h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Unit attribute versus non-dimensionless quantities


From: Felix Höfling
Subject: Re: [h5md-user] Unit attribute versus non-dimensionless quantities
Date: Thu, 01 Aug 2013 18:13:55 +0200
User-agent: Opera Mail/12.15 (Linux)

Am 01.08.2013, 17:15 Uhr, schrieb Pierre de Buyl <address@hidden>:

Peter Colberg <address@hidden> a écrit :

Reading works fine, even for attributes (which is the primary concern):

  import h5py
  f = h5py.File("h5md_units.h5", "r")
  attr = h5py.h5a.open(f.id, "data")
  datatype = h5py.Datatype(attr.get_type())
  print(datatype.attrs["unit"])

I am still figuring out how to teach h5py to write data using HDF5
data types.

I would not be discouraged by h5py's separation into low-level and high-level APIs. The high-level API exports a *tiny* subset of HDF5's functionality, so it is not surprising to find that datatypes have so far not been considered
in its design.

Would you agree to postpone units until after version 1.0?

I would remove the units attribute from the specification, which, in
its current form, cannot be implemented. The most important thing to
remember is to avoid the worst case of file format design, breaking
backwards compatibility in a future release.

Given your h5py update, this seems the best solution. No units for 1.0
[he he, I don't use them anyway :-)].

P



I think we would miss an important feature of HDF5 if we do not mention (and support) the possibility of annotating data in general, the physical unit is just one important example.

Many people in MD, in particular in the context of force fields and protein simulations, use absolute physical quantities and need units. I have seen many situations where even LJ units were translated to nm and fs. Just guessing or assuming physical units is not in the spirit of a self-describing file format.

The current specification is not as worse as it seems. The specification of units is possible for all proper datasets, only data stored as attributes can not carry a unit. We could just leave it as it is.

"Self-descriptiveness" reminds me of another point: "kinetic_energy" is in some sense self-descriptive, but one might want to add a more precise description as a text string, e.g., "kinetic energy per particle". I think a general optional string attribute "description" for all elements (groups and datasets) would find various applications.

I have also asked myself the question whether insisting on storing dimensionful data as HDF5 attributes might be a design flaw of H5MD, but I don't want to re-open this discussion.

Regards,

Felix



reply via email to

[Prev in Thread] Current Thread [Next in Thread]