h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Species Data Type


From: Pierre de Buyl
Subject: Re: [h5md-user] Species Data Type
Date: Tue, 20 Aug 2013 16:50:15 -0400
User-agent: Internet Messaging Program (IMP) H4 (5.0.21)


Peter Colberg <address@hidden> a écrit :

On Fri, Aug 09, 2013 at 07:41:06PM +0200, Olaf Lenz wrote:
Peter Colberg <address@hidden> schrieb:
>The same thought also struck my mind after discovering enums. However,
>I think enums are better used only where storage efficiency is needed,
>i.e. for (potentially) large arrays. In particular, the integer values
>of an enum type should not be hard-coded, since they are intended as a
>program-internal representation, while the associated strings are the
>HDF5 data representation.

In the case of the species, memory efficiency might actually be
needed, when it comes to really large systems.

As for the boundaries, I note that the associated integer values of
an enum are explicitly specified by the user, and the conversion to
an integer type is explicitly mentioned in the users manual. If the
integer values should not carry semantics, the enum type could be
specified without it. I think it is safe to use enums and to specify
the meaning of specific values.

>I actually find variable-length strings easy to use ;-).

There is, however, the memory problem, and the issue that an integer
is actually easier to use in many languages.

That is not the appropriate comparison, one needs to compare the use
of strings versus *enumerations*, not integers. The last thing I would
want to see is a reader that ignores the semantic values ("periodic",
"none") by reading an enum attribute using an integer memory datatype
[1].

In high-level languages, e.g., using h5py, reading strings is more
convenient than reading enums. For enums, the user has to manually
convert the integer values to the enum values using a conveniently
hidden dictionary. This is acceptable for the species array, where
enums provide a benefit over raw integers, but not for attributes
over a set of string values.

In low-level languages, one has to make sure to call H5Dvlen_reclaim
to free the memory allocated by the HDF5 library upon reading a
variable-length string (array), but that is all there is to it.

Peter

[1] Hard-coded integers belongs to the realm of FORTRAN *77* PROGRAMS… ;-)

For the rare strings present in H5MD, I would also keep them as strings.
Defining an enum for a single occurence of the values is also too much.

Cheers,

Pierre



reply via email to

[Prev in Thread] Current Thread [Next in Thread]