h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Species Data Type


From: Peter Colberg
Subject: Re: [h5md-user] Species Data Type
Date: Mon, 5 Aug 2013 12:27:56 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

Hi Nicolas,

Welcome to the H5MD mailing list :-).

On Mon, Aug 05, 2013 at 11:43:11AM +0200, Nicolas Höft wrote:
> I've started making a simulation involving 'real' atoms.
> To distinguish the atom type, I use the particle species. From a
> simulator point of view the number type of the species makes
> perfectly sense. But  when one has a lot of different atom type the
> H5MD species type (number) becomes very impractical from a users
> point of view because it is hard to remember (and see) what species
> was what Atom type.
> 
> A more useful identification in my case would be the short name of
> the Atom (e.g. H, Ar and so on). This would also be in line with the
> often used descriptions in mixtures of the particle types ("A", "B",
> .. ).
> 
> One solution to allow this kind of species identification is to drop
> the constraint ", and is of scalar integer data type".

The point you make about the identification of species using strings
is entirely valid. On the other hand, using string arrays would be
cumbersome in certain cases, as Pierre notes.

I suggest to take a look at HDF5 enum types, which allow one to use
a defined set of integer values, and associate a string with each
integer. This fulfills the criteria of identification, and storage
efficiency.

The HDF5 documentation has read and write examples [1] for enumerates types.

[1] http://www.hdfgroup.org/ftp/HDF5/examples/examples-by-api/api18-c.html#dtyp

This is an example output from h5dump:

   DATASET "species" {
      DATATYPE  H5T_ENUM {
         H5T_STD_I16BE;
         "solid"            0;
         "liquid"           1;
         "gas"              2;
         "plasma"           3;
      }
      DATASPACE  SIMPLE { ( 4, 7 ) / ( 4, 7 ) }
      DATA {
      (0,0): solid, solid, solid, solid, solid, solid, solid,
      (1,0): solid, liquid, gas, plasma, solid, liquid, gas,
      (2,0): solid, gas, solid, gas, solid, gas, solid,
      (3,0): solid, plasma, gas, liquid, solid, plasma, gas
      }
   }

h5py provides the metadata attached to the numpy.dtype:

  dtype = f["species"].dtype
  print(dtype.fields["enum"][2]["vals"])        --> {'solid': 0L, 'plasma': 3L, 
'gas': 2L, 'liquid': 1L}

Regards,
Peter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]