h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] H5MD for proteins


From: Konrad Hinsen
Subject: Re: [h5md-user] H5MD for proteins
Date: Tue, 10 Sep 2013 12:21:08 +0200

Olaf Lenz writes:

 > I would even go one step further: I think the idea of modules should be
 > part of the HDF5 specs. This would yield something like "XHDF5", the "X"
 > standing for "extensible", as in "XML". And indeed, such modules would
 > directly correspond to DTDs or schemas in XML. From my point of view,
 > this is exactly what HDF5 is missing. It would be perfect if there would

That issue has been discussed a few times on the HDF5 mailing list, but
with little success. Someone even proposed a concrete definition:

  
http://mail.lists.hdfgroup.org/pipermail/hdf-forum_lists.hdfgroup.org/2013-January/006440.html

However, this was (justly) criticized by a member of the HDF team for
various deficiencies, such as the lack of a machine-verifiable schema
definition.

My impression is that most HDF5 users don't care, which means the HDF
group doesn't care, and no one else is big enough to get any
proposition accepted.

 > And even better, if you would define the correspondence between XML
 > and HDF5, one could even directly translate between XML and
 > HDF5.

That could be useful for certain applications, but for many the mismatch
between XML data types and HDF5 data types would be a problem. It's worth
trying, but it's not a small job.

 > This would basically make HDF5 a binary XML format, something
 > that a number of people have asked for.  It would be really nice
 > for H5MD, too, as an XML file is significantly simpler to produce
 > than an HDF5 file, so if you have your handwritten code, you just
 > need to output an XML file.

I am not sure I want my 10 GB HDF5 trajectories to pass through an XML
phase during construction.

 > However, this is probably too big for us. ;-) Stil, it probably helps to
 > think of modules as DTDs for HDF5. The H5MD specs would then represent a
 > "particle trajectory" module.

Yes, that's a good point of view. Something else we can do is provide
a validation program for H5MD data.

BTW, I chose the same approach for Mosaic. In HDF5, every Mosaic data
item has a "type stamp" (an attribute) that identifies it with its
Mosaic type and the Mosaic version number. The Mosaic library can
validate such data items for conformance.

 > And, in contrast to what Konrad claims, I think that even this basic
 > module has its value. First of all, it serves as a basis for further

Me too :-) It's obviously useful for structuring (and sharing)
programs that work on trajectory data. What I question is the
scientific utility because the semantic information about the data in
the file is so weak. It's not that there is no semantic information, but
even 90% of the semantic information you need is 10% short of something
usable.

Konrad.
-- 
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Moléculaire, CNRS Orléans
Synchrotron Soleil - Division Expériences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: research AT khinsen DOT fastmail DOT net
http://dirac.cnrs-orleans.fr/~hinsen/
ORCID: http://orcid.org/0000-0003-0330-9428
Twitter: @khinsen
---------------------------------------------------------------------



reply via email to

[Prev in Thread] Current Thread [Next in Thread]