Re: [h5md-user] Dataset layouts

h5md-user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Dataset layouts

From:	Felix Höfling
Subject:	Re: [h5md-user] Dataset layouts
Date:	Wed, 01 Jul 2015 10:43:37 +0200
User-agent:	Opera Mail/12.16 (Linux)

Am 24.06.2015, 22:55 Uhr, schrieb Peter Colberg<address@hidden>:

On Tue, Jun 16, 2015 at 11:30:28AM +0200, Felix Höfling wrote:

Peter, did I understand correctly: parallel reading of a dataset musttakeinto account whether the dataset is compact or not, otherwise the dataare
inconsistent between the MPI processes? (Actually, such a mis-use of the
HDF5 library should raise an exception in my opinion.)


Fortunately not, the inconsistency only arises when only one process
writes to the compact dataset, and subsequently all processes read
from that dataset. Metadata writes go to the per-process cache, and
metadata reads are from the per-process cache. To keep the per-process
caches in sync, metadata writes must be collective.

With respect to writing, I don't see any need to require compactness.Thewriter application "knows" whether it uses the MPI interface or not andcanact accordingly. Second, h5py does a great job in writing H5MD files sofar.
I would not like to break this kind of support by making compactness
mandatory.


I agree, though h5py should allow the compact layout for efficiency.

For reading, on the other hand, the implementation of a reader has to be
simple for the sake of robustness. Querying the storage layout before
reading may be one complication that can be avoided by specifying the

layout. (This reminds me of an endless discussion about the stringtype.)


I tested what happens when all processes read a scalar dataset with
contiguous layout. It actually works fine. I get the same read times
as for the compact layout.

How about we include in the specification that the scalar "step"/"time"
dataset SHOULD use a compact layout?

Peter


Hi Peter,

(Sorry for the delayed reply.)

If I understand your examples correctly, the issue of parallelreading/writing is only an issue within the same application using MPI. Ifthe programme (not the file format) is implemented consistently, there isno problem. For me, there is no need to put more restrictions in the H5MDspecification. We should keep it as simple/minimal as possible, also toavoid conflicting specifications.

However, it is a good idea to share your experience. I think that the"implementation" section on the H5MD web page is the perfect place forthis. Would this meet your concerns?

http://nongnu.org/h5md/implementation.html#compact-datasets

Best,

Felix

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [h5md-user] Dataset layouts, Felix Höfling <=
- Re: [h5md-user] Dataset layouts, Pierre de Buyl, 2015/07/01

Next by Date: [h5md-user] static vs. dynamic H5MD elements
Next by thread: Re: [h5md-user] Dataset layouts
Index(es):
- Date
- Thread