[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [h5md-user] fields of observable group
From: |
Felix Höfling |
Subject: |
Re: [h5md-user] fields of observable group |
Date: |
Tue, 06 Sep 2011 11:00:57 +0200 |
User-agent: |
Opera Mail/11.11 (Linux) |
On Mon, 05 Sep 2011 11:17:55 +0200, Konrad Hinsen
<address@hidden> wrote:
On 2 Sep, 2011, at 14:07 , Felix Höfling wrote:
How should the edges/offset scheme be extended to account for such
things
as a truncated octahedral?
I see two options:
1) Introduce a special case for the truncated octahedral shape, which is
probably the most frequent one. The size of the box is specifed by the
edges just like for a "normal" (parallelepipedic) box. Some additional
label says that it is truncated octahedral box, meaning that the unit
cell of the parallelepipedic system actually contains two copies of the
system.
2) Provide a general mechanism for specifying symmetry inside the box.
This would allow the simulation of arbitrary crystals while maintaining
their symmetry. The part of the simulation universe stored explicitly
becomes the asymmetric unit, to which a set of symmetry transforms are
applied implicitly to reconstruct the whole system. The most
straightforward way to store the symmetry information is as a list of
symmetry transformations, i.e. one 3x3 rotation matrix plus a
translation vector.
In my "molecular system" data model, I have chosen the second approach
because in my field of work (molecular biophysics), crystals are
important because crystallography is the main source for protein
structures.
An optional set of affine symmetry transformations sounds very good. In
the most general case, this would include a matrix A (allowing for
reflections, rotations, rescaling) and a translation vector b for each
copy: x' = A x + b.
If we restrict to isometric transformations, the matrix shall be
orthogonal. This appears to be pretty general already, see
http://en.wikipedia.org/wiki/Euclidean_group.
It think about two optional attributes "transformation" and "shift"
attached to the "box" group. They hold datasets of ranks 3 and 2,
respectively: a square matrix and a vector for each copy of the stored
particle coordinates. If both attributes are present, their first
dimensions must agree; the remaining dimensions correspond to the space
dimension. [Alternatively, one may store a d by d+1 dimensional matrix (A,
b).] The order of operations is understood such that the matrix
multiplication is carried out first, then the translation. The unity
transformation shall not be specified and is always included [or would it
be better to require it explicitly if the attributes are present?]. This
would imply the following HDF5 structure:
parameters
\-- box
+-- [transformation] [#copies-1][d][d] (-1 because unity is
included by default)
+-- [shift] [#copies-1][d]
\-- edges
| \-- sample
| \-- time
| \-- step
\-- offset
\-- sample
\-- time
\-- step
One problem appears here: if the box size fluctuates, the shift vector has
to be adjusted as the simulation progresses. If the matrix is orthognonal,
i.e., of norm unity, it is unaffected. Maybe the shift vector should be
unified with 'offset'?
Shall the boundary conditions of the box be stored in an H5MD file? I can
think of open boundaries, periodic boundaries and (a bit weird)
Klein-bottle boundaries (a torus plus a twist). The same question arises
for the velocities in case of Lee-Edwards boundaries.
The offset can be useful if, e. g., different simulation snapshots shall
be glued together (for creating layered structures of pre-equilibrated
phases). And it is necessary for the complete description of an
arbitrarily positioned box in space. Of course, it is redundant for
particle positions reduced to the periodic box.
Ultimately this comes down to the question of what conditions you want
to impose on the particle coordinates stored in the trajectory. There
are various options:
1) No conditions at all. A particle implicitly stands for all its
periodic images, which are constructed by applying integer multiples of
the edge vectors. There is then no point in storing an offset. This
convention makes life easy for programs generating a trajectory, but
requires more work by programs that read a trajectory. It provides most
freedom for pairs of generators/readers to establish their own
conventions, but leaves the verification of these conventions to the
readers.
2) Coordinates are required to be in the interval [offset ...
offset+edge[. Trajectory generating programs must ensure this condition,
but have the freedom of specifying the offset as it suits them. Reading
programs gain a bit compared to 1) but must still handle arbitrary
offsets.
3) Coordinates are required to be in an interval defined by the
trajectory format specification, such as [0 ... edge[ or [-edge/2 ...
edge/2[. This puts more constraints on trajectory generators but
provides the most guarantees to readers.
There are also intermediate choices, such as permitting various
conventions and indicating them in metadata.
In biomolecular simulations, the usual convention is 1) because it
permits a useful arrangement for visualization: coordinates can be
arranged such that biologically important molecular assemblies are
represented in the way that is most useful to a biologist. This comes
down to having a specific arrangement for each individual simulation.
All programs get a bit more complicated, but it's the users who gain.
However, the same effect can be obtained otherwise, e.g. by storing an
explicit "visualization offset" outside of the trajectory.
The role of the offset may change with the inclusion of Euclidean
transformations, see above. Just one remark: for studies of transport in
liquids, it is desirable to store the unwrapped trajectory of each
particle in periodic space, otherwise the displacements after long time
lags get unphysically truncated. Thus the positions should definitely not
be enforced to be within the unit cell, leaving it open to the writer
whether to reduce them or not.
Felix