octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

rewrite of structs - advice sought


From: Jaroslav Hajek
Subject: rewrite of structs - advice sought
Date: Tue, 22 Jun 2010 16:27:28 +0200

hi all,

a couple of weeks ago I started to work on a larger change to Octave,
rewriting the C++ implementation of structs.

Rationale:

The current core class for structs is Octave_map, with an associated
octave_value container octave_struct.
Octave_map is basically a thin layer above std::map<std::string,
Cell>, with field lookup being handled by std::map and indexing by
Cell (i.e. Array).
This ensures efficient addition and lookup of keys, in basic
expressions like s.x and s.y = 1. std::map generally handles these
operations in logarithmic time.

While this design is efficient for field-based access, it suffers from
several significant shortcomings:

1. the std::map is directly contained in Octave_map, meaning that it
can't be shared amongst copies. Because std::map is normally not
reference counted, any copy of Octave_map, e.g. query for map_value,
or returning it from a DEFUN involves reconstruction of the std::map
structure (which is usually a red-black tree of std::pair<key,
value>).
Although this will be normally handled in linear time by the standard
library, it is still a considerable overhead typically requiring as
many unconstrained allocations as there are fields.

2. the majority of structs in real world usage are probably scalars.
In spite of this fact, there is no storage optimization for scalar
structs (as there is for numeric or logical scalars), and each field
of a scalar struct requires an extra singleton cell. This makes use of
lots of scalar structures (which is sometimes handy) quite wasteful.

3. as in 1, array-like operations with structs that do not involve
field manipulation, such as () indexing without field access, require
reconstruction of the std::map structure. This is typically done by
sequential field access, promoting the operation to superlinear time
(N*log(N)).

4. as in 3, built-ins are missing potential optimizations for
manipulating struct arrays. For instance:

octave:1> bins = glob ("/usr/bin/*");
octave:2> tic; cellfun (@stat, bins, "uniformoutput", false); toc
Elapsed time is 0.06951 seconds.
octave:3> tic; cellfun (@stat, bins); toc
Elapsed time is 1.07957 seconds.

while the first call leaves the outputs from stat() as scalar structs,
the second call accumulates them into an array for more efficient
storage. The accumulation itself, however, accounts for 93% of the
running time here, which is somewhat embarrassing. For comparison,
using a logical function:

octave:4> tic; cellfun (@is_absolute_filename, bins, "uniformoutput",
false); toc
Elapsed time is 0.003546 seconds.
octave:5> tic; cellfun (@is_absolute_filename, bins); toc
Elapsed time is 0.00264812 seconds.

here, the scalar result accumulation is optimized, taking advantage of
the storage optimizations for logical scalars.

Most of the above also applies to user-defined classes.

First code for the proposed changes is available in my public repo:
http://hg.tw-math.de/highegg/
It now finally compiles, but is still far from being debugged.

Summary of changes:

The core classes for storing structs are octave_map and
octave_scalar_map, the latter being simplified and specialized for
scalars. It also has a specialized container (octave_scalar_struct).

The field storage is based on directly contained std::vector (of Cell
or octave_value), thus involving only vector clone when copied.
Additionally, both classes store an octave_fields object, which is a
reference-counted encapsulation of a key -> index map, used by both
scalar and array maps. The index is used on the stored std::vector to
obtain a field reference.

This allows struct arrays and scalars to share the field information
via reference counting, copying only std::vectors instead of
std::maps, which is typically way more efficient, and requires just a
single allocation per operation.

Complexity of most operations (s.x, s.x = y, isfield, fieldnames,
orderfields) can be preserved (or improved, because in fact some of
these are suboptimal now). A notable exception is rmfield - removing a
field in this data structure takes up to linear time instead of
logarithmic time. However, since rmfield deletes out-of-place and thus
involves a linear-time copy anyway, the complexity of the built-in
will not change as seen from the interpreter, only for C++ extensions.

Regarding compatibility, after some thinking I decided to let the old
Octave_map implementation co-exist with the new ones, and make them
convertible to each other.

Thus, code going like

Octave_map m = args(i).map_value ();
// manipulate m
retval = octave_value (m);

will continue to work without changes, although slowed down by the
internal conversion. Eventually it may also produce a deprecation
warning.

Several questions:

1. general opinions about this? Is this a worthy goal for Octave? So
far it had been mostly fun, now the hard work (debugging) begins, so
I'd like to know if I should abandon the effort in time.

2. naming: I chose octave_map and octave_scalar_map, so that
octave_map and Octave_map only differ in capitalization. Since
Octave_map should eventually go away, this almost-nameclash seemed OK
to me, but I'm open to objections.

3. anything else...

regards

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]