lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Reference member = shared_ptr


From: Vadim Zeitlin
Subject: Re: [lmi] Reference member = shared_ptr
Date: Thu, 4 Feb 2021 17:49:40 +0100

On Tue, 2 Feb 2021 22:15:10 +0000 Greg Chicares <gchicares@sbcglobal.net> wrote:

GC> On 2/2/21 3:28 PM, Vadim Zeitlin wrote:

[...discussing cache implementation...]

GC> That's an intriguing idea, and Vaclav had said:
GC> 
GC> https://lists.nongnu.org/archive/html/lmi/2012-06/msg00007.html
GC> | I ended up using pointers in the end, because DBDictionary is uncopyable
GC> | -- and [...] cannot be easily made copyable.
GC> 
GC> The present design, with shared_ptr, seems to be the most reasonable
GC> way to obtain the current behavior--if class X HAS-A cached class F,
GC> and instances X0a, X0b, ...X0z are created with instance F0, but
GC> then the file underlying F is changed resulting in instance F1,
GC> then subsequently-created instance X1a, X1b... use F1, but earlier
GC> instances like X0k keep using F0.

 It all boils down to whether this is acceptable (or even desirable)
behaviour or not. It seems to me that it could be rather surprising to keep
using old data, not present anywhere on the disk any more, and it could
also be surprising, or worse, to use inconsistent data in X0a and X1a.

GC> If we make F a single mutable
GC> object, and replace F0 with F1, then instance X0k would use F1,
GC> which could have astonishing results. For example, maybe F was
GC> changed in a way that would prevent X0k from being created.

 If this really can happen, then I think the only reasonable solution is to
recreate all X objects whenever F changes. This is, of course, much more
involved, but I just don't see how could it make sense for X0k to continue
to be used when the current F state is incompatible with its existence.

GC> Alternatively, we could presumably preserve the current behavior
GC> with some kind of multimap that would preserve all the F0, F1...F99
GC> ever created, but return only the latest F99 for new queries. This
GC> would move the responsibility for extending the lifetime of all FN
GC> into the cache's data structure, instead of just letting shared_ptr
GC> handle that.

 Yes, this is definitely a bad idea.

GC> There are many tradeoffs, and the design we have isn't clearly
GC> worse than any other. It may be unwise to spend time changing it.

 Just to be perfectly clear: I'm not speaking about trade-offs but about
the fundamental goals here. If the current behaviour is what we need, then
using shared_ptr<const T> is indeed a pretty standard implementation of it.
There are different and IMHO slightly better approaches and personally I
prefer to use wrapper objects with value-like semantics using reference
counting and copy-on-write internally, which is advantageous from both the
code clarity (in particular, such objects are by definition never null) and
performance point of view (MT-safety of shared_ptr<> doesn't come for free)
standpoints, but this is indeed just a trade-off between the quality and
the simplicity of implementation that is probably not important enough to
worry about.

 However it is definitely important to know whether the current behaviour
is indeed what you want. I just can't convince myself that it is, which is
why I keep bringing it up. Of course, convincing myself shouldn't be your
goal, it's enough if you're sure of this yourself, and if you are, then I
won't mention it again.


GC> That (and several like it) are shared_ptr because of cache_file_reads.
GC> 
GC>   basic_values.hpp:    std::shared_ptr<MortalityRates>     MortalityRates_;
GC> 
GC> That (and several like it) probably arise from the bad old days
GC> of the 1990s, when objects that might exceed 65536 bytes had to
GC> be allocated on the heap. At least we can change these to
GC> unique_ptr; please tell me whether you think new branch
GC> odd/unique_vs_shared should be merged into master.

 One important comment on this branch is that its premise that "forward
declarations no longer suffice" [when unique_ptr is used] is wrong. It is
perfectly possible to declare unique_ptr<> with incomplete classes, you
just need to take care to avoid _using_ it before including the complete
class declaration. And "using" here means both using it in your own code
and in the compiler-generated special functions. So you could have kept the
forward declarations unchanged and just would need to explicitly define
BasicValues move ctor and assignment operator and dtor as "= default" in
basicvalues.cpp file, where the full declarations of everything are
available.

 Whether this should be done or not is a different question, I'm
sympathetic to the "these objects should probably be held directly" part,
so perhaps not. But you definitely don't have to do it right now if you
don't want to.

 So, to summarize, I don't think odd/unique_vs_shared branch should be
merged in its current state as it seems to be in some kind of undecided
middle position between 2 solutions:

(a) Use unique_ptr<> with incomplete classes to reduce physical
    dependencies.
(b) Use objects directly to reduce run-time overhead and make the code
    simpler.

I don't know whether (a) or (b) are better, but both seem better than the
current change to me.

GC>   account_value.hpp:    std::shared_ptr<Ledger         > ledger_;
GC> 
GC> Here, class AccountValue does its work and then vanishes (its dtor
GC> executes), leaving behind only the Ledger it produces, which can be
GC> used downstream because the shared_ptr endows it with eternal life.
GC> Isn't that an appropriate use?

 I'd say definitely not. The ownership of this Ledger is never shared with
anybody by this class (if it happens, it's done outside of it), so it
should use unique_ptr<> and return it from its ledger_from_av() function.
Notice that the caller could easily put it into shared_ptr<> if it really
needed to share its ownership -- but getting rid of shared_ptr<> once
you've started using it isn't possible any more, so returning unique_ptr<>
is almost always a better idea than returning a shared_ptr<>.

 IOW, eternal life is not always a blessing, it could also be a cancer.
Well defined lifetime is more desirable. At least from software management
perspective.

GC>   ledger.cpp:// TODO ?? Is it really a good idea to have shared_ptr data 
members?
GC> 
GC> Probably not. Especially because that comment block speaks of
GC> using 'new' to create a copy, which is a very 1990s solution; and,
GC> again, the original reason for using shared_ptr's here may have
GC> been to work around the 65536-byte limit on an 8086 data segment.
GC> But every time I contemplate class ledger_map_holder, a darkness
GC> descends on me--it's like wishing that you had built your house
GC> out of brick instead of lumber, but it seems like a lot of work
GC> to go back and revise that decision. Still, if you're inclined to
GC> rectify this, I'd be glad to accept patches.

 I think it would be actually good to address this, so I've added it to the
middle (and not the bottom) of my TODO complicated-multi-dimensional-data-
structure-usually-called-list-for-brevity.

GC>   multidimgrid_any.hpp:    typedef std::shared_ptr<MultiDimAxisAny> 
AxisAnyPtr;
GC> 
GC> Wholesale replacement would be the way to address this and other
GC> concerns with the product-editor code, which constitutes about
GC> one-quarter of lmi's use of shared_ptr.

 Yes, I'm not proud of this :-(

GC>   rate_table.hpp:    explicit table(std::shared_ptr<table_impl> const& impl)
GC> 
GC> Perhaps this is a case where you'd still use shared_ptr today?

 No, I believe the only reason I used it there was that it had been done
before the switch to C++11 in lmi. Today I would have used unique_ptr<>
here and, more importantly, use rvalue reference for the second parameter
of the database ctor. I could definitely change this code to avoid
shared_ptr<> misuse, please let me know if you agree it's worth doing.

 Thanks,
VZ

Attachment: pgpJX0cOzpcoR.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]