Re: [lmi] Micro-optimization in ledger

lmi

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Micro-optimization in ledger_format

From:	Greg Chicares
Subject:	Re: [lmi] Micro-optimization in ledger_format
Date:	Fri, 18 Jan 2019 00:50:56 +0000
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0

On 2019-01-17 02:33, Vadim Zeitlin wrote:
[...]
>  I could profile this to find out where exactly is the time spent but, in
> principle, it's not very surprising that creating a new object is more
> expensive than not doing it.
> 
>  IOW, what you should really compare it with is this:

[...patch snipped here; I'll apply and commit it soon...]

> and this version is actually faster than the original one.

Thanks. Distracted by incidental syntactic concerns, I hadn't noticed
that I had the 'static' variable in the wrong place. For each of the
1000 iterations in the timing loop, I had:

  X& auxiliary_function()
  {
    static X x;     // Constructed OAOO.
    x.imbue(facet); // Executed 1000 times!
    return x;
  }

  void cast_function()
  {
    X& x {auxiliary_function()}; // Not static: called 1000 times.
    do_something_with(x);
  }

but the efficient way is:

  X auxiliary_function() // Called only once.
  {
    X x;
    x.imbue(facet);
    return x;
  }

  void cast_function()
  {
    static X x {auxiliary_function()}; // Initialized OAOO.
    do_something_with(x);
  }

>   Speed tests...
>   stream_cast     : 1.522e-03 s mean;       1517 us least of 100 runs
>   minimalistic    : 1.166e-03 s mean;       1157 us least of 100 runs
>   static stream   : 8.668e-04 s mean;        858 us least of 100 runs
>   static facet too: 8.570e-04 s mean;        850 us least of 100 runs
>   without str()   : 7.687e-04 s mean;        757 us least of 100 runs
> 
> and looks more informative. Of course, not all digits of the result are
> significant, but they're mostly stable.
> 
>  Now, for comparison, your version move-constructing a new interpreter
> every time yield this:
> 
>   Speed tests...
>   stream_cast     : 1.643e-03 s mean;       1516 us least of 100 runs
>   minimalistic    : 1.242e-03 s mean;       1222 us least of 100 runs
>   static stream   : 9.102e-04 s mean;        892 us least of 100 runs
>   static facet too: 1.194e-03 s mean;       1164 us least of 100 runs
>   without str()   : 1.089e-03 s mean;       1060 us least of 100 runs
> 
> i.e. the last 2 lines are indeed considerably slower.
> 
>  But my version above gives
> 
>   Speed tests...
>   stream_cast     : 2.178e-03 s mean;       1589 us least of 100 runs
>   minimalistic    : 1.203e-03 s mean;       1180 us least of 100 runs
>   static stream   : 8.986e-04 s mean;        890 us least of 100 runs
>   static facet too: 7.020e-04 s mean;        697 us least of 100 runs
>   without str()   : 5.933e-04 s mean;        585 us least of 100 runs
> 
> i.e. is a bit faster than the original one.

Corresponding to your last two runs above (after adopting your welcome
addition of a 'for 0 to 1000' timing loop), I have:

*** 'static' used unwisely:
  stream_cast     : 3.367e-003 s mean;       3238 us least of 100 runs
  minimalistic    : 2.584e-003 s mean;       2573 us least of 100 runs
  static stream   : 1.418e-003 s mean;       1164 us least of 100 runs
  static facet too: 1.560e-003 s mean;       1538 us least of 100 runs
  without str()   : 1.551e-003 s mean;       1519 us least of 100 runs

*** 'static' used judiciously:
  stream_cast     : 3.306e-003 s mean;       3206 us least of 100 runs
  minimalistic    : 2.576e-003 s mean;       2564 us least of 100 runs
  static stream   : 1.358e-003 s mean;       1161 us least of 100 runs
  static facet too: 8.661e-004 s mean;        858 us least of 100 runs
  without str()   : 8.474e-004 s mean;        838 us least of 100 runs

The middle "static stream" lines show no timing difference for either
of us, because no code was changed there. The difference seen in the
"static facet too" lines shows the benefit of calling imbue() OAOO.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [lmi] Micro-optimization in ledger_format, Greg Chicares, 2019/01/16
- Re: [lmi] Micro-optimization in ledger_format, Vadim Zeitlin, 2019/01/16
  - Re: [lmi] Micro-optimization in ledger_format, Greg Chicares, 2019/01/16
    - Re: [lmi] Micro-optimization in ledger_format, Vadim Zeitlin, 2019/01/16
    - Re: [lmi] Micro-optimization in ledger_format, Greg Chicares, 2019/01/16
    - Re: [lmi] Micro-optimization in ledger_format, Vadim Zeitlin, 2019/01/16
    - [lmi] Local variable of rvalue reference type [Was: Micro-optimization in ledger_format], Greg Chicares, 2019/01/17
    - Re: [lmi] Local variable of rvalue reference type, Vadim Zeitlin, 2019/01/17
    - Re: [lmi] Micro-optimization in ledger_format, Greg Chicares <=
    - [lmi] Helpful compiler options [Was: Micro-optimization in ledger_format], Greg Chicares, 2019/01/18
    - Re: [lmi] Micro-optimization in ledger_format, Greg Chicares, 2019/01/18
  - Re: [lmi] Micro-optimization in ledger_format, Greg Chicares, 2019/01/18
    - Re: [lmi] Micro-optimization in ledger_format, Greg Chicares, 2019/01/18
    - Re: [lmi] Micro-optimization in ledger_format, Vadim Zeitlin, 2019/01/19

Prev by Date: Re: [lmi] Local variable of rvalue reference type
Next by Date: Re: [lmi] Micro-optimization in ledger_format
Previous by thread: Re: [lmi] Local variable of rvalue reference type
Next by thread: [lmi] Helpful compiler options [Was: Micro-optimization in ledger_format]
Index(es):
- Date
- Thread