lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] overview of C++ expression template libraries


From: Vadim Zeitlin
Subject: Re: [lmi] overview of C++ expression template libraries
Date: Mon, 22 Mar 2021 12:39:50 +0100

On Mon, 22 Mar 2021 02:58:52 +0000 Greg Chicares <gchicares@sbcglobal.net> 
wrote:

GC> In lmi's problem domain, vectors are largely sufficient, and arrays
GC> of greater rank are rare. Many libraries target multidimensional
GC> arrays, and std::vector wouldn't work for them.

 Yes, this is definitely a good point. It still seems like it ought to be
possible to make the same logic work for std::vector<>, but somehow nobody
seems to have ever been interested in this.

GC> >  From this point of view, Boost.YAP[1] library looks promising, as it 
seems
GC> > to allow exactly this, and I was going to look at it "soon" but just 
didn't
GC> > have time to do it yet. But, again, I don't know if it's still worth doing
GC> > this if you've already committed to using PETE for the observable future.
GC> > 
GC> > [1]: https://www.boost.org/doc/libs/master/doc/html/yap.html
GC> 
GC> Looks like PETE to me. Compare:
GC> 
GC> // Assigns some expression e to the given vector by evaluating e 
elementwise,
GC> // to avoid temporaries and allocations.
GC> template <typename T, typename Expr>
GC> std::vector<T> & assign (std::vector<T> & vec, Expr const & e)
GC> {
GC>     decltype(auto) expr = boost::yap::as_expr(e);
GC>     assert(equal_sizes(vec.size(), expr));
GC>     for (std::size_t i = 0, size = vec.size(); i < size; ++i) {
GC>         vec[i] = boost::yap::evaluate(
GC>             boost::yap::transform(boost::yap::as_expr(expr), take_nth{i}));
GC>     }
GC>     return vec;
GC> }

 But this is part of the _implementation_, not how you use the library.
I.e. you're supposed to write this once but then, as the example from which
the above was taken shows, you can just do

        std::vector<int> a,b,c,d;
        std::vector<double> e(n);

        [...]

        // After this point, no allocations occur.

        assign(b, 2);
        assign(d, a + b * c);

        a += if_else(d < 30, b, c);

        assign(e, c);
        e += e - 4 / (c + 1);
 
 I'm not sure why do we need to use assign() instead of just the assignment
operator, but this still looks reasonably clear to me.

 But of course, it's similar to PETE conceptually, but there is nothing
really wrong with this, is there?

GC> Here's a library:
GC>   https://github.com/wichtounet/etl
GC> that "makes extensive use of C++17 and some features of C++20".
GC> 
GC> > and it seems clear that things could be much improved
GC> > simply by using C++17-specific features such as if constexpr and fold
GC> > expressions.
GC> 
GC> "improved" in what sense?

 In the sense of making the library implementation simpler and more
efficient. In fact, the author of the ETL, linked above, wrote a couple
of posts explaining the advantages of migrating his library C++17. I admit
not have followed all the details, but I assume he knows what he's speaking
about...

GC> Most of lmi resists vectorization: it consists of
GC>   for(int year = 0; year < 100 - age; ++year)
GC>     for(int month = 0; month < 12; ++month)
GC>        do_something();
GC> where do_something() involves only simple + - * / arithmetic,
GC> but depends on a vast and dense thicket of conditionals, with
GC> rounding at each step. The rounding means that 100 years or
GC> even 12 months cannot be run in parallel on separate cores.
GC> It's embarrassingly serial.

 But couldn't we still process 12 months in a single loop iteration,
operating on all 12 months in parallel? This should be possible with AVX
instructions.

GC> For all those various parts of lmi to work together, there
GC> needs to be one shared vector datatype. If that datatype is
GC> to be std::vector, then I don't think we're likely to find
GC> a better library than PETE. We could write one ourselves,
GC> but I don't think we'll find one.

 As I wrote, I tend to agree, with Boost.YAP being the only library I've
found that really decouples the ET logic from the actual data
representation. This comes at the cost of needing to write things like
assign() above ourselves, but, in principle, this might not be too bad.

GC> If we can make 10% of lmi's code twice as fast, then lmi
GC> runs only 5% faster.

 I do agree with the idea, but I'd still like to say that the gains from
using AVX may be significantly higher than "only" 100%.

GC> I just want to regain some of the expressiveness of another
GC> language almost as old, APL, where
GC>     A += B + C;
GC> works as expected for std::vectors.

 Yes, and it really surprises me that there is no existing library allowing
to do this. Perhaps I'm missing something here and it's not as simple (or
as useful?) as I think it would be to write one.

 Regards,
VZ

Attachment: pgpOC186gpCm3.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]