taler
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Taler] wirewatch update


From: Christian Grothoff
Subject: [Taler] wirewatch update
Date: Tue, 22 Jun 2021 19:57:39 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0

Hi all,

Executive summary:
==================

I just got taler-exchange-wirewatch to import transactions into Postgres
at 50k / second via the Taler bank REST API. To give everybody an idea
(also for the future) how I did this, here is a short write-up.


The fun part:
=============

The code change was a bit funky. What we used to do was update a
'reserve' table which contains the current reserve balance, and a
'reserve_in' table with details about the transaction:

BEGIN TRANSACTION ISOLATION SERIALIZABLE;
for (batch-of-individual-bank-transactions) {
  INSERT reserve (...) ON CONFLICT DO NOTHING;
  if (LAST_INSERT_STATUS == CONFLICT) {
    // Note: this virtually never happens in the benchmark
    INSERT reserves_in (...) ON CONFLICT DO NOTHING;
    if (LAST_INSERT_STATUS != CONFLICT)
    {
      // reserve existed, UPDATE instead of INSERT
      // Note: this virtually never happens in the benchmark
      SELECT reserve (old_balance);
      UPDATE reserve (...) // update balance
    }
  } else
    INSERT reserves_in (...);
}
COMMIT;

That got to like 6500 TPS, but higher parallel transactions caused
serialization failures. I now changed this to:

BEGIN TRANSACTION ISOLATION READ COMMITTED;
for (batch-of-individual-bank-transactions) {
  INSERT reserve (...) ON CONFLICT DO NOTHING;
  conflict_reserve = (LAST_INSERT_STATUS == CONFLICT);
  INSERT reserves_in (...) ON CONFLICT DO NOTHING;
  conflict_reserve_in = (LAST_INSERT_STATUS == CONFLICT);
  if (conflict_reserve_in) {
     // 'assert(conflict_reserve)'
     continue; // transaction was already known, continue batch
  }
  if (conflict_reserve) {
    COMMIT;
    BEGIN TRANSACTION ISOLATION SERIALIZABLE;
    UPDATE reserve (...) // reserve existed, update balance
    COMMIT;
    // for rest of batch, go back to weaker isolation mode
    BEGIN TRANSACTION ISOLATION READ COMMITTED;
  }
}
COMMIT;

OMG why?
========

Basically, the fast path is that the reserve is 'new' and we just do 2
inserts. If the reserve already exists, we do first check if the
specific transaction is already known. If that is already known, we do
nothing. But if the reserve exists (and has a balance) we need to update
the balance.

The reason for switching to SERIALIZABLE is that 'READ COMMITTED' can
theoretically be insufficient here, as the UPDATE statement technically
does combine a read+write, and so serializability violations (with
another transaction going in between the read+write) could break the
balance update. Alas, as this path is highly atypical, it doesn't matter
to go for SERIALIZABLE here in terms of system performance.

The rest we run inside of a READ COMMITTED transaction to batch insert
multiple values in one transaction (which is much more efficient than
running each INSERT in its own transaction), and as we _only_ write,
READ COMMITTED should be safe (basically all we need is to avoid
conflicting WRITEs, and the DB shouldn't allow UNIQUE invariants to be
broken regardless of the transaction model).

Now, I'm not entirely sure why using SERIALIZABLE for the outer
transaction drastically increases the number of serialization violations
-- as for me the final semantics are the same.
If someone here knows why this is, I'd love to hear an explanation ;-).


If you care to reproduce:
=========================

cd exchange.git/src/benchmark/
$ taler-bank-benchmark -c bank-benchmark.conf -p 500 -P 64 -r 130000 -L
WARNING -s 65000000 -K
-> prepares a bank with 65 M inbound reserve transactions 'in memory'
using 500 clients and 64 worker threads (yeah, don't run this on a small
system).
Do NOT press enter, that'll terminate the bank!

$ taler-exchange-dbinit -c bank-benchmark.conf -r # reset DB

# Now, run 'wirewatch' to benchmark:
$ taler-exchange-wirewatch -c bank-benchmark.conf -S 4096 -L WARNING

Alas, the above is sequential. So for good benchmarking, run a bunch in
parallel:

$ for n in `seq 1 24`; do timeout 60 taler-exchange-wirewatch -c
bank-benchmark.conf -S 4096 -L WARNING -t &  done

After that, check how much work was done:
$ echo 'select count(*) from reserves_in;' | psql talercheck



Happy hacking!

Christian

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]