[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Taler] wirewatch update
From: |
Calvin Burns |
Subject: |
Re: [Taler] wirewatch update |
Date: |
Fri, 27 Aug 2021 18:05:43 +0000 |
On Tue, 06/22/2021 07:57:39 PM, Christian Grothoff wrote:
> Hi all,
>
> Executive summary:
> ==================
>
> I just got taler-exchange-wirewatch to import transactions into Postgres
> at 50k / second via the Taler bank REST API. To give everybody an idea
> (also for the future) how I did this, here is a short write-up.
>
>
> The fun part:
> =============
>
> The code change was a bit funky. What we used to do was update a
> 'reserve' table which contains the current reserve balance, and a
> 'reserve_in' table with details about the transaction:
>
> BEGIN TRANSACTION ISOLATION SERIALIZABLE;
> for (batch-of-individual-bank-transactions) {
> INSERT reserve (...) ON CONFLICT DO NOTHING;
> if (LAST_INSERT_STATUS == CONFLICT) {
> // Note: this virtually never happens in the benchmark
> INSERT reserves_in (...) ON CONFLICT DO NOTHING;
> if (LAST_INSERT_STATUS != CONFLICT)
> {
> // reserve existed, UPDATE instead of INSERT
> // Note: this virtually never happens in the benchmark
> SELECT reserve (old_balance);
> UPDATE reserve (...) // update balance
> }
> } else
> INSERT reserves_in (...);
> }
> COMMIT;
>
> That got to like 6500 TPS, but higher parallel transactions caused
> serialization failures. I now changed this to:
>
> BEGIN TRANSACTION ISOLATION READ COMMITTED;
> for (batch-of-individual-bank-transactions) {
> INSERT reserve (...) ON CONFLICT DO NOTHING;
> conflict_reserve = (LAST_INSERT_STATUS == CONFLICT);
> INSERT reserves_in (...) ON CONFLICT DO NOTHING;
> conflict_reserve_in = (LAST_INSERT_STATUS == CONFLICT);
> if (conflict_reserve_in) {
> // 'assert(conflict_reserve)'
> continue; // transaction was already known, continue batch
> }
> if (conflict_reserve) {
> COMMIT;
> BEGIN TRANSACTION ISOLATION SERIALIZABLE;
> UPDATE reserve (...) // reserve existed, update balance
> COMMIT;
> // for rest of batch, go back to weaker isolation mode
> BEGIN TRANSACTION ISOLATION READ COMMITTED;
> }
> }
> COMMIT;
>
> OMG why?
> ========
>
> Basically, the fast path is that the reserve is 'new' and we just do 2
> inserts. If the reserve already exists, we do first check if the
> specific transaction is already known. If that is already known, we do
> nothing. But if the reserve exists (and has a balance) we need to update
> the balance.
>
> The reason for switching to SERIALIZABLE is that 'READ COMMITTED' can
> theoretically be insufficient here, as the UPDATE statement technically
> does combine a read+write, and so serializability violations (with
> another transaction going in between the read+write) could break the
> balance update. Alas, as this path is highly atypical, it doesn't matter
> to go for SERIALIZABLE here in terms of system performance.
>
> The rest we run inside of a READ COMMITTED transaction to batch insert
> multiple values in one transaction (which is much more efficient than
> running each INSERT in its own transaction), and as we _only_ write,
> READ COMMITTED should be safe (basically all we need is to avoid
> conflicting WRITEs, and the DB shouldn't allow UNIQUE invariants to be
> broken regardless of the transaction model).
>
> Now, I'm not entirely sure why using SERIALIZABLE for the outer
> transaction drastically increases the number of serialization violations
> -- as for me the final semantics are the same.
Perhaps postgres uses (for efficiency) criteria that are sufficient but not
necessary for a transition graph "not serializable" [2, ยง3.3.1].
So perhaps there can be cases with a lot of false positive.
[1] https://www.postgresql.org/docs/13/transaction-iso.html
[2] arXiv:1208.4179v1
> If someone here knows why this is, I'd love to hear an explanation ;-).
>
>
> If you care to reproduce:
> =========================
>
> cd exchange.git/src/benchmark/
> $ taler-bank-benchmark -c bank-benchmark.conf -p 500 -P 64 -r 130000 -L
> WARNING -s 65000000 -K
> -> prepares a bank with 65 M inbound reserve transactions 'in memory'
> using 500 clients and 64 worker threads (yeah, don't run this on a small
> system).
> Do NOT press enter, that'll terminate the bank!
>
> $ taler-exchange-dbinit -c bank-benchmark.conf -r # reset DB
>
> # Now, run 'wirewatch' to benchmark:
> $ taler-exchange-wirewatch -c bank-benchmark.conf -S 4096 -L WARNING
>
> Alas, the above is sequential. So for good benchmarking, run a bunch in
> parallel:
>
> $ for n in `seq 1 24`; do timeout 60 taler-exchange-wirewatch -c
> bank-benchmark.conf -S 4096 -L WARNING -t & done
>
> After that, check how much work was done:
> $ echo 'select count(*) from reserves_in;' | psql talercheck
>
>
>
> Happy hacking!
>
> Christian
>
signature.asc
Description: PGP signature
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [Taler] wirewatch update,
Calvin Burns <=