[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[RFC 00/24] backup performance: block_status + async
From: |
Vladimir Sementsov-Ogievskiy |
Subject: |
[RFC 00/24] backup performance: block_status + async |
Date: |
Fri, 15 Nov 2019 17:14:20 +0300 |
Hi all!
These series does the following things:
1. bring block_status to block-copy, for efficient chunk sizes and
handling ZERO clusters. (mirror does it)
2. bring aio-task-pool to block-copy, for parallel copying loop
iteration. (mirror does it its own way)
4. add speed limit and cancelling possibility to block-copy (for 5.)
5. use block-copy in backup directly, to bring block_status and async.
tasks into backup
6. add some python scripts to benchmark the results
The main theme is async handling of copying loop iterations, which
already works and bring performance in mirror (in its own way) and in
qcow2 (using aio-task-pool).
Here are the results:
---------- ------------- ----------------- ------------- -----------------
-------------
backup-old backup-old(no CR) backup-new backup-new(no CR)
mirror
ssd -> ssd 9.88 +- 0.85 8.85 +- 0.48 5.39 +- 0.04 4.06 +- 0.01
4.15 +- 0.03
ssd -> hdd 10.90 +- 0.30 10.39 +- 0.41 9.36 +- 0.06 9.24 +- 0.06
9.00 +- 0.06
hdd -> hdd 20.09 +- 0.23 20.15 +- 0.07 48.65 +- 1.86 20.62 +- 0.08
19.82 +- 0.37
---------- ------------- ----------------- ------------- -----------------
-------------
---------- ------------- ------------- -------------
backup-old backup-new mirror
nbd -> ssd 30.69 +- 0.23 9.02 +- 0.00 9.06 +- 0.03
ssd -> nbd 36.94 +- 0.01 11.50 +- 0.08 10.12 +- 0.05
---------- ------------- ------------- -------------
Here:
"old" means "before series"
"new" means "after series"
"no CR" means "copy_range disabled"
nbd is nbd server on another node, running like
"qemu-nbd --persistent --nocache -p 10810 ones1000M-source"
RFC.1
What I noticed, is that copy_range makes things worse.. Is there any
case or benchmarking which shows that copy_range increases performance?
Possibly we should disable it by default..
RFC.2
Last patch isn't for commit, possibly I should make some generic
example, ideas are welcome
RFC.3
The series are big and splittable. The reason to send it alltogether is
that I wanted to get the whole picture to benchmark.
Also, I keep in mind benchmarking backup to qcow2 with different levels
of defragmentation and with/without compression.
Future plan is obvious: move block-commit and block-stream to use
block-copy, which will unify code path and bring performance to commit
and stream.
Vladimir Sementsov-Ogievskiy (24):
block/block-copy: specialcase first copy_range request
block/block-copy: use block_status
block/block-copy: factor out block_copy_find_inflight_req
block/block-copy: refactor interfaces to use bytes instead of end
block/block-copy: rename start to offset in interfaces
block/block-copy: reduce intersecting request lock
block/block-copy: hide structure definitions
block/block-copy: rename in-flight requests to tasks
block/block-copy: alloc task on each iteration
block/block-copy: add state pointer to BlockCopyTask
block/block-copy: move task size initial calculation to _task_create
block/block-copy: move block_copy_task_create down
block/block-copy: use aio-task-pool API
block/block-copy: More explicit call_state
block/block-copy: implement block_copy_async
block/block-copy: add max_chunk and max_workers paramters
block/block-copy: add ratelimit to block-copy
block/block-copy: add block_copy_cancel
blockjob: add set_speed to BlockJobDriver
job: call job_enter from job_user_pause
backup: move to block-copy
python: add simplebench.py
python: add qemu/bench_block_job.py
python: benchmark new backup architecture
qapi/block-core.json | 9 +-
include/block/block-copy.h | 90 ++---
include/block/block_int.h | 7 +
include/block/blockjob_int.h | 2 +
block/backup-top.c | 6 +-
block/backup.c | 184 ++++++----
block/block-copy.c | 608 ++++++++++++++++++++++++++++-----
block/replication.c | 1 +
blockdev.c | 5 +
blockjob.c | 6 +
job.c | 1 +
block/trace-events | 1 +
python/bench-example.py | 93 +++++
python/qemu/bench_block_job.py | 114 +++++++
python/simplebench.py | 122 +++++++
tests/qemu-iotests/129 | 3 +-
tests/qemu-iotests/185 | 3 +-
tests/qemu-iotests/219 | 1 +
tests/qemu-iotests/257 | 1 +
tests/qemu-iotests/257.out | 306 ++++++++---------
20 files changed, 1184 insertions(+), 379 deletions(-)
create mode 100755 python/bench-example.py
create mode 100755 python/qemu/bench_block_job.py
create mode 100644 python/simplebench.py
--
2.21.0
- [RFC 00/24] backup performance: block_status + async,
Vladimir Sementsov-Ogievskiy <=
- [RFC 20/24] job: call job_enter from job_user_pause, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 19/24] blockjob: add set_speed to BlockJobDriver, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 16/24] block/block-copy: add max_chunk and max_workers paramters, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 03/24] block/block-copy: factor out block_copy_find_inflight_req, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 12/24] block/block-copy: move block_copy_task_create down, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 09/24] block/block-copy: alloc task on each iteration, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 18/24] block/block-copy: add block_copy_cancel, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 23/24] python: add qemu/bench_block_job.py, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 17/24] block/block-copy: add ratelimit to block-copy, Vladimir Sementsov-Ogievskiy, 2019/11/15
- [RFC 15/24] block/block-copy: implement block_copy_async, Vladimir Sementsov-Ogievskiy, 2019/11/15