[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Block layer meeting notes from June 10/11
From: |
Stefan Hajnoczi |
Subject: |
[Qemu-devel] Block layer meeting notes from June 10/11 |
Date: |
Fri, 13 Jun 2014 21:28:28 +0800 |
Kevin, Markus, Benoit, and I recently had the opportunity to meet
face-to-face to discuss ongoing design challenges in the QEMU block
layer. I am sharing the meeting notes in this email. Questions and
comments welcome!
These notes are somewhat sparse. If you want full details, please add
an agenda item to the QEMU Community Call so we can have a live
discussion during the next call. Or just reply to this email thread.
The general theme is that the "node name" concept is being introduced
into the QMP commands. They allow the client to operate on specific
nodes in a BlockDriverState graph, not just the "drive" or root node.
This is much more powerful but also adds complexity. We need to make
sure the operations are safe, make sense, and do not destroy data.
1. I/O throttling groups (multiple disks sharing I/O throttling quota)
* BDSes already have multiple children, do they need multiple parents
as well? (And be able to distinguish them)
Action:
* Add bpsgroup=<name> property to -drive [Benoit]
* First drive using bpsgroup= can set the initial group bps/iops values
* Successive drives will get EBUSY if they try to set group bps/iops values
* throttle_set_limits on drive0 linked to a group will update the
group bps/iops values
* When last drive using group is deleted, the group is destroyed too
2. Dynamic graph reconfiguration (e.g. adding block filters, taking
snapshots, etc.)
* Where does the new node get inserted and how to specify how it is
linked up with the existing nodes?
* On a given "arrow" between two nodes (only works with 1 child, 1 parent)
* On a given set of arrows (possibly more complex than what is really needed?)
* How does removing a node work with more than one child of the deleted node?
* Keep using the existing QMP command for I/O throttling for now,
until we understand the general problem reasonably well
Action:
* Figure out the general problem
* Split I/O throttling off into own BDS [Benoît]
* Requires some care with snapshots etc.
3. Proper specification for blockdev-add
* What (magic) does -drive add that blockdev-add is missing?
* Filename parsing and protocol detectioin
* Format probing
* Desugaring -drive
* What does BDRV_O_PROTOCOL mean?
* Disable format probing
* Parse protocol name from filename (but not from options QDict)
* Put filename then into options QDict
* Set bs->growable
* Disable adding of a bs->file layer
* Ignore BDRV_O_SNAPSHOT
* Which callers need which of these properties?
Action:
* Convert network block drivers to QDict options (keep legacy
filename parsing for compatibility) [volunteer?]
* Add network block drivers to blockdev-add [volunteer?]
* Translate bdrv_open() arguments into options qdict, if appropriate [Kevin]
* Translate legacy "filename" to qdict
* Specify bdrv_open() behavior (especially magic) [Kevin]
4. BDS graph rules and manipulating arbitrary nodes
* A proper design: iterate children, safely manipulate graph
Action:
* Get rid of bdrv_swap() and update child/parent pointers instead
(depends on BlockBackend) [Markus?]
* Add notifier list to BDS so users can get updated when pointer changes:
bdrv_register_bs_pointer(bs, &mystruct->bs) /* automatically refresh pointer */
* Add parents and child list to BlockDriverState (could be realloc
array or just a function interface that operates on
->file/->backing_hd) [nice to have]
* Audit drivers
* Especially VMDK and quorum
* Make them use generic child interface
* Use child list where generic block layer currently hardcodes
->backing_hd and ->file
* Mutual exclusion of operations/background jobs (bs->in_use / BlockOpType)
* Streaming in two different parts of the backing chain - allowed?
(Benoît though that not, but does anything break?)
* Does streaming only require that streamed images stay read-only
(i.e. backing chain segment on which the operation is performed)
* Live commit in the opposite direction at the same time?
Action:
* Draw up matrix of operations (mirror, stream, resize, etc)
* Make op blocker mechanism use matrix as data instead of code
(define an array)
* Enforce that new QMP/QAPI commands and block jobs add themselves to
the matrix
* Recursively add blockers to child nodes (driver method?) [Benoit]
* Arbitrary nodes
* drive-mirror of arbitrary node
* block-stream of arbitrary node
* Jeff Cody's block-commit of arbitrary node patch series
Action:
* Add base-nodename argument to block-stream command [Benoit]
* Add top-nodename argument to block-stream command [Benoit]
* If command can modify part of a backing chain, need to add option
to update the parent's backing filename field on disk!
* Add optional backing-filename argument (since libvirt may use fd
passing and QEMU's filename is useless)
* Add boolean whether to update backing file (for users who don't
need to override backing filename)
* drive-mirror (block-mirror) of arbitrary node [Benoit]
* Deprecate filename references in QMP commands in favour of node
names (e.g. streaming base) [Jeff?]
5. BlockBackend split
Action:
* Split BlockBackend from BlockDriverState [Markus]
* bdrv_new+bdrv_open and bdrv_close+bdrv_unref should be same,
eliminate ENOMEDIUM semantics
* Make block driver private embedded in BlockDriverState instead of
opaque pointer
6. Dataplane programming model recap
* What do block drivers need to be careful of?
* Any comments on new docs/dataplane.txt documentation?
Action:
* AioContext assertions to prevent callbacks in wrong event loop [Stefan]
- [Qemu-devel] Block layer meeting notes from June 10/11,
Stefan Hajnoczi <=