[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 1/1] NBD proto: add WRITE_ZEROES extension
From: |
Alex Bligh |
Subject: |
Re: [Qemu-devel] [PATCH v2 1/1] NBD proto: add WRITE_ZEROES extension |
Date: |
Thu, 31 Mar 2016 14:53:22 +0100 |
On 31 Mar 2016, at 14:02, Denis V. Lunev <address@hidden> wrote:
> From: Pavel Borzenkov <address@hidden>
>
> There exist some cases when a client knows that the data it is going to
> write is all zeroes. Such cases include mirroring or backing up a device
> implemented by a sparse file.
Useful.
> -- bit 0, `NBD_CMD_FLAG_FUA`; valid during `NBD_CMD_WRITE`. SHOULD be
> - set to 1 if the client requires "Force Unit Access" mode of
> - operation. MUST NOT be set unless transmission flags included
> - `NBD_FLAG_SEND_FUA`.
> +- bit 0, `NBD_CMD_FLAG_FUA`; valid during `NBD_CMD_WRITE` and
> + `NBD_CMD_WRITE_ZEROES` commands. SHOULD be set to 1 if the client requires
> + "Force Unit Access" mode of operation. MUST NOT be set unless transmission
> + flags included `NBD_FLAG_SEND_FUA`.
Not your fault, but this should actually say "unless export flags
included". Transmission flags would be the flags with the command.
> +- bit 1, `NBD_CMD_MAY_TRIM`; defined by the experimental `WRITE_ZEROES`
> + extension; see below.
For consistency, probably useful to say here:
MUST NOT be set unless the export flags include NBD_FLAG_SEND_WRITE_ZEROES.
>
> #### Request types
>
> @@ -523,6 +528,10 @@ The following request types exist:
> A client MUST NOT send a trim request unless `NBD_FLAG_SEND_TRIM`
> was set in the transmission flags field.
>
> +* `NBD_CMD_WRITE_ZEROES` (6)
> +
> + Defined by the experimental `WRITE_ZEROES` extension; see below.
> +
> * Other requests
>
> Some third-party implementations may require additional protocol
> @@ -654,6 +663,53 @@ option reply type.
> message if they do not also send it as a reply to the
> `NBD_OPT_SELECT` message.
>
> +### `WRITE_ZEROES` extension
> +
> +There exist some cases when a client knows that the data it is going to write
> +is all zeroes. Such cases include mirroring or backing up a device
> implemented
> +by a sparse file. With current NBD command set, the client has to issue
> +`NBD_CMD_WRITE` command with zeroed payload and transfer these zero bytes
> +through the wire. The server has to write the data onto disk, effectively
> +losing the sparseness.
> +
> +To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
> +one new command and one new command flag.
> +
> +* `NBD_CMD_WRITE_ZEROES` (6)
> +
> + A write request with no payload. Length and offset define the location
> + and amount of data to be zeroed.
> +
> + The server MUST zero out the data on disk, and then send the reply
> + message. The server MAY send the reply message before the data has
> + reached permanent storage.
> +
> + A client MUST NOT send a write zeroes request unless
> + `NBD_FLAG_SEND_WRITE_ZEROES` was set in the transmission flags field.
> +
> + If the `NBD_FLAG_SEND_FUA` flag was set in the transmission flags field,
> + the client MAY set the flag `NBD_CMD_FLAG_FUA` in the command flags
> field.
> + If this flag was set, the server MUST NOT send the reply until it has
> + ensured that the newly-zeroed data has reached permanent storage.
> +
> + If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the command
> + flags field, the server MAY use trimming to zero out the area, but it
> + MUST ensure that the data reads back as zero.
> +
Can you give an example of a situation where the client would not set this
and it would be undesirable for the server to create a 'hole' using
'trim' type technology, even when the client doesn't specify it?
I suspect there are already some backends (e.g. ceph on qemu-nbd) which
will effectively do a 'trim' if you write 4k of zeroes even under
current circumstances.
IE why not always permit trimming PROVIDED the data always reads back
as zero? This would be far simpler.
--
Alex Bligh