[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH RFC 0/3] hw/block/nvme: dif-based end-to-end data protection
From: |
Klaus Jensen |
Subject: |
Re: [PATCH RFC 0/3] hw/block/nvme: dif-based end-to-end data protection support |
Date: |
Fri, 18 Dec 2020 10:39:01 +0100 |
On Dec 17 13:14, Keith Busch wrote:
> On Thu, Dec 17, 2020 at 10:02:19PM +0100, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> >
> > This series adds support for extended LBAs and end-to-end data
> > protection. Marked RFC, since there are a bunch of issues that could use
> > some discussion.
> >
> > Storing metadata bytes contiguously with the logical block data and
> > creating a physically extended logical block basically breaks the DULBE
> > and deallocation support I just added. Formatting a namespace with
> > protection information requires the app- and reftags of deallocated or
> > unwritten blocks to be 0xffff and 0xffffffff respectively; this could be
> > used to reintroduce DULBE support in that case, albeit at a somewhat
> > higher cost than the block status flag-based approach.
> >
> > There is basically three ways of storing metadata (and maybe a forth,
> > but that is probably quite the endeavour):
> >
> > 1. Storing metadata as extended blocks directly on the blockdev. That
> > is the approach used in this RFC.
> >
> > 2. Use a separate blockdev. Incidentially, this is also the easiest
> > and most straightforward solution to support MPTR-based "separate
> > metadata". This also allows DULBE and block deallocation to be
> > supported using the existing approach.
> >
> > 3. A hybrid of 1 and 2 where the metadata is stored contiguously at
> > the end of the nvme-ns blockdev.
> >
> > Option 1 obviously works well with DIF-based protection information and
> > extended LBAs since it maps one to one. Option 2 works flawlessly with
> > MPTR-based metadata, but extended LBAs can be "emulated" at the cost of
> > a bunch of scatter/gather operations.
>
> Are there any actual users of extended metadata that we care about? I'm
> aware of only a few niche places that can even access an extended
> metadata format. There's not kernel support in any major OS that I know
> of.
>
Yes, there are definitely actual users in enterprise storage. But the
main use case here is testing (using extended LBAs with SPDK for
instance).
> Option 2 sounds fine.
>
> If option 3 means that you're still using MPTR, but just sequester space
> at the end of the backing block device for meta-data purposes, then that
> is fine too. You can even resize it dynamically if you want to support
> different metadata sizes.
Heh, I tend to think that my English vocabulary is pretty decent but I
had to look up 'sequester'. I just learned a new word today \o/
Yes. I actually also like option 3. Technically option 2 does not break
image interoperability between devices (ide, virtio), but you would
leave out a bunch of metadata that your application might depend on, so
I don't see any way to not break interoperability really.
And I would then be just fine with "emulating" extended LBAs at the cost
of more I/O. Because I would like the device to support that mode of
operation as well. We have not implemented this, but my gut feeling says
that it can be done.
>
> > The 4th option is extending an existing image format (QCOW2) or create
> > something on top of RAW to supports metadata bytes per block. But both
> > approaches require full API support through the block layer. And
> > probably a lot of other stuff that I did not think about.
>
> It definitely sounds appealing to push the feature to a lower level if
> you're really willing to see that through.
>
Yes, its super appealing and I would like to have some input from the
block layer guys on this. That is, if anyone has ever explored it?
> In any case, calculating T10 CRCs is *really* slow unless you have
> special hardware and software support for it.
>
Yeah. I know this is super slow. For for emulation and testing purposes
I think it is a nice feature for the device to optionally offer.
signature.asc
Description: PGP signature