qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually wor


From: Tom Yan
Subject: Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg
Date: Mon, 7 Sep 2020 09:29:14 +0800

Your patch only fixed it for sg devices. S_ISBLK() will never be true
when sg_get_max_segments() is only called in raw_refresh_limits() when
bs->sg is true.

With that said, it's not like we cannot ditch the bs->sg check in
raw_refresh_limits() and switch to S_ISBLK()/S_ISCHR(). (Although not
strictly necessary, I would also do S_ISCHR() for
SG_GET_SG_TABLESIZE.)

I don't know if it's a better approach though. Seems repetitive to me
(and bs->sg is not only used/checked in
file-posix/raw_refresh_limits):
https://github.com/qemu/qemu/blob/v5.1.0/block/file-posix.c#L3390
https://github.com/qemu/qemu/blob/v5.1.0/block/file-posix.c#L3504

On Mon, 7 Sep 2020 at 03:50, Dmitry Fomichev <Dmitry.Fomichev@wdc.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Qemu-block <qemu-block-
> > bounces+dmitry.fomichev=wdc.com@nongnu.org> On Behalf Of Maxim
> > Levitsky
> > Sent: Sunday, September 6, 2020 1:06 PM
> > To: Tom Yan <tom.ty89@gmail.com>; eblake@redhat.com;
> > pbonzini@redhat.com; fam@euphon.net; anielhb@linux.vnet.ibm.com;
> > kwolf@redhat.com; mreitz@redhat.com
> > Cc: qemu-devel@nongnu.org; qemu-block@nongnu.org
> > Subject: Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that
> > actually works with sg
> >
> > On Sun, 2020-09-06 at 23:26 +0800, Tom Yan wrote:
> > > I don't disagree with your proposal, but the thing is, do we even need
> > > another field/limit for case 1? For example, do we *really* need to
> > > clamp sizes[2] (NBD_INFO_BLOCK_SIZE) in nbd/server.c and
> > > max_io_sectors (SCSIBlockLimits) in hw/scsi/scsi-disk.c to to any kind
> > > of "dynamic" limit?
> >
> > It depends on if we have other block drivers that have IO segment/max
> > transfer sizes,
> > that we need to enforce, other than the SG_IO case.
> >
> > Knowing that the commit that added these limits to file-posix, which I was
> > fixing is relatively
> > new, I guess there are other cases for these limits.
> >
> > I'll check this tomorrow.
> >
> > >
> > > Either way I can add another field (`max_pwrite`, maybe?) to
> > > BlockLimits, as an infrastructure/hint, but what should be switched to
> > > it, and what value should each block driver set it to, is probably
> > > beyond what I can take.
> >
> > Implementation wise, I think I can do this, but I'll wait few days for the
> > feedback on my proposal.
> >
> > Thanks for finding this bug!
>
> There was also 
> https://lore.kernel.org/qemu-devel/20200811225122.17342-2-dmitry.fomichev@wdc.com/
> posted three weeks ago, but, seemingly, it was ignored... sorry if this
> bug is already being fixed in a different way, but I think the fix presented
> in that patch is worth considering.
>
> Very best,
> Dmitry
>
> >
> > Best regards,
> >       Maxim Levitsky
> >
> > >
> > > IMHO my patches should be merged first (they are really just fixing a
> > > regression and some bugs). If anyone finds it necessary and is capable
> > > and willing to fix the insufficiency of BlockLimits, they can do it
> > > afterwards with another patch series as an improvement. I can add a
> > > patch to remove the clamping in nbd/server.c as a perhaps-temporary
> > > measure to address the original issue, if desired.
> > >
> > > On Sun, 6 Sep 2020 at 20:53, Maxim Levitsky <mlevitsk@redhat.com>
> > wrote:
> > > > On Sun, 2020-09-06 at 19:04 +0800, Tom Yan wrote:
> > > > > Maybe you want to add some condition for this:
> > > > > https://github.com/qemu/qemu/blob/v5.1.0/nbd/server.c#L659
> > > > > Or not clamp it at all.
> > > > >
> > > > > On Sun, 6 Sep 2020 at 18:58, Tom Yan <tom.ty89@gmail.com> wrote:
> > > > > > In commit 867eccfed84f96b54f4a432c510a02c2ce03b430, Levitsky
> > appears
> > > > > > to have assumed that the only "SCSI Passthrough" is `-device
> > > > > > scsi-generic`, while the fact is there's also `-device scsi-block`
> > > > > > (passthrough without the sg driver). Unlike `-device scsi-hd`, 
> > > > > > getting
> > > > > > max_sectors is necessary to it (more precisely, hw_max_sectors
> > might
> > > > > > what matters, but BLKSECTGET reports max_sectors, so).
> > > > > >
> > > > > > I'm unsure about how qemu-nbd works, but the commit clearly
> > wasn't the
> > > > > > right approach to fix the original issue it addresses. (It should 
> > > > > > for
> > > > > > example ignore the max_transfer if it will never matter in to it, or
> > > > > > overrides it in certain cases; when I glimpsed over
> > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=1647104, I don't see
> > how
> > > > > > it could be file-posix problem when it is reporting the right thing,
> > > > > > regardless of whether "removing" the code helps.)
> > > > > >
> > > > > > I don't think we want to "mark" `-device scsi-block` as sg either. 
> > > > > > It
> > > > > > will probably bring even more unexpected problems, because they
> > are
> > > > > > literally different sets of things behind the scene / in the kernel.
> > > >
> > > > Yes, I did overlook the fact that scsi-block is kind of hybrid 
> > > > passthough
> > device,
> > > > doing SG_IO for some things and regular IO for others.
> > > >
> > > > I don't think that my approach was wrong way to fix the problem, but as
> > you found
> > > > out, abusing 'bs->sg' hack (which I would be very happy to remove
> > completely)
> > > > wasn't a good idea.
> > > > I actualy was aware of scsi-block and that it does SG_IO but it
> > > > was forgotten some where on the way.
> > > >
> > > > So in summary what the problem is and what I think is the right 
> > > > solution:
> > > >
> > > >
> > > > Each qemu block driver exposes block limits and assumes that they are
> > the same
> > > > for two IO interfaces the block driver can expose:
> > > >
> > > > 1. Regular qemu blk_pread/pwrite alike functions
> > > > 2. blk_ioctl (tiny wrapper against SG_IO), supported by posix-raw on
> > > >    host block devices/sg char devices, and by iscsi
> > > >
> > > > The problem is that these two interfaces can have different block 
> > > > limits.
> > > >
> > > > I don't know about iscsi, but for files, doing regular IO is always 
> > > > unlimited,
> > > > since it passes through the kernel block layer and segemented there on
> > > > demand which is faster that doing it in userspace, while SG_IO is passed
> > as is
> > > > to the underlying SCSI device and lacks this segmentation.
> > > >
> > > > Regardless of how NBD uses these limits, I think that these limits 
> > > > should
> > be correctly
> > > > exposed by the block drivers, and therefore I propose is that each qemu
> > block driver
> > > > would expose a pair of block limits.
> > > > One for the regular block IO, and other for SG_IO.
> > > >
> > > > Then block driver clients (like scsi devices that you mention, nbd, etc)
> > > > can choose which limit to use, depending on which IO api they use.
> > > > The scsi-generic, and scsi-block can use the SG_IO limits,
> > > > while the rest  can use the normal (unlimited for file I/O) limits, 
> > > > including
> > the NBD.
> > > >
> > > > Best regards,
> > > >         Maxim Levitsky
> > > >
> > > > > > On Fri, 4 Sep 2020 at 10:09, Tom Yan <tom.ty89@gmail.com> wrote:
> > > > > > > sg devices have different major/minor than their corresponding
> > > > > > > block devices. Using sysfs to get max segments never really worked
> > > > > > > for them.
> > > > > > >
> > > > > > > Fortunately the sg driver provides an ioctl to get sg_tablesize,
> > > > > > > which is apparently equivalent to max segments.
> > > > > > >
> > > > > > > Signed-off-by: Tom Yan <tom.ty89@gmail.com>
> > > > > > > ---
> > > > > > >  block/file-posix.c | 17 ++++++++++++++++-
> > > > > > >  1 file changed, 16 insertions(+), 1 deletion(-)
> > > > > > >
> > > > > > > diff --git a/block/file-posix.c b/block/file-posix.c
> > > > > > > index 411a23cf99..9e37594145 100644
> > > > > > > --- a/block/file-posix.c
> > > > > > > +++ b/block/file-posix.c
> > > > > > > @@ -1178,6 +1178,21 @@ static int
> > sg_get_max_transfer_length(int fd)
> > > > > > >  #endif
> > > > > > >  }
> > > > > > >
> > > > > > > +static int sg_get_max_segments(int fd)
> > > > > > > +{
> > > > > > > +#ifdef SG_GET_SG_TABLESIZE
> > > > > > > +    long max_segments = 0;
> > > > > > > +
> > > > > > > +    if (ioctl(fd, SG_GET_SG_TABLESIZE, &max_segments) == 0) {
> > > > > > > +        return max_segments;
> > > > > > > +    } else {
> > > > > > > +        return -errno;
> > > > > > > +    }
> > > > > > > +#else
> > > > > > > +    return -ENOSYS;
> > > > > > > +#endif
> > > > > > > +}
> > > > > > > +
> > > > > > >  static int get_max_transfer_length(int fd)
> > > > > > >  {
> > > > > > >  #if defined(BLKSECTGET) && defined(BLKSSZGET)
> > > > > > > @@ -1268,7 +1283,7 @@ static void
> > hdev_refresh_limits(BlockDriverState *bs, Error **errp)
> > > > > > >          bs->bl.max_transfer = pow2floor(ret);
> > > > > > >      }
> > > > > > >
> > > > > > > -    ret = get_max_segments(s->fd);
> > > > > > > +    ret = bs->sg ? sg_get_max_segments(s->fd) :
> > get_max_segments(s->fd);
> > > > > > >      if (ret > 0) {
> > > > > > >          bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
> > > > > > >                                             ret * 
> > > > > > > qemu_real_host_page_size);
> > > > > > > --
> > > > > > > 2.28.0
> > > > > > >
> >
> >
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]