Re: [RFC 0/3] block/file-posix: Work around XFS bug

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 0/3] block/file-posix: Work around XFS bug

From:	Vladimir Sementsov-Ogievskiy
Subject:	Re: [RFC 0/3] block/file-posix: Work around XFS bug
Date:	Tue, 29 Oct 2019 11:48:10 +0000

29.10.2019 11:50, Max Reitz wrote:
> On 28.10.19 12:25, Vladimir Sementsov-Ogievskiy wrote:
>> 28.10.2019 14:04, Kevin Wolf wrote:
>>> Am 27.10.2019 um 13:35 hat Stefan Hajnoczi geschrieben:
>>>> On Fri, Oct 25, 2019 at 11:58:46AM +0200, Max Reitz wrote:
> 
> [...]
> 
>>>>> (3) Drop handle_alloc_space(), i.e. revert c8bb23cbdbe32f.
>>>>>       To my knowledge I’m the only one who has provided any benchmarks for
>>>>>       this commit, and even then I was a bit skeptical because it performs
>>>>>       well in some cases and bad in others.  I concluded that it’s
>>>>>       probably worth it because the “some cases” are more likely to occur.
>>>>>
>>>>>       Now we have this problem of corruption here (granted due to a bug in
>>>>>       the XFS driver), and another report of massively degraded
>>>>>       performance on ppc64
>>>>>       (https://bugzilla.redhat.com/show_bug.cgi?id=1745823 – sorry, a
>>>>>       private BZ; I hate that :-/  The report is about 40 % worse
>>>>>       performance for an in-guest fio write benchmark.)
>>>>>
>>>>>       So I have to ask the question about what the justification for
>>>>>       keeping c8bb23cbdbe32f is.  How much does performance increase with
>>>>>       it actually?  (On non-(ppc64+XFS) machines, obviously)
>>>>>
>>>>>       Advantages:
>>>>>       + Trivial
>>>>>       + No layering violations
>>>>>       + We wouldn’t need to keep track of whether the kernel bug has been
>>>>>         fixed or not
>>>>>       + Fixes the ppc64+XFS performance problem
>>>>>
>>>>>       Disadvantages:
>>>>>       - Reverts cluster allocation performance to pre-c8bb23cbdbe32f
>>>>>         levels, whatever that means
>>>>
>>>> My favorite because it is clean and simple, but Vladimir has a valid
>>>> use-case for requiring this performance optimization so reverting isn't
>>>> an option.
>>>
>>> Vladimir also said that qcow2 subclusters would probably also solve his
>>> problem, so maybe reverting and applying the subcluster patches instead
>>> is a possible solution, too?
>>
>> I'm not sure about ssd case, it may need write-zero optimization anyway.
> 
> What exactly do you need?  Do you actually need to write zeroes (e.g.
> because you’re storing images on block devices) or would it be
> sufficient to just drop the COW areas when bdrv_has_zero_init() and
> bdrv_has_zero_init_truncate() are true?

Hmm, what do you mean? We need to zero COW areas. So, original way is to
write real zeroes, optimized way is fallocate.. What do you mean by drop?
Mark sublusters as zeroes by metadata?

But still we'll have COW areas in subcluster, and we'll need to directly zero
them.. And fallocate will most probably be faster on ssd ext4 case..

> 
> I’m asking because Dave Chinner said
> (https://bugzilla.redhat.com/show_bug.cgi?id=1765547#c7) that
> fallocate() is always slow at least with aio=native because it needs to
> wait for all concurrent AIO writes to finish, and so it causes the AIO
> pipeline to stall.
> 
> (He suggested using XFS extent size hints to get the same effect as
> write-zeroes for free, basically, but I don’t know whether that’s really
> useful to us; as unallocated areas on XFS read back as zero anyway.)
> 
>>> We already have some cases where the existing handle_alloc_space()
>>> causes performance to actually become worse, and serialising requests as
>>> a workaround isn't going to make performance any better. So even on
>>> these grounds, keeping commit c8bb23cbdbe32f is questionable.
>>>
>>
>> Can keeping handle_alloc_space under some config option be an option?
> 
> Hm.  A config option is weird if you’re the only one who’s going to
> enable it.  But other than that I don’t have anything against it.
> 

It's just a bit easier for us to maintain config option, than out-of-tree patch.
On the other hand, it's not a real problem to maintain this one patch in 
separate.
It may return again to the tree, when XFS bug fixed.

-- 
Best regards,
Vladimir

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [RFC 0/3] block/file-posix: Work around XFS bug, (continued)
- Re: [RFC 0/3] block/file-posix: Work around XFS bug, Stefan Hajnoczi, 2019/10/27
  - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Max Reitz, 2019/10/28
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Max Reitz, 2019/10/28
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Max Reitz, 2019/10/28
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Vladimir Sementsov-Ogievskiy, 2019/10/28
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Max Reitz, 2019/10/28
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Vladimir Sementsov-Ogievskiy, 2019/10/28
  - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Kevin Wolf, 2019/10/28
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Vladimir Sementsov-Ogievskiy, 2019/10/28
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Max Reitz, 2019/10/29
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Vladimir Sementsov-Ogievskiy <=
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Max Reitz, 2019/10/29
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Vladimir Sementsov-Ogievskiy, 2019/10/29
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Max Reitz, 2019/10/29
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Vladimir Sementsov-Ogievskiy, 2019/10/29
    - Re: [RFC 0/3] block/file-posix: Work around XFS bug, Max Reitz, 2019/10/29

Prev by Date: Re: [PATCH v6 0/11] add failover feature for assigned network devices
Next by Date: [PATCH v7 0/11] add failover feature for assigned network devices
Previous by thread: Re: [RFC 0/3] block/file-posix: Work around XFS bug
Next by thread: Re: [RFC 0/3] block/file-posix: Work around XFS bug
Index(es):
- Date
- Thread