qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-5.1] file-posix: Mitigate file fragmentation with extent


From: Kevin Wolf
Subject: Re: [PATCH for-5.1] file-posix: Mitigate file fragmentation with extent size hints
Date: Mon, 13 Jul 2020 15:45:00 +0200

Am 13.07.2020 um 15:12 hat Kevin Wolf geschrieben:
> Am 13.07.2020 um 11:08 hat Max Reitz geschrieben:
> > On 10.07.20 18:12, Max Reitz wrote:
> > > On 07.07.20 18:17, Kevin Wolf wrote:
> > >> Am 07.07.2020 um 16:23 hat Kevin Wolf geschrieben:
> > >>> Espeically when O_DIRECT is used with image files so that the page cache
> > >>> indirection can't cause a merge of allocating requests, the file will
> > >>> fragment on the file system layer, with a potentially very small
> > >>> fragment size (this depends on the requests the guest sent).
> > >>>
> > >>> On Linux, fragmentation can be reduced by setting an extent size hint
> > >>> when creating the file (at least on XFS, it can't be set any more after
> > >>> the first extent has been allocated), basically giving raw files a
> > >>> "cluster size" for allocation.
> > >>>
> > >>> This adds an create option to set the extent size hint, and changes the
> > >>> default from not setting a hint to setting it to 1 MB. The main reason
> > >>> why qcow2 defaults to smaller cluster sizes is that COW becomes more
> > >>> expensive, which is not an issue with raw files, so we can choose a
> > >>> larger file. The tradeoff here is only potentially wasted disk space.
> > >>>
> > >>> For qcow2 (or other image formats) over file-posix, the advantage should
> > >>> even be greater because they grow sequentially without leaving holes, so
> > >>> there won't be wasted space. Setting even larger extent size hints for
> > >>> such images may make sense. This can be done with the new option, but
> > >>> let's keep the default conservative for now.
> > >>>
> > >>> The effect is very visible with a test that intentionally creates a
> > >>> badly fragmented file with qemu-img bench (the time difference while
> > >>> creating the file is already remarkable) and then looks at the number of
> > >>> extents and the take a simple "qemu-img map" takes.
> > >>>
> > >>> Without an extent size hint:
> > >>>
> > >>>     $ ./qemu-img create -f raw -o extent_size_hint=0 ~/tmp/test.raw 10G
> > >>>     Formatting '/home/kwolf/tmp/test.raw', fmt=raw size=10737418240 
> > >>> extent_size_hint=0
> > >>>     $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 
> > >>> -S 8192 -o 0
> > >>>     Sending 1000000 write requests, 4096 bytes each, 64 in parallel 
> > >>> (starting at offset 0, step size 8192)
> > >>>     Run completed in 25.848 seconds.
> > >>>     $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 
> > >>> -S 8192 -o 4096
> > >>>     Sending 1000000 write requests, 4096 bytes each, 64 in parallel 
> > >>> (starting at offset 4096, step size 8192)
> > >>>     Run completed in 19.616 seconds.
> > >>>     $ filefrag ~/tmp/test.raw
> > >>>     /home/kwolf/tmp/test.raw: 2000000 extents found
> > >>>     $ time ./qemu-img map ~/tmp/test.raw
> > >>>     Offset          Length          Mapped to       File
> > >>>     0               0x1e8480000     0               
> > >>> /home/kwolf/tmp/test.raw
> > >>>
> > >>>     real    0m1,279s
> > >>>     user    0m0,043s
> > >>>     sys     0m1,226s
> > >>>
> > >>> With the new default extent size hint of 1 MB:
> > >>>
> > >>>     $ ./qemu-img create -f raw -o extent_size_hint=1M ~/tmp/test.raw 10G
> > >>>     Formatting '/home/kwolf/tmp/test.raw', fmt=raw size=10737418240 
> > >>> extent_size_hint=1048576
> > >>>     $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 
> > >>> -S 8192 -o 0
> > >>>     Sending 1000000 write requests, 4096 bytes each, 64 in parallel 
> > >>> (starting at offset 0, step size 8192)
> > >>>     Run completed in 11.833 seconds.
> > >>>     $ ./qemu-img bench -f raw -t none -n -w ~/tmp/test.raw -c 1000000 
> > >>> -S 8192 -o 4096
> > >>>     Sending 1000000 write requests, 4096 bytes each, 64 in parallel 
> > >>> (starting at offset 4096, step size 8192)
> > >>>     Run completed in 10.155 seconds.
> > >>>     $ filefrag ~/tmp/test.raw
> > >>>     /home/kwolf/tmp/test.raw: 178 extents found
> > >>>     $ time ./qemu-img map ~/tmp/test.raw
> > >>>     Offset          Length          Mapped to       File
> > >>>     0               0x1e8480000     0               
> > >>> /home/kwolf/tmp/test.raw
> > >>>
> > >>>     real    0m0,061s
> > >>>     user    0m0,040s
> > >>>     sys     0m0,014s
> > >>>
> > >>> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > >>
> > >> I also need to squash in a few trivial qemu-iotests updates, for which I
> > >> won't send a v2:
> > > 
> > > The additional specifications in 243 make it print a warning on tmpfs
> > > (because the option doesn’t work there).  I suppose the same may be true
> > > on other filesystems as well.  Should it be filtered out?
> 
> I guess we just shouldn't print a warning if the requested hint is 0.
> 
> > This patch also breaks 059, 106, and 175.
> 
> Hm, I was sure I had tested raw... Anyway, 059 should filter out the
> actual size (how could this ever work?), and 175 is obvious, too - it
> tries to be clever, but not clever enough.
> 
> 106 is a bit mysterious because the error message implies that the
> images end up smaller than before, which shouldn't be the case. I'll
> have a look.

Ah, it misinterprets MiB as KiB, so the error says the image is smaller
than expected while it's actually larger. I'll just disable the extent
size hint for this one, too.

Kevin

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]