[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] Broken aarch64 by qcow2: skip writing zero buffers to e
From: |
Max Reitz |
Subject: |
Re: [Qemu-block] Broken aarch64 by qcow2: skip writing zero buffers to empty COW areas [v2] |
Date: |
Thu, 22 Aug 2019 17:25:09 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
On 22.08.19 14:09, Max Reitz wrote:
> (CC-ing Paolo because of the XFS connection, and Stefan because why not.)
>
> On 22.08.19 13:27, Lukáš Doktor wrote:
>> Dne 21. 08. 19 v 19:51 Max Reitz napsal(a):
>>> On 21.08.19 16:14, Lukáš Doktor wrote:
>>>> Hello guys,
>>>>
>>>> First attempt was rejected due to zip attachment, let's try it again with
>>>> just Avocado-vt debug.log and serial console log files attached.
>>>>
>>>> I bisected a regression on aarch64 all the way to this commit: "qcow2:
>>>> skip writing zero buffers to empty COW areas"
>>>> c8bb23cbdbe32f5c326365e0a82e1b0e68cdcd8a. Would you please have a look at
>>>> it?
>>>
>>> I think I can see the issue on my x64 system (I don’t see the XFS
>>> corruption, but the installation fails because of some segfaults).
>>>
>>> I haven’t found a simpler way to reproduce the problem yet, though,
>>> which is a pain... :-/
>>>
>>> It looks like the problem disappears when I configure qemu with
>>> “--disable-xfsctl”. Can you try that?
>>>
>>> Max
>>>
>>
>> Hello Max,
>>
>> yes, I'm getting the same behavior. With "--disable-xfsctl" it works well.
>> Also looking at the option I understand why it only failed on aarch64 for
>> me, I don't have libs installed on the other machines, therefor it was
>> disabled by "./configure" there. Anyway I guess disabling it in my builds
>> won't really fix the issue, right? :-)
>
> Thanks!
>
> No, it won’t, but it means the actual root of the problem is probably
> rather in some XFS-related code (be it because qemu uses it the wrong
> way or because of XFS kernel code) than in the pure qcow2 commit that
> made the problem surface by exercising it heavily. (Or in an
> interaction between the two.)
OK, I got a simpler reproducer now:
$ ./qemu-img create -f qcow2 test.qcow2 1M
$ (for i in $(seq 15 -1 0); do \
echo "aio_write -P 42 $((i * 64 + 1))k 62k"; \
done) \
| ./qemu-io test.qcow2
$ for i in $(seq 0 15); do \
echo $i; \
ofs=$((i * 64)); \
./qemu-io -c "read -P 0 ${ofs}k 1k" \
-c "read -P 42 $((ofs + 1))k 62k" \
-c "read -P 0 $((ofs + 63))k 1k" \
test.qcow2 \
| grep 'verification'; \
done
On XFS with --enable-xfsctl, this basically always gives me some
verification failure somewhere. (On tmpfs or with --disable-xfsctl, it
never fails.)
So it seems to be related to I/O from back to front.
(You can also reproduce it with a plain “qemu-img bench” invocation,
like “./qemu-img bench -w --pattern=42 -o 1k -S 64k -s 62k test.qcow2”
(on, say, a 4 GB image), but then the failure appears much later in the
image, because you have to wait from some requests to come in reverse
(by chance) first.)
Max