qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: iotest failure -- test possibly not using sufficiently unique temp f


From: Thomas Huth
Subject: Re: iotest failure -- test possibly not using sufficiently unique temp filename?
Date: Fri, 18 Oct 2019 13:43:24 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0

On 18/10/2019 10.42, Max Reitz wrote:
> On 18.10.19 08:20, Thomas Huth wrote:
>> On 17/10/2019 18.41, Peter Maydell wrote:
>>> On Fri, 27 Sep 2019 at 17:44, Max Reitz <address@hidden> wrote:
>>>>
>>>> On 27.09.19 18:39, Peter Maydell wrote:
>>>>> Hi; I just saw this iotest failure (on an s390x box, as it happens):
>>>>>
>>>>>   TEST    iotest-qcow2: 130 [fail]
>>>>> QEMU          --
>>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../s390x-softmmu/qemu-system-s390x"
>>>>> -nodefaults -display none -machine accel=qtest
>>>>> QEMU_IMG      -- 
>>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-img"
>>>>> QEMU_IO       --
>>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-io"
>>>>> --cache writeback -f qcow2
>>>>> QEMU_NBD      -- 
>>>>> "/home/linux1/qemu/build/all/tests/qemu-iotests/../../qemu-nbd"
>>>>> IMGFMT        -- qcow2 (compat=1.1)
>>>>> IMGPROTO      -- file
>>>>> PLATFORM      -- Linux/s390x lxub05 4.15.0-58-generic
>>>>> TEST_DIR      -- /home/linux1/qemu/build/all/tests/qemu-iotests/scratch
>>>>> SOCKET_SCM_HELPER --
>>>>> /home/linux1/qemu/build/all/tests/qemu-iotests/socket_scm_helper
>>>>>
>>>>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>>>>> 12:27:16.948075733 -0400
>>>>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>>>>> 2019-09-27 12:01:23.649722655 -0400
>>>>> @@ -18,20 +18,22 @@
>>>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>>>  (qemu) commit testdisk
>>>>>  (qemu)
>>>>> -image: TEST_DIR/t.IMGFMT
>>>>> -file format: IMGFMT
>>>>> -virtual size: 64 MiB (67108864 bytes)
>>>>> -backing file: TEST_DIR/t.IMGFMT.orig
>>>>> -backing file format: raw
>>>>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared 
>>>>> "write" lock
>>>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>>>
>>>>>  === Marking image dirty (lazy refcounts) ===
>>>>>
>>>>> +qemu-img: TEST_DIR/t.IMGFMT: Failed to get "write" lock
>>>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>>> -wrote 4096/4096 bytes at offset 0
>>>>> -4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>>> +qemu-io: can't open device
>>>>> /home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2: Failed
>>>>> to get "write" lock
>>>>> +Is another process using the image
>>>>> [/home/linux1/qemu/build/all/tests/qemu-iotests/scratch/t.qcow2]?
>>>>> +no file open, try 'help open'
>>>>>  image: TEST_DIR/t.IMGFMT
>>>>>  file format: IMGFMT
>>>>>  virtual size: 64 MiB (67108864 bytes)
>>>>> +backing file: TEST_DIR/t.IMGFMT.orig
>>>>> +backing file format: raw
>>>>>  Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
>>>>> backing_file=TEST_DIR/t.IMGFMT.orig backing_fmt=raw
>>>>>  wrote 4096/4096 bytes at offset 0
>>>>>  4 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>>>>>
>>>>>
>>>>>
>>>>> This looks suspiciously like the test isn't using a unique
>>>>> filename for its disk image: "qemu-iotests/scratch/t.qcow2"
>>>>> in the build directory, and so perhaps it has collided with
>>>>> another iotest ?
>>>>>
>>>>> If we run 'make check' with a -j<something> option do the
>>>>> iotests all get run serially anyway, or do they run in
>>>>> parallel against each other ?
>>>>
>>>> As far as I know, all iotests are executed serially.  Anything else
>>>> would not work with the same scratch directory.
>>>>
>>>> The only thing I suspect is that some tool has been accidentally left
>>>> running by some previous test that still accesses its own image.  But I
>>>> don’t know.
>>>
>>> Just saw this one again with the same iotest 130 on the same
>>> s390 box; only difference is that the log this time around
>>> has the first part where qemu-img fails, but not the second part
>>> where qemu-io fails:
>>>
>>> --- /home/linux1/qemu/tests/qemu-iotests/130.out        2019-05-10
>>> 12:27:16.948075733 -0400
>>> +++ /home/linux1/qemu/build/all/tests/qemu-iotests/130.out.bad
>>> 2019-10-17 11:56:43.450750873 -0400
>>> @@ -18,11 +18,8 @@
>>>  QEMU X.Y.Z monitor - type 'help' for more information
>>>  (qemu) commit testdisk
>>>  (qemu)
>>> -image: TEST_DIR/t.IMGFMT
>>> -file format: IMGFMT
>>> -virtual size: 64 MiB (67108864 bytes)
>>> -backing file: TEST_DIR/t.IMGFMT.orig
>>> -backing file format: raw
>>> +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Failed to get shared "write" 
>>> lock
>>> +Is another process using the image [TEST_DIR/t.IMGFMT]?
>>>
>>>  === Marking image dirty (lazy refcounts) ===
>>>
>>> On the host machine there don't seem to be any stray
>>> processes which might have held the file open, and
>>> indeed the file doesn't exist at all, so it got removed
>>> by some cleanup or other.
>>
>> Ok, so unless someone has a clue what might be going on here (is there a
>> race in the test?), I'd suggest that we simply remove 130 from the auto
>> group again. Shall I send a patch?
> 
> I don’t have much of an idea.  It looks like maybe the qemu process
> (which dos the commit) is lingering, but that shouldn’t be because
> _cleanup_qemu always waits for it.  (Also, I can’t reproduce the problem
> on my system.)
> 
> The only hunch that I have is that 130 seems to be the only test that
> uses _cleanup_qemu to kill a qemu process (i.e. without wait=1) while it
> has taken locks on an image and then still uses the image afterwards.
> Maybe making it quit qemu through HMP would fix the problem.  But
> knowing is difficult because I can’t reproduce it.

I also can't reproduce the problem. I've now run 130 several hundred
times in a loop on a s390x box and never saw the problem.

I think we should disable it for the time being.

 Thomas

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]