Re: [PATCH v3 0/7] Switch iotests to using Async QMP

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 0/7] Switch iotests to using Async QMP

From:	Hanna Reitz
Subject:	Re: [PATCH v3 0/7] Switch iotests to using Async QMP
Date:	Wed, 13 Oct 2021 16:49:29 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.0

On 13.10.21 16:00, John Snow wrote:



On Wed, Oct 13, 2021 at 8:51 AM John Snow <jsnow@redhat.com> wrote:



    On Wed, Oct 13, 2021 at 4:45 AM Hanna Reitz <hreitz@redhat.com> wrote:

        On 13.10.21 00:34, John Snow wrote:
        > Based-on: <20211012214152.802483-1-jsnow@redhat.com>
        >            [PULL 00/10] Python patches
        > GitLab:
        https://gitlab.com/jsnow/qemu/-/commits/python-aqmp-iotest-wrapper
        > CI: https://gitlab.com/jsnow/qemu/-/pipelines/387210591
        >
        > Hiya,
        >
        > This series continues where the last two AQMP series left
        off and adds a
        > synchronous 'legacy' wrapper around the new AQMP interface,
        then drops
        > it straight into iotests to prove that AQMP is functional
        and totally
        > cool and fine. The disruption and churn to iotests is pretty
        minimal.
        >
        > In the event that a regression happens and I am not
        physically proximate
        > to inflict damage upon, one may set the
        QEMU_PYTHON_LEGACY_QMP variable
        > to any non-empty string as it pleases you to engage the QMP
        machinery
        > you are used to.
        >
        > I'd like to try and get this committed early in the 6.2
        development
        > cycle to give ample time to smooth over any possible
        regressions. I've
        > tested it locally and via gitlab CI, across Python versions
        3.6 through
        > 3.10, and "worksforme". If something bad happens, we can
        revert the
        > actual switch-flip very trivially.

        So running iotests locally, I got one failure:

        $ TEST_DIR=/tmp/vdi-tests ./check -c writethrough -vdi 300
        [...]
        300                             fail       [10:28:06] [10:28:11]
        5.1s                 output mismatch (see 300.out.bad)
        --- /home/maxx/projects/qemu/tests/qemu-iotests/300.out
        +++ 300.out.bad
        @@ -1,4 +1,5 @@
        -.......................................
        +..............ERROR:qemu.aqmp.qmp_client.qemu-b-222963:Task.Reader:

        ConnectionResetError: [Errno 104] Connection reset by peer
        +.........................
          ----------------------------------------------------------------------
          Ran 39 tests
        [...]


    Oh, unfortunate.


        I’m afraid I can’t really give a reproducer or anything.  It
        feels like


    Thank you for the report!

        just some random spurious timing-related error. Although then
        again,
        300 does have an `except machine.AbnormalShutdown` clause at one
        point...  So perhaps that’s the culprit, and we need to
        disable logging
        there.


    I'll investigate!

Unfortunately, even in a loop some 150 times I couldn't reproduce thisone. As you point out, it appears to be just a failure caused bylogging. The test logic itself completes as expected.

Still, I would expect, on a "clean" shutdown of the destination host(where the destination process fails to load the migration stream andvoluntarily exits with an error code) to end with a FIN/ACK for TCP or... uh, whatever happens for a UNIX socket. Where's the ConnectionReset coming from? Did the destination VM process *crash*?

I'm not so sure that I *should* silence this error, but I also can'treproduce it at all to answer these questions, so uh. uhhh. I guess Iwill just hammer it on a loop a few hundred times more and see if Iget lucky.

I could reproduce it, by running 20 instances concurrently. (Needs achange to testrunner.py, so that the reference outputs don’t collide:

diff --git a/tests/qemu-iotests/testrunner.pyb/tests/qemu-iotests/testrunner.py

index a56b6da396..fd0a3a1eeb 100644
--- a/tests/qemu-iotests/testrunner.py
+++ b/tests/qemu-iotests/testrunner.py
@@ -221,7 +221,7 @@ def find_reference(self, test: str) -> str:

     def do_run_test(self, test: str) -> TestResult:
         f_test = Path(test)
-        f_bad = Path(f_test.name + '.out.bad')
+        f_bad = Path(f'{os.getpid()}-{f_test.name}.out.bad')
         f_notrun = Path(f_test.name + '.notrun')
         f_casenotrun = Path(f_test.name + '.casenotrun')
         f_reference = Path(self.find_reference(test))

)

And then:

$ while TEST_DIR=/tmp/vdi-$$ ./check -vdi 300; do; done

Which pretty quickly shows the error in at least one of those loops(under a minute).


As far as I can tell, changing the log level in 300 does indeed fix it:

diff --git a/tests/qemu-iotests/300 b/tests/qemu-iotests/300
index 10f9f2a8da..096f5dabf0 100755
--- a/tests/qemu-iotests/300
+++ b/tests/qemu-iotests/300
@@ -27,6 +27,7 @@ from typing import Dict, List, Optional
 from qemu.machine import machine

 import iotests
+from iotests import change_log_level


 BlockBitmapMapping = List[Dict[str, object]]

@@ -464,7 +465,8 @@ classTestBlockBitmapMappingErrors(TestDirtyBitmapMigration):

         # Expect abnormal shutdown of the destination VM because of
         # the failed migration
         try:
-            self.vm_b.shutdown()
+            with change_log_level('qemu.aqmp'):
+                self.vm_b.shutdown()
         except machine.AbnormalShutdown:
             pass

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v3 5/7] iotests: Conditionally silence certain AQMP errors, (continued)
- [PATCH v3 5/7] iotests: Conditionally silence certain AQMP errors, John Snow, 2021/10/12
  - Re: [PATCH v3 5/7] iotests: Conditionally silence certain AQMP errors, Hanna Reitz, 2021/10/13
- [PATCH v3 6/7] python/aqmp: Create sync QMP wrapper for iotests, John Snow, 2021/10/12
  - Re: [PATCH v3 6/7] python/aqmp: Create sync QMP wrapper for iotests, Hanna Reitz, 2021/10/13
- [PATCH v3 7/7] python, iotests: replace qmp with aqmp, John Snow, 2021/10/12
  - Re: [PATCH v3 7/7] python, iotests: replace qmp with aqmp, Eric Blake, 2021/10/13
- [PATCH v3 1/7] python/machine: remove has_quit argument, John Snow, 2021/10/12
- Re: [PATCH v3 0/7] Switch iotests to using Async QMP, Hanna Reitz, 2021/10/13
  - Re: [PATCH v3 0/7] Switch iotests to using Async QMP, John Snow, 2021/10/13
    - Re: [PATCH v3 0/7] Switch iotests to using Async QMP, John Snow, 2021/10/13
    - Re: [PATCH v3 0/7] Switch iotests to using Async QMP, Hanna Reitz <=
    - Re: [PATCH v3 0/7] Switch iotests to using Async QMP, John Snow, 2021/10/13

Prev by Date: Re: [PATCH 00/16] fdt: Make OF_BOARD a boolean option
Next by Date: Re: [PATCH 09/13] iotests: split linters.py out from 297
Previous by thread: Re: [PATCH v3 0/7] Switch iotests to using Async QMP
Next by thread: Re: [PATCH v3 0/7] Switch iotests to using Async QMP
Index(es):
- Date
- Thread