[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 'make check-acceptance' failing on s390 tests?
From: |
Thomas Huth |
Subject: |
Re: 'make check-acceptance' failing on s390 tests? |
Date: |
Mon, 21 Feb 2022 16:27:12 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 |
On 18/02/2022 16.04, Peter Maydell wrote:
Hi; is anybody else seeing 'make check-acceptance' fail on some of
the s390 tests?
(009/183) tests/avocado/boot_linux.py:BootLinuxS390X.test_s390_ccw_virtio_tcg:
INTERRUPTED: Test interrupted by SIGTERM\nRunner error occurred:
Timeout reached\nOriginal status: ERROR\n{'name':
'009-tests/avocado/boot_linux.py:BootLinuxS390X.test_s390_ccw_virtio_tcg',
'logdir':
'/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/clang/tests/results/j...
(900.20 s)
(090/183)
tests/avocado/machine_s390_ccw_virtio.py:S390CCWVirtioMachine.test_s390x_fedora:
FAIL: b'1280 800\n' != b'1024 768\n' (26.79 s)
I've cc'd Daniel because the 090 at least looks like a resolution
baked into the test case, and commit de72c4b7c that went in
last month changed the EDID reported resolution from 1024x768
to 1280x800.
Yes, that seems to be right - since the default monitor resolution changed,
the screenshot now has a different size, too. I sent a patch here:
https://lists.gnu.org/archive/html/qemu-devel/2022-02/msg04473.html
Not sure about the timeout on the boot test: the avocado log
shows it booting at least as far as
"Kernel 5.3.7-301.fc31.s390x on an s390x (ttysclp0)"
and then there's no further output until the timeout.
Unfortunately the avocado log doesn't seem to include useful
information like "this is the string we were waiting to see", so
I'm not sure exactly what's gone wrong there.
(I continue to find the Avocado tests rather opaque: when you
get a series of green OK's that's fine, but when you get a failure
it's often non-obvious why it failed or how to do simple things
like "rerun just that one failed test" or "run the failing command,
interactively on the command line".)
For me, it's even worse with the tests/avocado/boot_linux.py - none of them
is working on my local laptop, so I was always ignoring them until now.
FWIW, I'm seeing this python backtrace in the log:
Reproduced traceback from:
/home/thuth/tmp/qemu-build/tests/venv/lib64/python3.6/site-packages/avocado/core/test.py:770
Traceback (most recent call last):
File "/home/thuth/tmp/qemu-build/tests/avocado/boot_linux.py", line 30,
in test_pc_i440fx_tcg
self.launch_and_wait(set_up_ssh_connection=False)
File
"/home/thuth/tmp/qemu-build/tests/avocado/avocado_qemu/__init__.py", line
636, in launch_and_wait
cloudinit.wait_for_phone_home(('0.0.0.0', self.phone_home_port),
self.name)
File
"/home/thuth/tmp/qemu-build/tests/venv/lib64/python3.6/site-packages/avocado/utils/cloudinit.py",
line 192, in wait_for_phone_home
s = PhoneHomeServer(address, instance_id)
File
"/home/thuth/tmp/qemu-build/tests/venv/lib64/python3.6/site-packages/avocado/utils/cloudinit.py",
line 173, in __init__
HTTPServer.__init__(self, address, PhoneHomeServerHandler)
File "/usr/lib64/python3.6/socketserver.py", line 456, in __init__
self.server_bind()
File "/usr/lib64/python3.6/http/server.py", line 136, in server_bind
socketserver.TCPServer.server_bind(self)
File "/usr/lib64/python3.6/socketserver.py", line 470, in server_bind
self.socket.bind(self.server_address)
TypeError: an integer is required (got type NoneType)
... no clue how to debug these problems, though.
The 090 failure didn't cause the merge to be rejected because
in commit 333168efe5c8 we disabled both these tests when
running on GitLab.
Suggestion: we should either disable tests entirely (except
for manual "I want to run this known-flaky test") or not at
all, rather than disabling them only on GitLab. If I'm running
'make check-acceptance' locally I don't want to be distracted
by tests we know to be dodgy, any more than if I were running
the CI on GitLab.
IIRC I only saw the occasional hangs of the test on Gitlab, and never on my
local host ... but I see your point ... I'm fine if we replace the
@skipIf(os.getenv('GITLAB_CI')...) there with a
@skipUnless(os.getenv('AVOCADO_ALLOW_FLAKY_TESTS')...) or something similar.
Would you have some spare time to write such a patch?
Thomas