qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tests/test-char hangs in CI (Centos7 with check-crypto-old-nettle)


From: Alex Bennée
Subject: Re: tests/test-char hangs in CI (Centos7 with check-crypto-old-nettle)
Date: Wed, 16 Dec 2020 15:46:52 +0000
User-agent: mu4e 1.5.7; emacs 28.0.50

Alex Bennée <alex.bennee@linaro.org> writes:

> Hi,
>
> Over the last few weeks we've seen a steadily more common failure in
> test-char on CI. It seems to only manifest itself on the
> check-crypto-old-nettle build set and fails with a message like:
>
>   Running test test-char
>   Unexpected error in object_property_try_add() at ../qom/object.c:1219:
>   attempt to add duplicate property 'serial-id' to object (type 'container')
>   ERROR test-char - too few tests run (expected 38, got 9)
>   make: *** [run-test-87] Error 1

This is fixed by Eduardo's patch:

  Subject: [PATCH 0/2] Fix test-char reference counting bug
  Date: Tue, 15 Dec 2020 17:41:31 -0500
  Message-Id: <20201215224133.3545901-1-ehabkost@redhat.com>

which also fixes:
>
> passed with flying colours. It seems there must be an interaction
> between the tests to cause the failure. Attempts to valgrind out any
> strangeness were met with valgrind hanging on exit but otherwise
> passing. You can avoid the hang in valgrind by commenting out:
>
> --8<---------------cut here---------------start------------->8---
> modified   tests/test-char.c
> @@ -1434,11 +1434,11 @@ static void char_hotswap_test(void)
>      g_free(chr_args);
>  }
>  
> -static SocketAddress tcpaddr = {
> -    .type = SOCKET_ADDRESS_TYPE_INET,
> -    .u.inet.host = (char *)"127.0.0.1",
> -    .u.inet.port = (char *)"0",
> -};
> +/* static SocketAddress tcpaddr = { */
> +/*     .type = SOCKET_ADDRESS_TYPE_INET, */
> +/*     .u.inet.host = (char *)"127.0.0.1", */
> +/*     .u.inet.port = (char *)"0", */
> +/* }; */
>  #ifndef WIN32
>  static SocketAddress unixaddr = {
>      .type = SOCKET_ADDRESS_TYPE_UNIX,
> @@ -1534,12 +1534,12 @@ int main(int argc, char **argv)
>      g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
>                           &client8 ##name, char_socket_client_dupid_test)
>  
> -    if (has_ipv4) {
> -        SOCKET_SERVER_TEST(tcp, &tcpaddr);
> -        SOCKET_CLIENT_TEST(tcp, &tcpaddr);
> -        g_test_add_data_func("/char/socket/server/two-clients/tcp", &tcpaddr,
> -                             char_socket_server_two_clients_test);
> -    }
> +    /* if (has_ipv4) { */
> +    /*     SOCKET_SERVER_TEST(tcp, &tcpaddr); */
> +    /*     SOCKET_CLIENT_TEST(tcp, &tcpaddr); */
> +    /*     g_test_add_data_func("/char/socket/server/two-clients/tcp", 
> &tcpaddr, */
> +    /*                          char_socket_server_two_clients_test); */
> +    /* } */
>  #ifndef WIN32
>      SOCKET_SERVER_TEST(unix, &unixaddr);
>      SOCKET_CLIENT_TEST(unix, &unixaddr);
> --8<---------------cut here---------------end--------------->8---

the need for the above, however we still see the "leaked" fd's and I
also noticed you can't run the same test in parallel in the same
directory (soak test and valgrind) because of a non-unique socket
address in:

#ifndef WIN32
static SocketAddress unixaddr = {
    .type = SOCKET_ADDRESS_TYPE_UNIX,
    .u.q_unix.path = (char *)"test-char.sock",
};
#endif

>
> At that point valgrind doesn't have any smoking guns of corruption but
> minor leaks and also it seems a bunch of fd's being used up:
>
>   [alex@centos7 crypto-old-nettle]$ valgrind --tool=memcheck 
> --leak-check=full --track-origins=yes --track-fds=yes ./tests/test-char
>   ==28263== Memcheck, a memory error detector
>   ==28263== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
>   ==28263== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
>   ==28263== Command: ./tests/test-char
>   ==28263==
>   /char/null: OK
>   /char/invalid: OK
>   /char/ringbuf: OK
>   /char/mux: OK
>   /char/stdio: OK
>   /char/pipe: OK
>   /char/file: OK
>   /char/file-fifo: OK
>   /char/udp: OK
>   /char/serial: OK
>   /char/hotswap: OK
>   /char/websocket: OK
>   /char/socket/server/mainloop/unix: OK
>   /char/socket/server/wait-conn/unix: OK
>   /char/socket/server/mainloop-fdpass/unix: OK
>   /char/socket/server/wait-conn-fdpass/unix: OK
>   /char/socket/server/two-clients/unix: OK
>   /char/socket/client/mainloop/unix: OK
>   /char/socket/client/wait-conn/unix: OK
>   /char/socket/client/mainloop-reconnect/unix: OK
>   /char/socket/client/wait-conn-reconnect/unix: OK
>   /char/socket/client/mainloop-fdpass/unix: OK
>   /char/socket/client/wait-conn-fdpass/unix: OK
>   /char/socket/client/reconnect-error/unix: OK
>   ==28263==
>   ==28263== FILE DESCRIPTORS: 12 open at exit.
>   ==28263== Open AF_UNIX socket 12: <unknown>
>   ==28263==    at 0x5B9EAE7: socket (syscall-template.S:81)
>   ==28263==    by 0x14B1AC: qemu_socket (osdep.c:472)
>   ==28263==    by 0x1562FB: unix_connect_saddr (qemu-sockets.c:985)
>   ==28263==    by 0x1575E8: socket_connect (qemu-sockets.c:1158)
>   ==28263==    by 0x12A1D2: qio_channel_socket_connect_sync 
> (channel-socket.c:145)
>   ==28263==    by 0x13A7B8: tcp_chr_connect_client_task (char-socket.c:1123)
>   ==28263==    by 0x125640: qio_task_thread_worker (task.c:124)
>   ==28263==    by 0x14F245: qemu_thread_start (qemu-thread-posix.c:521)
>   ==28263==    by 0x588AEA4: start_thread (pthread_create.c:307)
>   ==28263==    by 0x5B9D96C: clone (clone.S:111)
>   ==28263==
>   ==28263== Open AF_UNIX socket 11: <unknown>
>   ==28263==    at 0x5B9EAE7: socket (syscall-template.S:81)
>   ==28263==    by 0x14B1AC: qemu_socket (osdep.c:472)
>   ==28263==    by 0x1562FB: unix_connect_saddr (qemu-sockets.c:985)
>   ==28263==    by 0x1575E8: socket_connect (qemu-sockets.c:1158)
>   ==28263==    by 0x12A1D2: qio_channel_socket_connect_sync 
> (channel-socket.c:145)
>   ==28263==    by 0x13A7B8: tcp_chr_connect_client_task (char-socket.c:1123)
>   ==28263==    by 0x125640: qio_task_thread_worker (task.c:124)
>   ==28263==    by 0x14F245: qemu_thread_start (qemu-thread-posix.c:521)
>   ==28263==    by 0x588AEA4: start_thread (pthread_create.c:307)
>   ==28263==    by 0x5B9D96C: clone (clone.S:111)
>   ==28263==
>   ==28263== Open file descriptor 10:
>   ==28263==    at 0x5B9DC1D: eventfd (eventfd.c:28)
>   ==28263==    by 0x4EC6B06: g_wakeup_new (gwakeup.c:146)
>   ==28263==    by 0x4E808B3: g_main_context_new (gmain.c:656)
>   ==28263==    by 0x4E842C4: g_get_worker_context (gmain.c:5797)
>   ==28263==    by 0x4E84332: ref_unix_signal_handler_unlocked (gmain.c:5205)
>   ==28263==    by 0x4E84458: g_child_watch_source_new (gmain.c:5384)
>   ==28263==    by 0x4EA5E64: wait_for_child (gtestutils.c:2814)
>   ==28263==    by 0x4EA9993: g_test_trap_subprocess (gtestutils.c:3100)
>   ==28263==    by 0x11BCB3: char_stdio_test (test-char.c:125)
>   ==28263==    by 0x4EA92E0: test_case_run (gtestutils.c:2255)
>   ==28263==    by 0x4EA92E0: g_test_run_suite_internal (gtestutils.c:2339)
>   ==28263==    by 0x4EA919A: g_test_run_suite_internal (gtestutils.c:2351)
>   ==28263==    by 0x4EA947D: g_test_run_suite (gtestutils.c:2426)
>   ==28263==
>   ==28263== Open file descriptor 8:
>   ==28263==    at 0x5B9DC1D: eventfd (eventfd.c:28)
>   ==28263==    by 0x4EC6B06: g_wakeup_new (gwakeup.c:146)
>   ==28263==    by 0x4E808B3: g_main_context_new (gmain.c:656)
>   ==28263==    by 0x4E8098C: g_main_context_default (gmain.c:692)
>   ==28263==    by 0x4E818F4: g_source_attach (gmain.c:1194)
>   ==28263==    by 0x158D5B: qemu_init_main_loop (main-loop.c:167)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 7:
>   ==28263==    at 0x5B9DC1D: eventfd (eventfd.c:28)
>   ==28263==    by 0x16C76B: event_notifier_init (event_notifier-posix.c:41)
>   ==28263==    by 0x1626A7: aio_context_new (async.c:508)
>   ==28263==    by 0x158CF9: qemu_init_main_loop (main-loop.c:159)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 6:
>   ==28263==    at 0x5B9DEB7: epoll_create1 (syscall-template.S:81)
>   ==28263==    by 0x165B5D: fdmon_epoll_setup (fdmon-epoll.c:151)
>   ==28263==    by 0x16269D: aio_context_new (async.c:506)
>   ==28263==    by 0x158CF9: qemu_init_main_loop (main-loop.c:159)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 5:
>   ==28263==    at 0x5B9DC1D: eventfd (eventfd.c:28)
>   ==28263==    by 0x16C76B: event_notifier_init (event_notifier-posix.c:41)
>   ==28263==    by 0x1626A7: aio_context_new (async.c:508)
>   ==28263==    by 0x1588AF: iohandler_init.part.0 (main-loop.c:551)
>   ==28263==    by 0x158C23: iohandler_init (main-loop.c:575)
>   ==28263==    by 0x158C23: qemu_set_fd_handler (main-loop.c:572)
>   ==28263==    by 0x158CF1: qemu_signal_init (main-loop.c:109)
>   ==28263==    by 0x158CF1: qemu_init_main_loop (main-loop.c:154)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 4:
>   ==28263==    at 0x5B9DEB7: epoll_create1 (syscall-template.S:81)
>   ==28263==    by 0x165B5D: fdmon_epoll_setup (fdmon-epoll.c:151)
>   ==28263==    by 0x16269D: aio_context_new (async.c:506)
>   ==28263==    by 0x1588AF: iohandler_init.part.0 (main-loop.c:551)
>   ==28263==    by 0x158C23: iohandler_init (main-loop.c:575)
>   ==28263==    by 0x158C23: qemu_set_fd_handler (main-loop.c:572)
>   ==28263==    by 0x158CF1: qemu_signal_init (main-loop.c:109)
>   ==28263==    by 0x158CF1: qemu_init_main_loop (main-loop.c:154)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 3:
>   ==28263==    at 0x5B97C89: syscall (syscall.S:38)
>   ==28263==    by 0x16C0C1: qemu_signalfd (compatfd.c:103)
>   ==28263==    by 0x158CC7: qemu_signal_init (main-loop.c:101)
>   ==28263==    by 0x158CC7: qemu_init_main_loop (main-loop.c:154)
>   ==28263==    by 0x118FD7: main (test-char.c:1453)
>   ==28263==
>   ==28263== Open file descriptor 2: /dev/pts/1
>   ==28263==    <inherited from parent>
>   ==28263==
>   ==28263== Open file descriptor 1: /dev/pts/1
>   ==28263==    <inherited from parent>
>   ==28263==
>   ==28263== Open file descriptor 0: /dev/pts/1
>   ==28263==    <inherited from parent>
>   ==28263==
>   ==28263==
>   ==28263== HEAP SUMMARY:
>   ==28263==     in use at exit: 62,568 bytes in 537 blocks
>   ==28263==   total heap usage: 5,803 allocs, 5,266 frees, 2,400,984 bytes 
> allocated
>   ==28263==
>   ==28263== 576 bytes in 1 blocks are possibly lost in loss record 352 of 359
>   ==28263==    at 0x4C2C089: calloc (vg_replace_malloc.c:762)
>   ==28263==    by 0x4012784: allocate_dtv (dl-tls.c:317)
>   ==28263==    by 0x4012784: _dl_allocate_tls (dl-tls.c:533)
>   ==28263==    by 0x588B87B: allocate_stack (allocatestack.c:539)
>   ==28263==    by 0x588B87B: pthread_create@@GLIBC_2.2.5 
> (pthread_create.c:447)
>   ==28263==    by 0x1500E5: qemu_thread_create (qemu-thread-posix.c:558)
>   ==28263==    by 0x1619D8: rcu_init_complete (rcu.c:379)
>   ==28263==    by 0x16FC9C: __libc_csu_init (in 
> /home/alex/lsrc/qemu.git/builds/crypto-old-nettle/tests/test-char)
>   ==28263==    by 0x5AC14E4: (below main) (libc-start.c:225)
>   ==28263==
>   ==28263== 576 bytes in 1 blocks are possibly lost in loss record 353 of 359
>   ==28263==    at 0x4C2C089: calloc (vg_replace_malloc.c:762)
>   ==28263==    by 0x4012784: allocate_dtv (dl-tls.c:317)
>   ==28263==    by 0x4012784: _dl_allocate_tls (dl-tls.c:533)
>   ==28263==    by 0x588B87B: allocate_stack (allocatestack.c:539)
>   ==28263==    by 0x588B87B: pthread_create@@GLIBC_2.2.5 
> (pthread_create.c:447)
>   ==28263==    by 0x4EC7C81: g_system_thread_new (gthread-posix.c:1177)
>   ==28263==    by 0x4EAA82E: g_thread_new_internal (gthread.c:874)
>   ==28263==    by 0x4EAA8E7: g_thread_new (gthread.c:827)
>   ==28263==    by 0x4E842E0: g_get_worker_context (gmain.c:5798)
>   ==28263==    by 0x4E84332: ref_unix_signal_handler_unlocked (gmain.c:5205)
>   ==28263==    by 0x4E84458: g_child_watch_source_new (gmain.c:5384)
>   ==28263==    by 0x4EA5E64: wait_for_child (gtestutils.c:2814)
>   ==28263==    by 0x4EA9993: g_test_trap_subprocess (gtestutils.c:3100)
>   ==28263==    by 0x11BCB3: char_stdio_test (test-char.c:125)
>   ==28263==
>   ==28263== LEAK SUMMARY:
>   ==28263==    definitely lost: 0 bytes in 0 blocks
>   ==28263==    indirectly lost: 0 bytes in 0 blocks
>   ==28263==      possibly lost: 1,152 bytes in 2 blocks
>   ==28263==    still reachable: 61,416 bytes in 535 blocks
>   ==28263==         suppressed: 0 bytes in 0 blocks
>   ==28263== Reachable blocks (those to which a pointer was found) are not 
> shown.
>   ==28263== To see them, rerun with: --leak-check=full --show-leak-kinds=all
>   ==28263==
>   ==28263== For lists of detected and suppressed errors, rerun with: -s
>   ==28263== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
>
> I also have a core dump of the binary when it fails but I suspect it
> requires someone with a bit more deftness of QOM to see if anything can
> be devised from its guts.
>
> So anyone want to hazard a guess at what's going on?


-- 
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]