From: "Alex Bennée" <alex.bennee@linaro.org>
To: qemu-devel@nongnu.org
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
"Thomas Huth" <thuth@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Mark Cave-Ayland" <mark.cave-ayland@ilande.co.uk>,
"Philippe Mathieu-Daudé" <pmathieu@redhat.com>,
"Li Feng" <fengli@smartx.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>
Subject: tests/test-char hangs in CI (Centos7 with check-crypto-old-nettle)
Date: Tue, 15 Dec 2020 20:31:37 +0000 [thread overview]
Message-ID: <87eejqu6y9.fsf@linaro.org> (raw)
Hi,
Over the last few weeks we've seen a steadily more common failure in
test-char on CI. It seems to only manifest itself on the
check-crypto-old-nettle build set and fails with a message like:
Running test test-char
Unexpected error in object_property_try_add() at ../qom/object.c:1219:
attempt to add duplicate property 'serial-id' to object (type 'container')
ERROR test-char - too few tests run (expected 38, got 9)
make: *** [run-test-87] Error 1
This would point the finger at test-char /char/serial
I've managed to re-create the crash on Centos7 (x86_64) with the
following build config:
'../../configure' '--disable-gcrypt' '--enable-nettle' '--target-list=x86_64-softmmu,x86_64-linux-user'
The failure rate is something in the region of 1% which I confirmed by
spamming it with my retry.py script:
retry.py -n 100 -c -- ./tests/test-char
Amusingly attempts to hit it just concentrating on the /char/serial test
with:
retry.py -n 1000 -c -- ./tests/test-char -p /char/serial
passed with flying colours. It seems there must be an interaction
between the tests to cause the failure. Attempts to valgrind out any
strangeness were met with valgrind hanging on exit but otherwise
passing. You can avoid the hang in valgrind by commenting out:
--8<---------------cut here---------------start------------->8---
modified tests/test-char.c
@@ -1434,11 +1434,11 @@ static void char_hotswap_test(void)
g_free(chr_args);
}
-static SocketAddress tcpaddr = {
- .type = SOCKET_ADDRESS_TYPE_INET,
- .u.inet.host = (char *)"127.0.0.1",
- .u.inet.port = (char *)"0",
-};
+/* static SocketAddress tcpaddr = { */
+/* .type = SOCKET_ADDRESS_TYPE_INET, */
+/* .u.inet.host = (char *)"127.0.0.1", */
+/* .u.inet.port = (char *)"0", */
+/* }; */
#ifndef WIN32
static SocketAddress unixaddr = {
.type = SOCKET_ADDRESS_TYPE_UNIX,
@@ -1534,12 +1534,12 @@ int main(int argc, char **argv)
g_test_add_data_func("/char/socket/client/dupid-reconnect/" # name, \
&client8 ##name, char_socket_client_dupid_test)
- if (has_ipv4) {
- SOCKET_SERVER_TEST(tcp, &tcpaddr);
- SOCKET_CLIENT_TEST(tcp, &tcpaddr);
- g_test_add_data_func("/char/socket/server/two-clients/tcp", &tcpaddr,
- char_socket_server_two_clients_test);
- }
+ /* if (has_ipv4) { */
+ /* SOCKET_SERVER_TEST(tcp, &tcpaddr); */
+ /* SOCKET_CLIENT_TEST(tcp, &tcpaddr); */
+ /* g_test_add_data_func("/char/socket/server/two-clients/tcp", &tcpaddr, */
+ /* char_socket_server_two_clients_test); */
+ /* } */
#ifndef WIN32
SOCKET_SERVER_TEST(unix, &unixaddr);
SOCKET_CLIENT_TEST(unix, &unixaddr);
--8<---------------cut here---------------end--------------->8---
At that point valgrind doesn't have any smoking guns of corruption but
minor leaks and also it seems a bunch of fd's being used up:
[alex@centos7 crypto-old-nettle]$ valgrind --tool=memcheck --leak-check=full --track-origins=yes --track-fds=yes ./tests/test-char
==28263== Memcheck, a memory error detector
==28263== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==28263== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==28263== Command: ./tests/test-char
==28263==
/char/null: OK
/char/invalid: OK
/char/ringbuf: OK
/char/mux: OK
/char/stdio: OK
/char/pipe: OK
/char/file: OK
/char/file-fifo: OK
/char/udp: OK
/char/serial: OK
/char/hotswap: OK
/char/websocket: OK
/char/socket/server/mainloop/unix: OK
/char/socket/server/wait-conn/unix: OK
/char/socket/server/mainloop-fdpass/unix: OK
/char/socket/server/wait-conn-fdpass/unix: OK
/char/socket/server/two-clients/unix: OK
/char/socket/client/mainloop/unix: OK
/char/socket/client/wait-conn/unix: OK
/char/socket/client/mainloop-reconnect/unix: OK
/char/socket/client/wait-conn-reconnect/unix: OK
/char/socket/client/mainloop-fdpass/unix: OK
/char/socket/client/wait-conn-fdpass/unix: OK
/char/socket/client/reconnect-error/unix: OK
==28263==
==28263== FILE DESCRIPTORS: 12 open at exit.
==28263== Open AF_UNIX socket 12: <unknown>
==28263== at 0x5B9EAE7: socket (syscall-template.S:81)
==28263== by 0x14B1AC: qemu_socket (osdep.c:472)
==28263== by 0x1562FB: unix_connect_saddr (qemu-sockets.c:985)
==28263== by 0x1575E8: socket_connect (qemu-sockets.c:1158)
==28263== by 0x12A1D2: qio_channel_socket_connect_sync (channel-socket.c:145)
==28263== by 0x13A7B8: tcp_chr_connect_client_task (char-socket.c:1123)
==28263== by 0x125640: qio_task_thread_worker (task.c:124)
==28263== by 0x14F245: qemu_thread_start (qemu-thread-posix.c:521)
==28263== by 0x588AEA4: start_thread (pthread_create.c:307)
==28263== by 0x5B9D96C: clone (clone.S:111)
==28263==
==28263== Open AF_UNIX socket 11: <unknown>
==28263== at 0x5B9EAE7: socket (syscall-template.S:81)
==28263== by 0x14B1AC: qemu_socket (osdep.c:472)
==28263== by 0x1562FB: unix_connect_saddr (qemu-sockets.c:985)
==28263== by 0x1575E8: socket_connect (qemu-sockets.c:1158)
==28263== by 0x12A1D2: qio_channel_socket_connect_sync (channel-socket.c:145)
==28263== by 0x13A7B8: tcp_chr_connect_client_task (char-socket.c:1123)
==28263== by 0x125640: qio_task_thread_worker (task.c:124)
==28263== by 0x14F245: qemu_thread_start (qemu-thread-posix.c:521)
==28263== by 0x588AEA4: start_thread (pthread_create.c:307)
==28263== by 0x5B9D96C: clone (clone.S:111)
==28263==
==28263== Open file descriptor 10:
==28263== at 0x5B9DC1D: eventfd (eventfd.c:28)
==28263== by 0x4EC6B06: g_wakeup_new (gwakeup.c:146)
==28263== by 0x4E808B3: g_main_context_new (gmain.c:656)
==28263== by 0x4E842C4: g_get_worker_context (gmain.c:5797)
==28263== by 0x4E84332: ref_unix_signal_handler_unlocked (gmain.c:5205)
==28263== by 0x4E84458: g_child_watch_source_new (gmain.c:5384)
==28263== by 0x4EA5E64: wait_for_child (gtestutils.c:2814)
==28263== by 0x4EA9993: g_test_trap_subprocess (gtestutils.c:3100)
==28263== by 0x11BCB3: char_stdio_test (test-char.c:125)
==28263== by 0x4EA92E0: test_case_run (gtestutils.c:2255)
==28263== by 0x4EA92E0: g_test_run_suite_internal (gtestutils.c:2339)
==28263== by 0x4EA919A: g_test_run_suite_internal (gtestutils.c:2351)
==28263== by 0x4EA947D: g_test_run_suite (gtestutils.c:2426)
==28263==
==28263== Open file descriptor 8:
==28263== at 0x5B9DC1D: eventfd (eventfd.c:28)
==28263== by 0x4EC6B06: g_wakeup_new (gwakeup.c:146)
==28263== by 0x4E808B3: g_main_context_new (gmain.c:656)
==28263== by 0x4E8098C: g_main_context_default (gmain.c:692)
==28263== by 0x4E818F4: g_source_attach (gmain.c:1194)
==28263== by 0x158D5B: qemu_init_main_loop (main-loop.c:167)
==28263== by 0x118FD7: main (test-char.c:1453)
==28263==
==28263== Open file descriptor 7:
==28263== at 0x5B9DC1D: eventfd (eventfd.c:28)
==28263== by 0x16C76B: event_notifier_init (event_notifier-posix.c:41)
==28263== by 0x1626A7: aio_context_new (async.c:508)
==28263== by 0x158CF9: qemu_init_main_loop (main-loop.c:159)
==28263== by 0x118FD7: main (test-char.c:1453)
==28263==
==28263== Open file descriptor 6:
==28263== at 0x5B9DEB7: epoll_create1 (syscall-template.S:81)
==28263== by 0x165B5D: fdmon_epoll_setup (fdmon-epoll.c:151)
==28263== by 0x16269D: aio_context_new (async.c:506)
==28263== by 0x158CF9: qemu_init_main_loop (main-loop.c:159)
==28263== by 0x118FD7: main (test-char.c:1453)
==28263==
==28263== Open file descriptor 5:
==28263== at 0x5B9DC1D: eventfd (eventfd.c:28)
==28263== by 0x16C76B: event_notifier_init (event_notifier-posix.c:41)
==28263== by 0x1626A7: aio_context_new (async.c:508)
==28263== by 0x1588AF: iohandler_init.part.0 (main-loop.c:551)
==28263== by 0x158C23: iohandler_init (main-loop.c:575)
==28263== by 0x158C23: qemu_set_fd_handler (main-loop.c:572)
==28263== by 0x158CF1: qemu_signal_init (main-loop.c:109)
==28263== by 0x158CF1: qemu_init_main_loop (main-loop.c:154)
==28263== by 0x118FD7: main (test-char.c:1453)
==28263==
==28263== Open file descriptor 4:
==28263== at 0x5B9DEB7: epoll_create1 (syscall-template.S:81)
==28263== by 0x165B5D: fdmon_epoll_setup (fdmon-epoll.c:151)
==28263== by 0x16269D: aio_context_new (async.c:506)
==28263== by 0x1588AF: iohandler_init.part.0 (main-loop.c:551)
==28263== by 0x158C23: iohandler_init (main-loop.c:575)
==28263== by 0x158C23: qemu_set_fd_handler (main-loop.c:572)
==28263== by 0x158CF1: qemu_signal_init (main-loop.c:109)
==28263== by 0x158CF1: qemu_init_main_loop (main-loop.c:154)
==28263== by 0x118FD7: main (test-char.c:1453)
==28263==
==28263== Open file descriptor 3:
==28263== at 0x5B97C89: syscall (syscall.S:38)
==28263== by 0x16C0C1: qemu_signalfd (compatfd.c:103)
==28263== by 0x158CC7: qemu_signal_init (main-loop.c:101)
==28263== by 0x158CC7: qemu_init_main_loop (main-loop.c:154)
==28263== by 0x118FD7: main (test-char.c:1453)
==28263==
==28263== Open file descriptor 2: /dev/pts/1
==28263== <inherited from parent>
==28263==
==28263== Open file descriptor 1: /dev/pts/1
==28263== <inherited from parent>
==28263==
==28263== Open file descriptor 0: /dev/pts/1
==28263== <inherited from parent>
==28263==
==28263==
==28263== HEAP SUMMARY:
==28263== in use at exit: 62,568 bytes in 537 blocks
==28263== total heap usage: 5,803 allocs, 5,266 frees, 2,400,984 bytes allocated
==28263==
==28263== 576 bytes in 1 blocks are possibly lost in loss record 352 of 359
==28263== at 0x4C2C089: calloc (vg_replace_malloc.c:762)
==28263== by 0x4012784: allocate_dtv (dl-tls.c:317)
==28263== by 0x4012784: _dl_allocate_tls (dl-tls.c:533)
==28263== by 0x588B87B: allocate_stack (allocatestack.c:539)
==28263== by 0x588B87B: pthread_create@@GLIBC_2.2.5 (pthread_create.c:447)
==28263== by 0x1500E5: qemu_thread_create (qemu-thread-posix.c:558)
==28263== by 0x1619D8: rcu_init_complete (rcu.c:379)
==28263== by 0x16FC9C: __libc_csu_init (in /home/alex/lsrc/qemu.git/builds/crypto-old-nettle/tests/test-char)
==28263== by 0x5AC14E4: (below main) (libc-start.c:225)
==28263==
==28263== 576 bytes in 1 blocks are possibly lost in loss record 353 of 359
==28263== at 0x4C2C089: calloc (vg_replace_malloc.c:762)
==28263== by 0x4012784: allocate_dtv (dl-tls.c:317)
==28263== by 0x4012784: _dl_allocate_tls (dl-tls.c:533)
==28263== by 0x588B87B: allocate_stack (allocatestack.c:539)
==28263== by 0x588B87B: pthread_create@@GLIBC_2.2.5 (pthread_create.c:447)
==28263== by 0x4EC7C81: g_system_thread_new (gthread-posix.c:1177)
==28263== by 0x4EAA82E: g_thread_new_internal (gthread.c:874)
==28263== by 0x4EAA8E7: g_thread_new (gthread.c:827)
==28263== by 0x4E842E0: g_get_worker_context (gmain.c:5798)
==28263== by 0x4E84332: ref_unix_signal_handler_unlocked (gmain.c:5205)
==28263== by 0x4E84458: g_child_watch_source_new (gmain.c:5384)
==28263== by 0x4EA5E64: wait_for_child (gtestutils.c:2814)
==28263== by 0x4EA9993: g_test_trap_subprocess (gtestutils.c:3100)
==28263== by 0x11BCB3: char_stdio_test (test-char.c:125)
==28263==
==28263== LEAK SUMMARY:
==28263== definitely lost: 0 bytes in 0 blocks
==28263== indirectly lost: 0 bytes in 0 blocks
==28263== possibly lost: 1,152 bytes in 2 blocks
==28263== still reachable: 61,416 bytes in 535 blocks
==28263== suppressed: 0 bytes in 0 blocks
==28263== Reachable blocks (those to which a pointer was found) are not shown.
==28263== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==28263==
==28263== For lists of detected and suppressed errors, rerun with: -s
==28263== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
I also have a core dump of the binary when it fails but I suspect it
requires someone with a bit more deftness of QOM to see if anything can
be devised from its guts.
So anyone want to hazard a guess at what's going on?
--
Alex Bennée
next reply other threads:[~2020-12-15 20:54 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-15 20:31 Alex Bennée [this message]
2020-12-16 15:46 ` tests/test-char hangs in CI (Centos7 with check-crypto-old-nettle) Alex Bennée
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87eejqu6y9.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=berrange@redhat.com \
--cc=fengli@smartx.com \
--cc=marcandre.lureau@redhat.com \
--cc=mark.cave-ayland@ilande.co.uk \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=pmathieu@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.