On Mon, Sep 27, 2021 at 05:17:01PM +0000, Raphael Norwitz wrote: > In the vhost-user-blk-test, as of now there is nothing stoping > vhost-user-blk in QEMU writing to the socket right after forking off the > storage daemon before it has a chance to come up properly, leaving the > test hanging forever. This intermittently hanging test has caused QEMU > automation failures reported multiple times on the mailing list [1]. > > This change makes the storage-daemon notify the vhost-user-blk-test > that it is fully initialized and ready to handle client connections by > creating a pidfile on initialiation. This ensures that the storage-daemon > backend won't miss vhost-user messages and thereby resolves the hang. > > [1] https://lore.kernel.org/qemu-devel/CAFEAcA8kYpz9LiPNxnWJAPSjc=nv532bEdyfynaBeMeohqBp3A@mail.gmail.com/ The following hack triggers the hang. The stack traces match what Peter posted in the thread linked above. I'll take a closer look. --- diff --git a/storage-daemon/qemu-storage-daemon.c b/storage-daemon/qemu-storage-daemon.c index 10a1a33761..50c9caa1b2 100644 --- a/storage-daemon/qemu-storage-daemon.c +++ b/storage-daemon/qemu-storage-daemon.c @@ -316,6 +316,8 @@ int main(int argc, char *argv[]) signal(SIGPIPE, SIG_IGN); #endif + sleep(5); + error_init(argv[0]); qemu_init_exec_dir(argv[0]); os_setup_signal_handling();