On Fri, 30 Jul 2021 04:40:45 -0300 Leonardo Bras wrote: > From source host viewpoint, losing a connection during migration will > cause the sockets to get stuck in sendmsg() syscall, waiting for > the receiving side to reply. > > In migration, yank works by shutting-down the migration QIOChannel fd. > This causes a failure in the next sendmsg() for that fd, and the whole > migration gets cancelled. > > In multifd, due to having multiple sockets in multiple threads, > on a connection loss there will be extra sockets stuck in sendmsg(), > and because they will be holding their own mutex, there is good chance > the main migration thread can get stuck in multifd_send_pages() > waiting for one of those mutexes. > > While it's waiting, the main migration thread can't run sendmsg() on > it's fd, and therefore can't cause the migration to be cancelled, thus > causing yank not to work. > > Fixes this by shutting down all migration fds (including multifd ones), > so no thread get's stuck in sendmsg() while holding a lock, and thus > allowing the main migration thread to properly cancel migration when > yank is used. > > There is no need to do the same procedure to yank to work in the > receiving host since ops->recv_pages() is kept outside the mutex protected > code in multifd_recv_thread(). > > Buglink:https://bugzilla.redhat.com/show_bug.cgi?id=1970337 > Reported-by: Li Xiaohui > Signed-off-by: Leonardo Bras > --- Hi, There is an easier explanation: I forgot the send side of multifd altogether (I thought it was covered by migration_channel_connect()). So yank won't actually shutdown() the multifd sockets on the send side. In the bugreport you wrote > (As a test, I called qio_channel_shutdown() in every multifd iochannel and yank worked just fine, but I could not retry migration, because it was still 'ongoing') That sounds like a bug in the error handling for multifd. But quickly looking at the code, it should properly fail the migration. BTW: You can shutdown outgoing sockets from outside of qemu with the 'ss' utility, like this: 'sudo ss -K dst dport = ' Regards, Lukas Straub