On 2022/6/14 下午10:14, Dr. David Alan Gilbert wrote: > I don't think we can tell which one of them triggered the error; so the > only thing I can suggest is that we document the need for optmem_max > setting; I wonder how we get a better answer than 'a few 100KB'? > I guess it's something like the number of packets inflight * > sizeof(cmsghdr) ? > > Dave Three cases with errno ENOBUFS are described in the official doc(https://www.kernel.org/doc/html/v5.12/networking/msg_zerocopy.html): 1.The socket option was not set 2.The socket exceeds its optmem limit 3.The user exceeds its ulimit on locked pages For case 1, if the code logic is correct, this possibility can be ignored. For case 2, I asked a kernel developer about the reason for "a few 100KB". He said that the recommended value should be for the purpose of improving the performance of zero_copy send. If the NICsends data slower than the data generation speed, even if optmem is set to 100KB, there is a probability that sendmsg returns with errno ENOBUFS. For case 3, If I do not set max locked memory for the qemu, the max locked memory will be unlimited. I set the max locked memory for qemu and found that once the memory usage exceeds the max locked memory, oom will occur.  Does this mean that sendmsg cannot return with errno ENOBUFS at all when user exceeds its ulimit on locked pages? If the above is true, can we take the errno as the case 2? I modified the code logic to call sendmsg again when the errno is ENOBUFS and set optmem to the initial 20KB(echo 20480 > /proc/sys/net/core/optmem_max), now the multifd zero_copy migration goes well. Here are the changes I made to the code: Signed-off-by: chuang xu ---  io/channel-socket.c | 4 +---  1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/io/channel-socket.c b/io/channel-socket.c index dc9c165de1..9267f55a1d 100644 --- a/io/channel-socket.c +++ b/io/channel-socket.c @@ -595,9 +595,7 @@ static ssize_t qio_channel_socket_writev(QIOChannel *ioc,  #ifdef QEMU_MSG_ZEROCOPY          case ENOBUFS:              if (sflags & MSG_ZEROCOPY) { -                error_setg_errno(errp, errno, -                                 "Process can't lock enough memory for using MSG_ZEROCOPY"); -                return -1; +                goto retry;              }              break;  #endif -- Dave, what's your take? Best Regards, chuang xu