UDP recvmsg blocks after select(), 2.6 bug?

* UDP recvmsg blocks after select(), 2.6 bug?
@ 2004-10-06 14:52 Joris van Rantwijk
  2004-10-06 15:01 ` David S. Miller
                   ` (4 more replies)
  0 siblings, 5 replies; 191+ messages in thread
From: Joris van Rantwijk @ 2004-10-06 14:52 UTC (permalink / raw)
  To: linux-kernel

Hello,

I have a problem where the sequence of events is as follows:
 - application does select() on a UDP socket descriptor
 - select returns success with descriptor ready for reading
 - application does recvfrom() on this descriptor and this recvfrom()
   blocks forever

My understanding of POSIX is limited, but it seems to me that a read call
must never block after select just said that it's ok to read from the
descriptor. So any such behaviour would be a kernel bug.

This problem occurs repeatedly, but only once per week on average so it is
hard to debug but definitely a real problem. I know for a fact that the
sequence of events is as described above from strace output. My kernel
version is 2.6.7.

>From a brief look at the kernel UDP code, I suspect a problem in
net/ipv4/udp.c, udp_recvmsg(): it reads the first available datagram
from the queue, then checks the UDP checksum. If the UDP checksum fails at
this point, the datagram is discarded and the process blocks until the next
datagram arrives.

Could someone please help me track this problem?
Am I correct in my reasoning that the select() -> recvmsg() sequence must
never block?
If yes, is it possible that this problem is triggered by a failed UDP
checksum in the udp_recvmsg() function?
If yes, can we do something to fix this?

Thanks,
  Joris.

^ permalink raw reply	[flat|nested] 191+ messages in thread