All of lore.kernel.org
 help / color / mirror / Atom feed
* Kernel UDP behavior with missing destinations
@ 2019-05-15 19:56 Adam Urban
  2019-05-16 14:47 ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: Adam Urban @ 2019-05-15 19:56 UTC (permalink / raw)
  To: netdev

We have an application where we are use sendmsg() to send (lots of)
UDP packets to multiple destinations over a single socket, repeatedly,
and at a pretty constant rate using IPv4.

In some cases, some of these destinations are no longer present on the
network, but we continue sending data to them anyways. The missing
devices are usually a temporary situation, but can last for
days/weeks/months.

We are seeing an issue where packets sent even to destinations that
are present on the network are getting dropped while the kernel
performs arp updates.

We see a -1 EAGAIN (Resource temporarily unavailable) return value
from the sendmsg() call when this is happening:

sendmsg(72, {msg_name(16)={sa_family=AF_INET, sin_port=htons(1234),
sin_addr=inet_addr("10.1.2.3")}, msg_iov(1)=[{"\4\1"..., 96}],
msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = -1 EAGAIN (Resource
temporarily unavailable)

Looking at packet captures, during this time you see the kernel arping
for the devices that aren't on the network, timing out, arping again,
timing out, and then finally arping a 3rd time before setting the
INCOMPLETE state again (very briefly being in a FAILED state).

"Good" packets don't start going out again until the 3rd timeout
happens, and then they go out for about 1s until the 3s delay from ARP
happens again.

Interestingly, this isn't an all or nothing situation. With only a few
(2-3) devices missing, we don't run into this "blocking" situation and
data always goes out. But once 4 or more devices are missing, it
happens. Setting static ARP entries for the missing supplies, even if
they are bogus, resolves the issue, but of course results in packets
with a bogus destination going out on the wire instead of getting
dropped by the kernel.

Can anyone explain why this is happening? I have tried tuning the
unres_qlen sysctl without effect and will next try to set the
MSG_DONTWAIT socket option to try and see if that helps. But I want to
make sure I understand what is going on.

Are there any parameters we can tune so that UDP packets sent to
INCOMPLETE destinations are immediately dropped? What's the best way
to prevent a socket from being unavailable while arp operations are
happening (assuming arp is the cause)?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-15 19:56 Kernel UDP behavior with missing destinations Adam Urban
@ 2019-05-16 14:47 ` Willem de Bruijn
  2019-05-16 15:43   ` Adam Urban
  2019-05-16 16:05   ` Eric Dumazet
  0 siblings, 2 replies; 12+ messages in thread
From: Willem de Bruijn @ 2019-05-16 14:47 UTC (permalink / raw)
  To: Adam Urban; +Cc: Network Development

On Wed, May 15, 2019 at 3:57 PM Adam Urban <adam.urban@appleguru.org> wrote:
>
> We have an application where we are use sendmsg() to send (lots of)
> UDP packets to multiple destinations over a single socket, repeatedly,
> and at a pretty constant rate using IPv4.
>
> In some cases, some of these destinations are no longer present on the
> network, but we continue sending data to them anyways. The missing
> devices are usually a temporary situation, but can last for
> days/weeks/months.
>
> We are seeing an issue where packets sent even to destinations that
> are present on the network are getting dropped while the kernel
> performs arp updates.
>
> We see a -1 EAGAIN (Resource temporarily unavailable) return value
> from the sendmsg() call when this is happening:
>
> sendmsg(72, {msg_name(16)={sa_family=AF_INET, sin_port=htons(1234),
> sin_addr=inet_addr("10.1.2.3")}, msg_iov(1)=[{"\4\1"..., 96}],
> msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = -1 EAGAIN (Resource
> temporarily unavailable)
>
> Looking at packet captures, during this time you see the kernel arping
> for the devices that aren't on the network, timing out, arping again,
> timing out, and then finally arping a 3rd time before setting the
> INCOMPLETE state again (very briefly being in a FAILED state).
>
> "Good" packets don't start going out again until the 3rd timeout
> happens, and then they go out for about 1s until the 3s delay from ARP
> happens again.
>
> Interestingly, this isn't an all or nothing situation. With only a few
> (2-3) devices missing, we don't run into this "blocking" situation and
> data always goes out. But once 4 or more devices are missing, it
> happens. Setting static ARP entries for the missing supplies, even if
> they are bogus, resolves the issue, but of course results in packets
> with a bogus destination going out on the wire instead of getting
> dropped by the kernel.
>
> Can anyone explain why this is happening? I have tried tuning the
> unres_qlen sysctl without effect and will next try to set the
> MSG_DONTWAIT socket option to try and see if that helps. But I want to
> make sure I understand what is going on.
>
> Are there any parameters we can tune so that UDP packets sent to
> INCOMPLETE destinations are immediately dropped? What's the best way
> to prevent a socket from being unavailable while arp operations are
> happening (assuming arp is the cause)?

Sounds like hitting SO_SNDBUF limit due to datagrams being held on the
neighbor queue. Especially since the issue occurs only as the number
of unreachable destinations exceeds some threshold. Does
/proc/net/stat/ndisc_cache show unresolved_discards? Increasing
unres_qlen may make matters only worse if more datagrams can get
queued. See also the branch on NUD_INCOMPLETE in __neigh_event_send.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-16 14:47 ` Willem de Bruijn
@ 2019-05-16 15:43   ` Adam Urban
  2019-05-16 16:05   ` Eric Dumazet
  1 sibling, 0 replies; 12+ messages in thread
From: Adam Urban @ 2019-05-16 15:43 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Network Development

/proc/net/stat/ndisc_cache show unresolved_discards appears to show 0
unresolved_discards:

entries,allocs,destroys,hash_grows,lookups,hits,res_failed,rcv_probes_mcast,rcv_probes_ucast,periodic_gc_runs,forced_gc_runs,unresolved_discards,table_fulls
00000005,00000005,00000000,00000000,00000000,00000000,00000000,00000000,00000000,000021af,00000000,00000000,00000000
00000005,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000


On Thu, May 16, 2019 at 10:48 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Wed, May 15, 2019 at 3:57 PM Adam Urban <adam.urban@appleguru.org> wrote:
> >
> > We have an application where we are use sendmsg() to send (lots of)
> > UDP packets to multiple destinations over a single socket, repeatedly,
> > and at a pretty constant rate using IPv4.
> >
> > In some cases, some of these destinations are no longer present on the
> > network, but we continue sending data to them anyways. The missing
> > devices are usually a temporary situation, but can last for
> > days/weeks/months.
> >
> > We are seeing an issue where packets sent even to destinations that
> > are present on the network are getting dropped while the kernel
> > performs arp updates.
> >
> > We see a -1 EAGAIN (Resource temporarily unavailable) return value
> > from the sendmsg() call when this is happening:
> >
> > sendmsg(72, {msg_name(16)={sa_family=AF_INET, sin_port=htons(1234),
> > sin_addr=inet_addr("10.1.2.3")}, msg_iov(1)=[{"\4\1"..., 96}],
> > msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = -1 EAGAIN (Resource
> > temporarily unavailable)
> >
> > Looking at packet captures, during this time you see the kernel arping
> > for the devices that aren't on the network, timing out, arping again,
> > timing out, and then finally arping a 3rd time before setting the
> > INCOMPLETE state again (very briefly being in a FAILED state).
> >
> > "Good" packets don't start going out again until the 3rd timeout
> > happens, and then they go out for about 1s until the 3s delay from ARP
> > happens again.
> >
> > Interestingly, this isn't an all or nothing situation. With only a few
> > (2-3) devices missing, we don't run into this "blocking" situation and
> > data always goes out. But once 4 or more devices are missing, it
> > happens. Setting static ARP entries for the missing supplies, even if
> > they are bogus, resolves the issue, but of course results in packets
> > with a bogus destination going out on the wire instead of getting
> > dropped by the kernel.
> >
> > Can anyone explain why this is happening? I have tried tuning the
> > unres_qlen sysctl without effect and will next try to set the
> > MSG_DONTWAIT socket option to try and see if that helps. But I want to
> > make sure I understand what is going on.
> >
> > Are there any parameters we can tune so that UDP packets sent to
> > INCOMPLETE destinations are immediately dropped? What's the best way
> > to prevent a socket from being unavailable while arp operations are
> > happening (assuming arp is the cause)?
>
> Sounds like hitting SO_SNDBUF limit due to datagrams being held on the
> neighbor queue. Especially since the issue occurs only as the number
> of unreachable destinations exceeds some threshold. Does
> /proc/net/stat/ndisc_cache show unresolved_discards? Increasing
> unres_qlen may make matters only worse if more datagrams can get
> queued. See also the branch on NUD_INCOMPLETE in __neigh_event_send.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-16 14:47 ` Willem de Bruijn
  2019-05-16 15:43   ` Adam Urban
@ 2019-05-16 16:05   ` Eric Dumazet
  2019-05-16 16:14     ` Eric Dumazet
  2019-05-17  0:27     ` Adam Urban
  1 sibling, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2019-05-16 16:05 UTC (permalink / raw)
  To: Willem de Bruijn, Adam Urban; +Cc: Network Development



On 5/16/19 7:47 AM, Willem de Bruijn wrote:
> On Wed, May 15, 2019 at 3:57 PM Adam Urban <adam.urban@appleguru.org> wrote:
>>
>> We have an application where we are use sendmsg() to send (lots of)
>> UDP packets to multiple destinations over a single socket, repeatedly,
>> and at a pretty constant rate using IPv4.
>>
>> In some cases, some of these destinations are no longer present on the
>> network, but we continue sending data to them anyways. The missing
>> devices are usually a temporary situation, but can last for
>> days/weeks/months.
>>
>> We are seeing an issue where packets sent even to destinations that
>> are present on the network are getting dropped while the kernel
>> performs arp updates.
>>
>> We see a -1 EAGAIN (Resource temporarily unavailable) return value
>> from the sendmsg() call when this is happening:
>>
>> sendmsg(72, {msg_name(16)={sa_family=AF_INET, sin_port=htons(1234),
>> sin_addr=inet_addr("10.1.2.3")}, msg_iov(1)=[{"\4\1"..., 96}],
>> msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = -1 EAGAIN (Resource
>> temporarily unavailable)
>>
>> Looking at packet captures, during this time you see the kernel arping
>> for the devices that aren't on the network, timing out, arping again,
>> timing out, and then finally arping a 3rd time before setting the
>> INCOMPLETE state again (very briefly being in a FAILED state).
>>
>> "Good" packets don't start going out again until the 3rd timeout
>> happens, and then they go out for about 1s until the 3s delay from ARP
>> happens again.
>>
>> Interestingly, this isn't an all or nothing situation. With only a few
>> (2-3) devices missing, we don't run into this "blocking" situation and
>> data always goes out. But once 4 or more devices are missing, it
>> happens. Setting static ARP entries for the missing supplies, even if
>> they are bogus, resolves the issue, but of course results in packets
>> with a bogus destination going out on the wire instead of getting
>> dropped by the kernel.
>>
>> Can anyone explain why this is happening? I have tried tuning the
>> unres_qlen sysctl without effect and will next try to set the
>> MSG_DONTWAIT socket option to try and see if that helps. But I want to
>> make sure I understand what is going on.
>>
>> Are there any parameters we can tune so that UDP packets sent to
>> INCOMPLETE destinations are immediately dropped? What's the best way
>> to prevent a socket from being unavailable while arp operations are
>> happening (assuming arp is the cause)?
> 
> Sounds like hitting SO_SNDBUF limit due to datagrams being held on the
> neighbor queue. Especially since the issue occurs only as the number
> of unreachable destinations exceeds some threshold. Does
> /proc/net/stat/ndisc_cache show unresolved_discards? Increasing
> unres_qlen may make matters only worse if more datagrams can get
> queued. See also the branch on NUD_INCOMPLETE in __neigh_event_send.
> 

We probably should add a ttl on arp queues.

neigh_probe() could do that quite easily.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-16 16:05   ` Eric Dumazet
@ 2019-05-16 16:14     ` Eric Dumazet
  2019-05-16 16:32       ` Adam Urban
  2019-05-17  0:27     ` Adam Urban
  1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2019-05-16 16:14 UTC (permalink / raw)
  To: Willem de Bruijn, Adam Urban; +Cc: Network Development



On 5/16/19 9:05 AM, Eric Dumazet wrote:

> We probably should add a ttl on arp queues.
> 
> neigh_probe() could do that quite easily.
> 

Adam, all you need to do is to increase UDP socket sndbuf.

Either by increasing /proc/sys/net/core/wmem_default

or using setsockopt( ... SO_SNDBUF ... )


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-16 16:14     ` Eric Dumazet
@ 2019-05-16 16:32       ` Adam Urban
  2019-05-16 17:03         ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Adam Urban @ 2019-05-16 16:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Willem de Bruijn, Network Development

Eric, thanks. Increasing wmem_default from 229376 to 2293760 indeed
makes the issue go away on my test bench. What's a good way to
determine the optimal value here? I assume this is in bytes and needs
to be large enough so that the SO_SNDBUF doesn't fill up before the
kernel drops the packets. How often does that happen?

On Thu, May 16, 2019 at 12:14 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 5/16/19 9:05 AM, Eric Dumazet wrote:
>
> > We probably should add a ttl on arp queues.
> >
> > neigh_probe() could do that quite easily.
> >
>
> Adam, all you need to do is to increase UDP socket sndbuf.
>
> Either by increasing /proc/sys/net/core/wmem_default
>
> or using setsockopt( ... SO_SNDBUF ... )
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-16 16:32       ` Adam Urban
@ 2019-05-16 17:03         ` Eric Dumazet
  2019-05-16 21:42           ` Adam Urban
  0 siblings, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2019-05-16 17:03 UTC (permalink / raw)
  To: Adam Urban, Eric Dumazet; +Cc: Willem de Bruijn, Network Development



On 5/16/19 9:32 AM, Adam Urban wrote:
> Eric, thanks. Increasing wmem_default from 229376 to 2293760 indeed
> makes the issue go away on my test bench. What's a good way to
> determine the optimal value here? I assume this is in bytes and needs
> to be large enough so that the SO_SNDBUF doesn't fill up before the
> kernel drops the packets. How often does that happen?

You have to count the max number of arp queues your UDP socket could hit.

Say this number is X

Then wmem_default should be set  to X * unres_qlen_bytes + Y

With Y =  229376  (the default  wmem_default)

Then, you might need to increase the qdisc limits.

If no arp queue is active, all UDP packets could be in the qdisc and might hit sooner
the qdisc limit, thus dropping packets on the qdisc.

(This is assuming your UDP application can blast packets at a rate above the link rate)

> 
> On Thu, May 16, 2019 at 12:14 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>>
>>
>> On 5/16/19 9:05 AM, Eric Dumazet wrote:
>>
>>> We probably should add a ttl on arp queues.
>>>
>>> neigh_probe() could do that quite easily.
>>>
>>
>> Adam, all you need to do is to increase UDP socket sndbuf.
>>
>> Either by increasing /proc/sys/net/core/wmem_default
>>
>> or using setsockopt( ... SO_SNDBUF ... )
>>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-16 17:03         ` Eric Dumazet
@ 2019-05-16 21:42           ` Adam Urban
  0 siblings, 0 replies; 12+ messages in thread
From: Adam Urban @ 2019-05-16 21:42 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Willem de Bruijn, Network Development

How can I see if there is an active arp queue?

Regarding the qdisc, I don't think we're bumping up against that (at
least not in my tiny bench setup):

tc -s qdisc show
qdisc fq_codel 0: dev eth0 root refcnt 2 limit 10240p flows 1024
quantum 1514 target 5.0ms interval 100.0ms ecn
 Sent 925035443 bytes 8988011 pkt (dropped 0, overlimits 0 requeues 3)
 backlog 0b 0p requeues 3
  maxpacket 717 drop_overlimit 0 new_flow_count 1004 ecn_mark 0
  new_flows_len 0 old_flows_len 0

I'm not sure I still 100% understand the relationship between the
socket buffer (skb / wmem_default sysctl setting or SO_SNDBUF socket
option), arp queue (arp_queue), and the unres_qlen_bytes sysctl
setting. I've made a public google spreadsheet here to try and
calculate this value based on some input and assumptions. Can you take
a look and see if I got this somewhat correct?

https://docs.google.com/spreadsheets/d/1t9_UowY6sok8xvK8Tx_La_jB4iqpewJT5X4WANj39gg/edit?usp=sharing

On Thu, May 16, 2019 at 1:03 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 5/16/19 9:32 AM, Adam Urban wrote:
> > Eric, thanks. Increasing wmem_default from 229376 to 2293760 indeed
> > makes the issue go away on my test bench. What's a good way to
> > determine the optimal value here? I assume this is in bytes and needs
> > to be large enough so that the SO_SNDBUF doesn't fill up before the
> > kernel drops the packets. How often does that happen?
>
> You have to count the max number of arp queues your UDP socket could hit.
>
> Say this number is X
>
> Then wmem_default should be set  to X * unres_qlen_bytes + Y
>
> With Y =  229376  (the default  wmem_default)
>
> Then, you might need to increase the qdisc limits.
>
> If no arp queue is active, all UDP packets could be in the qdisc and might hit sooner
> the qdisc limit, thus dropping packets on the qdisc.
>
> (This is assuming your UDP application can blast packets at a rate above the link rate)
>
> >
> > On Thu, May 16, 2019 at 12:14 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >>
> >>
> >>
> >> On 5/16/19 9:05 AM, Eric Dumazet wrote:
> >>
> >>> We probably should add a ttl on arp queues.
> >>>
> >>> neigh_probe() could do that quite easily.
> >>>
> >>
> >> Adam, all you need to do is to increase UDP socket sndbuf.
> >>
> >> Either by increasing /proc/sys/net/core/wmem_default
> >>
> >> or using setsockopt( ... SO_SNDBUF ... )
> >>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-16 16:05   ` Eric Dumazet
  2019-05-16 16:14     ` Eric Dumazet
@ 2019-05-17  0:27     ` Adam Urban
  2019-05-17  3:22       ` Willem de Bruijn
  1 sibling, 1 reply; 12+ messages in thread
From: Adam Urban @ 2019-05-17  0:27 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Willem de Bruijn, Network Development

And replying to your earlier comment about TTL, yes I think a TTL on
arp_queues would be hugely helpful.

In any environment where you are streaming time-sensitive UDP traffic,
you really want the kernel to be tuned to immediately drop the
outgoing packet if the destination isn't yet known/in the arp table
already... Doesn't make sense to keep it around while it arps, since
by the time it has an answer and gets it into the arp table, the UDP
packet that it queued for sending while waiting on the arp reply is
likely already out of date. (And if it doesn't get an answer, I
definitely don't want it filling up buffers with useless/old packets!)

On Thu, May 16, 2019 at 12:05 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
>
> On 5/16/19 7:47 AM, Willem de Bruijn wrote:
> > On Wed, May 15, 2019 at 3:57 PM Adam Urban <adam.urban@appleguru.org> wrote:
> >>
> >> We have an application where we are use sendmsg() to send (lots of)
> >> UDP packets to multiple destinations over a single socket, repeatedly,
> >> and at a pretty constant rate using IPv4.
> >>
> >> In some cases, some of these destinations are no longer present on the
> >> network, but we continue sending data to them anyways. The missing
> >> devices are usually a temporary situation, but can last for
> >> days/weeks/months.
> >>
> >> We are seeing an issue where packets sent even to destinations that
> >> are present on the network are getting dropped while the kernel
> >> performs arp updates.
> >>
> >> We see a -1 EAGAIN (Resource temporarily unavailable) return value
> >> from the sendmsg() call when this is happening:
> >>
> >> sendmsg(72, {msg_name(16)={sa_family=AF_INET, sin_port=htons(1234),
> >> sin_addr=inet_addr("10.1.2.3")}, msg_iov(1)=[{"\4\1"..., 96}],
> >> msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = -1 EAGAIN (Resource
> >> temporarily unavailable)
> >>
> >> Looking at packet captures, during this time you see the kernel arping
> >> for the devices that aren't on the network, timing out, arping again,
> >> timing out, and then finally arping a 3rd time before setting the
> >> INCOMPLETE state again (very briefly being in a FAILED state).
> >>
> >> "Good" packets don't start going out again until the 3rd timeout
> >> happens, and then they go out for about 1s until the 3s delay from ARP
> >> happens again.
> >>
> >> Interestingly, this isn't an all or nothing situation. With only a few
> >> (2-3) devices missing, we don't run into this "blocking" situation and
> >> data always goes out. But once 4 or more devices are missing, it
> >> happens. Setting static ARP entries for the missing supplies, even if
> >> they are bogus, resolves the issue, but of course results in packets
> >> with a bogus destination going out on the wire instead of getting
> >> dropped by the kernel.
> >>
> >> Can anyone explain why this is happening? I have tried tuning the
> >> unres_qlen sysctl without effect and will next try to set the
> >> MSG_DONTWAIT socket option to try and see if that helps. But I want to
> >> make sure I understand what is going on.
> >>
> >> Are there any parameters we can tune so that UDP packets sent to
> >> INCOMPLETE destinations are immediately dropped? What's the best way
> >> to prevent a socket from being unavailable while arp operations are
> >> happening (assuming arp is the cause)?
> >
> > Sounds like hitting SO_SNDBUF limit due to datagrams being held on the
> > neighbor queue. Especially since the issue occurs only as the number
> > of unreachable destinations exceeds some threshold. Does
> > /proc/net/stat/ndisc_cache show unresolved_discards? Increasing
> > unres_qlen may make matters only worse if more datagrams can get
> > queued. See also the branch on NUD_INCOMPLETE in __neigh_event_send.
> >
>
> We probably should add a ttl on arp queues.
>
> neigh_probe() could do that quite easily.
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-17  0:27     ` Adam Urban
@ 2019-05-17  3:22       ` Willem de Bruijn
  2019-05-17 12:57         ` David Laight
  0 siblings, 1 reply; 12+ messages in thread
From: Willem de Bruijn @ 2019-05-17  3:22 UTC (permalink / raw)
  To: Adam Urban; +Cc: Eric Dumazet, Willem de Bruijn, Network Development

On Thu, May 16, 2019 at 8:27 PM Adam Urban <adam.urban@appleguru.org> wrote:
>
> And replying to your earlier comment about TTL, yes I think a TTL on
> arp_queues would be hugely helpful.
>
> In any environment where you are streaming time-sensitive UDP traffic,
> you really want the kernel to be tuned to immediately drop the
> outgoing packet if the destination isn't yet known/in the arp table
> already...

For packets that need to be sent immediately or not at all, you
probably do not want a TTL, but simply for the send call to fail
immediately with EAGAIN instead of queuing the packet for ARP
resolution at all. Which is approximated with unres_qlen 0.

The relation between unres_qlen_bytes, arp_queue and SO_SNDBUF is
pretty straightforward in principal. Packets can be queued on the arp
queue until the byte limit is reached. Any packets on this queue still
have their memory counted towards their socket send budget. If a
packet is queued that causes to exceed the threshold, older packets
are freed and dropped as needed. Calculating the exact numbers is not
as straightforward, as, for instance, skb->truesize is a kernel
implementation detail.

The simple solution is just to overprovision the socket SO_SNDBUF. If
there are few sockets in the system that perform this role, that seems
perfectly fine.

> Doesn't make sense to keep it around while it arps, since
> by the time it has an answer and gets it into the arp table, the UDP
> packet that it queued for sending while waiting on the arp reply is
> likely already out of date. (And if it doesn't get an answer, I
> definitely don't want it filling up buffers with useless/old packets!)
>
> On Thu, May 16, 2019 at 12:05 PM Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >
> >
> >
> > On 5/16/19 7:47 AM, Willem de Bruijn wrote:
> > > On Wed, May 15, 2019 at 3:57 PM Adam Urban <adam.urban@appleguru.org> wrote:
> > >>
> > >> We have an application where we are use sendmsg() to send (lots of)
> > >> UDP packets to multiple destinations over a single socket, repeatedly,
> > >> and at a pretty constant rate using IPv4.
> > >>
> > >> In some cases, some of these destinations are no longer present on the
> > >> network, but we continue sending data to them anyways. The missing
> > >> devices are usually a temporary situation, but can last for
> > >> days/weeks/months.
> > >>
> > >> We are seeing an issue where packets sent even to destinations that
> > >> are present on the network are getting dropped while the kernel
> > >> performs arp updates.
> > >>
> > >> We see a -1 EAGAIN (Resource temporarily unavailable) return value
> > >> from the sendmsg() call when this is happening:
> > >>
> > >> sendmsg(72, {msg_name(16)={sa_family=AF_INET, sin_port=htons(1234),
> > >> sin_addr=inet_addr("10.1.2.3")}, msg_iov(1)=[{"\4\1"..., 96}],
> > >> msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = -1 EAGAIN (Resource
> > >> temporarily unavailable)
> > >>
> > >> Looking at packet captures, during this time you see the kernel arping
> > >> for the devices that aren't on the network, timing out, arping again,
> > >> timing out, and then finally arping a 3rd time before setting the
> > >> INCOMPLETE state again (very briefly being in a FAILED state).
> > >>
> > >> "Good" packets don't start going out again until the 3rd timeout
> > >> happens, and then they go out for about 1s until the 3s delay from ARP
> > >> happens again.
> > >>
> > >> Interestingly, this isn't an all or nothing situation. With only a few
> > >> (2-3) devices missing, we don't run into this "blocking" situation and
> > >> data always goes out. But once 4 or more devices are missing, it
> > >> happens. Setting static ARP entries for the missing supplies, even if
> > >> they are bogus, resolves the issue, but of course results in packets
> > >> with a bogus destination going out on the wire instead of getting
> > >> dropped by the kernel.
> > >>
> > >> Can anyone explain why this is happening? I have tried tuning the
> > >> unres_qlen sysctl without effect and will next try to set the
> > >> MSG_DONTWAIT socket option to try and see if that helps. But I want to
> > >> make sure I understand what is going on.
> > >>
> > >> Are there any parameters we can tune so that UDP packets sent to
> > >> INCOMPLETE destinations are immediately dropped? What's the best way
> > >> to prevent a socket from being unavailable while arp operations are
> > >> happening (assuming arp is the cause)?
> > >
> > > Sounds like hitting SO_SNDBUF limit due to datagrams being held on the
> > > neighbor queue. Especially since the issue occurs only as the number
> > > of unreachable destinations exceeds some threshold. Does
> > > /proc/net/stat/ndisc_cache show unresolved_discards? Increasing
> > > unres_qlen may make matters only worse if more datagrams can get
> > > queued. See also the branch on NUD_INCOMPLETE in __neigh_event_send.
> > >
> >
> > We probably should add a ttl on arp queues.
> >
> > neigh_probe() could do that quite easily.
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: Kernel UDP behavior with missing destinations
  2019-05-17  3:22       ` Willem de Bruijn
@ 2019-05-17 12:57         ` David Laight
  2019-05-17 13:20           ` Willem de Bruijn
  0 siblings, 1 reply; 12+ messages in thread
From: David Laight @ 2019-05-17 12:57 UTC (permalink / raw)
  To: 'Willem de Bruijn', Adam Urban; +Cc: Eric Dumazet, Network Development

From: Willem de Bruijn
> Sent: 17 May 2019 04:23
> On Thu, May 16, 2019 at 8:27 PM Adam Urban <adam.urban@appleguru.org> wrote:
> >
> > And replying to your earlier comment about TTL, yes I think a TTL on
> > arp_queues would be hugely helpful.
> >
> > In any environment where you are streaming time-sensitive UDP traffic,
> > you really want the kernel to be tuned to immediately drop the
> > outgoing packet if the destination isn't yet known/in the arp table
> > already...

I suspect we may suffer from the same problems when sending out a lot
of RTP (think of sending 1000s of UDP messages to different addresses
every 20ms).
For various reasons the sends are done from a single raw socket (rather
than 'connected' UDP sockets).

> For packets that need to be sent immediately or not at all, you
> probably do not want a TTL, but simply for the send call to fail
> immediately with EAGAIN instead of queuing the packet for ARP
> resolution at all. Which is approximated with unres_qlen 0.
> 
> The relation between unres_qlen_bytes, arp_queue and SO_SNDBUF is
> pretty straightforward in principal. Packets can be queued on the arp
> queue until the byte limit is reached. Any packets on this queue still
> have their memory counted towards their socket send budget. If a
> packet is queued that causes to exceed the threshold, older packets
> are freed and dropped as needed. Calculating the exact numbers is not
> as straightforward, as, for instance, skb->truesize is a kernel
> implementation detail.

But 'fiddling' with the arp queue will affect all traffic.
So you'd need it to be per socket option so that it is a property
of the message by the time it reaches the arp code.

> The simple solution is just to overprovision the socket SO_SNDBUF. If
> there are few sockets in the system that perform this role, that seems
> perfectly fine.

That depends on how often you are sending messages compared to the
arp timeout. If you are sending 50 messages a second to each of 1000
destinations the over provisioning of SO_SNDBUF would have to be extreme.

FWIW we do sometimes see sendmsg() taking much longer than expected,
but haven't get tracked down why.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Kernel UDP behavior with missing destinations
  2019-05-17 12:57         ` David Laight
@ 2019-05-17 13:20           ` Willem de Bruijn
  0 siblings, 0 replies; 12+ messages in thread
From: Willem de Bruijn @ 2019-05-17 13:20 UTC (permalink / raw)
  To: David Laight; +Cc: Adam Urban, Eric Dumazet, Network Development

On Fri, May 17, 2019 at 8:57 AM David Laight <David.Laight@aculab.com> wrote:
>
> From: Willem de Bruijn
> > Sent: 17 May 2019 04:23
> > On Thu, May 16, 2019 at 8:27 PM Adam Urban <adam.urban@appleguru.org> wrote:
> > >
> > > And replying to your earlier comment about TTL, yes I think a TTL on
> > > arp_queues would be hugely helpful.
> > >
> > > In any environment where you are streaming time-sensitive UDP traffic,
> > > you really want the kernel to be tuned to immediately drop the
> > > outgoing packet if the destination isn't yet known/in the arp table
> > > already...
>
> I suspect we may suffer from the same problems when sending out a lot
> of RTP (think of sending 1000s of UDP messages to different addresses
> every 20ms).
> For various reasons the sends are done from a single raw socket (rather
> than 'connected' UDP sockets).
>
> > For packets that need to be sent immediately or not at all, you
> > probably do not want a TTL, but simply for the send call to fail
> > immediately with EAGAIN instead of queuing the packet for ARP
> > resolution at all. Which is approximated with unres_qlen 0.
> >
> > The relation between unres_qlen_bytes, arp_queue and SO_SNDBUF is
> > pretty straightforward in principal. Packets can be queued on the arp
> > queue until the byte limit is reached. Any packets on this queue still
> > have their memory counted towards their socket send budget. If a
> > packet is queued that causes to exceed the threshold, older packets
> > are freed and dropped as needed. Calculating the exact numbers is not
> > as straightforward, as, for instance, skb->truesize is a kernel
> > implementation detail.
>
> But 'fiddling' with the arp queue will affect all traffic.
> So you'd need it to be per socket option so that it is a property
> of the message by the time it reaches the arp code.

A per socket or even datagram do-not-queue signal would be
interesting. Where any queuing would instead result in send failure
(though this feedback signal does not work for secondary qdiscs).

The recent SCM_TXTIME cmsg has a deadline mode that might implement
this. In which case we would only have to check for it in the neighbor
layer.

>
> > The simple solution is just to overprovision the socket SO_SNDBUF. If
> > there are few sockets in the system that perform this role, that seems
> > perfectly fine.
>
> That depends on how often you are sending messages compared to the
> arp timeout. If you are sending 50 messages a second to each of 1000
> destinations the over provisioning of SO_SNDBUF would have to be extreme.
>
> FWIW we do sometimes see sendmsg() taking much longer than expected,
> but haven't get tracked down why.

I've observed this problem with health checks under particular ARP
settings as well.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-05-17 13:21 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-15 19:56 Kernel UDP behavior with missing destinations Adam Urban
2019-05-16 14:47 ` Willem de Bruijn
2019-05-16 15:43   ` Adam Urban
2019-05-16 16:05   ` Eric Dumazet
2019-05-16 16:14     ` Eric Dumazet
2019-05-16 16:32       ` Adam Urban
2019-05-16 17:03         ` Eric Dumazet
2019-05-16 21:42           ` Adam Urban
2019-05-17  0:27     ` Adam Urban
2019-05-17  3:22       ` Willem de Bruijn
2019-05-17 12:57         ` David Laight
2019-05-17 13:20           ` Willem de Bruijn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.