All of lore.kernel.org
 help / color / mirror / Atom feed
* After a while of system running no incoming UDP any more?
@ 2017-07-24 12:09 Marc Haber
  2017-07-24 14:19 ` Paolo Abeni
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Haber @ 2017-07-24 12:09 UTC (permalink / raw)
  To: netdev

Hi,

I am running ~ 50 servers, most of them as KVM guests, some of them as
Xen guests, and even less of them on hardware, and have recently updated
to Debian stretch. I usually use kernels locally built from the latest
vanille stable release.

Roughly since the upgrade to Debian stretch and kernel 4.12, some of my
systems have begun to not forward UDP packets (such as incoming DNS
replies) to the user space. When this happens, I see the packet coming
in on tcpdump -p, but the application never sees it and eventuelly times
out. An strace on the process sees the process waiting on the select()
syscall and nothing happens when the system receives the UDP packet. I
do also see the same phenomenon with ntp. A reboot always fixes the
issue. 

Runnign wireshark on a pcap file obtained on an affected systems does
show all checksums to be in order. Both IPv4 and IPv6 are affected, and
in the DNS case, switching dig/drill or even the system resolver to TCP
also fixes the issue.

This happens only after the system has been running for a few days, and
I have seen this happen on both KVM and Xen guests, but not (yet) on
real hardware. In my zoo of servers, this happens - over the entire
sample - about twice a week, often enough to be annoying and seldomly
enough to make debugging really difficult since you'll never know in
advance which system will have the issue for the next time.

I have therefore been reluctant to downgrade kernel or system since that
would mean days of work. Bisecting is probably out of the question since
you'll never know when "git bisect good" is a sufficiently safe
assumption.

Before I begin running older kernels on productive systems, I would like
to ask wether there have been recent changes in the 4.11 => 4.12
development cycle that might cause an issue like that.

Since I have never seen the issue on stretch systems when they were
still running 4.11.8 (the latest 4.11 kernel that I had deployed before
switching over to 4.12), I do really suspect the kernel, and I do also
suspect that network interface offloading is probably not the culprit.

On the KVM guests, I use virtio-net, and I had that one high on my list
until one of the two Xen guests that doesn't show any network modules
loaded has been showing the phenomenon as well.

That Xen guest outputs the following to lshw -C network:

that doesn't show any network modules loaded has been showing the
phenomenon as well.

That Xen guest outputs the following to lshw -C network:

  *-network
       description: Ethernet interface
       physical id: 1
       logical name: eth0
       serial: 0e:06:5f:74:48:97
       capabilities: ethernet physical
       configuration: broadcast=yes driver=vif ip=<redacted> link=yes multicast=yes

So I assume that this one is not using virtio-net, so virtio-net seems
safe as well.

Any idea what might be happening here and what else I could try?

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-24 12:09 After a while of system running no incoming UDP any more? Marc Haber
@ 2017-07-24 14:19 ` Paolo Abeni
  2017-07-25 11:57   ` Marc Haber
  2017-07-28  6:26   ` Marc Haber
  0 siblings, 2 replies; 14+ messages in thread
From: Paolo Abeni @ 2017-07-24 14:19 UTC (permalink / raw)
  To: Marc Haber, netdev

Hi,

On Mon, 2017-07-24 at 14:09 +0200, Marc Haber wrote:
> Before I begin running older kernels on productive systems, I would like
> to ask wether there have been recent changes in the 4.11 => 4.12
> development cycle that might cause an issue like that.

While there has been some activity regarding the UDP protocol lately,
almost nothing touched UDP in the 4.11 release cycle.

The issue you describe looks similar to the bug fixed by the commit
9bd780f5e066 ("udp: fix poll()"), but the bugged code is only in later
kernels. 

> Any idea what might be happening here and what else I could try?

Once that a system enter the buggy status, do the packets reach the
relevant socket's queue?

ss -u
nstat |grep -e Udp -e Ip

will help checking that.

Thanks,

Paol

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-24 14:19 ` Paolo Abeni
@ 2017-07-25 11:57   ` Marc Haber
  2017-07-25 12:17     ` Paolo Abeni
  2017-07-28  6:26   ` Marc Haber
  1 sibling, 1 reply; 14+ messages in thread
From: Marc Haber @ 2017-07-25 11:57 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev

Hi Paolo,

thanks for your answer. I appreciate that.

On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> On Mon, 2017-07-24 at 14:09 +0200, Marc Haber wrote:
> > Before I begin running older kernels on productive systems, I would like
> > to ask wether there have been recent changes in the 4.11 => 4.12
> > development cycle that might cause an issue like that.
> 
> While there has been some activity regarding the UDP protocol lately,
> almost nothing touched UDP in the 4.11 release cycle.

4.11 is good, 4.12 is bad.

> The issue you describe looks similar to the bug fixed by the commit
> 9bd780f5e066 ("udp: fix poll()"), but the bugged code is only in later
> kernels. 

That one is in v4.13-rc1 and v4.13-rc2, but it doesn't apply to my 4.12
trees.

> > Any idea what might be happening here and what else I could try?
> 
> Once that a system enter the buggy status, do the packets reach the
> relevant socket's queue?
> 
> ss -u

That one only shows table headers on an unaffected system in normal
operation, right?

> nstat |grep -e Udp -e Ip
> 
> will help checking that.

An unaffected system will show UdpInDatagrams, right?

But where is the connection to the relevant socket's queue?

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-25 11:57   ` Marc Haber
@ 2017-07-25 12:17     ` Paolo Abeni
  2017-07-26  8:10       ` Marc Haber
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Abeni @ 2017-07-25 12:17 UTC (permalink / raw)
  To: Marc Haber; +Cc: netdev

On Tue, 2017-07-25 at 13:57 +0200, Marc Haber wrote:
> On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> > Once that a system enter the buggy status, do the packets reach the
> > relevant socket's queue?
> > 
> > ss -u
> 
> That one only shows table headers on an unaffected system in normal
> operation, right?

This one shows the current lenght of the socket receive queue (Recv-Q,
the first column). If the packets land into the skbuff (and the user
space reader for some reason is not woken up) such value will grow over
time.

> > nstat |grep -e Udp -e Ip
> > 
> > will help checking that.
> 
> An unaffected system will show UdpInDatagrams, right?
> 
> But where is the connection to the relevant socket's queue?

If the socket queue lenght (as reported above) does not increase,
IP/UDP stats could give an hint of where and why the packets stop
traversing the network stack.

Beyond that, you can try using perf probes or kprobe/systemtap to [try
to] track the relevant packets inside the kernel.

/P

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-25 12:17     ` Paolo Abeni
@ 2017-07-26  8:10       ` Marc Haber
  2017-07-26  8:33         ` Paolo Abeni
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Haber @ 2017-07-26  8:10 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev

On Tue, Jul 25, 2017 at 02:17:52PM +0200, Paolo Abeni wrote:
> On Tue, 2017-07-25 at 13:57 +0200, Marc Haber wrote:
> > On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> > > Once that a system enter the buggy status, do the packets reach the
> > > relevant socket's queue?
> > > 
> > > ss -u
> > 
> > That one only shows table headers on an unaffected system in normal
> > operation, right?
> 
> This one shows the current lenght of the socket receive queue (Recv-Q,
> the first column). If the packets land into the skbuff (and the user
> space reader for some reason is not woken up) such value will grow over
> time.

Only that there is no value:
[4/4992]mh@swivel:~ $ ss -u
Recv-Q Send-Q Local Address:Port                 Peer Address:Port              
[5/4992]mh@swivel:~ $

(is that the intended behavior on a system thiat is not affected by the
issue?)

> > > nstat |grep -e Udp -e Ip
> > > 
> > > will help checking that.
> > 
> > An unaffected system will show UdpInDatagrams, right?
> > 
> > But where is the connection to the relevant socket's queue?
> 
> If the socket queue lenght (as reported above) does not increase,
> IP/UDP stats could give an hint of where and why the packets stop
> traversing the network stack.

We'll see. Still waiting for the phenomenon to show up again.

> Beyond that, you can try using perf probes or kprobe/systemtap to [try
> to] track the relevant packets inside the kernel.

That's way beyond my kernel foo, I'm afraid.

Thanks for helping, I'll report back.

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-26  8:10       ` Marc Haber
@ 2017-07-26  8:33         ` Paolo Abeni
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Abeni @ 2017-07-26  8:33 UTC (permalink / raw)
  To: Marc Haber; +Cc: netdev

On Wed, 2017-07-26 at 10:10 +0200, Marc Haber wrote:
> On Tue, Jul 25, 2017 at 02:17:52PM +0200, Paolo Abeni wrote:
> > On Tue, 2017-07-25 at 13:57 +0200, Marc Haber wrote:
> > > On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> > > > Once that a system enter the buggy status, do the packets reach the
> > > > relevant socket's queue?
> > > > 
> > > > ss -u
> > > 
> > > That one only shows table headers on an unaffected system in normal
> > > operation, right?
> > 
> > This one shows the current lenght of the socket receive queue (Recv-Q,
> > the first column). If the packets land into the skbuff (and the user
> > space reader for some reason is not woken up) such value will grow over
> > time.
> 
> Only that there is no value:
> [4/4992]mh@swivel:~ $ ss -u
> Recv-Q Send-Q Local Address:Port                 Peer Address:Port              
> [5/4992]mh@swivel:~ $
> 
> (is that the intended behavior on a system thiat is not affected by the
> issue?)

That means there are no open UDP connected sockets in the system at the
moment you  run ss -u. I forgot to specify you must add also the '-a'
command line option to the 'ss' tool to show all udp sockets regardless
theis state:

ss -ua

Anyway this issue looks quite similar to:

https://bugzilla.kernel.org/show_bug.cgi?id=196469

Which contains some more information. I suggest to follow-up on such
bz.

Cheers,

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-24 14:19 ` Paolo Abeni
  2017-07-25 11:57   ` Marc Haber
@ 2017-07-28  6:26   ` Marc Haber
  2017-07-28  8:05     ` Eric Dumazet
  2017-07-28  8:07     ` Paolo Abeni
  1 sibling, 2 replies; 14+ messages in thread
From: Marc Haber @ 2017-07-28  6:26 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev

On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> Once that a system enter the buggy status, do the packets reach the
> relevant socket's queue?
> 
> ss -u
> nstat |grep -e Udp -e Ip
> 
> will help checking that.

I now have the issue on one machine, a Xen guest acting as authoritative
nameserver for my domains. Here are the outputs during normal use, with
artificial queries coming in on eth0:

[9/1075]mh@impetus:~ $ ss -u
Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
0      0                                              127.0.0.1:56547                                                        127.0.0.1:domain               
0      0                                         216.231.132.60:27667                                                       198.41.0.4:domain               
0      0                                         216.231.132.60:44121                                                          8.8.8.8:domain               
0      0                                         216.231.132.60:29814                                                       198.41.0.4:domain               
[10/1076]mh@impetus:~ $ ss -u
Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
[11/1076]mh@impetus:~ $ ss -u
Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
[12/1076]mh@impetus:~ $ ss -u
Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
[13/1076]mh@impetus:~ $ ss -u
Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
[14/1076]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
IpInReceives                    400688             0.0
IpInAddrErrors                  18567              0.0
IpInUnknownProtos               3                  0.0
IpInDelivers                    330634             0.0
IpOutRequests                   283637             0.0
UdpInDatagrams                  145860             0.0
UdpNoPorts                      1313               0.0
UdpInErrors                     9356               0.0
UdpOutDatagrams                 153093             0.0
UdpIgnoredMulti                 34148              0.0
Ip6InReceives                   161178             0.0
Ip6InNoRoutes                   8                  0.0
Ip6InDelivers                   73841              0.0
Ip6OutRequests                  77575              0.0
Ip6InMcastPkts                  87332              0.0
Ip6OutMcastPkts                 109                0.0
Ip6InOctets                     21880674           0.0
Ip6OutOctets                    9633059            0.0
Ip6InMcastOctets                9371483            0.0
Ip6OutMcastOctets               6636               0.0
Ip6InNoECTPkts                  161202             0.0
Ip6InECT1Pkts                   15                 0.0
Ip6InECT0Pkts                   11                 0.0
Ip6InCEPkts                     4                  0.0
Udp6InDatagrams                 11725              0.0
Udp6NoPorts                     2                  0.0
Udp6InErrors                    1989               0.0
Udp6OutDatagrams                14483              0.0
IpExtInBcastPkts                34148              0.0
IpExtInOctets                   47462716           0.0
IpExtOutOctets                  31262696           0.0
IpExtInBcastOctets              7476059            0.0
IpExtInNoECTPkts                400178             0.0
IpExtInECT1Pkts                 22                 0.0
IpExtInECT0Pkts                 481                0.0
IpExtInCEPkts                   14                 0.0
[15/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
IpInReceives                    25                 0.0
IpInDelivers                    25                 0.0
IpOutRequests                   16                 0.0
UdpInDatagrams                  1                  0.0
UdpInErrors                     24                 0.0
UdpOutDatagrams                 16                 0.0
Ip6InReceives                   15                 0.0
Ip6InDelivers                   14                 0.0
Ip6OutRequests                  12                 0.0
Ip6InMcastPkts                  1                  0.0
Ip6InOctets                     1219               0.0
Ip6OutOctets                    4384               0.0
Ip6InMcastOctets                131                0.0
Ip6InNoECTPkts                  15                 0.0
IpExtInOctets                   11779              0.0
IpExtOutOctets                  1023               0.0
IpExtInNoECTPkts                25                 0.0
[16/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
IpInReceives                    24                 0.0
IpInDelivers                    24                 0.0
IpOutRequests                   18                 0.0
UdpInErrors                     22                 0.0
UdpOutDatagrams                 16                 0.0
Ip6InReceives                   15                 0.0
Ip6InDelivers                   12                 0.0
Ip6OutRequests                  10                 0.0
Ip6InMcastPkts                  3                  0.0
Ip6InOctets                     1160               0.0
Ip6OutOctets                    2456               0.0
Ip6InMcastOctets                216                0.0
Ip6InNoECTPkts                  15                 0.0
IpExtInOctets                   8612               0.0
IpExtOutOctets                  1127               0.0
IpExtInNoECTPkts                24                 0.0
[17/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
IpInReceives                    5                  0.0
IpInDelivers                    4                  0.0
IpOutRequests                   3                  0.0
UdpNoPorts                      1                  0.0
UdpInErrors                     2                  0.0
UdpOutDatagrams                 1                  0.0
Ip6InReceives                   12                 0.0
Ip6InDelivers                   12                 0.0
Ip6OutRequests                  10                 0.0
Ip6InOctets                     944                0.0
Ip6OutOctets                    2364               0.0
Ip6InNoECTPkts                  12                 0.0
IpExtInOctets                   429                0.0
IpExtOutOctets                  226                0.0
IpExtInNoECTPkts                5                  0.0
[18/1077]mh@impetus:~ $

And here, hopefully a bit more helpful:

[19/1078]mh@impetus:~ $ ss -u ; nstat  | grep -e Udp -e Ip ; dig +time=2 @8.8.8.8 zugschlus.de mx ; ss -u ; nstat  | grep -e Udp -e Ip 
Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
0      0                                         216.231.132.60:27333                                                       198.41.0.4:domain               
0      0                                         216.231.132.60:38101                                                       198.41.0.4:domain               
0      0                                         216.231.132.60:15836                                                       198.41.0.4:domain               
0      0                                         216.231.132.60:50655                                                          8.8.8.8:domain               
0      0                                         216.231.132.60:41953                                                       198.41.0.4:domain               
0      0                                         216.231.132.60:6888                                                        198.41.0.4:domain               
0      0                                         216.231.132.60:51441                                                       198.41.0.4:domain               
0      0                                         216.231.132.60:42503                                                       198.41.0.4:domain               
0      0                                         216.231.132.60:12575                                                       198.41.0.4:domain               
0      0                                         216.231.132.60:13857                                                       198.41.0.4:domain
0      0                                         216.231.132.60:16419                                                    192.36.148.17:domain
0      0                                         216.231.132.60:39227                                                       198.41.0.4:domain
0      0                                              127.0.0.1:54608                                                        127.0.0.1:domain
0      0                                         216.231.132.60:20818                                                       198.41.0.4:domain
0      0                                         216.231.132.60:56662                                                       198.41.0.4:domain
0      0                                         216.231.132.60:48259                                                    192.36.148.17:domain
0      0                                         216.231.132.60:37803                                                       198.41.0.4:domain
IpInReceives                    59                 0.0
IpInAddrErrors                  1                  0.0
IpInDelivers                    56                 0.0
IpOutRequests                   50                 0.0
UdpInDatagrams                  1                  0.0
UdpInErrors                     50                 0.0
UdpOutDatagrams                 47                 0.0
UdpIgnoredMulti                 1                  0.0
Ip6InReceives                   75                 0.0
Ip6InDelivers                   73                 0.0
Ip6OutRequests                  64                 0.0
Ip6InMcastPkts                  2                  0.0
Ip6InOctets                     7837               0.0
Ip6OutOctets                    11876              0.0
Ip6InMcastOctets                279                0.0
Ip6InNoECTPkts                  75                 0.0
Udp6InErrors                    3                  0.0
IpExtInBcastPkts                1                  0.0
IpExtInOctets                   18447              0.0
IpExtOutOctets                  3478               0.0
IpExtInBcastOctets              183                0.0
IpExtInNoECTPkts                59                 0.0

; <<>> DiG 9.10.3-P4-Debian <<>> +time=2 @8.8.8.8 zugschlus.de mx
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached
Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port
0      0                                         216.231.132.60:7879                                                      202.12.27.33:domain
0      0                                         216.231.132.60:32711                                                     202.12.27.33:domain
0      0                                         216.231.132.60:54238                                                     202.12.27.33:domain
0      0                                         216.231.132.60:30948                                                   192.228.79.201:domain
0      0                                         216.231.132.60:4106                                                      202.12.27.33:domain
0      0                                         216.231.132.60:6667                                                      202.12.27.33:domain
0      0                                         216.231.132.60:2090                                                    192.228.79.201:domain
0      0                                         216.231.132.60:60459                                                   192.228.79.201:domain
0      0                                         216.231.132.60:16427                                                     202.12.27.33:domain
0      0                                         216.231.132.60:9019                                                      202.12.27.33:domain
0      0                                         216.231.132.60:2113                                                      202.12.27.33:domain
0      0                                         216.231.132.60:34907                                                     202.12.27.33:domain
0      0                                         216.231.132.60:34654                                                     202.12.27.33:domain
0      0                                         216.231.132.60:47725                                                     202.12.27.33:domain
0      0                                         216.231.132.60:35774                                                     202.12.27.33:domain
IpInReceives                    38                 0.0
IpInDelivers                    38                 0.0
IpOutRequests                   38                 0.0
UdpInDatagrams                  2                  0.0
UdpInErrors                     34                 0.0
UdpOutDatagrams                 36                 0.0
Ip6InReceives                   14                 0.0
Ip6InDelivers                   13                 0.0
Ip6OutRequests                  13                 0.0
Ip6InMcastPkts                  1                  0.0
Ip6InOctets                     1046               0.0
Ip6OutOctets                    6277               0.0
Ip6InMcastOctets                133                0.0
Ip6InNoECTPkts                  13                 0.0
Ip6InECT0Pkts                   1                  0.0
Udp6InDatagrams                 1                  0.0
Udp6OutDatagrams                1                  0.0
IpExtInOctets                   15963              0.0
IpExtOutOctets                  2397               0.0
IpExtInNoECTPkts                37                 0.0
IpExtInECT0Pkts                 1                  0.0
[20/1079]mh@impetus:~ $

I am afraid I cannot keep this state for much longer than a few
additional hours as this is an authoritative name server...

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-28  6:26   ` Marc Haber
@ 2017-07-28  8:05     ` Eric Dumazet
  2017-07-28  8:15       ` Paolo Abeni
  2017-07-28  8:07     ` Paolo Abeni
  1 sibling, 1 reply; 14+ messages in thread
From: Eric Dumazet @ 2017-07-28  8:05 UTC (permalink / raw)
  To: Marc Haber; +Cc: Paolo Abeni, netdev

On Fri, 2017-07-28 at 08:26 +0200, Marc Haber wrote:
> On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> > Once that a system enter the buggy status, do the packets reach the
> > relevant socket's queue?
> > 
> > ss -u
> > nstat |grep -e Udp -e Ip
> > 
> > will help checking that.
> 
> I now have the issue on one machine, a Xen guest acting as authoritative
> nameserver for my domains. Here are the outputs during normal use, with
> artificial queries coming in on eth0:
> 
> [9/1075]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> 0      0                                              127.0.0.1:56547                                                        127.0.0.1:domain               
> 0      0                                         216.231.132.60:27667                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:44121                                                          8.8.8.8:domain               
> 0      0                                         216.231.132.60:29814                                                       198.41.0.4:domain               
> [10/1076]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> [11/1076]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> [12/1076]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> [13/1076]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> [14/1076]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> IpInReceives                    400688             0.0
> IpInAddrErrors                  18567              0.0
> IpInUnknownProtos               3                  0.0
> IpInDelivers                    330634             0.0
> IpOutRequests                   283637             0.0
> UdpInDatagrams                  145860             0.0
> UdpNoPorts                      1313               0.0
> UdpInErrors                     9356               0.0
> UdpOutDatagrams                 153093             0.0
> UdpIgnoredMulti                 34148              0.0
> Ip6InReceives                   161178             0.0
> Ip6InNoRoutes                   8                  0.0
> Ip6InDelivers                   73841              0.0
> Ip6OutRequests                  77575              0.0
> Ip6InMcastPkts                  87332              0.0
> Ip6OutMcastPkts                 109                0.0
> Ip6InOctets                     21880674           0.0
> Ip6OutOctets                    9633059            0.0
> Ip6InMcastOctets                9371483            0.0
> Ip6OutMcastOctets               6636               0.0
> Ip6InNoECTPkts                  161202             0.0
> Ip6InECT1Pkts                   15                 0.0
> Ip6InECT0Pkts                   11                 0.0
> Ip6InCEPkts                     4                  0.0
> Udp6InDatagrams                 11725              0.0
> Udp6NoPorts                     2                  0.0
> Udp6InErrors                    1989               0.0
> Udp6OutDatagrams                14483              0.0
> IpExtInBcastPkts                34148              0.0
> IpExtInOctets                   47462716           0.0
> IpExtOutOctets                  31262696           0.0
> IpExtInBcastOctets              7476059            0.0
> IpExtInNoECTPkts                400178             0.0
> IpExtInECT1Pkts                 22                 0.0
> IpExtInECT0Pkts                 481                0.0
> IpExtInCEPkts                   14                 0.0
> [15/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> IpInReceives                    25                 0.0
> IpInDelivers                    25                 0.0
> IpOutRequests                   16                 0.0
> UdpInDatagrams                  1                  0.0
> UdpInErrors                     24                 0.0
> UdpOutDatagrams                 16                 0.0
> Ip6InReceives                   15                 0.0
> Ip6InDelivers                   14                 0.0
> Ip6OutRequests                  12                 0.0
> Ip6InMcastPkts                  1                  0.0
> Ip6InOctets                     1219               0.0
> Ip6OutOctets                    4384               0.0
> Ip6InMcastOctets                131                0.0
> Ip6InNoECTPkts                  15                 0.0
> IpExtInOctets                   11779              0.0
> IpExtOutOctets                  1023               0.0
> IpExtInNoECTPkts                25                 0.0
> [16/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> IpInReceives                    24                 0.0
> IpInDelivers                    24                 0.0
> IpOutRequests                   18                 0.0
> UdpInErrors                     22                 0.0
> UdpOutDatagrams                 16                 0.0
> Ip6InReceives                   15                 0.0
> Ip6InDelivers                   12                 0.0
> Ip6OutRequests                  10                 0.0
> Ip6InMcastPkts                  3                  0.0
> Ip6InOctets                     1160               0.0
> Ip6OutOctets                    2456               0.0
> Ip6InMcastOctets                216                0.0
> Ip6InNoECTPkts                  15                 0.0
> IpExtInOctets                   8612               0.0
> IpExtOutOctets                  1127               0.0
> IpExtInNoECTPkts                24                 0.0
> [17/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> IpInReceives                    5                  0.0
> IpInDelivers                    4                  0.0
> IpOutRequests                   3                  0.0
> UdpNoPorts                      1                  0.0
> UdpInErrors                     2                  0.0
> UdpOutDatagrams                 1                  0.0
> Ip6InReceives                   12                 0.0
> Ip6InDelivers                   12                 0.0
> Ip6OutRequests                  10                 0.0
> Ip6InOctets                     944                0.0
> Ip6OutOctets                    2364               0.0
> Ip6InNoECTPkts                  12                 0.0
> IpExtInOctets                   429                0.0
> IpExtOutOctets                  226                0.0
> IpExtInNoECTPkts                5                  0.0
> [18/1077]mh@impetus:~ $
> 
> And here, hopefully a bit more helpful:
> 
> [19/1078]mh@impetus:~ $ ss -u ; nstat  | grep -e Udp -e Ip ; dig +time=2 @8.8.8.8 zugschlus.de mx ; ss -u ; nstat  | grep -e Udp -e Ip 
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> 0      0                                         216.231.132.60:27333                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:38101                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:15836                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:50655                                                          8.8.8.8:domain               
> 0      0                                         216.231.132.60:41953                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:6888                                                        198.41.0.4:domain               
> 0      0                                         216.231.132.60:51441                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:42503                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:12575                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:13857                                                       198.41.0.4:domain
> 0      0                                         216.231.132.60:16419                                                    192.36.148.17:domain
> 0      0                                         216.231.132.60:39227                                                       198.41.0.4:domain
> 0      0                                              127.0.0.1:54608                                                        127.0.0.1:domain
> 0      0                                         216.231.132.60:20818                                                       198.41.0.4:domain
> 0      0                                         216.231.132.60:56662                                                       198.41.0.4:domain
> 0      0                                         216.231.132.60:48259                                                    192.36.148.17:domain
> 0      0                                         216.231.132.60:37803                                                       198.41.0.4:domain
> IpInReceives                    59                 0.0
> IpInAddrErrors                  1                  0.0
> IpInDelivers                    56                 0.0
> IpOutRequests                   50                 0.0
> UdpInDatagrams                  1                  0.0
> UdpInErrors                     50                 0.0
> UdpOutDatagrams                 47                 0.0
> UdpIgnoredMulti                 1                  0.0
> Ip6InReceives                   75                 0.0
> Ip6InDelivers                   73                 0.0
> Ip6OutRequests                  64                 0.0
> Ip6InMcastPkts                  2                  0.0
> Ip6InOctets                     7837               0.0
> Ip6OutOctets                    11876              0.0
> Ip6InMcastOctets                279                0.0
> Ip6InNoECTPkts                  75                 0.0
> Udp6InErrors                    3                  0.0
> IpExtInBcastPkts                1                  0.0
> IpExtInOctets                   18447              0.0
> IpExtOutOctets                  3478               0.0
> IpExtInBcastOctets              183                0.0
> IpExtInNoECTPkts                59                 0.0
> 
> ; <<>> DiG 9.10.3-P4-Debian <<>> +time=2 @8.8.8.8 zugschlus.de mx
> ; (1 server found)
> ;; global options: +cmd
> ;; connection timed out; no servers could be reached
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port
> 0      0                                         216.231.132.60:7879                                                      202.12.27.33:domain
> 0      0                                         216.231.132.60:32711                                                     202.12.27.33:domain
> 0      0                                         216.231.132.60:54238                                                     202.12.27.33:domain
> 0      0                                         216.231.132.60:30948                                                   192.228.79.201:domain
> 0      0                                         216.231.132.60:4106                                                      202.12.27.33:domain
> 0      0                                         216.231.132.60:6667                                                      202.12.27.33:domain
> 0      0                                         216.231.132.60:2090                                                    192.228.79.201:domain
> 0      0                                         216.231.132.60:60459                                                   192.228.79.201:domain
> 0      0                                         216.231.132.60:16427                                                     202.12.27.33:domain
> 0      0                                         216.231.132.60:9019                                                      202.12.27.33:domain
> 0      0                                         216.231.132.60:2113                                                      202.12.27.33:domain
> 0      0                                         216.231.132.60:34907                                                     202.12.27.33:domain
> 0      0                                         216.231.132.60:34654                                                     202.12.27.33:domain
> 0      0                                         216.231.132.60:47725                                                     202.12.27.33:domain
> 0      0                                         216.231.132.60:35774                                                     202.12.27.33:domain
> IpInReceives                    38                 0.0
> IpInDelivers                    38                 0.0
> IpOutRequests                   38                 0.0
> UdpInDatagrams                  2                  0.0
> UdpInErrors                     34                 0.0
> UdpOutDatagrams                 36                 0.0
> Ip6InReceives                   14                 0.0
> Ip6InDelivers                   13                 0.0
> Ip6OutRequests                  13                 0.0
> Ip6InMcastPkts                  1                  0.0
> Ip6InOctets                     1046               0.0
> Ip6OutOctets                    6277               0.0
> Ip6InMcastOctets                133                0.0
> Ip6InNoECTPkts                  13                 0.0
> Ip6InECT0Pkts                   1                  0.0
> Udp6InDatagrams                 1                  0.0
> Udp6OutDatagrams                1                  0.0
> IpExtInOctets                   15963              0.0
> IpExtOutOctets                  2397               0.0
> IpExtInNoECTPkts                37                 0.0
> IpExtInECT0Pkts                 1                  0.0
> [20/1079]mh@impetus:~ $
> 
> I am afraid I cannot keep this state for much longer than a few
> additional hours as this is an authoritative name server...
> 
> Greetings
> Marc
> 

To confirm the refcount issue, you might send us :

cat /proc/net/udp6

Normally, the refcount column should have a value of 2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-28  6:26   ` Marc Haber
  2017-07-28  8:05     ` Eric Dumazet
@ 2017-07-28  8:07     ` Paolo Abeni
  2017-07-28 12:14       ` Marc Haber
  1 sibling, 1 reply; 14+ messages in thread
From: Paolo Abeni @ 2017-07-28  8:07 UTC (permalink / raw)
  To: Marc Haber; +Cc: netdev

Hi,

On Fri, 2017-07-28 at 08:26 +0200, Marc Haber wrote:
> On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> > Once that a system enter the buggy status, do the packets reach the
> > relevant socket's queue?
> > 
> > ss -u
> > nstat |grep -e Udp -e Ip
> > 
> > will help checking that.
> 
> I now have the issue on one machine, a Xen guest acting as authoritative
> nameserver for my domains. Here are the outputs during normal use, with
> artificial queries coming in on eth0:
> 
> [9/1075]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> 0      0                                              127.0.0.1:56547                                                        127.0.0.1:domain               
> 0      0                                         216.231.132.60:27667                                                       198.41.0.4:domain               
> 0      0                                         216.231.132.60:44121                                                          8.8.8.8:domain               
> 0      0                                         216.231.132.60:29814                                                       198.41.0.4:domain               
> [10/1076]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> [11/1076]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> [12/1076]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> [13/1076]mh@impetus:~ $ ss -u
> Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> [14/1076]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> IpInReceives                    400688             0.0
> IpInAddrErrors                  18567              0.0
> IpInUnknownProtos               3                  0.0
> IpInDelivers                    330634             0.0
> IpOutRequests                   283637             0.0
> UdpInDatagrams                  145860             0.0
> UdpNoPorts                      1313               0.0
> UdpInErrors                     9356               0.0

Thanks for the info. This is compatible with what reported on:

https://bugzilla.kernel.org/show_bug.cgi?id=196469

and should be fixed by this patch:

http://marc.info/?l=linux-netdev&m=150115960024825&w=2

(approval pending)

Ad a workaround you can disable UDP early demux:

echo 0 > /proc/sys/net/ipv4/udp_early_demux

(will affect both ipv4 and ipv6).

and (if the system  is already into the bad state) increase the udp
accounted memory limit, writing in /proc/sys/net/ipv4/udp_mem greater
values than the current ones (the actual values depends on the system
total memory).

Feel free to test the above patch on your systems.

Cheers,

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-28  8:05     ` Eric Dumazet
@ 2017-07-28  8:15       ` Paolo Abeni
  2017-07-28  8:41         ` Eric Dumazet
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Abeni @ 2017-07-28  8:15 UTC (permalink / raw)
  To: Eric Dumazet, Marc Haber; +Cc: netdev

Hi,

On Fri, 2017-07-28 at 01:05 -0700, Eric Dumazet wrote:
> On Fri, 2017-07-28 at 08:26 +0200, Marc Haber wrote:
> > On Mon, Jul 24, 2017 at 04:19:10PM +0200, Paolo Abeni wrote:
> > > Once that a system enter the buggy status, do the packets reach the
> > > relevant socket's queue?
> > > 
> > > ss -u
> > > nstat |grep -e Udp -e Ip
> > > 
> > > will help checking that.
> > 
> > I now have the issue on one machine, a Xen guest acting as authoritative
> > nameserver for my domains. Here are the outputs during normal use, with
> > artificial queries coming in on eth0:
> > 
> > [9/1075]mh@impetus:~ $ ss -u
> > Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> > 0      0                                              127.0.0.1:56547                                                        127.0.0.1:domain               
> > 0      0                                         216.231.132.60:27667                                                       198.41.0.4:domain               
> > 0      0                                         216.231.132.60:44121                                                          8.8.8.8:domain               
> > 0      0                                         216.231.132.60:29814                                                       198.41.0.4:domain               
> > [10/1076]mh@impetus:~ $ ss -u
> > Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> > [11/1076]mh@impetus:~ $ ss -u
> > Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> > [12/1076]mh@impetus:~ $ ss -u
> > Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> > [13/1076]mh@impetus:~ $ ss -u
> > Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> > [14/1076]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> > IpInReceives                    400688             0.0
> > IpInAddrErrors                  18567              0.0
> > IpInUnknownProtos               3                  0.0
> > IpInDelivers                    330634             0.0
> > IpOutRequests                   283637             0.0
> > UdpInDatagrams                  145860             0.0
> > UdpNoPorts                      1313               0.0
> > UdpInErrors                     9356               0.0
> > UdpOutDatagrams                 153093             0.0
> > UdpIgnoredMulti                 34148              0.0
> > Ip6InReceives                   161178             0.0
> > Ip6InNoRoutes                   8                  0.0
> > Ip6InDelivers                   73841              0.0
> > Ip6OutRequests                  77575              0.0
> > Ip6InMcastPkts                  87332              0.0
> > Ip6OutMcastPkts                 109                0.0
> > Ip6InOctets                     21880674           0.0
> > Ip6OutOctets                    9633059            0.0
> > Ip6InMcastOctets                9371483            0.0
> > Ip6OutMcastOctets               6636               0.0
> > Ip6InNoECTPkts                  161202             0.0
> > Ip6InECT1Pkts                   15                 0.0
> > Ip6InECT0Pkts                   11                 0.0
> > Ip6InCEPkts                     4                  0.0
> > Udp6InDatagrams                 11725              0.0
> > Udp6NoPorts                     2                  0.0
> > Udp6InErrors                    1989               0.0
> > Udp6OutDatagrams                14483              0.0
> > IpExtInBcastPkts                34148              0.0
> > IpExtInOctets                   47462716           0.0
> > IpExtOutOctets                  31262696           0.0
> > IpExtInBcastOctets              7476059            0.0
> > IpExtInNoECTPkts                400178             0.0
> > IpExtInECT1Pkts                 22                 0.0
> > IpExtInECT0Pkts                 481                0.0
> > IpExtInCEPkts                   14                 0.0
> > [15/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> > IpInReceives                    25                 0.0
> > IpInDelivers                    25                 0.0
> > IpOutRequests                   16                 0.0
> > UdpInDatagrams                  1                  0.0
> > UdpInErrors                     24                 0.0
> > UdpOutDatagrams                 16                 0.0
> > Ip6InReceives                   15                 0.0
> > Ip6InDelivers                   14                 0.0
> > Ip6OutRequests                  12                 0.0
> > Ip6InMcastPkts                  1                  0.0
> > Ip6InOctets                     1219               0.0
> > Ip6OutOctets                    4384               0.0
> > Ip6InMcastOctets                131                0.0
> > Ip6InNoECTPkts                  15                 0.0
> > IpExtInOctets                   11779              0.0
> > IpExtOutOctets                  1023               0.0
> > IpExtInNoECTPkts                25                 0.0
> > [16/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> > IpInReceives                    24                 0.0
> > IpInDelivers                    24                 0.0
> > IpOutRequests                   18                 0.0
> > UdpInErrors                     22                 0.0
> > UdpOutDatagrams                 16                 0.0
> > Ip6InReceives                   15                 0.0
> > Ip6InDelivers                   12                 0.0
> > Ip6OutRequests                  10                 0.0
> > Ip6InMcastPkts                  3                  0.0
> > Ip6InOctets                     1160               0.0
> > Ip6OutOctets                    2456               0.0
> > Ip6InMcastOctets                216                0.0
> > Ip6InNoECTPkts                  15                 0.0
> > IpExtInOctets                   8612               0.0
> > IpExtOutOctets                  1127               0.0
> > IpExtInNoECTPkts                24                 0.0
> > [17/1077]mh@impetus:~ $ nstat  | grep -e Udp -e Ip
> > IpInReceives                    5                  0.0
> > IpInDelivers                    4                  0.0
> > IpOutRequests                   3                  0.0
> > UdpNoPorts                      1                  0.0
> > UdpInErrors                     2                  0.0
> > UdpOutDatagrams                 1                  0.0
> > Ip6InReceives                   12                 0.0
> > Ip6InDelivers                   12                 0.0
> > Ip6OutRequests                  10                 0.0
> > Ip6InOctets                     944                0.0
> > Ip6OutOctets                    2364               0.0
> > Ip6InNoECTPkts                  12                 0.0
> > IpExtInOctets                   429                0.0
> > IpExtOutOctets                  226                0.0
> > IpExtInNoECTPkts                5                  0.0
> > [18/1077]mh@impetus:~ $
> > 
> > And here, hopefully a bit more helpful:
> > 
> > [19/1078]mh@impetus:~ $ ss -u ; nstat  | grep -e Udp -e Ip ; dig +time=2 @8.8.8.8 zugschlus.de mx ; ss -u ; nstat  | grep -e Udp -e Ip 
> > Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port                
> > 0      0                                         216.231.132.60:27333                                                       198.41.0.4:domain               
> > 0      0                                         216.231.132.60:38101                                                       198.41.0.4:domain               
> > 0      0                                         216.231.132.60:15836                                                       198.41.0.4:domain               
> > 0      0                                         216.231.132.60:50655                                                          8.8.8.8:domain               
> > 0      0                                         216.231.132.60:41953                                                       198.41.0.4:domain               
> > 0      0                                         216.231.132.60:6888                                                        198.41.0.4:domain               
> > 0      0                                         216.231.132.60:51441                                                       198.41.0.4:domain               
> > 0      0                                         216.231.132.60:42503                                                       198.41.0.4:domain               
> > 0      0                                         216.231.132.60:12575                                                       198.41.0.4:domain               
> > 0      0                                         216.231.132.60:13857                                                       198.41.0.4:domain
> > 0      0                                         216.231.132.60:16419                                                    192.36.148.17:domain
> > 0      0                                         216.231.132.60:39227                                                       198.41.0.4:domain
> > 0      0                                              127.0.0.1:54608                                                        127.0.0.1:domain
> > 0      0                                         216.231.132.60:20818                                                       198.41.0.4:domain
> > 0      0                                         216.231.132.60:56662                                                       198.41.0.4:domain
> > 0      0                                         216.231.132.60:48259                                                    192.36.148.17:domain
> > 0      0                                         216.231.132.60:37803                                                       198.41.0.4:domain
> > IpInReceives                    59                 0.0
> > IpInAddrErrors                  1                  0.0
> > IpInDelivers                    56                 0.0
> > IpOutRequests                   50                 0.0
> > UdpInDatagrams                  1                  0.0
> > UdpInErrors                     50                 0.0
> > UdpOutDatagrams                 47                 0.0
> > UdpIgnoredMulti                 1                  0.0
> > Ip6InReceives                   75                 0.0
> > Ip6InDelivers                   73                 0.0
> > Ip6OutRequests                  64                 0.0
> > Ip6InMcastPkts                  2                  0.0
> > Ip6InOctets                     7837               0.0
> > Ip6OutOctets                    11876              0.0
> > Ip6InMcastOctets                279                0.0
> > Ip6InNoECTPkts                  75                 0.0
> > Udp6InErrors                    3                  0.0
> > IpExtInBcastPkts                1                  0.0
> > IpExtInOctets                   18447              0.0
> > IpExtOutOctets                  3478               0.0
> > IpExtInBcastOctets              183                0.0
> > IpExtInNoECTPkts                59                 0.0
> > 
> > ; <<>> DiG 9.10.3-P4-Debian <<>> +time=2 @8.8.8.8 zugschlus.de mx
> > ; (1 server found)
> > ;; global options: +cmd
> > ;; connection timed out; no servers could be reached
> > Recv-Q Send-Q                                     Local Address:Port                                                      Peer Address:Port
> > 0      0                                         216.231.132.60:7879                                                      202.12.27.33:domain
> > 0      0                                         216.231.132.60:32711                                                     202.12.27.33:domain
> > 0      0                                         216.231.132.60:54238                                                     202.12.27.33:domain
> > 0      0                                         216.231.132.60:30948                                                   192.228.79.201:domain
> > 0      0                                         216.231.132.60:4106                                                      202.12.27.33:domain
> > 0      0                                         216.231.132.60:6667                                                      202.12.27.33:domain
> > 0      0                                         216.231.132.60:2090                                                    192.228.79.201:domain
> > 0      0                                         216.231.132.60:60459                                                   192.228.79.201:domain
> > 0      0                                         216.231.132.60:16427                                                     202.12.27.33:domain
> > 0      0                                         216.231.132.60:9019                                                      202.12.27.33:domain
> > 0      0                                         216.231.132.60:2113                                                      202.12.27.33:domain
> > 0      0                                         216.231.132.60:34907                                                     202.12.27.33:domain
> > 0      0                                         216.231.132.60:34654                                                     202.12.27.33:domain
> > 0      0                                         216.231.132.60:47725                                                     202.12.27.33:domain
> > 0      0                                         216.231.132.60:35774                                                     202.12.27.33:domain
> > IpInReceives                    38                 0.0
> > IpInDelivers                    38                 0.0
> > IpOutRequests                   38                 0.0
> > UdpInDatagrams                  2                  0.0
> > UdpInErrors                     34                 0.0
> > UdpOutDatagrams                 36                 0.0
> > Ip6InReceives                   14                 0.0
> > Ip6InDelivers                   13                 0.0
> > Ip6OutRequests                  13                 0.0
> > Ip6InMcastPkts                  1                  0.0
> > Ip6InOctets                     1046               0.0
> > Ip6OutOctets                    6277               0.0
> > Ip6InMcastOctets                133                0.0
> > Ip6InNoECTPkts                  13                 0.0
> > Ip6InECT0Pkts                   1                  0.0
> > Udp6InDatagrams                 1                  0.0
> > Udp6OutDatagrams                1                  0.0
> > IpExtInOctets                   15963              0.0
> > IpExtOutOctets                  2397               0.0
> > IpExtInNoECTPkts                37                 0.0
> > IpExtInECT0Pkts                 1                  0.0
> > [20/1079]mh@impetus:~ $
> > 
> > I am afraid I cannot keep this state for much longer than a few
> > additional hours as this is an authoritative name server...
> > 
> > Greetings
> > Marc
> > 
> 
> To confirm the refcount issue, you might send us :
> 
> cat /proc/net/udp6
> 
> Normally, the refcount column should have a value of 2

I think that the leaked sockets are still unhashed on close() (via
udp_lib_close() -> sk_common_release() ->sk_proto->unhash()), so they
should not be listed there ?!? (Rush answer, subject to lack of coffee
issues)

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-28  8:15       ` Paolo Abeni
@ 2017-07-28  8:41         ` Eric Dumazet
  0 siblings, 0 replies; 14+ messages in thread
From: Eric Dumazet @ 2017-07-28  8:41 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Marc Haber, netdev

On Fri, 2017-07-28 at 10:15 +0200, Paolo Abeni wrote:
> Hi,
> 
> On Fri, 2017-07-28 at 01:05 -0700, Eric Dumazet wrote:

> I think that the leaked sockets are still unhashed on close() (via
> udp_lib_close() -> sk_common_release() ->sk_proto->unhash()), so they
> should not be listed there ?!? (Rush answer, subject to lack of coffee
> issues)

If the daemon is still running, sockets are not closed and we can watch
refcount being increased.

We can see this even on an UDP_RR workload (no UDP flood at all)

lpk51:~# while :
> do
> ./super_netperf 10 -H 2002:af6:b34:: -t UDP_RR -l 1 -- -N
> done


lpk52:~# cat /proc/net/udp6|grep -v " 2 "
  sl  local_address                         remote_address
st tx_queue rx_queue tr tm->when retrnsmt   uid  timeout inode ref
pointer drops
  293: F60A02200000340B0000000000000000:F0A9
F60A02200000330B0000000000000000:AB11 01 00000000:00000000 00:00000000
00000000     0        0 583763 387 ffff880ddb862c00 0
 3251: F60A02200000340B0000000000000000:FC37
F60A02200000330B0000000000000000:EA7F 01 00000000:00000000 00:00000000
00000000     0        0 586126 356 ffff880dd25a7940 0
 4043: F60A02200000340B0000000000000000:FF4F
F60A02200000330B0000000000000000:9A35 01 00000500:00000000 00:00000000
00000000     0        0 583760 409 ffff880ddb863180 0
 7832: F60A02200000340B0000000000000000:8E1C
F60A02200000330B0000000000000000:DEC1 01 00000000:00000000 00:00000000
00000000     0        0 583766 387 ffff880ddb863700 0
 8568: F60A02200000340B0000000000000000:90FC
F60A02200000330B0000000000000000:8A60 01 00000000:00000000 00:00000000
00000000     0        0 580369 379 ffff880d9e7b2700 0
13085: F60A02200000340B0000000000000000:A2A1
F60A02200000330B0000000000000000:CDB3 01 00000000:00000000 00:00000000
00000000     0        0 586915 397 ffff880e31994140 0
14975: F60A02200000340B0000000000000000:AA03
F60A02200000330B0000000000000000:9C71 01 00000000:00000000 00:00000000
00000000     0        0 586123 355 ffff880d9d1baec0 0
15155: F60A02200000340B0000000000000000:AAB7
F60A02200000330B0000000000000000:D324 01 00000000:00000000 00:00000000
00000000     0        0 583769 367 ffff880dd88141c0 0
25421: F60A02200000340B0000000000000000:D2D1
F60A02200000330B0000000000000000:CA33 01 00000000:00000000 00:00000000
00000000     0        0 586120 424 ffff880dfda4b0c0 0
28774: F60A02200000340B0000000000000000:DFEA
F60A02200000330B0000000000000000:DD05 01 00000500:00000000 00:00000000
00000000     0        0 583757 411 ffff880ddb862680 0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-28  8:07     ` Paolo Abeni
@ 2017-07-28 12:14       ` Marc Haber
  2017-08-11 14:34         ` Marc Haber
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Haber @ 2017-07-28 12:14 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev

On Fri, Jul 28, 2017 at 10:07:57AM +0200, Paolo Abeni wrote:
> Ad a workaround you can disable UDP early demux:
> 
> echo 0 > /proc/sys/net/ipv4/udp_early_demux
> 
> (will affect both ipv4 and ipv6).
> 
> and (if the system  is already into the bad state) increase the udp
> accounted memory limit, writing in /proc/sys/net/ipv4/udp_mem greater
> values than the current ones (the actual values depends on the system
> total memory).

I can confirm that these two changes make a system in bad state work
again immediately. Will try the patch on 4.12.4 later today.

Thanks for helping!

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-07-28 12:14       ` Marc Haber
@ 2017-08-11 14:34         ` Marc Haber
  2017-08-11 20:07           ` Marc Haber
  0 siblings, 1 reply; 14+ messages in thread
From: Marc Haber @ 2017-08-11 14:34 UTC (permalink / raw)
  To: Paolo Abeni, netdev

On Fri, Jul 28, 2017 at 02:14:34PM +0200, Marc Haber wrote:
> I can confirm that these two changes make a system in bad state work
> again immediately. Will try the patch on 4.12.4 later today.

After upgrading my test systems to 4.12.5, the issue reappeared. This
shows me that the patch indeed helped (my patched 4.12.4 kernels didn't
show the bad behavior), and that the patch didn't make its way into
4.12.5. The patch applied to 4.12.5, kernels are building.

The run-time fix works as well.

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: After a while of system running no incoming UDP any more?
  2017-08-11 14:34         ` Marc Haber
@ 2017-08-11 20:07           ` Marc Haber
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Haber @ 2017-08-11 20:07 UTC (permalink / raw)
  To: Paolo Abeni, netdev

On Fri, Aug 11, 2017 at 04:34:53PM +0200, Marc Haber wrote:
> On Fri, Jul 28, 2017 at 02:14:34PM +0200, Marc Haber wrote:
> > I can confirm that these two changes make a system in bad state work
> > again immediately. Will try the patch on 4.12.4 later today.
> 
> After upgrading my test systems to 4.12.5, the issue reappeared. This
> shows me that the patch indeed helped (my patched 4.12.4 kernels didn't
> show the bad behavior), and that the patch didn't make its way into
> 4.12.5. The patch applied to 4.12.5, kernels are building.

It seems to be in the freshly released 4.12.6.

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-08-11 20:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-24 12:09 After a while of system running no incoming UDP any more? Marc Haber
2017-07-24 14:19 ` Paolo Abeni
2017-07-25 11:57   ` Marc Haber
2017-07-25 12:17     ` Paolo Abeni
2017-07-26  8:10       ` Marc Haber
2017-07-26  8:33         ` Paolo Abeni
2017-07-28  6:26   ` Marc Haber
2017-07-28  8:05     ` Eric Dumazet
2017-07-28  8:15       ` Paolo Abeni
2017-07-28  8:41         ` Eric Dumazet
2017-07-28  8:07     ` Paolo Abeni
2017-07-28 12:14       ` Marc Haber
2017-08-11 14:34         ` Marc Haber
2017-08-11 20:07           ` Marc Haber

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.