All of lore.kernel.org
 help / color / mirror / Atom feed
* IPv6 neighbor discovery issues on 4.18
@ 2018-08-31 14:49 Brian Rak
  2019-01-09 21:33 ` IPv6 neighbor discovery issues on 4.18 (and now 4.19) Brian Rak
  0 siblings, 1 reply; 3+ messages in thread
From: Brian Rak @ 2018-08-31 14:49 UTC (permalink / raw)
  To: netdev

We've upgraded a few machines to a 4.18.3 kernel and we're running into 
weird IPv6 neighbor discovery issues.  Basically, the machines stop 
responding to inbound IPv6 neighbor solicitation requests, which very 
quickly breaks all IPv6 connectivity.

It seems like the routing table gets confused:

# ip -6 route get fe80::4e16:fc00:c7a0:7800 dev br0
RTNETLINK answers: Network is unreachable
# ping6 fe80::4e16:fc00:c7a0:7800 -I br0
connect: Network is unreachable
yet

# ip -6 route | grep fe80 | grep br0
fe80::/64 dev br0 proto kernel metric 256 pref medium

fe80::4e16:fc00:c7a0:7800 is the link-local IP of the server's default 
gateway.

In this case, br0 has a single adapter attached to it.

I haven't been able to come up with any sort of reproduction steps here, 
this seems to happen after a few days of uptime in our environment.  The 
last known good release we have here is 4.17.13.

Any suggestions for troubleshooting this?  Sometimes we see machines fix 
themselves, but we haven't been able to figure out what's happening that 
helps.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: IPv6 neighbor discovery issues on 4.18 (and now 4.19)
  2018-08-31 14:49 IPv6 neighbor discovery issues on 4.18 Brian Rak
@ 2019-01-09 21:33 ` Brian Rak
  2019-01-11 18:19   ` Brian Rak
  0 siblings, 1 reply; 3+ messages in thread
From: Brian Rak @ 2019-01-09 21:33 UTC (permalink / raw)
  To: netdev


On 8/31/2018 10:49 AM, Brian Rak wrote:
> We've upgraded a few machines to a 4.18.3 kernel and we're running 
> into weird IPv6 neighbor discovery issues.  Basically, the machines 
> stop responding to inbound IPv6 neighbor solicitation requests, which 
> very quickly breaks all IPv6 connectivity.
>
> It seems like the routing table gets confused:
>
> # ip -6 route get fe80::4e16:fc00:c7a0:7800 dev br0
> RTNETLINK answers: Network is unreachable
> # ping6 fe80::4e16:fc00:c7a0:7800 -I br0
> connect: Network is unreachable
> yet
>
> # ip -6 route | grep fe80 | grep br0
> fe80::/64 dev br0 proto kernel metric 256 pref medium
>
> fe80::4e16:fc00:c7a0:7800 is the link-local IP of the server's default 
> gateway.
>
> In this case, br0 has a single adapter attached to it.
>
> I haven't been able to come up with any sort of reproduction steps 
> here, this seems to happen after a few days of uptime in our 
> environment.  The last known good release we have here is 4.17.13.
>
> Any suggestions for troubleshooting this?  Sometimes we see machines 
> fix themselves, but we haven't been able to figure out what's 
> happening that helps.
>
So, we're still seeing this on 4.19.13.  I've been investigating this a 
little further and have discovered a few more things:

The server also fails to respond to IPv6 neighbor discovery requests:

16:12:10.181769 IP6 fe80::629c:9fff:fe22:4b80 > ff02::1:ff00:33: ICMP6, 
neighbor solicitation, who has 2001:x::33, length 32

But this IP is configured properly:

# ip -6 addr show dev br0
7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
     inet6 2001:x::33/64 scope global
        valid_lft forever preferred_lft forever
     inet6 fe80::ec4:7aff:fe88:c48c/64 scope link
        valid_lft forever preferred_lft forever

I found some instructions that suggest using `perf` to determine where 
packets are getting dropped, so I tried: perf record -g -a -e 
skb:kfree_skb; perf script, which showed me this seemingly relevant 
places (and a bunch of other drops):

swapper     0 [037] 161501.062542: skb:kfree_skb: 
skbaddr=0xffff968771988600 protocol=34525 location=0xffffffff94796c6a
         ffffffff9468d50b kfree_skb+0x7b ([kernel.kallsyms])
         ffffffff94796c6a ndisc_send_skb+0x2fa ([kernel.kallsyms])
         ffffffff947975b4 ndisc_send_na+0x184 ([kernel.kallsyms])
         ffffffff94798143 ndisc_recv_ns+0x2f3 ([kernel.kallsyms])
         ffffffff94799b46 ndisc_rcv+0xe6 ([kernel.kallsyms])
         ffffffff947a1fa8 icmpv6_rcv+0x428 ([kernel.kallsyms])
         ffffffff9477bcd3 ip6_input_finish+0xf3 ([kernel.kallsyms])
         ffffffff9477c11f ip6_input+0x3f ([kernel.kallsyms])
         ffffffff9477c787 ip6_mc_input+0x97 ([kernel.kallsyms])
         ffffffff9477c0cc ip6_rcv_finish+0x7c ([kernel.kallsyms])
         ffffffff947d9fd2 ip_sabotage_in+0x42 ([kernel.kallsyms])
         ffffffff946f3822 nf_hook_slow+0x42 ([kernel.kallsyms])
         ffffffff9477c569 ipv6_rcv+0xc9 ([kernel.kallsyms])
         ffffffff946a5de7 __netif_receive_skb_one_core+0x57 
([kernel.kallsyms])
         ffffffff946a5e48 __netif_receive_skb+0x18 ([kernel.kallsyms])
         ffffffff946a5145 netif_receive_skb_internal+0x45 
([kernel.kallsyms])
         ffffffff946a520c netif_receive_skb+0x1c ([kernel.kallsyms])
         ffffffff947c7d03 br_netif_receive_skb+0x43 ([kernel.kallsyms])
         ffffffff947c7ded br_pass_frame_up+0xcd ([kernel.kallsyms])
         ffffffff947c80ca br_handle_frame_finish+0x24a ([kernel.kallsyms])
         ffffffff947dae0f br_nf_hook_thresh+0xdf ([kernel.kallsyms])
         ffffffff947dbf19 br_nf_pre_routing_finish_ipv6+0x109 
([kernel.kallsyms])
         ffffffff947dc39a br_nf_pre_routing_ipv6+0xfa ([kernel.kallsyms])
         ffffffff947dbbe9 br_nf_pre_routing+0x1c9 ([kernel.kallsyms])
         ffffffff946f3822 nf_hook_slow+0x42 ([kernel.kallsyms])
         ffffffff947c850f br_handle_frame+0x1ef ([kernel.kallsyms])
         ffffffff946a5471 __netif_receive_skb_core+0x211 ([kernel.kallsyms])
         ffffffff946a5dcb __netif_receive_skb_one_core+0x3b 
([kernel.kallsyms])
         ffffffff946a5e48 __netif_receive_skb+0x18 ([kernel.kallsyms])
         ffffffff946a5145 netif_receive_skb_internal+0x45 
([kernel.kallsyms])
         ffffffff946a6fb0 napi_gro_receive+0xd0 ([kernel.kallsyms])
         ffffffffc05c319f ixgbe_clean_rx_irq+0x46f ([kernel.kallsyms])
         ffffffffc05c4610 ixgbe_poll+0x280 ([kernel.kallsyms])
         ffffffff946a6729 net_rx_action+0x289 ([kernel.kallsyms])
         ffffffff94c000d1 __softirqentry_text_start+0xd1 ([kernel.kallsyms])
         ffffffff94075108 irq_exit+0xe8 ([kernel.kallsyms])
         ffffffff94a01a69 do_IRQ+0x59 ([kernel.kallsyms])
         ffffffff94a0098f ret_from_intr+0x0 ([kernel.kallsyms])
         ffffffff9464e01d cpuidle_enter_state+0xbd ([kernel.kallsyms])
         ffffffff9464e287 cpuidle_enter+0x17 ([kernel.kallsyms])
         ffffffff940a3cd3 call_cpuidle+0x23 ([kernel.kallsyms])
         ffffffff940a3f78 do_idle+0x1c8 ([kernel.kallsyms])
         ffffffff940a4203 cpu_startup_entry+0x73 ([kernel.kallsyms])
         ffffffff9403fade start_secondary+0x1ae ([kernel.kallsyms])
         ffffffff940000d4 secondary_startup_64+0xa4 ([kernel.kallsyms])

However, I can't seem to determine why this is failing.  It seems like 
the only way to hit kfree_skb within ndisc_send_skb would be if 
icmp6_dst_alloc fails?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: IPv6 neighbor discovery issues on 4.18 (and now 4.19)
  2019-01-09 21:33 ` IPv6 neighbor discovery issues on 4.18 (and now 4.19) Brian Rak
@ 2019-01-11 18:19   ` Brian Rak
  0 siblings, 0 replies; 3+ messages in thread
From: Brian Rak @ 2019-01-11 18:19 UTC (permalink / raw)
  To: netdev


On 1/9/2019 4:33 PM, Brian Rak wrote:
>
> On 8/31/2018 10:49 AM, Brian Rak wrote:
>> We've upgraded a few machines to a 4.18.3 kernel and we're running 
>> into weird IPv6 neighbor discovery issues.  Basically, the machines 
>> stop responding to inbound IPv6 neighbor solicitation requests, which 
>> very quickly breaks all IPv6 connectivity.
>>
>> It seems like the routing table gets confused:
>>
>> # ip -6 route get fe80::4e16:fc00:c7a0:7800 dev br0
>> RTNETLINK answers: Network is unreachable
>> # ping6 fe80::4e16:fc00:c7a0:7800 -I br0
>> connect: Network is unreachable
>> yet
>>
>> # ip -6 route | grep fe80 | grep br0
>> fe80::/64 dev br0 proto kernel metric 256 pref medium
>>
>> fe80::4e16:fc00:c7a0:7800 is the link-local IP of the server's 
>> default gateway.
>>
>> In this case, br0 has a single adapter attached to it.
>>
>> I haven't been able to come up with any sort of reproduction steps 
>> here, this seems to happen after a few days of uptime in our 
>> environment.  The last known good release we have here is 4.17.13.
>>
>> Any suggestions for troubleshooting this?  Sometimes we see machines 
>> fix themselves, but we haven't been able to figure out what's 
>> happening that helps.
>>
> So, we're still seeing this on 4.19.13.  I've been investigating this 
> a little further and have discovered a few more things:
>
> The server also fails to respond to IPv6 neighbor discovery requests:
>
> 16:12:10.181769 IP6 fe80::629c:9fff:fe22:4b80 > ff02::1:ff00:33: 
> ICMP6, neighbor solicitation, who has 2001:x::33, length 32
>
> But this IP is configured properly:
>
> # ip -6 addr show dev br0
> 7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 state UP qlen 1000
>     inet6 2001:x::33/64 scope global
>        valid_lft forever preferred_lft forever
>     inet6 fe80::ec4:7aff:fe88:c48c/64 scope link
>        valid_lft forever preferred_lft forever
>
> I found some instructions that suggest using `perf` to determine where 
> packets are getting dropped, so I tried: perf record -g -a -e 
> skb:kfree_skb; perf script, which showed me this seemingly relevant 
> places (and a bunch of other drops):
>
> swapper     0 [037] 161501.062542: skb:kfree_skb: 
> skbaddr=0xffff968771988600 protocol=34525 location=0xffffffff94796c6a
>         ffffffff9468d50b kfree_skb+0x7b ([kernel.kallsyms])
>         ffffffff94796c6a ndisc_send_skb+0x2fa ([kernel.kallsyms])
>         ffffffff947975b4 ndisc_send_na+0x184 ([kernel.kallsyms])
>         ffffffff94798143 ndisc_recv_ns+0x2f3 ([kernel.kallsyms])
>         ffffffff94799b46 ndisc_rcv+0xe6 ([kernel.kallsyms])
>         ffffffff947a1fa8 icmpv6_rcv+0x428 ([kernel.kallsyms])
>         ffffffff9477bcd3 ip6_input_finish+0xf3 ([kernel.kallsyms])
>         ffffffff9477c11f ip6_input+0x3f ([kernel.kallsyms])
>         ffffffff9477c787 ip6_mc_input+0x97 ([kernel.kallsyms])
>         ffffffff9477c0cc ip6_rcv_finish+0x7c ([kernel.kallsyms])
>         ffffffff947d9fd2 ip_sabotage_in+0x42 ([kernel.kallsyms])
>         ffffffff946f3822 nf_hook_slow+0x42 ([kernel.kallsyms])
>         ffffffff9477c569 ipv6_rcv+0xc9 ([kernel.kallsyms])
>         ffffffff946a5de7 __netif_receive_skb_one_core+0x57 
> ([kernel.kallsyms])
>         ffffffff946a5e48 __netif_receive_skb+0x18 ([kernel.kallsyms])
>         ffffffff946a5145 netif_receive_skb_internal+0x45 
> ([kernel.kallsyms])
>         ffffffff946a520c netif_receive_skb+0x1c ([kernel.kallsyms])
>         ffffffff947c7d03 br_netif_receive_skb+0x43 ([kernel.kallsyms])
>         ffffffff947c7ded br_pass_frame_up+0xcd ([kernel.kallsyms])
>         ffffffff947c80ca br_handle_frame_finish+0x24a ([kernel.kallsyms])
>         ffffffff947dae0f br_nf_hook_thresh+0xdf ([kernel.kallsyms])
>         ffffffff947dbf19 br_nf_pre_routing_finish_ipv6+0x109 
> ([kernel.kallsyms])
>         ffffffff947dc39a br_nf_pre_routing_ipv6+0xfa ([kernel.kallsyms])
>         ffffffff947dbbe9 br_nf_pre_routing+0x1c9 ([kernel.kallsyms])
>         ffffffff946f3822 nf_hook_slow+0x42 ([kernel.kallsyms])
>         ffffffff947c850f br_handle_frame+0x1ef ([kernel.kallsyms])
>         ffffffff946a5471 __netif_receive_skb_core+0x211 
> ([kernel.kallsyms])
>         ffffffff946a5dcb __netif_receive_skb_one_core+0x3b 
> ([kernel.kallsyms])
>         ffffffff946a5e48 __netif_receive_skb+0x18 ([kernel.kallsyms])
>         ffffffff946a5145 netif_receive_skb_internal+0x45 
> ([kernel.kallsyms])
>         ffffffff946a6fb0 napi_gro_receive+0xd0 ([kernel.kallsyms])
>         ffffffffc05c319f ixgbe_clean_rx_irq+0x46f ([kernel.kallsyms])
>         ffffffffc05c4610 ixgbe_poll+0x280 ([kernel.kallsyms])
>         ffffffff946a6729 net_rx_action+0x289 ([kernel.kallsyms])
>         ffffffff94c000d1 __softirqentry_text_start+0xd1 
> ([kernel.kallsyms])
>         ffffffff94075108 irq_exit+0xe8 ([kernel.kallsyms])
>         ffffffff94a01a69 do_IRQ+0x59 ([kernel.kallsyms])
>         ffffffff94a0098f ret_from_intr+0x0 ([kernel.kallsyms])
>         ffffffff9464e01d cpuidle_enter_state+0xbd ([kernel.kallsyms])
>         ffffffff9464e287 cpuidle_enter+0x17 ([kernel.kallsyms])
>         ffffffff940a3cd3 call_cpuidle+0x23 ([kernel.kallsyms])
>         ffffffff940a3f78 do_idle+0x1c8 ([kernel.kallsyms])
>         ffffffff940a4203 cpu_startup_entry+0x73 ([kernel.kallsyms])
>         ffffffff9403fade start_secondary+0x1ae ([kernel.kallsyms])
>         ffffffff940000d4 secondary_startup_64+0xa4 ([kernel.kallsyms])
>
> However, I can't seem to determine why this is failing.  It seems like 
> the only way to hit kfree_skb within ndisc_send_skb would be if 
> icmp6_dst_alloc fails?


So, I applied a dumb patch to log failures:

diff -baur linux-4.19.13/net/ipv6/ndisc.c 
linux-4.19.13-dirty/net/ipv6/ndisc.c
--- linux-4.19.13/net/ipv6/ndisc.c    2018-12-29 07:37:59.000000000 -0500
+++ linux-4.19.13-dirty/net/ipv6/ndisc.c    2019-01-09 
16:37:59.140042846 -0500
@@ -470,6 +470,7 @@
          icmpv6_flow_init(sk, &fl6, type, saddr, daddr, oif);
          dst = icmp6_dst_alloc(skb->dev, &fl6);
          if (IS_ERR(dst)) {
+            net_warn_ratelimited("Dropping ndisc response due to 
icmp6_dst_alloc failure: %d", PTR_ERR(dst));
              kfree_skb(skb);
              return;
          }

Which ends up producing a bunch of this:

[73531.594663] ICMPv6: Dropping ndisc response due to icmp6_dst_alloc 
failure: -12
[73532.361678] ICMPv6: Dropping ndisc response due to icmp6_dst_alloc 
failure: -12
[73533.319860] ICMPv6: Dropping ndisc response due to icmp6_dst_alloc 
failure: -12
[73534.089759] ICMPv6: Dropping ndisc response due to icmp6_dst_alloc 
failure: -12

That seems to be ENOMEM, which suggests that dst_alloc is failing 
somehow (as ip6_dst_alloc looks to be a simple wrapper around dst_alloc).

If I look at `trace-cmd record -p function -l ip6_dst_gc`, I see that 
this function is getting called about once a second..

I have net.ipv6.route.max_size=4096, and the machine only has 376 routes 
(calculated by `ip -6 route | wc -l`).  However, raising this sysctl to 
65k seems to instantly fix IPv6 (I'm not sure if this is a permanent fix 
yet)

Does this indicate that the machine is leaking IPv6 dst_entry? How would 
I determine what is leaking?

This is from shortly after raising the max_size:

# cat /proc/net/rt6_stats
02b9 015f 13e597 04ab 0000 1031 0b3c

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-01-11 18:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-31 14:49 IPv6 neighbor discovery issues on 4.18 Brian Rak
2019-01-09 21:33 ` IPv6 neighbor discovery issues on 4.18 (and now 4.19) Brian Rak
2019-01-11 18:19   ` Brian Rak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.