netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
@ 2021-01-29 17:44 Pierre Cheynier
  2021-01-30  3:27 ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Pierre Cheynier @ 2021-01-29 17:44 UTC (permalink / raw)
  To: netdev; +Cc: kuba

Dear list,

I noticed this assertion error recently after upgrading to 5.10.x (latest trial being 5.10.11).
Coming indirectly with my usage of the vxlan module, the assertion output will probably give you the information required to guess my hardware context (i40e).

[    8.842462] ------------[ cut here ]------------
[    8.847081] RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c (557)
[    8.853541] WARNING: CPU: 0 PID: 15 at net/ipv4/udp_tunnel_nic.c:557 __udp_tunnel_nic_reset_ntf+0xde/0xf0 [udp_tunnel]
[    8.864226] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel sg mlx4_en mlx4_core ipvlan i40e(+) ptp pps_core ahci(+) libahci libata ipmi_si ipmi_devintf ipmi_msghandler ip_vs_mh ip_vs nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c crc32c_intel
[    8.886539] CPU: 0 PID: 15 Comm: kworker/0:1 Not tainted 5.10.11-1.el7.x86_64 #1
[    8.893927] Hardware name: Quanta Cloud Technology Inc. QuantaPlex T42S-2U(LBG-2)/T42S-2U MB (Lewisburg-2), BIOS 3A14.Q301 05/03/2019
[    8.905919] Workqueue: events work_for_cpu_fn
[    8.910283] RIP: 0010:__udp_tunnel_nic_reset_ntf+0xde/0xf0 [udp_tunnel]
[    8.916896] Code: ef 20 00 00 00 0f 85 5f ff ff ff ba 2d 02 00 00 48 c7 c6 32 23 19 c0 48 c7 c7 10 2e 19 c0 c6 05 cf 20 00 00 01 e8 6f f9 74 c3 <0f> 0b e9 39 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
[    8.935641] RSP: 0018:ffff9afc86447b68 EFLAGS: 00010286
[    8.940868] RAX: 0000000000000000 RBX: ffff8ea4435a6768 RCX: 0000000000000000
[    8.948000] RDX: ffff8ea41fe27a20 RSI: ffff8ea41fe17c40 RDI: ffff8ea41fe17c40
[    8.955133] RBP: ffff8e9cc85ef000 R08: ffff8ea41fe17c40 R09: ffff9afc86447980
[    8.962265] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
[    8.969399] R13: 0000000000000000 R14: ffff8ea4435a6008 R15: ffff8ea445320000
[    8.976533] FS:  0000000000000000(0000) GS:ffff8ea41fe00000(0000) knlGS:0000000000000000
[    8.984617] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.990365] CR2: 00007f3d32b26160 CR3: 000000038d60a001 CR4: 00000000007706f0
[    8.997498] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    9.004639] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    9.011785] PKRU: 55555554
[    9.014499] Call Trace:
[    9.016968]  i40e_setup_pf_switch+0x3e8/0x5e0 [i40e]
[    9.021949]  i40e_probe.part.0.cold+0x87a/0x11f2 [i40e]
[    9.027175]  ? kmem_cache_alloc+0x39e/0x3f0
[    9.031361]  ? irq_get_irq_data+0xa/0x20
[    9.035286]  ? mp_check_pin_attr+0x13/0xc0
[    9.039399]  ? irq_get_irq_data+0xa/0x20
[    9.043329]  ? mp_map_pin_to_irq+0xd2/0x2f0
[    9.047514]  ? acpi_register_gsi_ioapic+0x90/0x170
[    9.052309]  ? pci_conf1_read+0xa4/0x100
[    9.056235]  ? pci_bus_read_config_word+0x49/0x70
[    9.060938]  ? do_pci_enable_device+0xd0/0x100
[    9.065385]  local_pci_probe+0x42/0x80
[    9.069140]  ? __schedule+0x32f/0x7e0
[    9.072803]  work_for_cpu_fn+0x16/0x20
[    9.076556]  process_one_work+0x1b0/0x350
[    9.080568]  worker_thread+0x1dc/0x3a0
[    9.084322]  ? process_one_work+0x350/0x350
[    9.088510]  kthread+0xfe/0x140
[    9.088513]  ? kthread_park+0x90/0x90
[    9.088516]  ret_from_fork+0x1f/0x30
[    9.088522] ---[ end trace daa573e87ec91564 ]---

Cheers,
-- 
Pierre Cheynier

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-01-29 17:44 [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c Pierre Cheynier
@ 2021-01-30  3:27 ` Jakub Kicinski
  2021-02-02  9:59   ` Pierre Cheynier
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2021-01-30  3:27 UTC (permalink / raw)
  To: Pierre Cheynier, Jesse Brandeburg, Tony Nguyen, intel-wired-lan; +Cc: netdev

On Fri, 29 Jan 2021 17:44:12 +0000 Pierre Cheynier wrote:
> Dear list,
> 
> I noticed this assertion error recently after upgrading to 5.10.x (latest trial being 5.10.11).
> Coming indirectly with my usage of the vxlan module, the assertion output will probably give you the information required to guess my hardware context (i40e).
> 
> [    8.842462] ------------[ cut here ]------------
> [    8.847081] RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c (557)
> [    8.853541] WARNING: CPU: 0 PID: 15 at net/ipv4/udp_tunnel_nic.c:557 __udp_tunnel_nic_reset_ntf+0xde/0xf0 [udp_tunnel]

> [    8.910283] RIP: 0010:__udp_tunnel_nic_reset_ntf+0xde/0xf0 [udp_tunnel]

> [    9.014499] Call Trace:
> [    9.016968]  i40e_setup_pf_switch+0x3e8/0x5e0 [i40e]
> [    9.021949]  i40e_probe.part.0.cold+0x87a/0x11f2 [i40e]
> [    9.065385]  local_pci_probe+0x42/0x80

Thanks for the report!

I must have missed that i40e_setup_pf_switch() is called from the probe
path.

Intel folks, does the UDP port table get reset only when reinit is true?
So can this be the fix?

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 521ea9df38d5..4f3e7201ec1e 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -14269,7 +14269,8 @@ static int i40e_setup_pf_switch(struct i40e_pf *pf, bool reinit)
        i40e_ptp_init(pf);
 
        /* repopulate tunnel port filters */
-       udp_tunnel_nic_reset_ntf(pf->vsi[pf->lan_vsi]->netdev);
+       if (!reinit)
+               udp_tunnel_nic_reset_ntf(pf->vsi[pf->lan_vsi]->netdev);
 
        return ret;
 }

Or do we need to exclude the first call like this?

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 521ea9df38d5..823c054f4c23 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -14269,7 +14269,8 @@ static int i40e_setup_pf_switch(struct i40e_pf *pf, bool reinit)
        i40e_ptp_init(pf);
 
        /* repopulate tunnel port filters */
-       udp_tunnel_nic_reset_ntf(pf->vsi[pf->lan_vsi]->netdev);
+       if (pf->lan_vsi != I40E_NO_VSI)
+               udp_tunnel_nic_reset_ntf(pf->vsi[pf->lan_vsi]->netdev);
 
        return ret;
 }

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-01-30  3:27 ` Jakub Kicinski
@ 2021-02-02  9:59   ` Pierre Cheynier
  2021-02-02 16:30     ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Pierre Cheynier @ 2021-02-02  9:59 UTC (permalink / raw)
  To: Jakub Kicinski, Jesse Brandeburg, Tony Nguyen, intel-wired-lan; +Cc: netdev


On Sat, 30 Jan 2021 04:27:00 +0100 Jakub Kicinski wrote:

> I must have missed that i40e_setup_pf_switch() is called from the probe
> path.

Do you want me to apply these patches, rebuild and tell you what's the
outcome?

--
Pierre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-02-02  9:59   ` Pierre Cheynier
@ 2021-02-02 16:30     ` Jakub Kicinski
  2021-02-02 16:55       ` Nguyen, Anthony L
  2021-02-03 14:23       ` [Intel-wired-lan] " Sokolowski, Jan
  0 siblings, 2 replies; 11+ messages in thread
From: Jakub Kicinski @ 2021-02-02 16:30 UTC (permalink / raw)
  To: Pierre Cheynier; +Cc: Jesse Brandeburg, Tony Nguyen, intel-wired-lan, netdev

On Tue, 2 Feb 2021 09:59:56 +0000 Pierre Cheynier wrote:
> On Sat, 30 Jan 2021 04:27:00 +0100 Jakub Kicinski wrote:
> 
> > I must have missed that i40e_setup_pf_switch() is called from the probe
> > path.  
> 
> Do you want me to apply these patches, rebuild and tell you what's the
> outcome?

I was hoping someone from Intel would step in and help.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-02-02 16:30     ` Jakub Kicinski
@ 2021-02-02 16:55       ` Nguyen, Anthony L
  2021-02-03 14:23       ` [Intel-wired-lan] " Sokolowski, Jan
  1 sibling, 0 replies; 11+ messages in thread
From: Nguyen, Anthony L @ 2021-02-02 16:55 UTC (permalink / raw)
  To: p.cheynier, kuba
  Cc: netdev, Sokolowski, Jan, Brandeburg, Jesse, intel-wired-lan,
	Loktionov, Aleksandr

On Tue, 2021-02-02 at 08:30 -0800, Jakub Kicinski wrote:
> On Tue, 2 Feb 2021 09:59:56 +0000 Pierre Cheynier wrote:
> > On Sat, 30 Jan 2021 04:27:00 +0100 Jakub Kicinski wrote:
> > 
> > > I must have missed that i40e_setup_pf_switch() is called from the
> > > probe
> > > path.  
> > 
> > Do you want me to apply these patches, rebuild and tell you what's
> > the
> > outcome?
> 
> I was hoping someone from Intel would step in and help.

I inquired with the i40e team about these proposed fixes. Adding a
couple of developers directly.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Intel-wired-lan] [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-02-02 16:30     ` Jakub Kicinski
  2021-02-02 16:55       ` Nguyen, Anthony L
@ 2021-02-03 14:23       ` Sokolowski, Jan
  2021-02-03 15:25         ` Pierre Cheynier
  1 sibling, 1 reply; 11+ messages in thread
From: Sokolowski, Jan @ 2021-02-03 14:23 UTC (permalink / raw)
  To: Jakub Kicinski, Pierre Cheynier; +Cc: intel-wired-lan, netdev

It has been mentioned that the error only appeared recently, after upgrade to 5.10.X. What's the last known working configuration it was tested on? A bisection could help us investigate.
Jan


-----Original Message-----
From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Jakub Kicinski
Sent: Tuesday, February 2, 2021 5:31 PM
To: Pierre Cheynier <p.cheynier@criteo.com>
Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org
Subject: Re: [Intel-wired-lan] [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c

On Tue, 2 Feb 2021 09:59:56 +0000 Pierre Cheynier wrote:
> On Sat, 30 Jan 2021 04:27:00 +0100 Jakub Kicinski wrote:
> 
> > I must have missed that i40e_setup_pf_switch() is called from the probe
> > path.  
> 
> Do you want me to apply these patches, rebuild and tell you what's the
> outcome?

I was hoping someone from Intel would step in and help.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Intel-wired-lan] [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-02-03 14:23       ` [Intel-wired-lan] " Sokolowski, Jan
@ 2021-02-03 15:25         ` Pierre Cheynier
  2021-02-03 16:05           ` Pierre Cheynier
  0 siblings, 1 reply; 11+ messages in thread
From: Pierre Cheynier @ 2021-02-03 15:25 UTC (permalink / raw)
  To: Sokolowski, Jan, Jakub Kicinski; +Cc: intel-wired-lan, netdev


On Wed, 3 Feb 2021 15:23:54 +0100 Sokolowski, Jan wrote:

> It has been mentioned that the error only appeared recently, after upgrade to 5.10.X. What's the last known working configuration it was tested on? A bisection could help us investigate.

I unfortunately moved from one LTS to another, meaning I was in 5.4 before, and this UDP tunnel offloading feature landed in 5.9 as far as I know.

Maybe Jakub can give pointers to specific 5.9 or 5.10 kernel versions I can eventually try, so that I can help refine where this was introduced (or if it was present from the start)?

--
Pierre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Intel-wired-lan] [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-02-03 15:25         ` Pierre Cheynier
@ 2021-02-03 16:05           ` Pierre Cheynier
  2021-02-03 17:08             ` Jakub Kicinski
  0 siblings, 1 reply; 11+ messages in thread
From: Pierre Cheynier @ 2021-02-03 16:05 UTC (permalink / raw)
  To: Sokolowski, Jan, Jakub Kicinski; +Cc: intel-wired-lan, netdev


On Wed, 3 Feb 2021 16:25:12 +0100 Pierre Cheynier wrote:
> On Wed, 3 Feb 2021 15:23:54 +0100 Sokolowski, Jan wrote:
> 
> > It has been mentioned that the error only appeared recently, after upgrade to 5.10.X. What's the last known working configuration it was tested on? A bisection could help us investigate.
> 
> I unfortunately moved from one LTS to another, meaning I was in 5.4 before, and this UDP tunnel offloading feature landed in 5.9 as far as I know.
> 
> Maybe Jakub can give pointers to specific 5.9 or 5.10 kernel versions I can eventually try, so that I can help refine where this was introduced (or if it was present from the start)?

So I think I was incorrect, the support of this infrastructure for i40e appears in 5.10.
From what I'm seeing, and Jakub will confirm, I think this started with the
initial implementation for i40e (see 40a98cb6f01f013b8ab0ce7b28f705423ee16836).

--
Pierre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Intel-wired-lan] [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-02-03 16:05           ` Pierre Cheynier
@ 2021-02-03 17:08             ` Jakub Kicinski
  2021-02-08  9:09               ` Pierre Cheynier
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Kicinski @ 2021-02-03 17:08 UTC (permalink / raw)
  To: Pierre Cheynier, Sokolowski, Jan; +Cc: intel-wired-lan, netdev

On Wed, 3 Feb 2021 16:05:40 +0000 Pierre Cheynier wrote:
> On Wed, 3 Feb 2021 16:25:12 +0100 Pierre Cheynier wrote:
> > On Wed, 3 Feb 2021 15:23:54 +0100 Sokolowski, Jan wrote:
> >   
> > > It has been mentioned that the error only appeared recently, after upgrade to 5.10.X. What's the last known working configuration it was tested on? A bisection could help us investigate.  
> > 
> > I unfortunately moved from one LTS to another, meaning I was in 5.4 before, and this UDP tunnel offloading feature landed in 5.9 as far as I know.
> > 
> > Maybe Jakub can give pointers to specific 5.9 or 5.10 kernel versions I can eventually try, so that I can help refine where this was introduced (or if it was present from the start)?  
> 
> So I think I was incorrect, the support of this infrastructure for i40e appears in 5.10.
> From what I'm seeing, and Jakub will confirm, I think this started with the
> initial implementation for i40e (see 40a98cb6f01f013b8ab0ce7b28f705423ee16836).

Yup! I'm pretty sure it's my conversion. The full commit quote upstream:

40a98cb6f01f ("i40e: convert to new udp_tunnel infrastructure")

It should trigger if you have vxlan module loaded (or built in) 
and then reload or re-probe i40e.

Let us know if you can't repro it should pop up pretty reliably.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Intel-wired-lan] [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-02-03 17:08             ` Jakub Kicinski
@ 2021-02-08  9:09               ` Pierre Cheynier
  2021-02-10  8:23                 ` Sokolowski, Jan
  0 siblings, 1 reply; 11+ messages in thread
From: Pierre Cheynier @ 2021-02-08  9:09 UTC (permalink / raw)
  To: Jakub Kicinski, Sokolowski, Jan; +Cc: intel-wired-lan, netdev


On Wed, 3 Feb 2021 18:08:31 +0100 Jakub Kicinski wrote:
> Yup! I'm pretty sure it's my conversion. The full commit quote upstream:
> 
> 40a98cb6f01f ("i40e: convert to new udp_tunnel infrastructure")
> 
> It should trigger if you have vxlan module loaded (or built in)
> and then reload or re-probe i40e.
> 
> Let us know if you can't repro it should pop up pretty reliably.

Not sure if this is under investigation on Intel side, I can help to test patches
or provide more info if needed.

--
Pierre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [Intel-wired-lan] [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c
  2021-02-08  9:09               ` Pierre Cheynier
@ 2021-02-10  8:23                 ` Sokolowski, Jan
  0 siblings, 0 replies; 11+ messages in thread
From: Sokolowski, Jan @ 2021-02-10  8:23 UTC (permalink / raw)
  To: Pierre Cheynier, Jakub Kicinski; +Cc: intel-wired-lan, netdev

Issue has been reproduced and is under investigation, once we get more information/potential fixes, we'll contact you again.

Jan

-----Original Message-----
From: Pierre Cheynier <p.cheynier@criteo.com> 
Sent: Monday, February 8, 2021 10:10 AM
To: Jakub Kicinski <kuba@kernel.org>; Sokolowski, Jan <jan.sokolowski@intel.com>
Cc: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org
Subject: RE: [Intel-wired-lan] [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c


On Wed, 3 Feb 2021 18:08:31 +0100 Jakub Kicinski wrote:
> Yup! I'm pretty sure it's my conversion. The full commit quote upstream:
> 
> 40a98cb6f01f ("i40e: convert to new udp_tunnel infrastructure")
> 
> It should trigger if you have vxlan module loaded (or built in)
> and then reload or re-probe i40e.
> 
> Let us know if you can't repro it should pop up pretty reliably.

Not sure if this is under investigation on Intel side, I can help to test patches
or provide more info if needed.

--
Pierre

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-02-10  8:24 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-29 17:44 [5.10] i40e/udp_tunnel: RTNL: assertion failed at net/ipv4/udp_tunnel_nic.c Pierre Cheynier
2021-01-30  3:27 ` Jakub Kicinski
2021-02-02  9:59   ` Pierre Cheynier
2021-02-02 16:30     ` Jakub Kicinski
2021-02-02 16:55       ` Nguyen, Anthony L
2021-02-03 14:23       ` [Intel-wired-lan] " Sokolowski, Jan
2021-02-03 15:25         ` Pierre Cheynier
2021-02-03 16:05           ` Pierre Cheynier
2021-02-03 17:08             ` Jakub Kicinski
2021-02-08  9:09               ` Pierre Cheynier
2021-02-10  8:23                 ` Sokolowski, Jan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).