netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeedm@mellanox.com>
To: "toke@toke.dk" <toke@toke.dk>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Cc: Eran Ben Elisha <eranbe@mellanox.com>,
	Tariq Toukan <tariqt@mellanox.com>,
	"brouer@redhat.com" <brouer@redhat.com>
Subject: Re: Kernel oops with mlx5 and dual XDP redirect programs
Date: Wed, 3 Oct 2018 23:44:07 +0000	[thread overview]
Message-ID: <4e2cfdc3db244f4b9483a0c3dfc62fae55238bb3.camel@mellanox.com> (raw)
In-Reply-To: <877eize5ro.fsf@toke.dk>

On Wed, 2018-10-03 at 11:30 +0200, Toke Høiland-Jørgensen wrote:
> Hi Saeed
> 
> I can reliably oops the kernel with the mlx5 driver, by installing
> XDP_REDIRECT programs on two devices so they redirect to each other,
> and then remove them while there is traffic on the interface.
> 
> Steps to reproduce:
> 
> # cd ~/build/linux/samples/bpf
> # ./xdp_redirect_map $(</sys/class/net/ens1f1/ifindex)
> $(</sys/class/net/ens1f0/ifindex)
> # ./xdp_redirect_map $(</sys/class/net/ens1f0/ifindex)
> $(</sys/class/net/ens1f1/ifindex)
> 
> Now, run some traffic (e.g., using pktgen) across the interfaces, and
> while the traffic is running, interrupt one of the xdp_redirect_map
> commands (thus unloading the eBPF program). This results in a kernel
> oops with the backtrace below. I get no crash if there's only a
> single
> XDP program.

Hi Toke,

What looks like happening is that while the traffic is being redirected
to the other device, the driver is trying to unload the program and
restarting the rings from below call trace we can see:

[ 1400.972318] RIP: 0010:mlx5e_xdp_xmit+0x7b/0x2a0 [mlx5_core]
[ 1401.077409]  bq_xmit_all+0x5e/0x160
[ 1401.080897]  dev_map_enqueue+0x12e/0x140
[ 1401.084823]  xdp_do_redirect+0x1a9/0x2a0
[ 1401.088756]  mlx5e_xdp_handle+0x24f/0x2b0 [mlx5_core]

and
[ 1401.154559] RIP: 0010:mlx5e_open_channels+0x65e/0x1390 [mlx5_core]
[ 1401.222834]  ? mlx5e_open_channels+0x5e1/0x1390 [mlx5_core]
[ 1401.228404]  ? rcu_exp_wait_wake+0x550/0x550
[ 1401.232674]  ? free_one_page+0x68/0x370
[ 1401.236519]  mlx5e_open_locked+0x28/0xa0 [mlx5_core]
[ 1401.241491]  mlx5e_xdp+0x2b2/0x300 [mlx5_core]
[ 1401.245936]  dev_xdp_install+0x4c/0x70
[ 1401.249686]  do_setlink+0xcdb/0xd10

I think that the mlx5 driver doesn't know how to tell the other device
to stop transmitting to it while it is resetting.. Maybe tariq or
Jesper know more about this ?
I will look at this tomorrow after noon and will try to repro...

what is interesting is that @ mlx5e_open_channels  stage all previous
TX queues must be still active and not destroyed only later on when we
switch to the new channels we stop and destroy older TX/RX queues, the
question is how much this call trace is reliable ?

Thanks for the report.

> 
> Is this something you could look into, please? :)

> 
> -Toke
> 
> 
> [ 1400.937870] BUG: unable to handle kernel paging request at
> 0000000000003fa8
> [ 1400.944826] PGD 800000072cc7b067 P4D 800000072cc7b067 PUD
> 72cc7a067 PMD 0 
> [ 1400.951693] Oops: 0000 [#1] SMP PTI
> [ 1400.955184] CPU: 5 PID: 10392 Comm: xdp_redirect_ma Not tainted
> 4.19.0-rc5-xdptest-g5be3ebf+ #17
> [ 1400.965344] Hardware name: LENOVO 30B3005DMT/102F, BIOS S00KT56A
> 01/15/2018
> [ 1400.972318] RIP: 0010:mlx5e_xdp_xmit+0x7b/0x2a0 [mlx5_core]
> [ 1400.977889] Code: 8b 0d 29 d9 4f 3f 39 8f 48 39 00 00 b8 fa ff ff
> ff 0f 86 45 01 00 00 48 8b 87 40 39 00 00 48 63 c9 4c 8b 24 c8 b8 9c
> ff ff ff <49> 8b 8c 24 a8 3f 00 00 4d 8d bc 24 c0 3c 00 00 83 e1 01
> 0f 84 19
> [ 1400.996624] RSP: 0018:ffff90209fb43bb0 EFLAGS: 00010202
> [ 1401.002001] RAX: 00000000ffffff9c RBX: 0000000000000000 RCX:
> 0000000000000005
> [ 1401.009122] RDX: ffffc7627fd75190 RSI: 0000000000000010 RDI:
> ffff902084580000
> [ 1401.016250] RBP: ffffc7627fd75190 R08: ffff901f9821c100 R09:
> ffffc7627fd75210
> [ 1401.023379] R10: 00000000000005dc R11: 0000000000000000 R12:
> 0000000000000000
> [ 1401.030500] R13: ffff902081580000 R14: 0000000000000001 R15:
> ffffc7627fd75190
> [ 1401.037645] FS:  00007f460fa96700(0000) GS:ffff90209fb40000(0000)
> knlGS:0000000000000000
> [ 1401.045718] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1401.051452] CR2: 0000000000003fa8 CR3: 000000076c3b6006 CR4:
> 00000000003606e0
> [ 1401.058573] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 1401.065823] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [ 1401.072943] Call Trace:
> [ 1401.075390]  <IRQ>
> [ 1401.077409]  bq_xmit_all+0x5e/0x160
> [ 1401.080897]  dev_map_enqueue+0x12e/0x140
> [ 1401.084823]  xdp_do_redirect+0x1a9/0x2a0
> [ 1401.088756]  mlx5e_xdp_handle+0x24f/0x2b0 [mlx5_core]
> [ 1401.093821]  ? resched_cpu+0x5f/0x70
> [ 1401.097399]  ? __xdp_return+0x189/0x400
> [ 1401.101242]  mlx5e_skb_from_cqe_linear+0xdd/0x180 [mlx5_core]
> [ 1401.106987]  mlx5e_handle_rx_cqe+0x43/0xe0 [mlx5_core]
> [ 1401.112130]  mlx5e_poll_rx_cq+0xcb/0x940 [mlx5_core]
> [ 1401.117094]  mlx5e_napi_poll+0xa6/0xc90 [mlx5_core]
> [ 1401.121966]  ? smp_reschedule_interrupt+0x16/0xd0
> [ 1401.126789]  ? reschedule_interrupt+0xf/0x20
> [ 1401.131057]  ? reschedule_interrupt+0xa/0x20
> [ 1401.135321]  net_rx_action+0x279/0x3d0
> [ 1401.139071]  __do_softirq+0xf2/0x28e
> [ 1401.142651]  irq_exit+0xb6/0xc0
> [ 1401.145792]  do_IRQ+0x52/0xd0
> [ 1401.148785]  common_interrupt+0xf/0xf
> [ 1401.152445]  </IRQ>
> [ 1401.154559] RIP: 0010:mlx5e_open_channels+0x65e/0x1390 [mlx5_core]
> [ 1401.160734] Code: 8b 00 48 05 a8 00 00 00 48 89 85 78 3c 00 00 48
> 8b 83 f8 8d 01 00 48 89 85 80 3c 00 00 48 8b 83 f0 8d 01 00 8b 80 a8
> fb 03 00 <0f> c8 89 85 88 3c 00 00 41 0f b6 45 16 88 85 8c 3c 00 00
> 49 83 bd
> [ 1401.179463] RSP: 0018:ffffa7628dd43808 EFLAGS: 00000282 ORIG_RAX:
> ffffffffffffffd4
> [ 1401.187024] RAX: 0000000000080000 RBX: ffff9020845808c0 RCX:
> 0000000000000000
> [ 1401.194325] RDX: ffffa7628dd43894 RSI: 0000000000000000 RDI:
> ffff901f8a0e0000
> [ 1401.201463] RBP: ffff901f8a0d8000 R08: ffffe1799d283800 R09:
> 0000000000000008
> [ 1401.208582] R10: 0000000000000000 R11: 0000000000000002 R12:
> 0000000000000000
> [ 1401.215702] R13: ffff902084583940 R14: 0000000000000000 R15:
> 0000000000000000
> [ 1401.222834]  ? mlx5e_open_channels+0x5e1/0x1390 [mlx5_core]
> [ 1401.228404]  ? rcu_exp_wait_wake+0x550/0x550
> [ 1401.232674]  ? free_one_page+0x68/0x370
> [ 1401.236519]  mlx5e_open_locked+0x28/0xa0 [mlx5_core]
> [ 1401.241491]  mlx5e_xdp+0x2b2/0x300 [mlx5_core]
> [ 1401.245936]  dev_xdp_install+0x4c/0x70
> [ 1401.249686]  do_setlink+0xcdb/0xd10
> [ 1401.253300]  ? flat_send_IPI_allbutself+0x6c/0xa0
> [ 1401.258003]  ? __update_load_avg_se+0x20c/0x290
> [ 1401.262530]  rtnl_setlink+0x104/0x140
> [ 1401.266189]  rtnetlink_rcv_msg+0x269/0x310
> [ 1401.270283]  ? _cond_resched+0x16/0x40
> [ 1401.274029]  ? __kmalloc_node_track_caller+0x1dd/0x2a0
> [ 1401.279162]  ? rtnl_calcit.isra.32+0x110/0x110
> [ 1401.283601]  netlink_rcv_skb+0xdb/0x110
> [ 1401.287437]  netlink_unicast+0x18b/0x250
> [ 1401.291359]  netlink_sendmsg+0x2c7/0x3b0
> [ 1401.295287]  sock_sendmsg+0x30/0x40
> [ 1401.298776]  __sys_sendto+0xd8/0x150
> [ 1401.302351]  ? __sys_getsockname+0xac/0xc0
> [ 1401.306448]  ? netlink_setsockopt+0x2e/0x2b0
> [ 1401.310718]  ? __sys_setsockopt+0x7c/0xe0
> [ 1401.314867]  __x64_sys_sendto+0x24/0x30
> [ 1401.318709]  do_syscall_64+0x4f/0x100
> [ 1401.322372]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 1401.327420] RIP: 0033:0x7f460f3a83dd
> [ 1401.330997] Code: 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 8b 05 7a
> 13 2c 00 85 c0 75 3e 45 31 c9 45 31 c0 4c 63 d1 48 63 ff b8 2c 00 00
> 00 0f 05 <48> 3d 00 f0 ff ff 77 0b c3 66 2e 0f 1f 84 00 00 00 00 00
> 48 8b 15
> [ 1401.349733] RSP: 002b:00007ffd28d23138 EFLAGS: 00000246 ORIG_RAX:
> 000000000000002c
> [ 1401.357293] RAX: ffffffffffffffda RBX: ffffffffffffff90 RCX:
> 00007f460f3a83dd
> [ 1401.364413] RDX: 000000000000002c RSI: 00007ffd28d23170 RDI:
> 0000000000000003
> [ 1401.371533] RBP: 00007ffd28d231e0 R08: 0000000000000000 R09:
> 0000000000000000
> [ 1401.378767] R10: 0000000000000000 R11: 0000000000000246 R12:
> 0000000000000006
> [ 1401.385895] R13: 00007ffd28d237f0 R14: 00007ffd28d23830 R15:
> 00007ffd28d2388c
> [ 1401.393016] Modules linked in: rpcrdma ib_umad sunrpc ib_ipoib
> rdma_ucm mlx5_ib binfmt_misc ib_uverbs snd_hda_codec_hdmi intel_rapl
> sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mlx5_core
> kvm_intel snd_hda_codec_realtek snd_hda_codec_generic kvm
> snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm e1000e uas
> irqbypass snd_timer crct10dif_pclmul snd mei_me usb_storage
> crc32_pclmul ghash_clmulni_intel wmi_bmof mei lpc_ich soundcore mlxfw
> pata_acpi mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs
> iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 raid10
> raid456 libcrc32c async_raid6_recov async_memcpy async_pq async_xor
> xor async_tx raid6_pq raid1 raid0 multipath linear nouveau video
> i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops drm mxm_wmi
> [ 1401.463638]  aesni_intel aes_x86_64 crypto_simd cryptd glue_helper
> ahci libahci wmi
> [ 1401.471289] CR2: 0000000000003fa8
> [ 1401.474617] ---[ end trace 1a0d8962c7db30ed ]---
> [ 1401.528487] RIP: 0010:mlx5e_xdp_xmit+0x7b/0x2a0 [mlx5_core]
> [ 1401.534058] Code: 8b 0d 29 d9 4f 3f 39 8f 48 39 00 00 b8 fa ff ff
> ff 0f 86 45 01 00 00 48 8b 87 40 39 00 00 48 63 c9 4c 8b 24 c8 b8 9c
> ff ff ff <49> 8b 8c 24 a8 3f 00 00 4d 8d bc 24 c0 3c 00 00 83 e1 01
> 0f 84 19
> [ 1401.552789] RSP: 0018:ffff90209fb43bb0 EFLAGS: 00010202
> [ 1401.558012] RAX: 00000000ffffff9c RBX: 0000000000000000 RCX:
> 0000000000000005
> [ 1401.565132] RDX: ffffc7627fd75190 RSI: 0000000000000010 RDI:
> ffff902084580000
> [ 1401.572252] RBP: ffffc7627fd75190 R08: ffff901f9821c100 R09:
> ffffc7627fd75210
> [ 1401.579371] R10: 00000000000005dc R11: 0000000000000000 R12:
> 0000000000000000
> [ 1401.586493] R13: ffff902081580000 R14: 0000000000000001 R15:
> ffffc7627fd75190
> [ 1401.593726] FS:  00007f460fa96700(0000) GS:ffff90209fb40000(0000)
> knlGS:0000000000000000
> [ 1401.601797] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1401.607533] CR2: 0000000000003fa8 CR3: 000000076c3b6006 CR4:
> 00000000003606e0
> [ 1401.614653] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 1401.621772] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> [ 1401.628895] Kernel panic - not syncing: Fatal exception in
> interrupt
> [ 1401.635280] Kernel Offset: 0x5000000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [ 1401.694263] Rebooting in 5 seconds..

  reply	other threads:[~2018-10-04  6:37 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-03  9:30 Kernel oops with mlx5 and dual XDP redirect programs Toke Høiland-Jørgensen
2018-10-03 23:44 ` Saeed Mahameed [this message]
2018-10-04 12:03   ` Toke Høiland-Jørgensen
2018-10-18 21:53   ` Toke Høiland-Jørgensen
2018-10-22 17:57     ` Saeed Mahameed
2018-10-23 10:10       ` Toke Høiland-Jørgensen
2018-10-23 18:01         ` Saeed Mahameed
2018-10-23 20:29           ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e2cfdc3db244f4b9483a0c3dfc62fae55238bb3.camel@mellanox.com \
    --to=saeedm@mellanox.com \
    --cc=brouer@redhat.com \
    --cc=eranbe@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=tariqt@mellanox.com \
    --cc=toke@toke.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).