All of lore.kernel.org
 help / color / mirror / Atom feed
* XDP_TX bug report on mlx4
@ 2016-09-16 19:03 Jesper Dangaard Brouer
  2016-09-16 19:17 ` Brenden Blanco
  0 siblings, 1 reply; 5+ messages in thread
From: Jesper Dangaard Brouer @ 2016-09-16 19:03 UTC (permalink / raw)
  To: Brenden Blanco
  Cc: brouer, netdev, Tariq Toukan, Alexei Starovoitov, Tom Herbert,
	Saeed Mahameed, Rana Shahout, Eran Ben Elisha

Hi Brenden,

I've discovered a bug with XDP_TX recycling of pages in the mlx4 driver.

If I increase the number of RX and TX queues/channels via ethtool cmd:
 ethtool -L mlx4p1 rx 10 tx 10

Then when running the xdp2 program, which does XDP_TX, the kernel will
crash with page errors, because the page refcnt goes to zero or even
minus.  I've noticed pages delivered to mlx4_en_rx_recycle() can have
a page refcnt of zero, which is wrong, they should always have 1 (for
XDP).

Debugging it further, I find that this can happen when mlx4_en_rx_recycle()
is called from mlx4_en_recycle_tx_desc().  This is the TX cleanup function,
associated with TX ring queues used for XDP_TX only. No others than the
XDP_TX action should be able to place packets into these TX rings
which call mlx4_en_recycle_tx_desc().

Do you have any idea of what could be going wrong in this case?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: XDP_TX bug report on mlx4
  2016-09-16 19:03 XDP_TX bug report on mlx4 Jesper Dangaard Brouer
@ 2016-09-16 19:17 ` Brenden Blanco
  2016-09-16 19:24   ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 5+ messages in thread
From: Brenden Blanco @ 2016-09-16 19:17 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Tariq Toukan, Alexei Starovoitov, Tom Herbert,
	Saeed Mahameed, Rana Shahout, Eran Ben Elisha

On Fri, Sep 16, 2016 at 09:03:40PM +0200, Jesper Dangaard Brouer wrote:
> Hi Brenden,
> 
> I've discovered a bug with XDP_TX recycling of pages in the mlx4 driver.
> 
> If I increase the number of RX and TX queues/channels via ethtool cmd:
>  ethtool -L mlx4p1 rx 10 tx 10
> 
> Then when running the xdp2 program, which does XDP_TX, the kernel will
> crash with page errors, because the page refcnt goes to zero or even
> minus.  I've noticed pages delivered to mlx4_en_rx_recycle() can have
> a page refcnt of zero, which is wrong, they should always have 1 (for
> XDP).
> 
> Debugging it further, I find that this can happen when mlx4_en_rx_recycle()
> is called from mlx4_en_recycle_tx_desc().  This is the TX cleanup function,
> associated with TX ring queues used for XDP_TX only. No others than the
> XDP_TX action should be able to place packets into these TX rings
> which call mlx4_en_recycle_tx_desc().

Sounds pretty straightforward, let me look into it.
> 
> Do you have any idea of what could be going wrong in this case?
> 
> -- 
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   Author of http://www.iptv-analyzer.org
>   LinkedIn: http://www.linkedin.com/in/brouer
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: XDP_TX bug report on mlx4
  2016-09-16 19:17 ` Brenden Blanco
@ 2016-09-16 19:24   ` Jesper Dangaard Brouer
  2016-09-18 23:59     ` Brenden Blanco
  0 siblings, 1 reply; 5+ messages in thread
From: Jesper Dangaard Brouer @ 2016-09-16 19:24 UTC (permalink / raw)
  To: Brenden Blanco
  Cc: netdev, Tariq Toukan, Tom Herbert, Saeed Mahameed, Rana Shahout,
	Eran Ben Elisha, brouer

[-- Attachment #1: Type: text/plain, Size: 4746 bytes --]

On Fri, 16 Sep 2016 12:17:27 -0700
Brenden Blanco <bblanco@plumgrid.com> wrote:

> On Fri, Sep 16, 2016 at 09:03:40PM +0200, Jesper Dangaard Brouer wrote:
> > Hi Brenden,
> > 
> > I've discovered a bug with XDP_TX recycling of pages in the mlx4 driver.
> > 
> > If I increase the number of RX and TX queues/channels via ethtool cmd:
> >  ethtool -L mlx4p1 rx 10 tx 10
> > 
> > Then when running the xdp2 program, which does XDP_TX, the kernel will
> > crash with page errors, because the page refcnt goes to zero or even
> > minus.  I've noticed pages delivered to mlx4_en_rx_recycle() can have
> > a page refcnt of zero, which is wrong, they should always have 1 (for
> > XDP).
> > 
> > Debugging it further, I find that this can happen when mlx4_en_rx_recycle()
> > is called from mlx4_en_recycle_tx_desc().  This is the TX cleanup function,
> > associated with TX ring queues used for XDP_TX only. No others than the
> > XDP_TX action should be able to place packets into these TX rings
> > which call mlx4_en_recycle_tx_desc().  
> 
> Sounds pretty straightforward, let me look into it.

Here is some debug info I instrumented my kernel with, and I've
attached my minicom output with a warning and a panic.

Enable some driver debug printks via::
 ethtool -s mlx4p1 msglvl drv on

Debug normal situation::

 $ grep recycle_ring minicom_capturefile.log08
 [  520.746610] mlx4_en: mlx4p1: Set tx_ring[56]->recycle_ring = rx_ring[0]
 [  520.747042] mlx4_en: mlx4p1: Set tx_ring[57]->recycle_ring = rx_ring[1]
 [  520.747470] mlx4_en: mlx4p1: Set tx_ring[58]->recycle_ring = rx_ring[2]
 [  520.747918] mlx4_en: mlx4p1: Set tx_ring[59]->recycle_ring = rx_ring[3]
 [  520.748330] mlx4_en: mlx4p1: Set tx_ring[60]->recycle_ring = rx_ring[4]
 [  520.748749] mlx4_en: mlx4p1: Set tx_ring[61]->recycle_ring = rx_ring[5]
 [  520.749181] mlx4_en: mlx4p1: Set tx_ring[62]->recycle_ring = rx_ring[6]
 [  520.749620] mlx4_en: mlx4p1: Set tx_ring[63]->recycle_ring = rx_ring[7]

Change $ ethtool -L mlx4p1 rx 9 tx 9 ::

 [  911.594692] mlx4_en: mlx4p1: Set tx_ring[56]->recycle_ring = rx_ring[0]
 [  911.608345] mlx4_en: mlx4p1: Set tx_ring[57]->recycle_ring = rx_ring[1]
 [  911.622008] mlx4_en: mlx4p1: Set tx_ring[58]->recycle_ring = rx_ring[2]
 [  911.636364] mlx4_en: mlx4p1: Set tx_ring[59]->recycle_ring = rx_ring[3]
 [  911.650015] mlx4_en: mlx4p1: Set tx_ring[60]->recycle_ring = rx_ring[4]
 [  911.663690] mlx4_en: mlx4p1: Set tx_ring[61]->recycle_ring = rx_ring[5]
 [  911.677356] mlx4_en: mlx4p1: Set tx_ring[62]->recycle_ring = rx_ring[6]
 [  911.690924] mlx4_en: mlx4p1: Set tx_ring[63]->recycle_ring = rx_ring[7]
 [  911.704544] mlx4_en: mlx4p1: Set tx_ring[64]->recycle_ring = rx_ring[8]
 [  911.718171] mlx4_en: mlx4p1: Set tx_ring[65]->recycle_ring = rx_ring[9]
 [  911.731772] mlx4_en: mlx4p1: Set tx_ring[66]->recycle_ring = rx_ring[10]
 [  911.745438] mlx4_en: mlx4p1: Set tx_ring[67]->recycle_ring = rx_ring[11]
 [  911.759063] mlx4_en: mlx4p1: Set tx_ring[68]->recycle_ring = rx_ring[12]
 [  911.772741] mlx4_en: mlx4p1: Set tx_ring[69]->recycle_ring = rx_ring[13]
 [  911.786415] mlx4_en: mlx4p1: Set tx_ring[70]->recycle_ring = rx_ring[14]
 [  911.800070] mlx4_en: mlx4p1: Set tx_ring[71]->recycle_ring = rx_ring[15]

Change $ ethtool -L mlx4p1 rx 10 tx 10::

 netif_set_real_num_tx_queues() setting dev->real_num_tx_queues(now:80) = 64
 mlx4_en: mlx4p1:   frag:0 - size:1522 prefix:0 stride:4096
 mlx4_en_init_recycle_ring() Set tx_ring[64]->recycle_ring = rx_ring[0]
 mlx4_en_init_recycle_ring() Set tx_ring[65]->recycle_ring = rx_ring[1]
 mlx4_en_init_recycle_ring() Set tx_ring[66]->recycle_ring = rx_ring[2]
 mlx4_en_init_recycle_ring() Set tx_ring[67]->recycle_ring = rx_ring[3]
 mlx4_en_init_recycle_ring() Set tx_ring[68]->recycle_ring = rx_ring[4]
 mlx4_en_init_recycle_ring() Set tx_ring[69]->recycle_ring = rx_ring[5]
 mlx4_en_init_recycle_ring() Set tx_ring[70]->recycle_ring = rx_ring[6]
 mlx4_en_init_recycle_ring() Set tx_ring[71]->recycle_ring = rx_ring[7]
 mlx4_en_init_recycle_ring() Set tx_ring[72]->recycle_ring = rx_ring[8]
 mlx4_en_init_recycle_ring() Set tx_ring[73]->recycle_ring = rx_ring[9]
 mlx4_en_init_recycle_ring() Set tx_ring[74]->recycle_ring = rx_ring[10]
 mlx4_en_init_recycle_ring() Set tx_ring[75]->recycle_ring = rx_ring[11]
 mlx4_en_init_recycle_ring() Set tx_ring[76]->recycle_ring = rx_ring[12]
 mlx4_en_init_recycle_ring() Set tx_ring[77]->recycle_ring = rx_ring[13]
 mlx4_en_init_recycle_ring() Set tx_ring[78]->recycle_ring = rx_ring[14]
 mlx4_en_init_recycle_ring() Set tx_ring[79]->recycle_ring = rx_ring[15]

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

[-- Attachment #2: minicom_capturefile.log17.txt --]
[-- Type: text/plain, Size: 8293 bytes --]


[   95.777366] systemd[1]: Started Session c1 of user jbrouer.
[   95.783108] systemd[1]: Starting Session c1 of user jbrouer.
[  102.577674] XXX: netif_set_real_num_tx_queues() setting dev->real_num_tx_queues(now:64) = 80
[  102.586160] mlx4_en: mlx4p1: Using 80 TX rings
[  102.590640] mlx4_en: mlx4p1: Using 0 TX rings for XDP
[  102.595702] mlx4_en: mlx4p1: Using 10 RX rings
[  102.600183] mlx4_en: mlx4p1:   frag:0 - size:1522 prefix:0 stride:1536
[  102.612851] mlx4_en: mlx4p1: Setting RSS context tunnel type to RSS on inner headers
[  102.677064] mlx4_en: mlx4p1: Link Down
[  103.830065] mlx4_en: mlx4p1: Link Up
[  171.811607] XXX: netif_set_real_num_tx_queues() calling qdisc_reset_all_tx_gt(64)
[  171.819208] XXX: netif_set_real_num_tx_queues() setting dev->real_num_tx_queues(now:80) = 64
[  171.827796] mlx4_en: mlx4p1:   frag:0 - size:1522 prefix:0 stride:4096
[  171.840921] mlx4_en: mlx4p1: Setting RSS context tunnel type to RSS on inner headers
[  171.877866] XXX: mlx4_en_init_recycle_ring() Set tx_ring[64]->recycle_ring = rx_ring[0]
[  171.886379] XXX: mlx4_en_init_recycle_ring() Set tx_ring[65]->recycle_ring = rx_ring[1]
[  171.894938] XXX: mlx4_en_init_recycle_ring() Set tx_ring[66]->recycle_ring = rx_ring[2]
[  171.903442] XXX: mlx4_en_init_recycle_ring() Set tx_ring[67]->recycle_ring = rx_ring[3]
[  171.911979] XXX: mlx4_en_init_recycle_ring() Set tx_ring[68]->recycle_ring = rx_ring[4]
[  171.920469] XXX: mlx4_en_init_recycle_ring() Set tx_ring[69]->recycle_ring = rx_ring[5]
[  171.929016] XXX: mlx4_en_init_recycle_ring() Set tx_ring[70]->recycle_ring = rx_ring[6]
[  171.937564] XXX: mlx4_en_init_recycle_ring() Set tx_ring[71]->recycle_ring = rx_ring[7]
[  171.946102] XXX: mlx4_en_init_recycle_ring() Set tx_ring[72]->recycle_ring = rx_ring[8]
[  171.954628] XXX: mlx4_en_init_recycle_ring() Set tx_ring[73]->recycle_ring = rx_ring[9]
[  171.963190] XXX: mlx4_en_init_recycle_ring() Set tx_ring[74]->recycle_ring = rx_ring[10]
[  171.971851] XXX: mlx4_en_init_recycle_ring() Set tx_ring[75]->recycle_ring = rx_ring[11]
[  171.980436] XXX: mlx4_en_init_recycle_ring() Set tx_ring[76]->recycle_ring = rx_ring[12]
[  171.989039] XXX: mlx4_en_init_recycle_ring() Set tx_ring[77]->recycle_ring = rx_ring[13]
[  171.997641] XXX: mlx4_en_init_recycle_ring() Set tx_ring[78]->recycle_ring = rx_ring[14]
[  172.006253] XXX: mlx4_en_init_recycle_ring() Set tx_ring[79]->recycle_ring = rx_ring[15]
[  172.025565] mlx4_en: mlx4p1: Link Down
[  173.154950] mlx4_en: mlx4p1: Link Up
[  209.277225] systemd-logind[778]: New session c2 of user jbrouer.
[  209.284629] systemd[1]: Started Session c2 of user jbrouer.
[  209.290519] systemd[1]: Starting Session c2 of user jbrouer.
[  259.151792] XXX: mlx4_en_xmit_frame(cpu:2) tx_drop (tx_ind:66) *doorbell_pending:0
[  259.152002] XXX: mlx4_en_rx_recycle(cpu:6) page refcnt(0) bug cache->index:128/128
[  259.152003] ------------[ cut here ]------------
[  259.152007] WARNING: CPU: 6 PID: 0 at drivers/net/ethernet/mellanox/mlx4/en_rx.c:532 mlx4_en_rx_recycle+0x9c/0xb0 [mlx4_en]
[  259.152018] Modules linked in: coretemp kvm_intel kvm mxm_wmi irqbypass i2c_i801 intel_cstate i2c_smbus intel_rapl_perf sg i2c_core pcspkr shpchp video wmi acpi_pad nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables mlx4_en mlx5_core e1000e ptp sd_mod serio_raw mlx4_core pps_core devlink hid_generic
[  259.152020] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 4.8.0-rc4-xdp02_seperate_xdp_struct+ #89
[  259.152021] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Extreme4, BIOS P2.10 05/12/2015
[  259.152022]  0000000000000000 ffff88041fb83d58 ffffffff813a8ccb 0000000000000000
[  259.152023]  0000000000000000 ffff88041fb83d98 ffffffff81060f3b 000002141fb83d70
[  259.152024]  ffff88041fb83dd8 ffff8804081e8000 0000000000000151 0000000000000009
[  259.152024] Call Trace:
[  259.152029]  <IRQ>  [<ffffffff813a8ccb>] dump_stack+0x4d/0x72
[  259.152032]  [<ffffffff81060f3b>] __warn+0xcb/0xf0
[  259.152034]  [<ffffffff8106102d>] warn_slowpath_null+0x1d/0x20
[  259.152035]  [<ffffffffa016a30c>] mlx4_en_rx_recycle+0x9c/0xb0 [mlx4_en]
[  259.152037]  [<ffffffffa0166a2e>] mlx4_en_recycle_tx_desc+0x4e/0xf0 [mlx4_en]
[  259.152038]  [<ffffffffa01678a7>] mlx4_en_poll_tx_cq+0x1e7/0x480 [mlx4_en]
[  259.152040]  [<ffffffff815f0f6c>] net_rx_action+0x1fc/0x350
[  259.152042]  [<ffffffff8170b83e>] __do_softirq+0xce/0x2cf
[  259.152043]  [<ffffffff8106635b>] irq_exit+0xab/0xb0
[  259.152044]  [<ffffffff8170b584>] do_IRQ+0x54/0xd0
[  259.152046]  [<ffffffff81709bbf>] common_interrupt+0x7f/0x7f
[  259.152048]  <EOI>  [<ffffffff815b213f>] ? cpuidle_enter_state+0x12f/0x300
[  259.152049]  [<ffffffff815b2347>] cpuidle_enter+0x17/0x20
[  259.152051]  [<ffffffff810a3c94>] cpu_startup_entry+0x2b4/0x300
[  259.152053]  [<ffffffff8103c611>] start_secondary+0x101/0x120
[  259.152054] ---[ end trace 2b35fea8b90ba9b5 ]---
[  259.152055] page:ffffea000a8e8300 count:0 mapcount:0 mapping:          (null) index:0x0
[  259.152056] flags: 0x2fffff80000000()
[  259.152056] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
[  259.152067] ------------[ cut here ]------------
[  259.152068] kernel BUG at ./include/linux/mm.h:445!
[  259.152068] invalid opcode: 0000 [#1] PREEMPT SMP
[  259.152074] Modules linked in: coretemp kvm_intel kvm mxm_wmi irqbypass i2c_i801 intel_cstate i2c_smbus intel_rapl_perf sg i2c_core pcspkr shpchp video wmi acpi_pad nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables mlx4_en mlx5_core e1000e ptp sd_mod serio_raw mlx4_core pps_core devlink hid_generic
[  259.152075] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G        W       4.8.0-rc4-xdp02_seperate_xdp_struct+ #89
[  259.152076] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z97 Extreme4, BIOS P2.10 05/12/2015
[  259.152076] task: ffff88040d5bab80 task.stack: ffff88040d5d0000
[  259.152078] RIP: 0010:[<ffffffffa0166acd>]  [<ffffffffa0166acd>] mlx4_en_recycle_tx_desc+0xed/0xf0 [mlx4_en]
[  259.152079] RSP: 0018:ffff88041fb83dd8  EFLAGS: 00010292
[  259.152079] RAX: 000000000000003e RBX: ffff88040b76d440 RCX: 0000000000000001
[  259.152079] RDX: 0000000000000001 RSI: 0000000000000286 RDI: ffffffff81c40b90
[  259.152080] RBP: ffff88041fb83e00 R08: 0000000000000000 R09: 000000000000003e
[  259.152080] R10: 000000000000000a R11: 0000000000000000 R12: ffff8803f1b40840
[  259.152081] R13: 0000000000000151 R14: 0000000000000009 R15: 0000000000005440
[  259.152081] FS:  0000000000000000(0000) GS:ffff88041fb80000(0000) knlGS:0000000000000000
[  259.152082] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  259.152082] CR2: 00007fd47c1c4520 CR3: 00000003e970a000 CR4: 00000000001406e0
[  259.152082] Stack:
[  259.152084]  ffffea000a8e8300 00000002a3a0c000 0000100000000000 0000000000000150
[  259.152085]  ffff88040b748800 ffff88041fb83e90 ffffffffa01678a7 ffff88040b3ed500
[  259.152086]  0000000000000020 ffff88040b760000 ffff8803f1b40000 0367cd5100000200
[  259.152086] Call Trace:
[  259.152088]  <IRQ> 
[  259.152088]  [<ffffffffa01678a7>] mlx4_en_poll_tx_cq+0x1e7/0x480 [mlx4_en]
[  259.152089]  [<ffffffff815f0f6c>] net_rx_action+0x1fc/0x350
[  259.152090]  [<ffffffff8170b83e>] __do_softirq+0xce/0x2cf
[  259.152091]  [<ffffffff8106635b>] irq_exit+0xab/0xb0
[  259.152092]  [<ffffffff8170b584>] do_IRQ+0x54/0xd0
[  259.152093]  [<ffffffff81709bbf>] common_interrupt+0x7f/0x7f
[  259.152095]  <EOI> 
[  259.152095]  [<ffffffff815b213f>] ? cpuidle_enter_state+0x12f/0x300
[  259.152096]  [<ffffffff815b2347>] cpuidle_enter+0x17/0x20
[  259.152097]  [<ffffffff810a3c94>] cpu_startup_entry+0x2b4/0x300
[  259.152098]  [<ffffffff8103c611>] start_secondary+0x101/0x120
[  259.152108] Code: 5c 5d c3 e8 b6 a4 ff e0 8b 43 14 48 83 c4 18 5b 41 5c 5d c3 48 8b 05 13 80 ab e1 eb a0 0f 0b 48 c7 c6 60 9e 17 a0 e8 a3 8b 01 e1 <0f> 0b 90 0f 1f 44 00 00 55 48 63 c2 89 d1 49 89 f2 48 c1 e0 06 
[  259.152110] RIP  [<ffffffffa0166acd>] mlx4_en_recycle_tx_desc+0xed/0xf0 [mlx4_en]
[  259.152110]  RSP <ffff88041fb83dd8>
[  259.152117] ---[ end trace 2b35fea8b90ba9b6 ]---
[  259.152118] Kernel panic - not syncing: Fatal exception in interrupt
[  259.159443] Kernel Offset: disabled
[  259.640689] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: XDP_TX bug report on mlx4
  2016-09-16 19:24   ` Jesper Dangaard Brouer
@ 2016-09-18 23:59     ` Brenden Blanco
  2016-10-13 19:46       ` Brenden Blanco
  0 siblings, 1 reply; 5+ messages in thread
From: Brenden Blanco @ 2016-09-18 23:59 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Tariq Toukan, Tom Herbert, Saeed Mahameed, Rana Shahout,
	Eran Ben Elisha

On Fri, Sep 16, 2016 at 09:24:43PM +0200, Jesper Dangaard Brouer wrote:
> On Fri, 16 Sep 2016 12:17:27 -0700
> Brenden Blanco <bblanco@plumgrid.com> wrote:
> 
> > On Fri, Sep 16, 2016 at 09:03:40PM +0200, Jesper Dangaard Brouer wrote:
> > > Hi Brenden,
> > > 
> > > I've discovered a bug with XDP_TX recycling of pages in the mlx4 driver.
> > > 
> > > If I increase the number of RX and TX queues/channels via ethtool cmd:
> > >  ethtool -L mlx4p1 rx 10 tx 10
> > > 
> > > Then when running the xdp2 program, which does XDP_TX, the kernel will
> > > crash with page errors, because the page refcnt goes to zero or even
> > > minus.  I've noticed pages delivered to mlx4_en_rx_recycle() can have
> > > a page refcnt of zero, which is wrong, they should always have 1 (for
> > > XDP).
> > > 
> > > Debugging it further, I find that this can happen when mlx4_en_rx_recycle()
> > > is called from mlx4_en_recycle_tx_desc().  This is the TX cleanup function,
> > > associated with TX ring queues used for XDP_TX only. No others than the
> > > XDP_TX action should be able to place packets into these TX rings
> > > which call mlx4_en_recycle_tx_desc().  
> > 
> > Sounds pretty straightforward, let me look into it.
> 
> Here is some debug info I instrumented my kernel with, and I've
> attached my minicom output with a warning and a panic.

Thanks for the info.

I've also spent this weekend trying to debug (pretty easy to reproduce),
but with no conclusive answer. I was investigating the sequence in
mlx4_en_stop_port to see if rx might still be running through the
function, on the theory that the xdp tx might race with
mlx4_en_free_tx_buf. For instance, I tried moving the napi_synchronize
loop to be just below the msleep(10). No improvement.

Unfortunately, I'm out of options, since my one test box has decided not
to reboot itself automatically, and I'll be out of email communication
(for 3 weeks) before anybody can physicially resuscitate it (tomorrow).
> 
> Enable some driver debug printks via::
>  ethtool -s mlx4p1 msglvl drv on
> 
> Debug normal situation::
> 
[...]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: XDP_TX bug report on mlx4
  2016-09-18 23:59     ` Brenden Blanco
@ 2016-10-13 19:46       ` Brenden Blanco
  0 siblings, 0 replies; 5+ messages in thread
From: Brenden Blanco @ 2016-10-13 19:46 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, Tariq Toukan, Tom Herbert, Saeed Mahameed, Rana Shahout,
	Eran Ben Elisha

On Sun, Sep 18, 2016 at 07:59:43PM -0400, Brenden Blanco wrote:
> On Fri, Sep 16, 2016 at 09:24:43PM +0200, Jesper Dangaard Brouer wrote:
> > On Fri, 16 Sep 2016 12:17:27 -0700
> > Brenden Blanco <bblanco@plumgrid.com> wrote:
> > 
> > > On Fri, Sep 16, 2016 at 09:03:40PM +0200, Jesper Dangaard Brouer wrote:
> > > > Hi Brenden,
> > > > 
> > > > I've discovered a bug with XDP_TX recycling of pages in the mlx4 driver.
> > > > 
> > > > If I increase the number of RX and TX queues/channels via ethtool cmd:
> > > >  ethtool -L mlx4p1 rx 10 tx 10
> > > > 
> > > > Then when running the xdp2 program, which does XDP_TX, the kernel will
> > > > crash with page errors, because the page refcnt goes to zero or even
> > > > minus.  I've noticed pages delivered to mlx4_en_rx_recycle() can have
> > > > a page refcnt of zero, which is wrong, they should always have 1 (for
> > > > XDP).
> > > > 
> > > > Debugging it further, I find that this can happen when mlx4_en_rx_recycle()
> > > > is called from mlx4_en_recycle_tx_desc().  This is the TX cleanup function,
> > > > associated with TX ring queues used for XDP_TX only. No others than the
> > > > XDP_TX action should be able to place packets into these TX rings
> > > > which call mlx4_en_recycle_tx_desc().  
> > > 
> > > Sounds pretty straightforward, let me look into it.
> > 
> > Here is some debug info I instrumented my kernel with, and I've
> > attached my minicom output with a warning and a panic.
> 
> Thanks for the info.
> 
> I've also spent this weekend trying to debug (pretty easy to reproduce),
> but with no conclusive answer. I was investigating the sequence in
> mlx4_en_stop_port to see if rx might still be running through the
> function, on the theory that the xdp tx might race with
> mlx4_en_free_tx_buf. For instance, I tried moving the napi_synchronize
> loop to be just below the msleep(10). No improvement.
> 
> Unfortunately, I'm out of options, since my one test box has decided not
> to reboot itself automatically, and I'll be out of email communication
> (for 3 weeks) before anybody can physicially resuscitate it (tomorrow).

Sorry to leave this hanging for so long. I've just recently come back
from vacation and I think I now have a fix for this. I'll send a patch
to the list shortly.

> > 
> > Enable some driver debug printks via::
> >  ethtool -s mlx4p1 msglvl drv on
> > 
> > Debug normal situation::
> > 
> [...]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-10-13 19:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-16 19:03 XDP_TX bug report on mlx4 Jesper Dangaard Brouer
2016-09-16 19:17 ` Brenden Blanco
2016-09-16 19:24   ` Jesper Dangaard Brouer
2016-09-18 23:59     ` Brenden Blanco
2016-10-13 19:46       ` Brenden Blanco

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.