linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
@ 2019-02-08 18:29 Sander Eikelenboom
  2019-02-08 18:52 ` Heiner Kallweit
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-08 18:29 UTC (permalink / raw)
  To: Realtek linux nic maintainers, Heiner Kallweit
  Cc: Linus Torvalds, linux-kernel, netdev

L.S.,

While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
that I haven encountered with Linux 4.20.x.

Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.

If you need more info, want me to run a debug patch etc., please feel free to ask.

--
Sander


[ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
[ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
[ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
[ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 6466.611579] RIP: e030:dql_completed+0x126/0x140
[ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
[ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
[ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
[ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
[ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
[ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
[ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
[ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
[ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
[ 6466.758366] Call Trace:
[ 6466.768118]  <IRQ>
[ 6466.778214]  rtl8169_poll+0x4f4/0x640
[ 6466.789198]  net_rx_action+0x23d/0x370
[ 6466.798467]  __do_softirq+0xed/0x229
[ 6466.807039]  irq_exit+0xb7/0xc0
[ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
[ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
[ 6466.835902]  </IRQ>
[ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
[ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
[ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
[ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
[ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
[ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
[ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
[ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
[ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
[ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
[ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
[ 6466.955772]  ? do_mmap+0x44b/0x5b0
[ 6466.964410]  ? handle_mm_fault+0xf8/0x200
[ 6466.973290]  ? __do_page_fault+0x231/0x4a0
[ 6466.981973]  ? page_fault+0x8/0x30
[ 6466.990904]  ? page_fault+0x1e/0x30
[ 6466.999585] Modules linked in:
[ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
[ 6467.016751] RIP: e030:dql_completed+0x126/0x140
[ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
[ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
[ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
[ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
[ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
[ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
[ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
[ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
[ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
[ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
[ 6467.118166] Kernel Offset: disabled
(XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 18:29 Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27! Sander Eikelenboom
@ 2019-02-08 18:52 ` Heiner Kallweit
  2019-02-08 20:55   ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-08 18:52 UTC (permalink / raw)
  To: Sander Eikelenboom, Realtek linux nic maintainers
  Cc: Linus Torvalds, linux-kernel, netdev

On 08.02.2019 19:29, Sander Eikelenboom wrote:
> L.S.,
> 
> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
> that I haven encountered with Linux 4.20.x.
> 
> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
> 
> If you need more info, want me to run a debug patch etc., please feel free to ask.
> 
Thanks for the report. However I see no change in the r8169 driver between
4.20 and 5.0 with regard to BQL code. Having said that the root cause could
be somewhere else. Therefore I'm afraid a bisect will be needed.

> --
> Sander
> 
Heiner

> 
> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
> [ 6466.758366] Call Trace:
> [ 6466.768118]  <IRQ>
> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
> [ 6466.789198]  net_rx_action+0x23d/0x370
> [ 6466.798467]  __do_softirq+0xed/0x229
> [ 6466.807039]  irq_exit+0xb7/0xc0
> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
> [ 6466.835902]  </IRQ>
> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
> [ 6466.981973]  ? page_fault+0x8/0x30
> [ 6466.990904]  ? page_fault+0x1e/0x30
> [ 6466.999585] Modules linked in:
> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
> [ 6467.118166] Kernel Offset: disabled
> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 18:52 ` Heiner Kallweit
@ 2019-02-08 20:55   ` Sander Eikelenboom
  2019-02-08 21:22     ` Heiner Kallweit
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-08 20:55 UTC (permalink / raw)
  To: Heiner Kallweit, Realtek linux nic maintainers
  Cc: Linus Torvalds, linux-kernel, netdev

On 08/02/2019 19:52, Heiner Kallweit wrote:
> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>> L.S.,
>>
>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>> that I haven encountered with Linux 4.20.x.
>>
>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>
>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>
> Thanks for the report. However I see no change in the r8169 driver between
> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
> be somewhere else. Therefore I'm afraid a bisect will be needed.

Hmm i did some diging and i think:
bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue

would be candidates, which were merged in 5.0.

I have reverted the first two, see how that works out.

--
Sander

 
>> --
>> Sander
>>
> Heiner
> 
>>
>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>> [ 6466.758366] Call Trace:
>> [ 6466.768118]  <IRQ>
>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>> [ 6466.789198]  net_rx_action+0x23d/0x370
>> [ 6466.798467]  __do_softirq+0xed/0x229
>> [ 6466.807039]  irq_exit+0xb7/0xc0
>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>> [ 6466.835902]  </IRQ>
>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>> [ 6466.981973]  ? page_fault+0x8/0x30
>> [ 6466.990904]  ? page_fault+0x1e/0x30
>> [ 6466.999585] Modules linked in:
>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>> [ 6467.118166] Kernel Offset: disabled
>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 20:55   ` Sander Eikelenboom
@ 2019-02-08 21:22     ` Heiner Kallweit
  2019-02-08 21:45       ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-08 21:22 UTC (permalink / raw)
  To: Sander Eikelenboom, Realtek linux nic maintainers
  Cc: Linus Torvalds, linux-kernel, netdev

On 08.02.2019 21:55, Sander Eikelenboom wrote:
> On 08/02/2019 19:52, Heiner Kallweit wrote:
>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>> L.S.,
>>>
>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>> that I haven encountered with Linux 4.20.x.
>>>
>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>
>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>
>> Thanks for the report. However I see no change in the r8169 driver between
>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>> be somewhere else. Therefore I'm afraid a bisect will be needed.
> 
> Hmm i did some diging and i think:
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
> 
You're right. Thought this was added in 4.20 already.
The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
this issue from any user of physical hw. And due to the fact that a lot of mainboards
have onboard Realtek network I have quite a few testers out there.
Does the issue occur under specific circumstances like very high load?

If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
as author of the underlying changes.

> would be candidates, which were merged in 5.0.
> 
> I have reverted the first two, see how that works out.
> 
> --
> Sander
> 
Heiner

>  
>>> --
>>> Sander
>>>
>> Heiner
>>
>>>
>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>> [ 6466.758366] Call Trace:
>>> [ 6466.768118]  <IRQ>
>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>> [ 6466.835902]  </IRQ>
>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>> [ 6466.999585] Modules linked in:
>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>> [ 6467.118166] Kernel Offset: disabled
>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 21:22     ` Heiner Kallweit
@ 2019-02-08 21:45       ` Sander Eikelenboom
  2019-02-08 21:50         ` Heiner Kallweit
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-08 21:45 UTC (permalink / raw)
  To: Heiner Kallweit, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 08/02/2019 22:22, Heiner Kallweit wrote:
> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>> L.S.,
>>>>
>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>> that I haven encountered with Linux 4.20.x.
>>>>
>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>
>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>
>>> Thanks for the report. However I see no change in the r8169 driver between
>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>
>> Hmm i did some diging and i think:
>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>
> You're right. Thought this was added in 4.20 already.
> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
> this issue from any user of physical hw. And due to the fact that a lot of mainboards
> have onboard Realtek network I have quite a few testers out there.
> Does the issue occur under specific circumstances like very high load?

Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
on the host.

> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
> as author of the underlying changes.

It could also be the barriers weren't that unneeded as assumed.
Since we are almost at RC6 i took the liberty to CC Eric now.

BTW am i correct these patches are merely optimizations ?
If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
to revert them for 5.0 and try again for 5.1 ?

--
Sander


> 
>> would be candidates, which were merged in 5.0.
>>
>> I have reverted the first two, see how that works out.
>>
>> --
>> Sander
>>
> Heiner
> 
>>  
>>>> --
>>>> Sander
>>>>
>>> Heiner
>>>
>>>>
>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>> [ 6466.758366] Call Trace:
>>>> [ 6466.768118]  <IRQ>
>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>> [ 6466.835902]  </IRQ>
>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>> [ 6466.999585] Modules linked in:
>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>> [ 6467.118166] Kernel Offset: disabled
>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>
>>>
>>
>>
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 21:45       ` Sander Eikelenboom
@ 2019-02-08 21:50         ` Heiner Kallweit
  2019-02-08 23:09           ` Eric Dumazet
  2019-02-08 23:34           ` Sander Eikelenboom
  0 siblings, 2 replies; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-08 21:50 UTC (permalink / raw)
  To: Sander Eikelenboom, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 08.02.2019 22:45, Sander Eikelenboom wrote:
> On 08/02/2019 22:22, Heiner Kallweit wrote:
>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>> L.S.,
>>>>>
>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>> that I haven encountered with Linux 4.20.x.
>>>>>
>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>
>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>
>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>
>>> Hmm i did some diging and i think:
>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>
>> You're right. Thought this was added in 4.20 already.
>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>> have onboard Realtek network I have quite a few testers out there.
>> Does the issue occur under specific circumstances like very high load?
> 
> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
> on the host.
> 
>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>> as author of the underlying changes.
> 
> It could also be the barriers weren't that unneeded as assumed.

The barriers were removed after adding xmit_more handling. Therefore it would be good to
test also with only 
bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
removed.

> Since we are almost at RC6 i took the liberty to CC Eric now.
> 
Sure, thanks.

> BTW am i correct these patches are merely optimizations ?

Yes

> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
> to revert them for 5.0 and try again for 5.1 ?
> 
Before removing both it would be good to test with only the barrier-removal removed.

> --
> Sander
> 
Heiner

> 
>>
>>> would be candidates, which were merged in 5.0.
>>>
>>> I have reverted the first two, see how that works out.
>>>
>>> --
>>> Sander
>>>
>> Heiner
>>
>>>  
>>>>> --
>>>>> Sander
>>>>>
>>>> Heiner
>>>>
>>>>>
>>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>> [ 6466.758366] Call Trace:
>>>>> [ 6466.768118]  <IRQ>
>>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>>> [ 6466.835902]  </IRQ>
>>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>>> [ 6466.999585] Modules linked in:
>>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>>> [ 6467.118166] Kernel Offset: disabled
>>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 21:50         ` Heiner Kallweit
@ 2019-02-08 23:09           ` Eric Dumazet
  2019-02-09  9:02             ` Heiner Kallweit
  2019-02-08 23:34           ` Sander Eikelenboom
  1 sibling, 1 reply; 20+ messages in thread
From: Eric Dumazet @ 2019-02-08 23:09 UTC (permalink / raw)
  To: Heiner Kallweit, Sander Eikelenboom,
	Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev



On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>> L.S.,
>>>>>>
>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>
>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>
>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>
>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>
>>>> Hmm i did some diging and i think:
>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>
>>> You're right. Thought this was added in 4.20 already.
>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>> have onboard Realtek network I have quite a few testers out there.
>>> Does the issue occur under specific circumstances like very high load?
>>
>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>> on the host.
>>
>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>> as author of the underlying changes.
>>
>> It could also be the barriers weren't that unneeded as assumed.
> 
> The barriers were removed after adding xmit_more handling. Therefore it would be good to
> test also with only 
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
> removed.
> 
>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>
> Sure, thanks.
> 
>> BTW am i correct these patches are merely optimizations ?
> 
> Yes
> 
>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>> to revert them for 5.0 and try again for 5.1 ?
>>
> Before removing both it would be good to test with only the barrier-removal removed.
> 

Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
looks buggy to me, since the skb might have been freed already on another cpu when you call

You could try :

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        dma_addr_t mapping;
        u32 opts[2], len;
        bool stop_queue;
+       bool door_bell;
        int frags;
 
        if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
@@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        /* Force memory writes to complete before releasing descriptor */
        dma_wmb();
 
+       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
+
        txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
 
        /* Force all memory writes to complete before notifying device */
@@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
        if (unlikely(stop_queue))
                netif_stop_queue(dev);
 
-       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
+       if (door_bell) {
                RTL_W8(tp, TxPoll, NPQ);
                mmiowb();
        }



^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 21:50         ` Heiner Kallweit
  2019-02-08 23:09           ` Eric Dumazet
@ 2019-02-08 23:34           ` Sander Eikelenboom
  2019-02-09  9:10             ` Heiner Kallweit
  1 sibling, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-08 23:34 UTC (permalink / raw)
  To: Heiner Kallweit, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 08/02/2019 22:50, Heiner Kallweit wrote:
> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>> L.S.,
>>>>>>
>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>
>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>
>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>
>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>
>>>> Hmm i did some diging and i think:
>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>
>>> You're right. Thought this was added in 4.20 already.
>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>> have onboard Realtek network I have quite a few testers out there.
>>> Does the issue occur under specific circumstances like very high load?
>>
>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>> on the host.
>>
>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>> as author of the underlying changes.
>>
>> It could also be the barriers weren't that unneeded as assumed.
> 
> The barriers were removed after adding xmit_more handling. Therefore it would be good to
> test also with only 
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
> removed.

*arghh* *grmbl*

with both:
    bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3
    and
    2e6eedb4813e34d8d84ac0eb3afb668966f3f356 
reverted i get yet another splat:

[ 3769.246083] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246095] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246096] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246098] Call Trace:
[ 3769.246104]  <IRQ>
[ 3769.246114]  dump_stack+0x5c/0x7b
[ 3769.246120]  warn_alloc+0x103/0x190
[ 3769.246122]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246128]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246130]  page_frag_alloc+0x117/0x150
[ 3769.246132]  __napi_alloc_skb+0x83/0xd0
[ 3769.246137]  rtl8169_poll+0x210/0x640
[ 3769.246140]  net_rx_action+0x23d/0x370
[ 3769.246145]  __do_softirq+0xed/0x229
[ 3769.246149]  irq_exit+0xb7/0xc0
[ 3769.246152]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246154]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246155]  </IRQ>
[ 3769.246161] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246163] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246164] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246166] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246167] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246167] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246168] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246169] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246173]  _raw_spin_lock+0x16/0x20
[ 3769.246176]  list_lru_add+0x59/0x170
[ 3769.246179]  inode_lru_list_add+0x1b/0x40
[ 3769.246182]  iput+0x18b/0x1a0
[ 3769.246184]  __dentry_kill+0xc5/0x170
[ 3769.246186]  shrink_dentry_list+0x93/0x1c0
[ 3769.246187]  prune_dcache_sb+0x4d/0x70
[ 3769.246191]  super_cache_scan+0x104/0x190
[ 3769.246194]  do_shrink_slab+0x12c/0x1e0
[ 3769.246196]  shrink_slab+0xdf/0x2b0
[ 3769.246198]  shrink_node+0x158/0x470
[ 3769.246200]  do_try_to_free_pages+0xd1/0x380
[ 3769.246202]  try_to_free_pages+0xb2/0xe0
[ 3769.246204]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246207]  ? xas_load+0x9/0x80
[ 3769.246209]  ? find_get_entry+0x58/0x120
[ 3769.246210]  pagecache_get_page+0xde/0x210
[ 3769.246213]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246215]  ext4_da_write_begin+0xc4/0x340
[ 3769.246217]  generic_perform_write+0xb8/0x1b0
[ 3769.246219]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246223]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246225]  __vfs_write+0x123/0x1a0
[ 3769.246226]  vfs_write+0xab/0x1a0
[ 3769.246229]  ksys_write+0x4d/0xc0
[ 3769.246232]  do_syscall_64+0x49/0x100
[ 3769.246234]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246237] RIP: 0033:0x7fee5b265730
[ 3769.246238] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246239] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246240] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246241] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246241] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246242] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246243] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
[ 3769.246244] Mem-Info:
[ 3769.246249] active_anon:152383 inactive_anon:99216 isolated_anon:0
                active_file:51569 inactive_file:85922 isolated_file:0
                unevictable:552 dirty:6866 writeback:0 unstable:0
                slab_reclaimable:6707 slab_unreclaimable:16166
                mapped:1870 shmem:6 pagetables:2716 bounce:0
                free:3639 free_pcp:900 free_cma:0
[ 3769.246252] Node 0 active_anon:609532kB inactive_anon:396864kB active_file:206276kB inactive_file:343688kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:7480kB dirty:27464kB writeback:0kB shmem:24kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[ 3769.246253] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:8056kB inactive_anon:0kB active_file:92kB inactive_file:148kB unevictable:0kB writepending:8kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:20kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[ 3769.246256] lowmem_reserve[]: 0 1865 1865 1865
[ 3769.246258] Node 0 DMA32 free:7076kB min:19472kB low:21380kB high:23288kB active_anon:601840kB inactive_anon:396512kB active_file:206216kB inactive_file:343644kB unevictable:2208kB writepending:27256kB present:2080768kB managed:1833792kB mlocked:2208kB kernel_stack:9392kB pagetables:10844kB bounce:0kB free_pcp:3600kB local_pcp:596kB free_cma:0kB
[ 3769.246260] lowmem_reserve[]: 0 0 0 0
[ 3769.246262] Node 0 DMA: 6*4kB (UE) 4*8kB (UME) 4*16kB (UME) 2*32kB (UE) 6*64kB (UE) 2*128kB (UM) 4*256kB (UME) 3*512kB (UME) 2*1024kB (ME) 1*2048kB (M) 0*4096kB = 7480kB
[ 3769.246267] Node 0 DMA32: 66*4kB (UM) 271*8kB (UME) 218*16kB (UME) 45*32kB (UME) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7360kB
[ 3769.246272] 144878 total pagecache pages
[ 3769.246276] 6812 pages in swap cache
[ 3769.246277] Swap cache stats: add 62616, delete 55806, find 31/55
[ 3769.246278] Free swap  = 3943164kB
[ 3769.246278] Total swap = 4194300kB
[ 3769.246279] 524181 pages RAM
[ 3769.246279] 0 pages HighMem/MovableOnly
[ 3769.246280] 61765 pages reserved
[ 3769.246280] 0 pages cma reserved
[ 3769.246284] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246286] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246287] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246287] Call Trace:
[ 3769.246288]  <IRQ>
[ 3769.246290]  dump_stack+0x5c/0x7b
[ 3769.246291]  warn_alloc+0x103/0x190
[ 3769.246293]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246294]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246296]  page_frag_alloc+0x117/0x150
[ 3769.246297]  __napi_alloc_skb+0x83/0xd0
[ 3769.246299]  rtl8169_poll+0x210/0x640
[ 3769.246300]  net_rx_action+0x23d/0x370
[ 3769.246302]  __do_softirq+0xed/0x229
[ 3769.246304]  irq_exit+0xb7/0xc0
[ 3769.246305]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246306]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246307]  </IRQ>
[ 3769.246308] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246310] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246310] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246311] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246312] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246313] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246313] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246314] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246316]  _raw_spin_lock+0x16/0x20
[ 3769.246317]  list_lru_add+0x59/0x170
[ 3769.246318]  inode_lru_list_add+0x1b/0x40
[ 3769.246320]  iput+0x18b/0x1a0
[ 3769.246321]  __dentry_kill+0xc5/0x170
[ 3769.246322]  shrink_dentry_list+0x93/0x1c0
[ 3769.246323]  prune_dcache_sb+0x4d/0x70
[ 3769.246325]  super_cache_scan+0x104/0x190
[ 3769.246326]  do_shrink_slab+0x12c/0x1e0
[ 3769.246328]  shrink_slab+0xdf/0x2b0
[ 3769.246329]  shrink_node+0x158/0x470
[ 3769.246331]  do_try_to_free_pages+0xd1/0x380
[ 3769.246333]  try_to_free_pages+0xb2/0xe0
[ 3769.246334]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246336]  ? xas_load+0x9/0x80
[ 3769.246337]  ? find_get_entry+0x58/0x120
[ 3769.246338]  pagecache_get_page+0xde/0x210
[ 3769.246340]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246341]  ext4_da_write_begin+0xc4/0x340
[ 3769.246342]  generic_perform_write+0xb8/0x1b0
[ 3769.246344]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246345]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246347]  __vfs_write+0x123/0x1a0
[ 3769.246348]  vfs_write+0xab/0x1a0
[ 3769.246349]  ksys_write+0x4d/0xc0
[ 3769.246350]  do_syscall_64+0x49/0x100
[ 3769.246352]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246353] RIP: 0033:0x7fee5b265730
[ 3769.246354] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246354] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246355] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246356] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246357] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246357] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246358] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
[ 3769.246364] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[ 3769.246366] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
[ 3769.246366] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[ 3769.246366] Call Trace:
[ 3769.246367]  <IRQ>
[ 3769.246368]  dump_stack+0x5c/0x7b
[ 3769.246370]  warn_alloc+0x103/0x190
[ 3769.246371]  __alloc_pages_nodemask+0xe3d/0xe80
[ 3769.246373]  ? inet_gro_receive+0x232/0x2c0
[ 3769.246374]  page_frag_alloc+0x117/0x150
[ 3769.246375]  __napi_alloc_skb+0x83/0xd0
[ 3769.246376]  rtl8169_poll+0x210/0x640
[ 3769.246378]  net_rx_action+0x23d/0x370
[ 3769.246379]  __do_softirq+0xed/0x229
[ 3769.246381]  irq_exit+0xb7/0xc0
[ 3769.246382]  xen_evtchn_do_upcall+0x27/0x40
[ 3769.246383]  xen_do_hypervisor_callback+0x29/0x40
[ 3769.246383]  </IRQ>
[ 3769.246385] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
[ 3769.246386] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
[ 3769.246387] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
[ 3769.246388] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
[ 3769.246388] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
[ 3769.246389] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
[ 3769.246390] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
[ 3769.246390] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
[ 3769.246392]  _raw_spin_lock+0x16/0x20
[ 3769.246393]  list_lru_add+0x59/0x170
[ 3769.246395]  inode_lru_list_add+0x1b/0x40
[ 3769.246396]  iput+0x18b/0x1a0
[ 3769.246397]  __dentry_kill+0xc5/0x170
[ 3769.246398]  shrink_dentry_list+0x93/0x1c0
[ 3769.246399]  prune_dcache_sb+0x4d/0x70
[ 3769.246401]  super_cache_scan+0x104/0x190
[ 3769.246402]  do_shrink_slab+0x12c/0x1e0
[ 3769.246404]  shrink_slab+0xdf/0x2b0
[ 3769.246405]  shrink_node+0x158/0x470
[ 3769.246407]  do_try_to_free_pages+0xd1/0x380
[ 3769.246408]  try_to_free_pages+0xb2/0xe0
[ 3769.246410]  __alloc_pages_nodemask+0x603/0xe80
[ 3769.246411]  ? xas_load+0x9/0x80
[ 3769.246413]  ? find_get_entry+0x58/0x120
[ 3769.246414]  pagecache_get_page+0xde/0x210
[ 3769.246415]  grab_cache_page_write_begin+0x17/0x30
[ 3769.246416]  ext4_da_write_begin+0xc4/0x340
[ 3769.246418]  generic_perform_write+0xb8/0x1b0
[ 3769.246420]  __generic_file_write_iter+0x13c/0x1b0
[ 3769.246421]  ext4_file_write_iter+0x121/0x3c0
[ 3769.246422]  __vfs_write+0x123/0x1a0
[ 3769.246423]  vfs_write+0xab/0x1a0
[ 3769.246424]  ksys_write+0x4d/0xc0
[ 3769.246426]  do_syscall_64+0x49/0x100
[ 3769.246427]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3769.246428] RIP: 0033:0x7fee5b265730
[ 3769.246429] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
[ 3769.246430] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3769.246431] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
[ 3769.246431] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
[ 3769.246432] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
[ 3769.246433] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
[ 3769.246433] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710


 
>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>
> Sure, thanks.
> 
>> BTW am i correct these patches are merely optimizations ?
> 
> Yes
> 
>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>> to revert them for 5.0 and try again for 5.1 ?
>>
> Before removing both it would be good to test with only the barrier-removal removed.
> 
>> --
>> Sander
>>
> Heiner
> 
>>
>>>
>>>> would be candidates, which were merged in 5.0.
>>>>
>>>> I have reverted the first two, see how that works out.
>>>>
>>>> --
>>>> Sander
>>>>
>>> Heiner
>>>
>>>>  
>>>>>> --
>>>>>> Sander
>>>>>>
>>>>> Heiner
>>>>>
>>>>>>
>>>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>> [ 6466.758366] Call Trace:
>>>>>> [ 6466.768118]  <IRQ>
>>>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>>>> [ 6466.835902]  </IRQ>
>>>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>>>> [ 6466.999585] Modules linked in:
>>>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>>>> [ 6467.118166] Kernel Offset: disabled
>>>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 23:09           ` Eric Dumazet
@ 2019-02-09  9:02             ` Heiner Kallweit
  2019-02-09  9:34               ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-09  9:02 UTC (permalink / raw)
  To: Eric Dumazet, Sander Eikelenboom, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 09.02.2019 00:09, Eric Dumazet wrote:
> 
> 
> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>> L.S.,
>>>>>>>
>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>
>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>
>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>
>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>
>>>>> Hmm i did some diging and i think:
>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>
>>>> You're right. Thought this was added in 4.20 already.
>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>> have onboard Realtek network I have quite a few testers out there.
>>>> Does the issue occur under specific circumstances like very high load?
>>>
>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>> on the host.
>>>
>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>> as author of the underlying changes.
>>>
>>> It could also be the barriers weren't that unneeded as assumed.
>>
>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>> test also with only 
>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>> removed.
>>
>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>
>> Sure, thanks.
>>
>>> BTW am i correct these patches are merely optimizations ?
>>
>> Yes
>>
>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>> to revert them for 5.0 and try again for 5.1 ?
>>>
>> Before removing both it would be good to test with only the barrier-removal removed.
>>
> 
> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
> looks buggy to me, since the skb might have been freed already on another cpu when you call
> 
> You could try :
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>         dma_addr_t mapping;
>         u32 opts[2], len;
>         bool stop_queue;
> +       bool door_bell;
>         int frags;
>  
>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>         /* Force memory writes to complete before releasing descriptor */
>         dma_wmb();
>  
> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
> +
>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>  
>         /* Force all memory writes to complete before notifying device */
> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>         if (unlikely(stop_queue))
>                 netif_stop_queue(dev);
>  
> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
> +       if (door_bell) {
>                 RTL_W8(tp, TxPoll, NPQ);
>                 mmiowb();
>         }
> 
Thanks a lot for checking and for the proposed fix.
Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?

> 
> .
> 
Heiner

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-08 23:34           ` Sander Eikelenboom
@ 2019-02-09  9:10             ` Heiner Kallweit
  0 siblings, 0 replies; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-09  9:10 UTC (permalink / raw)
  To: Sander Eikelenboom, Realtek linux nic maintainers, Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 09.02.2019 00:34, Sander Eikelenboom wrote:
> On 08/02/2019 22:50, Heiner Kallweit wrote:
>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>> L.S.,
>>>>>>>
>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>
>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>
>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>
>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>
>>>>> Hmm i did some diging and i think:
>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>
>>>> You're right. Thought this was added in 4.20 already.
>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>> have onboard Realtek network I have quite a few testers out there.
>>>> Does the issue occur under specific circumstances like very high load?
>>>
>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>> on the host.
>>>
>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>> as author of the underlying changes.
>>>
>>> It could also be the barriers weren't that unneeded as assumed.
>>
>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>> test also with only 
>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>> removed.
> 
> *arghh* *grmbl*
> 
> with both:
>     bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3
>     and
>     2e6eedb4813e34d8d84ac0eb3afb668966f3f356 
> reverted i get yet another splat:
> 
Puh, I'm not a memory management expert. The traces include also a failed memory
allocation from a file system operation. Maybe the system is going low on memory?
The issue occurs so deep in the memory mgmt, that I wonder if and how this could
be caused by the network driver.


> [ 3769.246083] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3769.246095] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
> [ 3769.246096] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [ 3769.246098] Call Trace:
> [ 3769.246104]  <IRQ>
> [ 3769.246114]  dump_stack+0x5c/0x7b
> [ 3769.246120]  warn_alloc+0x103/0x190
> [ 3769.246122]  __alloc_pages_nodemask+0xe3d/0xe80
> [ 3769.246128]  ? inet_gro_receive+0x232/0x2c0
> [ 3769.246130]  page_frag_alloc+0x117/0x150
> [ 3769.246132]  __napi_alloc_skb+0x83/0xd0
> [ 3769.246137]  rtl8169_poll+0x210/0x640
> [ 3769.246140]  net_rx_action+0x23d/0x370
> [ 3769.246145]  __do_softirq+0xed/0x229
> [ 3769.246149]  irq_exit+0xb7/0xc0
> [ 3769.246152]  xen_evtchn_do_upcall+0x27/0x40
> [ 3769.246154]  xen_do_hypervisor_callback+0x29/0x40
> [ 3769.246155]  </IRQ>
> [ 3769.246161] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
> [ 3769.246163] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
> [ 3769.246164] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
> [ 3769.246166] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
> [ 3769.246167] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
> [ 3769.246167] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
> [ 3769.246168] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
> [ 3769.246169] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
> [ 3769.246173]  _raw_spin_lock+0x16/0x20
> [ 3769.246176]  list_lru_add+0x59/0x170
> [ 3769.246179]  inode_lru_list_add+0x1b/0x40
> [ 3769.246182]  iput+0x18b/0x1a0
> [ 3769.246184]  __dentry_kill+0xc5/0x170
> [ 3769.246186]  shrink_dentry_list+0x93/0x1c0
> [ 3769.246187]  prune_dcache_sb+0x4d/0x70
> [ 3769.246191]  super_cache_scan+0x104/0x190
> [ 3769.246194]  do_shrink_slab+0x12c/0x1e0
> [ 3769.246196]  shrink_slab+0xdf/0x2b0
> [ 3769.246198]  shrink_node+0x158/0x470
> [ 3769.246200]  do_try_to_free_pages+0xd1/0x380
> [ 3769.246202]  try_to_free_pages+0xb2/0xe0
> [ 3769.246204]  __alloc_pages_nodemask+0x603/0xe80
> [ 3769.246207]  ? xas_load+0x9/0x80
> [ 3769.246209]  ? find_get_entry+0x58/0x120
> [ 3769.246210]  pagecache_get_page+0xde/0x210
> [ 3769.246213]  grab_cache_page_write_begin+0x17/0x30
> [ 3769.246215]  ext4_da_write_begin+0xc4/0x340
> [ 3769.246217]  generic_perform_write+0xb8/0x1b0
> [ 3769.246219]  __generic_file_write_iter+0x13c/0x1b0
> [ 3769.246223]  ext4_file_write_iter+0x121/0x3c0
> [ 3769.246225]  __vfs_write+0x123/0x1a0
> [ 3769.246226]  vfs_write+0xab/0x1a0
> [ 3769.246229]  ksys_write+0x4d/0xc0
> [ 3769.246232]  do_syscall_64+0x49/0x100
> [ 3769.246234]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 3769.246237] RIP: 0033:0x7fee5b265730
> [ 3769.246238] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
> [ 3769.246239] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 3769.246240] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
> [ 3769.246241] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
> [ 3769.246241] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
> [ 3769.246242] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
> [ 3769.246243] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
> [ 3769.246244] Mem-Info:
> [ 3769.246249] active_anon:152383 inactive_anon:99216 isolated_anon:0
>                 active_file:51569 inactive_file:85922 isolated_file:0
>                 unevictable:552 dirty:6866 writeback:0 unstable:0
>                 slab_reclaimable:6707 slab_unreclaimable:16166
>                 mapped:1870 shmem:6 pagetables:2716 bounce:0
>                 free:3639 free_pcp:900 free_cma:0
> [ 3769.246252] Node 0 active_anon:609532kB inactive_anon:396864kB active_file:206276kB inactive_file:343688kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:7480kB dirty:27464kB writeback:0kB shmem:24kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
> [ 3769.246253] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:8056kB inactive_anon:0kB active_file:92kB inactive_file:148kB unevictable:0kB writepending:8kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:20kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [ 3769.246256] lowmem_reserve[]: 0 1865 1865 1865
> [ 3769.246258] Node 0 DMA32 free:7076kB min:19472kB low:21380kB high:23288kB active_anon:601840kB inactive_anon:396512kB active_file:206216kB inactive_file:343644kB unevictable:2208kB writepending:27256kB present:2080768kB managed:1833792kB mlocked:2208kB kernel_stack:9392kB pagetables:10844kB bounce:0kB free_pcp:3600kB local_pcp:596kB free_cma:0kB
> [ 3769.246260] lowmem_reserve[]: 0 0 0 0
> [ 3769.246262] Node 0 DMA: 6*4kB (UE) 4*8kB (UME) 4*16kB (UME) 2*32kB (UE) 6*64kB (UE) 2*128kB (UM) 4*256kB (UME) 3*512kB (UME) 2*1024kB (ME) 1*2048kB (M) 0*4096kB = 7480kB
> [ 3769.246267] Node 0 DMA32: 66*4kB (UM) 271*8kB (UME) 218*16kB (UME) 45*32kB (UME) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7360kB
> [ 3769.246272] 144878 total pagecache pages
> [ 3769.246276] 6812 pages in swap cache
> [ 3769.246277] Swap cache stats: add 62616, delete 55806, find 31/55
> [ 3769.246278] Free swap  = 3943164kB
> [ 3769.246278] Total swap = 4194300kB
> [ 3769.246279] 524181 pages RAM
> [ 3769.246279] 0 pages HighMem/MovableOnly
> [ 3769.246280] 61765 pages reserved
> [ 3769.246280] 0 pages cma reserved
> [ 3769.246284] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3769.246286] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
> [ 3769.246287] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [ 3769.246287] Call Trace:
> [ 3769.246288]  <IRQ>
> [ 3769.246290]  dump_stack+0x5c/0x7b
> [ 3769.246291]  warn_alloc+0x103/0x190
> [ 3769.246293]  __alloc_pages_nodemask+0xe3d/0xe80
> [ 3769.246294]  ? inet_gro_receive+0x232/0x2c0
> [ 3769.246296]  page_frag_alloc+0x117/0x150
> [ 3769.246297]  __napi_alloc_skb+0x83/0xd0
> [ 3769.246299]  rtl8169_poll+0x210/0x640
> [ 3769.246300]  net_rx_action+0x23d/0x370
> [ 3769.246302]  __do_softirq+0xed/0x229
> [ 3769.246304]  irq_exit+0xb7/0xc0
> [ 3769.246305]  xen_evtchn_do_upcall+0x27/0x40
> [ 3769.246306]  xen_do_hypervisor_callback+0x29/0x40
> [ 3769.246307]  </IRQ>
> [ 3769.246308] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
> [ 3769.246310] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
> [ 3769.246310] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
> [ 3769.246311] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
> [ 3769.246312] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
> [ 3769.246313] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
> [ 3769.246313] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
> [ 3769.246314] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
> [ 3769.246316]  _raw_spin_lock+0x16/0x20
> [ 3769.246317]  list_lru_add+0x59/0x170
> [ 3769.246318]  inode_lru_list_add+0x1b/0x40
> [ 3769.246320]  iput+0x18b/0x1a0
> [ 3769.246321]  __dentry_kill+0xc5/0x170
> [ 3769.246322]  shrink_dentry_list+0x93/0x1c0
> [ 3769.246323]  prune_dcache_sb+0x4d/0x70
> [ 3769.246325]  super_cache_scan+0x104/0x190
> [ 3769.246326]  do_shrink_slab+0x12c/0x1e0
> [ 3769.246328]  shrink_slab+0xdf/0x2b0
> [ 3769.246329]  shrink_node+0x158/0x470
> [ 3769.246331]  do_try_to_free_pages+0xd1/0x380
> [ 3769.246333]  try_to_free_pages+0xb2/0xe0
> [ 3769.246334]  __alloc_pages_nodemask+0x603/0xe80
> [ 3769.246336]  ? xas_load+0x9/0x80
> [ 3769.246337]  ? find_get_entry+0x58/0x120
> [ 3769.246338]  pagecache_get_page+0xde/0x210
> [ 3769.246340]  grab_cache_page_write_begin+0x17/0x30
> [ 3769.246341]  ext4_da_write_begin+0xc4/0x340
> [ 3769.246342]  generic_perform_write+0xb8/0x1b0
> [ 3769.246344]  __generic_file_write_iter+0x13c/0x1b0
> [ 3769.246345]  ext4_file_write_iter+0x121/0x3c0
> [ 3769.246347]  __vfs_write+0x123/0x1a0
> [ 3769.246348]  vfs_write+0xab/0x1a0
> [ 3769.246349]  ksys_write+0x4d/0xc0
> [ 3769.246350]  do_syscall_64+0x49/0x100
> [ 3769.246352]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 3769.246353] RIP: 0033:0x7fee5b265730
> [ 3769.246354] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
> [ 3769.246354] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 3769.246355] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
> [ 3769.246356] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
> [ 3769.246357] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
> [ 3769.246357] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
> [ 3769.246358] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
> [ 3769.246364] ld: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
> [ 3769.246366] CPU: 2 PID: 3201 Comm: ld Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-doflr+ #1
> [ 3769.246366] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [ 3769.246366] Call Trace:
> [ 3769.246367]  <IRQ>
> [ 3769.246368]  dump_stack+0x5c/0x7b
> [ 3769.246370]  warn_alloc+0x103/0x190
> [ 3769.246371]  __alloc_pages_nodemask+0xe3d/0xe80
> [ 3769.246373]  ? inet_gro_receive+0x232/0x2c0
> [ 3769.246374]  page_frag_alloc+0x117/0x150
> [ 3769.246375]  __napi_alloc_skb+0x83/0xd0
> [ 3769.246376]  rtl8169_poll+0x210/0x640
> [ 3769.246378]  net_rx_action+0x23d/0x370
> [ 3769.246379]  __do_softirq+0xed/0x229
> [ 3769.246381]  irq_exit+0xb7/0xc0
> [ 3769.246382]  xen_evtchn_do_upcall+0x27/0x40
> [ 3769.246383]  xen_do_hypervisor_callback+0x29/0x40
> [ 3769.246383]  </IRQ>
> [ 3769.246385] RIP: e030:__pv_queued_spin_lock_slowpath+0xda/0x280
> [ 3769.246386] Code: 14 41 bc 01 00 00 00 41 bd 00 01 00 00 3c 02 0f 94 c0 0f b6 c0 48 89 04 24 c6 45 14 00 ba 00 80 00 00 c6 43 01 01 eb 0b f3 90 <83> ea 01 0f 84 49 01 00 00 0f b6 03 84 c0 75 ee 44 89 e8 f0 66 44
> [ 3769.246387] RSP: e02b:ffffc90005b0f780 EFLAGS: 00000202
> [ 3769.246388] RAX: 0000000000000001 RBX: ffff8880047c9200 RCX: 0000000000000001
> [ 3769.246388] RDX: 0000000000007d75 RSI: 0000000000000000 RDI: ffff8880047c9200
> [ 3769.246389] RBP: ffff88807d4a1a80 R08: ffffc90005b0f978 R09: ffffc90005b0f978
> [ 3769.246390] R10: ffffc90005b0f9d0 R11: ffff88807fc17000 R12: 0000000000000001
> [ 3769.246390] R13: 0000000000000100 R14: 0000000000000000 R15: 00000000000c0000
> [ 3769.246392]  _raw_spin_lock+0x16/0x20
> [ 3769.246393]  list_lru_add+0x59/0x170
> [ 3769.246395]  inode_lru_list_add+0x1b/0x40
> [ 3769.246396]  iput+0x18b/0x1a0
> [ 3769.246397]  __dentry_kill+0xc5/0x170
> [ 3769.246398]  shrink_dentry_list+0x93/0x1c0
> [ 3769.246399]  prune_dcache_sb+0x4d/0x70
> [ 3769.246401]  super_cache_scan+0x104/0x190
> [ 3769.246402]  do_shrink_slab+0x12c/0x1e0
> [ 3769.246404]  shrink_slab+0xdf/0x2b0
> [ 3769.246405]  shrink_node+0x158/0x470
> [ 3769.246407]  do_try_to_free_pages+0xd1/0x380
> [ 3769.246408]  try_to_free_pages+0xb2/0xe0
> [ 3769.246410]  __alloc_pages_nodemask+0x603/0xe80
> [ 3769.246411]  ? xas_load+0x9/0x80
> [ 3769.246413]  ? find_get_entry+0x58/0x120
> [ 3769.246414]  pagecache_get_page+0xde/0x210
> [ 3769.246415]  grab_cache_page_write_begin+0x17/0x30
> [ 3769.246416]  ext4_da_write_begin+0xc4/0x340
> [ 3769.246418]  generic_perform_write+0xb8/0x1b0
> [ 3769.246420]  __generic_file_write_iter+0x13c/0x1b0
> [ 3769.246421]  ext4_file_write_iter+0x121/0x3c0
> [ 3769.246422]  __vfs_write+0x123/0x1a0
> [ 3769.246423]  vfs_write+0xab/0x1a0
> [ 3769.246424]  ksys_write+0x4d/0xc0
> [ 3769.246426]  do_syscall_64+0x49/0x100
> [ 3769.246427]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [ 3769.246428] RIP: 0033:0x7fee5b265730
> [ 3769.246429] Code: 73 01 c3 48 8b 0d 68 d7 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d d9 2f 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 7e 9b 01 00 48 89 04 24
> [ 3769.246430] RSP: 002b:00007fff33183dd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ 3769.246431] RAX: ffffffffffffffda RBX: 0000000000000710 RCX: 00007fee5b265730
> [ 3769.246431] RDX: 0000000000000710 RSI: 000055559bed78b0 RDI: 0000000000000049
> [ 3769.246432] RBP: 000055559bed78b0 R08: 0000000000000b40 R09: 0000000001c0320c
> [ 3769.246433] R10: 00007fee5be91e80 R11: 0000000000000246 R12: 0000000000000710
> [ 3769.246433] R13: 0000000000000001 R14: 00005555a2690050 R15: 0000000000000710
> 
> 
>  
>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>
>> Sure, thanks.
>>
>>> BTW am i correct these patches are merely optimizations ?
>>
>> Yes
>>
>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>> to revert them for 5.0 and try again for 5.1 ?
>>>
>> Before removing both it would be good to test with only the barrier-removal removed.
>>
>>> --
>>> Sander
>>>
>> Heiner
>>
>>>
>>>>
>>>>> would be candidates, which were merged in 5.0.
>>>>>
>>>>> I have reverted the first two, see how that works out.
>>>>>
>>>>> --
>>>>> Sander
>>>>>
>>>> Heiner
>>>>
>>>>>  
>>>>>>> --
>>>>>>> Sander
>>>>>>>
>>>>>> Heiner
>>>>>>
>>>>>>>
>>>>>>> [ 6466.554866] kernel BUG at lib/dynamic_queue_limits.c:27!
>>>>>>> [ 6466.571425] invalid opcode: 0000 [#1] SMP NOPTI
>>>>>>> [ 6466.585890] CPU: 3 PID: 7057 Comm: as Not tainted 5.0.0-rc5-20190208-thp-net-florian-doflr+ #1
>>>>>>> [ 6466.598693] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>>> [ 6466.611579] RIP: e030:dql_completed+0x126/0x140
>>>>>>> [ 6466.624339] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>>> [ 6466.648130] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>>> [ 6466.659616] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>>> [ 6466.672835] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>>> [ 6466.684521] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>>> [ 6466.696824] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>>> [ 6466.709953] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>>> [ 6466.722165] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>>> [ 6466.733228] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [ 6466.746581] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>>> [ 6466.758366] Call Trace:
>>>>>>> [ 6466.768118]  <IRQ>
>>>>>>> [ 6466.778214]  rtl8169_poll+0x4f4/0x640
>>>>>>> [ 6466.789198]  net_rx_action+0x23d/0x370
>>>>>>> [ 6466.798467]  __do_softirq+0xed/0x229
>>>>>>> [ 6466.807039]  irq_exit+0xb7/0xc0
>>>>>>> [ 6466.815471]  xen_evtchn_do_upcall+0x27/0x40
>>>>>>> [ 6466.826647]  xen_do_hypervisor_callback+0x29/0x40
>>>>>>> [ 6466.835902]  </IRQ>
>>>>>>> [ 6466.845361] RIP: e030:xen_hypercall_mmu_update+0xa/0x20
>>>>>>> [ 6466.853390] Code: 51 41 53 b8 00 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 01 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc
>>>>>>> [ 6466.874031] RSP: e02b:ffffc90003c0bdd0 EFLAGS: 00000246
>>>>>>> [ 6466.883452] RAX: 0000000000000000 RBX: 000000041f83bfe8 RCX: ffffffff8100102a
>>>>>>> [ 6466.891986] RDX: deadbeefdeadf00d RSI: deadbeefdeadf00d RDI: deadbeefdeadf00d
>>>>>>> [ 6466.903402] RBP: 0000000000000fe8 R08: 000000000000000b R09: 0000000000000000
>>>>>>> [ 6466.911201] R10: deadbeefdeadf00d R11: 0000000000000246 R12: 800000050c346067
>>>>>>> [ 6466.918491] R13: ffff8880607c4fe8 R14: ffff888005082800 R15: 0000000000000000
>>>>>>> [ 6466.926647]  ? xen_hypercall_mmu_update+0xa/0x20
>>>>>>> [ 6466.938195]  ? xen_set_pte_at+0x78/0xe0
>>>>>>> [ 6466.947046]  ? __handle_mm_fault+0xc43/0x1060
>>>>>>> [ 6466.955772]  ? do_mmap+0x44b/0x5b0
>>>>>>> [ 6466.964410]  ? handle_mm_fault+0xf8/0x200
>>>>>>> [ 6466.973290]  ? __do_page_fault+0x231/0x4a0
>>>>>>> [ 6466.981973]  ? page_fault+0x8/0x30
>>>>>>> [ 6466.990904]  ? page_fault+0x1e/0x30
>>>>>>> [ 6466.999585] Modules linked in:
>>>>>>> [ 6467.007533] ---[ end trace 94bec01608fe4061 ]---
>>>>>>> [ 6467.016751] RIP: e030:dql_completed+0x126/0x140
>>>>>>> [ 6467.024271] Code: 2b 47 54 ba 00 00 00 00 c7 47 54 ff ff ff ff 0f 48 c2 48 8b 15 7b 39 4a 01 48 89 57 58 e9 48 ff ff ff 44 89 c0 e9 40 ff ff ff <0f> 0b 8b 47 50 29 e8 41 0f 48 c3 eb 9f 90 90 90 90 90 90 90 90 90
>>>>>>> [ 6467.039726] RSP: e02b:ffff88807d4c3e78 EFLAGS: 00010297
>>>>>>> [ 6467.047243] RAX: 0000000000000042 RBX: ffff8880049cf800 RCX: 0000000000000000
>>>>>>> [ 6467.054202] RDX: 0000000000000001 RSI: 0000000000000042 RDI: ffff8880049cf8c0
>>>>>>> [ 6467.062000] RBP: ffff888077df7260 R08: 0000000000000001 R09: 0000000000000000
>>>>>>> [ 6467.069664] R10: 00000000387c2336 R11: 00000000387c2336 R12: 0000000010000000
>>>>>>> [ 6467.077715] R13: ffff888077df6898 R14: ffff888077df75c0 R15: 0000000000454677
>>>>>>> [ 6467.084916] FS:  00007fd869147200(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
>>>>>>> [ 6467.093352] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>> [ 6467.101492] CR2: 00007fd867dfd000 CR3: 0000000074884000 CR4: 0000000000000660
>>>>>>> [ 6467.110542] Kernel panic - not syncing: Fatal exception in interrupt
>>>>>>> [ 6467.118166] Kernel Offset: disabled
>>>>>>> (XEN) [2019-02-08 18:04:48.854] Hardware Dom0 crashed: rebooting machine in 5 seconds.
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-09  9:02             ` Heiner Kallweit
@ 2019-02-09  9:34               ` Sander Eikelenboom
  2019-02-09  9:59                 ` Heiner Kallweit
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-09  9:34 UTC (permalink / raw)
  To: Heiner Kallweit, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 09/02/2019 10:02, Heiner Kallweit wrote:
> On 09.02.2019 00:09, Eric Dumazet wrote:
>>
>>
>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>> L.S.,
>>>>>>>>
>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>
>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>
>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>
>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>
>>>>>> Hmm i did some diging and i think:
>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>
>>>>> You're right. Thought this was added in 4.20 already.
>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>> Does the issue occur under specific circumstances like very high load?
>>>>
>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>> on the host.
>>>>
>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>> as author of the underlying changes.
>>>>
>>>> It could also be the barriers weren't that unneeded as assumed.
>>>
>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>> test also with only 
>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>> removed.
>>>
>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>
>>> Sure, thanks.
>>>
>>>> BTW am i correct these patches are merely optimizations ?
>>>
>>> Yes
>>>
>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>
>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>
>>
>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>
>> You could try :
>>
>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>> --- a/drivers/net/ethernet/realtek/r8169.c
>> +++ b/drivers/net/ethernet/realtek/r8169.c
>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>         dma_addr_t mapping;
>>         u32 opts[2], len;
>>         bool stop_queue;
>> +       bool door_bell;
>>         int frags;
>>  
>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>         /* Force memory writes to complete before releasing descriptor */
>>         dma_wmb();
>>  
>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>> +
>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>  
>>         /* Force all memory writes to complete before notifying device */
>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>         if (unlikely(stop_queue))
>>                 netif_stop_queue(dev);
>>  
>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>> +       if (door_bell) {
>>                 RTL_W8(tp, TxPoll, NPQ);
>>                 mmiowb();
>>         }
>>
> Thanks a lot for checking and for the proposed fix.
> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?

I have done that already during the night .. the results:
- I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
  (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).

- The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
  The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
  this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
  compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
  Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.

  If I can, it is a separate issue.
  If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
  a revert would be the right thing to do (since as you indicated these are merely optimizations), 
  which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
  (especially since I seem to still have other issues which need to be sorted out and time is limited)

  The timeout in question:
        [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
        [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
        [28336.893358] Modules linked in:
        [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
        [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
        [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
        [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
        [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
        [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
        [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
        [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
        [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
        [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
        [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
        [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
        [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
        [28337.090052] Call Trace:
        [28337.103615]  <IRQ>
        [28337.116587]  ? qdisc_destroy+0x120/0x120
        [28337.128905]  call_timer_fn+0x19/0x90
        [28337.141892]  expire_timers+0x8b/0xa0
        [28337.153354]  run_timer_softirq+0x7e/0x160
        [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
        [28337.176548]  ? handle_percpu_irq+0x32/0x50
        [28337.186734]  __do_softirq+0xed/0x229
        [28337.196404]  ? hypervisor_callback+0xa/0x20
        [28337.207822]  irq_exit+0xb7/0xc0
        [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
        [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
        [28337.241261]  </IRQ>
        [28337.253283] RIP: e033:0xff7e62
        [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
        [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
        [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
        [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
        [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
        [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
        [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
        [28337.353977] ---[ end trace 6ff49f09286816b7 ]---

--
Sander

 
>>
>> .
>>
> Heiner
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-09  9:34               ` Sander Eikelenboom
@ 2019-02-09  9:59                 ` Heiner Kallweit
  2019-02-09 10:07                   ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-09  9:59 UTC (permalink / raw)
  To: Sander Eikelenboom, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 09.02.2019 10:34, Sander Eikelenboom wrote:
> On 09/02/2019 10:02, Heiner Kallweit wrote:
>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>
>>>
>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>> L.S.,
>>>>>>>>>
>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>
>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>
>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>
>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>
>>>>>>> Hmm i did some diging and i think:
>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>
>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>
>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>> on the host.
>>>>>
>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>> as author of the underlying changes.
>>>>>
>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>
>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>> test also with only 
>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>> removed.
>>>>
>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>
>>>> Sure, thanks.
>>>>
>>>>> BTW am i correct these patches are merely optimizations ?
>>>>
>>>> Yes
>>>>
>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>
>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>
>>>
>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>
>>> You could try :
>>>
>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>         dma_addr_t mapping;
>>>         u32 opts[2], len;
>>>         bool stop_queue;
>>> +       bool door_bell;
>>>         int frags;
>>>  
>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>         /* Force memory writes to complete before releasing descriptor */
>>>         dma_wmb();
>>>  
>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>> +
>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>  
>>>         /* Force all memory writes to complete before notifying device */
>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>         if (unlikely(stop_queue))
>>>                 netif_stop_queue(dev);
>>>  
>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>> +       if (door_bell) {
>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>                 mmiowb();
>>>         }
>>>
>> Thanks a lot for checking and for the proposed fix.
>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
> 
> I have done that already during the night .. the results:
> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
> 
> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
> 
>   If I can, it is a separate issue.
>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
> 
>   The timeout in question:
>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>         [28336.893358] Modules linked in:
>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>         [28337.090052] Call Trace:
>         [28337.103615]  <IRQ>
>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>         [28337.128905]  call_timer_fn+0x19/0x90
>         [28337.141892]  expire_timers+0x8b/0xa0
>         [28337.153354]  run_timer_softirq+0x7e/0x160
>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>         [28337.186734]  __do_softirq+0xed/0x229
>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>         [28337.207822]  irq_exit+0xb7/0xc0
>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>         [28337.241261]  </IRQ>
>         [28337.253283] RIP: e033:0xff7e62
>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
> 
Thanks for your efforts. As usual this tx timeout trace says basically nothing except
"timeout" and root cause could be anything. Earlier you reported a memory allocation error,
did that occur again?
If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
contribute to the issue) and just submit a patch to effectively revert
2e6eedb4813e34d8d84ac0eb3afb668966f3f356.

> --
> Sander
> 
>  
>>>
>>> .
>>>
>> Heiner
>>
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-09  9:59                 ` Heiner Kallweit
@ 2019-02-09 10:07                   ` Sander Eikelenboom
  2019-02-09 11:50                     ` Heiner Kallweit
  0 siblings, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-09 10:07 UTC (permalink / raw)
  To: Heiner Kallweit, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 09/02/2019 10:59, Heiner Kallweit wrote:
> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>
>>>>
>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>> L.S.,
>>>>>>>>>>
>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>
>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>
>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>
>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>
>>>>>>>> Hmm i did some diging and i think:
>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>
>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>
>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>> on the host.
>>>>>>
>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>> as author of the underlying changes.
>>>>>>
>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>
>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>> test also with only 
>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>> removed.
>>>>>
>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>
>>>>> Sure, thanks.
>>>>>
>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>
>>>>> Yes
>>>>>
>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>
>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>
>>>>
>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>
>>>> You could try :
>>>>
>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>         dma_addr_t mapping;
>>>>         u32 opts[2], len;
>>>>         bool stop_queue;
>>>> +       bool door_bell;
>>>>         int frags;
>>>>  
>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>         dma_wmb();
>>>>  
>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>> +
>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>  
>>>>         /* Force all memory writes to complete before notifying device */
>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>         if (unlikely(stop_queue))
>>>>                 netif_stop_queue(dev);
>>>>  
>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>> +       if (door_bell) {
>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>                 mmiowb();
>>>>         }
>>>>
>>> Thanks a lot for checking and for the proposed fix.
>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>
>> I have done that already during the night .. the results:
>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>
>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>
>>   If I can, it is a separate issue.
>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>
>>   The timeout in question:
>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>         [28336.893358] Modules linked in:
>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>         [28337.090052] Call Trace:
>>         [28337.103615]  <IRQ>
>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>         [28337.128905]  call_timer_fn+0x19/0x90
>>         [28337.141892]  expire_timers+0x8b/0xa0
>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>         [28337.186734]  __do_softirq+0xed/0x229
>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>         [28337.207822]  irq_exit+0xb7/0xc0
>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>         [28337.241261]  </IRQ>
>>         [28337.253283] RIP: e033:0xff7e62
>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>
> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
> did that occur again?
> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
> contribute to the issue) and just submit a patch to effectively revert
> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.

I can't say if that is correct, because i haven't tested that.

Another thing I could test is:
 - putting all the r8169 patches (and prerequisites) that went into 5.0 
   up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
   If that would be feasible (not too many needed prerequisites out of r8169) and if 
   you could spare me some time and prep such a branch somewhere so i can pull and compile that,
   that would be great.

--
Sander

>> --
>> Sander
>>
>>  
>>>>
>>>> .
>>>>
>>> Heiner
>>>
>>
>>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-09 10:07                   ` Sander Eikelenboom
@ 2019-02-09 11:50                     ` Heiner Kallweit
  2019-02-10  9:16                       ` Sander Eikelenboom
  0 siblings, 1 reply; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-09 11:50 UTC (permalink / raw)
  To: Sander Eikelenboom, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 09.02.2019 11:07, Sander Eikelenboom wrote:
> On 09/02/2019 10:59, Heiner Kallweit wrote:
>> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>>
>>>>>
>>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>>> L.S.,
>>>>>>>>>>>
>>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>>
>>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>>
>>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>>
>>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>>
>>>>>>>>> Hmm i did some diging and i think:
>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>>
>>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>>
>>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>>> on the host.
>>>>>>>
>>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>>> as author of the underlying changes.
>>>>>>>
>>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>>
>>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>>> test also with only 
>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>> removed.
>>>>>>
>>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>>
>>>>>> Sure, thanks.
>>>>>>
>>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>>
>>>>>> Yes
>>>>>>
>>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>>
>>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>>
>>>>>
>>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>>
>>>>> You could try :
>>>>>
>>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>         dma_addr_t mapping;
>>>>>         u32 opts[2], len;
>>>>>         bool stop_queue;
>>>>> +       bool door_bell;
>>>>>         int frags;
>>>>>  
>>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>>         dma_wmb();
>>>>>  
>>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>>> +
>>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>>  
>>>>>         /* Force all memory writes to complete before notifying device */
>>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>         if (unlikely(stop_queue))
>>>>>                 netif_stop_queue(dev);
>>>>>  
>>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>>> +       if (door_bell) {
>>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>>                 mmiowb();
>>>>>         }
>>>>>
>>>> Thanks a lot for checking and for the proposed fix.
>>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>>
>>> I have done that already during the night .. the results:
>>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>>
>>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>>
>>>   If I can, it is a separate issue.
>>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>>
>>>   The timeout in question:
>>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>>         [28336.893358] Modules linked in:
>>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>>         [28337.090052] Call Trace:
>>>         [28337.103615]  <IRQ>
>>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>>         [28337.128905]  call_timer_fn+0x19/0x90
>>>         [28337.141892]  expire_timers+0x8b/0xa0
>>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>>         [28337.186734]  __do_softirq+0xed/0x229
>>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>>         [28337.207822]  irq_exit+0xb7/0xc0
>>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>>         [28337.241261]  </IRQ>
>>>         [28337.253283] RIP: e033:0xff7e62
>>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>>
>> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
>> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
>> did that occur again?
>> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
>> contribute to the issue) and just submit a patch to effectively revert
>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
> 
> I can't say if that is correct, because i haven't tested that.
> 
> Another thing I could test is:
>  - putting all the r8169 patches (and prerequisites) that went into 5.0 
>    up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
>    If that would be feasible (not too many needed prerequisites out of r8169) and if 
>    you could spare me some time and prep such a branch somewhere so i can pull and compile that,
>    that would be great.
> 

Unfortunately there's quite a number of changes. Regarding __netdev_tx_sent_queue()
and watchdog timeout I found the following comment in drivers/net/ethernet/sfc/tx.c,
efx_enqueue_skb():

	if (__netdev_tx_sent_queue(tx_queue->core_txq, skb_len, xmit_more)) {
		struct efx_tx_queue *txq2 = efx_tx_queue_partner(tx_queue);

		/* There could be packets left on the partner queue if those
		 * SKBs had skb->xmit_more set. If we do not push those they
		 * could be left for a long time and cause a netdev watchdog.
		 */
		if (txq2->xmit_more_available)
			efx_nic_push_buffers(txq2);

But I'm not sure whether the situation in r8169 is comparable. The following patch
implements what I mentioned earlier: It leaves all other 5.0 changes in place and
effectively reverts 2e6eedb4813e34d8d84ac0eb3afb668966f3f356. Would be great if
you could give it a try.


diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index e8a112149..3cca2ffb2 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6192,7 +6192,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
 	struct device *d = tp_to_dev(tp);
 	dma_addr_t mapping;
 	u32 opts[2], len;
-	bool stop_queue;
 	int frags;
 
 	if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
@@ -6234,6 +6233,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
 
 	txd->opts2 = cpu_to_le32(opts[1]);
 
+	netdev_sent_queue(dev, skb->len);
+
 	skb_tx_timestamp(skb);
 
 	/* Force memory writes to complete before releasing descriptor */
@@ -6246,14 +6247,14 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
 
 	tp->cur_tx += frags + 1;
 
-	stop_queue = !rtl_tx_slots_avail(tp, MAX_SKB_FRAGS);
-	if (unlikely(stop_queue))
-		netif_stop_queue(dev);
-
-	if (__netdev_sent_queue(dev, skb->len, skb->xmit_more))
-		RTL_W8(tp, TxPoll, NPQ);
+	RTL_W8(tp, TxPoll, NPQ);
 
-	if (unlikely(stop_queue)) {
+	if (!rtl_tx_slots_avail(tp, MAX_SKB_FRAGS)) {
+		/* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
+		 * not miss a ring update when it notices a stopped queue.
+		 */
+		smp_wmb();
+		netif_stop_queue(dev);
 		/* Sync with rtl_tx:
 		 * - publish queue status and cur_tx ring index (write barrier)
 		 * - refresh dirty_tx ring index (read barrier).
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-09 11:50                     ` Heiner Kallweit
@ 2019-02-10  9:16                       ` Sander Eikelenboom
  2019-02-10  9:32                         ` Heiner Kallweit
  2019-02-10 11:44                         ` Heiner Kallweit
  0 siblings, 2 replies; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-10  9:16 UTC (permalink / raw)
  To: Heiner Kallweit, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 09/02/2019 12:50, Heiner Kallweit wrote:
> On 09.02.2019 11:07, Sander Eikelenboom wrote:
>> On 09/02/2019 10:59, Heiner Kallweit wrote:
>>> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>>>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>>>
>>>>>>
>>>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>>>> L.S.,
>>>>>>>>>>>>
>>>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>>>
>>>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>>>
>>>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>>>
>>>>>>>>>> Hmm i did some diging and i think:
>>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>>>
>>>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>>>
>>>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>>>> on the host.
>>>>>>>>
>>>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>>>> as author of the underlying changes.
>>>>>>>>
>>>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>>>
>>>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>>>> test also with only 
>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>> removed.
>>>>>>>
>>>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>>>
>>>>>>> Sure, thanks.
>>>>>>>
>>>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>>>
>>>>>>> Yes
>>>>>>>
>>>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>>>
>>>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>>>
>>>>>>
>>>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>>>
>>>>>> You could try :
>>>>>>
>>>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>         dma_addr_t mapping;
>>>>>>         u32 opts[2], len;
>>>>>>         bool stop_queue;
>>>>>> +       bool door_bell;
>>>>>>         int frags;
>>>>>>  
>>>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>>>         dma_wmb();
>>>>>>  
>>>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>>>> +
>>>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>>>  
>>>>>>         /* Force all memory writes to complete before notifying device */
>>>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>         if (unlikely(stop_queue))
>>>>>>                 netif_stop_queue(dev);
>>>>>>  
>>>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>>>> +       if (door_bell) {
>>>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>>>                 mmiowb();
>>>>>>         }
>>>>>>
>>>>> Thanks a lot for checking and for the proposed fix.
>>>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>>>
>>>> I have done that already during the night .. the results:
>>>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>>>
>>>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>>>
>>>>   If I can, it is a separate issue.
>>>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>>>
>>>>   The timeout in question:
>>>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>>>         [28336.893358] Modules linked in:
>>>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>>>         [28337.090052] Call Trace:
>>>>         [28337.103615]  <IRQ>
>>>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>>>         [28337.128905]  call_timer_fn+0x19/0x90
>>>>         [28337.141892]  expire_timers+0x8b/0xa0
>>>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>>>         [28337.186734]  __do_softirq+0xed/0x229
>>>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>>>         [28337.207822]  irq_exit+0xb7/0xc0
>>>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>>>         [28337.241261]  </IRQ>
>>>>         [28337.253283] RIP: e033:0xff7e62
>>>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>>>
>>> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
>>> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
>>> did that occur again?
>>> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
>>> contribute to the issue) and just submit a patch to effectively revert
>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
>>
>> I can't say if that is correct, because i haven't tested that.
>>
>> Another thing I could test is:
>>  - putting all the r8169 patches (and prerequisites) that went into 5.0 
>>    up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
>>    If that would be feasible (not too many needed prerequisites out of r8169) and if 
>>    you could spare me some time and prep such a branch somewhere so i can pull and compile that,
>>    that would be great.
>>
> 
> Unfortunately there's quite a number of changes. Regarding __netdev_tx_sent_queue()
> and watchdog timeout I found the following comment in drivers/net/ethernet/sfc/tx.c,
> efx_enqueue_skb():
> 
> 	if (__netdev_tx_sent_queue(tx_queue->core_txq, skb_len, xmit_more)) {
> 		struct efx_tx_queue *txq2 = efx_tx_queue_partner(tx_queue);
> 
> 		/* There could be packets left on the partner queue if those
> 		 * SKBs had skb->xmit_more set. If we do not push those they
> 		 * could be left for a long time and cause a netdev watchdog.
> 		 */
> 		if (txq2->xmit_more_available)
> 			efx_nic_push_buffers(txq2);
> 
> But I'm not sure whether the situation in r8169 is comparable. The following patch
> implements what I mentioned earlier: It leaves all other 5.0 changes in place and
> effectively reverts 2e6eedb4813e34d8d84ac0eb3afb668966f3f356. Would be great if
> you could give it a try.

Hi Heiner,

It took some time to respond, because I had another issue with 5.0 which intervened with proper testing, 
but fortunately I could pinpoint without doing a full bisect and revert that commit for further testing.

So there is still time left and I could do a more proper run with your patch below.
Unfortunately i still get a splat (see below) with this, although i'm not sure it is related, 
just that I can't tell.

Perhaps Linus as Oops-decoding-guru has an idea ?

--
Sander

[39041.689007] dpkg-deb: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
[39041.689016] CPU: 4 PID: 14078 Comm: dpkg-deb Not tainted 5.0.0-rc5-20190209-kallweit+ #1
[39041.689017] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
[39041.689018] Call Trace:
[39041.689022]  <IRQ>
[39041.689030]  dump_stack+0x5c/0x7b
[39041.689033]  warn_alloc+0x103/0x190
[39041.689036]  __alloc_pages_nodemask+0xe3d/0xe80
[39041.689039]  ? ip_rcv+0x48/0xc0
[39041.689040]  ? ip_rcv_finish_core.isra.0+0x360/0x360
[39041.689042]  page_frag_alloc+0x117/0x150
[39041.689044]  __napi_alloc_skb+0x83/0xd0
[39041.689048]  rtl8169_poll+0x210/0x640
[39041.689051]  net_rx_action+0x23d/0x370
[39041.689054]  __do_softirq+0xed/0x229
[39041.689058]  irq_exit+0xb7/0xc0
[39041.689061]  xen_evtchn_do_upcall+0x27/0x40
[39041.689063]  xen_do_hypervisor_callback+0x29/0x40
[39041.689064]  </IRQ>
[39041.689066] RIP: e030:_atomic_dec_and_lock+0x2/0x40
[39041.689068] Code: ff 39 05 c5 c1 c9 00 89 c7 89 c6 76 0f 83 eb 01 83 fb ff 75 d9 5b 89 f8 5d 41 5c c3 0f 0b 90 90 90 90 90 90 90 90 90 90 8b 07 <83> f8 01 74 0c 8d 50 ff f0 0f b1 17 75 f2 31 c0 c3 55 53 48 89 fb
[39041.689069] RSP: e02b:ffffc9000705b990 EFLAGS: 00000246
[39041.689071] RAX: 0000000000000001 RBX: ffff888017082640 RCX: 0000000000000000
[39041.689071] RDX: 0000000000000000 RSI: ffff8880170826c0 RDI: ffff888017082788
[39041.689072] RBP: ffff8880170826c0 R08: ffffc9000705bb00 R09: ffffc9000705bb00
[39041.689073] R10: ffffc9000705bb58 R11: ffff88807fc17000 R12: ffff888017082788
[39041.689073] R13: ffff88806cc8cf58 R14: ffff888017082640 R15: ffff888009990240
[39041.689077]  iput+0x63/0x1a0
[39041.689079]  __dentry_kill+0xc5/0x170
[39041.689080]  shrink_dentry_list+0x93/0x1c0
[39041.689082]  prune_dcache_sb+0x4d/0x70
[39041.689084]  super_cache_scan+0x104/0x190
[39041.689087]  do_shrink_slab+0x12c/0x1e0
[39041.689089]  shrink_slab+0xdf/0x2b0
[39041.689091]  shrink_node+0x158/0x470
[39041.689093]  do_try_to_free_pages+0xd1/0x380
[39041.689095]  try_to_free_pages+0xb2/0xe0
[39041.689097]  __alloc_pages_nodemask+0x603/0xe80
[39041.689099]  ? __pagevec_lru_add_fn+0x1b1/0x290
[39041.689102]  alloc_pages_vma+0x7b/0x1c0
[39041.689106]  __handle_mm_fault+0xdb3/0x1060
[39041.689109]  ? xen_mc_flush+0xc0/0x190
[39041.689110]  handle_mm_fault+0xf8/0x200
[39041.689113]  __do_page_fault+0x231/0x4a0
[39041.689115]  ? page_fault+0x8/0x30
[39041.689116]  page_fault+0x1e/0x30
[39041.689118] RIP: e033:0x7fb9851d012e
[39041.689119] Code: 29 c2 48 3b 15 7b a3 31 00 0f 87 af 00 00 00 0f 10 01 0f 10 49 f0 0f 10 51 e0 0f 10 59 d0 48 83 e9 40 48 83 ea 40 41 0f 29 01 <41> 0f 29 49 f0 41 0f 29 51 e0 41 0f 29 59 d0 49 83 e9 40 48 83 fa
[39041.689119] RSP: e02b:00007fb958b36d38 EFLAGS: 00010202
[39041.689120] RAX: 00007fb97a617f0e RBX: 000000000000f004 RCX: 00007fb948008be3
[39041.689121] RDX: 00000000000080c2 RSI: 00007fb948000b31 RDI: 00007fb97a617f0e
[39041.689122] RBP: 00000000000ff062 R08: 0000000000000002 R09: 00007fb97a620000
[39041.689123] R10: 0000000000000004 R11: 00007fb97a626f02 R12: 000000000000f005
[39041.689123] R13: 00007fb948000b28 R14: 0000562d76b63710 R15: 0000000000000003
[39041.689125] Mem-Info:
[39041.689130] active_anon:78775 inactive_anon:49211 isolated_anon:0
                active_file:106409 inactive_file:107531 isolated_file:0
                unevictable:552 dirty:175 writeback:0 unstable:0
                slab_reclaimable:13739 slab_unreclaimable:16454
                mapped:1605 shmem:23 pagetables:2900 bounce:0
                free:3681 free_pcp:935 free_cma:0
[39041.689132] Node 0 active_anon:315100kB inactive_anon:196844kB active_file:425636kB inactive_file:430124kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:6420kB dirty:700kB writeback:0kB shmem:92kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[39041.689133] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:0kB inactive_anon:7832kB active_file:472kB inactive_file:4kB unevictable:0kB writepending:0kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:12kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[39041.689136] lowmem_reserve[]: 0 1865 1865 1865
[39041.689138] Node 0 DMA32 free:7244kB min:19472kB low:21380kB high:23288kB active_anon:315360kB inactive_anon:188144kB active_file:425164kB inactive_file:430120kB unevictable:2208kB writepending:700kB present:2080768kB managed:1674968kB mlocked:2208kB kernel_stack:9632kB pagetables:11588kB bounce:0kB free_pcp:3740kB local_pcp:528kB free_cma:0kB
[39041.689140] lowmem_reserve[]: 0 0 0 0
[39041.689142] Node 0 DMA: 6*4kB (UME) 6*8kB (UE) 7*16kB (UME) 6*32kB (ME) 5*64kB (UME) 3*128kB (UE) 5*256kB (UME) 2*512kB (ME) 2*1024kB (UE) 1*2048kB (M) 0*4096kB = 7480kB
[39041.689148] Node 0 DMA32: 69*4kB (U) 315*8kB (UE) 138*16kB (UE) 70*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7244kB
[39041.689153] 214701 total pagecache pages
[39041.689155] 273 pages in swap cache
[39041.689156] Swap cache stats: add 100978, delete 100706, find 1158/1257
[39041.689156] Free swap  = 3790588kB
[39041.689157] Total swap = 4194300kB
[39041.689157] 524181 pages RAM
[39041.689158] 0 pages HighMem/MovableOnly
[39041.689158] 101471 pages reserved
[39041.689159] 0 pages cma reserved





> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index e8a112149..3cca2ffb2 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -6192,7 +6192,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>  	struct device *d = tp_to_dev(tp);
>  	dma_addr_t mapping;
>  	u32 opts[2], len;
> -	bool stop_queue;
>  	int frags;
>  
>  	if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
> @@ -6234,6 +6233,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>  
>  	txd->opts2 = cpu_to_le32(opts[1]);
>  
> +	netdev_sent_queue(dev, skb->len);
> +
>  	skb_tx_timestamp(skb);
>  
>  	/* Force memory writes to complete before releasing descriptor */
> @@ -6246,14 +6247,14 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>  
>  	tp->cur_tx += frags + 1;
>  
> -	stop_queue = !rtl_tx_slots_avail(tp, MAX_SKB_FRAGS);
> -	if (unlikely(stop_queue))
> -		netif_stop_queue(dev);
> -
> -	if (__netdev_sent_queue(dev, skb->len, skb->xmit_more))
> -		RTL_W8(tp, TxPoll, NPQ);
> +	RTL_W8(tp, TxPoll, NPQ);
>  
> -	if (unlikely(stop_queue)) {
> +	if (!rtl_tx_slots_avail(tp, MAX_SKB_FRAGS)) {
> +		/* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
> +		 * not miss a ring update when it notices a stopped queue.
> +		 */
> +		smp_wmb();
> +		netif_stop_queue(dev);
>  		/* Sync with rtl_tx:
>  		 * - publish queue status and cur_tx ring index (write barrier)
>  		 * - refresh dirty_tx ring index (read barrier).
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-10  9:16                       ` Sander Eikelenboom
@ 2019-02-10  9:32                         ` Heiner Kallweit
  2019-02-10 11:44                         ` Heiner Kallweit
  1 sibling, 0 replies; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-10  9:32 UTC (permalink / raw)
  To: Sander Eikelenboom, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 10.02.2019 10:16, Sander Eikelenboom wrote:
> On 09/02/2019 12:50, Heiner Kallweit wrote:
>> On 09.02.2019 11:07, Sander Eikelenboom wrote:
>>> On 09/02/2019 10:59, Heiner Kallweit wrote:
>>>> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>>>>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>>>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>>>>> L.S.,
>>>>>>>>>>>>>
>>>>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>>>>
>>>>>>>>>>> Hmm i did some diging and i think:
>>>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>>>>
>>>>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>>>>
>>>>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>>>>> on the host.
>>>>>>>>>
>>>>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>>>>> as author of the underlying changes.
>>>>>>>>>
>>>>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>>>>
>>>>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>>>>> test also with only 
>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>> removed.
>>>>>>>>
>>>>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>>>>
>>>>>>>> Sure, thanks.
>>>>>>>>
>>>>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>>>>
>>>>>>>> Yes
>>>>>>>>
>>>>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>>>>
>>>>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>>>>
>>>>>>>
>>>>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>>>>
>>>>>>> You could try :
>>>>>>>
>>>>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>         dma_addr_t mapping;
>>>>>>>         u32 opts[2], len;
>>>>>>>         bool stop_queue;
>>>>>>> +       bool door_bell;
>>>>>>>         int frags;
>>>>>>>  
>>>>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>>>>         dma_wmb();
>>>>>>>  
>>>>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>>>>> +
>>>>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>>>>  
>>>>>>>         /* Force all memory writes to complete before notifying device */
>>>>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>         if (unlikely(stop_queue))
>>>>>>>                 netif_stop_queue(dev);
>>>>>>>  
>>>>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>>>>> +       if (door_bell) {
>>>>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>>>>                 mmiowb();
>>>>>>>         }
>>>>>>>
>>>>>> Thanks a lot for checking and for the proposed fix.
>>>>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>>>>
>>>>> I have done that already during the night .. the results:
>>>>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>>>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>>>>
>>>>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>>>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>>>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>>>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>>>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>>>>
>>>>>   If I can, it is a separate issue.
>>>>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>>>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>>>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>>>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>>>>
>>>>>   The timeout in question:
>>>>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>>>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>>>>         [28336.893358] Modules linked in:
>>>>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>>>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>>>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>>>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>>>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>>>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>>>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>>>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>>>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>>>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>>>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>>>>         [28337.090052] Call Trace:
>>>>>         [28337.103615]  <IRQ>
>>>>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>>>>         [28337.128905]  call_timer_fn+0x19/0x90
>>>>>         [28337.141892]  expire_timers+0x8b/0xa0
>>>>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>>>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>>>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>>>>         [28337.186734]  __do_softirq+0xed/0x229
>>>>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>>>>         [28337.207822]  irq_exit+0xb7/0xc0
>>>>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>>>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>>>>         [28337.241261]  </IRQ>
>>>>>         [28337.253283] RIP: e033:0xff7e62
>>>>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>>>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>>>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>>>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>>>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>>>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>>>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>>>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>>>>
>>>> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
>>>> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
>>>> did that occur again?
>>>> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
>>>> contribute to the issue) and just submit a patch to effectively revert
>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
>>>
>>> I can't say if that is correct, because i haven't tested that.
>>>
>>> Another thing I could test is:
>>>  - putting all the r8169 patches (and prerequisites) that went into 5.0 
>>>    up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
>>>    If that would be feasible (not too many needed prerequisites out of r8169) and if 
>>>    you could spare me some time and prep such a branch somewhere so i can pull and compile that,
>>>    that would be great.
>>>
>>
>> Unfortunately there's quite a number of changes. Regarding __netdev_tx_sent_queue()
>> and watchdog timeout I found the following comment in drivers/net/ethernet/sfc/tx.c,
>> efx_enqueue_skb():
>>
>> 	if (__netdev_tx_sent_queue(tx_queue->core_txq, skb_len, xmit_more)) {
>> 		struct efx_tx_queue *txq2 = efx_tx_queue_partner(tx_queue);
>>
>> 		/* There could be packets left on the partner queue if those
>> 		 * SKBs had skb->xmit_more set. If we do not push those they
>> 		 * could be left for a long time and cause a netdev watchdog.
>> 		 */
>> 		if (txq2->xmit_more_available)
>> 			efx_nic_push_buffers(txq2);
>>
>> But I'm not sure whether the situation in r8169 is comparable. The following patch
>> implements what I mentioned earlier: It leaves all other 5.0 changes in place and
>> effectively reverts 2e6eedb4813e34d8d84ac0eb3afb668966f3f356. Would be great if
>> you could give it a try.
> 
> Hi Heiner,
> 
> It took some time to respond, because I had another issue with 5.0 which intervened with proper testing, 
> but fortunately I could pinpoint without doing a full bisect and revert that commit for further testing.
> 
> So there is still time left and I could do a more proper run with your patch below.
> Unfortunately i still get a splat (see below) with this, although i'm not sure it is related, 
> just that I can't tell.
> 
The nasty memory allocation problem in napi_alloc_skb() you've seen before ..
If you can't reproduce it with 4.20 then a potential cause may be here:
5317d5c6d47e ("r8169: use napi_consume_skb where possible")
At least at a first and second glance I don't see how usage of both calls
could be wrong. So maybe the root cause is somewhere deeper inside.
At least it would be worth trying with the mentioned commit reverted.

> Perhaps Linus as Oops-decoding-guru has an idea ?
> 
> --
> Sander
> 
Heiner

> [39041.689007] dpkg-deb: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
> [39041.689016] CPU: 4 PID: 14078 Comm: dpkg-deb Not tainted 5.0.0-rc5-20190209-kallweit+ #1
> [39041.689017] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [39041.689018] Call Trace:
> [39041.689022]  <IRQ>
> [39041.689030]  dump_stack+0x5c/0x7b
> [39041.689033]  warn_alloc+0x103/0x190
> [39041.689036]  __alloc_pages_nodemask+0xe3d/0xe80
> [39041.689039]  ? ip_rcv+0x48/0xc0
> [39041.689040]  ? ip_rcv_finish_core.isra.0+0x360/0x360
> [39041.689042]  page_frag_alloc+0x117/0x150
> [39041.689044]  __napi_alloc_skb+0x83/0xd0
> [39041.689048]  rtl8169_poll+0x210/0x640
> [39041.689051]  net_rx_action+0x23d/0x370
> [39041.689054]  __do_softirq+0xed/0x229
> [39041.689058]  irq_exit+0xb7/0xc0
> [39041.689061]  xen_evtchn_do_upcall+0x27/0x40
> [39041.689063]  xen_do_hypervisor_callback+0x29/0x40
> [39041.689064]  </IRQ>
> [39041.689066] RIP: e030:_atomic_dec_and_lock+0x2/0x40
> [39041.689068] Code: ff 39 05 c5 c1 c9 00 89 c7 89 c6 76 0f 83 eb 01 83 fb ff 75 d9 5b 89 f8 5d 41 5c c3 0f 0b 90 90 90 90 90 90 90 90 90 90 8b 07 <83> f8 01 74 0c 8d 50 ff f0 0f b1 17 75 f2 31 c0 c3 55 53 48 89 fb
> [39041.689069] RSP: e02b:ffffc9000705b990 EFLAGS: 00000246
> [39041.689071] RAX: 0000000000000001 RBX: ffff888017082640 RCX: 0000000000000000
> [39041.689071] RDX: 0000000000000000 RSI: ffff8880170826c0 RDI: ffff888017082788
> [39041.689072] RBP: ffff8880170826c0 R08: ffffc9000705bb00 R09: ffffc9000705bb00
> [39041.689073] R10: ffffc9000705bb58 R11: ffff88807fc17000 R12: ffff888017082788
> [39041.689073] R13: ffff88806cc8cf58 R14: ffff888017082640 R15: ffff888009990240
> [39041.689077]  iput+0x63/0x1a0
> [39041.689079]  __dentry_kill+0xc5/0x170
> [39041.689080]  shrink_dentry_list+0x93/0x1c0
> [39041.689082]  prune_dcache_sb+0x4d/0x70
> [39041.689084]  super_cache_scan+0x104/0x190
> [39041.689087]  do_shrink_slab+0x12c/0x1e0
> [39041.689089]  shrink_slab+0xdf/0x2b0
> [39041.689091]  shrink_node+0x158/0x470
> [39041.689093]  do_try_to_free_pages+0xd1/0x380
> [39041.689095]  try_to_free_pages+0xb2/0xe0
> [39041.689097]  __alloc_pages_nodemask+0x603/0xe80
> [39041.689099]  ? __pagevec_lru_add_fn+0x1b1/0x290
> [39041.689102]  alloc_pages_vma+0x7b/0x1c0
> [39041.689106]  __handle_mm_fault+0xdb3/0x1060
> [39041.689109]  ? xen_mc_flush+0xc0/0x190
> [39041.689110]  handle_mm_fault+0xf8/0x200
> [39041.689113]  __do_page_fault+0x231/0x4a0
> [39041.689115]  ? page_fault+0x8/0x30
> [39041.689116]  page_fault+0x1e/0x30
> [39041.689118] RIP: e033:0x7fb9851d012e
> [39041.689119] Code: 29 c2 48 3b 15 7b a3 31 00 0f 87 af 00 00 00 0f 10 01 0f 10 49 f0 0f 10 51 e0 0f 10 59 d0 48 83 e9 40 48 83 ea 40 41 0f 29 01 <41> 0f 29 49 f0 41 0f 29 51 e0 41 0f 29 59 d0 49 83 e9 40 48 83 fa
> [39041.689119] RSP: e02b:00007fb958b36d38 EFLAGS: 00010202
> [39041.689120] RAX: 00007fb97a617f0e RBX: 000000000000f004 RCX: 00007fb948008be3
> [39041.689121] RDX: 00000000000080c2 RSI: 00007fb948000b31 RDI: 00007fb97a617f0e
> [39041.689122] RBP: 00000000000ff062 R08: 0000000000000002 R09: 00007fb97a620000
> [39041.689123] R10: 0000000000000004 R11: 00007fb97a626f02 R12: 000000000000f005
> [39041.689123] R13: 00007fb948000b28 R14: 0000562d76b63710 R15: 0000000000000003
> [39041.689125] Mem-Info:
> [39041.689130] active_anon:78775 inactive_anon:49211 isolated_anon:0
>                 active_file:106409 inactive_file:107531 isolated_file:0
>                 unevictable:552 dirty:175 writeback:0 unstable:0
>                 slab_reclaimable:13739 slab_unreclaimable:16454
>                 mapped:1605 shmem:23 pagetables:2900 bounce:0
>                 free:3681 free_pcp:935 free_cma:0
> [39041.689132] Node 0 active_anon:315100kB inactive_anon:196844kB active_file:425636kB inactive_file:430124kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:6420kB dirty:700kB writeback:0kB shmem:92kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
> [39041.689133] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:0kB inactive_anon:7832kB active_file:472kB inactive_file:4kB unevictable:0kB writepending:0kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:12kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [39041.689136] lowmem_reserve[]: 0 1865 1865 1865
> [39041.689138] Node 0 DMA32 free:7244kB min:19472kB low:21380kB high:23288kB active_anon:315360kB inactive_anon:188144kB active_file:425164kB inactive_file:430120kB unevictable:2208kB writepending:700kB present:2080768kB managed:1674968kB mlocked:2208kB kernel_stack:9632kB pagetables:11588kB bounce:0kB free_pcp:3740kB local_pcp:528kB free_cma:0kB
> [39041.689140] lowmem_reserve[]: 0 0 0 0
> [39041.689142] Node 0 DMA: 6*4kB (UME) 6*8kB (UE) 7*16kB (UME) 6*32kB (ME) 5*64kB (UME) 3*128kB (UE) 5*256kB (UME) 2*512kB (ME) 2*1024kB (UE) 1*2048kB (M) 0*4096kB = 7480kB
> [39041.689148] Node 0 DMA32: 69*4kB (U) 315*8kB (UE) 138*16kB (UE) 70*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7244kB
> [39041.689153] 214701 total pagecache pages
> [39041.689155] 273 pages in swap cache
> [39041.689156] Swap cache stats: add 100978, delete 100706, find 1158/1257
> [39041.689156] Free swap  = 3790588kB
> [39041.689157] Total swap = 4194300kB
> [39041.689157] 524181 pages RAM
> [39041.689158] 0 pages HighMem/MovableOnly
> [39041.689158] 101471 pages reserved
> [39041.689159] 0 pages cma reserved
> 
> 
> 
> 
> 
>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>> index e8a112149..3cca2ffb2 100644
>> --- a/drivers/net/ethernet/realtek/r8169.c
>> +++ b/drivers/net/ethernet/realtek/r8169.c
>> @@ -6192,7 +6192,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>  	struct device *d = tp_to_dev(tp);
>>  	dma_addr_t mapping;
>>  	u32 opts[2], len;
>> -	bool stop_queue;
>>  	int frags;
>>  
>>  	if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>> @@ -6234,6 +6233,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>  
>>  	txd->opts2 = cpu_to_le32(opts[1]);
>>  
>> +	netdev_sent_queue(dev, skb->len);
>> +
>>  	skb_tx_timestamp(skb);
>>  
>>  	/* Force memory writes to complete before releasing descriptor */
>> @@ -6246,14 +6247,14 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>  
>>  	tp->cur_tx += frags + 1;
>>  
>> -	stop_queue = !rtl_tx_slots_avail(tp, MAX_SKB_FRAGS);
>> -	if (unlikely(stop_queue))
>> -		netif_stop_queue(dev);
>> -
>> -	if (__netdev_sent_queue(dev, skb->len, skb->xmit_more))
>> -		RTL_W8(tp, TxPoll, NPQ);
>> +	RTL_W8(tp, TxPoll, NPQ);
>>  
>> -	if (unlikely(stop_queue)) {
>> +	if (!rtl_tx_slots_avail(tp, MAX_SKB_FRAGS)) {
>> +		/* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
>> +		 * not miss a ring update when it notices a stopped queue.
>> +		 */
>> +		smp_wmb();
>> +		netif_stop_queue(dev);
>>  		/* Sync with rtl_tx:
>>  		 * - publish queue status and cur_tx ring index (write barrier)
>>  		 * - refresh dirty_tx ring index (read barrier).
>>
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-10  9:16                       ` Sander Eikelenboom
  2019-02-10  9:32                         ` Heiner Kallweit
@ 2019-02-10 11:44                         ` Heiner Kallweit
  2019-02-10 13:05                           ` Sander Eikelenboom
  2019-02-10 15:50                           ` Sander Eikelenboom
  1 sibling, 2 replies; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-10 11:44 UTC (permalink / raw)
  To: Sander Eikelenboom, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 10.02.2019 10:16, Sander Eikelenboom wrote:
> On 09/02/2019 12:50, Heiner Kallweit wrote:
>> On 09.02.2019 11:07, Sander Eikelenboom wrote:
>>> On 09/02/2019 10:59, Heiner Kallweit wrote:
>>>> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>>>>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>>>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>>>>> L.S.,
>>>>>>>>>>>>>
>>>>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>>>>
>>>>>>>>>>> Hmm i did some diging and i think:
>>>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>>>>
>>>>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>>>>
>>>>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>>>>> on the host.
>>>>>>>>>
>>>>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>>>>> as author of the underlying changes.
>>>>>>>>>
>>>>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>>>>
>>>>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>>>>> test also with only 
>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>> removed.
>>>>>>>>
>>>>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>>>>
>>>>>>>> Sure, thanks.
>>>>>>>>
>>>>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>>>>
>>>>>>>> Yes
>>>>>>>>
>>>>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>>>>
>>>>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>>>>
>>>>>>>
>>>>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>>>>
>>>>>>> You could try :
>>>>>>>
>>>>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>         dma_addr_t mapping;
>>>>>>>         u32 opts[2], len;
>>>>>>>         bool stop_queue;
>>>>>>> +       bool door_bell;
>>>>>>>         int frags;
>>>>>>>  
>>>>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>>>>         dma_wmb();
>>>>>>>  
>>>>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>>>>> +
>>>>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>>>>  
>>>>>>>         /* Force all memory writes to complete before notifying device */
>>>>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>         if (unlikely(stop_queue))
>>>>>>>                 netif_stop_queue(dev);
>>>>>>>  
>>>>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>>>>> +       if (door_bell) {
>>>>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>>>>                 mmiowb();
>>>>>>>         }
>>>>>>>
>>>>>> Thanks a lot for checking and for the proposed fix.
>>>>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>>>>
>>>>> I have done that already during the night .. the results:
>>>>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>>>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>>>>
>>>>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>>>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>>>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>>>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>>>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>>>>
>>>>>   If I can, it is a separate issue.
>>>>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>>>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>>>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>>>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>>>>
>>>>>   The timeout in question:
>>>>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>>>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>>>>         [28336.893358] Modules linked in:
>>>>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>>>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>>>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>>>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>>>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>>>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>>>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>>>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>>>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>>>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>>>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>>>>         [28337.090052] Call Trace:
>>>>>         [28337.103615]  <IRQ>
>>>>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>>>>         [28337.128905]  call_timer_fn+0x19/0x90
>>>>>         [28337.141892]  expire_timers+0x8b/0xa0
>>>>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>>>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>>>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>>>>         [28337.186734]  __do_softirq+0xed/0x229
>>>>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>>>>         [28337.207822]  irq_exit+0xb7/0xc0
>>>>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>>>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>>>>         [28337.241261]  </IRQ>
>>>>>         [28337.253283] RIP: e033:0xff7e62
>>>>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>>>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>>>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>>>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>>>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>>>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>>>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>>>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>>>>
>>>> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
>>>> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
>>>> did that occur again?
>>>> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
>>>> contribute to the issue) and just submit a patch to effectively revert
>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
>>>
>>> I can't say if that is correct, because i haven't tested that.
>>>
>>> Another thing I could test is:
>>>  - putting all the r8169 patches (and prerequisites) that went into 5.0 
>>>    up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
>>>    If that would be feasible (not too many needed prerequisites out of r8169) and if 
>>>    you could spare me some time and prep such a branch somewhere so i can pull and compile that,
>>>    that would be great.
>>>
>>
>> Unfortunately there's quite a number of changes. Regarding __netdev_tx_sent_queue()
>> and watchdog timeout I found the following comment in drivers/net/ethernet/sfc/tx.c,
>> efx_enqueue_skb():
>>
>> 	if (__netdev_tx_sent_queue(tx_queue->core_txq, skb_len, xmit_more)) {
>> 		struct efx_tx_queue *txq2 = efx_tx_queue_partner(tx_queue);
>>
>> 		/* There could be packets left on the partner queue if those
>> 		 * SKBs had skb->xmit_more set. If we do not push those they
>> 		 * could be left for a long time and cause a netdev watchdog.
>> 		 */
>> 		if (txq2->xmit_more_available)
>> 			efx_nic_push_buffers(txq2);
>>
>> But I'm not sure whether the situation in r8169 is comparable. The following patch
>> implements what I mentioned earlier: It leaves all other 5.0 changes in place and
>> effectively reverts 2e6eedb4813e34d8d84ac0eb3afb668966f3f356. Would be great if
>> you could give it a try.
> 
> Hi Heiner,
> 
> It took some time to respond, because I had another issue with 5.0 which intervened with proper testing, 
> but fortunately I could pinpoint without doing a full bisect and revert that commit for further testing.
> 
> So there is still time left and I could do a more proper run with your patch below.
> Unfortunately i still get a splat (see below) with this, although i'm not sure it is related, 
> just that I can't tell.
> 
I checked further and there's a handful of network drivers using __napi_alloc_skb() with __GFP_NOWARN,
maybe to avoid such splats. Did the splat impact functionality? When checking the code in r8169 the
affected packet would just be dropped.

> Perhaps Linus as Oops-decoding-guru has an idea ?
> 
> --
> Sander
> 
> [39041.689007] dpkg-deb: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
> [39041.689016] CPU: 4 PID: 14078 Comm: dpkg-deb Not tainted 5.0.0-rc5-20190209-kallweit+ #1
> [39041.689017] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
> [39041.689018] Call Trace:
> [39041.689022]  <IRQ>
> [39041.689030]  dump_stack+0x5c/0x7b
> [39041.689033]  warn_alloc+0x103/0x190
> [39041.689036]  __alloc_pages_nodemask+0xe3d/0xe80
> [39041.689039]  ? ip_rcv+0x48/0xc0
> [39041.689040]  ? ip_rcv_finish_core.isra.0+0x360/0x360
> [39041.689042]  page_frag_alloc+0x117/0x150
> [39041.689044]  __napi_alloc_skb+0x83/0xd0
> [39041.689048]  rtl8169_poll+0x210/0x640
> [39041.689051]  net_rx_action+0x23d/0x370
> [39041.689054]  __do_softirq+0xed/0x229
> [39041.689058]  irq_exit+0xb7/0xc0
> [39041.689061]  xen_evtchn_do_upcall+0x27/0x40
> [39041.689063]  xen_do_hypervisor_callback+0x29/0x40
> [39041.689064]  </IRQ>
> [39041.689066] RIP: e030:_atomic_dec_and_lock+0x2/0x40
> [39041.689068] Code: ff 39 05 c5 c1 c9 00 89 c7 89 c6 76 0f 83 eb 01 83 fb ff 75 d9 5b 89 f8 5d 41 5c c3 0f 0b 90 90 90 90 90 90 90 90 90 90 8b 07 <83> f8 01 74 0c 8d 50 ff f0 0f b1 17 75 f2 31 c0 c3 55 53 48 89 fb
> [39041.689069] RSP: e02b:ffffc9000705b990 EFLAGS: 00000246
> [39041.689071] RAX: 0000000000000001 RBX: ffff888017082640 RCX: 0000000000000000
> [39041.689071] RDX: 0000000000000000 RSI: ffff8880170826c0 RDI: ffff888017082788
> [39041.689072] RBP: ffff8880170826c0 R08: ffffc9000705bb00 R09: ffffc9000705bb00
> [39041.689073] R10: ffffc9000705bb58 R11: ffff88807fc17000 R12: ffff888017082788
> [39041.689073] R13: ffff88806cc8cf58 R14: ffff888017082640 R15: ffff888009990240
> [39041.689077]  iput+0x63/0x1a0
> [39041.689079]  __dentry_kill+0xc5/0x170
> [39041.689080]  shrink_dentry_list+0x93/0x1c0
> [39041.689082]  prune_dcache_sb+0x4d/0x70
> [39041.689084]  super_cache_scan+0x104/0x190
> [39041.689087]  do_shrink_slab+0x12c/0x1e0
> [39041.689089]  shrink_slab+0xdf/0x2b0
> [39041.689091]  shrink_node+0x158/0x470
> [39041.689093]  do_try_to_free_pages+0xd1/0x380
> [39041.689095]  try_to_free_pages+0xb2/0xe0
> [39041.689097]  __alloc_pages_nodemask+0x603/0xe80
> [39041.689099]  ? __pagevec_lru_add_fn+0x1b1/0x290
> [39041.689102]  alloc_pages_vma+0x7b/0x1c0
> [39041.689106]  __handle_mm_fault+0xdb3/0x1060
> [39041.689109]  ? xen_mc_flush+0xc0/0x190
> [39041.689110]  handle_mm_fault+0xf8/0x200
> [39041.689113]  __do_page_fault+0x231/0x4a0
> [39041.689115]  ? page_fault+0x8/0x30
> [39041.689116]  page_fault+0x1e/0x30
> [39041.689118] RIP: e033:0x7fb9851d012e
> [39041.689119] Code: 29 c2 48 3b 15 7b a3 31 00 0f 87 af 00 00 00 0f 10 01 0f 10 49 f0 0f 10 51 e0 0f 10 59 d0 48 83 e9 40 48 83 ea 40 41 0f 29 01 <41> 0f 29 49 f0 41 0f 29 51 e0 41 0f 29 59 d0 49 83 e9 40 48 83 fa
> [39041.689119] RSP: e02b:00007fb958b36d38 EFLAGS: 00010202
> [39041.689120] RAX: 00007fb97a617f0e RBX: 000000000000f004 RCX: 00007fb948008be3
> [39041.689121] RDX: 00000000000080c2 RSI: 00007fb948000b31 RDI: 00007fb97a617f0e
> [39041.689122] RBP: 00000000000ff062 R08: 0000000000000002 R09: 00007fb97a620000
> [39041.689123] R10: 0000000000000004 R11: 00007fb97a626f02 R12: 000000000000f005
> [39041.689123] R13: 00007fb948000b28 R14: 0000562d76b63710 R15: 0000000000000003
> [39041.689125] Mem-Info:
> [39041.689130] active_anon:78775 inactive_anon:49211 isolated_anon:0
>                 active_file:106409 inactive_file:107531 isolated_file:0
>                 unevictable:552 dirty:175 writeback:0 unstable:0
>                 slab_reclaimable:13739 slab_unreclaimable:16454
>                 mapped:1605 shmem:23 pagetables:2900 bounce:0
>                 free:3681 free_pcp:935 free_cma:0
> [39041.689132] Node 0 active_anon:315100kB inactive_anon:196844kB active_file:425636kB inactive_file:430124kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:6420kB dirty:700kB writeback:0kB shmem:92kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
> [39041.689133] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:0kB inactive_anon:7832kB active_file:472kB inactive_file:4kB unevictable:0kB writepending:0kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:12kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [39041.689136] lowmem_reserve[]: 0 1865 1865 1865
> [39041.689138] Node 0 DMA32 free:7244kB min:19472kB low:21380kB high:23288kB active_anon:315360kB inactive_anon:188144kB active_file:425164kB inactive_file:430120kB unevictable:2208kB writepending:700kB present:2080768kB managed:1674968kB mlocked:2208kB kernel_stack:9632kB pagetables:11588kB bounce:0kB free_pcp:3740kB local_pcp:528kB free_cma:0kB
> [39041.689140] lowmem_reserve[]: 0 0 0 0
> [39041.689142] Node 0 DMA: 6*4kB (UME) 6*8kB (UE) 7*16kB (UME) 6*32kB (ME) 5*64kB (UME) 3*128kB (UE) 5*256kB (UME) 2*512kB (ME) 2*1024kB (UE) 1*2048kB (M) 0*4096kB = 7480kB
> [39041.689148] Node 0 DMA32: 69*4kB (U) 315*8kB (UE) 138*16kB (UE) 70*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7244kB
> [39041.689153] 214701 total pagecache pages
> [39041.689155] 273 pages in swap cache
> [39041.689156] Swap cache stats: add 100978, delete 100706, find 1158/1257
> [39041.689156] Free swap  = 3790588kB
> [39041.689157] Total swap = 4194300kB
> [39041.689157] 524181 pages RAM
> [39041.689158] 0 pages HighMem/MovableOnly
> [39041.689158] 101471 pages reserved
> [39041.689159] 0 pages cma reserved
> 
> 
> 
> 
> 
>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>> index e8a112149..3cca2ffb2 100644
>> --- a/drivers/net/ethernet/realtek/r8169.c
>> +++ b/drivers/net/ethernet/realtek/r8169.c
>> @@ -6192,7 +6192,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>  	struct device *d = tp_to_dev(tp);
>>  	dma_addr_t mapping;
>>  	u32 opts[2], len;
>> -	bool stop_queue;
>>  	int frags;
>>  
>>  	if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>> @@ -6234,6 +6233,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>  
>>  	txd->opts2 = cpu_to_le32(opts[1]);
>>  
>> +	netdev_sent_queue(dev, skb->len);
>> +
>>  	skb_tx_timestamp(skb);
>>  
>>  	/* Force memory writes to complete before releasing descriptor */
>> @@ -6246,14 +6247,14 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>  
>>  	tp->cur_tx += frags + 1;
>>  
>> -	stop_queue = !rtl_tx_slots_avail(tp, MAX_SKB_FRAGS);
>> -	if (unlikely(stop_queue))
>> -		netif_stop_queue(dev);
>> -
>> -	if (__netdev_sent_queue(dev, skb->len, skb->xmit_more))
>> -		RTL_W8(tp, TxPoll, NPQ);
>> +	RTL_W8(tp, TxPoll, NPQ);
>>  
>> -	if (unlikely(stop_queue)) {
>> +	if (!rtl_tx_slots_avail(tp, MAX_SKB_FRAGS)) {
>> +		/* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
>> +		 * not miss a ring update when it notices a stopped queue.
>> +		 */
>> +		smp_wmb();
>> +		netif_stop_queue(dev);
>>  		/* Sync with rtl_tx:
>>  		 * - publish queue status and cur_tx ring index (write barrier)
>>  		 * - refresh dirty_tx ring index (read barrier).
>>
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-10 11:44                         ` Heiner Kallweit
@ 2019-02-10 13:05                           ` Sander Eikelenboom
  2019-02-10 13:57                             ` Heiner Kallweit
  2019-02-10 15:50                           ` Sander Eikelenboom
  1 sibling, 1 reply; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-10 13:05 UTC (permalink / raw)
  To: Heiner Kallweit, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 10/02/2019 12:44, Heiner Kallweit wrote:
> On 10.02.2019 10:16, Sander Eikelenboom wrote:
>> On 09/02/2019 12:50, Heiner Kallweit wrote:
>>> On 09.02.2019 11:07, Sander Eikelenboom wrote:
>>>> On 09/02/2019 10:59, Heiner Kallweit wrote:
>>>>> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>>>>>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>>>>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>>>>>> L.S.,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>>>>>
>>>>>>>>>>>> Hmm i did some diging and i think:
>>>>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>>>>>
>>>>>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>>>>>
>>>>>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>>>>>> on the host.
>>>>>>>>>>
>>>>>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>>>>>> as author of the underlying changes.
>>>>>>>>>>
>>>>>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>>>>>
>>>>>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>>>>>> test also with only 
>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>> removed.
>>>>>>>>>
>>>>>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>>>>>
>>>>>>>>> Sure, thanks.
>>>>>>>>>
>>>>>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>>>>>
>>>>>>>>> Yes
>>>>>>>>>
>>>>>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>>>>>
>>>>>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>>>>>
>>>>>>>> You could try :
>>>>>>>>
>>>>>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>>>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>>>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>>>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>>>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>         dma_addr_t mapping;
>>>>>>>>         u32 opts[2], len;
>>>>>>>>         bool stop_queue;
>>>>>>>> +       bool door_bell;
>>>>>>>>         int frags;
>>>>>>>>  
>>>>>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>>>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>>>>>         dma_wmb();
>>>>>>>>  
>>>>>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>>>>>> +
>>>>>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>>>>>  
>>>>>>>>         /* Force all memory writes to complete before notifying device */
>>>>>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>         if (unlikely(stop_queue))
>>>>>>>>                 netif_stop_queue(dev);
>>>>>>>>  
>>>>>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>>>>>> +       if (door_bell) {
>>>>>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>>>>>                 mmiowb();
>>>>>>>>         }
>>>>>>>>
>>>>>>> Thanks a lot for checking and for the proposed fix.
>>>>>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>>>>>
>>>>>> I have done that already during the night .. the results:
>>>>>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>>>>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>>>>>
>>>>>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>>>>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>>>>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>>>>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>>>>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>>>>>
>>>>>>   If I can, it is a separate issue.
>>>>>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>>>>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>>>>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>>>>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>>>>>
>>>>>>   The timeout in question:
>>>>>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>>>>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>>>>>         [28336.893358] Modules linked in:
>>>>>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>>>>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>>>>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>>>>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>>>>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>>>>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>>>>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>>>>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>>>>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>>>>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>>>>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>>>>>         [28337.090052] Call Trace:
>>>>>>         [28337.103615]  <IRQ>
>>>>>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>>>>>         [28337.128905]  call_timer_fn+0x19/0x90
>>>>>>         [28337.141892]  expire_timers+0x8b/0xa0
>>>>>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>>>>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>>>>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>>>>>         [28337.186734]  __do_softirq+0xed/0x229
>>>>>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>>>>>         [28337.207822]  irq_exit+0xb7/0xc0
>>>>>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>>>>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>>>>>         [28337.241261]  </IRQ>
>>>>>>         [28337.253283] RIP: e033:0xff7e62
>>>>>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>>>>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>>>>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>>>>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>>>>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>>>>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>>>>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>>>>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>>>>>
>>>>> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
>>>>> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
>>>>> did that occur again?
>>>>> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
>>>>> contribute to the issue) and just submit a patch to effectively revert
>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
>>>>
>>>> I can't say if that is correct, because i haven't tested that.
>>>>
>>>> Another thing I could test is:
>>>>  - putting all the r8169 patches (and prerequisites) that went into 5.0 
>>>>    up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
>>>>    If that would be feasible (not too many needed prerequisites out of r8169) and if 
>>>>    you could spare me some time and prep such a branch somewhere so i can pull and compile that,
>>>>    that would be great.
>>>>
>>>
>>> Unfortunately there's quite a number of changes. Regarding __netdev_tx_sent_queue()
>>> and watchdog timeout I found the following comment in drivers/net/ethernet/sfc/tx.c,
>>> efx_enqueue_skb():
>>>
>>> 	if (__netdev_tx_sent_queue(tx_queue->core_txq, skb_len, xmit_more)) {
>>> 		struct efx_tx_queue *txq2 = efx_tx_queue_partner(tx_queue);
>>>
>>> 		/* There could be packets left on the partner queue if those
>>> 		 * SKBs had skb->xmit_more set. If we do not push those they
>>> 		 * could be left for a long time and cause a netdev watchdog.
>>> 		 */
>>> 		if (txq2->xmit_more_available)
>>> 			efx_nic_push_buffers(txq2);
>>>
>>> But I'm not sure whether the situation in r8169 is comparable. The following patch
>>> implements what I mentioned earlier: It leaves all other 5.0 changes in place and
>>> effectively reverts 2e6eedb4813e34d8d84ac0eb3afb668966f3f356. Would be great if
>>> you could give it a try.
>>
>> Hi Heiner,
>>
>> It took some time to respond, because I had another issue with 5.0 which intervened with proper testing, 
>> but fortunately I could pinpoint without doing a full bisect and revert that commit for further testing.
>>
>> So there is still time left and I could do a more proper run with your patch below.
>> Unfortunately i still get a splat (see below) with this, although i'm not sure it is related, 
>> just that I can't tell.
>>
> I checked further and there's a handful of network drivers using __napi_alloc_skb() with __GFP_NOWARN,
> maybe to avoid such splats. Did the splat impact functionality? When checking the code in r8169 the
> affected packet would just be dropped.

It doesn't permanently or noticeably impact functionality, and indeed seems to drop packets:

eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.1.1  netmask 255.255.0.0  broadcast 172.16.255.255
        ether 40:61:86:f4:67:d8  txqueuelen 1000  (Ethernet)
        RX packets 11563913  bytes 16724445852 (15.5 GiB)
        RX errors 0  dropped 6  overruns 0  frame 0
        TX packets 4301515  bytes 1210966808 (1.1 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

Reverting 5317d5c6d47e ("r8169: use napi_consume_skb where possible") doesn't suffice still gives the page allocation failure.

I think at this point in time we should at least get the
reverts into 5.0 (probably to late for rc-6 since DaveM's pull request is already in) for:
    bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
    2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue

Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the one which caused the BUG_ON() to be hit.

While we could use your patch to revert only 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 and leave
bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 in:
    - I haven't been able to test it properly.
    - It's dealing with barriers, which can be tedious and give subtle breakage.
    - I don't see any compelling argument to keep it in (it's no fix).
    - It's RC-6 time ...

So we then we can focus on the page allocation issue and hopefully find some stable
baseline before 5.0-final is cut. While I appreciate having a forward looking approach,
I think we are at the point in time, were we should revert when in doubt 
(and it doesn't clearly fix an other issue).

After establishing a stable baseline again, we can start incrementally re-applying and test stuff.

--
Sander



>> Perhaps Linus as Oops-decoding-guru has an idea ?
>>
>> --
>> Sander
>>
>> [39041.689007] dpkg-deb: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
>> [39041.689016] CPU: 4 PID: 14078 Comm: dpkg-deb Not tainted 5.0.0-rc5-20190209-kallweit+ #1
>> [39041.689017] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>> [39041.689018] Call Trace:
>> [39041.689022]  <IRQ>
>> [39041.689030]  dump_stack+0x5c/0x7b
>> [39041.689033]  warn_alloc+0x103/0x190
>> [39041.689036]  __alloc_pages_nodemask+0xe3d/0xe80
>> [39041.689039]  ? ip_rcv+0x48/0xc0
>> [39041.689040]  ? ip_rcv_finish_core.isra.0+0x360/0x360
>> [39041.689042]  page_frag_alloc+0x117/0x150
>> [39041.689044]  __napi_alloc_skb+0x83/0xd0
>> [39041.689048]  rtl8169_poll+0x210/0x640
>> [39041.689051]  net_rx_action+0x23d/0x370
>> [39041.689054]  __do_softirq+0xed/0x229
>> [39041.689058]  irq_exit+0xb7/0xc0
>> [39041.689061]  xen_evtchn_do_upcall+0x27/0x40
>> [39041.689063]  xen_do_hypervisor_callback+0x29/0x40
>> [39041.689064]  </IRQ>
>> [39041.689066] RIP: e030:_atomic_dec_and_lock+0x2/0x40
>> [39041.689068] Code: ff 39 05 c5 c1 c9 00 89 c7 89 c6 76 0f 83 eb 01 83 fb ff 75 d9 5b 89 f8 5d 41 5c c3 0f 0b 90 90 90 90 90 90 90 90 90 90 8b 07 <83> f8 01 74 0c 8d 50 ff f0 0f b1 17 75 f2 31 c0 c3 55 53 48 89 fb
>> [39041.689069] RSP: e02b:ffffc9000705b990 EFLAGS: 00000246
>> [39041.689071] RAX: 0000000000000001 RBX: ffff888017082640 RCX: 0000000000000000
>> [39041.689071] RDX: 0000000000000000 RSI: ffff8880170826c0 RDI: ffff888017082788
>> [39041.689072] RBP: ffff8880170826c0 R08: ffffc9000705bb00 R09: ffffc9000705bb00
>> [39041.689073] R10: ffffc9000705bb58 R11: ffff88807fc17000 R12: ffff888017082788
>> [39041.689073] R13: ffff88806cc8cf58 R14: ffff888017082640 R15: ffff888009990240
>> [39041.689077]  iput+0x63/0x1a0
>> [39041.689079]  __dentry_kill+0xc5/0x170
>> [39041.689080]  shrink_dentry_list+0x93/0x1c0
>> [39041.689082]  prune_dcache_sb+0x4d/0x70
>> [39041.689084]  super_cache_scan+0x104/0x190
>> [39041.689087]  do_shrink_slab+0x12c/0x1e0
>> [39041.689089]  shrink_slab+0xdf/0x2b0
>> [39041.689091]  shrink_node+0x158/0x470
>> [39041.689093]  do_try_to_free_pages+0xd1/0x380
>> [39041.689095]  try_to_free_pages+0xb2/0xe0
>> [39041.689097]  __alloc_pages_nodemask+0x603/0xe80
>> [39041.689099]  ? __pagevec_lru_add_fn+0x1b1/0x290
>> [39041.689102]  alloc_pages_vma+0x7b/0x1c0
>> [39041.689106]  __handle_mm_fault+0xdb3/0x1060
>> [39041.689109]  ? xen_mc_flush+0xc0/0x190
>> [39041.689110]  handle_mm_fault+0xf8/0x200
>> [39041.689113]  __do_page_fault+0x231/0x4a0
>> [39041.689115]  ? page_fault+0x8/0x30
>> [39041.689116]  page_fault+0x1e/0x30
>> [39041.689118] RIP: e033:0x7fb9851d012e
>> [39041.689119] Code: 29 c2 48 3b 15 7b a3 31 00 0f 87 af 00 00 00 0f 10 01 0f 10 49 f0 0f 10 51 e0 0f 10 59 d0 48 83 e9 40 48 83 ea 40 41 0f 29 01 <41> 0f 29 49 f0 41 0f 29 51 e0 41 0f 29 59 d0 49 83 e9 40 48 83 fa
>> [39041.689119] RSP: e02b:00007fb958b36d38 EFLAGS: 00010202
>> [39041.689120] RAX: 00007fb97a617f0e RBX: 000000000000f004 RCX: 00007fb948008be3
>> [39041.689121] RDX: 00000000000080c2 RSI: 00007fb948000b31 RDI: 00007fb97a617f0e
>> [39041.689122] RBP: 00000000000ff062 R08: 0000000000000002 R09: 00007fb97a620000
>> [39041.689123] R10: 0000000000000004 R11: 00007fb97a626f02 R12: 000000000000f005
>> [39041.689123] R13: 00007fb948000b28 R14: 0000562d76b63710 R15: 0000000000000003
>> [39041.689125] Mem-Info:
>> [39041.689130] active_anon:78775 inactive_anon:49211 isolated_anon:0
>>                 active_file:106409 inactive_file:107531 isolated_file:0
>>                 unevictable:552 dirty:175 writeback:0 unstable:0
>>                 slab_reclaimable:13739 slab_unreclaimable:16454
>>                 mapped:1605 shmem:23 pagetables:2900 bounce:0
>>                 free:3681 free_pcp:935 free_cma:0
>> [39041.689132] Node 0 active_anon:315100kB inactive_anon:196844kB active_file:425636kB inactive_file:430124kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:6420kB dirty:700kB writeback:0kB shmem:92kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
>> [39041.689133] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:0kB inactive_anon:7832kB active_file:472kB inactive_file:4kB unevictable:0kB writepending:0kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:12kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> [39041.689136] lowmem_reserve[]: 0 1865 1865 1865
>> [39041.689138] Node 0 DMA32 free:7244kB min:19472kB low:21380kB high:23288kB active_anon:315360kB inactive_anon:188144kB active_file:425164kB inactive_file:430120kB unevictable:2208kB writepending:700kB present:2080768kB managed:1674968kB mlocked:2208kB kernel_stack:9632kB pagetables:11588kB bounce:0kB free_pcp:3740kB local_pcp:528kB free_cma:0kB
>> [39041.689140] lowmem_reserve[]: 0 0 0 0
>> [39041.689142] Node 0 DMA: 6*4kB (UME) 6*8kB (UE) 7*16kB (UME) 6*32kB (ME) 5*64kB (UME) 3*128kB (UE) 5*256kB (UME) 2*512kB (ME) 2*1024kB (UE) 1*2048kB (M) 0*4096kB = 7480kB
>> [39041.689148] Node 0 DMA32: 69*4kB (U) 315*8kB (UE) 138*16kB (UE) 70*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7244kB
>> [39041.689153] 214701 total pagecache pages
>> [39041.689155] 273 pages in swap cache
>> [39041.689156] Swap cache stats: add 100978, delete 100706, find 1158/1257
>> [39041.689156] Free swap  = 3790588kB
>> [39041.689157] Total swap = 4194300kB
>> [39041.689157] 524181 pages RAM
>> [39041.689158] 0 pages HighMem/MovableOnly
>> [39041.689158] 101471 pages reserved
>> [39041.689159] 0 pages cma reserved
>>
>>
>>
>>
>>
>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>> index e8a112149..3cca2ffb2 100644
>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>> @@ -6192,7 +6192,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>  	struct device *d = tp_to_dev(tp);
>>>  	dma_addr_t mapping;
>>>  	u32 opts[2], len;
>>> -	bool stop_queue;
>>>  	int frags;
>>>  
>>>  	if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>> @@ -6234,6 +6233,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>  
>>>  	txd->opts2 = cpu_to_le32(opts[1]);
>>>  
>>> +	netdev_sent_queue(dev, skb->len);
>>> +
>>>  	skb_tx_timestamp(skb);
>>>  
>>>  	/* Force memory writes to complete before releasing descriptor */
>>> @@ -6246,14 +6247,14 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>  
>>>  	tp->cur_tx += frags + 1;
>>>  
>>> -	stop_queue = !rtl_tx_slots_avail(tp, MAX_SKB_FRAGS);
>>> -	if (unlikely(stop_queue))
>>> -		netif_stop_queue(dev);
>>> -
>>> -	if (__netdev_sent_queue(dev, skb->len, skb->xmit_more))
>>> -		RTL_W8(tp, TxPoll, NPQ);
>>> +	RTL_W8(tp, TxPoll, NPQ);
>>>  
>>> -	if (unlikely(stop_queue)) {
>>> +	if (!rtl_tx_slots_avail(tp, MAX_SKB_FRAGS)) {
>>> +		/* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
>>> +		 * not miss a ring update when it notices a stopped queue.
>>> +		 */
>>> +		smp_wmb();
>>> +		netif_stop_queue(dev);
>>>  		/* Sync with rtl_tx:
>>>  		 * - publish queue status and cur_tx ring index (write barrier)
>>>  		 * - refresh dirty_tx ring index (read barrier).
>>>
>>
>>
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-10 13:05                           ` Sander Eikelenboom
@ 2019-02-10 13:57                             ` Heiner Kallweit
  0 siblings, 0 replies; 20+ messages in thread
From: Heiner Kallweit @ 2019-02-10 13:57 UTC (permalink / raw)
  To: Sander Eikelenboom, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 10.02.2019 14:05, Sander Eikelenboom wrote:
> On 10/02/2019 12:44, Heiner Kallweit wrote:
>> On 10.02.2019 10:16, Sander Eikelenboom wrote:
>>> On 09/02/2019 12:50, Heiner Kallweit wrote:
>>>> On 09.02.2019 11:07, Sander Eikelenboom wrote:
>>>>> On 09/02/2019 10:59, Heiner Kallweit wrote:
>>>>>> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>>>>>>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>>>>>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>>>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>>>>>>> L.S.,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hmm i did some diging and i think:
>>>>>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>>>>>>
>>>>>>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>>>>>>
>>>>>>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>>>>>>> on the host.
>>>>>>>>>>>
>>>>>>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>>>>>>> as author of the underlying changes.
>>>>>>>>>>>
>>>>>>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>>>>>>
>>>>>>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>>>>>>> test also with only 
>>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>>> removed.
>>>>>>>>>>
>>>>>>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>>>>>>
>>>>>>>>>> Sure, thanks.
>>>>>>>>>>
>>>>>>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>>>>>>
>>>>>>>>>> Yes
>>>>>>>>>>
>>>>>>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>>>>>>
>>>>>>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>>>>>>
>>>>>>>>> You could try :
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>>>>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>>>>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>>>>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>>>>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>>         dma_addr_t mapping;
>>>>>>>>>         u32 opts[2], len;
>>>>>>>>>         bool stop_queue;
>>>>>>>>> +       bool door_bell;
>>>>>>>>>         int frags;
>>>>>>>>>  
>>>>>>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>>>>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>>>>>>         dma_wmb();
>>>>>>>>>  
>>>>>>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>>>>>>> +
>>>>>>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>>>>>>  
>>>>>>>>>         /* Force all memory writes to complete before notifying device */
>>>>>>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>>         if (unlikely(stop_queue))
>>>>>>>>>                 netif_stop_queue(dev);
>>>>>>>>>  
>>>>>>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>>>>>>> +       if (door_bell) {
>>>>>>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>>>>>>                 mmiowb();
>>>>>>>>>         }
>>>>>>>>>
>>>>>>>> Thanks a lot for checking and for the proposed fix.
>>>>>>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>>>>>>
>>>>>>> I have done that already during the night .. the results:
>>>>>>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>>>>>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>>>>>>
>>>>>>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>>>>>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>>>>>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>>>>>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>>>>>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>>>>>>
>>>>>>>   If I can, it is a separate issue.
>>>>>>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>>>>>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>>>>>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>>>>>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>>>>>>
>>>>>>>   The timeout in question:
>>>>>>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>>>>>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>>>>>>         [28336.893358] Modules linked in:
>>>>>>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>>>>>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>>>>>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>>>>>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>>>>>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>>>>>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>>>>>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>>>>>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>>>>>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>>>>>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>>>>>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>>>>>>         [28337.090052] Call Trace:
>>>>>>>         [28337.103615]  <IRQ>
>>>>>>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>>>>>>         [28337.128905]  call_timer_fn+0x19/0x90
>>>>>>>         [28337.141892]  expire_timers+0x8b/0xa0
>>>>>>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>>>>>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>>>>>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>>>>>>         [28337.186734]  __do_softirq+0xed/0x229
>>>>>>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>>>>>>         [28337.207822]  irq_exit+0xb7/0xc0
>>>>>>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>>>>>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>>>>>>         [28337.241261]  </IRQ>
>>>>>>>         [28337.253283] RIP: e033:0xff7e62
>>>>>>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>>>>>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>>>>>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>>>>>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>>>>>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>>>>>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>>>>>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>>>>>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>>>>>>
>>>>>> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
>>>>>> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
>>>>>> did that occur again?
>>>>>> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
>>>>>> contribute to the issue) and just submit a patch to effectively revert
>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
>>>>>
>>>>> I can't say if that is correct, because i haven't tested that.
>>>>>
>>>>> Another thing I could test is:
>>>>>  - putting all the r8169 patches (and prerequisites) that went into 5.0 
>>>>>    up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
>>>>>    If that would be feasible (not too many needed prerequisites out of r8169) and if 
>>>>>    you could spare me some time and prep such a branch somewhere so i can pull and compile that,
>>>>>    that would be great.
>>>>>
>>>>
>>>> Unfortunately there's quite a number of changes. Regarding __netdev_tx_sent_queue()
>>>> and watchdog timeout I found the following comment in drivers/net/ethernet/sfc/tx.c,
>>>> efx_enqueue_skb():
>>>>
>>>> 	if (__netdev_tx_sent_queue(tx_queue->core_txq, skb_len, xmit_more)) {
>>>> 		struct efx_tx_queue *txq2 = efx_tx_queue_partner(tx_queue);
>>>>
>>>> 		/* There could be packets left on the partner queue if those
>>>> 		 * SKBs had skb->xmit_more set. If we do not push those they
>>>> 		 * could be left for a long time and cause a netdev watchdog.
>>>> 		 */
>>>> 		if (txq2->xmit_more_available)
>>>> 			efx_nic_push_buffers(txq2);
>>>>
>>>> But I'm not sure whether the situation in r8169 is comparable. The following patch
>>>> implements what I mentioned earlier: It leaves all other 5.0 changes in place and
>>>> effectively reverts 2e6eedb4813e34d8d84ac0eb3afb668966f3f356. Would be great if
>>>> you could give it a try.
>>>
>>> Hi Heiner,
>>>
>>> It took some time to respond, because I had another issue with 5.0 which intervened with proper testing, 
>>> but fortunately I could pinpoint without doing a full bisect and revert that commit for further testing.
>>>
>>> So there is still time left and I could do a more proper run with your patch below.
>>> Unfortunately i still get a splat (see below) with this, although i'm not sure it is related, 
>>> just that I can't tell.
>>>
>> I checked further and there's a handful of network drivers using __napi_alloc_skb() with __GFP_NOWARN,
>> maybe to avoid such splats. Did the splat impact functionality? When checking the code in r8169 the
>> affected packet would just be dropped.
> 
> It doesn't permanently or noticeably impact functionality, and indeed seems to drop packets:
> 
> eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
>         inet 172.16.1.1  netmask 255.255.0.0  broadcast 172.16.255.255
>         ether 40:61:86:f4:67:d8  txqueuelen 1000  (Ethernet)
>         RX packets 11563913  bytes 16724445852 (15.5 GiB)
>         RX errors 0  dropped 6  overruns 0  frame 0
>         TX packets 4301515  bytes 1210966808 (1.1 GiB)
>         TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> 
> Reverting 5317d5c6d47e ("r8169: use napi_consume_skb where possible") doesn't suffice still gives the page allocation failure.
> 
> I think at this point in time we should at least get the
> reverts into 5.0 (probably to late for rc-6 since DaveM's pull request is already in) for:
>     bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>     2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
> 
OK, I just sent the reverts for both patches.

Heiner

> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the one which caused the BUG_ON() to be hit.
> 
> While we could use your patch to revert only 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 and leave
> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 in:
>     - I haven't been able to test it properly.
>     - It's dealing with barriers, which can be tedious and give subtle breakage.
>     - I don't see any compelling argument to keep it in (it's no fix).
>     - It's RC-6 time ...
> 
> So we then we can focus on the page allocation issue and hopefully find some stable
> baseline before 5.0-final is cut. While I appreciate having a forward looking approach,
> I think we are at the point in time, were we should revert when in doubt 
> (and it doesn't clearly fix an other issue).
> 
> After establishing a stable baseline again, we can start incrementally re-applying and test stuff.
> 
> --
> Sander
> 
> 
> 
>>> Perhaps Linus as Oops-decoding-guru has an idea ?
>>>
>>> --
>>> Sander
>>>
>>> [39041.689007] dpkg-deb: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
>>> [39041.689016] CPU: 4 PID: 14078 Comm: dpkg-deb Not tainted 5.0.0-rc5-20190209-kallweit+ #1
>>> [39041.689017] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>> [39041.689018] Call Trace:
>>> [39041.689022]  <IRQ>
>>> [39041.689030]  dump_stack+0x5c/0x7b
>>> [39041.689033]  warn_alloc+0x103/0x190
>>> [39041.689036]  __alloc_pages_nodemask+0xe3d/0xe80
>>> [39041.689039]  ? ip_rcv+0x48/0xc0
>>> [39041.689040]  ? ip_rcv_finish_core.isra.0+0x360/0x360
>>> [39041.689042]  page_frag_alloc+0x117/0x150
>>> [39041.689044]  __napi_alloc_skb+0x83/0xd0
>>> [39041.689048]  rtl8169_poll+0x210/0x640
>>> [39041.689051]  net_rx_action+0x23d/0x370
>>> [39041.689054]  __do_softirq+0xed/0x229
>>> [39041.689058]  irq_exit+0xb7/0xc0
>>> [39041.689061]  xen_evtchn_do_upcall+0x27/0x40
>>> [39041.689063]  xen_do_hypervisor_callback+0x29/0x40
>>> [39041.689064]  </IRQ>
>>> [39041.689066] RIP: e030:_atomic_dec_and_lock+0x2/0x40
>>> [39041.689068] Code: ff 39 05 c5 c1 c9 00 89 c7 89 c6 76 0f 83 eb 01 83 fb ff 75 d9 5b 89 f8 5d 41 5c c3 0f 0b 90 90 90 90 90 90 90 90 90 90 8b 07 <83> f8 01 74 0c 8d 50 ff f0 0f b1 17 75 f2 31 c0 c3 55 53 48 89 fb
>>> [39041.689069] RSP: e02b:ffffc9000705b990 EFLAGS: 00000246
>>> [39041.689071] RAX: 0000000000000001 RBX: ffff888017082640 RCX: 0000000000000000
>>> [39041.689071] RDX: 0000000000000000 RSI: ffff8880170826c0 RDI: ffff888017082788
>>> [39041.689072] RBP: ffff8880170826c0 R08: ffffc9000705bb00 R09: ffffc9000705bb00
>>> [39041.689073] R10: ffffc9000705bb58 R11: ffff88807fc17000 R12: ffff888017082788
>>> [39041.689073] R13: ffff88806cc8cf58 R14: ffff888017082640 R15: ffff888009990240
>>> [39041.689077]  iput+0x63/0x1a0
>>> [39041.689079]  __dentry_kill+0xc5/0x170
>>> [39041.689080]  shrink_dentry_list+0x93/0x1c0
>>> [39041.689082]  prune_dcache_sb+0x4d/0x70
>>> [39041.689084]  super_cache_scan+0x104/0x190
>>> [39041.689087]  do_shrink_slab+0x12c/0x1e0
>>> [39041.689089]  shrink_slab+0xdf/0x2b0
>>> [39041.689091]  shrink_node+0x158/0x470
>>> [39041.689093]  do_try_to_free_pages+0xd1/0x380
>>> [39041.689095]  try_to_free_pages+0xb2/0xe0
>>> [39041.689097]  __alloc_pages_nodemask+0x603/0xe80
>>> [39041.689099]  ? __pagevec_lru_add_fn+0x1b1/0x290
>>> [39041.689102]  alloc_pages_vma+0x7b/0x1c0
>>> [39041.689106]  __handle_mm_fault+0xdb3/0x1060
>>> [39041.689109]  ? xen_mc_flush+0xc0/0x190
>>> [39041.689110]  handle_mm_fault+0xf8/0x200
>>> [39041.689113]  __do_page_fault+0x231/0x4a0
>>> [39041.689115]  ? page_fault+0x8/0x30
>>> [39041.689116]  page_fault+0x1e/0x30
>>> [39041.689118] RIP: e033:0x7fb9851d012e
>>> [39041.689119] Code: 29 c2 48 3b 15 7b a3 31 00 0f 87 af 00 00 00 0f 10 01 0f 10 49 f0 0f 10 51 e0 0f 10 59 d0 48 83 e9 40 48 83 ea 40 41 0f 29 01 <41> 0f 29 49 f0 41 0f 29 51 e0 41 0f 29 59 d0 49 83 e9 40 48 83 fa
>>> [39041.689119] RSP: e02b:00007fb958b36d38 EFLAGS: 00010202
>>> [39041.689120] RAX: 00007fb97a617f0e RBX: 000000000000f004 RCX: 00007fb948008be3
>>> [39041.689121] RDX: 00000000000080c2 RSI: 00007fb948000b31 RDI: 00007fb97a617f0e
>>> [39041.689122] RBP: 00000000000ff062 R08: 0000000000000002 R09: 00007fb97a620000
>>> [39041.689123] R10: 0000000000000004 R11: 00007fb97a626f02 R12: 000000000000f005
>>> [39041.689123] R13: 00007fb948000b28 R14: 0000562d76b63710 R15: 0000000000000003
>>> [39041.689125] Mem-Info:
>>> [39041.689130] active_anon:78775 inactive_anon:49211 isolated_anon:0
>>>                 active_file:106409 inactive_file:107531 isolated_file:0
>>>                 unevictable:552 dirty:175 writeback:0 unstable:0
>>>                 slab_reclaimable:13739 slab_unreclaimable:16454
>>>                 mapped:1605 shmem:23 pagetables:2900 bounce:0
>>>                 free:3681 free_pcp:935 free_cma:0
>>> [39041.689132] Node 0 active_anon:315100kB inactive_anon:196844kB active_file:425636kB inactive_file:430124kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:6420kB dirty:700kB writeback:0kB shmem:92kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
>>> [39041.689133] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:0kB inactive_anon:7832kB active_file:472kB inactive_file:4kB unevictable:0kB writepending:0kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:12kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>>> [39041.689136] lowmem_reserve[]: 0 1865 1865 1865
>>> [39041.689138] Node 0 DMA32 free:7244kB min:19472kB low:21380kB high:23288kB active_anon:315360kB inactive_anon:188144kB active_file:425164kB inactive_file:430120kB unevictable:2208kB writepending:700kB present:2080768kB managed:1674968kB mlocked:2208kB kernel_stack:9632kB pagetables:11588kB bounce:0kB free_pcp:3740kB local_pcp:528kB free_cma:0kB
>>> [39041.689140] lowmem_reserve[]: 0 0 0 0
>>> [39041.689142] Node 0 DMA: 6*4kB (UME) 6*8kB (UE) 7*16kB (UME) 6*32kB (ME) 5*64kB (UME) 3*128kB (UE) 5*256kB (UME) 2*512kB (ME) 2*1024kB (UE) 1*2048kB (M) 0*4096kB = 7480kB
>>> [39041.689148] Node 0 DMA32: 69*4kB (U) 315*8kB (UE) 138*16kB (UE) 70*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7244kB
>>> [39041.689153] 214701 total pagecache pages
>>> [39041.689155] 273 pages in swap cache
>>> [39041.689156] Swap cache stats: add 100978, delete 100706, find 1158/1257
>>> [39041.689156] Free swap  = 3790588kB
>>> [39041.689157] Total swap = 4194300kB
>>> [39041.689157] 524181 pages RAM
>>> [39041.689158] 0 pages HighMem/MovableOnly
>>> [39041.689158] 101471 pages reserved
>>> [39041.689159] 0 pages cma reserved
>>>
>>>
>>>
>>>
>>>
>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>> index e8a112149..3cca2ffb2 100644
>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>> @@ -6192,7 +6192,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>  	struct device *d = tp_to_dev(tp);
>>>>  	dma_addr_t mapping;
>>>>  	u32 opts[2], len;
>>>> -	bool stop_queue;
>>>>  	int frags;
>>>>  
>>>>  	if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>> @@ -6234,6 +6233,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>  
>>>>  	txd->opts2 = cpu_to_le32(opts[1]);
>>>>  
>>>> +	netdev_sent_queue(dev, skb->len);
>>>> +
>>>>  	skb_tx_timestamp(skb);
>>>>  
>>>>  	/* Force memory writes to complete before releasing descriptor */
>>>> @@ -6246,14 +6247,14 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>  
>>>>  	tp->cur_tx += frags + 1;
>>>>  
>>>> -	stop_queue = !rtl_tx_slots_avail(tp, MAX_SKB_FRAGS);
>>>> -	if (unlikely(stop_queue))
>>>> -		netif_stop_queue(dev);
>>>> -
>>>> -	if (__netdev_sent_queue(dev, skb->len, skb->xmit_more))
>>>> -		RTL_W8(tp, TxPoll, NPQ);
>>>> +	RTL_W8(tp, TxPoll, NPQ);
>>>>  
>>>> -	if (unlikely(stop_queue)) {
>>>> +	if (!rtl_tx_slots_avail(tp, MAX_SKB_FRAGS)) {
>>>> +		/* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
>>>> +		 * not miss a ring update when it notices a stopped queue.
>>>> +		 */
>>>> +		smp_wmb();
>>>> +		netif_stop_queue(dev);
>>>>  		/* Sync with rtl_tx:
>>>>  		 * - publish queue status and cur_tx ring index (write barrier)
>>>>  		 * - refresh dirty_tx ring index (read barrier).
>>>>
>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27!
  2019-02-10 11:44                         ` Heiner Kallweit
  2019-02-10 13:05                           ` Sander Eikelenboom
@ 2019-02-10 15:50                           ` Sander Eikelenboom
  1 sibling, 0 replies; 20+ messages in thread
From: Sander Eikelenboom @ 2019-02-10 15:50 UTC (permalink / raw)
  To: Heiner Kallweit, Eric Dumazet, Realtek linux nic maintainers,
	Eric Dumazet
  Cc: Linus Torvalds, linux-kernel, netdev

On 10/02/2019 12:44, Heiner Kallweit wrote:
> On 10.02.2019 10:16, Sander Eikelenboom wrote:
>> On 09/02/2019 12:50, Heiner Kallweit wrote:
>>> On 09.02.2019 11:07, Sander Eikelenboom wrote:
>>>> On 09/02/2019 10:59, Heiner Kallweit wrote:
>>>>> On 09.02.2019 10:34, Sander Eikelenboom wrote:
>>>>>> On 09/02/2019 10:02, Heiner Kallweit wrote:
>>>>>>> On 09.02.2019 00:09, Eric Dumazet wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 02/08/2019 01:50 PM, Heiner Kallweit wrote:
>>>>>>>>> On 08.02.2019 22:45, Sander Eikelenboom wrote:
>>>>>>>>>> On 08/02/2019 22:22, Heiner Kallweit wrote:
>>>>>>>>>>> On 08.02.2019 21:55, Sander Eikelenboom wrote:
>>>>>>>>>>>> On 08/02/2019 19:52, Heiner Kallweit wrote:
>>>>>>>>>>>>> On 08.02.2019 19:29, Sander Eikelenboom wrote:
>>>>>>>>>>>>>> L.S.,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> While testing a linux 5.0-rc5 kernel (with some patches on top but they don't seem related) under Xen i the nasty splat below, 
>>>>>>>>>>>>>> that I haven encountered with Linux 4.20.x.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Unfortunately I haven't got a clear reproducer for this and bisecting could be nasty due to another (networking related) kernel bug.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you need more info, want me to run a debug patch etc., please feel free to ask.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the report. However I see no change in the r8169 driver between
>>>>>>>>>>>>> 4.20 and 5.0 with regard to BQL code. Having said that the root cause could
>>>>>>>>>>>>> be somewhere else. Therefore I'm afraid a bisect will be needed.
>>>>>>>>>>>>
>>>>>>>>>>>> Hmm i did some diging and i think:
>>>>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>>>>>> 620344c43edfa020bbadfd81a144ebe5181fc94f net: core: add __netdev_sent_queue as variant of __netdev_tx_sent_queue
>>>>>>>>>>>>
>>>>>>>>>>> You're right. Thought this was added in 4.20 already.
>>>>>>>>>>> The BQL code pattern I copied from the mlx4 driver and so far I haven't heard about
>>>>>>>>>>> this issue from any user of physical hw. And due to the fact that a lot of mainboards
>>>>>>>>>>> have onboard Realtek network I have quite a few testers out there.
>>>>>>>>>>> Does the issue occur under specific circumstances like very high load?
>>>>>>>>>>
>>>>>>>>>> Yep, the box is already quite contented with the Xen VM's and if I remember correctly it occurred while kernel compiling
>>>>>>>>>> on the host.
>>>>>>>>>>
>>>>>>>>>>> If indeed the xmit_more patch causes the issue, I think we have to involve Eric Dumazet
>>>>>>>>>>> as author of the underlying changes.
>>>>>>>>>>
>>>>>>>>>> It could also be the barriers weren't that unneeded as assumed.
>>>>>>>>>
>>>>>>>>> The barriers were removed after adding xmit_more handling. Therefore it would be good to
>>>>>>>>> test also with only 
>>>>>>>>> bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 r8169: remove unneeded mmiowb barriers
>>>>>>>>> removed.
>>>>>>>>>
>>>>>>>>>> Since we are almost at RC6 i took the liberty to CC Eric now.
>>>>>>>>>>
>>>>>>>>> Sure, thanks.
>>>>>>>>>
>>>>>>>>>> BTW am i correct these patches are merely optimizations ?
>>>>>>>>>
>>>>>>>>> Yes
>>>>>>>>>
>>>>>>>>>> If so and concluding they revert cleanly, perhaps it should be considered at this point in the RC's
>>>>>>>>>> to revert them for 5.0 and try again for 5.1 ?
>>>>>>>>>>
>>>>>>>>> Before removing both it would be good to test with only the barrier-removal removed.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Commit 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 r8169: make use of xmit_more and __netdev_sent_queue
>>>>>>>> looks buggy to me, since the skb might have been freed already on another cpu when you call
>>>>>>>>
>>>>>>>> You could try :
>>>>>>>>
>>>>>>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>>>>>>> index 3624e67aef72c92ed6e908e2c99ac2d381210126..f907d484165d9fd775e81bf2bfb9aa4ddedb1c93 100644
>>>>>>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>>>>>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>>>>>>> @@ -6070,6 +6070,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>         dma_addr_t mapping;
>>>>>>>>         u32 opts[2], len;
>>>>>>>>         bool stop_queue;
>>>>>>>> +       bool door_bell;
>>>>>>>>         int frags;
>>>>>>>>  
>>>>>>>>         if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>>>>>>> @@ -6116,6 +6117,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>         /* Force memory writes to complete before releasing descriptor */
>>>>>>>>         dma_wmb();
>>>>>>>>  
>>>>>>>> +       door_bell = __netdev_sent_queue(dev, skb->len, skb->xmit_more);
>>>>>>>> +
>>>>>>>>         txd->opts1 = rtl8169_get_txd_opts1(opts[0], len, entry);
>>>>>>>>  
>>>>>>>>         /* Force all memory writes to complete before notifying device */
>>>>>>>> @@ -6127,7 +6130,7 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>>>>>>         if (unlikely(stop_queue))
>>>>>>>>                 netif_stop_queue(dev);
>>>>>>>>  
>>>>>>>> -       if (__netdev_sent_queue(dev, skb->len, skb->xmit_more)) {
>>>>>>>> +       if (door_bell) {
>>>>>>>>                 RTL_W8(tp, TxPoll, NPQ);
>>>>>>>>                 mmiowb();
>>>>>>>>         }
>>>>>>>>
>>>>>>> Thanks a lot for checking and for the proposed fix.
>>>>>>> Sander, can you try with this patch on top of 5.0-rc5 w/o removing two two commits?
>>>>>>
>>>>>> I have done that already during the night .. the results:
>>>>>> - I can confirm 2e6eedb4813e34d8d84ac0eb3afb668966f3f356 is the first commit which causes hitting the BUG_ON in lib/dynamic_queue_limits.c.
>>>>>>   (in other word, with only reverting bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 it still blows up).
>>>>>>
>>>>>> - The Eric's patch only applies cleanly with bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3 reverted, so that's what I tested.
>>>>>>   The patch seems to prevent hitting the BUG_ON in lib/dynamic_queue_limits.c, it has run this night and I gave done a few kernel compiles
>>>>>>   this morning. How ever during these kernel compiles i'm getting a transmit queue timeout which i haven't seen with 4.20.x, although i regularly
>>>>>>   compile kernels in the same way as I do now. The only thing I can't say if that is due to this change, or if it's again something else.
>>>>>>   Which makes me somewhat inclined to go testing the complete revert some more and see if I can trigger the queue timeout on that or not.
>>>>>>
>>>>>>   If I can, it is a separate issue.
>>>>>>   If I can't it seems even with a patch it still seems as a regression in comparison with 4.20.x, for which
>>>>>>   a revert would be the right thing to do (since as you indicated these are merely optimizations), 
>>>>>>   which would give us more time for 5.1 to try to solve things on top of the 5.0-release-to-be.
>>>>>>   (especially since I seem to still have other issues which need to be sorted out and time is limited)
>>>>>>
>>>>>>   The timeout in question:
>>>>>>         [28336.869479] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
>>>>>>         [28336.881498] WARNING: CPU: 0 PID: 6925 at net/sched/sch_generic.c:461 dev_watchdog+0x20b/0x210
>>>>>>         [28336.893358] Modules linked in:
>>>>>>         [28336.904106] CPU: 0 PID: 6925 Comm: cc1 Tainted: G      D           5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
>>>>>>         [28336.917385] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>>>>>>         [28336.928988] RIP: e030:dev_watchdog+0x20b/0x210
>>>>>>         [28336.940623] Code: 00 49 63 4e e0 eb 90 4c 89 e7 c6 05 ad d8 f1 00 01 e8 a9 32 fd ff 89 d9 48 89 c2 4c 89 e6 48 c7 c7 50 59 89 82 e8 e5 92 4d ff <0f> 0b eb c0 90 48 c7 47 08 00 00 00 00 48 c7 07 00 00 00 00 0f b7
>>>>>>         [28336.965265] RSP: e02b:ffff88807d403ea0 EFLAGS: 00010286
>>>>>>         [28336.977465] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff82a69db8
>>>>>>         [28336.991265] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 0000000000000200
>>>>>>         [28337.008865] RBP: ffff88807936e41c R08: 0000000000000000 R09: 0000000000000819
>>>>>>         [28337.022250] R10: 0000000000000202 R11: ffffffff8247ca80 R12: ffff88807936e000
>>>>>>         [28337.035204] R13: 0000000000000000 R14: ffff88807936e440 R15: 0000000000000001
>>>>>>         [28337.049832] FS:  00007f53e9bf3840(0000) GS:ffff88807d400000(0000) knlGS:0000000000000000
>>>>>>         [28337.062524] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>         [28337.075086] CR2: 00007f53e60c4000 CR3: 000000001a0be000 CR4: 0000000000000660
>>>>>>         [28337.090052] Call Trace:
>>>>>>         [28337.103615]  <IRQ>
>>>>>>         [28337.116587]  ? qdisc_destroy+0x120/0x120
>>>>>>         [28337.128905]  call_timer_fn+0x19/0x90
>>>>>>         [28337.141892]  expire_timers+0x8b/0xa0
>>>>>>         [28337.153354]  run_timer_softirq+0x7e/0x160
>>>>>>         [28337.165931]  ? handle_irq_event_percpu+0x4c/0x70
>>>>>>         [28337.176548]  ? handle_percpu_irq+0x32/0x50
>>>>>>         [28337.186734]  __do_softirq+0xed/0x229
>>>>>>         [28337.196404]  ? hypervisor_callback+0xa/0x20
>>>>>>         [28337.207822]  irq_exit+0xb7/0xc0
>>>>>>         [28337.218978]  xen_evtchn_do_upcall+0x27/0x40
>>>>>>         [28337.230763]  xen_do_hypervisor_callback+0x29/0x40
>>>>>>         [28337.241261]  </IRQ>
>>>>>>         [28337.253283] RIP: e033:0xff7e62
>>>>>>         [28337.264899] Code: 35 43 0f c7 00 4c 89 ef e8 8b 6d 67 ff 0f 1f 00 44 89 e0 44 89 e2 c1 e8 06 83 e2 3f 48 8b 0c c5 40 8d c6 01 48 0f a3 d1 72 0e <48> 8b 04 c5 50 8d c6 01 48 0f a3 d0 73 0b 44 89 e6 4c 89 ef e8 b5
>>>>>>         [28337.288677] RSP: e02b:00007fff0fc6a340 EFLAGS: 00000202
>>>>>>         [28337.299234] RAX: 0000000000000000 RBX: 00007f53e60c3580 RCX: 0000000000000000
>>>>>>         [28337.309577] RDX: 0000000000000034 RSI: 0000000001e71a98 RDI: 00007fff0fc6a538
>>>>>>         [28337.320724] RBP: 00007fff0fc6a4b0 R08: 0000000000000000 R09: 0000000000000000
>>>>>>         [28337.331829] R10: 0000000000000001 R11: 00000000020cb3d0 R12: 0000000000000034
>>>>>>         [28337.343900] R13: 00007fff0fc6a538 R14: 0000000000000000 R15: 0000000000000001
>>>>>>         [28337.353977] ---[ end trace 6ff49f09286816b7 ]---
>>>>>>
>>>>> Thanks for your efforts. As usual this tx timeout trace says basically nothing except
>>>>> "timeout" and root cause could be anything. Earlier you reported a memory allocation error,
>>>>> did that occur again?
>>>>> If we decide to revert, I'd leave removal of the memory barriers in (as it doesn't seem to
>>>>> contribute to the issue) and just submit a patch to effectively revert
>>>>> 2e6eedb4813e34d8d84ac0eb3afb668966f3f356.
>>>>
>>>> I can't say if that is correct, because i haven't tested that.
>>>>
>>>> Another thing I could test is:
>>>>  - putting all the r8169 patches (and prerequisites) that went into 5.0 
>>>>    up to bd7153bd83b806bfcc2e79b7a6f43aa653d06ef3, onto 4.20.7 and see what that does.
>>>>    If that would be feasible (not too many needed prerequisites out of r8169) and if 
>>>>    you could spare me some time and prep such a branch somewhere so i can pull and compile that,
>>>>    that would be great.
>>>>
>>>
>>> Unfortunately there's quite a number of changes. Regarding __netdev_tx_sent_queue()
>>> and watchdog timeout I found the following comment in drivers/net/ethernet/sfc/tx.c,
>>> efx_enqueue_skb():
>>>
>>> 	if (__netdev_tx_sent_queue(tx_queue->core_txq, skb_len, xmit_more)) {
>>> 		struct efx_tx_queue *txq2 = efx_tx_queue_partner(tx_queue);
>>>
>>> 		/* There could be packets left on the partner queue if those
>>> 		 * SKBs had skb->xmit_more set. If we do not push those they
>>> 		 * could be left for a long time and cause a netdev watchdog.
>>> 		 */
>>> 		if (txq2->xmit_more_available)
>>> 			efx_nic_push_buffers(txq2);
>>>
>>> But I'm not sure whether the situation in r8169 is comparable. The following patch
>>> implements what I mentioned earlier: It leaves all other 5.0 changes in place and
>>> effectively reverts 2e6eedb4813e34d8d84ac0eb3afb668966f3f356. Would be great if
>>> you could give it a try.
>>
>> Hi Heiner,
>>
>> It took some time to respond, because I had another issue with 5.0 which intervened with proper testing, 
>> but fortunately I could pinpoint without doing a full bisect and revert that commit for further testing.
>>
>> So there is still time left and I could do a more proper run with your patch below.
>> Unfortunately i still get a splat (see below) with this, although i'm not sure it is related, 
>> just that I can't tell.
>>
> I checked further and there's a handful of network drivers using __napi_alloc_skb() with __GFP_NOWARN,
> maybe to avoid such splats. Did the splat impact functionality? When checking the code in r8169 the
> affected packet would just be dropped.

Hmm a __GFP_NOWARN will merely hide it.

But i took a good look with some more testing and it seems to occur when
during kernel compile the system has to use a little bit of swap.

So it's probably not a problem with r8169 code.

--
Sander


>> Perhaps Linus as Oops-decoding-guru has an idea ?
>>
>> --
>> Sander
>>
>> [39041.689007] dpkg-deb: page allocation failure: order:0, mode:0x480020(GFP_ATOMIC), nodemask=(null),cpuset=/,mems_allowed=0
>> [39041.689016] CPU: 4 PID: 14078 Comm: dpkg-deb Not tainted 5.0.0-rc5-20190209-kallweit+ #1
>> [39041.689017] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640)  , BIOS V1.8B1 09/13/2010
>> [39041.689018] Call Trace:
>> [39041.689022]  <IRQ>
>> [39041.689030]  dump_stack+0x5c/0x7b
>> [39041.689033]  warn_alloc+0x103/0x190
>> [39041.689036]  __alloc_pages_nodemask+0xe3d/0xe80
>> [39041.689039]  ? ip_rcv+0x48/0xc0
>> [39041.689040]  ? ip_rcv_finish_core.isra.0+0x360/0x360
>> [39041.689042]  page_frag_alloc+0x117/0x150
>> [39041.689044]  __napi_alloc_skb+0x83/0xd0
>> [39041.689048]  rtl8169_poll+0x210/0x640
>> [39041.689051]  net_rx_action+0x23d/0x370
>> [39041.689054]  __do_softirq+0xed/0x229
>> [39041.689058]  irq_exit+0xb7/0xc0
>> [39041.689061]  xen_evtchn_do_upcall+0x27/0x40
>> [39041.689063]  xen_do_hypervisor_callback+0x29/0x40
>> [39041.689064]  </IRQ>
>> [39041.689066] RIP: e030:_atomic_dec_and_lock+0x2/0x40
>> [39041.689068] Code: ff 39 05 c5 c1 c9 00 89 c7 89 c6 76 0f 83 eb 01 83 fb ff 75 d9 5b 89 f8 5d 41 5c c3 0f 0b 90 90 90 90 90 90 90 90 90 90 8b 07 <83> f8 01 74 0c 8d 50 ff f0 0f b1 17 75 f2 31 c0 c3 55 53 48 89 fb
>> [39041.689069] RSP: e02b:ffffc9000705b990 EFLAGS: 00000246
>> [39041.689071] RAX: 0000000000000001 RBX: ffff888017082640 RCX: 0000000000000000
>> [39041.689071] RDX: 0000000000000000 RSI: ffff8880170826c0 RDI: ffff888017082788
>> [39041.689072] RBP: ffff8880170826c0 R08: ffffc9000705bb00 R09: ffffc9000705bb00
>> [39041.689073] R10: ffffc9000705bb58 R11: ffff88807fc17000 R12: ffff888017082788
>> [39041.689073] R13: ffff88806cc8cf58 R14: ffff888017082640 R15: ffff888009990240
>> [39041.689077]  iput+0x63/0x1a0
>> [39041.689079]  __dentry_kill+0xc5/0x170
>> [39041.689080]  shrink_dentry_list+0x93/0x1c0
>> [39041.689082]  prune_dcache_sb+0x4d/0x70
>> [39041.689084]  super_cache_scan+0x104/0x190
>> [39041.689087]  do_shrink_slab+0x12c/0x1e0
>> [39041.689089]  shrink_slab+0xdf/0x2b0
>> [39041.689091]  shrink_node+0x158/0x470
>> [39041.689093]  do_try_to_free_pages+0xd1/0x380
>> [39041.689095]  try_to_free_pages+0xb2/0xe0
>> [39041.689097]  __alloc_pages_nodemask+0x603/0xe80
>> [39041.689099]  ? __pagevec_lru_add_fn+0x1b1/0x290
>> [39041.689102]  alloc_pages_vma+0x7b/0x1c0
>> [39041.689106]  __handle_mm_fault+0xdb3/0x1060
>> [39041.689109]  ? xen_mc_flush+0xc0/0x190
>> [39041.689110]  handle_mm_fault+0xf8/0x200
>> [39041.689113]  __do_page_fault+0x231/0x4a0
>> [39041.689115]  ? page_fault+0x8/0x30
>> [39041.689116]  page_fault+0x1e/0x30
>> [39041.689118] RIP: e033:0x7fb9851d012e
>> [39041.689119] Code: 29 c2 48 3b 15 7b a3 31 00 0f 87 af 00 00 00 0f 10 01 0f 10 49 f0 0f 10 51 e0 0f 10 59 d0 48 83 e9 40 48 83 ea 40 41 0f 29 01 <41> 0f 29 49 f0 41 0f 29 51 e0 41 0f 29 59 d0 49 83 e9 40 48 83 fa
>> [39041.689119] RSP: e02b:00007fb958b36d38 EFLAGS: 00010202
>> [39041.689120] RAX: 00007fb97a617f0e RBX: 000000000000f004 RCX: 00007fb948008be3
>> [39041.689121] RDX: 00000000000080c2 RSI: 00007fb948000b31 RDI: 00007fb97a617f0e
>> [39041.689122] RBP: 00000000000ff062 R08: 0000000000000002 R09: 00007fb97a620000
>> [39041.689123] R10: 0000000000000004 R11: 00007fb97a626f02 R12: 000000000000f005
>> [39041.689123] R13: 00007fb948000b28 R14: 0000562d76b63710 R15: 0000000000000003
>> [39041.689125] Mem-Info:
>> [39041.689130] active_anon:78775 inactive_anon:49211 isolated_anon:0
>>                 active_file:106409 inactive_file:107531 isolated_file:0
>>                 unevictable:552 dirty:175 writeback:0 unstable:0
>>                 slab_reclaimable:13739 slab_unreclaimable:16454
>>                 mapped:1605 shmem:23 pagetables:2900 bounce:0
>>                 free:3681 free_pcp:935 free_cma:0
>> [39041.689132] Node 0 active_anon:315100kB inactive_anon:196844kB active_file:425636kB inactive_file:430124kB unevictable:2208kB isolated(anon):0kB isolated(file):0kB mapped:6420kB dirty:700kB writeback:0kB shmem:92kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
>> [39041.689133] Node 0 DMA free:7480kB min:44kB low:56kB high:68kB active_anon:0kB inactive_anon:7832kB active_file:472kB inactive_file:4kB unevictable:0kB writepending:0kB present:15956kB managed:15872kB mlocked:0kB kernel_stack:0kB pagetables:12kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> [39041.689136] lowmem_reserve[]: 0 1865 1865 1865
>> [39041.689138] Node 0 DMA32 free:7244kB min:19472kB low:21380kB high:23288kB active_anon:315360kB inactive_anon:188144kB active_file:425164kB inactive_file:430120kB unevictable:2208kB writepending:700kB present:2080768kB managed:1674968kB mlocked:2208kB kernel_stack:9632kB pagetables:11588kB bounce:0kB free_pcp:3740kB local_pcp:528kB free_cma:0kB
>> [39041.689140] lowmem_reserve[]: 0 0 0 0
>> [39041.689142] Node 0 DMA: 6*4kB (UME) 6*8kB (UE) 7*16kB (UME) 6*32kB (ME) 5*64kB (UME) 3*128kB (UE) 5*256kB (UME) 2*512kB (ME) 2*1024kB (UE) 1*2048kB (M) 0*4096kB = 7480kB
>> [39041.689148] Node 0 DMA32: 69*4kB (U) 315*8kB (UE) 138*16kB (UE) 70*32kB (UE) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7244kB
>> [39041.689153] 214701 total pagecache pages
>> [39041.689155] 273 pages in swap cache
>> [39041.689156] Swap cache stats: add 100978, delete 100706, find 1158/1257
>> [39041.689156] Free swap  = 3790588kB
>> [39041.689157] Total swap = 4194300kB
>> [39041.689157] 524181 pages RAM
>> [39041.689158] 0 pages HighMem/MovableOnly
>> [39041.689158] 101471 pages reserved
>> [39041.689159] 0 pages cma reserved
>>
>>
>>
>>
>>
>>> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
>>> index e8a112149..3cca2ffb2 100644
>>> --- a/drivers/net/ethernet/realtek/r8169.c
>>> +++ b/drivers/net/ethernet/realtek/r8169.c
>>> @@ -6192,7 +6192,6 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>  	struct device *d = tp_to_dev(tp);
>>>  	dma_addr_t mapping;
>>>  	u32 opts[2], len;
>>> -	bool stop_queue;
>>>  	int frags;
>>>  
>>>  	if (unlikely(!rtl_tx_slots_avail(tp, skb_shinfo(skb)->nr_frags))) {
>>> @@ -6234,6 +6233,8 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>  
>>>  	txd->opts2 = cpu_to_le32(opts[1]);
>>>  
>>> +	netdev_sent_queue(dev, skb->len);
>>> +
>>>  	skb_tx_timestamp(skb);
>>>  
>>>  	/* Force memory writes to complete before releasing descriptor */
>>> @@ -6246,14 +6247,14 @@ static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
>>>  
>>>  	tp->cur_tx += frags + 1;
>>>  
>>> -	stop_queue = !rtl_tx_slots_avail(tp, MAX_SKB_FRAGS);
>>> -	if (unlikely(stop_queue))
>>> -		netif_stop_queue(dev);
>>> -
>>> -	if (__netdev_sent_queue(dev, skb->len, skb->xmit_more))
>>> -		RTL_W8(tp, TxPoll, NPQ);
>>> +	RTL_W8(tp, TxPoll, NPQ);
>>>  
>>> -	if (unlikely(stop_queue)) {
>>> +	if (!rtl_tx_slots_avail(tp, MAX_SKB_FRAGS)) {
>>> +		/* Avoid wrongly optimistic queue wake-up: rtl_tx thread must
>>> +		 * not miss a ring update when it notices a stopped queue.
>>> +		 */
>>> +		smp_wmb();
>>> +		netif_stop_queue(dev);
>>>  		/* Sync with rtl_tx:
>>>  		 * - publish queue status and cur_tx ring index (write barrier)
>>>  		 * - refresh dirty_tx ring index (read barrier).
>>>
>>
>>
> 


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-02-10 15:49 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-08 18:29 Linux 5.0 regression: rtl8169 / kernel BUG at lib/dynamic_queue_limits.c:27! Sander Eikelenboom
2019-02-08 18:52 ` Heiner Kallweit
2019-02-08 20:55   ` Sander Eikelenboom
2019-02-08 21:22     ` Heiner Kallweit
2019-02-08 21:45       ` Sander Eikelenboom
2019-02-08 21:50         ` Heiner Kallweit
2019-02-08 23:09           ` Eric Dumazet
2019-02-09  9:02             ` Heiner Kallweit
2019-02-09  9:34               ` Sander Eikelenboom
2019-02-09  9:59                 ` Heiner Kallweit
2019-02-09 10:07                   ` Sander Eikelenboom
2019-02-09 11:50                     ` Heiner Kallweit
2019-02-10  9:16                       ` Sander Eikelenboom
2019-02-10  9:32                         ` Heiner Kallweit
2019-02-10 11:44                         ` Heiner Kallweit
2019-02-10 13:05                           ` Sander Eikelenboom
2019-02-10 13:57                             ` Heiner Kallweit
2019-02-10 15:50                           ` Sander Eikelenboom
2019-02-08 23:34           ` Sander Eikelenboom
2019-02-09  9:10             ` Heiner Kallweit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).