All of lore.kernel.org
 help / color / mirror / Atom feed
* [bug, bisected] pfifo_fast causes packet reordering
@ 2018-03-13 18:24 Jakob Unterwurzacher
  2018-03-13 18:35 ` Dave Taht
  0 siblings, 1 reply; 16+ messages in thread
From: Jakob Unterwurzacher @ 2018-03-13 18:24 UTC (permalink / raw)
  To: netdev, linux-kernel, John Fastabend, David S. Miller
  Cc: linux-can, Martin Elshuber

During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on 
Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of 
packets are delivered out-of-order.

We have tracked the problem down to the driver interface level, and it 
seems that the driver's net_device_ops.ndo_start_xmit() function gets 
the packets handed over in the wrong order.

This behavior was not observed on Linux v4.15 and I have bisected the 
problem down to this patch:

> commit c5ad119fb6c09b0297446be05bd66602fa564758
> Author: John Fastabend <john.fastabend@gmail.com>
> Date:   Thu Dec 7 09:58:19 2017 -0800
> 
>    net: sched: pfifo_fast use skb_array
> 
>    This converts the pfifo_fast qdisc to use the skb_array data structure
>    and set the lockless qdisc bit. pfifo_fast is the first qdisc to support
>    the lockless bit that can be a child of a qdisc requiring locking. So
>    we add logic to clear the lock bit on initialization in these cases when
>    the qdisc graft operation occurs.
> 
>    This also removes the logic used to pick the next band to dequeue from
>    and instead just checks a per priority array for packets from top priority
>    to lowest. This might need to be a bit more clever but seems to work
>    for now.
> 
>    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>    Signed-off-by: David S. Miller <davem@davemloft.net>

The patch does not revert cleanly, but moving to one commit earlier 
makes the problem go away.

Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem 
go away as well.

Is this an unintended side-effect of the patch or is there something the 
driver has to do to request in-order delivery?

Thanks,
Jakob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-13 18:24 [bug, bisected] pfifo_fast causes packet reordering Jakob Unterwurzacher
@ 2018-03-13 18:35 ` Dave Taht
  2018-03-14  4:03   ` John Fastabend
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Taht @ 2018-03-13 18:35 UTC (permalink / raw)
  To: Jakob Unterwurzacher
  Cc: netdev, linux-kernel, John Fastabend, David S. Miller, linux-can,
	Martin Elshuber

On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher
<jakob.unterwurzacher@theobroma-systems.com> wrote:
> During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux
> v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are
> delivered out-of-order.
>
> We have tracked the problem down to the driver interface level, and it seems
> that the driver's net_device_ops.ndo_start_xmit() function gets the packets
> handed over in the wrong order.
>
> This behavior was not observed on Linux v4.15 and I have bisected the
> problem down to this patch:
>
>> commit c5ad119fb6c09b0297446be05bd66602fa564758
>> Author: John Fastabend <john.fastabend@gmail.com>
>> Date:   Thu Dec 7 09:58:19 2017 -0800
>>
>>    net: sched: pfifo_fast use skb_array
>>
>>    This converts the pfifo_fast qdisc to use the skb_array data structure
>>    and set the lockless qdisc bit. pfifo_fast is the first qdisc to
>> support
>>    the lockless bit that can be a child of a qdisc requiring locking. So
>>    we add logic to clear the lock bit on initialization in these cases
>> when
>>    the qdisc graft operation occurs.
>>
>>    This also removes the logic used to pick the next band to dequeue from
>>    and instead just checks a per priority array for packets from top
>> priority
>>    to lowest. This might need to be a bit more clever but seems to work
>>    for now.
>>
>>    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>>    Signed-off-by: David S. Miller <davem@davemloft.net>
>
>
> The patch does not revert cleanly, but moving to one commit earlier makes
> the problem go away.
>
> Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem go
> away as well.

I am of course, a fan of obsoleting pfifo_fast. There's no good reason
for it anymore.

>
> Is this an unintended side-effect of the patch or is there something the
> driver has to do to request in-order delivery?
>
> Thanks,
> Jakob



-- 

Dave Täht
CEO, TekLibre, LLC
http://www.teklibre.com
Tel: 1-669-226-2619

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-13 18:35 ` Dave Taht
@ 2018-03-14  4:03   ` John Fastabend
  2018-03-14 10:09     ` Jakob Unterwurzacher
  2018-03-15 18:08     ` Jakob Unterwurzacher
  0 siblings, 2 replies; 16+ messages in thread
From: John Fastabend @ 2018-03-14  4:03 UTC (permalink / raw)
  To: Dave Taht, Jakob Unterwurzacher
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 03/13/2018 11:35 AM, Dave Taht wrote:
> On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher
> <jakob.unterwurzacher@theobroma-systems.com> wrote:
>> During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux
>> v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are
>> delivered out-of-order.
>>

Is the stress-testing tool available somewhere? What type of packets
are being sent?


>> We have tracked the problem down to the driver interface level, and it seems
>> that the driver's net_device_ops.ndo_start_xmit() function gets the packets
>> handed over in the wrong order.
>>
>> This behavior was not observed on Linux v4.15 and I have bisected the
>> problem down to this patch:
>>
>>> commit c5ad119fb6c09b0297446be05bd66602fa564758
>>> Author: John Fastabend <john.fastabend@gmail.com>
>>> Date:   Thu Dec 7 09:58:19 2017 -0800
>>>
>>>    net: sched: pfifo_fast use skb_array
>>>
>>>    This converts the pfifo_fast qdisc to use the skb_array data structure
>>>    and set the lockless qdisc bit. pfifo_fast is the first qdisc to
>>> support
>>>    the lockless bit that can be a child of a qdisc requiring locking. So
>>>    we add logic to clear the lock bit on initialization in these cases
>>> when
>>>    the qdisc graft operation occurs.
>>>
>>>    This also removes the logic used to pick the next band to dequeue from
>>>    and instead just checks a per priority array for packets from top
>>> priority
>>>    to lowest. This might need to be a bit more clever but seems to work
>>>    for now.
>>>
>>>    Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>>>    Signed-off-by: David S. Miller <davem@davemloft.net>
>>
>>
>> The patch does not revert cleanly, but moving to one commit earlier makes
>> the problem go away.
>>
>> Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem go
>> away as well.
> 

Is this a single queue device or a multiqueue device? Running
'tc -s qdisc show dev foo' would help some.

> I am of course, a fan of obsoleting pfifo_fast. There's no good reason
> for it anymore.
> 
>>
>> Is this an unintended side-effect of the patch or is there something the
>> driver has to do to request in-order delivery?
>>
If we introduced a OOO edge case somewhere that was not
intended so I'll take a look into it. But, if you can provide
a bit more details on how stress testing is done to cause the
issue that would help.

Thanks,
John

>> Thanks,
>> Jakob
> 
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-14  4:03   ` John Fastabend
@ 2018-03-14 10:09     ` Jakob Unterwurzacher
  2018-03-15 18:08     ` Jakob Unterwurzacher
  1 sibling, 0 replies; 16+ messages in thread
From: Jakob Unterwurzacher @ 2018-03-14 10:09 UTC (permalink / raw)
  To: John Fastabend, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 14.03.18 05:03, John Fastabend wrote:
>>> During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux
>>> v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are
>>> delivered out-of-order.
>>>
> 
> Is the stress-testing tool available somewhere? What type of packets
> are being sent?

Not public, no, the problem is that you'd need a CAN adapter as well.

The test is simple, sending CAN frames with an increasing counter and a 
random length payload:

[  tx thread                      rx thread    ]
        |                              ^
        v                              |
[ interface 0 ] ---- cable ----> [ interface 1 ]

I'll see if I can come up with a UDP testcase that works with normal 
ethernet interfaces.

> Is this a single queue device or a multiqueue device? Running
> 'tc -s qdisc show dev foo' would help some.

Here you go:

root@rk3399-q7:~# tc -s qdisc show dev can0
qdisc pfifo_fast 0: root refcnt 2 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 
1 1 1 1 1
  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
  backlog 0b 0p requeues 0

> If we introduced a OOO edge case somewhere that was not
> intended so I'll take a look into it. But, if you can provide
> a bit more details on how stress testing is done to cause the
> issue that would help.

Will do.

Thanks,
Jakob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-14  4:03   ` John Fastabend
  2018-03-14 10:09     ` Jakob Unterwurzacher
@ 2018-03-15 18:08     ` Jakob Unterwurzacher
  2018-03-15 22:30       ` John Fastabend
  1 sibling, 1 reply; 16+ messages in thread
From: Jakob Unterwurzacher @ 2018-03-15 18:08 UTC (permalink / raw)
  To: John Fastabend, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 14.03.18 05:03, John Fastabend wrote:
> On 03/13/2018 11:35 AM, Dave Taht wrote:
>> On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher
>> <jakob.unterwurzacher@theobroma-systems.com> wrote:
>>> During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux
>>> v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are
>>> delivered out-of-order.
>>>
> 
> Is the stress-testing tool available somewhere? What type of packets
> are being sent?


I have reproduced it using two USB network cards connected to each 
other. The test tool sends UDP packets containing a counter and listens 
on the other interface, it is available at
https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py

Here is what I get:

root@rk3399-q7:~# ./pfifo_stress.py
[...]
expected ctr 0xcdc, received 0xcdd
expected ctr 0xcde, received 0xcdc
expected ctr 0xcdd, received 0xcde
expected ctr 0xe3c, received 0xe3d
expected ctr 0xe3e, received 0xe3c
expected ctr 0xe3d, received 0xe3e
expected ctr 0x1097, received 0x1098
expected ctr 0x1099, received 0x1097
expected ctr 0x1098, received 0x1099
expected ctr 0x17c0, received 0x17c1
expected ctr 0x17c2, received 0x17c0
[...]

Best regards,
Jakob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-15 18:08     ` Jakob Unterwurzacher
@ 2018-03-15 22:30       ` John Fastabend
  2018-03-16 10:26         ` Jakob Unterwurzacher
  0 siblings, 1 reply; 16+ messages in thread
From: John Fastabend @ 2018-03-15 22:30 UTC (permalink / raw)
  To: Jakob Unterwurzacher, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 03/15/2018 11:08 AM, Jakob Unterwurzacher wrote:
> On 14.03.18 05:03, John Fastabend wrote:
>> On 03/13/2018 11:35 AM, Dave Taht wrote:
>>> On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher
>>> <jakob.unterwurzacher@theobroma-systems.com> wrote:
>>>> During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux
>>>> v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are
>>>> delivered out-of-order.
>>>>
>>
>> Is the stress-testing tool available somewhere? What type of packets
>> are being sent?
> 
> 
> I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at
> https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py
> 

Great thanks, can you also run this with taskset to bind to
a single CPU,

 # taskset 0x1 ./pifof_stress.py

And let me know if you still see the OOO.

> Here is what I get:
> 
> root@rk3399-q7:~# ./pfifo_stress.py
> [...]
> expected ctr 0xcdc, received 0xcdd
> expected ctr 0xcde, received 0xcdc
> expected ctr 0xcdd, received 0xcde
> expected ctr 0xe3c, received 0xe3d
> expected ctr 0xe3e, received 0xe3c
> expected ctr 0xe3d, received 0xe3e
> expected ctr 0x1097, received 0x1098
> expected ctr 0x1099, received 0x1097
> expected ctr 0x1098, received 0x1099
> expected ctr 0x17c0, received 0x17c1
> expected ctr 0x17c2, received 0x17c0
> [...]
> 
> Best regards,
> Jakob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-15 22:30       ` John Fastabend
@ 2018-03-16 10:26         ` Jakob Unterwurzacher
  2018-03-19  6:07           ` Alexander Stein
                             ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Jakob Unterwurzacher @ 2018-03-16 10:26 UTC (permalink / raw)
  To: John Fastabend, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 15.03.18 23:30, John Fastabend wrote:
>> I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at
>> https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py
>>
> 
> Great thanks, can you also run this with taskset to bind to
> a single CPU,
> 
>   # taskset 0x1 ./pifof_stress.py
> 
> And let me know if you still see the OOO.

Interesting. Looks like it depends on which core it runs on. CPU0 is 
clean, CPU1 is not.

Clean:	taskset --cpu-list 0 ./pfifo_stress.py

Broken:	taskset --cpu-list 1 ./pfifo_stress.py

Maybe related: CPU0 is where USB interrupts are handled:

> root@rk3399-q7:~# cat /proc/interrupts           
>            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> 217:    2175353          0          0          0          0          0     GICv3 142 Level     xhci-hcd:usb5
Thanks,
Jakob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-16 10:26         ` Jakob Unterwurzacher
@ 2018-03-19  6:07           ` Alexander Stein
  2018-03-19 12:32           ` Paolo Abeni
  2018-03-21 10:01           ` Jakob Unterwurzacher
  2 siblings, 0 replies; 16+ messages in thread
From: Alexander Stein @ 2018-03-19  6:07 UTC (permalink / raw)
  To: Jakob Unterwurzacher
  Cc: John Fastabend, Dave Taht, netdev, linux-kernel, David S. Miller,
	linux-can, Martin Elshuber

On Friday, March 16, 2018, 11:26:47 AM CET Jakob Unterwurzacher wrote:
> On 15.03.18 23:30, John Fastabend wrote:
> >> I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at
> >> https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py
> >>
> > 
> > Great thanks, can you also run this with taskset to bind to
> > a single CPU,
> > 
> >   # taskset 0x1 ./pifof_stress.py
> > 
> > And let me know if you still see the OOO.
> 
> Interesting. Looks like it depends on which core it runs on. CPU0 is 
> clean, CPU1 is not.
> 
> Clean:	taskset --cpu-list 0 ./pfifo_stress.py
> 
> Broken:	taskset --cpu-list 1 ./pfifo_stress.py
> 
> Maybe related: CPU0 is where USB interrupts are handled:
> 
> > root@rk3399-q7:~# cat /proc/interrupts           
> >            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> > 217:    2175353          0          0          0          0          0     GICv3 142 Level     xhci-hcd:usb5

This reminds me somewhat of this thread: https://marc.info/?l=linux-can&m=148007442317274&w=2

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-16 10:26         ` Jakob Unterwurzacher
  2018-03-19  6:07           ` Alexander Stein
@ 2018-03-19 12:32           ` Paolo Abeni
  2018-03-19 12:56             ` Jakob Unterwurzacher
  2018-03-21 10:01           ` Jakob Unterwurzacher
  2 siblings, 1 reply; 16+ messages in thread
From: Paolo Abeni @ 2018-03-19 12:32 UTC (permalink / raw)
  To: Jakob Unterwurzacher, John Fastabend, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

Hi,

On Fri, 2018-03-16 at 11:26 +0100, Jakob Unterwurzacher wrote:
> On 15.03.18 23:30, John Fastabend wrote:
> > > I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at
> > > https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py
> > > 
> > 
> > Great thanks, can you also run this with taskset to bind to
> > a single CPU,
> > 
> >   # taskset 0x1 ./pifof_stress.py
> > 
> > And let me know if you still see the OOO.
> 
> Interesting. Looks like it depends on which core it runs on. CPU0 is 
> clean, CPU1 is not.
> 
> Clean:	taskset --cpu-list 0 ./pfifo_stress.py
> 
> Broken:	taskset --cpu-list 1 ./pfifo_stress.py
> 
> Maybe related: CPU0 is where USB interrupts are handled:
> 
> > root@rk3399-q7:~# cat /proc/interrupts           
> >            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
> > 217:    2175353          0          0          0          0          0     GICv3 142 Level     xhci-hcd:usb5


Is not clear to me if you can reproduce the bug with the vanilla
kernel, or if  you need some out-of-tree nic driver. Can you please
clarify which NIC/driver are you using?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-19 12:32           ` Paolo Abeni
@ 2018-03-19 12:56             ` Jakob Unterwurzacher
  0 siblings, 0 replies; 16+ messages in thread
From: Jakob Unterwurzacher @ 2018-03-19 12:56 UTC (permalink / raw)
  To: Paolo Abeni, John Fastabend, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 19.03.18 13:32, Paolo Abeni wrote:
> Is not clear to me if you can reproduce the bug with the vanilla
> kernel, or if  you need some out-of-tree nic driver. Can you please
> clarify which NIC/driver are you using?

Yes I reproduced it with a vanilla kernel. I use two off-the-shelf USB 
NICs, lsusb says:

   Bus 005 Device 013: ID 0b95:7720 ASIX Electronics Corp. AX88772
   Bus 005 Device 014: ID 0b95:772b ASIX Electronics Corp. AX88772B

Photo of the setup:
https://github.com/jakob-tsd/pfifo_stress/blob/master/usb_nics.jpg

Best regards,
Jakob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-16 10:26         ` Jakob Unterwurzacher
  2018-03-19  6:07           ` Alexander Stein
  2018-03-19 12:32           ` Paolo Abeni
@ 2018-03-21 10:01           ` Jakob Unterwurzacher
  2018-03-21 18:43             ` John Fastabend
  2 siblings, 1 reply; 16+ messages in thread
From: Jakob Unterwurzacher @ 2018-03-21 10:01 UTC (permalink / raw)
  To: John Fastabend, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 16.03.18 11:26, Jakob Unterwurzacher wrote:
> On 15.03.18 23:30, John Fastabend wrote:
>>> I have reproduced it using two USB network cards connected to each 
>>> other. The test tool sends UDP packets containing a counter and 
>>> listens on the other interface, it is available at
>>> https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py
>>
>> Great thanks, can you also run this with taskset to bind to
>> a single CPU,
>>
>>   # taskset 0x1 ./pifof_stress.py
>>
>> And let me know if you still see the OOO.
> 
> Interesting. Looks like it depends on which core it runs on. CPU0 is 
> clean, CPU1 is not.

So we are at v4.16-rc6 now - have you managed to reproduce this is or 
should I try to get the revert correct?

Best regards,
Jakob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-21 10:01           ` Jakob Unterwurzacher
@ 2018-03-21 18:43             ` John Fastabend
  2018-03-21 19:44               ` Jakob Unterwurzacher
  0 siblings, 1 reply; 16+ messages in thread
From: John Fastabend @ 2018-03-21 18:43 UTC (permalink / raw)
  To: Jakob Unterwurzacher, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 03/21/2018 03:01 AM, Jakob Unterwurzacher wrote:
> On 16.03.18 11:26, Jakob Unterwurzacher wrote:
>> On 15.03.18 23:30, John Fastabend wrote:
>>>> I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at
>>>> https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py
>>>
>>> Great thanks, can you also run this with taskset to bind to
>>> a single CPU,
>>>
>>>   # taskset 0x1 ./pifof_stress.py
>>>
>>> And let me know if you still see the OOO.
>>
>> Interesting. Looks like it depends on which core it runs on. CPU0 is clean, CPU1 is not.
> 
> So we are at v4.16-rc6 now - have you managed to reproduce this is or should I try to get the revert correct?

I have a theory on what is going on here. Because we now run without
locks we can have multiple qdisc_run() calls running in parallel. Possible
if we send packets using multiple cores.

   --- application ---
     cpu0    cpu1
        |     |
        |     |
    enqueue enqueue   
        |     |
      pfifo_fast
        |     |
    dequeue   dequeue
         \    /
        ndo_xmit

The skb->ooo_okay flag will keep the enqueue side packets in
order. So that is covered.

But on the dequeue side if two cores dequeue in parallel we will
race to ndo ops to ensure packets are in order. Rarely, I guess the
second dequeue could actually call ndo hook before first dequeued
packet. Because usually the dequeue happens on the same queue the
enqueue happened on we don't see this very often. But there seems
to be a case where the driver calls netif_tx_wake_queue() on a
different core (from the rx interrupt context). The wake queue call
then eventually runs the dequeue on a different core. So when taskset
is aligned with the interrupt everything is in-order when it is moved
to a different core we see the OOO.

Thats my theory at least. Are you able to test a patch if I generate
one to fix this?

FWIW the revert on this is trivial, but I think we can fix this
without too much work. Also, if you had a driver tx queue per core
this would not be an issue. 

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 190570f..171f470 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -792,7 +792,6 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
        .dump           =       pfifo_fast_dump,
        .change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
        .owner          =       THIS_MODULE,
-       .static_flags   =       TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
 };
 EXPORT_SYMBOL(pfifo_fast_ops);

> 
> Best regards,
> Jakob

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-21 18:43             ` John Fastabend
@ 2018-03-21 19:44               ` Jakob Unterwurzacher
  2018-03-21 20:52                 ` John Fastabend
  0 siblings, 1 reply; 16+ messages in thread
From: Jakob Unterwurzacher @ 2018-03-21 19:44 UTC (permalink / raw)
  To: John Fastabend, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 21.03.18 19:43, John Fastabend wrote:
> Thats my theory at least. Are you able to test a patch if I generate
> one to fix this?

Yes, no problem.

I just tested with the flag change you suggested (see below, I had to 
keep TCQ_F_CPUSTATS to prevent a crash) and I have NOT seen OOO so far.

Thanks,
Jakob


diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 190570f21b20..51b68ef4977b 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -792,7 +792,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
         .dump           =       pfifo_fast_dump,
         .change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
         .owner          =       THIS_MODULE,
-       .static_flags   =       TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
+       .static_flags   =       TCQ_F_CPUSTATS,
  };
  EXPORT_SYMBOL(pfifo_fast_ops);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-21 19:44               ` Jakob Unterwurzacher
@ 2018-03-21 20:52                 ` John Fastabend
  2018-03-22 10:16                   ` Jakob Unterwurzacher
  0 siblings, 1 reply; 16+ messages in thread
From: John Fastabend @ 2018-03-21 20:52 UTC (permalink / raw)
  To: Jakob Unterwurzacher, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 03/21/2018 12:44 PM, Jakob Unterwurzacher wrote:
> On 21.03.18 19:43, John Fastabend wrote:
>> Thats my theory at least. Are you able to test a patch if I generate
>> one to fix this?
> 
> Yes, no problem.

Can you try this,

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index d4907b5..1e596bd 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -30,6 +30,7 @@ struct qdisc_rate_table {
 enum qdisc_state_t {
        __QDISC_STATE_SCHED,
        __QDISC_STATE_DEACTIVATED,
+       __QDISC_STATE_RUNNING,
 };
 
 struct qdisc_size_table {
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 190570f..cf7c37d 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -377,20 +377,26 @@ static inline bool qdisc_restart(struct Qdisc *q, int *packets)
        struct netdev_queue *txq;
        struct net_device *dev;
        struct sk_buff *skb;
-       bool validate;
+       bool more, validate;
 
        /* Dequeue packet */
+       if (test_and_set_bit(__QDISC_STATE_RUNNING, &q->state))
+               return false;
+
        skb = dequeue_skb(q, &validate, packets);
-       if (unlikely(!skb))
+       if (unlikely(!skb)) {
+               clear_bit(__QDISC_STATE_RUNNING, &q->state);
                return false;
+       }
 
        if (!(q->flags & TCQ_F_NOLOCK))
                root_lock = qdisc_lock(q);
 
        dev = qdisc_dev(q);
        txq = skb_get_tx_queue(dev, skb);
-
-       return sch_direct_xmit(skb, q, dev, txq, root_lock, validate);
+       more = sch_direct_xmit(skb, q, dev, txq, root_lock, validate);
+       clear_bit(__QDISC_STATE_RUNNING, &q->state);
+       return more;
 }


> 
> I just tested with the flag change you suggested (see below, I had to keep TCQ_F_CPUSTATS to prevent a crash) and I have NOT seen OOO so far.
> 

Right because the code expects per cpu stats if the CPUSTATS flag is
removed it will crash.

> Thanks,
> Jakob
> 
> 
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 190570f21b20..51b68ef4977b 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -792,7 +792,7 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
>         .dump           =       pfifo_fast_dump,
>         .change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
>         .owner          =       THIS_MODULE,
> -       .static_flags   =       TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
> +       .static_flags   =       TCQ_F_CPUSTATS,
>  };
>  EXPORT_SYMBOL(pfifo_fast_ops);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-21 20:52                 ` John Fastabend
@ 2018-03-22 10:16                   ` Jakob Unterwurzacher
  2018-03-24 14:26                     ` John Fastabend
  0 siblings, 1 reply; 16+ messages in thread
From: Jakob Unterwurzacher @ 2018-03-22 10:16 UTC (permalink / raw)
  To: John Fastabend, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 21.03.18 21:52, John Fastabend wrote:
> Can you try this,
> 
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index d4907b5..1e596bd 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -30,6 +30,7 @@ struct qdisc_rate_table {
>   enum qdisc_state_t {
>          __QDISC_STATE_SCHED,
>          __QDISC_STATE_DEACTIVATED,
> +       __QDISC_STATE_RUNNING,
>   };
> [...]

Tested, looks good. No OOO observed, no side effects observed, iperf 
numbers on Gigabit Ethernet look the same.

Thanks,
Jakob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [bug, bisected] pfifo_fast causes packet reordering
  2018-03-22 10:16                   ` Jakob Unterwurzacher
@ 2018-03-24 14:26                     ` John Fastabend
  0 siblings, 0 replies; 16+ messages in thread
From: John Fastabend @ 2018-03-24 14:26 UTC (permalink / raw)
  To: Jakob Unterwurzacher, Dave Taht
  Cc: netdev, linux-kernel, David S. Miller, linux-can, Martin Elshuber

On 03/22/2018 03:16 AM, Jakob Unterwurzacher wrote:
> On 21.03.18 21:52, John Fastabend wrote:
>> Can you try this,
>>
>> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
>> index d4907b5..1e596bd 100644
>> --- a/include/net/sch_generic.h
>> +++ b/include/net/sch_generic.h
>> @@ -30,6 +30,7 @@ struct qdisc_rate_table {
>>   enum qdisc_state_t {
>>          __QDISC_STATE_SCHED,
>>          __QDISC_STATE_DEACTIVATED,
>> +       __QDISC_STATE_RUNNING,
>>   };
>> [...]
> 
> Tested, looks good. No OOO observed, no side effects observed, iperf
> numbers on Gigabit Ethernet look the same.
> 
> Thanks,
> Jakob

Thanks, I'll send out an official patch today.

.John

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-03-24 14:26 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-13 18:24 [bug, bisected] pfifo_fast causes packet reordering Jakob Unterwurzacher
2018-03-13 18:35 ` Dave Taht
2018-03-14  4:03   ` John Fastabend
2018-03-14 10:09     ` Jakob Unterwurzacher
2018-03-15 18:08     ` Jakob Unterwurzacher
2018-03-15 22:30       ` John Fastabend
2018-03-16 10:26         ` Jakob Unterwurzacher
2018-03-19  6:07           ` Alexander Stein
2018-03-19 12:32           ` Paolo Abeni
2018-03-19 12:56             ` Jakob Unterwurzacher
2018-03-21 10:01           ` Jakob Unterwurzacher
2018-03-21 18:43             ` John Fastabend
2018-03-21 19:44               ` Jakob Unterwurzacher
2018-03-21 20:52                 ` John Fastabend
2018-03-22 10:16                   ` Jakob Unterwurzacher
2018-03-24 14:26                     ` John Fastabend

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.