All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Jakob Unterwurzacher <jakob.unterwurzacher@theobroma-systems.com>,
	Dave Taht <dave.taht@gmail.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	"David S. Miller" <davem@davemloft.net>,
	"linux-can@vger.kernel.org" <linux-can@vger.kernel.org>,
	Martin Elshuber <martin.elshuber@theobroma-systems.com>
Subject: Re: [bug, bisected] pfifo_fast causes packet reordering
Date: Wed, 21 Mar 2018 11:43:04 -0700	[thread overview]
Message-ID: <4e33aae4-9e87-22b4-7f09-008183ea553a@gmail.com> (raw)
In-Reply-To: <00cc2d41-6861-9a9c-603f-ba8013b2e2ce@theobroma-systems.com>

On 03/21/2018 03:01 AM, Jakob Unterwurzacher wrote:
> On 16.03.18 11:26, Jakob Unterwurzacher wrote:
>> On 15.03.18 23:30, John Fastabend wrote:
>>>> I have reproduced it using two USB network cards connected to each other. The test tool sends UDP packets containing a counter and listens on the other interface, it is available at
>>>> https://github.com/jakob-tsd/pfifo_stress/blob/master/pfifo_stress.py
>>>
>>> Great thanks, can you also run this with taskset to bind to
>>> a single CPU,
>>>
>>>   # taskset 0x1 ./pifof_stress.py
>>>
>>> And let me know if you still see the OOO.
>>
>> Interesting. Looks like it depends on which core it runs on. CPU0 is clean, CPU1 is not.
> 
> So we are at v4.16-rc6 now - have you managed to reproduce this is or should I try to get the revert correct?

I have a theory on what is going on here. Because we now run without
locks we can have multiple qdisc_run() calls running in parallel. Possible
if we send packets using multiple cores.

   --- application ---
     cpu0    cpu1
        |     |
        |     |
    enqueue enqueue   
        |     |
      pfifo_fast
        |     |
    dequeue   dequeue
         \    /
        ndo_xmit

The skb->ooo_okay flag will keep the enqueue side packets in
order. So that is covered.

But on the dequeue side if two cores dequeue in parallel we will
race to ndo ops to ensure packets are in order. Rarely, I guess the
second dequeue could actually call ndo hook before first dequeued
packet. Because usually the dequeue happens on the same queue the
enqueue happened on we don't see this very often. But there seems
to be a case where the driver calls netif_tx_wake_queue() on a
different core (from the rx interrupt context). The wake queue call
then eventually runs the dequeue on a different core. So when taskset
is aligned with the interrupt everything is in-order when it is moved
to a different core we see the OOO.

Thats my theory at least. Are you able to test a patch if I generate
one to fix this?

FWIW the revert on this is trivial, but I think we can fix this
without too much work. Also, if you had a driver tx queue per core
this would not be an issue. 

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 190570f..171f470 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -792,7 +792,6 @@ struct Qdisc_ops pfifo_fast_ops __read_mostly = {
        .dump           =       pfifo_fast_dump,
        .change_tx_queue_len =  pfifo_fast_change_tx_queue_len,
        .owner          =       THIS_MODULE,
-       .static_flags   =       TCQ_F_NOLOCK | TCQ_F_CPUSTATS,
 };
 EXPORT_SYMBOL(pfifo_fast_ops);

> 
> Best regards,
> Jakob

  reply	other threads:[~2018-03-21 18:43 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-13 18:24 [bug, bisected] pfifo_fast causes packet reordering Jakob Unterwurzacher
2018-03-13 18:35 ` Dave Taht
2018-03-14  4:03   ` John Fastabend
2018-03-14 10:09     ` Jakob Unterwurzacher
2018-03-15 18:08     ` Jakob Unterwurzacher
2018-03-15 22:30       ` John Fastabend
2018-03-16 10:26         ` Jakob Unterwurzacher
2018-03-19  6:07           ` Alexander Stein
2018-03-19 12:32           ` Paolo Abeni
2018-03-19 12:56             ` Jakob Unterwurzacher
2018-03-21 10:01           ` Jakob Unterwurzacher
2018-03-21 18:43             ` John Fastabend [this message]
2018-03-21 19:44               ` Jakob Unterwurzacher
2018-03-21 20:52                 ` John Fastabend
2018-03-22 10:16                   ` Jakob Unterwurzacher
2018-03-24 14:26                     ` John Fastabend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4e33aae4-9e87-22b4-7f09-008183ea553a@gmail.com \
    --to=john.fastabend@gmail.com \
    --cc=dave.taht@gmail.com \
    --cc=davem@davemloft.net \
    --cc=jakob.unterwurzacher@theobroma-systems.com \
    --cc=linux-can@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.elshuber@theobroma-systems.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.