All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.duyck@gmail.com>
To: Jakub Kicinski <kuba@kernel.org>
Cc: Heiner Kallweit <hkallweit1@gmail.com>,
	davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com,
	pabeni@redhat.com, Herbert Xu <herbert@gondor.apana.org.au>,
	"Paul E. McKenney" <paulmck@kernel.org>
Subject: Re: [PATCH net-next 1/3] net: provide macros for commonly copied lockless queue stop/wake code
Date: Mon, 3 Apr 2023 13:27:44 -0700	[thread overview]
Message-ID: <CAKgT0Ue-hEycSyYvVJt0L5Z=373MyNPbgPjFZMA5j2v0hWg0zg@mail.gmail.com> (raw)
In-Reply-To: <20230403120345.0c02232c@kernel.org>

On Mon, Apr 3, 2023 at 12:03 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 3 Apr 2023 11:11:35 -0700 Alexander Duyck wrote:
> > On Mon, Apr 3, 2023 at 8:56 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > I don't think in terms of flushes. Let me add line numbers to the
> > > producer and the consumer.
> > >
> > >  c1. WRITE cons
> > >  c2. mb()  # A
> > >  c3. READ stopped
> > >  c4. rmb() # C
> > >  c5. READ prod, cons
> > >
> > >  p1. WRITE prod
> > >  p2. READ prod, cons
> > >  p3. mb()  # B
> > >  p4. WRITE stopped
> > >  p5. READ prod, cons
> > >
> > > The way I think the mb() orders c1 and c3 vs p2 and p4. The rmb()
> > > orders c3 and c5 vs p1 and p4. Let me impenitently add Paul..
> >
> > So which function is supposed to be consumer vs producer here?
>
> producer is xmit consumer is NAPI
>
> > I think your write stopped is on the wrong side of the memory barrier.
> > It should be writing prod and stopped both before the barrier.
>
> Indeed, Paul pointed out over chat that we need two barriers there
> to be correct :( Should be fine in practice, first one is BQL,
> second one is on the slow path.
>
> > The maybe/try stop should essentially be:
> > 1. write tail
> > 2. read prod/cons
> > 3. if unused >= 1x packet
> > 3.a return
> >
> > 4. set stop
> > 5. mb()
> > 6. Re-read prod/cons
> > 7. if unused >= 1x packet
> > 7.a. test_and_clear stop
> >
> > The maybe/try wake would be:
> > 1. write head
> > 2. read prod/cons
> > 3. if consumed == 0 || unused < 2x packet
> > 3.a. return
> >
> > 4. mb()
> > 5. test_and_clear stop
> >
> > > > One other thing to keep in mind is that the wake gives itself a pretty
> > > > good runway. We are talking about enough to transmit at least 2
> > > > frames. So if another consumer is stopping it we aren't waking it
> > > > unless there is enough space for yet another frame after the current
> > > > consumer.
> > >
> > > Ack, the race is very unlikely, basically the completing CPU would have
> > > to take an expensive IRQ between checking the descriptor count and
> > > checking if stopped -- to let the sending CPU queue multiple frames.
> > >
> > > But in theory the race is there, right?
> >
> > I don't think this is so much a race as a skid. Specifically when we
> > wake the queue it will only run for one more packet in such a
> > scenario. I think it is being run more like a flow control threshold
> > rather than some sort of lock.
> >
> > I think I see what you are getting at though. Basically if the xmit
> > function were to cycle several times between steps 3.a and 4 in the
> > maybe/try wake it could fill the queue and then trigger the wake even
> > though the queue is full and the unused space was already consumed.
>
> Yup, exactly. So we either need to sprinkle a couple more barriers
> and tests in, or document that the code is only 99.999999% safe
> against false positive restarts and drivers need to check for ring
> full at the beginning of xmit.
>
> I'm quite tempted to add the barriers, because on the NAPI/consumer
> side we could use this as an opportunity to start piggy backing on
> the BQL barrier.

The thing is the more barriers we add the more it will hurt
performance. I'd be tempted to just increase the runway we have as we
could afford a 1 packet skid if we had a 2 packet runway for the
start/stop thresholds.

I suspect that is probably why we haven't seen any issues as the
DESC_NEEDED is pretty generous since it is assuming worst case
scenarios.

  reply	other threads:[~2023-04-03 20:29 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-01  5:12 [PATCH net-next 0/3] net: provide macros for commonly copied lockless queue stop/wake code Jakub Kicinski
2023-04-01  5:12 ` [PATCH net-next 1/3] " Jakub Kicinski
2023-04-01 15:04   ` Heiner Kallweit
2023-04-01 18:03     ` Jakub Kicinski
2023-04-01 15:18   ` Heiner Kallweit
2023-04-01 18:58     ` Jakub Kicinski
2023-04-01 20:41       ` Heiner Kallweit
2023-04-03 15:18       ` Alexander Duyck
2023-04-03 15:56         ` Jakub Kicinski
2023-04-03 18:11           ` Alexander Duyck
2023-04-03 19:03             ` Jakub Kicinski
2023-04-03 20:27               ` Alexander Duyck [this message]
2023-04-05 22:20                 ` Paul E. McKenney
2023-04-06  5:15                   ` Herbert Xu
2023-04-06 14:17                     ` Paul E. McKenney
2023-04-06 14:46                       ` Jakub Kicinski
2023-04-06 15:45                         ` Paul E. McKenney
2023-04-06 15:56                           ` Jakub Kicinski
2023-04-06 16:25                             ` Paul E. McKenney
2023-04-07  0:58                         ` Herbert Xu
2023-04-07  1:03                           ` Jakub Kicinski
2023-04-07  1:14                             ` Herbert Xu
2023-04-07  1:21                               ` Jakub Kicinski
2023-04-04  6:39         ` Herbert Xu
2023-04-04 22:36           ` Jakub Kicinski
2023-04-01  5:12 ` [PATCH net-next 2/3] ixgbe: use new queue try_stop/try_wake macros Jakub Kicinski
2023-04-01  5:12 ` [PATCH net-next 3/3] bnxt: " Jakub Kicinski
2023-04-01 18:35   ` Michael Chan
  -- strict thread matches above, loose matches on Subject: below --
2023-03-22 23:30 [PATCH net-next 1/3] net: provide macros for commonly copied lockless queue stop/wake code Jakub Kicinski
2023-03-23  0:35 ` Andrew Lunn
2023-03-23  1:04   ` Jakub Kicinski
2023-03-23 21:02     ` Andrew Lunn
2023-03-23 22:46       ` Jakub Kicinski
2023-03-23  3:05 ` Yunsheng Lin
2023-03-23  3:27   ` Jakub Kicinski
2023-03-23  4:53 ` Pavan Chebbi
2023-03-23  5:08   ` Jakub Kicinski
2023-03-23 16:05 ` Alexander H Duyck
2023-03-24  3:09   ` Jakub Kicinski
2023-03-24 15:45     ` Alexander Duyck
2023-03-24 21:28       ` Jakub Kicinski
2023-03-26 21:23         ` Alexander Duyck
2023-03-29  0:56           ` Jakub Kicinski
2023-03-30 14:56             ` Paolo Abeni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKgT0Ue-hEycSyYvVJt0L5Z=373MyNPbgPjFZMA5j2v0hWg0zg@mail.gmail.com' \
    --to=alexander.duyck@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=hkallweit1@gmail.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.