All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org, tariqt@mellanox.com, saeedm@mellanox.com
Subject: Re: [PATCH v2 net] net: solve a NAPI race
Date: Mon, 27 Feb 2017 08:44:14 -0800	[thread overview]
Message-ID: <1488213854.9415.198.camel@edumazet-glaptop3.roam.corp.google.com> (raw)
In-Reply-To: <20170227.111944.1725806340309799464.davem@davemloft.net>

On Mon, 2017-02-27 at 11:19 -0500, David Miller wrote:

> Various rules were meant to protect these sequences, and make sure
> nothing like this race could happen.
> 
> Can you show the specific sequence that fails?
> 
> One of the basic protections is that the device IRQ is not re-enabled
> until napi_complete_done() is finished, most drivers do something like
> this:
> 
> 	napi_complete_done();
> 		- sets NAPI_STATE_SCHED
> 	enable device IRQ
> 
> So I don't understand how it is possible that "later an IRQ firing and
> finding this bit set, right before napi_complete_done() clears it".
> 
> While napi_complete_done() is running, the device's IRQ is still
> disabled, so there cannot be an IRQ firing before napi_complete_done()
> is finished.


Any point doing a napi_schedule() not from device hard irq handler
is subject to the race for NIC using some kind of edge trigger interrupts.

Since we do not provide a ndo to disable device interrupts,
the following can happen.

thread 1                                 thread 2 (could be on same cpu)

// busy polling or napi_watchdog()
napi_schedule();
...
napi->poll()

device polling:
read 2 packets from ring buffer
                                          Additional 3rd packet is available.
                                          device hard irq

                                          // does nothing because NAPI_STATE_SCHED bit is owned by thread 1
                                          napi_schedule();
                                          
napi_complete_done(napi, 2);
rearm_irq();


Note that rearm_irq() will not force the device to send an additional IRQ
for the packet it already signaled (3rd packet in my example)

At least for mlx4, only 4th packet will trigger the IRQ again.

In the old days, the race would not happen since napi->poll() was called
in direct response to a prior device IRQ :
Edge triggered hard irqs from the device for this queue were already disabled.

  reply	other threads:[~2017-02-27 16:44 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-25 14:22 [PATCH net] net/mlx4_en: reception NAPI/IRQ race breaker Eric Dumazet
2017-02-26 16:32 ` Saeed Mahameed
2017-02-26 17:34   ` Eric Dumazet
2017-02-26 17:40     ` Eric Dumazet
2017-02-26 18:03       ` Eric Dumazet
2017-02-27  3:31 ` [PATCH net] net: solve a NAPI race Eric Dumazet
2017-02-27 14:06   ` Eric Dumazet
2017-02-27 14:21   ` [PATCH v2 " Eric Dumazet
2017-02-27 16:19     ` David Miller
2017-02-27 16:44       ` Eric Dumazet [this message]
2017-02-27 17:10         ` Eric Dumazet
2017-02-28  2:08         ` David Miller
2017-03-01  0:22           ` Francois Romieu
2017-03-01  1:04             ` Stephen Hemminger
2017-02-27 21:00       ` Alexander Duyck
2017-02-27 20:18     ` [PATCH v3 " Eric Dumazet
2017-02-27 22:14       ` Stephen Hemminger
2017-02-27 22:35         ` Eric Dumazet
2017-02-27 22:44           ` Stephen Hemminger
2017-02-27 22:48             ` David Miller
2017-02-27 23:23               ` Stephen Hemminger
2017-02-28 10:14           ` David Laight
2017-02-28 13:04             ` Eric Dumazet
2017-02-28 13:20             ` Eric Dumazet
2017-02-28 16:17       ` Stephen Hemminger
2017-02-28 16:57         ` Eric Dumazet
2017-02-28 18:34       ` [PATCH v4 " Eric Dumazet
2017-03-01 17:53         ` David Miller
2017-02-28 17:20     ` [PATCH v2 " Alexander Duyck
2017-02-28 17:47       ` Eric Dumazet
2017-03-01 10:41       ` David Laight
2017-03-01 16:14         ` Alexander Duyck
2017-03-01 17:32           ` Eric Dumazet
2017-03-02 10:24             ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1488213854.9415.198.camel@edumazet-glaptop3.roam.corp.google.com \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=tariqt@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.