bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
	Martin KaFai Lau <kafai@fb.com>,
	Hangbin Liu <liuhangbin@gmail.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Magnus Karlsson <magnus.karlsson@gmail.com>,
	"Paul E. McKenney" <paulmck@kernel.org>
Subject: Re: [PATCH RFC bpf-next 4/4] i40e: remove rcu_read_lock() around XDP program invocation
Date: Fri, 23 Apr 2021 15:57:04 +0200	[thread overview]
Message-ID: <20210423135704.GC64904@ranger.igk.intel.com> (raw)
In-Reply-To: <161917591996.102337.9559803697014955421.stgit@toke.dk>

On Fri, Apr 23, 2021 at 01:05:20PM +0200, Toke Høiland-Jørgensen wrote:
> From: Toke Høiland-Jørgensen <toke@redhat.com>
> 
> The i40e driver has rcu_read_lock()/rcu_read_unlock() pairs around XDP
> program invocations. However, the actual lifetime of the objects referred
> by the XDP program invocation is longer, all the way through to the call to
> xdp_do_flush(), making the scope of the rcu_read_lock() too small. This
> turns out to be harmless because it all happens in a single NAPI poll
> cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock()
> misleading.

Okay, but what about the lifetime of the xdp_prog itself? Can xdp_prog
change within a single NAPI poll? After reading previous discussions I
would say it can't, right?

There are drivers that have a big RCU critical section in NAPI poll, but it
seems that some read a xdp_prog a single time whereas others read it per
processed frame.

If we are sure that xdp_prog can't change on-the-fly then first low
hanging fruit, at least for the Intel drivers, is to skip a test against
NULL and read it only once at the beginning of NAPI poll. There might be
also other micro-optimizations specific to each drivers that could be done
based on that (that of course read the xdp_prog per each frame).

Or am I nuts?

> 
> Rather than extend the scope of the rcu_read_lock(), just get rid of it
> entirely. With the addition of RCU annotations to the XDP_REDIRECT map
> types that take bh execution into account, lockdep even understands this to
> be safe, so there's really no reason to keep it around.
> 
> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_txrx.c |    2 --
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c  |    6 +-----
>  2 files changed, 1 insertion(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> index fc20afc23bfa..3f4c947a5185 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> @@ -2303,7 +2303,6 @@ static struct sk_buff *i40e_run_xdp(struct i40e_ring *rx_ring,
>  	struct bpf_prog *xdp_prog;
>  	u32 act;
>  
> -	rcu_read_lock();
>  	xdp_prog = READ_ONCE(rx_ring->xdp_prog);
>  
>  	if (!xdp_prog)
> @@ -2334,7 +2333,6 @@ static struct sk_buff *i40e_run_xdp(struct i40e_ring *rx_ring,
>  		break;
>  	}
>  xdp_out:
> -	rcu_read_unlock();
>  	return ERR_PTR(-result);
>  }
>  
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index d89c22347d9d..93b349f11d3b 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -153,7 +153,6 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
>  	struct bpf_prog *xdp_prog;
>  	u32 act;
>  
> -	rcu_read_lock();
>  	/* NB! xdp_prog will always be !NULL, due to the fact that
>  	 * this path is enabled by setting an XDP program.
>  	 */
> @@ -162,9 +161,7 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
>  
>  	if (likely(act == XDP_REDIRECT)) {
>  		err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
> -		result = !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED;
> -		rcu_read_unlock();
> -		return result;
> +		return !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED;
>  	}
>  
>  	switch (act) {
> @@ -184,7 +181,6 @@ static int i40e_run_xdp_zc(struct i40e_ring *rx_ring, struct xdp_buff *xdp)
>  		result = I40E_XDP_CONSUMED;
>  		break;
>  	}
> -	rcu_read_unlock();
>  	return result;
>  }
>  
> 

  reply	other threads:[~2021-04-23 14:11 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-23 11:05 [PATCH RFC bpf-next 0/4] Clean up and document RCU-based object protection for XDP_REDIRECT Toke Høiland-Jørgensen
2021-04-23 11:05 ` [PATCH RFC bpf-next 1/4] rcu: Create an unrcu_pointer() to remove __rcu from a pointer Toke Høiland-Jørgensen
2021-04-23 11:05 ` [PATCH RFC bpf-next 2/4] dev: add rcu_read_lock_bh_held() as a valid check when getting a RCU dev ref Toke Høiland-Jørgensen
2021-04-23 11:05 ` [PATCH RFC bpf-next 3/4] xdp: add proper __rcu annotations to redirect map entries Toke Høiland-Jørgensen
2021-04-23 11:05 ` [PATCH RFC bpf-next 4/4] i40e: remove rcu_read_lock() around XDP program invocation Toke Høiland-Jørgensen
2021-04-23 13:57   ` Maciej Fijalkowski [this message]
2021-04-23 20:33     ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210423135704.GC64904@ranger.igk.intel.com \
    --to=maciej.fijalkowski@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=hawk@kernel.org \
    --cc=kafai@fb.com \
    --cc=liuhangbin@gmail.com \
    --cc=magnus.karlsson@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).