bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: bjorn.topel@gmail.com, bpf@vger.kernel.org, toke@redhat.com,
	toshiaki.makita1@gmail.com, netdev@vger.kernel.org,
	ast@kernel.org, daniel@iogearbox.net
Subject: Re: [bpf-next PATCH v2 2/2] bpf: xdp, remove no longer required rcu_read_{un}lock()
Date: Tue, 14 Jan 2020 01:31:58 +0100	[thread overview]
Message-ID: <20200114003158.GA6804@ranger.igk.intel.com> (raw)
In-Reply-To: <5e1d34bfbf1f5_78752af1940225b41c@john-XPS-13-9370.notmuch>

On Mon, Jan 13, 2020 at 07:25:51PM -0800, John Fastabend wrote:
> Maciej Fijalkowski wrote:
> > On Sat, Jan 11, 2020 at 06:37:42PM -0800, John Fastabend wrote:
> > > Now that we depend on rcu_call() and synchronize_rcu() to also wait

one last thing since you'll be sending a v3, s/rcu_call/call_rcu.

> > > for preempt_disabled region to complete the rcu read critical section
> > > in __dev_map_flush() is no longer required. Except in a few special
> > > cases in drivers that need it for other reasons.
> > > 
> > > These originally ensured the map reference was safe while a map was
> > > also being free'd. And additionally that bpf program updates via
> > > ndo_bpf did not happen while flush updates were in flight. But flush
> > > by new rules can only be called from preempt-disabled NAPI context.
> > > The synchronize_rcu from the map free path and the rcu_call from the
> > > delete path will ensure the reference there is safe. So lets remove
> > > the rcu_read_lock and rcu_read_unlock pair to avoid any confusion
> > > around how this is being protected.
> > > 
> > > If the rcu_read_lock was required it would mean errors in the above
> > > logic and the original patch would also be wrong.
> > > 
> > > Now that we have done above we put the rcu_read_lock in the driver
> > > code where it is needed in a driver dependent way. I think this
> > > helps readability of the code so we know where and why we are
> > > taking read locks. Most drivers will not need rcu_read_locks here
> > > and further XDP drivers already have rcu_read_locks in their code
> > > paths for reading xdp programs on RX side so this makes it symmetric
> > > where we don't have half of rcu critical sections define in driver
> > > and the other half in devmap.
> > > 
> > > Fixes: 0536b85239b84 ("xdp: Simplify devmap cleanup")
> > > Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> > > ---
> 
> [...]
> 
> > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > > index 4d7d5434..2c11f82 100644
> > > --- a/drivers/net/virtio_net.c
> > > +++ b/drivers/net/virtio_net.c
> > > @@ -498,12 +498,16 @@ static int virtnet_xdp_xmit(struct net_device *dev,
> > >  	void *ptr;
> > >  	int i;
> > >  
> > > +	rcu_read_lock();
> > > +
> > >  	/* Only allow ndo_xdp_xmit if XDP is loaded on dev, as this
> > >  	 * indicate XDP resources have been successfully allocated.
> > >  	 */
> > >  	xdp_prog = rcu_dereference(rq->xdp_prog);
> > 
> > We could convert that rcu_dereference to rcu_access_pointer so that we
> > don't need the rcu critical section here at all. Actually this was
> > suggested some time ago by David Ahern during the initial discussion
> > around this code. Not sure why we didn't change it.
> 
> Makes sense I'll send a v3 with a middle patch to do this and then drop
> this segment.

Great :)

> 
> > 
> > Veth is also checking the xdp prog presence and it is doing that via
> > rcu_access_pointer so such conversion would make it more common, no?
> 
> veth derefernces rcv netdevice and this accesses it. The logic to
> drop the rcu here is less obvious to me. At least I would have to
> study it closely.

Veth does two rcu derefs in the veth_xmit, one for netdev and one for
xdp_prog and I was referring to a xdp_prog deref which is done via
rcu_access_pointer. So yeah we need to keep the rcu section in there but I
was just making an argument for having the rcu_access_pointer on the
virtio_net side.

> 
> > 
> > xdp_prog is only check against NULL, so quoting the part of comment from
> > rcu_access_pointer:
> > "This is useful when the value of this pointer is accessed, but the pointer
> > is not dereferenced, for example, when testing an RCU-protected pointer
> > against NULL."
> 
> +1 thanks it does make the cleanup nicer.

  reply	other threads:[~2020-01-14  7:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-12  2:36 [bpf-next PATCH v2 0/2] xdp devmap improvements cleanup John Fastabend
2020-01-12  2:37 ` [bpf-next PATCH v2 1/2] bpf: xdp, update devmap comments to reflect napi/rcu usage John Fastabend
2020-01-12  7:47   ` Maciej Fijalkowski
2020-01-14  3:27     ` John Fastabend
2020-01-12 11:21   ` Toke Høiland-Jørgensen
2020-01-12  2:37 ` [bpf-next PATCH v2 2/2] bpf: xdp, remove no longer required rcu_read_{un}lock() John Fastabend
2020-01-12  6:49   ` Maciej Fijalkowski
2020-01-14  3:25     ` John Fastabend
2020-01-14  0:31       ` Maciej Fijalkowski [this message]
2020-01-12 11:22   ` Toke Høiland-Jørgensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200114003158.GA6804@ranger.igk.intel.com \
    --to=maciej.fijalkowski@intel.com \
    --cc=ast@kernel.org \
    --cc=bjorn.topel@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@redhat.com \
    --cc=toshiaki.makita1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).