bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Jakub Sitnicki <jakub@cloudflare.com>,
	John Fastabend <john.fastabend@gmail.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, kernel-team@cloudflare.com
Subject: Re: [PATCH bpf 0/3] Fix locking order and synchronization on sockmap/sockhash tear-down
Date: Tue, 10 Mar 2020 09:44:24 -0700	[thread overview]
Message-ID: <5e67c3e83fb25_1e8a2b0e88e0a5bc84@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <87zhdun0ay.fsf@cloudflare.com>

Jakub Sitnicki wrote:
> On Thu, Feb 06, 2020 at 08:43 PM CET, John Fastabend wrote:
> > Jakub Sitnicki wrote:
> >> Couple of fixes that came from recent discussion [0] on commit
> >> 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down").
> >>
> >> This series doesn't address the sleeping while holding a spinlock
> >> problem. We're still trying to decide how to fix that [1].
> >>
> >> Until then sockmap users might see the following warnings:
> >>
> >> | BUG: sleeping function called from invalid context at net/core/sock.c:2935
> 

[...]

> Hey John,

Patch sent.

> 
> > Untested at the moment, but this should also be fine per your suggestion
> > (if I read it correctly).  The reason we have stab->lock and bucket->locks
> > here is to handle checking EEXIST in update/delete cases. We need to
> > be careful that when an update happens and we check for EEXIST that the
> > socket is added/removed during this check. So both map_update_common and
> > sock_map_delete need to guard from being run together potentially deleting
> > an entry we are checking, etc.
> 
> Okay, thanks for explanation. IOW, we're serializing map writers.
> 
> > But by the time we get here we just did a synchronize_rcu() in the
> > line above so no updates/deletes should be in flight. So it seems safe
> > to drop these locks because of the condition no updates in flight.
> 
> This part is not clear to me. I might be missing something.
> 
> Here's my thinking - for any map writes (update/delete) to start,
> map->refcnt needs to be > 0, and the ref is not dropped until the write
> operation has finished.
> 
> Map FDs hold a ref to map until the FD gets released. And BPF progs hold
> refs to maps until the prog gets unloaded.
> 
> This would mean that map_free will get scheduled from __bpf_map_put only
> when no one is holding a map ref, and could start a write that would be
> happening concurrently with sock_{map,hash}_free:

Sorry bringing back this old thread I'm not sure I followed the couple
paragraphs here. Is this with regards to the lock or the rcu? II didn't
want to just drop this thanks.

We can't have new updates/lookups/deletes happening while we are free'ing
a map that would cause all sorts of problems, use after free's, etc.

> 
> /* decrement map refcnt and schedule it for freeing via workqueue
>  * (unrelying map implementation ops->map_free() might sleep)
>  */
> static void __bpf_map_put(struct bpf_map *map, bool do_idr_lock)
> {
> 	if (atomic64_dec_and_test(&map->refcnt)) {
> 		/* bpf_map_free_id() must be called first */
> 		bpf_map_free_id(map, do_idr_lock);
> 		btf_put(map->btf);
> 		INIT_WORK(&map->work, bpf_map_free_deferred);
> 		schedule_work(&map->work);
> 	}
> }
> 
> > So with patch below we keep the sync rcu but that is fine IMO these
> > map free's are rare. Take a look and make sure it seems sane to you
> > as well.
> 
> I can't vouch for the need to keep synchronize_rcu here because I don't
> understand that part, but otherwise the change LGTM.
> 
> -jkbs
>

[...]

  reply	other threads:[~2020-03-10 16:44 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-06 11:16 [PATCH bpf 0/3] Fix locking order and synchronization on sockmap/sockhash tear-down Jakub Sitnicki
2020-02-06 11:16 ` [PATCH bpf 1/3] bpf, sockmap: Don't sleep while holding RCU lock on tear-down Jakub Sitnicki
2020-02-06 18:59   ` John Fastabend
2020-02-06 11:16 ` [PATCH bpf 2/3] bpf, sockhash: synchronize_rcu before free'ing map Jakub Sitnicki
2020-02-06 19:01   ` John Fastabend
2020-02-06 11:16 ` [PATCH bpf 3/3] selftests/bpf: Test freeing sockmap/sockhash with a socket in it Jakub Sitnicki
2020-02-06 19:03   ` John Fastabend
2020-02-09  2:41   ` Alexei Starovoitov
2020-02-09  4:12     ` Yonghong Song
2020-02-09 15:29     ` Jakub Sitnicki
2020-02-10  3:55       ` John Fastabend
2020-02-10 11:52         ` Jakub Sitnicki
2020-02-10 23:38           ` Daniel Borkmann
2020-02-06 19:43 ` [PATCH bpf 0/3] Fix locking order and synchronization on sockmap/sockhash tear-down John Fastabend
2020-02-07 10:45   ` Jakub Sitnicki
2020-03-10 16:44     ` John Fastabend [this message]
2020-03-11 11:51       ` Jakub Sitnicki
2020-03-10 11:30   ` Jakub Sitnicki
2020-02-07 21:56 ` Daniel Borkmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5e67c3e83fb25_1e8a2b0e88e0a5bc84@john-XPS-13-9370.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=jakub@cloudflare.com \
    --cc=kernel-team@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).