All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: John Fastabend <john.fastabend@gmail.com>
Cc: <jakub@cloudflare.com>, <daniel@iogearbox.net>, <ast@kernel.org>,
	<netdev@vger.kernel.org>, <bpf@vger.kernel.org>
Subject: Re: [bpf PATCH v2 2/3] bpf, sockmap: RCU dereferenced psock may be used outside RCU block
Date: Thu, 25 Jun 2020 23:13:50 -0700	[thread overview]
Message-ID: <20200626061350.seduf7s7cthqbnov@kafai-mbp.dhcp.thefacebook.com> (raw)
In-Reply-To: <159312679888.18340.15248924071966273998.stgit@john-XPS-13-9370>

On Thu, Jun 25, 2020 at 04:13:18PM -0700, John Fastabend wrote:
> If an ingress verdict program specifies message sizes greater than
> skb->len and there is an ENOMEM error due to memory pressure we
> may call the rcv_msg handler outside the strp_data_ready() caller
> context. This is because on an ENOMEM error the strparser will
> retry from a workqueue. The caller currently protects the use of
> psock by calling the strp_data_ready() inside a rcu_read_lock/unlock
> block.
> 
> But, in above workqueue error case the psock is accessed outside
> the read_lock/unlock block of the caller. So instead of using
> psock directly we must do a look up against the sk again to
> ensure the psock is available.
> 
> There is an an ugly piece here where we must handle
> the case where we paused the strp and removed the psock. On
> psock removal we first pause the strparser and then remove
> the psock. If the strparser is paused while an skb is
> scheduled on the workqueue the skb will be dropped on the
> flow and kfree_skb() is called. If the workqueue manages
> to get called before we pause the strparser but runs the rcvmsg
> callback after the psock is removed we will hit the unlikely
> case where we run the sockmap rcvmsg handler but do not have
> a psock. For now we will follow strparser logic and drop the
> skb on the floor with skb_kfree(). This is ugly because the
> data is dropped. To date this has not caused problems in practice
> because either the application controlling the sockmap is
> coordinating with the datapath so that skbs are "flushed"
> before removal or we simply wait for the sock to be closed before
> removing it.
> 
> This patch fixes the describe RCU bug and dropping the skb doesn't
> make things worse. Future patches will improve this by allowing
> the normal case where skbs are not merged to skip the strparser
> altogether. In practice many (most?) use cases have no need to
> merge skbs so its both a code complexity hit as seen above and
> a performance issue. For example, in the Cilium case we always
> set the strparser up to return sbks 1:1 without any merging and
> have avoided above issues.
Thanks for the details explanation.  I have to admit that I cannot
fully comprehend the concurrency situation in skmsg and psock.
The change makes sense to me after reading the description though.

Acked-by: Martin KaFai Lau <kafai@fb.com>

  reply	other threads:[~2020-06-26  6:14 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-25 23:12 [bpf PATCH v2 0/3] Sockmap RCU splat fix John Fastabend
2020-06-25 23:12 ` [bpf PATCH v2 1/3] bpf, sockmap: RCU splat with redirect and strparser error or TLS John Fastabend
2020-06-26  6:10   ` Martin KaFai Lau
2020-06-26 11:11   ` Jakub Sitnicki
2020-06-25 23:13 ` [bpf PATCH v2 2/3] bpf, sockmap: RCU dereferenced psock may be used outside RCU block John Fastabend
2020-06-26  6:13   ` Martin KaFai Lau [this message]
2020-06-29  7:42   ` Jakub Sitnicki
2020-06-25 23:13 ` [bpf PATCH v2 3/3] bpf, sockmap: Add ingres skb tests that utilize merge skbs John Fastabend
2020-06-26  6:14   ` Martin KaFai Lau
2020-06-28 15:40 ` [bpf PATCH v2 0/3] Sockmap RCU splat fix Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200626061350.seduf7s7cthqbnov@kafai-mbp.dhcp.thefacebook.com \
    --to=kafai@fb.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jakub@cloudflare.com \
    --cc=john.fastabend@gmail.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.