netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch net v2 1/2] lwt: disable BH too in run_lwt_bpf()
@ 2020-12-05  7:59 Cong Wang
  2020-12-05  7:59 ` [Patch net v2 2/2] lwt_bpf: replace preempt_disable() with migrate_disable() Cong Wang
  2020-12-07 20:10 ` [Patch net v2 1/2] lwt: disable BH too in run_lwt_bpf() patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Cong Wang @ 2020-12-05  7:59 UTC (permalink / raw)
  To: netdev; +Cc: bpf, Dongdong Wang, Thomas Graf, Alexei Starovoitov, Cong Wang

From: Dongdong Wang <wangdongdong.6@bytedance.com>

The per-cpu bpf_redirect_info is shared among all skb_do_redirect()
and BPF redirect helpers. Callers on RX path are all in BH context,
disabling preemption is not sufficient to prevent BH interruption.

In production, we observed strange packet drops because of the race
condition between LWT xmit and TC ingress, and we verified this issue
is fixed after we disable BH.

Although this bug was technically introduced from the beginning, that
is commit 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure"),
at that time call_rcu() had to be call_rcu_bh() to match the RCU context.
So this patch may not work well before RCU flavor consolidation has been
completed around v5.0.

Update the comments above the code too, as call_rcu() is now BH friendly.

Cc: Thomas Graf <tgraf@suug.ch>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Dongdong Wang <wangdongdong.6@bytedance.com>
---
 net/core/lwt_bpf.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index 7d3438215f32..4f3cb7c15ddf 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -39,12 +39,11 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
 {
 	int ret;
 
-	/* Preempt disable is needed to protect per-cpu redirect_info between
-	 * BPF prog and skb_do_redirect(). The call_rcu in bpf_prog_put() and
-	 * access to maps strictly require a rcu_read_lock() for protection,
-	 * mixing with BH RCU lock doesn't work.
+	/* Preempt disable and BH disable are needed to protect per-cpu
+	 * redirect_info between BPF prog and skb_do_redirect().
 	 */
 	preempt_disable();
+	local_bh_disable();
 	bpf_compute_data_pointers(skb);
 	ret = bpf_prog_run_save_cb(lwt->prog, skb);
 
@@ -78,6 +77,7 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
 		break;
 	}
 
+	local_bh_enable();
 	preempt_enable();
 
 	return ret;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [Patch net v2 2/2] lwt_bpf: replace preempt_disable() with migrate_disable()
  2020-12-05  7:59 [Patch net v2 1/2] lwt: disable BH too in run_lwt_bpf() Cong Wang
@ 2020-12-05  7:59 ` Cong Wang
  2020-12-07 20:10 ` [Patch net v2 1/2] lwt: disable BH too in run_lwt_bpf() patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Cong Wang @ 2020-12-05  7:59 UTC (permalink / raw)
  To: netdev; +Cc: bpf, Cong Wang, Alexei Starovoitov

From: Cong Wang <cong.wang@bytedance.com>

migrate_disable() is just a wrapper for preempt_disable() in
non-RT kernel. It is safe to replace it, and RT kernel will
benefit.

Note that it is introduced since Feb 2020.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 net/core/lwt_bpf.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index 4f3cb7c15ddf..2f7940bcf715 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -39,10 +39,10 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
 {
 	int ret;
 
-	/* Preempt disable and BH disable are needed to protect per-cpu
+	/* Migration disable and BH disable are needed to protect per-cpu
 	 * redirect_info between BPF prog and skb_do_redirect().
 	 */
-	preempt_disable();
+	migrate_disable();
 	local_bh_disable();
 	bpf_compute_data_pointers(skb);
 	ret = bpf_prog_run_save_cb(lwt->prog, skb);
@@ -78,7 +78,7 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt,
 	}
 
 	local_bh_enable();
-	preempt_enable();
+	migrate_enable();
 
 	return ret;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Patch net v2 1/2] lwt: disable BH too in run_lwt_bpf()
  2020-12-05  7:59 [Patch net v2 1/2] lwt: disable BH too in run_lwt_bpf() Cong Wang
  2020-12-05  7:59 ` [Patch net v2 2/2] lwt_bpf: replace preempt_disable() with migrate_disable() Cong Wang
@ 2020-12-07 20:10 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2020-12-07 20:10 UTC (permalink / raw)
  To: Cong Wang
  Cc: netdev, bpf, wangdongdong.6, tgraf, alexei.starovoitov, cong.wang

Hello:

This series was applied to bpf/bpf.git (refs/heads/master):

On Fri,  4 Dec 2020 23:59:45 -0800 you wrote:
> From: Dongdong Wang <wangdongdong.6@bytedance.com>
> 
> The per-cpu bpf_redirect_info is shared among all skb_do_redirect()
> and BPF redirect helpers. Callers on RX path are all in BH context,
> disabling preemption is not sufficient to prevent BH interruption.
> 
> In production, we observed strange packet drops because of the race
> condition between LWT xmit and TC ingress, and we verified this issue
> is fixed after we disable BH.
> 
> [...]

Here is the summary with links:
  - [net,v2,1/2] lwt: disable BH too in run_lwt_bpf()
    https://git.kernel.org/bpf/bpf/c/d9054a1ff585
  - [net,v2,2/2] lwt_bpf: replace preempt_disable() with migrate_disable()
    https://git.kernel.org/bpf/bpf/c/e3366884b383

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-12-07 20:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-05  7:59 [Patch net v2 1/2] lwt: disable BH too in run_lwt_bpf() Cong Wang
2020-12-05  7:59 ` [Patch net v2 2/2] lwt_bpf: replace preempt_disable() with migrate_disable() Cong Wang
2020-12-07 20:10 ` [Patch net v2 1/2] lwt: disable BH too in run_lwt_bpf() patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).