bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net] bpf: do not return NET_XMIT_xxx values on bpf_redirect
@ 2023-07-19  3:30 Yan Zhai
  2023-07-19  3:42 ` Alexei Starovoitov
  0 siblings, 1 reply; 3+ messages in thread
From: Yan Zhai @ 2023-07-19  3:30 UTC (permalink / raw)
  To: netdev
  Cc: kernel-team, Martin KaFai Lau, Daniel Borkmann, John Fastabend,
	Alexei Starovoitov, Andrii Nakryiko, Song Liu, Yonghong Song,
	KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf,
	netdev, linux-kernel, Jordan Griege

skb_do_redirect handles returns error code from both rx and tx path. The
tx path codes are special, e.g. NET_XMIT_CN: they are non-negative, and
can conflict with LWTUNNEL_XMIT_xxx values. Directly returning such code
can cause unexpected behavior. We found at least one bug that will panic
the kernel through KASAN report when we are redirecting packets to a
down or carrier-down device at lwt xmit hook:

https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48

Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the
down device, and it propagates from dev_queue_xmit all way to the lwt
logic. The result is skb that has been freed by the qdisc continues to
neighbor subsystem and triggers the bug.

This change converts the tx code to proper errors that lwt can consume.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Reported-by: Jordan Griege <jgriege@cloudflare.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
---
v2: coding style fix; sent to netdev instead of bpf for bug fixing.

---
 net/core/filter.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index 06ba0e56e369..8738c7a4701d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2129,6 +2129,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
 	ret = dev_queue_xmit(skb);
 	dev_xmit_recursion_dec();
 
+	if (unlikely(ret > 0))
+		ret = net_xmit_errno(ret);
+
 	return ret;
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 net] bpf: do not return NET_XMIT_xxx values on bpf_redirect
  2023-07-19  3:30 [PATCH v2 net] bpf: do not return NET_XMIT_xxx values on bpf_redirect Yan Zhai
@ 2023-07-19  3:42 ` Alexei Starovoitov
  2023-07-20  2:48   ` Yan Zhai
  0 siblings, 1 reply; 3+ messages in thread
From: Alexei Starovoitov @ 2023-07-19  3:42 UTC (permalink / raw)
  To: Yan Zhai
  Cc: Network Development, kernel-team, Martin KaFai Lau,
	Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf, LKML,
	Jordan Griege

On Tue, Jul 18, 2023 at 8:30 PM Yan Zhai <yan@cloudflare.com> wrote:
>
> skb_do_redirect handles returns error code from both rx and tx path. The
> tx path codes are special, e.g. NET_XMIT_CN: they are non-negative, and
> can conflict with LWTUNNEL_XMIT_xxx values. Directly returning such code
> can cause unexpected behavior. We found at least one bug that will panic
> the kernel through KASAN report when we are redirecting packets to a
> down or carrier-down device at lwt xmit hook:
>
> https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48
>
> Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the
> down device, and it propagates from dev_queue_xmit all way to the lwt
> logic. The result is skb that has been freed by the qdisc continues to
> neighbor subsystem and triggers the bug.

I'm struggling to parse the above paragraph.
Where bpf prog is installed?
Is this lwt bpf prog that returns BPF_REDIRECT ?
that redirects to netdev with noop_qdisc ?
What is the topology?

Please add a selftest to make sure we don't regress.

Also pls mark your patch as [PATCH v3 bpf] when you respin.

> This change converts the tx code to proper errors that lwt can consume.
>
> Suggested-by: Stanislav Fomichev <sdf@google.com>
> Reported-by: Jordan Griege <jgriege@cloudflare.com>
> Signed-off-by: Yan Zhai <yan@cloudflare.com>
> ---
> v2: coding style fix; sent to netdev instead of bpf for bug fixing.
>
> ---
>  net/core/filter.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 06ba0e56e369..8738c7a4701d 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2129,6 +2129,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
>         ret = dev_queue_xmit(skb);
>         dev_xmit_recursion_dec();
>
> +       if (unlikely(ret > 0))
> +               ret = net_xmit_errno(ret);
> +
>         return ret;
>  }
>
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 net] bpf: do not return NET_XMIT_xxx values on bpf_redirect
  2023-07-19  3:42 ` Alexei Starovoitov
@ 2023-07-20  2:48   ` Yan Zhai
  0 siblings, 0 replies; 3+ messages in thread
From: Yan Zhai @ 2023-07-20  2:48 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Network Development, kernel-team, Martin KaFai Lau,
	Daniel Borkmann, John Fastabend, Alexei Starovoitov,
	Andrii Nakryiko, Song Liu, Yonghong Song, KP Singh,
	Stanislav Fomichev, Hao Luo, Jiri Olsa, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, bpf, LKML,
	Jordan Griege

On Tue, Jul 18, 2023 at 10:42 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jul 18, 2023 at 8:30 PM Yan Zhai <yan@cloudflare.com> wrote:
> >
> > skb_do_redirect handles returns error code from both rx and tx path. The
> > tx path codes are special, e.g. NET_XMIT_CN: they are non-negative, and
> > can conflict with LWTUNNEL_XMIT_xxx values. Directly returning such code
> > can cause unexpected behavior. We found at least one bug that will panic
> > the kernel through KASAN report when we are redirecting packets to a
> > down or carrier-down device at lwt xmit hook:
> >
> > https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48
> >
> > Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the
> > down device, and it propagates from dev_queue_xmit all way to the lwt
> > logic. The result is skb that has been freed by the qdisc continues to
> > neighbor subsystem and triggers the bug.
>
> I'm struggling to parse the above paragraph.
> Where bpf prog is installed?
> Is this lwt bpf prog that returns BPF_REDIRECT ?
> that redirects to netdev with noop_qdisc ?
> What is the topology?
>
Sorry for the confusion. Mentioning noop_qdisc is an explanation of
what happened. The actual trigger is simple: install a bpf program on
lwt route at xmit hook. It bpf_redirect packets to a device FOO. If
FOO is down or carrier-down, redirected packets will crash the kernel.

> Please add a selftest to make sure we don't regress.
>
> Also pls mark your patch as [PATCH v3 bpf] when you respin.
>
Ack

> > This change converts the tx code to proper errors that lwt can consume.
> >
> > Suggested-by: Stanislav Fomichev <sdf@google.com>
> > Reported-by: Jordan Griege <jgriege@cloudflare.com>
> > Signed-off-by: Yan Zhai <yan@cloudflare.com>
> > ---
> > v2: coding style fix; sent to netdev instead of bpf for bug fixing.
> >
> > ---
> >  net/core/filter.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 06ba0e56e369..8738c7a4701d 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -2129,6 +2129,9 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
> >         ret = dev_queue_xmit(skb);
> >         dev_xmit_recursion_dec();
> >
> > +       if (unlikely(ret > 0))
> > +               ret = net_xmit_errno(ret);
> > +
> >         return ret;
> >  }
> >
> > --
> > 2.30.2
> >



-- 

Yan

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-07-20  2:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-19  3:30 [PATCH v2 net] bpf: do not return NET_XMIT_xxx values on bpf_redirect Yan Zhai
2023-07-19  3:42 ` Alexei Starovoitov
2023-07-20  2:48   ` Yan Zhai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).