bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
@ 2022-07-15 11:55 Zhengchao Shao
  2022-07-15 23:30 ` Stanislav Fomichev
                   ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Zhengchao Shao @ 2022-07-15 11:55 UTC (permalink / raw)
  To: bpf, netdev, linux-kernel, davem, edumazet, kuba, pabeni, hawk
  Cc: ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, bigeasy, imagedong, petrm, arnd, dsahern,
	talalahmad, keescook, haoluo, jolsa, weiyongjun1, yuehaibing,
	shaozhengchao

Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
skbs, that is, the flow->head is null.
The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
run a bpf prog which redirects empty skbs.
So we should determine whether the length of the packet modified by bpf
prog or others like bpf_prog_test is valid before forwarding it directly.

LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html

Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
---
v3: modify debug print
v2: need move checking to convert___skb_to_skb and add debug info
v1: should not check len in fast path

 include/linux/skbuff.h | 8 ++++++++
 net/bpf/test_run.c     | 3 +++
 net/core/dev.c         | 1 +
 3 files changed, 12 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f6a27ab19202..82e8368ba6e6 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2459,6 +2459,14 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
 
 #endif /* NET_SKBUFF_DATA_USES_OFFSET */
 
+static inline void skb_assert_len(struct sk_buff *skb)
+{
+#ifdef CONFIG_DEBUG_NET
+	if (WARN_ONCE(!skb->len, "%s\n", __func__))
+		DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
+#endif /* CONFIG_DEBUG_NET */
+}
+
 /*
  *	Add data to an sk_buff
  */
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 2ca96acbc50a..dc9dc0bedca0 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -955,6 +955,9 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
 {
 	struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
 
+	if (!skb->len)
+		return -EINVAL;
+
 	if (!__skb)
 		return 0;
 
diff --git a/net/core/dev.c b/net/core/dev.c
index d588fd0a54ce..716df64fcfa5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4168,6 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
 	bool again = false;
 
 	skb_reset_mac_header(skb);
+	skb_assert_len(skb);
 
 	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
 		__skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-07-15 11:55 [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len Zhengchao Shao
@ 2022-07-15 23:30 ` Stanislav Fomichev
  2022-07-19 17:00 ` patchwork-bot+netdevbpf
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Stanislav Fomichev @ 2022-07-15 23:30 UTC (permalink / raw)
  To: Zhengchao Shao
  Cc: bpf, netdev, linux-kernel, davem, edumazet, kuba, pabeni, hawk,
	ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, bigeasy, imagedong, petrm, arnd, dsahern, talalahmad,
	keescook, haoluo, jolsa, weiyongjun1, yuehaibing

On Fri, Jul 15, 2022 at 4:51 AM Zhengchao Shao <shaozhengchao@huawei.com> wrote:
>
> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> skbs, that is, the flow->head is null.
> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> run a bpf prog which redirects empty skbs.
> So we should determine whether the length of the packet modified by bpf
> prog or others like bpf_prog_test is valid before forwarding it directly.
>
> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
>
> Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com
> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>

Reviewed-by: Stanislav Fomichev <sdf@google.com>

Looks like it addresses everything?

> ---
> v3: modify debug print
> v2: need move checking to convert___skb_to_skb and add debug info
> v1: should not check len in fast path
>
>  include/linux/skbuff.h | 8 ++++++++
>  net/bpf/test_run.c     | 3 +++
>  net/core/dev.c         | 1 +
>  3 files changed, 12 insertions(+)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f6a27ab19202..82e8368ba6e6 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2459,6 +2459,14 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
>
>  #endif /* NET_SKBUFF_DATA_USES_OFFSET */
>
> +static inline void skb_assert_len(struct sk_buff *skb)
> +{
> +#ifdef CONFIG_DEBUG_NET
> +       if (WARN_ONCE(!skb->len, "%s\n", __func__))
> +               DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
> +#endif /* CONFIG_DEBUG_NET */
> +}
> +
>  /*
>   *     Add data to an sk_buff
>   */
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 2ca96acbc50a..dc9dc0bedca0 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -955,6 +955,9 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
>  {
>         struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
>
> +       if (!skb->len)
> +               return -EINVAL;
> +
>         if (!__skb)
>                 return 0;
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index d588fd0a54ce..716df64fcfa5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4168,6 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>         bool again = false;
>
>         skb_reset_mac_header(skb);
> +       skb_assert_len(skb);
>
>         if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
>                 __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-07-15 11:55 [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len Zhengchao Shao
  2022-07-15 23:30 ` Stanislav Fomichev
@ 2022-07-19 17:00 ` patchwork-bot+netdevbpf
  2022-09-14 11:19 ` Lorenz Bauer
  2022-11-03 21:07 ` Martin KaFai Lau
  3 siblings, 0 replies; 19+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-07-19 17:00 UTC (permalink / raw)
  To: Zhengchao Shao
  Cc: bpf, netdev, linux-kernel, davem, edumazet, kuba, pabeni, hawk,
	ast, daniel, andrii, martin.lau, song, yhs, john.fastabend,
	kpsingh, sdf, bigeasy, imagedong, petrm, arnd, dsahern,
	talalahmad, keescook, haoluo, jolsa, weiyongjun1, yuehaibing

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Fri, 15 Jul 2022 19:55:59 +0800 you wrote:
> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> skbs, that is, the flow->head is null.
> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> run a bpf prog which redirects empty skbs.
> So we should determine whether the length of the packet modified by bpf
> prog or others like bpf_prog_test is valid before forwarding it directly.
> 
> [...]

Here is the summary with links:
  - [v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
    https://git.kernel.org/bpf/bpf-next/c/fd1894224407

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-07-15 11:55 [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len Zhengchao Shao
  2022-07-15 23:30 ` Stanislav Fomichev
  2022-07-19 17:00 ` patchwork-bot+netdevbpf
@ 2022-09-14 11:19 ` Lorenz Bauer
  2022-09-17 15:46   ` Stanislav Fomichev
  2022-11-03 21:07 ` Martin KaFai Lau
  3 siblings, 1 reply; 19+ messages in thread
From: Lorenz Bauer @ 2022-09-14 11:19 UTC (permalink / raw)
  To: shaozhengchao, ast, daniel
  Cc: bpf, linux-kernel, sdf, yuehaibing, Lorenz Bauer

Hi,

I think this patch is causing user-space breakage, see [0].

The gist is that we do BPF_PROG_RUN of a socket filter with 14 byte input to determine whether
BPF_PROG_RUN is available or not. I'll fix this in cilium/ebpf, but I think this patch
needs more work since users may be doing the same thing in their code.


Thanks,
Lorenz

0: https://github.com/cilium/ebpf/pull/788

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-09-14 11:19 ` Lorenz Bauer
@ 2022-09-17 15:46   ` Stanislav Fomichev
  2022-09-19 10:55     ` shaozhengchao
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Fomichev @ 2022-09-17 15:46 UTC (permalink / raw)
  To: Lorenz Bauer; +Cc: shaozhengchao, ast, daniel, bpf, linux-kernel, yuehaibing

On Wed, Sep 14, 2022 at 4:20 AM Lorenz Bauer <oss@lmb.io> wrote:
>
> Hi,
>
> I think this patch is causing user-space breakage, see [0].
>
> The gist is that we do BPF_PROG_RUN of a socket filter with 14 byte input to determine whether
> BPF_PROG_RUN is available or not. I'll fix this in cilium/ebpf, but I think this patch
> needs more work since users may be doing the same thing in their code.

Ooops, sorry about that.

Instead of rejecting len=0 data, we might accept the packet but add
some safe header? I think that should be more backwards compatible?
Zhengchao, something you can look into?


> Thanks,
> Lorenz
>
> 0: https://github.com/cilium/ebpf/pull/788

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-09-17 15:46   ` Stanislav Fomichev
@ 2022-09-19 10:55     ` shaozhengchao
  2022-09-20 14:42       ` Lorenz Bauer
  0 siblings, 1 reply; 19+ messages in thread
From: shaozhengchao @ 2022-09-19 10:55 UTC (permalink / raw)
  To: Stanislav Fomichev, Lorenz Bauer
  Cc: ast, daniel, bpf, linux-kernel, yuehaibing



On 2022/9/17 23:46, Stanislav Fomichev wrote:
> On Wed, Sep 14, 2022 at 4:20 AM Lorenz Bauer <oss@lmb.io> wrote:
>>
>> Hi,
>>
>> I think this patch is causing user-space breakage, see [0].
>>
>> The gist is that we do BPF_PROG_RUN of a socket filter with 14 byte input to determine whether
>> BPF_PROG_RUN is available or not. I'll fix this in cilium/ebpf, but I think this patch
>> needs more work since users may be doing the same thing in their code.
> 
> Ooops, sorry about that.
> 
> Instead of rejecting len=0 data, we might accept the packet but add
> some safe header? I think that should be more backwards compatible?
> Zhengchao, something you can look into?
> 
> 
Sorry for the delay. I'm busy testing the TC module recently. I'm very 
sorry for the user-space breakage.

The root cause of this problem is that eth_type_trans() is called when
the protocol type of the SKB is parsed. The len value of the SKB is
reduced to 0. If the user mode requires that the forwarding succeed, or
  if the MAC header is added again after the MAC header is subtracted, 
is this appropriate?

Zhengchao Shao
>> Thanks,
>> Lorenz
>>
>> 0: https://github.com/cilium/ebpf/pull/788

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-09-19 10:55     ` shaozhengchao
@ 2022-09-20 14:42       ` Lorenz Bauer
  2022-09-21  8:48         ` shaozhengchao
  0 siblings, 1 reply; 19+ messages in thread
From: Lorenz Bauer @ 2022-09-20 14:42 UTC (permalink / raw)
  To: shaozhengchao, Stanislav Fomichev
  Cc: ast, daniel, bpf, linux-kernel, yuehaibing

On Mon, 19 Sep 2022, at 11:55, shaozhengchao wrote:
> Sorry for the delay. I'm busy testing the TC module recently. I'm very 
> sorry for the user-space breakage.
>
> The root cause of this problem is that eth_type_trans() is called when
> the protocol type of the SKB is parsed. The len value of the SKB is
> reduced to 0. If the user mode requires that the forwarding succeed, or
>   if the MAC header is added again after the MAC header is subtracted, 
> is this appropriate?

We don't require forwarding to succeed with a 14 byte input buffer. We also don't look at the MAC header.

I think refusing to forward 0 length packets would be OK. Not 100% certain I understood you correctly, let me know if this helps.

Best
Lorenz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-09-20 14:42       ` Lorenz Bauer
@ 2022-09-21  8:48         ` shaozhengchao
  2022-09-21 20:59           ` Stanislav Fomichev
  2022-10-13  9:36           ` Lorenz Bauer
  0 siblings, 2 replies; 19+ messages in thread
From: shaozhengchao @ 2022-09-21  8:48 UTC (permalink / raw)
  To: Lorenz Bauer, Stanislav Fomichev
  Cc: ast, daniel, bpf, linux-kernel, yuehaibing



On 2022/9/20 22:42, Lorenz Bauer wrote:
> On Mon, 19 Sep 2022, at 11:55, shaozhengchao wrote:
>> Sorry for the delay. I'm busy testing the TC module recently. I'm very
>> sorry for the user-space breakage.
>>
>> The root cause of this problem is that eth_type_trans() is called when
>> the protocol type of the SKB is parsed. The len value of the SKB is
>> reduced to 0. If the user mode requires that the forwarding succeed, or
>>    if the MAC header is added again after the MAC header is subtracted,
>> is this appropriate?
> 
> We don't require forwarding to succeed with a 14 byte input buffer. We also don't look at the MAC header.
> 
> I think refusing to forward 0 length packets would be OK. Not 100% certain I understood you correctly, let me know if this helps.
> 
> Best
> Lorenz
Hi Lorenz
	Sorry. But how does the rejection of the 0 length affect the
test case? Is the return value abnormal, send packet failure or some
others?

Zhengchao Shao

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-09-21  8:48         ` shaozhengchao
@ 2022-09-21 20:59           ` Stanislav Fomichev
  2022-10-13  9:36           ` Lorenz Bauer
  1 sibling, 0 replies; 19+ messages in thread
From: Stanislav Fomichev @ 2022-09-21 20:59 UTC (permalink / raw)
  To: shaozhengchao; +Cc: Lorenz Bauer, ast, daniel, bpf, linux-kernel, yuehaibing

On Wed, Sep 21, 2022 at 1:48 AM shaozhengchao <shaozhengchao@huawei.com> wrote:
>
>
>
> On 2022/9/20 22:42, Lorenz Bauer wrote:
> > On Mon, 19 Sep 2022, at 11:55, shaozhengchao wrote:
> >> Sorry for the delay. I'm busy testing the TC module recently. I'm very
> >> sorry for the user-space breakage.
> >>
> >> The root cause of this problem is that eth_type_trans() is called when
> >> the protocol type of the SKB is parsed. The len value of the SKB is
> >> reduced to 0. If the user mode requires that the forwarding succeed, or
> >>    if the MAC header is added again after the MAC header is subtracted,
> >> is this appropriate?
> >
> > We don't require forwarding to succeed with a 14 byte input buffer. We also don't look at the MAC header.
> >
> > I think refusing to forward 0 length packets would be OK. Not 100% certain I understood you correctly, let me know if this helps.
> >
> > Best
> > Lorenz
> Hi Lorenz
>         Sorry. But how does the rejection of the 0 length affect the
> test case? Is the return value abnormal, send packet failure or some
> others?

Sorry for the late reply, I'm still traveling.
What we want to avoid is creating invalid skbs via prog_test_run.
We can reject <14 byte packets with ENIVAL, but that might break some
of the existing apps (as Lorenz reported).
So it seems like a safer alternative would be to accept those packets,
but just append missing bytes in the kernel to create the valid
packets? Padding with zeros seems fine?

> Zhengchao Shao

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-09-21  8:48         ` shaozhengchao
  2022-09-21 20:59           ` Stanislav Fomichev
@ 2022-10-13  9:36           ` Lorenz Bauer
  2022-10-13 10:44             ` shaozhengchao
  1 sibling, 1 reply; 19+ messages in thread
From: Lorenz Bauer @ 2022-10-13  9:36 UTC (permalink / raw)
  To: shaozhengchao, Stanislav Fomichev
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, linux-kernel, yuehaibing

On Wed, 21 Sep 2022, at 09:48, shaozhengchao wrote:
> Hi Lorenz
> 	Sorry. But how does the rejection of the 0 length affect the
> test case? Is the return value abnormal, send packet failure or some
> others?

Hi Zhengchao,

Did you get around to submitting another fix for this?

Best
Lorenz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-10-13  9:36           ` Lorenz Bauer
@ 2022-10-13 10:44             ` shaozhengchao
  2022-10-14 16:29               ` Lorenz Bauer
  0 siblings, 1 reply; 19+ messages in thread
From: shaozhengchao @ 2022-10-13 10:44 UTC (permalink / raw)
  To: Lorenz Bauer, Stanislav Fomichev
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, linux-kernel, yuehaibing



On 2022/10/13 17:36, Lorenz Bauer wrote:
> Did you get around to submitting another fix for this?

Hi Bauer:
	Sorry, I haven't fully understood your intentions yet.
Can you explain it more detail?

Zhengchao Shao

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-10-13 10:44             ` shaozhengchao
@ 2022-10-14 16:29               ` Lorenz Bauer
  2022-10-14 16:55                 ` sdf
  0 siblings, 1 reply; 19+ messages in thread
From: Lorenz Bauer @ 2022-10-14 16:29 UTC (permalink / raw)
  To: shaozhengchao, Stanislav Fomichev
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, linux-kernel, yuehaibing

On Thu, 13 Oct 2022, at 11:44, shaozhengchao wrote:
> 	Sorry, I haven't fully understood your intentions yet.
> Can you explain it more detail?

I'll try! Roughly, we do the following:

1. Create a BPF_PROG_TYPE_SOCKET_FILTER program that just returns 0
2. Load the program into the kernel
3. Call BPF_PROG_RUN with data_size_in == 14

After your bugfix, it seems like step 3 is rejected due to data_size_in == 14. We had to increase data_size_in to 15 to
avoid this, see [0].

This breaks user space, so it would be great if you could fix this in a way that doesn't refuse BPF_PROG_RUN with
data_size_in == 14. Since I don't understand the original problem very well I can't tell you what the best fix is however.

0: https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0

Thanks
Lorenz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-10-14 16:29               ` Lorenz Bauer
@ 2022-10-14 16:55                 ` sdf
  2022-10-15  2:36                   ` shaozhengchao
  0 siblings, 1 reply; 19+ messages in thread
From: sdf @ 2022-10-14 16:55 UTC (permalink / raw)
  To: Lorenz Bauer
  Cc: shaozhengchao, Alexei Starovoitov, Daniel Borkmann, bpf,
	linux-kernel, yuehaibing

On 10/14, Lorenz Bauer wrote:
> On Thu, 13 Oct 2022, at 11:44, shaozhengchao wrote:
> > 	Sorry, I haven't fully understood your intentions yet.
> > Can you explain it more detail?

> I'll try! Roughly, we do the following:

> 1. Create a BPF_PROG_TYPE_SOCKET_FILTER program that just returns 0
> 2. Load the program into the kernel
> 3. Call BPF_PROG_RUN with data_size_in == 14

> After your bugfix, it seems like step 3 is rejected due to data_size_in  
> == 14. We had to increase data_size_in to 15 to
> avoid this, see [0].

> This breaks user space, so it would be great if you could fix this in a  
> way that doesn't refuse BPF_PROG_RUN with

[..]

> data_size_in == 14. Since I don't understand the original problem very  
> well I can't tell you what the best fix is however.

The problem was that we were able to generate skb with len=0 via
BPF_PROG_RUN. Prohibiting those cases breaks backwards compatibility, so
we either have to:

a) (preferred?) accept inputs with <14, but maybe internally pad to 14
bytes to make the core stack happy
b) revert the patch and instead have length checks at runtime; doesn't seem  
to
be worth the penalty in the forwarding path because of some corner cases
like these ?


> 0:  
> https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0

> Thanks
> Lorenz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-10-14 16:55                 ` sdf
@ 2022-10-15  2:36                   ` shaozhengchao
  0 siblings, 0 replies; 19+ messages in thread
From: shaozhengchao @ 2022-10-15  2:36 UTC (permalink / raw)
  To: sdf, Lorenz Bauer
  Cc: Alexei Starovoitov, Daniel Borkmann, bpf, linux-kernel, yuehaibing



On 2022/10/15 0:55, sdf@google.com wrote:
> On 10/14, Lorenz Bauer wrote:
>> On Thu, 13 Oct 2022, at 11:44, shaozhengchao wrote:
>> >     Sorry, I haven't fully understood your intentions yet.
>> > Can you explain it more detail?
> 
>> I'll try! Roughly, we do the following:
> 
>> 1. Create a BPF_PROG_TYPE_SOCKET_FILTER program that just returns 0
>> 2. Load the program into the kernel
>> 3. Call BPF_PROG_RUN with data_size_in == 14
> 
>> After your bugfix, it seems like step 3 is rejected due to 
>> data_size_in == 14. We had to increase data_size_in to 15 to
>> avoid this, see [0].
> 
>> This breaks user space, so it would be great if you could fix this in 
>> a way that doesn't refuse BPF_PROG_RUN with
> 
> [..]
> 
>> data_size_in == 14. Since I don't understand the original problem very 
>> well I can't tell you what the best fix is however.
> 
> The problem was that we were able to generate skb with len=0 via
> BPF_PROG_RUN. Prohibiting those cases breaks backwards compatibility, so
> we either have to:
> 
> a) (preferred?) accept inputs with <14, but maybe internally pad to 14
> bytes to make the core stack happy
> b) revert the patch and instead have length checks at runtime; doesn't 
> seem to
> be worth the penalty in the forwarding path because of some corner cases
> like these ?
> 
Hi sdf:
	a) looks better and I'll put up a patch as soon as possible to
fix it.

Zhengchao Shao
> 
>> 0: 
>> https://github.com/cilium/ebpf/commit/a38fb6b5a46ab3b5639ea4d421232a10013596c0
> 
>> Thanks
>> Lorenz

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-07-15 11:55 [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len Zhengchao Shao
                   ` (2 preceding siblings ...)
  2022-09-14 11:19 ` Lorenz Bauer
@ 2022-11-03 21:07 ` Martin KaFai Lau
  2022-11-03 21:36   ` Stanislav Fomichev
  3 siblings, 1 reply; 19+ messages in thread
From: Martin KaFai Lau @ 2022-11-03 21:07 UTC (permalink / raw)
  To: Zhengchao Shao
  Cc: ast, daniel, andrii, song, yhs, john.fastabend, kpsingh, sdf,
	bigeasy, imagedong, petrm, arnd, dsahern, talalahmad, keescook,
	haoluo, jolsa, weiyongjun1, yuehaibing, bpf, netdev,
	linux-kernel, davem, edumazet, kuba, pabeni, hawk

On 7/15/22 4:55 AM, Zhengchao Shao wrote:
> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> skbs, that is, the flow->head is null.
> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> run a bpf prog which redirects empty skbs.
> So we should determine whether the length of the packet modified by bpf
> prog or others like bpf_prog_test is valid before forwarding it directly.
> 
> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
> 
> Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com
> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
> ---
> v3: modify debug print
> v2: need move checking to convert___skb_to_skb and add debug info
> v1: should not check len in fast path
> 
>   include/linux/skbuff.h | 8 ++++++++
>   net/bpf/test_run.c     | 3 +++
>   net/core/dev.c         | 1 +
>   3 files changed, 12 insertions(+)
> 
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f6a27ab19202..82e8368ba6e6 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2459,6 +2459,14 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
>   
>   #endif /* NET_SKBUFF_DATA_USES_OFFSET */
>   
> +static inline void skb_assert_len(struct sk_buff *skb)
> +{
> +#ifdef CONFIG_DEBUG_NET
> +	if (WARN_ONCE(!skb->len, "%s\n", __func__))
> +		DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
> +#endif /* CONFIG_DEBUG_NET */
> +}
> +
>   /*
>    *	Add data to an sk_buff
>    */
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 2ca96acbc50a..dc9dc0bedca0 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -955,6 +955,9 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
>   {
>   	struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
>   
> +	if (!skb->len)
> +		return -EINVAL;

 From another recent report [0], I don't think this change is fixing the report 
from syzbot.  It probably makes sense to revert this patch.

afaict, This '!skb->len' test is done after
	if (is_l2)
		__skb_push(skb, hh_len);

Hence, skb->len is not zero in convert___skb_to_skb().  The proper place to test 
skb->len is before __skb_push() to ensure there is some network header after the 
mac or may as well ensure "data_size_in > ETH_HLEN" at the beginning.

The fix in [0] is applied.  If it turns out there are other cases caused by the 
skb generated by test_run that needs extra fixes in bpf_redirect_*,  it needs to 
revisit an earlier !skb->len check mentioned above and the existing test cases 
outside of test_progs would have to adjust accordingly.

[0]: https://lore.kernel.org/bpf/20221027225537.353077-1-sdf@google.com/

> +
>   	if (!__skb)
>   		return 0;
>   
> diff --git a/net/core/dev.c b/net/core/dev.c
> index d588fd0a54ce..716df64fcfa5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4168,6 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>   	bool again = false;
>   
>   	skb_reset_mac_header(skb);
> +	skb_assert_len(skb);
>   
>   	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
>   		__skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-11-03 21:07 ` Martin KaFai Lau
@ 2022-11-03 21:36   ` Stanislav Fomichev
  2022-11-03 22:42     ` Martin KaFai Lau
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Fomichev @ 2022-11-03 21:36 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Zhengchao Shao, ast, daniel, andrii, song, yhs, john.fastabend,
	kpsingh, bigeasy, imagedong, petrm, arnd, dsahern, talalahmad,
	keescook, haoluo, jolsa, weiyongjun1, yuehaibing, bpf, netdev,
	linux-kernel, davem, edumazet, kuba, pabeni, hawk

On Thu, Nov 3, 2022 at 2:07 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 7/15/22 4:55 AM, Zhengchao Shao wrote:
> > Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> > skbs, that is, the flow->head is null.
> > The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> > run a bpf prog which redirects empty skbs.
> > So we should determine whether the length of the packet modified by bpf
> > prog or others like bpf_prog_test is valid before forwarding it directly.
> >
> > LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> > LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
> >
> > Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com
> > Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
> > ---
> > v3: modify debug print
> > v2: need move checking to convert___skb_to_skb and add debug info
> > v1: should not check len in fast path
> >
> >   include/linux/skbuff.h | 8 ++++++++
> >   net/bpf/test_run.c     | 3 +++
> >   net/core/dev.c         | 1 +
> >   3 files changed, 12 insertions(+)
> >
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index f6a27ab19202..82e8368ba6e6 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -2459,6 +2459,14 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
> >
> >   #endif /* NET_SKBUFF_DATA_USES_OFFSET */
> >
> > +static inline void skb_assert_len(struct sk_buff *skb)
> > +{
> > +#ifdef CONFIG_DEBUG_NET
> > +     if (WARN_ONCE(!skb->len, "%s\n", __func__))
> > +             DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
> > +#endif /* CONFIG_DEBUG_NET */
> > +}
> > +
> >   /*
> >    *  Add data to an sk_buff
> >    */
> > diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > index 2ca96acbc50a..dc9dc0bedca0 100644
> > --- a/net/bpf/test_run.c
> > +++ b/net/bpf/test_run.c
> > @@ -955,6 +955,9 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
> >   {
> >       struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
> >
> > +     if (!skb->len)
> > +             return -EINVAL;
>
>  From another recent report [0], I don't think this change is fixing the report
> from syzbot.  It probably makes sense to revert this patch.
>
> afaict, This '!skb->len' test is done after
>         if (is_l2)
>                 __skb_push(skb, hh_len);
>
> Hence, skb->len is not zero in convert___skb_to_skb().  The proper place to test
> skb->len is before __skb_push() to ensure there is some network header after the
> mac or may as well ensure "data_size_in > ETH_HLEN" at the beginning.

When is_l2==true, __skb_push will result in non-zero skb->len, so we
should be good, right?
The only issue is when we do bpf_redirect into a tunneling device and
do __skb_pull, but that's now fixed by [0].

When is_l2==false, the existing check in convert___skb_to_skb will
make sure there is something in the l3 headers.

So it seems like this patch is still needed. Or am I missing something?

> The fix in [0] is applied.  If it turns out there are other cases caused by the
> skb generated by test_run that needs extra fixes in bpf_redirect_*,  it needs to
> revisit an earlier !skb->len check mentioned above and the existing test cases
> outside of test_progs would have to adjust accordingly.
>
> [0]: https://lore.kernel.org/bpf/20221027225537.353077-1-sdf@google.com/
>
> > +
> >       if (!__skb)
> >               return 0;
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index d588fd0a54ce..716df64fcfa5 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -4168,6 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
> >       bool again = false;
> >
> >       skb_reset_mac_header(skb);
> > +     skb_assert_len(skb);
> >
> >       if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
> >               __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-11-03 21:36   ` Stanislav Fomichev
@ 2022-11-03 22:42     ` Martin KaFai Lau
  2022-11-03 22:58       ` Stanislav Fomichev
  0 siblings, 1 reply; 19+ messages in thread
From: Martin KaFai Lau @ 2022-11-03 22:42 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Zhengchao Shao, ast, daniel, andrii, song, yhs, john.fastabend,
	kpsingh, bigeasy, imagedong, petrm, arnd, dsahern, talalahmad,
	keescook, haoluo, jolsa, weiyongjun1, yuehaibing, bpf, netdev,
	linux-kernel, davem, edumazet, kuba, pabeni, hawk

On 11/3/22 2:36 PM, Stanislav Fomichev wrote:
> On Thu, Nov 3, 2022 at 2:07 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>>
>> On 7/15/22 4:55 AM, Zhengchao Shao wrote:
>>> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
>>> skbs, that is, the flow->head is null.
>>> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
>>> run a bpf prog which redirects empty skbs.
>>> So we should determine whether the length of the packet modified by bpf
>>> prog or others like bpf_prog_test is valid before forwarding it directly.
>>>
>>> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
>>> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
>>>
>>> Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com
>>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
>>> ---
>>> v3: modify debug print
>>> v2: need move checking to convert___skb_to_skb and add debug info
>>> v1: should not check len in fast path
>>>
>>>    include/linux/skbuff.h | 8 ++++++++
>>>    net/bpf/test_run.c     | 3 +++
>>>    net/core/dev.c         | 1 +
>>>    3 files changed, 12 insertions(+)
>>>
>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>>> index f6a27ab19202..82e8368ba6e6 100644
>>> --- a/include/linux/skbuff.h
>>> +++ b/include/linux/skbuff.h
>>> @@ -2459,6 +2459,14 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
>>>
>>>    #endif /* NET_SKBUFF_DATA_USES_OFFSET */
>>>
>>> +static inline void skb_assert_len(struct sk_buff *skb)
>>> +{
>>> +#ifdef CONFIG_DEBUG_NET
>>> +     if (WARN_ONCE(!skb->len, "%s\n", __func__))
>>> +             DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
>>> +#endif /* CONFIG_DEBUG_NET */
>>> +}
>>> +
>>>    /*
>>>     *  Add data to an sk_buff
>>>     */
>>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>>> index 2ca96acbc50a..dc9dc0bedca0 100644
>>> --- a/net/bpf/test_run.c
>>> +++ b/net/bpf/test_run.c
>>> @@ -955,6 +955,9 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
>>>    {
>>>        struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
>>>
>>> +     if (!skb->len)
>>> +             return -EINVAL;
>>
>>   From another recent report [0], I don't think this change is fixing the report
>> from syzbot.  It probably makes sense to revert this patch.
>>
>> afaict, This '!skb->len' test is done after
>>          if (is_l2)
>>                  __skb_push(skb, hh_len);
>>
>> Hence, skb->len is not zero in convert___skb_to_skb().  The proper place to test
>> skb->len is before __skb_push() to ensure there is some network header after the
>> mac or may as well ensure "data_size_in > ETH_HLEN" at the beginning.
> 
> When is_l2==true, __skb_push will result in non-zero skb->len, so we
> should be good, right?
> The only issue is when we do bpf_redirect into a tunneling device and
> do __skb_pull, but that's now fixed by [0].
> 
> When is_l2==false, the existing check in convert___skb_to_skb will
> make sure there is something in the l3 headers.
> 
> So it seems like this patch is still needed. Or am I missing something?

Replied in [0].  I think a small change in [0] will make this patch obsolete.

My thinking is the !skb->len test in this patch does not address all cases, at 
least not the most common one (the sch_cls prog where is_l2 == true) and then it 
needs another change in __bpf_redirect_no_mac [0].  Then, instead of breaking 
the existing test cases,  may as well solely depend on the change in 
__bpf_redirect_no_mac which seems to be the only redirect function that does not 
have the len check now.

> 
>> The fix in [0] is applied.  If it turns out there are other cases caused by the
>> skb generated by test_run that needs extra fixes in bpf_redirect_*,  it needs to
>> revisit an earlier !skb->len check mentioned above and the existing test cases
>> outside of test_progs would have to adjust accordingly.
>>
>> [0]: https://lore.kernel.org/bpf/20221027225537.353077-1-sdf@google.com/
>>
>>> +
>>>        if (!__skb)
>>>                return 0;
>>>
>>> diff --git a/net/core/dev.c b/net/core/dev.c
>>> index d588fd0a54ce..716df64fcfa5 100644
>>> --- a/net/core/dev.c
>>> +++ b/net/core/dev.c
>>> @@ -4168,6 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>>>        bool again = false;
>>>
>>>        skb_reset_mac_header(skb);
>>> +     skb_assert_len(skb);
>>>
>>>        if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
>>>                __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
>>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-11-03 22:42     ` Martin KaFai Lau
@ 2022-11-03 22:58       ` Stanislav Fomichev
  2022-11-09 21:43         ` Stanislav Fomichev
  0 siblings, 1 reply; 19+ messages in thread
From: Stanislav Fomichev @ 2022-11-03 22:58 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Zhengchao Shao, ast, daniel, andrii, song, yhs, john.fastabend,
	kpsingh, bigeasy, imagedong, petrm, arnd, dsahern, talalahmad,
	keescook, haoluo, jolsa, weiyongjun1, yuehaibing, bpf, netdev,
	linux-kernel, davem, edumazet, kuba, pabeni, hawk

On Thu, Nov 3, 2022 at 3:42 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 11/3/22 2:36 PM, Stanislav Fomichev wrote:
> > On Thu, Nov 3, 2022 at 2:07 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >>
> >> On 7/15/22 4:55 AM, Zhengchao Shao wrote:
> >>> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> >>> skbs, that is, the flow->head is null.
> >>> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> >>> run a bpf prog which redirects empty skbs.
> >>> So we should determine whether the length of the packet modified by bpf
> >>> prog or others like bpf_prog_test is valid before forwarding it directly.
> >>>
> >>> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> >>> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
> >>>
> >>> Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com
> >>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
> >>> ---
> >>> v3: modify debug print
> >>> v2: need move checking to convert___skb_to_skb and add debug info
> >>> v1: should not check len in fast path
> >>>
> >>>    include/linux/skbuff.h | 8 ++++++++
> >>>    net/bpf/test_run.c     | 3 +++
> >>>    net/core/dev.c         | 1 +
> >>>    3 files changed, 12 insertions(+)
> >>>
> >>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> >>> index f6a27ab19202..82e8368ba6e6 100644
> >>> --- a/include/linux/skbuff.h
> >>> +++ b/include/linux/skbuff.h
> >>> @@ -2459,6 +2459,14 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
> >>>
> >>>    #endif /* NET_SKBUFF_DATA_USES_OFFSET */
> >>>
> >>> +static inline void skb_assert_len(struct sk_buff *skb)
> >>> +{
> >>> +#ifdef CONFIG_DEBUG_NET
> >>> +     if (WARN_ONCE(!skb->len, "%s\n", __func__))
> >>> +             DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
> >>> +#endif /* CONFIG_DEBUG_NET */
> >>> +}
> >>> +
> >>>    /*
> >>>     *  Add data to an sk_buff
> >>>     */
> >>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> >>> index 2ca96acbc50a..dc9dc0bedca0 100644
> >>> --- a/net/bpf/test_run.c
> >>> +++ b/net/bpf/test_run.c
> >>> @@ -955,6 +955,9 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
> >>>    {
> >>>        struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
> >>>
> >>> +     if (!skb->len)
> >>> +             return -EINVAL;
> >>
> >>   From another recent report [0], I don't think this change is fixing the report
> >> from syzbot.  It probably makes sense to revert this patch.
> >>
> >> afaict, This '!skb->len' test is done after
> >>          if (is_l2)
> >>                  __skb_push(skb, hh_len);
> >>
> >> Hence, skb->len is not zero in convert___skb_to_skb().  The proper place to test
> >> skb->len is before __skb_push() to ensure there is some network header after the
> >> mac or may as well ensure "data_size_in > ETH_HLEN" at the beginning.
> >
> > When is_l2==true, __skb_push will result in non-zero skb->len, so we
> > should be good, right?
> > The only issue is when we do bpf_redirect into a tunneling device and
> > do __skb_pull, but that's now fixed by [0].
> >
> > When is_l2==false, the existing check in convert___skb_to_skb will
> > make sure there is something in the l3 headers.
> >
> > So it seems like this patch is still needed. Or am I missing something?
>
> Replied in [0].  I think a small change in [0] will make this patch obsolete.
>
> My thinking is the !skb->len test in this patch does not address all cases, at
> least not the most common one (the sch_cls prog where is_l2 == true) and then it
> needs another change in __bpf_redirect_no_mac [0].  Then, instead of breaking
> the existing test cases,  may as well solely depend on the change in
> __bpf_redirect_no_mac which seems to be the only redirect function that does not
> have the len check now.

Removing this check in convert___skb_to_skb and moving the new one in
__bpf_redirect_no_mac out of (mlen) SGTM.
Can follow up unless you or Zhengchao prefer to do it.
There were some concerns about doing this len check at runtime per
packet, but not sure whether it really affects anything..

> >> The fix in [0] is applied.  If it turns out there are other cases caused by the
> >> skb generated by test_run that needs extra fixes in bpf_redirect_*,  it needs to
> >> revisit an earlier !skb->len check mentioned above and the existing test cases
> >> outside of test_progs would have to adjust accordingly.
> >>
> >> [0]: https://lore.kernel.org/bpf/20221027225537.353077-1-sdf@google.com/
> >>
> >>> +
> >>>        if (!__skb)
> >>>                return 0;
> >>>
> >>> diff --git a/net/core/dev.c b/net/core/dev.c
> >>> index d588fd0a54ce..716df64fcfa5 100644
> >>> --- a/net/core/dev.c
> >>> +++ b/net/core/dev.c
> >>> @@ -4168,6 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
> >>>        bool again = false;
> >>>
> >>>        skb_reset_mac_header(skb);
> >>> +     skb_assert_len(skb);
> >>>
> >>>        if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
> >>>                __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
> >>
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len
  2022-11-03 22:58       ` Stanislav Fomichev
@ 2022-11-09 21:43         ` Stanislav Fomichev
  0 siblings, 0 replies; 19+ messages in thread
From: Stanislav Fomichev @ 2022-11-09 21:43 UTC (permalink / raw)
  To: Martin KaFai Lau
  Cc: Zhengchao Shao, ast, daniel, andrii, song, yhs, john.fastabend,
	kpsingh, bigeasy, imagedong, petrm, arnd, dsahern, talalahmad,
	keescook, haoluo, jolsa, weiyongjun1, yuehaibing, bpf, netdev,
	linux-kernel, davem, edumazet, kuba, pabeni, hawk

"

On Thu, Nov 3, 2022 at 3:58 PM Stanislav Fomichev <sdf@google.com> wrote:
>
> On Thu, Nov 3, 2022 at 3:42 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> >
> > On 11/3/22 2:36 PM, Stanislav Fomichev wrote:
> > > On Thu, Nov 3, 2022 at 2:07 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
> > >>
> > >> On 7/15/22 4:55 AM, Zhengchao Shao wrote:
> > >>> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> > >>> skbs, that is, the flow->head is null.
> > >>> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> > >>> run a bpf prog which redirects empty skbs.
> > >>> So we should determine whether the length of the packet modified by bpf
> > >>> prog or others like bpf_prog_test is valid before forwarding it directly.
> > >>>
> > >>> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> > >>> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
> > >>>
> > >>> Reported-by: syzbot+7a12909485b94426aceb@syzkaller.appspotmail.com
> > >>> Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
> > >>> ---
> > >>> v3: modify debug print
> > >>> v2: need move checking to convert___skb_to_skb and add debug info
> > >>> v1: should not check len in fast path
> > >>>
> > >>>    include/linux/skbuff.h | 8 ++++++++
> > >>>    net/bpf/test_run.c     | 3 +++
> > >>>    net/core/dev.c         | 1 +
> > >>>    3 files changed, 12 insertions(+)
> > >>>
> > >>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > >>> index f6a27ab19202..82e8368ba6e6 100644
> > >>> --- a/include/linux/skbuff.h
> > >>> +++ b/include/linux/skbuff.h
> > >>> @@ -2459,6 +2459,14 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
> > >>>
> > >>>    #endif /* NET_SKBUFF_DATA_USES_OFFSET */
> > >>>
> > >>> +static inline void skb_assert_len(struct sk_buff *skb)
> > >>> +{
> > >>> +#ifdef CONFIG_DEBUG_NET
> > >>> +     if (WARN_ONCE(!skb->len, "%s\n", __func__))
> > >>> +             DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
> > >>> +#endif /* CONFIG_DEBUG_NET */
> > >>> +}
> > >>> +
> > >>>    /*
> > >>>     *  Add data to an sk_buff
> > >>>     */
> > >>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> > >>> index 2ca96acbc50a..dc9dc0bedca0 100644
> > >>> --- a/net/bpf/test_run.c
> > >>> +++ b/net/bpf/test_run.c
> > >>> @@ -955,6 +955,9 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
> > >>>    {
> > >>>        struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
> > >>>
> > >>> +     if (!skb->len)
> > >>> +             return -EINVAL;
> > >>
> > >>   From another recent report [0], I don't think this change is fixing the report
> > >> from syzbot.  It probably makes sense to revert this patch.
> > >>
> > >> afaict, This '!skb->len' test is done after
> > >>          if (is_l2)
> > >>                  __skb_push(skb, hh_len);
> > >>
> > >> Hence, skb->len is not zero in convert___skb_to_skb().  The proper place to test
> > >> skb->len is before __skb_push() to ensure there is some network header after the
> > >> mac or may as well ensure "data_size_in > ETH_HLEN" at the beginning.
> > >
> > > When is_l2==true, __skb_push will result in non-zero skb->len, so we
> > > should be good, right?
> > > The only issue is when we do bpf_redirect into a tunneling device and
> > > do __skb_pull, but that's now fixed by [0].
> > >
> > > When is_l2==false, the existing check in convert___skb_to_skb will
> > > make sure there is something in the l3 headers.
> > >
> > > So it seems like this patch is still needed. Or am I missing something?
> >
> > Replied in [0].  I think a small change in [0] will make this patch obsolete.
> >
> > My thinking is the !skb->len test in this patch does not address all cases, at
> > least not the most common one (the sch_cls prog where is_l2 == true) and then it
> > needs another change in __bpf_redirect_no_mac [0].  Then, instead of breaking
> > the existing test cases,  may as well solely depend on the change in
> > __bpf_redirect_no_mac which seems to be the only redirect function that does not
> > have the len check now.
>
> Removing this check in convert___skb_to_skb and moving the new one in
> __bpf_redirect_no_mac out of (mlen) SGTM.
> Can follow up unless you or Zhengchao prefer to do it.
> There were some concerns about doing this len check at runtime per
> packet, but not sure whether it really affects anything..

I've implemented a simple selftest for both mac/no-mac cases, and I
feel like this explicit len==0 has to happen in both cases.
That "skb->mac_header >= skb->network_header" check is not enough :-(
I'll send the series early next week after another round of xdp metadata..

> > >> The fix in [0] is applied.  If it turns out there are other cases caused by the
> > >> skb generated by test_run that needs extra fixes in bpf_redirect_*,  it needs to
> > >> revisit an earlier !skb->len check mentioned above and the existing test cases
> > >> outside of test_progs would have to adjust accordingly.
> > >>
> > >> [0]: https://lore.kernel.org/bpf/20221027225537.353077-1-sdf@google.com/
> > >>
> > >>> +
> > >>>        if (!__skb)
> > >>>                return 0;
> > >>>
> > >>> diff --git a/net/core/dev.c b/net/core/dev.c
> > >>> index d588fd0a54ce..716df64fcfa5 100644
> > >>> --- a/net/core/dev.c
> > >>> +++ b/net/core/dev.c
> > >>> @@ -4168,6 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
> > >>>        bool again = false;
> > >>>
> > >>>        skb_reset_mac_header(skb);
> > >>> +     skb_assert_len(skb);
> > >>>
> > >>>        if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
> > >>>                __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
> > >>
> >

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-11-09 21:43 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-15 11:55 [PATCH v4,bpf-next] bpf: Don't redirect packets with invalid pkt_len Zhengchao Shao
2022-07-15 23:30 ` Stanislav Fomichev
2022-07-19 17:00 ` patchwork-bot+netdevbpf
2022-09-14 11:19 ` Lorenz Bauer
2022-09-17 15:46   ` Stanislav Fomichev
2022-09-19 10:55     ` shaozhengchao
2022-09-20 14:42       ` Lorenz Bauer
2022-09-21  8:48         ` shaozhengchao
2022-09-21 20:59           ` Stanislav Fomichev
2022-10-13  9:36           ` Lorenz Bauer
2022-10-13 10:44             ` shaozhengchao
2022-10-14 16:29               ` Lorenz Bauer
2022-10-14 16:55                 ` sdf
2022-10-15  2:36                   ` shaozhengchao
2022-11-03 21:07 ` Martin KaFai Lau
2022-11-03 21:36   ` Stanislav Fomichev
2022-11-03 22:42     ` Martin KaFai Lau
2022-11-03 22:58       ` Stanislav Fomichev
2022-11-09 21:43         ` Stanislav Fomichev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).