bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RE: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
       [not found] <db5393a3-d4b3-45c1-8219-f23b43a8d2ab.anny.hu@linux.alibaba.com>
@ 2020-05-26 21:10 ` John Fastabend
  2020-05-29  9:05   ` dihu
  0 siblings, 1 reply; 7+ messages in thread
From: John Fastabend @ 2020-05-26 21:10 UTC (permalink / raw)
  To: dihu, Eric Dumazet, John Fastabend, Daniel Borkmann,
	Jakub Sitnicki, Lorenz Bauer, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, netdev, bpf, linux-kernel

dihu wrote:
> From 865a45747de6b68fd02a0ff128a69a5c8feb73c3 Mon Sep 17 00:00:00 2001
> From: dihu <anny.hu@linux.alibaba.com>
> Date: Mon, 25 May 2020 17:23:16 +0800
> Subject: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
> 
> When user application calls read() with MSG_PEEK flag to read data
> of bpf sockmap socket, kernel panic happens at
> __tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
> queue after read out under MSG_PEEK flag is set. Because it's not
> judged whether sk_msg is the last msg of ingress_msg queue, the next
> sk_msg may be the head of ingress_msg queue, whose memory address of
> sg page is invalid. So it's necessary to add check codes to prevent
> this problem.
> 
> [20759.125457] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
> [20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G            E
> 5.4.32 #1
> [20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
> 4.1.12 06/18/2017
> [20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
> [20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
> [20759.276099] tcp_bpf_recvmsg+0x113/0x370
> [20759.281137] inet_recvmsg+0x55/0xc0
> [20759.285734] __sys_recvfrom+0xc8/0x130
> [20759.290566] ? __audit_syscall_entry+0x103/0x130
> [20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
> [20759.301700] ? __audit_syscall_exit+0x1e4/0x290
> [20759.307235] __x64_sys_recvfrom+0x24/0x30
> [20759.312226] do_syscall_64+0x55/0x1b0
> [20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Signed-off-by: dihu <anny.hu@linux.alibaba.com>
> ---
>  net/ipv4/tcp_bpf.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index 5a05327..c0d4624 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -64,6 +64,9 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
>    } while (i != msg_rx->sg.end);
> 
>    if (unlikely(peek)) {
> +   if (msg_rx == list_last_entry(&psock->ingress_msg,
> +       struct sk_msg, list))
> +    break;


Thanks. Change looks good but spacing is a bit off . Can we
turn those spaces into tabs? Otherwise adding fixes tag and
my ack would be great.

Fixes: 02c558b2d5d67 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
  2020-05-26 21:10 ` [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg John Fastabend
@ 2020-05-29  9:05   ` dihu
  2020-06-02  9:03     ` Jakub Sitnicki
  0 siblings, 1 reply; 7+ messages in thread
From: dihu @ 2020-05-29  9:05 UTC (permalink / raw)
  To: John Fastabend, Eric Dumazet, Daniel Borkmann, Jakub Sitnicki,
	Lorenz Bauer, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, Jakub Kicinski, Alexei Starovoitov,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, netdev, bpf, linux-kernel


On 2020/5/27 5:10, John Fastabend wrote:
> dihu wrote:
>>  From 865a45747de6b68fd02a0ff128a69a5c8feb73c3 Mon Sep 17 00:00:00 2001
>> From: dihu <anny.hu@linux.alibaba.com>
>> Date: Mon, 25 May 2020 17:23:16 +0800
>> Subject: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
>>
>> When user application calls read() with MSG_PEEK flag to read data
>> of bpf sockmap socket, kernel panic happens at
>> __tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
>> queue after read out under MSG_PEEK flag is set. Because it's not
>> judged whether sk_msg is the last msg of ingress_msg queue, the next
>> sk_msg may be the head of ingress_msg queue, whose memory address of
>> sg page is invalid. So it's necessary to add check codes to prevent
>> this problem.
>>
>> [20759.125457] BUG: kernel NULL pointer dereference, address:
>> 0000000000000008
>> [20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G            E
>> 5.4.32 #1
>> [20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
>> 4.1.12 06/18/2017
>> [20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
>> [20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
>> [20759.276099] tcp_bpf_recvmsg+0x113/0x370
>> [20759.281137] inet_recvmsg+0x55/0xc0
>> [20759.285734] __sys_recvfrom+0xc8/0x130
>> [20759.290566] ? __audit_syscall_entry+0x103/0x130
>> [20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
>> [20759.301700] ? __audit_syscall_exit+0x1e4/0x290
>> [20759.307235] __x64_sys_recvfrom+0x24/0x30
>> [20759.312226] do_syscall_64+0x55/0x1b0
>> [20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> Signed-off-by: dihu <anny.hu@linux.alibaba.com>
>> ---
>>   net/ipv4/tcp_bpf.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
>> index 5a05327..c0d4624 100644
>> --- a/net/ipv4/tcp_bpf.c
>> +++ b/net/ipv4/tcp_bpf.c
>> @@ -64,6 +64,9 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
>>     } while (i != msg_rx->sg.end);
>>
>>     if (unlikely(peek)) {
>> +   if (msg_rx == list_last_entry(&psock->ingress_msg,
>> +       struct sk_msg, list))
>> +    break;
>
> Thanks. Change looks good but spacing is a bit off . Can we
> turn those spaces into tabs? Otherwise adding fixes tag and
> my ack would be great.
>
> Fixes: 02c558b2d5d67 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
> Acked-by: John Fastabend <john.fastabend@gmail.com>


 From 127a334fa5e5d029353ceb1a0414886c527f4be5 Mon Sep 17 00:00:00 2001
From: dihu <anny.hu@linux.alibaba.com>
Date: Fri, 29 May 2020 16:38:50 +0800
Subject: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg

When user application calls read() with MSG_PEEK flag to read data
of bpf sockmap socket, kernel panic happens at
__tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
queue after read out under MSG_PEEK flag is set. Because it's not
judged whether sk_msg is the last msg of ingress_msg queue, the next
sk_msg may be the head of ingress_msg queue, whose memory address of
sg page is invalid. So it's necessary to add check codes to prevent
this problem.

[20759.125457] BUG: kernel NULL pointer dereference, address:
0000000000000008
[20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G            E
5.4.32 #1
[20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
4.1.12 06/18/2017
[20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
[20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
[20759.276099] tcp_bpf_recvmsg+0x113/0x370
[20759.281137] inet_recvmsg+0x55/0xc0
[20759.285734] __sys_recvfrom+0xc8/0x130
[20759.290566] ? __audit_syscall_entry+0x103/0x130
[20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
[20759.301700] ? __audit_syscall_exit+0x1e4/0x290
[20759.307235] __x64_sys_recvfrom+0x24/0x30
[20759.312226] do_syscall_64+0x55/0x1b0
[20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: dihu <anny.hu@linux.alibaba.com>
---
  net/ipv4/tcp_bpf.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 5a05327..b82e4c3 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -64,6 +64,9 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock 
*psock,
          } while (i != msg_rx->sg.end);

          if (unlikely(peek)) {
+            if (msg_rx == list_last_entry(&psock->ingress_msg,
+                              struct sk_msg, list))
+                break;
              msg_rx = list_next_entry(msg_rx, list);
              continue;
          }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
  2020-05-29  9:05   ` dihu
@ 2020-06-02  9:03     ` Jakub Sitnicki
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Sitnicki @ 2020-06-02  9:03 UTC (permalink / raw)
  To: dihu
  Cc: John Fastabend, Eric Dumazet, Daniel Borkmann, Lorenz Bauer,
	David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Jakub Kicinski, Alexei Starovoitov, Martin KaFai Lau, Song Liu,
	Yonghong Song, Andrii Nakryiko, KP Singh, netdev, bpf,
	linux-kernel

On Fri, May 29, 2020 at 11:05 AM CEST, dihu wrote:
> On 2020/5/27 5:10, John Fastabend wrote:
>> dihu wrote:
>>>  From 865a45747de6b68fd02a0ff128a69a5c8feb73c3 Mon Sep 17 00:00:00 2001
>>> From: dihu <anny.hu@linux.alibaba.com>
>>> Date: Mon, 25 May 2020 17:23:16 +0800
>>> Subject: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
>>>
>>> When user application calls read() with MSG_PEEK flag to read data
>>> of bpf sockmap socket, kernel panic happens at
>>> __tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
>>> queue after read out under MSG_PEEK flag is set. Because it's not
>>> judged whether sk_msg is the last msg of ingress_msg queue, the next
>>> sk_msg may be the head of ingress_msg queue, whose memory address of
>>> sg page is invalid. So it's necessary to add check codes to prevent
>>> this problem.
>>>
>>> [20759.125457] BUG: kernel NULL pointer dereference, address:
>>> 0000000000000008
>>> [20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G            E
>>> 5.4.32 #1
>>> [20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
>>> 4.1.12 06/18/2017
>>> [20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
>>> [20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
>>> [20759.276099] tcp_bpf_recvmsg+0x113/0x370
>>> [20759.281137] inet_recvmsg+0x55/0xc0
>>> [20759.285734] __sys_recvfrom+0xc8/0x130
>>> [20759.290566] ? __audit_syscall_entry+0x103/0x130
>>> [20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
>>> [20759.301700] ? __audit_syscall_exit+0x1e4/0x290
>>> [20759.307235] __x64_sys_recvfrom+0x24/0x30
>>> [20759.312226] do_syscall_64+0x55/0x1b0
>>> [20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>
>>> Signed-off-by: dihu <anny.hu@linux.alibaba.com>
>>> ---
>>>   net/ipv4/tcp_bpf.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
>>> index 5a05327..c0d4624 100644
>>> --- a/net/ipv4/tcp_bpf.c
>>> +++ b/net/ipv4/tcp_bpf.c
>>> @@ -64,6 +64,9 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
>>>     } while (i != msg_rx->sg.end);
>>>
>>>     if (unlikely(peek)) {
>>> +   if (msg_rx == list_last_entry(&psock->ingress_msg,
>>> +       struct sk_msg, list))
>>> +    break;
>>
>> Thanks. Change looks good but spacing is a bit off . Can we
>> turn those spaces into tabs? Otherwise adding fixes tag and
>> my ack would be great.
>>
>> Fixes: 02c558b2d5d67 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
>> Acked-by: John Fastabend <john.fastabend@gmail.com>
>
>
> From 127a334fa5e5d029353ceb1a0414886c527f4be5 Mon Sep 17 00:00:00 2001
> From: dihu <anny.hu@linux.alibaba.com>
> Date: Fri, 29 May 2020 16:38:50 +0800
> Subject: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
>
> When user application calls read() with MSG_PEEK flag to read data
> of bpf sockmap socket, kernel panic happens at
> __tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
> queue after read out under MSG_PEEK flag is set. Because it's not
> judged whether sk_msg is the last msg of ingress_msg queue, the next
> sk_msg may be the head of ingress_msg queue, whose memory address of
> sg page is invalid. So it's necessary to add check codes to prevent
> this problem.
>
> [20759.125457] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
> [20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G E
> 5.4.32 #1
> [20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
> 4.1.12 06/18/2017
> [20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
> [20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
> [20759.276099] tcp_bpf_recvmsg+0x113/0x370
> [20759.281137] inet_recvmsg+0x55/0xc0
> [20759.285734] __sys_recvfrom+0xc8/0x130
> [20759.290566] ? __audit_syscall_entry+0x103/0x130
> [20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
> [20759.301700] ? __audit_syscall_exit+0x1e4/0x290
> [20759.307235] __x64_sys_recvfrom+0x24/0x30
> [20759.312226] do_syscall_64+0x55/0x1b0
> [20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Signed-off-by: dihu <anny.hu@linux.alibaba.com>
> ---
> net/ipv4/tcp_bpf.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index 5a05327..b82e4c3 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -64,6 +64,9 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
>   } while (i != msg_rx->sg.end);
>
>   if (unlikely(peek)) {
> +   if (msg_rx == list_last_entry(&psock->ingress_msg,
> +       struct sk_msg, list))
> +    break;
>    msg_rx = list_next_entry(msg_rx, list);
>    continue;
>   }

Looks like the patch is garbled. I suspect due to copy-paste to an
e-mail client. Context line got wrapped and there are non-breaking
spaces instead of tabs in the body.

Crash fix is important so could you resend it with `git send-email`?

  git send-email --to bpf@vger.kernel.org --cc netdev@vger.kernel.org file.patch

You might find the documentation below helpful:

  https://www.kernel.org/doc/html/latest/process/email-clients.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
  2020-06-09  9:03 ` Jakub Sitnicki
@ 2020-06-09 17:58   ` Alexei Starovoitov
  0 siblings, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2020-06-09 17:58 UTC (permalink / raw)
  To: Jakub Sitnicki; +Cc: dihu, bpf, Network Development

On Tue, Jun 9, 2020 at 2:04 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>
> On Fri, Jun 05, 2020 at 10:46 AM CEST, dihu wrote:
> > When user application calls read() with MSG_PEEK flag to read data
> > of bpf sockmap socket, kernel panic happens at
> > __tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
> > queue after read out under MSG_PEEK flag is set. Because it's not
> > judged whether sk_msg is the last msg of ingress_msg queue, the next
> > sk_msg may be the head of ingress_msg queue, whose memory address of
> > sg page is invalid. So it's necessary to add check codes to prevent
> > this problem.
> >
> > [20759.125457] BUG: kernel NULL pointer dereference, address:
> > 0000000000000008
> > [20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G            E
> > 5.4.32 #1
> > [20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
> > 4.1.12 06/18/2017
> > [20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
> > [20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
> > [20759.276099] tcp_bpf_recvmsg+0x113/0x370
> > [20759.281137] inet_recvmsg+0x55/0xc0
> > [20759.285734] __sys_recvfrom+0xc8/0x130
> > [20759.290566] ? __audit_syscall_entry+0x103/0x130
> > [20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
> > [20759.301700] ? __audit_syscall_exit+0x1e4/0x290
> > [20759.307235] __x64_sys_recvfrom+0x24/0x30
> > [20759.312226] do_syscall_64+0x55/0x1b0
> > [20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Signed-off-by: dihu <anny.hu@linux.alibaba.com>
> > ---
> >  net/ipv4/tcp_bpf.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> > index 5a05327..b82e4c3 100644
> > --- a/net/ipv4/tcp_bpf.c
> > +++ b/net/ipv4/tcp_bpf.c
> > @@ -64,6 +64,9 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
> >               } while (i != msg_rx->sg.end);
> >
> >               if (unlikely(peek)) {
> > +                     if (msg_rx == list_last_entry(&psock->ingress_msg,
> > +                                                   struct sk_msg, list))
> > +                             break;
> >                       msg_rx = list_next_entry(msg_rx, list);
> >                       continue;
> >               }
>
> Acked-by: Jakub Sitnicki <jakub@cloudflare.com>

Applied. Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
  2020-06-05  8:46 dihu
  2020-06-08 16:06 ` John Fastabend
@ 2020-06-09  9:03 ` Jakub Sitnicki
  2020-06-09 17:58   ` Alexei Starovoitov
  1 sibling, 1 reply; 7+ messages in thread
From: Jakub Sitnicki @ 2020-06-09  9:03 UTC (permalink / raw)
  To: dihu; +Cc: bpf, netdev

On Fri, Jun 05, 2020 at 10:46 AM CEST, dihu wrote:
> When user application calls read() with MSG_PEEK flag to read data
> of bpf sockmap socket, kernel panic happens at
> __tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
> queue after read out under MSG_PEEK flag is set. Because it's not
> judged whether sk_msg is the last msg of ingress_msg queue, the next
> sk_msg may be the head of ingress_msg queue, whose memory address of
> sg page is invalid. So it's necessary to add check codes to prevent
> this problem.
>
> [20759.125457] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
> [20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G            E
> 5.4.32 #1
> [20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
> 4.1.12 06/18/2017
> [20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
> [20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
> [20759.276099] tcp_bpf_recvmsg+0x113/0x370
> [20759.281137] inet_recvmsg+0x55/0xc0
> [20759.285734] __sys_recvfrom+0xc8/0x130
> [20759.290566] ? __audit_syscall_entry+0x103/0x130
> [20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
> [20759.301700] ? __audit_syscall_exit+0x1e4/0x290
> [20759.307235] __x64_sys_recvfrom+0x24/0x30
> [20759.312226] do_syscall_64+0x55/0x1b0
> [20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Signed-off-by: dihu <anny.hu@linux.alibaba.com>
> ---
>  net/ipv4/tcp_bpf.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index 5a05327..b82e4c3 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -64,6 +64,9 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
>  		} while (i != msg_rx->sg.end);
>  
>  		if (unlikely(peek)) {
> +			if (msg_rx == list_last_entry(&psock->ingress_msg,
> +						      struct sk_msg, list))
> +				break;
>  			msg_rx = list_next_entry(msg_rx, list);
>  			continue;
>  		}

Acked-by: Jakub Sitnicki <jakub@cloudflare.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
  2020-06-05  8:46 dihu
@ 2020-06-08 16:06 ` John Fastabend
  2020-06-09  9:03 ` Jakub Sitnicki
  1 sibling, 0 replies; 7+ messages in thread
From: John Fastabend @ 2020-06-08 16:06 UTC (permalink / raw)
  To: dihu, bpf; +Cc: netdev, dihu

dihu wrote:
> When user application calls read() with MSG_PEEK flag to read data
> of bpf sockmap socket, kernel panic happens at
> __tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
> queue after read out under MSG_PEEK flag is set. Because it's not
> judged whether sk_msg is the last msg of ingress_msg queue, the next
> sk_msg may be the head of ingress_msg queue, whose memory address of
> sg page is invalid. So it's necessary to add check codes to prevent
> this problem.
> 
> [20759.125457] BUG: kernel NULL pointer dereference, address:
> 0000000000000008
> [20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G            E
> 5.4.32 #1
> [20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
> 4.1.12 06/18/2017
> [20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
> [20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
> [20759.276099] tcp_bpf_recvmsg+0x113/0x370
> [20759.281137] inet_recvmsg+0x55/0xc0
> [20759.285734] __sys_recvfrom+0xc8/0x130
> [20759.290566] ? __audit_syscall_entry+0x103/0x130
> [20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
> [20759.301700] ? __audit_syscall_exit+0x1e4/0x290
> [20759.307235] __x64_sys_recvfrom+0x24/0x30
> [20759.312226] do_syscall_64+0x55/0x1b0
> [20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Signed-off-by: dihu <anny.hu@linux.alibaba.com>
> ---
>  net/ipv4/tcp_bpf.c | 3 +++
>  1 file changed, 3 insertions(+)
> 

Thanks, looks good to me.

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg
@ 2020-06-05  8:46 dihu
  2020-06-08 16:06 ` John Fastabend
  2020-06-09  9:03 ` Jakub Sitnicki
  0 siblings, 2 replies; 7+ messages in thread
From: dihu @ 2020-06-05  8:46 UTC (permalink / raw)
  To: bpf; +Cc: netdev, dihu

When user application calls read() with MSG_PEEK flag to read data
of bpf sockmap socket, kernel panic happens at
__tcp_bpf_recvmsg+0x12c/0x350. sk_msg is not removed from ingress_msg
queue after read out under MSG_PEEK flag is set. Because it's not
judged whether sk_msg is the last msg of ingress_msg queue, the next
sk_msg may be the head of ingress_msg queue, whose memory address of
sg page is invalid. So it's necessary to add check codes to prevent
this problem.

[20759.125457] BUG: kernel NULL pointer dereference, address:
0000000000000008
[20759.132118] CPU: 53 PID: 51378 Comm: envoy Tainted: G            E
5.4.32 #1
[20759.140890] Hardware name: Inspur SA5212M4/YZMB-00370-109, BIOS
4.1.12 06/18/2017
[20759.149734] RIP: 0010:copy_page_to_iter+0xad/0x300
[20759.270877] __tcp_bpf_recvmsg+0x12c/0x350
[20759.276099] tcp_bpf_recvmsg+0x113/0x370
[20759.281137] inet_recvmsg+0x55/0xc0
[20759.285734] __sys_recvfrom+0xc8/0x130
[20759.290566] ? __audit_syscall_entry+0x103/0x130
[20759.296227] ? syscall_trace_enter+0x1d2/0x2d0
[20759.301700] ? __audit_syscall_exit+0x1e4/0x290
[20759.307235] __x64_sys_recvfrom+0x24/0x30
[20759.312226] do_syscall_64+0x55/0x1b0
[20759.316852] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: dihu <anny.hu@linux.alibaba.com>
---
 net/ipv4/tcp_bpf.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 5a05327..b82e4c3 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -64,6 +64,9 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
 		} while (i != msg_rx->sg.end);
 
 		if (unlikely(peek)) {
+			if (msg_rx == list_last_entry(&psock->ingress_msg,
+						      struct sk_msg, list))
+				break;
 			msg_rx = list_next_entry(msg_rx, list);
 			continue;
 		}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-06-09 17:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <db5393a3-d4b3-45c1-8219-f23b43a8d2ab.anny.hu@linux.alibaba.com>
2020-05-26 21:10 ` [PATCH] bpf/sockmap: fix kernel panic at __tcp_bpf_recvmsg John Fastabend
2020-05-29  9:05   ` dihu
2020-06-02  9:03     ` Jakub Sitnicki
2020-06-05  8:46 dihu
2020-06-08 16:06 ` John Fastabend
2020-06-09  9:03 ` Jakub Sitnicki
2020-06-09 17:58   ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).