linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Review request 0/1: fs/proc: Fix NULL pointer dereference in
@ 2020-12-09 11:20 Yahu Gao
  2020-12-09 11:21 ` [PATCH] fs/proc: Fix NULL pointer dereference in pid_delete_dentry Yahu Gao
  2020-12-09 21:54 ` Review request 0/1: fs/proc: Fix NULL pointer dereference in Eric W. Biederman
  0 siblings, 2 replies; 8+ messages in thread
From: Yahu Gao @ 2020-12-09 11:20 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: linux-kernel, linux-fsdevel, yahu.gao

There is a kernel NULL pointer dereference was found in Linux system.
The details of kernel NULL is shown at bellow.

Currently, we do not have a way to provoke this fault on purpose, but
the reproduction rate in out CI loops is high enough that we could go
for a trace patch in black or white UP and get a reproduction in few
weeks.

Our kernel version is 4.1.21, but via analyzing the source code of the
call trace. The upstream version should be affected. Really sorry for
havn't reproduced this in upstream version. But it's easier to be safe
than to prove it can't happen, right?

Details of kernel crash:
----------------------------------------------------------------------
[1446.285834] Unable to handle kernel NULL pointer dereference at
virtual address 00000008
[ 1446.293943] pgd = e4af0880
[ 1446.296656] [00000008] *pgd=10cc3003, *pmd=04153003, *pte=00000000
[ 1446.302898] Internal error: Oops: 207 1 PREEMPT SMP ARM
[ 1446.302950] Modules linked in: adkNetD ncp
lttng_ring_buffer_client_mmap_overwrite(C)
lttng_ring_buffer_client_mmap_discard(C)
lttng_ring_buffer_client_discard(C)
lttng_ring_buffer_metadata_mmap_client(C) lttng_probe_printk(C)
lttng_probe_irq(C) lttng_ring_buffer_metadata_client(C)
lttng_ring_buffer_client_overwrite(C) lttng_probe_signal(C)
lttng_probe_sched(C) lttng_tracer(C) lttng_statedump(C)
lttng_lib_ring_buffer(C) lttng_clock_plugin_arm_cntpct(C) lttng_clock(C)
[ 1446.302963] CPU: 0 PID: 12086 Comm: netstat Tainted: G C
4.1.21-rt13-* #1
[ 1446.302967] Hardware name: Ericsson CPM1
[ 1446.302972] task: cbd75480 ti: c4a68000 task.ti: c4a68000
[ 1446.302984] PC is at pid_delete_dentry+0x8/0x18
[ 1446.302992] LR is at dput+0x1a8/0x2b4
[ 1446.303003] pc : [] lr : [] psr: 20070013
[ 1446.303003] sp : c4a69e88 ip : 00000000 fp : 00000000
[ 1446.303007] r10: 000218cc r9 : cd228000 r8 : e5f44320
[ 1446.303011] r7 : 00000001 r6 : 00080040 r5 : c4aa97d0 r4 : c4aa9780
[ 1446.303015] r3 : 00000000 r2 : cbd75480 r1 : 00000000 r0 : c4aa9780
[ 1446.303020] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment
user
[ 1446.303026] Control: 30c5387d Table: 24af0880 DAC: 000000fd
[ 1446.303033] Process netstat (pid: 12086, stack limit = 0xc4a68218)
[ 1446.303039] Stack: (0xc4a69e88 to 0xc4a6a000)
[ 1446.303052] 9e80: c4a69f70 0000a1c0 c4a69f13 00000002 e5f44320
cd228000
[ 1446.303059] 9ea0: 000218cc c0571604 c0a60bcc 00000000 00000000
00000000 c4a69f20 c4a69f15
[ 1446.303065] 9ec0: 00003133 00000002 c4a69f13 00000000 0000001f
c4a69f70 c35de800 0000007c
[ 1446.303072] 9ee0: ce2b1c00 cd228000 00000001 c05747b8 c05745cc
c35de800 0000001f 00000000
[ 1446.303078] 9f00: 00000004 cd228008 00020000 c05745cc 33000004
c0400031 c4a68000 00000400
[ 1446.303086] 9f20: beb78c2c cd228000 c4a69f70 00000000 cd228008
c0ffca90 c4a68000 00000400
[ 1446.303103] 9f40: beb78c2c c052cd0c bf08a774 00000400 01480080
00008000 cd228000 cd228000
[ 1446.303114] 9f60: c040f7c8 c4a68000 00000400 c052d22c c052cd8c
00000000 00000021 00000000
[ 1446.303127] 9f80: 01480290 01480280 00007df0 ffffffea 01480060
01480060 01480064 b6e424c0
[ 1446.303143] 9fa0: 0000008d c040f794 01480060 01480064 00000004
01480080 00008000 00000000
[ 1446.303150] 9fc0: 01480060 01480064 b6e424c0 0000008d 01480080
01480060 00035440 beb78c2c
[ 1446.303156] 9fe0: 01480080 beb78160 b6ede59c b6edea3c 60070010
00000004 00000000 00000000
[ 1446.303167] [] (pid_delete_dentry) from [] (dput+0x1a8/0x2b4)
[ 1446.303176] [] (dput) from [] (proc_fill_cache+0x54/0x10c)
[ 1446.303189] [] (proc_fill_cache) from []
(proc_readfd_common+0xd8/0x238)
[ 1446.303203] [] (proc_readfd_common) from [] (iterate_dir+0x98/0x118)
[ 1446.303217] [] (iterate_dir) from [] (SyS_getdents+0x7c/0xf0)
[ 1446.303233] [] (SyS_getdents) from [] (__sys_trace_return+0x0/0x2c)
[ 1446.303243] Code: e8bd0030 e12fff1e e5903028 e5133020 (e5930008) 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] fs/proc: Fix NULL pointer dereference in pid_delete_dentry
  2020-12-09 11:20 Review request 0/1: fs/proc: Fix NULL pointer dereference in Yahu Gao
@ 2020-12-09 11:21 ` Yahu Gao
  2020-12-09 13:16   ` Al Viro
  2020-12-09 21:54 ` Review request 0/1: fs/proc: Fix NULL pointer dereference in Eric W. Biederman
  1 sibling, 1 reply; 8+ messages in thread
From: Yahu Gao @ 2020-12-09 11:21 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: linux-kernel, linux-fsdevel, yahu.gao

Get the staus of task from the pointer of proc inode directly is not
safe. The function get_proc_task make it happen in RCU protection.

Signed-off-by: Yahu Gao <yahu.gao@windriver.com>
---
 fs/proc/base.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 1bc9bcdef09f..05f33bb35067 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1994,7 +1994,7 @@ static int pid_revalidate(struct dentry *dentry, unsigned int flags)
 
 static inline bool proc_inode_is_dead(struct inode *inode)
 {
-	return !proc_pid(inode)->tasks[PIDTYPE_PID].first;
+	return !get_proc_task(inode);
 }
 
 int pid_delete_dentry(const struct dentry *dentry)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] fs/proc: Fix NULL pointer dereference in pid_delete_dentry
  2020-12-09 11:21 ` [PATCH] fs/proc: Fix NULL pointer dereference in pid_delete_dentry Yahu Gao
@ 2020-12-09 13:16   ` Al Viro
  2021-01-07  9:41     ` [PATCH v2] " Gao Yahu
  0 siblings, 1 reply; 8+ messages in thread
From: Al Viro @ 2020-12-09 13:16 UTC (permalink / raw)
  To: Yahu Gao; +Cc: Alexey Dobriyan, linux-kernel, linux-fsdevel

On Wed, Dec 09, 2020 at 07:21:00PM +0800, Yahu Gao wrote:
> Get the staus of task from the pointer of proc inode directly is not
> safe. The function get_proc_task make it happen in RCU protection.

This is completely broken,  get_proc_task() acquires a reference to
task_struct; your patch is an instant leak.

> Signed-off-by: Yahu Gao <yahu.gao@windriver.com>
> ---
>  fs/proc/base.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 1bc9bcdef09f..05f33bb35067 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1994,7 +1994,7 @@ static int pid_revalidate(struct dentry *dentry, unsigned int flags)
>  
>  static inline bool proc_inode_is_dead(struct inode *inode)
>  {
> -	return !proc_pid(inode)->tasks[PIDTYPE_PID].first;
> +	return !get_proc_task(inode);
>  }
>  
>  int pid_delete_dentry(const struct dentry *dentry)
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review request 0/1: fs/proc: Fix NULL pointer dereference in
  2020-12-09 11:20 Review request 0/1: fs/proc: Fix NULL pointer dereference in Yahu Gao
  2020-12-09 11:21 ` [PATCH] fs/proc: Fix NULL pointer dereference in pid_delete_dentry Yahu Gao
@ 2020-12-09 21:54 ` Eric W. Biederman
  2021-01-07  9:02   ` Gao Yahu
                     ` (2 more replies)
  1 sibling, 3 replies; 8+ messages in thread
From: Eric W. Biederman @ 2020-12-09 21:54 UTC (permalink / raw)
  To: Yahu Gao; +Cc: Alexey Dobriyan, linux-kernel, linux-fsdevel, Al Viro

Yahu Gao <yahu.gao@windriver.com> writes:

> There is a kernel NULL pointer dereference was found in Linux system.
> The details of kernel NULL is shown at bellow.
>
> Currently, we do not have a way to provoke this fault on purpose, but
> the reproduction rate in out CI loops is high enough that we could go
> for a trace patch in black or white UP and get a reproduction in few
> weeks.
>
> Our kernel version is 4.1.21, but via analyzing the source code of the
> call trace. The upstream version should be affected. Really sorry for
> havn't reproduced this in upstream version. But it's easier to be safe
> than to prove it can't happen, right?

Except I there are strong invariants that suggests that it takes
a memory stomp to get a NULL pointer deference here.

For the life of a proc inode PROC_I(inode)->pid should be non-NULL.

For a non-NULL pid pointer ->tasks[PIDTYPE_PID].first simply reads
an entry out of the struct pid.  Only pid needs to be non-NULL.

So I don't see how you are getting a NULL pointer derference.

Have you decoded the oops, looked at the assembly and seen which field
is NULL?  I expec that will help you track down what is wrong.


> Details of kernel crash:
> ----------------------------------------------------------------------
> [1446.285834] Unable to handle kernel NULL pointer dereference at
> virtual address 00000008
> [ 1446.293943] pgd = e4af0880
> [ 1446.296656] [00000008] *pgd=10cc3003, *pmd=04153003, *pte=00000000
> [ 1446.302898] Internal error: Oops: 207 1 PREEMPT SMP ARM
> [ 1446.302950] Modules linked in: adkNetD ncp
> lttng_ring_buffer_client_mmap_overwrite(C)
> lttng_ring_buffer_client_mmap_discard(C)
> lttng_ring_buffer_client_discard(C)
> lttng_ring_buffer_metadata_mmap_client(C) lttng_probe_printk(C)
> lttng_probe_irq(C) lttng_ring_buffer_metadata_client(C)
> lttng_ring_buffer_client_overwrite(C) lttng_probe_signal(C)
> lttng_probe_sched(C) lttng_tracer(C) lttng_statedump(C)
> lttng_lib_ring_buffer(C) lttng_clock_plugin_arm_cntpct(C) lttng_clock(C)
> [ 1446.302963] CPU: 0 PID: 12086 Comm: netstat Tainted: G C
> 4.1.21-rt13-* #1
> [ 1446.302967] Hardware name: Ericsson CPM1
> [ 1446.302972] task: cbd75480 ti: c4a68000 task.ti: c4a68000
> [ 1446.302984] PC is at pid_delete_dentry+0x8/0x18
> [ 1446.302992] LR is at dput+0x1a8/0x2b4
> [ 1446.303003] pc : [] lr : [] psr: 20070013
> [ 1446.303003] sp : c4a69e88 ip : 00000000 fp : 00000000
> [ 1446.303007] r10: 000218cc r9 : cd228000 r8 : e5f44320
> [ 1446.303011] r7 : 00000001 r6 : 00080040 r5 : c4aa97d0 r4 : c4aa9780
> [ 1446.303015] r3 : 00000000 r2 : cbd75480 r1 : 00000000 r0 : c4aa9780
> [ 1446.303020] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment
> user
> [ 1446.303026] Control: 30c5387d Table: 24af0880 DAC: 000000fd
> [ 1446.303033] Process netstat (pid: 12086, stack limit = 0xc4a68218)
> [ 1446.303039] Stack: (0xc4a69e88 to 0xc4a6a000)
> [ 1446.303052] 9e80: c4a69f70 0000a1c0 c4a69f13 00000002 e5f44320
> cd228000
> [ 1446.303059] 9ea0: 000218cc c0571604 c0a60bcc 00000000 00000000
> 00000000 c4a69f20 c4a69f15
> [ 1446.303065] 9ec0: 00003133 00000002 c4a69f13 00000000 0000001f
> c4a69f70 c35de800 0000007c
> [ 1446.303072] 9ee0: ce2b1c00 cd228000 00000001 c05747b8 c05745cc
> c35de800 0000001f 00000000
> [ 1446.303078] 9f00: 00000004 cd228008 00020000 c05745cc 33000004
> c0400031 c4a68000 00000400
> [ 1446.303086] 9f20: beb78c2c cd228000 c4a69f70 00000000 cd228008
> c0ffca90 c4a68000 00000400
> [ 1446.303103] 9f40: beb78c2c c052cd0c bf08a774 00000400 01480080
> 00008000 cd228000 cd228000
> [ 1446.303114] 9f60: c040f7c8 c4a68000 00000400 c052d22c c052cd8c
> 00000000 00000021 00000000
> [ 1446.303127] 9f80: 01480290 01480280 00007df0 ffffffea 01480060
> 01480060 01480064 b6e424c0
> [ 1446.303143] 9fa0: 0000008d c040f794 01480060 01480064 00000004
> 01480080 00008000 00000000
> [ 1446.303150] 9fc0: 01480060 01480064 b6e424c0 0000008d 01480080
> 01480060 00035440 beb78c2c
> [ 1446.303156] 9fe0: 01480080 beb78160 b6ede59c b6edea3c 60070010
> 00000004 00000000 00000000
> [ 1446.303167] [] (pid_delete_dentry) from [] (dput+0x1a8/0x2b4)
> [ 1446.303176] [] (dput) from [] (proc_fill_cache+0x54/0x10c)
> [ 1446.303189] [] (proc_fill_cache) from []
> (proc_readfd_common+0xd8/0x238)
> [ 1446.303203] [] (proc_readfd_common) from [] (iterate_dir+0x98/0x118)
> [ 1446.303217] [] (iterate_dir) from [] (SyS_getdents+0x7c/0xf0)
> [ 1446.303233] [] (SyS_getdents) from [] (__sys_trace_return+0x0/0x2c)
> [ 1446.303243] Code: e8bd0030 e12fff1e e5903028 e5133020 (e5930008) 

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review request 0/1: fs/proc: Fix NULL pointer dereference in
  2020-12-09 21:54 ` Review request 0/1: fs/proc: Fix NULL pointer dereference in Eric W. Biederman
@ 2021-01-07  9:02   ` Gao Yahu
  2021-01-07  9:03   ` Gao, Yahu
  2021-01-07  9:07   ` Gao Yahu
  2 siblings, 0 replies; 8+ messages in thread
From: Gao Yahu @ 2021-01-07  9:02 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Yahu Gao, Alexey Dobriyan, linux-kernel, linux-fsdevel, Al Viro


Eric W. Biederman <ebiederm@xmission.com> writes:

> [Please note this e-mail is from an EXTERNAL e-mail address]
>
> Yahu Gao <yahu.gao@windriver.com> writes:
>
>> There is a kernel NULL pointer dereference was found in Linux system.
>> The details of kernel NULL is shown at bellow.
>>
>> Currently, we do not have a way to provoke this fault on purpose, but
>> the reproduction rate in out CI loops is high enough that we could go
>> for a trace patch in black or white UP and get a reproduction in few
>> weeks.
>>
>> Our kernel version is 4.1.21, but via analyzing the source code of the
>> call trace. The upstream version should be affected. Really sorry for
>> havn't reproduced this in upstream version. But it's easier to be safe
>> than to prove it can't happen, right?
>
> Except I there are strong invariants that suggests that it takes
> a memory stomp to get a NULL pointer deference here.
>

Sorry for late reply. I took a long time to find root cause of memory stomp,
but got nothing for now:(

> For the life of a proc inode PROC_I(inode)->pid should be non-NULL.
>
> For a non-NULL pid pointer ->tasks[PIDTYPE_PID].first simply reads
> an entry out of the struct pid.  Only pid needs to be non-NULL.
>
> So I don't see how you are getting a NULL pointer derference.
>
> Have you decoded the oops, looked at the assembly and seen which field
> is NULL?  I expec that will help you track down what is wrong.
>
There are two messages I had found:
1. 'Unable to handle kernel NULL pointer dereference at virtual address
 00000008'.
2. In kernel 4.1.21, the members of 'struct pid' likes follow:
(Apologize for missing this message at first)
...
struct pid
{
        atomic_t count;
        unsigned int level;
        /* lists of tasks that use this pid */
        struct hlist_head tasks[PIDTYPE_MAX];
        struct rcu_head rcu;
        struct upid numbers[1];
};
...
The offset of the member *tasks* from *struct pid* is 0x00000008.

Based on the above message, I thought we got a NULL pointer of struct
pid.

Regards,
Yahu

>
>> Details of kernel crash:
>> ----------------------------------------------------------------------
>> [1446.285834] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000008
>> [ 1446.293943] pgd = e4af0880
>> [ 1446.296656] [00000008] *pgd=10cc3003, *pmd=04153003, *pte=00000000
>> [ 1446.302898] Internal error: Oops: 207 1 PREEMPT SMP ARM
>> [ 1446.302950] Modules linked in: adkNetD ncp
>> lttng_ring_buffer_client_mmap_overwrite(C)
>> lttng_ring_buffer_client_mmap_discard(C)
>> lttng_ring_buffer_client_discard(C)
>> lttng_ring_buffer_metadata_mmap_client(C) lttng_probe_printk(C)
>> lttng_probe_irq(C) lttng_ring_buffer_metadata_client(C)
>> lttng_ring_buffer_client_overwrite(C) lttng_probe_signal(C)
>> lttng_probe_sched(C) lttng_tracer(C) lttng_statedump(C)
>> lttng_lib_ring_buffer(C) lttng_clock_plugin_arm_cntpct(C) lttng_clock(C)
>> [ 1446.302963] CPU: 0 PID: 12086 Comm: netstat Tainted: G C
>> 4.1.21-rt13-* #1
>> [ 1446.302967] Hardware name: Ericsson CPM1
>> [ 1446.302972] task: cbd75480 ti: c4a68000 task.ti: c4a68000
>> [ 1446.302984] PC is at pid_delete_dentry+0x8/0x18
>> [ 1446.302992] LR is at dput+0x1a8/0x2b4
>> [ 1446.303003] pc : [] lr : [] psr: 20070013
>> [ 1446.303003] sp : c4a69e88 ip : 00000000 fp : 00000000
>> [ 1446.303007] r10: 000218cc r9 : cd228000 r8 : e5f44320
>> [ 1446.303011] r7 : 00000001 r6 : 00080040 r5 : c4aa97d0 r4 : c4aa9780
>> [ 1446.303015] r3 : 00000000 r2 : cbd75480 r1 : 00000000 r0 : c4aa9780
>> [ 1446.303020] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment
>> user
>> [ 1446.303026] Control: 30c5387d Table: 24af0880 DAC: 000000fd
>> [ 1446.303033] Process netstat (pid: 12086, stack limit = 0xc4a68218)
>> [ 1446.303039] Stack: (0xc4a69e88 to 0xc4a6a000)
>> [ 1446.303052] 9e80: c4a69f70 0000a1c0 c4a69f13 00000002 e5f44320
>> cd228000
>> [ 1446.303059] 9ea0: 000218cc c0571604 c0a60bcc 00000000 00000000
>> 00000000 c4a69f20 c4a69f15
>> [ 1446.303065] 9ec0: 00003133 00000002 c4a69f13 00000000 0000001f
>> c4a69f70 c35de800 0000007c
>> [ 1446.303072] 9ee0: ce2b1c00 cd228000 00000001 c05747b8 c05745cc
>> c35de800 0000001f 00000000
>> [ 1446.303078] 9f00: 00000004 cd228008 00020000 c05745cc 33000004
>> c0400031 c4a68000 00000400
>> [ 1446.303086] 9f20: beb78c2c cd228000 c4a69f70 00000000 cd228008
>> c0ffca90 c4a68000 00000400
>> [ 1446.303103] 9f40: beb78c2c c052cd0c bf08a774 00000400 01480080
>> 00008000 cd228000 cd228000
>> [ 1446.303114] 9f60: c040f7c8 c4a68000 00000400 c052d22c c052cd8c
>> 00000000 00000021 00000000
>> [ 1446.303127] 9f80: 01480290 01480280 00007df0 ffffffea 01480060
>> 01480060 01480064 b6e424c0
>> [ 1446.303143] 9fa0: 0000008d c040f794 01480060 01480064 00000004
>> 01480080 00008000 00000000
>> [ 1446.303150] 9fc0: 01480060 01480064 b6e424c0 0000008d 01480080
>> 01480060 00035440 beb78c2c
>> [ 1446.303156] 9fe0: 01480080 beb78160 b6ede59c b6edea3c 60070010
>> 00000004 00000000 00000000
>> [ 1446.303167] [] (pid_delete_dentry) from [] (dput+0x1a8/0x2b4)
>> [ 1446.303176] [] (dput) from [] (proc_fill_cache+0x54/0x10c)
>> [ 1446.303189] [] (proc_fill_cache) from []
>> (proc_readfd_common+0xd8/0x238)
>> [ 1446.303203] [] (proc_readfd_common) from [] (iterate_dir+0x98/0x118)
>> [ 1446.303217] [] (iterate_dir) from [] (SyS_getdents+0x7c/0xf0)
>> [ 1446.303233] [] (SyS_getdents) from [] (__sys_trace_return+0x0/0x2c)
>> [ 1446.303243] Code: e8bd0030 e12fff1e e5903028 e5133020 (e5930008)
>
> Eric


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review request 0/1: fs/proc: Fix NULL pointer dereference in
  2020-12-09 21:54 ` Review request 0/1: fs/proc: Fix NULL pointer dereference in Eric W. Biederman
  2021-01-07  9:02   ` Gao Yahu
@ 2021-01-07  9:03   ` Gao, Yahu
  2021-01-07  9:07   ` Gao Yahu
  2 siblings, 0 replies; 8+ messages in thread
From: Gao, Yahu @ 2021-01-07  9:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Gao, Yahu, Alexey Dobriyan, linux-kernel, linux-fsdevel, Al Viro


Eric W. Biederman <ebiederm@xmission.com> writes:

> [Please note this e-mail is from an EXTERNAL e-mail address]
>
> Yahu Gao <yahu.gao@windriver.com> writes:
>
>> There is a kernel NULL pointer dereference was found in Linux system.
>> The details of kernel NULL is shown at bellow.
>>
>> Currently, we do not have a way to provoke this fault on purpose, but
>> the reproduction rate in out CI loops is high enough that we could go
>> for a trace patch in black or white UP and get a reproduction in few
>> weeks.
>>
>> Our kernel version is 4.1.21, but via analyzing the source code of the
>> call trace. The upstream version should be affected. Really sorry for
>> havn't reproduced this in upstream version. But it's easier to be safe
>> than to prove it can't happen, right?
>
> Except I there are strong invariants that suggests that it takes
> a memory stomp to get a NULL pointer deference here.
>

Sorry for late reply. I took a long time to find root cause of memory stomp,
but got nothing for now:(

> For the life of a proc inode PROC_I(inode)->pid should be non-NULL.
>
> For a non-NULL pid pointer ->tasks[PIDTYPE_PID].first simply reads
> an entry out of the struct pid.  Only pid needs to be non-NULL.
>
> So I don't see how you are getting a NULL pointer derference.
>
> Have you decoded the oops, looked at the assembly and seen which field
> is NULL?  I expec that will help you track down what is wrong.
>
There are two messages I had found:
1. 'Unable to handle kernel NULL pointer dereference at virtual address
 00000008'.
2. In kernel 4.1.21, the members of 'struct pid' likes follow:
(Apologize for missing this message at first)
...
struct pid
{
        atomic_t count;
        unsigned int level;
        /* lists of tasks that use this pid */
        struct hlist_head tasks[PIDTYPE_MAX];
        struct rcu_head rcu;
        struct upid numbers[1];
};
...
The offset of the member *tasks* from *struct pid* is 0x00000008.

Based on the above message, I thought we got a NULL pointer of struct
pid.

Regards,
Yahu

>
>> Details of kernel crash:
>> ----------------------------------------------------------------------
>> [1446.285834] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000008
>> [ 1446.293943] pgd = e4af0880
>> [ 1446.296656] [00000008] *pgd=10cc3003, *pmd=04153003, *pte=00000000
>> [ 1446.302898] Internal error: Oops: 207 1 PREEMPT SMP ARM
>> [ 1446.302950] Modules linked in: adkNetD ncp
>> lttng_ring_buffer_client_mmap_overwrite(C)
>> lttng_ring_buffer_client_mmap_discard(C)
>> lttng_ring_buffer_client_discard(C)
>> lttng_ring_buffer_metadata_mmap_client(C) lttng_probe_printk(C)
>> lttng_probe_irq(C) lttng_ring_buffer_metadata_client(C)
>> lttng_ring_buffer_client_overwrite(C) lttng_probe_signal(C)
>> lttng_probe_sched(C) lttng_tracer(C) lttng_statedump(C)
>> lttng_lib_ring_buffer(C) lttng_clock_plugin_arm_cntpct(C) lttng_clock(C)
>> [ 1446.302963] CPU: 0 PID: 12086 Comm: netstat Tainted: G C
>> 4.1.21-rt13-* #1
>> [ 1446.302967] Hardware name: Ericsson CPM1
>> [ 1446.302972] task: cbd75480 ti: c4a68000 task.ti: c4a68000
>> [ 1446.302984] PC is at pid_delete_dentry+0x8/0x18
>> [ 1446.302992] LR is at dput+0x1a8/0x2b4
>> [ 1446.303003] pc : [] lr : [] psr: 20070013
>> [ 1446.303003] sp : c4a69e88 ip : 00000000 fp : 00000000
>> [ 1446.303007] r10: 000218cc r9 : cd228000 r8 : e5f44320
>> [ 1446.303011] r7 : 00000001 r6 : 00080040 r5 : c4aa97d0 r4 : c4aa9780
>> [ 1446.303015] r3 : 00000000 r2 : cbd75480 r1 : 00000000 r0 : c4aa9780
>> [ 1446.303020] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment
>> user
>> [ 1446.303026] Control: 30c5387d Table: 24af0880 DAC: 000000fd
>> [ 1446.303033] Process netstat (pid: 12086, stack limit = 0xc4a68218)
>> [ 1446.303039] Stack: (0xc4a69e88 to 0xc4a6a000)
>> [ 1446.303052] 9e80: c4a69f70 0000a1c0 c4a69f13 00000002 e5f44320
>> cd228000
>> [ 1446.303059] 9ea0: 000218cc c0571604 c0a60bcc 00000000 00000000
>> 00000000 c4a69f20 c4a69f15
>> [ 1446.303065] 9ec0: 00003133 00000002 c4a69f13 00000000 0000001f
>> c4a69f70 c35de800 0000007c
>> [ 1446.303072] 9ee0: ce2b1c00 cd228000 00000001 c05747b8 c05745cc
>> c35de800 0000001f 00000000
>> [ 1446.303078] 9f00: 00000004 cd228008 00020000 c05745cc 33000004
>> c0400031 c4a68000 00000400
>> [ 1446.303086] 9f20: beb78c2c cd228000 c4a69f70 00000000 cd228008
>> c0ffca90 c4a68000 00000400
>> [ 1446.303103] 9f40: beb78c2c c052cd0c bf08a774 00000400 01480080
>> 00008000 cd228000 cd228000
>> [ 1446.303114] 9f60: c040f7c8 c4a68000 00000400 c052d22c c052cd8c
>> 00000000 00000021 00000000
>> [ 1446.303127] 9f80: 01480290 01480280 00007df0 ffffffea 01480060
>> 01480060 01480064 b6e424c0
>> [ 1446.303143] 9fa0: 0000008d c040f794 01480060 01480064 00000004
>> 01480080 00008000 00000000
>> [ 1446.303150] 9fc0: 01480060 01480064 b6e424c0 0000008d 01480080
>> 01480060 00035440 beb78c2c
>> [ 1446.303156] 9fe0: 01480080 beb78160 b6ede59c b6edea3c 60070010
>> 00000004 00000000 00000000
>> [ 1446.303167] [] (pid_delete_dentry) from [] (dput+0x1a8/0x2b4)
>> [ 1446.303176] [] (dput) from [] (proc_fill_cache+0x54/0x10c)
>> [ 1446.303189] [] (proc_fill_cache) from []
>> (proc_readfd_common+0xd8/0x238)
>> [ 1446.303203] [] (proc_readfd_common) from [] (iterate_dir+0x98/0x118)
>> [ 1446.303217] [] (iterate_dir) from [] (SyS_getdents+0x7c/0xf0)
>> [ 1446.303233] [] (SyS_getdents) from [] (__sys_trace_return+0x0/0x2c)
>> [ 1446.303243] Code: e8bd0030 e12fff1e e5903028 e5133020 (e5930008)
>
> Eric


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Review request 0/1: fs/proc: Fix NULL pointer dereference in
  2020-12-09 21:54 ` Review request 0/1: fs/proc: Fix NULL pointer dereference in Eric W. Biederman
  2021-01-07  9:02   ` Gao Yahu
  2021-01-07  9:03   ` Gao, Yahu
@ 2021-01-07  9:07   ` Gao Yahu
  2 siblings, 0 replies; 8+ messages in thread
From: Gao Yahu @ 2021-01-07  9:07 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Yahu Gao, Alexey Dobriyan, linux-kernel, linux-fsdevel, Al Viro


Eric W. Biederman <ebiederm@xmission.com> writes:

> [Please note this e-mail is from an EXTERNAL e-mail address]
>
> Yahu Gao <yahu.gao@windriver.com> writes:
>
>> There is a kernel NULL pointer dereference was found in Linux system.
>> The details of kernel NULL is shown at bellow.
>>
>> Currently, we do not have a way to provoke this fault on purpose, but
>> the reproduction rate in out CI loops is high enough that we could go
>> for a trace patch in black or white UP and get a reproduction in few
>> weeks.
>>
>> Our kernel version is 4.1.21, but via analyzing the source code of the
>> call trace. The upstream version should be affected. Really sorry for
>> havn't reproduced this in upstream version. But it's easier to be safe
>> than to prove it can't happen, right?
>
> Except I there are strong invariants that suggests that it takes
> a memory stomp to get a NULL pointer deference here.
>

Sorry for late reply. I took a long time to find root cause of memory stomp,
but got nothing for now:(

> For the life of a proc inode PROC_I(inode)->pid should be non-NULL.
>
> For a non-NULL pid pointer ->tasks[PIDTYPE_PID].first simply reads
> an entry out of the struct pid.  Only pid needs to be non-NULL.
>
> So I don't see how you are getting a NULL pointer derference.
>
> Have you decoded the oops, looked at the assembly and seen which field
> is NULL?  I expec that will help you track down what is wrong.
>

There are two messages I had found:
1. 'Unable to handle kernel NULL pointer dereference at virtual address
 00000008'.
2. In kernel 4.1.21, the members of 'struct pid' likes follow:
(Apologize for missing this message at first)
...
struct pid
{
        atomic_t count;
        unsigned int level;
        /* lists of tasks that use this pid */
        struct hlist_head tasks[PIDTYPE_MAX];
        struct rcu_head rcu;
        struct upid numbers[1];
};
...
The offset of the member *tasks* from *struct pid* is 0x00000008.

Based on the above message, I thought we got a NULL pointer of struct
pid.

Regards,
Yahu

>
>> Details of kernel crash:
>> ----------------------------------------------------------------------
>> [1446.285834] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000008
>> [ 1446.293943] pgd = e4af0880
>> [ 1446.296656] [00000008] *pgd=10cc3003, *pmd=04153003, *pte=00000000
>> [ 1446.302898] Internal error: Oops: 207 1 PREEMPT SMP ARM
>> [ 1446.302950] Modules linked in: adkNetD ncp
>> lttng_ring_buffer_client_mmap_overwrite(C)
>> lttng_ring_buffer_client_mmap_discard(C)
>> lttng_ring_buffer_client_discard(C)
>> lttng_ring_buffer_metadata_mmap_client(C) lttng_probe_printk(C)
>> lttng_probe_irq(C) lttng_ring_buffer_metadata_client(C)
>> lttng_ring_buffer_client_overwrite(C) lttng_probe_signal(C)
>> lttng_probe_sched(C) lttng_tracer(C) lttng_statedump(C)
>> lttng_lib_ring_buffer(C) lttng_clock_plugin_arm_cntpct(C) lttng_clock(C)
>> [ 1446.302963] CPU: 0 PID: 12086 Comm: netstat Tainted: G C
>> 4.1.21-rt13-* #1
>> [ 1446.302967] Hardware name: Ericsson CPM1
>> [ 1446.302972] task: cbd75480 ti: c4a68000 task.ti: c4a68000
>> [ 1446.302984] PC is at pid_delete_dentry+0x8/0x18
>> [ 1446.302992] LR is at dput+0x1a8/0x2b4
>> [ 1446.303003] pc : [] lr : [] psr: 20070013
>> [ 1446.303003] sp : c4a69e88 ip : 00000000 fp : 00000000
>> [ 1446.303007] r10: 000218cc r9 : cd228000 r8 : e5f44320
>> [ 1446.303011] r7 : 00000001 r6 : 00080040 r5 : c4aa97d0 r4 : c4aa9780
>> [ 1446.303015] r3 : 00000000 r2 : cbd75480 r1 : 00000000 r0 : c4aa9780
>> [ 1446.303020] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment
>> user
>> [ 1446.303026] Control: 30c5387d Table: 24af0880 DAC: 000000fd
>> [ 1446.303033] Process netstat (pid: 12086, stack limit = 0xc4a68218)
>> [ 1446.303039] Stack: (0xc4a69e88 to 0xc4a6a000)
>> [ 1446.303052] 9e80: c4a69f70 0000a1c0 c4a69f13 00000002 e5f44320
>> cd228000
>> [ 1446.303059] 9ea0: 000218cc c0571604 c0a60bcc 00000000 00000000
>> 00000000 c4a69f20 c4a69f15
>> [ 1446.303065] 9ec0: 00003133 00000002 c4a69f13 00000000 0000001f
>> c4a69f70 c35de800 0000007c
>> [ 1446.303072] 9ee0: ce2b1c00 cd228000 00000001 c05747b8 c05745cc
>> c35de800 0000001f 00000000
>> [ 1446.303078] 9f00: 00000004 cd228008 00020000 c05745cc 33000004
>> c0400031 c4a68000 00000400
>> [ 1446.303086] 9f20: beb78c2c cd228000 c4a69f70 00000000 cd228008
>> c0ffca90 c4a68000 00000400
>> [ 1446.303103] 9f40: beb78c2c c052cd0c bf08a774 00000400 01480080
>> 00008000 cd228000 cd228000
>> [ 1446.303114] 9f60: c040f7c8 c4a68000 00000400 c052d22c c052cd8c
>> 00000000 00000021 00000000
>> [ 1446.303127] 9f80: 01480290 01480280 00007df0 ffffffea 01480060
>> 01480060 01480064 b6e424c0
>> [ 1446.303143] 9fa0: 0000008d c040f794 01480060 01480064 00000004
>> 01480080 00008000 00000000
>> [ 1446.303150] 9fc0: 01480060 01480064 b6e424c0 0000008d 01480080
>> 01480060 00035440 beb78c2c
>> [ 1446.303156] 9fe0: 01480080 beb78160 b6ede59c b6edea3c 60070010
>> 00000004 00000000 00000000
>> [ 1446.303167] [] (pid_delete_dentry) from [] (dput+0x1a8/0x2b4)
>> [ 1446.303176] [] (dput) from [] (proc_fill_cache+0x54/0x10c)
>> [ 1446.303189] [] (proc_fill_cache) from []
>> (proc_readfd_common+0xd8/0x238)
>> [ 1446.303203] [] (proc_readfd_common) from [] (iterate_dir+0x98/0x118)
>> [ 1446.303217] [] (iterate_dir) from [] (SyS_getdents+0x7c/0xf0)
>> [ 1446.303233] [] (SyS_getdents) from [] (__sys_trace_return+0x0/0x2c)
>> [ 1446.303243] Code: e8bd0030 e12fff1e e5903028 e5133020 (e5930008)
>
> Eric


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] fs/proc: Fix NULL pointer dereference in pid_delete_dentry
  2020-12-09 13:16   ` Al Viro
@ 2021-01-07  9:41     ` Gao Yahu
  0 siblings, 0 replies; 8+ messages in thread
From: Gao Yahu @ 2021-01-07  9:41 UTC (permalink / raw)
  To: Al Viro; +Cc: Yahu Gao, Alexey Dobriyan, linux-kernel, linux-fsdevel


Get the staus of task from the pointer of proc inode directly is not
safe. Make it happen in RCU protection.

Signed-off-by: Yahu Gao <yahu.gao@windriver.com>
---
v2 changes:
  - Use RCU lock to avoid NULL dereference of pid.
---
 fs/proc/base.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index b3422cda2a91..d44b5f2414a6 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1,3 +1,4 @@
+
 // SPDX-License-Identifier: GPL-2.0
 /*
  *  linux/fs/proc/base.c
@@ -1994,7 +1995,16 @@ static int pid_revalidate(struct dentry *dentry, unsigned int flags)
 
 static inline bool proc_inode_is_dead(struct inode *inode)
 {
-	return !proc_pid(inode)->tasks[PIDTYPE_PID].first;
+	struct pid *pid = NULL;
+	struct hlist_node *first = NULL;
+
+	rcu_read_lock();
+	pid = proc_pid(inode);
+	if (likely(pid))
+		first = rcu_dereference_check(hlist_first_rcu(&pid->tasks[PIDTYPE_PID]),
+					      lockdep_tasklist_lock_is_held());
+	rcu_read_unlock();
+	return !first;
 }
 
 int pid_delete_dentry(const struct dentry *dentry)
-- 
2.25.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-01-27 13:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-09 11:20 Review request 0/1: fs/proc: Fix NULL pointer dereference in Yahu Gao
2020-12-09 11:21 ` [PATCH] fs/proc: Fix NULL pointer dereference in pid_delete_dentry Yahu Gao
2020-12-09 13:16   ` Al Viro
2021-01-07  9:41     ` [PATCH v2] " Gao Yahu
2020-12-09 21:54 ` Review request 0/1: fs/proc: Fix NULL pointer dereference in Eric W. Biederman
2021-01-07  9:02   ` Gao Yahu
2021-01-07  9:03   ` Gao, Yahu
2021-01-07  9:07   ` Gao Yahu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).