linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fs/proc: Crash observed in next_tgid (fs/proc/base.c)
@ 2019-04-15 12:58 Jitendra Sharma
  2019-04-17  3:38 ` Kees Cook
  2019-04-17 12:11 ` Marc Gonzalez
  0 siblings, 2 replies; 7+ messages in thread
From: Jitendra Sharma @ 2019-04-15 12:58 UTC (permalink / raw)
  To: keescook, mcgrof; +Cc: linux-kernel, linux-fsdevel, linux-arm-msm

Hi Kees Cook/Luis,

We are observing one kernel crash in next_tgid function through 
getdents64 path. Call stack is as shown below:

-000|has_group_leader_pid(inline)
-000|next_tgid(
| [X20] ns = 0xFFFFFF87CABB1AC0,
| [locdesc] iter = (
| [locdesc] tgid = 424,
| [locdesc] task = ?))
| [X21] p = 0xFFFFFFD0FFFFF948
| [X21] task = 0xFFFFFFD0FFFFF948
-001|proc_pid_readdir(
| [X20] file = 0xFFFFFFD1AC60FC40,
| [X19] ctx = 0xFFFFFF8027363E40)
| [X21] ns = 0xFFFFFF87CABB1AC0
-002|proc_root_readdir(
| [X20] file = 0xFFFFFFD1AC60FC40,
| [X19] ctx = 0xFFFFFF8027363E40)
-003|iterate_dir(
| [X19] file = 0xFFFFFFD1AC60FC40,
| [X22] ctx = 0xFFFFFF8027363E40)
| [X23] inode = 0xFFFFFFD1F20246D0
-004|SYSC_getdents64(inline)
-004|sys_getdents64(
| ?,
| ?,
| [X19] count = 4200)
| [X19] count = 4200
| [X20] f = ([X20] file = 0xAC60FC43AC60FC40, [X20] flags = 1207898624)
| [X0] error = -1720
-005|el0_svc_naked(asm)
-->|exception
-006|NUX:0x78C5AD7D38(asm)
---|end of frame


 From this call stack,task: 0xFFFFFFD0FFFFF948, seems to be invalid. 
As(from ramdumps) it doesn't have any valid fields. And while trying to 
access the fields of this task struct in has_group_leader_pid, abort is 
happening.

 From the dumps, its not clear why the task struct is coming to be some 
invalid (Possibly task has already exited).  This issue is observed 
during normal monkey testing for long hours.

Could you please provide some pointers which could help in debugging 
this issue further.


Thanks,

Jitendra

-- 

QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
  2019-04-15 12:58 fs/proc: Crash observed in next_tgid (fs/proc/base.c) Jitendra Sharma
@ 2019-04-17  3:38 ` Kees Cook
  2019-04-17 11:21   ` Oleg Nesterov
  2019-04-17 12:11 ` Marc Gonzalez
  1 sibling, 1 reply; 7+ messages in thread
From: Kees Cook @ 2019-04-17  3:38 UTC (permalink / raw)
  To: Jitendra Sharma
  Cc: Luis R. Rodriguez, LKML, linux-fsdevel, linux-arm-msm, Oleg Nesterov

On Mon, Apr 15, 2019 at 7:58 AM Jitendra Sharma <shajit@codeaurora.org> wrote:
>
> Hi Kees Cook/Luis,
>
> We are observing one kernel crash in next_tgid function through
> getdents64 path. Call stack is as shown below:
>
> -000|has_group_leader_pid(inline)
> -000|next_tgid(
> | [X20] ns = 0xFFFFFF87CABB1AC0,
> | [locdesc] iter = (
> | [locdesc] tgid = 424,
> | [locdesc] task = ?))
> | [X21] p = 0xFFFFFFD0FFFFF948
> | [X21] task = 0xFFFFFFD0FFFFF948
> -001|proc_pid_readdir(
> | [X20] file = 0xFFFFFFD1AC60FC40,
> | [X19] ctx = 0xFFFFFF8027363E40)
> | [X21] ns = 0xFFFFFF87CABB1AC0
> -002|proc_root_readdir(
> | [X20] file = 0xFFFFFFD1AC60FC40,
> | [X19] ctx = 0xFFFFFF8027363E40)
> -003|iterate_dir(
> | [X19] file = 0xFFFFFFD1AC60FC40,
> | [X22] ctx = 0xFFFFFF8027363E40)
> | [X23] inode = 0xFFFFFFD1F20246D0
> -004|SYSC_getdents64(inline)
> -004|sys_getdents64(
> | ?,
> | ?,
> | [X19] count = 4200)
> | [X19] count = 4200
> | [X20] f = ([X20] file = 0xAC60FC43AC60FC40, [X20] flags = 1207898624)
> | [X0] error = -1720
> -005|el0_svc_naked(asm)
> -->|exception
> -006|NUX:0x78C5AD7D38(asm)
> ---|end of frame
>
>
>  From this call stack,task: 0xFFFFFFD0FFFFF948, seems to be invalid.
> As(from ramdumps) it doesn't have any valid fields. And while trying to
> access the fields of this task struct in has_group_leader_pid, abort is
> happening.
>
>  From the dumps, its not clear why the task struct is coming to be some
> invalid (Possibly task has already exited).  This issue is observed
> during normal monkey testing for long hours.
>
> Could you please provide some pointers which could help in debugging
> this issue further.

Do you have any hints on how to reproduce this? I assume something is
missing proper locking or RCU handling, but I don't see anything
obvious in the surrounding code yet...

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
  2019-04-17  3:38 ` Kees Cook
@ 2019-04-17 11:21   ` Oleg Nesterov
  0 siblings, 0 replies; 7+ messages in thread
From: Oleg Nesterov @ 2019-04-17 11:21 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jitendra Sharma, Luis R. Rodriguez, LKML, linux-fsdevel, linux-arm-msm

On 04/16, Kees Cook wrote:
>
> Do you have any hints on how to reproduce this? I assume something is
> missing proper locking or RCU handling,

or we simply have an unbalanced put_task_struct() anywhere else ...

> but I don't see anything
> obvious in the surrounding code yet...

I too do not see anything wrong in proc_pid_readdir() paths

Oleg.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
  2019-04-15 12:58 fs/proc: Crash observed in next_tgid (fs/proc/base.c) Jitendra Sharma
  2019-04-17  3:38 ` Kees Cook
@ 2019-04-17 12:11 ` Marc Gonzalez
  2019-04-17 16:58   ` Jitendra Sharma
  1 sibling, 1 reply; 7+ messages in thread
From: Marc Gonzalez @ 2019-04-17 12:11 UTC (permalink / raw)
  To: Jitendra Sharma; +Cc: Kees Cook, LKML

On 15/04/2019 14:58, Jitendra Sharma wrote:

> We are observing one kernel crash in next_tgid function through 
> getdents64 path. Call stack is as shown below:

It might help if you specify the exact kernel version you are discussing,
as in which tag or commit hash you are running.

Also, what are you doing to trigger the issue?

Regards.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
  2019-04-17 12:11 ` Marc Gonzalez
@ 2019-04-17 16:58   ` Jitendra Sharma
  2019-04-18  9:24     ` Marc Gonzalez
  0 siblings, 1 reply; 7+ messages in thread
From: Jitendra Sharma @ 2019-04-17 16:58 UTC (permalink / raw)
  To: Marc Gonzalez; +Cc: Kees Cook, LKML

Thanks Marc and Kees for replying.
Answer to your queries:
Kernel version: 4.14.83
Test case: Not some specific test case. Issue reproduced while doing 
monkey testing for very long hours.

Thanks,
Jitendra

On 4/17/2019 5:41 PM, Marc Gonzalez wrote:
> On 15/04/2019 14:58, Jitendra Sharma wrote:
> 
>> We are observing one kernel crash in next_tgid function through
>> getdents64 path. Call stack is as shown below:
> 
> It might help if you specify the exact kernel version you are discussing,
> as in which tag or commit hash you are running.
> 
> Also, what are you doing to trigger the issue?
> 
> Regards.
> 

-- 

QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
  2019-04-17 16:58   ` Jitendra Sharma
@ 2019-04-18  9:24     ` Marc Gonzalez
  0 siblings, 0 replies; 7+ messages in thread
From: Marc Gonzalez @ 2019-04-18  9:24 UTC (permalink / raw)
  To: Jitendra Sharma; +Cc: Kees Cook, LKML

NB: it is preferable to avoid top-posting here.

On 17/04/2019 18:58, Jitendra Sharma wrote:

> Kernel version: 4.14.83

NB2: 4.14.83 is obsolete, as it stands 2255 patches behind the tip of
linux-4.14.y (though only 4 patches in fs/proc).

NB3: 4.14 is 108499(!!) patches behind v5.1-rc5 (latest release). It makes
sense to test on a recent kernel, in case someone has solved the issue along
the way (then the bisecting fun can begin).

> Test case: Not some specific test case. Issue reproduced while doing 
> monkey testing for very long hours.

I'm not sure what "monkey testing" entails...

Any relation to the infinite monkey theorem?
https://en.wikipedia.org/wiki/Infinite_monkey_theorem

Regards.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
@ 2019-04-17 18:14 Alexey Dobriyan
  0 siblings, 0 replies; 7+ messages in thread
From: Alexey Dobriyan @ 2019-04-17 18:14 UTC (permalink / raw)
  To: shajit; +Cc: linux-kernel

> Test case: Not some specific test case.

Write 2 programs: one forks, clones, and execs randomly (but not going
out of control), another does open("/proc"), getdents in a loop.
Don't use shell scripts or readdir as they only bring overhead.

Run getdents in a loop and several forkers, see what happens.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-04-18  9:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-15 12:58 fs/proc: Crash observed in next_tgid (fs/proc/base.c) Jitendra Sharma
2019-04-17  3:38 ` Kees Cook
2019-04-17 11:21   ` Oleg Nesterov
2019-04-17 12:11 ` Marc Gonzalez
2019-04-17 16:58   ` Jitendra Sharma
2019-04-18  9:24     ` Marc Gonzalez
2019-04-17 18:14 Alexey Dobriyan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).