* fs/proc: Crash observed in next_tgid (fs/proc/base.c)
@ 2019-04-15 12:58 Jitendra Sharma
2019-04-17 3:38 ` Kees Cook
2019-04-17 12:11 ` Marc Gonzalez
0 siblings, 2 replies; 7+ messages in thread
From: Jitendra Sharma @ 2019-04-15 12:58 UTC (permalink / raw)
To: keescook, mcgrof; +Cc: linux-kernel, linux-fsdevel, linux-arm-msm
Hi Kees Cook/Luis,
We are observing one kernel crash in next_tgid function through
getdents64 path. Call stack is as shown below:
-000|has_group_leader_pid(inline)
-000|next_tgid(
| [X20] ns = 0xFFFFFF87CABB1AC0,
| [locdesc] iter = (
| [locdesc] tgid = 424,
| [locdesc] task = ?))
| [X21] p = 0xFFFFFFD0FFFFF948
| [X21] task = 0xFFFFFFD0FFFFF948
-001|proc_pid_readdir(
| [X20] file = 0xFFFFFFD1AC60FC40,
| [X19] ctx = 0xFFFFFF8027363E40)
| [X21] ns = 0xFFFFFF87CABB1AC0
-002|proc_root_readdir(
| [X20] file = 0xFFFFFFD1AC60FC40,
| [X19] ctx = 0xFFFFFF8027363E40)
-003|iterate_dir(
| [X19] file = 0xFFFFFFD1AC60FC40,
| [X22] ctx = 0xFFFFFF8027363E40)
| [X23] inode = 0xFFFFFFD1F20246D0
-004|SYSC_getdents64(inline)
-004|sys_getdents64(
| ?,
| ?,
| [X19] count = 4200)
| [X19] count = 4200
| [X20] f = ([X20] file = 0xAC60FC43AC60FC40, [X20] flags = 1207898624)
| [X0] error = -1720
-005|el0_svc_naked(asm)
-->|exception
-006|NUX:0x78C5AD7D38(asm)
---|end of frame
From this call stack,task: 0xFFFFFFD0FFFFF948, seems to be invalid.
As(from ramdumps) it doesn't have any valid fields. And while trying to
access the fields of this task struct in has_group_leader_pid, abort is
happening.
From the dumps, its not clear why the task struct is coming to be some
invalid (Possibly task has already exited). This issue is observed
during normal monkey testing for long hours.
Could you please provide some pointers which could help in debugging
this issue further.
Thanks,
Jitendra
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
2019-04-15 12:58 fs/proc: Crash observed in next_tgid (fs/proc/base.c) Jitendra Sharma
@ 2019-04-17 3:38 ` Kees Cook
2019-04-17 11:21 ` Oleg Nesterov
2019-04-17 12:11 ` Marc Gonzalez
1 sibling, 1 reply; 7+ messages in thread
From: Kees Cook @ 2019-04-17 3:38 UTC (permalink / raw)
To: Jitendra Sharma
Cc: Luis R. Rodriguez, LKML, linux-fsdevel, linux-arm-msm, Oleg Nesterov
On Mon, Apr 15, 2019 at 7:58 AM Jitendra Sharma <shajit@codeaurora.org> wrote:
>
> Hi Kees Cook/Luis,
>
> We are observing one kernel crash in next_tgid function through
> getdents64 path. Call stack is as shown below:
>
> -000|has_group_leader_pid(inline)
> -000|next_tgid(
> | [X20] ns = 0xFFFFFF87CABB1AC0,
> | [locdesc] iter = (
> | [locdesc] tgid = 424,
> | [locdesc] task = ?))
> | [X21] p = 0xFFFFFFD0FFFFF948
> | [X21] task = 0xFFFFFFD0FFFFF948
> -001|proc_pid_readdir(
> | [X20] file = 0xFFFFFFD1AC60FC40,
> | [X19] ctx = 0xFFFFFF8027363E40)
> | [X21] ns = 0xFFFFFF87CABB1AC0
> -002|proc_root_readdir(
> | [X20] file = 0xFFFFFFD1AC60FC40,
> | [X19] ctx = 0xFFFFFF8027363E40)
> -003|iterate_dir(
> | [X19] file = 0xFFFFFFD1AC60FC40,
> | [X22] ctx = 0xFFFFFF8027363E40)
> | [X23] inode = 0xFFFFFFD1F20246D0
> -004|SYSC_getdents64(inline)
> -004|sys_getdents64(
> | ?,
> | ?,
> | [X19] count = 4200)
> | [X19] count = 4200
> | [X20] f = ([X20] file = 0xAC60FC43AC60FC40, [X20] flags = 1207898624)
> | [X0] error = -1720
> -005|el0_svc_naked(asm)
> -->|exception
> -006|NUX:0x78C5AD7D38(asm)
> ---|end of frame
>
>
> From this call stack,task: 0xFFFFFFD0FFFFF948, seems to be invalid.
> As(from ramdumps) it doesn't have any valid fields. And while trying to
> access the fields of this task struct in has_group_leader_pid, abort is
> happening.
>
> From the dumps, its not clear why the task struct is coming to be some
> invalid (Possibly task has already exited). This issue is observed
> during normal monkey testing for long hours.
>
> Could you please provide some pointers which could help in debugging
> this issue further.
Do you have any hints on how to reproduce this? I assume something is
missing proper locking or RCU handling, but I don't see anything
obvious in the surrounding code yet...
--
Kees Cook
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
2019-04-17 3:38 ` Kees Cook
@ 2019-04-17 11:21 ` Oleg Nesterov
0 siblings, 0 replies; 7+ messages in thread
From: Oleg Nesterov @ 2019-04-17 11:21 UTC (permalink / raw)
To: Kees Cook
Cc: Jitendra Sharma, Luis R. Rodriguez, LKML, linux-fsdevel, linux-arm-msm
On 04/16, Kees Cook wrote:
>
> Do you have any hints on how to reproduce this? I assume something is
> missing proper locking or RCU handling,
or we simply have an unbalanced put_task_struct() anywhere else ...
> but I don't see anything
> obvious in the surrounding code yet...
I too do not see anything wrong in proc_pid_readdir() paths
Oleg.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
2019-04-15 12:58 fs/proc: Crash observed in next_tgid (fs/proc/base.c) Jitendra Sharma
2019-04-17 3:38 ` Kees Cook
@ 2019-04-17 12:11 ` Marc Gonzalez
2019-04-17 16:58 ` Jitendra Sharma
1 sibling, 1 reply; 7+ messages in thread
From: Marc Gonzalez @ 2019-04-17 12:11 UTC (permalink / raw)
To: Jitendra Sharma; +Cc: Kees Cook, LKML
On 15/04/2019 14:58, Jitendra Sharma wrote:
> We are observing one kernel crash in next_tgid function through
> getdents64 path. Call stack is as shown below:
It might help if you specify the exact kernel version you are discussing,
as in which tag or commit hash you are running.
Also, what are you doing to trigger the issue?
Regards.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
2019-04-17 12:11 ` Marc Gonzalez
@ 2019-04-17 16:58 ` Jitendra Sharma
2019-04-18 9:24 ` Marc Gonzalez
0 siblings, 1 reply; 7+ messages in thread
From: Jitendra Sharma @ 2019-04-17 16:58 UTC (permalink / raw)
To: Marc Gonzalez; +Cc: Kees Cook, LKML
Thanks Marc and Kees for replying.
Answer to your queries:
Kernel version: 4.14.83
Test case: Not some specific test case. Issue reproduced while doing
monkey testing for very long hours.
Thanks,
Jitendra
On 4/17/2019 5:41 PM, Marc Gonzalez wrote:
> On 15/04/2019 14:58, Jitendra Sharma wrote:
>
>> We are observing one kernel crash in next_tgid function through
>> getdents64 path. Call stack is as shown below:
>
> It might help if you specify the exact kernel version you are discussing,
> as in which tag or commit hash you are running.
>
> Also, what are you doing to trigger the issue?
>
> Regards.
>
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
2019-04-17 16:58 ` Jitendra Sharma
@ 2019-04-18 9:24 ` Marc Gonzalez
0 siblings, 0 replies; 7+ messages in thread
From: Marc Gonzalez @ 2019-04-18 9:24 UTC (permalink / raw)
To: Jitendra Sharma; +Cc: Kees Cook, LKML
NB: it is preferable to avoid top-posting here.
On 17/04/2019 18:58, Jitendra Sharma wrote:
> Kernel version: 4.14.83
NB2: 4.14.83 is obsolete, as it stands 2255 patches behind the tip of
linux-4.14.y (though only 4 patches in fs/proc).
NB3: 4.14 is 108499(!!) patches behind v5.1-rc5 (latest release). It makes
sense to test on a recent kernel, in case someone has solved the issue along
the way (then the bisecting fun can begin).
> Test case: Not some specific test case. Issue reproduced while doing
> monkey testing for very long hours.
I'm not sure what "monkey testing" entails...
Any relation to the infinite monkey theorem?
https://en.wikipedia.org/wiki/Infinite_monkey_theorem
Regards.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fs/proc: Crash observed in next_tgid (fs/proc/base.c)
@ 2019-04-17 18:14 Alexey Dobriyan
0 siblings, 0 replies; 7+ messages in thread
From: Alexey Dobriyan @ 2019-04-17 18:14 UTC (permalink / raw)
To: shajit; +Cc: linux-kernel
> Test case: Not some specific test case.
Write 2 programs: one forks, clones, and execs randomly (but not going
out of control), another does open("/proc"), getdents in a loop.
Don't use shell scripts or readdir as they only bring overhead.
Run getdents in a loop and several forkers, see what happens.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-04-18 9:24 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-15 12:58 fs/proc: Crash observed in next_tgid (fs/proc/base.c) Jitendra Sharma
2019-04-17 3:38 ` Kees Cook
2019-04-17 11:21 ` Oleg Nesterov
2019-04-17 12:11 ` Marc Gonzalez
2019-04-17 16:58 ` Jitendra Sharma
2019-04-18 9:24 ` Marc Gonzalez
2019-04-17 18:14 Alexey Dobriyan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).