All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Dooks <ben.dooks@codethink.co.uk>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: syzbot <syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>,
	linux-riscv <linux-riscv@lists.infradead.org>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Benjamin Segall <bsegall@google.com>,
	dietmar.eggemann@arm.com, Juri Lelli <juri.lelli@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>, Mel Gorman <mgorman@suse.de>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [syzbot] BUG: unable to handle kernel access to user memory in schedule_tail
Date: Fri, 12 Mar 2021 16:30:00 +0000	[thread overview]
Message-ID: <aa801bc7-cf6f-b77a-bbb0-28b0ff36e8ba@codethink.co.uk> (raw)
In-Reply-To: <CACT4Y+ZsSRdQ5LzYMsgjrBAukgP-Vv8WSQsSoxguYjWvB1QnrA@mail.gmail.com>

On 12/03/2021 15:12, Dmitry Vyukov wrote:
> On Fri, Mar 12, 2021 at 2:50 PM Ben Dooks <ben.dooks@codethink.co.uk> wrote:
>>
>> On 10/03/2021 17:16, Dmitry Vyukov wrote:
>>> On Wed, Mar 10, 2021 at 5:46 PM syzbot
>>> <syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit:    0d7588ab riscv: process: Fix no prototype for arch_dup_tas..
>>>> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1212c6e6d00000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=e3c595255fb2d136
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=e74b94fe601ab9552d69
>>>> userspace arch: riscv64
>>>>
>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com
>>>
>>> +riscv maintainers
>>>
>>> This is riscv64-specific.
>>> I've seen similar crashes in put_user in other places. It looks like
>>> put_user crashes in the user address is not mapped/protected (?).
>>
>> I've been having a look, and this seems to be down to access of the
>> tsk->set_child_tid variable. I assume the fuzzing here is to pass a
>> bad address to clone?
>>
>>   From looking at the code, the put_user() code should have set the
>> relevant SR_SUM bit (the value for this, which is 1<<18 is in the
>> s2 register in the crash report) and from looking at the compiler
>> output from my gcc-10, the code looks to be dong the relevant csrs
>> and then csrc around the put_user
>>
>> So currently I do not understand how the above could have happened
>> over than something re-tried the code seqeunce and ended up retrying
>> the faulting instruction without the SR_SUM bit set.
> 
> I would maybe blame qemu for randomly resetting SR_SUM, but it's
> strange that 99% of these crashes are in schedule_tail. If it would be
> qemu, then they would be more evenly distributed...
> 
> Another observation: looking at a dozen of crash logs, in none of
> these cases fuzzer was actually trying to fuzz clone with some insane
> arguments. So it looks like completely normal clone's (e..g coming
> from pthread_create) result in this crash.
> 
> I also wonder why there is ret_from_exception, is it normal? I see
> handle_exception disables SR_SUM:
> https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/riscv/kernel/entry.S#L73

So I think if SR_SUM is set, then it faults the access to user memory
which the _user() routines clear to allow them access.

I'm thinking there is at least one issue here:

- the test in fault is the wrong way around for die kernel
- the handler only catches this if the page has yet to be mapped.

So I think the test should be:

         if (!user_mode(regs) && addr < TASK_SIZE &&
                         unlikely(regs->status & SR_SUM)

This then should continue on and allow the rest of the handler to
complete mapping the page if it is not there.

I have been trying to create a very simple clone test, but so far it
has yet to actually trigger anything.

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html

WARNING: multiple messages have this Message-ID (diff)
From: Ben Dooks <ben.dooks@codethink.co.uk>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: syzbot <syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Palmer Dabbelt <palmer@dabbelt.com>,
	Albert Ou <aou@eecs.berkeley.edu>,
	linux-riscv <linux-riscv@lists.infradead.org>,
	Daniel Bristot de Oliveira <bristot@redhat.com>,
	Benjamin Segall <bsegall@google.com>,
	dietmar.eggemann@arm.com, Juri Lelli <juri.lelli@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>, Mel Gorman <mgorman@suse.de>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	syzkaller-bugs <syzkaller-bugs@googlegroups.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Subject: Re: [syzbot] BUG: unable to handle kernel access to user memory in schedule_tail
Date: Fri, 12 Mar 2021 16:30:00 +0000	[thread overview]
Message-ID: <aa801bc7-cf6f-b77a-bbb0-28b0ff36e8ba@codethink.co.uk> (raw)
In-Reply-To: <CACT4Y+ZsSRdQ5LzYMsgjrBAukgP-Vv8WSQsSoxguYjWvB1QnrA@mail.gmail.com>

On 12/03/2021 15:12, Dmitry Vyukov wrote:
> On Fri, Mar 12, 2021 at 2:50 PM Ben Dooks <ben.dooks@codethink.co.uk> wrote:
>>
>> On 10/03/2021 17:16, Dmitry Vyukov wrote:
>>> On Wed, Mar 10, 2021 at 5:46 PM syzbot
>>> <syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit:    0d7588ab riscv: process: Fix no prototype for arch_dup_tas..
>>>> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git fixes
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1212c6e6d00000
>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=e3c595255fb2d136
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=e74b94fe601ab9552d69
>>>> userspace arch: riscv64
>>>>
>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com
>>>
>>> +riscv maintainers
>>>
>>> This is riscv64-specific.
>>> I've seen similar crashes in put_user in other places. It looks like
>>> put_user crashes in the user address is not mapped/protected (?).
>>
>> I've been having a look, and this seems to be down to access of the
>> tsk->set_child_tid variable. I assume the fuzzing here is to pass a
>> bad address to clone?
>>
>>   From looking at the code, the put_user() code should have set the
>> relevant SR_SUM bit (the value for this, which is 1<<18 is in the
>> s2 register in the crash report) and from looking at the compiler
>> output from my gcc-10, the code looks to be dong the relevant csrs
>> and then csrc around the put_user
>>
>> So currently I do not understand how the above could have happened
>> over than something re-tried the code seqeunce and ended up retrying
>> the faulting instruction without the SR_SUM bit set.
> 
> I would maybe blame qemu for randomly resetting SR_SUM, but it's
> strange that 99% of these crashes are in schedule_tail. If it would be
> qemu, then they would be more evenly distributed...
> 
> Another observation: looking at a dozen of crash logs, in none of
> these cases fuzzer was actually trying to fuzz clone with some insane
> arguments. So it looks like completely normal clone's (e..g coming
> from pthread_create) result in this crash.
> 
> I also wonder why there is ret_from_exception, is it normal? I see
> handle_exception disables SR_SUM:
> https://elixir.bootlin.com/linux/v5.12-rc2/source/arch/riscv/kernel/entry.S#L73

So I think if SR_SUM is set, then it faults the access to user memory
which the _user() routines clear to allow them access.

I'm thinking there is at least one issue here:

- the test in fault is the wrong way around for die kernel
- the handler only catches this if the page has yet to be mapped.

So I think the test should be:

         if (!user_mode(regs) && addr < TASK_SIZE &&
                         unlikely(regs->status & SR_SUM)

This then should continue on and allow the rest of the handler to
complete mapping the page if it is not there.

I have been trying to create a very simple clone test, but so far it
has yet to actually trigger anything.

-- 
Ben Dooks				http://www.codethink.co.uk/
Senior Engineer				Codethink - Providing Genius

https://www.codethink.co.uk/privacy.html

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  parent reply	other threads:[~2021-03-12 16:31 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 16:46 [syzbot] BUG: unable to handle kernel access to user memory in schedule_tail syzbot
2021-03-10 17:16 ` Dmitry Vyukov
2021-03-10 17:16   ` Dmitry Vyukov
2021-03-10 22:24   ` Ben Dooks
2021-03-10 22:24     ` Ben Dooks
2021-03-11  6:39     ` Alex Ghiti
2021-03-11  6:39       ` Alex Ghiti
2021-03-11  6:50       ` Dmitry Vyukov
2021-03-11  6:50         ` Dmitry Vyukov
2021-03-11  6:52         ` Dmitry Vyukov
2021-03-11  6:52           ` Dmitry Vyukov
2021-03-11 10:41           ` Ben Dooks
2021-03-11 10:41             ` Ben Dooks
2021-03-12 13:49   ` Ben Dooks
2021-03-12 13:49     ` Ben Dooks
2021-03-12 15:12     ` Dmitry Vyukov
2021-03-12 15:12       ` Dmitry Vyukov
2021-03-12 16:25       ` Alex Ghiti
2021-03-12 16:25         ` Alex Ghiti
2021-03-12 20:12         ` Ben Dooks
2021-03-12 20:12           ` Ben Dooks
2021-03-13  7:20           ` Dmitry Vyukov
2021-03-13  7:20             ` Dmitry Vyukov
2021-03-15 16:55             ` Ben Dooks
2021-03-15 16:55               ` Ben Dooks
2021-03-18 14:34               ` Dmitry Vyukov
2021-03-18 14:34                 ` Dmitry Vyukov
2021-03-15 21:38             ` Ben Dooks
2021-03-15 21:38               ` Ben Dooks
2021-03-16  8:52               ` Dmitry Vyukov
2021-03-16  8:52                 ` Dmitry Vyukov
2021-03-16 11:35                 ` Ben Dooks
2021-03-16 11:35                   ` Ben Dooks
2021-03-16 11:44                   ` Dmitry Vyukov
2021-03-16 11:44                     ` Dmitry Vyukov
2021-03-12 16:30       ` Ben Dooks [this message]
2021-03-12 16:30         ` Ben Dooks
2021-03-12 16:34         ` Ben Dooks
2021-03-12 16:34           ` Ben Dooks
2021-03-12 16:36           ` Ben Dooks
2021-03-12 16:36             ` Ben Dooks
2021-03-12 17:34             ` Dmitry Vyukov
2021-03-12 17:34               ` Dmitry Vyukov
2021-03-12 17:38               ` Dmitry Vyukov
2021-03-12 17:38                 ` Dmitry Vyukov
2021-03-18  9:41                 ` Ben Dooks
2021-03-18  9:41                   ` Ben Dooks
2021-03-18 10:05                   ` Dmitry Vyukov
2021-03-18 10:05                     ` Dmitry Vyukov
2021-03-18 12:52                     ` Ben Dooks
2021-03-18 12:52                       ` Ben Dooks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa801bc7-cf6f-b77a-bbb0-28b0ff36e8ba@codethink.co.uk \
    --to=ben.dooks@codethink.co.uk \
    --cc=aou@eecs.berkeley.edu \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dvyukov@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=syzbot+e74b94fe601ab9552d69@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.