All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [syzbot] general protection fault in __device_attach
       [not found] <20220603033532.5154-1-hdanton@sina.com>
@ 2022-06-03  3:55 ` syzbot
  0 siblings, 0 replies; 17+ messages in thread
From: syzbot @ 2022-06-03  3:55 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in __device_attach

usb usb9: device_add((null)) --> -22
general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
CPU: 1 PID: 4084 Comm: syz-executor.0 Not tainted 5.18.0-syzkaller-11972-gd1dc87763f40-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:948
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90003347b98 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff92000668f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000002 RDI: 0000000000000108
RBP: ffff888021878030 R08: 0000000000000000 R09: ffffffff8dbb1097
R10: fffffbfff1b76212 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff8880218780b0
FS:  00007f8da1571700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8da059d090 CR3: 000000006b626000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 proc_ioctl.part.0+0x48e/0x560 drivers/usb/core/devio.c:2356
 proc_ioctl drivers/usb/core/devio.c:182 [inline]
 proc_ioctl_default drivers/usb/core/devio.c:2391 [inline]
 usbdev_do_ioctl drivers/usb/core/devio.c:2747 [inline]
 usbdev_ioctl+0x2c08/0x36f0 drivers/usb/core/devio.c:2807
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:870 [inline]
 __se_sys_ioctl fs/ioctl.c:856 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:856
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f8da0489109
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f8da1571168 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f8da059bf60 RCX: 00007f8da0489109
RDX: 0000000020000040 RSI: 00000000c0105512 RDI: 0000000000000005
RBP: 00007f8da04e308d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff3b94a64f R14: 00007f8da1571300 R15: 0000000000022000
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:948
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90003347b98 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff92000668f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000002 RDI: 0000000000000108
RBP: ffff888021878030 R08: 0000000000000000 R09: ffffffff8dbb1097
R10: fffffbfff1b76212 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff8880218780b0
FS:  00007f8da1571700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8da059d090 CR3: 000000006b626000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
   0:	e8 03 42 80 3c       	callq  0x3c804208
   5:	20 00                	and    %al,(%rax)
   7:	0f 85 a3 03 00 00    	jne    0x3b0
   d:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  14:	fc ff df
  17:	4c 8b 65 48          	mov    0x48(%rbp),%r12
  1b:	49 8d bc 24 08 01 00 	lea    0x108(%r12),%rdi
  22:	00
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	0f b6 04 02          	movzbl (%rdx,%rax,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	74 06                	je     0x38
  32:	0f 8e 6e 03 00 00    	jle    0x3a6
  38:	45                   	rex.RB
  39:	0f                   	.byte 0xf
  3a:	b6 b4                	mov    $0xb4,%dh
  3c:	24 08                	and    $0x8,%al
  3e:	01 00                	add    %eax,(%rax)


Tested on:

commit:         d1dc8776 assoc_array: Fix BUG_ON during garbage collect
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=113f50ddf00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=12390667f00000


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-08  8:20                     ` Dmitry Vyukov
@ 2022-06-08  8:24                       ` Dmitry Vyukov
  0 siblings, 0 replies; 17+ messages in thread
From: Dmitry Vyukov @ 2022-06-08  8:24 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dan Carpenter, Greg KH, Alan Stern, Andy Shevchenko, syzbot,
	hdanton, lenb, linux-acpi, linux-kernel, rafael.j.wysocki,
	rafael, rjw, syzkaller-bugs, linux-usb, Linux-MM

On Wed, 8 Jun 2022 at 10:20, Dmitry Vyukov <dvyukov@google.com> wrote:
> > On Tue, Jun 07, 2022 at 09:15:09AM +0200, Dmitry Vyukov wrote:
> > > On Mon, 6 Jun 2022 at 14:39, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > > >
> > > > On Sat, Jun 04, 2022 at 10:32:46AM +0200, 'Dmitry Vyukov' via syzkaller-bugs wrote:
> > > > > On Fri, 3 Jun 2022 at 18:12, Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > >
> > > > > > But again, is this a "real and able to be triggered from userspace"
> > > > > > problem, or just fault-injection-induced?
> > > > >
> > > > > Then this is something to fix in the fault injection subsystem.
> > > > > Testing systems shouldn't be reporting false positives.
> > > > > What allocations cannot fail in real life? Is it <=page_size?
> > > > >
> > > >
> > > > Apparently in 2014, anything less than *EIGHT?!!* pages succeeded!
> > > >
> > > > https://lwn.net/Articles/627419/
> > > >
> > > > I have been on the look out since that article and never seen anyone
> > > > mention it changing.  I think we should ignore that and say that
> > > > anything over PAGE_SIZE can fail.  Possibly we could go smaller than
> > > > PAGE_SIZE...
> > >
> > > +linux-mm for GFP expertise re what allocations cannot possibly fail
> > > and should be excluded from fault injection.
> > >
> > > Interesting, thanks for the link.
> > >
> > > PAGE_SIZE looks like a good start. Once we have the predicate in
> > > place, we can refine it later when/if we have more inputs.
> > >
> > > But I wonder about GFP flags. They definitely have some impact on allocations.
> > > If GFP_ACCOUNT is set, all allocations can fail, right?
> > > If GFP_DMA/DMA32 is set, allocations can fail, right? What about other zones?
> > > If GFP_NORETRY is set, allocations can fail?
> > > What about GFP_NOMEMALLOC and GFP_ATOMIC?
> > > What about GFP_IO/GFP_FS/GFP_DIRECT_RECLAIM/GFP_KSWAPD_RECLAIM? At
> > > least some of these need to be set for allocations to not fail? Which
> > > ones?
> > > Any other flags are required to be set/unset for allocations to not fail?
> >
> > I'm not the expert on page allocation, but ...
> >
> > I don't think GFP_ACCOUNT makes allocations fail.  It might make reclaim
> > happen from within that cgroup, and it might cause an OOM kill for
> > something in that cgroup.  But I don't think it makes a (low order)
> > allocation more likely to fail.
>
> Interesting.
> I was thinking of some malicious specifically crafted configurations
> with very low limit and particular pattern of allocations. Also what
> if there is just 1 process (current)? Is it possible to kill and
> reclaim the current process when a thread is stuck in the middle of
> the kernel on a kmalloc?
> Also I see e.g.:
>         Tasks with the OOM protection (oom_score_adj set to -1000)
>         are treated as an exception and are never killed.
>
> I am not an expert on this either, but I think it may be hard to fight
> with a specifically crafted attack.
>
>
> > There's usually less memory avilable in DMA/DMA32 zones, but we have
> > so few allocations from those zones, I question the utility of focusing
> > testing on those allocations.
> >
> > GFP_ATOMIC allows access to emergency pools, so I would say _less_ likely
> > to fail.  KSWAPD_RECLAIM has no effect on whether _this_ allocation
> > succeeds or fails; it kicks kswapd to do reclaim, rather than doing
> > reclaim directly.  DIRECT_RECLAIM definitely makes allocations more likely
> > to succeed.  GFP_FS allows (direct) reclaim to happen from filesystems.
> > GFP_IO allows IO to start (ie writeback can start) in order to clean
> > dirty memory.
> >
> > Anyway, I hope somebody who knows the page allocator better than I do
> > can say smarter things than this.  Even better if they can put it into
> > Documentation/ somewhere ;-)
>
> Even better to put this into code as a predicate function that fault
> injection will use. It will also serve as precise up-to-date
> documentation.

Also at the end of kmalloc as:
WARN_ON(!ret && !cant_fail(size, gfp));
!

> > https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html
> > exists but isn't quite enough to answer this question.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-08  3:25                   ` Matthew Wilcox
@ 2022-06-08  8:20                     ` Dmitry Vyukov
  2022-06-08  8:24                       ` Dmitry Vyukov
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2022-06-08  8:20 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dan Carpenter, Greg KH, Alan Stern, Andy Shevchenko, syzbot,
	hdanton, lenb, linux-acpi, linux-kernel, rafael.j.wysocki,
	rafael, rjw, syzkaller-bugs, linux-usb, Linux-MM

On Wed, 8 Jun 2022 at 05:25, Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Jun 07, 2022 at 09:15:09AM +0200, Dmitry Vyukov wrote:
> > On Mon, 6 Jun 2022 at 14:39, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > >
> > > On Sat, Jun 04, 2022 at 10:32:46AM +0200, 'Dmitry Vyukov' via syzkaller-bugs wrote:
> > > > On Fri, 3 Jun 2022 at 18:12, Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > But again, is this a "real and able to be triggered from userspace"
> > > > > problem, or just fault-injection-induced?
> > > >
> > > > Then this is something to fix in the fault injection subsystem.
> > > > Testing systems shouldn't be reporting false positives.
> > > > What allocations cannot fail in real life? Is it <=page_size?
> > > >
> > >
> > > Apparently in 2014, anything less than *EIGHT?!!* pages succeeded!
> > >
> > > https://lwn.net/Articles/627419/
> > >
> > > I have been on the look out since that article and never seen anyone
> > > mention it changing.  I think we should ignore that and say that
> > > anything over PAGE_SIZE can fail.  Possibly we could go smaller than
> > > PAGE_SIZE...
> >
> > +linux-mm for GFP expertise re what allocations cannot possibly fail
> > and should be excluded from fault injection.
> >
> > Interesting, thanks for the link.
> >
> > PAGE_SIZE looks like a good start. Once we have the predicate in
> > place, we can refine it later when/if we have more inputs.
> >
> > But I wonder about GFP flags. They definitely have some impact on allocations.
> > If GFP_ACCOUNT is set, all allocations can fail, right?
> > If GFP_DMA/DMA32 is set, allocations can fail, right? What about other zones?
> > If GFP_NORETRY is set, allocations can fail?
> > What about GFP_NOMEMALLOC and GFP_ATOMIC?
> > What about GFP_IO/GFP_FS/GFP_DIRECT_RECLAIM/GFP_KSWAPD_RECLAIM? At
> > least some of these need to be set for allocations to not fail? Which
> > ones?
> > Any other flags are required to be set/unset for allocations to not fail?
>
> I'm not the expert on page allocation, but ...
>
> I don't think GFP_ACCOUNT makes allocations fail.  It might make reclaim
> happen from within that cgroup, and it might cause an OOM kill for
> something in that cgroup.  But I don't think it makes a (low order)
> allocation more likely to fail.

Interesting.
I was thinking of some malicious specifically crafted configurations
with very low limit and particular pattern of allocations. Also what
if there is just 1 process (current)? Is it possible to kill and
reclaim the current process when a thread is stuck in the middle of
the kernel on a kmalloc?
Also I see e.g.:
        Tasks with the OOM protection (oom_score_adj set to -1000)
        are treated as an exception and are never killed.

I am not an expert on this either, but I think it may be hard to fight
with a specifically crafted attack.


> There's usually less memory avilable in DMA/DMA32 zones, but we have
> so few allocations from those zones, I question the utility of focusing
> testing on those allocations.
>
> GFP_ATOMIC allows access to emergency pools, so I would say _less_ likely
> to fail.  KSWAPD_RECLAIM has no effect on whether _this_ allocation
> succeeds or fails; it kicks kswapd to do reclaim, rather than doing
> reclaim directly.  DIRECT_RECLAIM definitely makes allocations more likely
> to succeed.  GFP_FS allows (direct) reclaim to happen from filesystems.
> GFP_IO allows IO to start (ie writeback can start) in order to clean
> dirty memory.
>
> Anyway, I hope somebody who knows the page allocator better than I do
> can say smarter things than this.  Even better if they can put it into
> Documentation/ somewhere ;-)

Even better to put this into code as a predicate function that fault
injection will use. It will also serve as precise up-to-date
documentation.

> https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html
> exists but isn't quite enough to answer this question.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-07  7:15                 ` Dmitry Vyukov
@ 2022-06-08  3:25                   ` Matthew Wilcox
  2022-06-08  8:20                     ` Dmitry Vyukov
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Wilcox @ 2022-06-08  3:25 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Dan Carpenter, Greg KH, Alan Stern, Andy Shevchenko, syzbot,
	hdanton, lenb, linux-acpi, linux-kernel, rafael.j.wysocki,
	rafael, rjw, syzkaller-bugs, linux-usb, Linux-MM

On Tue, Jun 07, 2022 at 09:15:09AM +0200, Dmitry Vyukov wrote:
> On Mon, 6 Jun 2022 at 14:39, Dan Carpenter <dan.carpenter@oracle.com> wrote:
> >
> > On Sat, Jun 04, 2022 at 10:32:46AM +0200, 'Dmitry Vyukov' via syzkaller-bugs wrote:
> > > On Fri, 3 Jun 2022 at 18:12, Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > But again, is this a "real and able to be triggered from userspace"
> > > > problem, or just fault-injection-induced?
> > >
> > > Then this is something to fix in the fault injection subsystem.
> > > Testing systems shouldn't be reporting false positives.
> > > What allocations cannot fail in real life? Is it <=page_size?
> > >
> >
> > Apparently in 2014, anything less than *EIGHT?!!* pages succeeded!
> >
> > https://lwn.net/Articles/627419/
> >
> > I have been on the look out since that article and never seen anyone
> > mention it changing.  I think we should ignore that and say that
> > anything over PAGE_SIZE can fail.  Possibly we could go smaller than
> > PAGE_SIZE...
> 
> +linux-mm for GFP expertise re what allocations cannot possibly fail
> and should be excluded from fault injection.
> 
> Interesting, thanks for the link.
> 
> PAGE_SIZE looks like a good start. Once we have the predicate in
> place, we can refine it later when/if we have more inputs.
> 
> But I wonder about GFP flags. They definitely have some impact on allocations.
> If GFP_ACCOUNT is set, all allocations can fail, right?
> If GFP_DMA/DMA32 is set, allocations can fail, right? What about other zones?
> If GFP_NORETRY is set, allocations can fail?
> What about GFP_NOMEMALLOC and GFP_ATOMIC?
> What about GFP_IO/GFP_FS/GFP_DIRECT_RECLAIM/GFP_KSWAPD_RECLAIM? At
> least some of these need to be set for allocations to not fail? Which
> ones?
> Any other flags are required to be set/unset for allocations to not fail?

I'm not the expert on page allocation, but ...

I don't think GFP_ACCOUNT makes allocations fail.  It might make reclaim
happen from within that cgroup, and it might cause an OOM kill for
something in that cgroup.  But I don't think it makes a (low order)
allocation more likely to fail.

There's usually less memory avilable in DMA/DMA32 zones, but we have
so few allocations from those zones, I question the utility of focusing
testing on those allocations.

GFP_ATOMIC allows access to emergency pools, so I would say _less_ likely
to fail.  KSWAPD_RECLAIM has no effect on whether _this_ allocation
succeeds or fails; it kicks kswapd to do reclaim, rather than doing
reclaim directly.  DIRECT_RECLAIM definitely makes allocations more likely
to succeed.  GFP_FS allows (direct) reclaim to happen from filesystems.
GFP_IO allows IO to start (ie writeback can start) in order to clean
dirty memory.

Anyway, I hope somebody who knows the page allocator better than I do
can say smarter things than this.  Even better if they can put it into
Documentation/ somewhere ;-)

https://www.kernel.org/doc/html/latest/core-api/memory-allocation.html
exists but isn't quite enough to answer this question.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-06 12:38               ` Dan Carpenter
@ 2022-06-07  7:15                 ` Dmitry Vyukov
  2022-06-08  3:25                   ` Matthew Wilcox
  0 siblings, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2022-06-07  7:15 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Greg KH, Alan Stern, Andy Shevchenko, syzbot, hdanton, lenb,
	linux-acpi, linux-kernel, rafael.j.wysocki, rafael, rjw,
	syzkaller-bugs, linux-usb, Linux-MM

On Mon, 6 Jun 2022 at 14:39, Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Sat, Jun 04, 2022 at 10:32:46AM +0200, 'Dmitry Vyukov' via syzkaller-bugs wrote:
> > On Fri, 3 Jun 2022 at 18:12, Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > But again, is this a "real and able to be triggered from userspace"
> > > problem, or just fault-injection-induced?
> >
> > Then this is something to fix in the fault injection subsystem.
> > Testing systems shouldn't be reporting false positives.
> > What allocations cannot fail in real life? Is it <=page_size?
> >
>
> Apparently in 2014, anything less than *EIGHT?!!* pages succeeded!
>
> https://lwn.net/Articles/627419/
>
> I have been on the look out since that article and never seen anyone
> mention it changing.  I think we should ignore that and say that
> anything over PAGE_SIZE can fail.  Possibly we could go smaller than
> PAGE_SIZE...

+linux-mm for GFP expertise re what allocations cannot possibly fail
and should be excluded from fault injection.

Interesting, thanks for the link.

PAGE_SIZE looks like a good start. Once we have the predicate in
place, we can refine it later when/if we have more inputs.

But I wonder about GFP flags. They definitely have some impact on allocations.
If GFP_ACCOUNT is set, all allocations can fail, right?
If GFP_DMA/DMA32 is set, allocations can fail, right? What about other zones?
If GFP_NORETRY is set, allocations can fail?
What about GFP_NOMEMALLOC and GFP_ATOMIC?
What about GFP_IO/GFP_FS/GFP_DIRECT_RECLAIM/GFP_KSWAPD_RECLAIM? At
least some of these need to be set for allocations to not fail? Which
ones?
Any other flags are required to be set/unset for allocations to not fail?

FTR here is quick link to flags list:
https://elixir.bootlin.com/linux/v5.19-rc1/source/include/linux/gfp.h#L32

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-04  8:32             ` Dmitry Vyukov
@ 2022-06-06 12:38               ` Dan Carpenter
  2022-06-07  7:15                 ` Dmitry Vyukov
  0 siblings, 1 reply; 17+ messages in thread
From: Dan Carpenter @ 2022-06-06 12:38 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Greg KH, Alan Stern, Andy Shevchenko, syzbot, hdanton, lenb,
	linux-acpi, linux-kernel, rafael.j.wysocki, rafael, rjw,
	syzkaller-bugs, linux-usb

On Sat, Jun 04, 2022 at 10:32:46AM +0200, 'Dmitry Vyukov' via syzkaller-bugs wrote:
> On Fri, 3 Jun 2022 at 18:12, Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > But again, is this a "real and able to be triggered from userspace"
> > problem, or just fault-injection-induced?
> 
> Then this is something to fix in the fault injection subsystem.
> Testing systems shouldn't be reporting false positives.
> What allocations cannot fail in real life? Is it <=page_size?
> 

Apparently in 2014, anything less than *EIGHT?!!* pages succeeded!

https://lwn.net/Articles/627419/

I have been on the look out since that article and never seen anyone
mention it changing.  I think we should ignore that and say that
anything over PAGE_SIZE can fail.  Possibly we could go smaller than
PAGE_SIZE...

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-03 16:11           ` Greg KH
  2022-06-03 16:27             ` Alan Stern
@ 2022-06-04  8:32             ` Dmitry Vyukov
  2022-06-06 12:38               ` Dan Carpenter
  1 sibling, 1 reply; 17+ messages in thread
From: Dmitry Vyukov @ 2022-06-04  8:32 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Stern, Andy Shevchenko, syzbot, hdanton, lenb, linux-acpi,
	linux-kernel, rafael.j.wysocki, rafael, rjw, syzkaller-bugs,
	linux-usb

On Fri, 3 Jun 2022 at 18:12, Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > syzbot has bisected this issue to:
> > > > > >
> > > > > > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> > > > > > Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > > > Date:   Fri Jun 18 13:41:27 2021 +0000
> > > > > >
> > > > > >     ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
> > > > >
> > > > > Hmm... It's not obvious at all how this change can alter the behaviour so
> > > > > drastically. device_add() is called from USB core with intf->dev.name == NULL
> > > > > by some reason. A-ha, seems like fault injector, which looks like
> > > > >
> > > > >         dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
> > > > >                      dev->devpath, configuration, ifnum);
> > > > >
> > > > > missed the return code check.
> > > > >
> > > > > But I'm not familiar with that code at all, adding Linux USB ML and Alan.
> > > >
> > > > I can't see any connection between this bug and acpi/sysfs.c.  Is it a
> > > > bad bisection?
> > > >
> > > > It looks like you're right about dev_set_name() failing.  In fact, the
> > > > kernel appears to be littered with calls to that routine which do not
> > > > check the return code (the entire subtree below drivers/usb/ contains
> > > > only _one_ call that does check the return code!).  The function doesn't
> > > > have any __must_check annotation, and its kerneldoc doesn't mention the
> > > > return code or the possibility of a failure.
> > > >
> > > > Apparently the assumption is that if dev_set_name() fails then
> > > > device_add() later on will also fail, and the problem will be detected
> > > > then.
> > > >
> > > > So now what should happen when device_add() for an interface fails in
> > > > usb_set_configuration()?
> > >
> > > But how can that really fail on a real system?
> > >
> > > Is this just due to error-injection stuff?  If so, I'm really loath to
> > > rework the world for something that can never happen in real life.
> > >
> > > Or is this a real syzbot-found-with-reproducer issue?
> >
> > Aren't there quite a few reasons why device_add() might fail?  (Although
> > most of them probably are memory allocation errors...)
>
> I was thinking of the dev_set_name() issue further back in the call
> chain.
>
> > Basically, you have to make up your mind.  If a function can fail, you
> > should be prepared to handle the failure.  If it can't fail, there's no
> > point in even checking the return code.
>
> True, ok, we should unwind the mess.  I'll try to look at it after the
> merge window...
>
> But again, is this a "real and able to be triggered from userspace"
> problem, or just fault-injection-induced?

Then this is something to fix in the fault injection subsystem.
Testing systems shouldn't be reporting false positives.
What allocations cannot fail in real life? Is it <=page_size?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-03 16:11           ` Greg KH
@ 2022-06-03 16:27             ` Alan Stern
  2022-06-04  8:32             ` Dmitry Vyukov
  1 sibling, 0 replies; 17+ messages in thread
From: Alan Stern @ 2022-06-03 16:27 UTC (permalink / raw)
  To: Greg KH
  Cc: Andy Shevchenko, syzbot, hdanton, lenb, linux-acpi, linux-kernel,
	rafael.j.wysocki, rafael, rjw, syzkaller-bugs, linux-usb

On Fri, Jun 03, 2022 at 06:11:55PM +0200, Greg KH wrote:
> On Fri, Jun 03, 2022 at 12:03:32PM -0400, Alan Stern wrote:
> > On Fri, Jun 03, 2022 at 05:52:38PM +0200, Greg KH wrote:
> > > On Fri, Jun 03, 2022 at 11:42:19AM -0400, Alan Stern wrote:
> > > > So now what should happen when device_add() for an interface fails in 
> > > > usb_set_configuration()?
> > > 
> > > But how can that really fail on a real system?
> > > 
> > > Is this just due to error-injection stuff?  If so, I'm really loath to
> > > rework the world for something that can never happen in real life.
> > > 
> > > Or is this a real syzbot-found-with-reproducer issue?
> > 
> > Aren't there quite a few reasons why device_add() might fail?  (Although 
> > most of them probably are memory allocation errors...)
> 
> I was thinking of the dev_set_name() issue further back in the call
> chain.

As far as I know, the only reason for dev_set_name() to fail is -ENOMEM.  
That's not something the user can control directly.

> > Basically, you have to make up your mind.  If a function can fail, you 
> > should be prepared to handle the failure.  If it can't fail, there's no 
> > point in even checking the return code.
> 
> True, ok, we should unwind the mess.  I'll try to look at it after the
> merge window...
> 
> But again, is this a "real and able to be triggered from userspace"
> problem, or just fault-injection-induced?

I don't think any of the failure paths here are controlled by the user.  
They all seem to involve something going wrong internally in the kernel 
(i.e., corruption or memory allocation failure for a small buffer).  
Once that happens, the game is pretty much over anyway.

Is it worth handling this sort of thing, or should we ignore the 
possibility and allow it to escalate to the point where the user can 
potentially trigger a kernel panic?  Another way of putting it is: How 
gracefully do you want the kernel to collapse when this sort of 
corruption happens?

Alan Stern

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-03 16:03         ` Alan Stern
@ 2022-06-03 16:11           ` Greg KH
  2022-06-03 16:27             ` Alan Stern
  2022-06-04  8:32             ` Dmitry Vyukov
  0 siblings, 2 replies; 17+ messages in thread
From: Greg KH @ 2022-06-03 16:11 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andy Shevchenko, syzbot, hdanton, lenb, linux-acpi, linux-kernel,
	rafael.j.wysocki, rafael, rjw, syzkaller-bugs, linux-usb

On Fri, Jun 03, 2022 at 12:03:32PM -0400, Alan Stern wrote:
> On Fri, Jun 03, 2022 at 05:52:38PM +0200, Greg KH wrote:
> > On Fri, Jun 03, 2022 at 11:42:19AM -0400, Alan Stern wrote:
> > > On Fri, Jun 03, 2022 at 02:04:04PM +0300, Andy Shevchenko wrote:
> > > > On Fri, Jun 03, 2022 at 03:02:07AM -0700, syzbot wrote:
> > > > > syzbot has bisected this issue to:
> > > > > 
> > > > > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> > > > > Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > > Date:   Fri Jun 18 13:41:27 2021 +0000
> > > > > 
> > > > >     ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
> > > > 
> > > > Hmm... It's not obvious at all how this change can alter the behaviour so
> > > > drastically. device_add() is called from USB core with intf->dev.name == NULL
> > > > by some reason. A-ha, seems like fault injector, which looks like
> > > > 
> > > > 	dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
> > > > 		     dev->devpath, configuration, ifnum);
> > > > 
> > > > missed the return code check.
> > > > 
> > > > But I'm not familiar with that code at all, adding Linux USB ML and Alan.
> > > 
> > > I can't see any connection between this bug and acpi/sysfs.c.  Is it a 
> > > bad bisection?
> > > 
> > > It looks like you're right about dev_set_name() failing.  In fact, the 
> > > kernel appears to be littered with calls to that routine which do not 
> > > check the return code (the entire subtree below drivers/usb/ contains 
> > > only _one_ call that does check the return code!).  The function doesn't 
> > > have any __must_check annotation, and its kerneldoc doesn't mention the 
> > > return code or the possibility of a failure.
> > > 
> > > Apparently the assumption is that if dev_set_name() fails then 
> > > device_add() later on will also fail, and the problem will be detected 
> > > then.
> > > 
> > > So now what should happen when device_add() for an interface fails in 
> > > usb_set_configuration()?
> > 
> > But how can that really fail on a real system?
> > 
> > Is this just due to error-injection stuff?  If so, I'm really loath to
> > rework the world for something that can never happen in real life.
> > 
> > Or is this a real syzbot-found-with-reproducer issue?
> 
> Aren't there quite a few reasons why device_add() might fail?  (Although 
> most of them probably are memory allocation errors...)

I was thinking of the dev_set_name() issue further back in the call
chain.

> Basically, you have to make up your mind.  If a function can fail, you 
> should be prepared to handle the failure.  If it can't fail, there's no 
> point in even checking the return code.

True, ok, we should unwind the mess.  I'll try to look at it after the
merge window...

But again, is this a "real and able to be triggered from userspace"
problem, or just fault-injection-induced?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-03 15:52       ` Greg KH
@ 2022-06-03 16:03         ` Alan Stern
  2022-06-03 16:11           ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Stern @ 2022-06-03 16:03 UTC (permalink / raw)
  To: Greg KH
  Cc: Andy Shevchenko, syzbot, hdanton, lenb, linux-acpi, linux-kernel,
	rafael.j.wysocki, rafael, rjw, syzkaller-bugs, linux-usb

On Fri, Jun 03, 2022 at 05:52:38PM +0200, Greg KH wrote:
> On Fri, Jun 03, 2022 at 11:42:19AM -0400, Alan Stern wrote:
> > On Fri, Jun 03, 2022 at 02:04:04PM +0300, Andy Shevchenko wrote:
> > > On Fri, Jun 03, 2022 at 03:02:07AM -0700, syzbot wrote:
> > > > syzbot has bisected this issue to:
> > > > 
> > > > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> > > > Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > > Date:   Fri Jun 18 13:41:27 2021 +0000
> > > > 
> > > >     ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
> > > 
> > > Hmm... It's not obvious at all how this change can alter the behaviour so
> > > drastically. device_add() is called from USB core with intf->dev.name == NULL
> > > by some reason. A-ha, seems like fault injector, which looks like
> > > 
> > > 	dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
> > > 		     dev->devpath, configuration, ifnum);
> > > 
> > > missed the return code check.
> > > 
> > > But I'm not familiar with that code at all, adding Linux USB ML and Alan.
> > 
> > I can't see any connection between this bug and acpi/sysfs.c.  Is it a 
> > bad bisection?
> > 
> > It looks like you're right about dev_set_name() failing.  In fact, the 
> > kernel appears to be littered with calls to that routine which do not 
> > check the return code (the entire subtree below drivers/usb/ contains 
> > only _one_ call that does check the return code!).  The function doesn't 
> > have any __must_check annotation, and its kerneldoc doesn't mention the 
> > return code or the possibility of a failure.
> > 
> > Apparently the assumption is that if dev_set_name() fails then 
> > device_add() later on will also fail, and the problem will be detected 
> > then.
> > 
> > So now what should happen when device_add() for an interface fails in 
> > usb_set_configuration()?
> 
> But how can that really fail on a real system?
> 
> Is this just due to error-injection stuff?  If so, I'm really loath to
> rework the world for something that can never happen in real life.
> 
> Or is this a real syzbot-found-with-reproducer issue?

Aren't there quite a few reasons why device_add() might fail?  (Although 
most of them probably are memory allocation errors...)

Basically, you have to make up your mind.  If a function can fail, you 
should be prepared to handle the failure.  If it can't fail, there's no 
point in even checking the return code.

Alan Stern

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-03 15:42     ` Alan Stern
@ 2022-06-03 15:52       ` Greg KH
  2022-06-03 16:03         ` Alan Stern
  0 siblings, 1 reply; 17+ messages in thread
From: Greg KH @ 2022-06-03 15:52 UTC (permalink / raw)
  To: Alan Stern
  Cc: Andy Shevchenko, syzbot, hdanton, lenb, linux-acpi, linux-kernel,
	rafael.j.wysocki, rafael, rjw, syzkaller-bugs, linux-usb

On Fri, Jun 03, 2022 at 11:42:19AM -0400, Alan Stern wrote:
> On Fri, Jun 03, 2022 at 02:04:04PM +0300, Andy Shevchenko wrote:
> > On Fri, Jun 03, 2022 at 03:02:07AM -0700, syzbot wrote:
> > > syzbot has bisected this issue to:
> > > 
> > > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> > > Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > > Date:   Fri Jun 18 13:41:27 2021 +0000
> > > 
> > >     ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
> > 
> > Hmm... It's not obvious at all how this change can alter the behaviour so
> > drastically. device_add() is called from USB core with intf->dev.name == NULL
> > by some reason. A-ha, seems like fault injector, which looks like
> > 
> > 	dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
> > 		     dev->devpath, configuration, ifnum);
> > 
> > missed the return code check.
> > 
> > But I'm not familiar with that code at all, adding Linux USB ML and Alan.
> 
> I can't see any connection between this bug and acpi/sysfs.c.  Is it a 
> bad bisection?
> 
> It looks like you're right about dev_set_name() failing.  In fact, the 
> kernel appears to be littered with calls to that routine which do not 
> check the return code (the entire subtree below drivers/usb/ contains 
> only _one_ call that does check the return code!).  The function doesn't 
> have any __must_check annotation, and its kerneldoc doesn't mention the 
> return code or the possibility of a failure.
> 
> Apparently the assumption is that if dev_set_name() fails then 
> device_add() later on will also fail, and the problem will be detected 
> then.
> 
> So now what should happen when device_add() for an interface fails in 
> usb_set_configuration()?

But how can that really fail on a real system?

Is this just due to error-injection stuff?  If so, I'm really loath to
rework the world for something that can never happen in real life.

Or is this a real syzbot-found-with-reproducer issue?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-03 11:04   ` Andy Shevchenko
@ 2022-06-03 15:42     ` Alan Stern
  2022-06-03 15:52       ` Greg KH
  0 siblings, 1 reply; 17+ messages in thread
From: Alan Stern @ 2022-06-03 15:42 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: syzbot, gregkh, hdanton, lenb, linux-acpi, linux-kernel,
	rafael.j.wysocki, rafael, rjw, syzkaller-bugs, linux-usb

On Fri, Jun 03, 2022 at 02:04:04PM +0300, Andy Shevchenko wrote:
> On Fri, Jun 03, 2022 at 03:02:07AM -0700, syzbot wrote:
> > syzbot has bisected this issue to:
> > 
> > commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> > Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> > Date:   Fri Jun 18 13:41:27 2021 +0000
> > 
> >     ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros
> 
> Hmm... It's not obvious at all how this change can alter the behaviour so
> drastically. device_add() is called from USB core with intf->dev.name == NULL
> by some reason. A-ha, seems like fault injector, which looks like
> 
> 	dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
> 		     dev->devpath, configuration, ifnum);
> 
> missed the return code check.
> 
> But I'm not familiar with that code at all, adding Linux USB ML and Alan.

I can't see any connection between this bug and acpi/sysfs.c.  Is it a 
bad bisection?

It looks like you're right about dev_set_name() failing.  In fact, the 
kernel appears to be littered with calls to that routine which do not 
check the return code (the entire subtree below drivers/usb/ contains 
only _one_ call that does check the return code!).  The function doesn't 
have any __must_check annotation, and its kerneldoc doesn't mention the 
return code or the possibility of a failure.

Apparently the assumption is that if dev_set_name() fails then 
device_add() later on will also fail, and the problem will be detected 
then.

So now what should happen when device_add() for an interface fails in 
usb_set_configuration()?  I guess the interface should be deleted; 
otherwise we have the possibility that people might still try to access 
it via usbfs, as in the syzbot test run.  Same goes for the 
of_device_is_available() check.

Fixing that will be a little painful.  Right now there are plenty of 
places in the USB core that aren't prepared to cope with a non-existent 
interface.

Alan Stern

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-06-03 10:02 ` syzbot
@ 2022-06-03 11:04   ` Andy Shevchenko
  2022-06-03 15:42     ` Alan Stern
  0 siblings, 1 reply; 17+ messages in thread
From: Andy Shevchenko @ 2022-06-03 11:04 UTC (permalink / raw)
  To: syzbot
  Cc: gregkh, hdanton, lenb, linux-acpi, linux-kernel,
	rafael.j.wysocki, rafael, rjw, syzkaller-bugs, linux-usb,
	Alan Stern

On Fri, Jun 03, 2022 at 03:02:07AM -0700, syzbot wrote:
> syzbot has bisected this issue to:
> 
> commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
> Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Date:   Fri Jun 18 13:41:27 2021 +0000
> 
>     ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros

Hmm... It's not obvious at all how this change can alter the behaviour so
drastically. device_add() is called from USB core with intf->dev.name == NULL
by some reason. A-ha, seems like fault injector, which looks like

	dev_set_name(&intf->dev, "%d-%s:%d.%d", dev->bus->busnum,
		     dev->devpath, configuration, ifnum);

missed the return code check.

But I'm not familiar with that code at all, adding Linux USB ML and Alan.

> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1040b80df00000
> start commit:   d1dc87763f40 assoc_array: Fix BUG_ON during garbage collect
> git tree:       upstream
> final oops:     https://syzkaller.appspot.com/x/report.txt?x=1240b80df00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1440b80df00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
> dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15613e2bf00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15c90adbf00000
> 
> Reported-by: syzbot+dd3c97de244683533381@syzkaller.appspotmail.com
> Fixes: a9c4cf299f5f ("ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros")
> 
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
       [not found] <20220603074439.5255-1-hdanton@sina.com>
@ 2022-06-03 10:41 ` syzbot
  0 siblings, 0 replies; 17+ messages in thread
From: syzbot @ 2022-06-03 10:41 UTC (permalink / raw)
  To: hdanton, linux-kernel, syzkaller-bugs

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+dd3c97de244683533381@syzkaller.appspotmail.com

Tested on:

commit:         d1dc8776 assoc_array: Fix BUG_ON during garbage collect
git tree:       https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel config:  https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1244e933f00000

Note: testing is done by a robot and is best-effort only.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-03-14  8:46 syzbot
  2022-06-02 19:49 ` syzbot
@ 2022-06-03 10:02 ` syzbot
  2022-06-03 11:04   ` Andy Shevchenko
  1 sibling, 1 reply; 17+ messages in thread
From: syzbot @ 2022-06-03 10:02 UTC (permalink / raw)
  To: andriy.shevchenko, gregkh, hdanton, lenb, linux-acpi,
	linux-kernel, rafael.j.wysocki, rafael, rjw, syzkaller-bugs

syzbot has bisected this issue to:

commit a9c4cf299f5f79d5016c8a9646fa1fc49381a8c1
Author: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date:   Fri Jun 18 13:41:27 2021 +0000

    ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1040b80df00000
start commit:   d1dc87763f40 assoc_array: Fix BUG_ON during garbage collect
git tree:       upstream
final oops:     https://syzkaller.appspot.com/x/report.txt?x=1240b80df00000
console output: https://syzkaller.appspot.com/x/log.txt?x=1440b80df00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15613e2bf00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15c90adbf00000

Reported-by: syzbot+dd3c97de244683533381@syzkaller.appspotmail.com
Fixes: a9c4cf299f5f ("ACPI: sysfs: Use __ATTR_RO() and __ATTR_RW() macros")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [syzbot] general protection fault in __device_attach
  2022-03-14  8:46 syzbot
@ 2022-06-02 19:49 ` syzbot
  2022-06-03 10:02 ` syzbot
  1 sibling, 0 replies; 17+ messages in thread
From: syzbot @ 2022-06-02 19:49 UTC (permalink / raw)
  To: gregkh, linux-kernel, rafael, syzkaller-bugs

syzbot has found a reproducer for the following issue on:

HEAD commit:    d1dc87763f40 assoc_array: Fix BUG_ON during garbage collect
git tree:       upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=17d2e7f5f00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=c51cd24814bb5665
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=15613e2bf00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15c90adbf00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+dd3c97de244683533381@syzkaller.appspotmail.com

usb usb9: device_add((null)) --> -22
general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
CPU: 0 PID: 4190 Comm: syz-executor322 Not tainted 5.18.0-syzkaller-11972-gd1dc87763f40 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:948
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90003447b98 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff92000688f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000002 RDI: 0000000000000108
RBP: ffff88807a22f030 R08: 0000000000000000 R09: ffffffff8dbb1097
R10: fffffbfff1b76212 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff88807a22f0b0
FS:  0000555557335300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2b779a90b0 CR3: 000000007a1a7000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 proc_ioctl.part.0+0x48e/0x560 drivers/usb/core/devio.c:2356
 proc_ioctl drivers/usb/core/devio.c:182 [inline]
 proc_ioctl_default drivers/usb/core/devio.c:2391 [inline]
 usbdev_do_ioctl drivers/usb/core/devio.c:2747 [inline]
 usbdev_ioctl+0x2c08/0x36f0 drivers/usb/core/devio.c:2807
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:870 [inline]
 __se_sys_ioctl fs/ioctl.c:856 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:856
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f2b77979779
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe17c6ed98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f2b779bd184 RCX: 00007f2b77979779
RDX: 0000000020000040 RSI: 00000000c0105512 RDI: 0000000000000006
RBP: 00007ffe17c6edb0 R08: 0000000000000001 R09: 0000000000000001
R10: 000000000000ffff R11: 0000000000000246 R12: 0000000000000001
R13: 431bde82d7b634db R14: 0000000000000000 R15: 0000000000000000
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:948
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90003447b98 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: 1ffff92000688f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000002 RDI: 0000000000000108
RBP: ffff88807a22f030 R08: 0000000000000000 R09: ffffffff8dbb1097
R10: fffffbfff1b76212 R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff88807a22f0b0
FS:  0000555557335300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2b779a90b0 CR3: 000000007a1a7000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
   0:	e8 03 42 80 3c       	callq  0x3c804208
   5:	20 00                	and    %al,(%rax)
   7:	0f 85 a3 03 00 00    	jne    0x3b0
   d:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  14:	fc ff df
  17:	4c 8b 65 48          	mov    0x48(%rbp),%r12
  1b:	49 8d bc 24 08 01 00 	lea    0x108(%r12),%rdi
  22:	00
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	0f b6 04 02          	movzbl (%rdx,%rax,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	74 06                	je     0x38
  32:	0f 8e 6e 03 00 00    	jle    0x3a6
  38:	45                   	rex.RB
  39:	0f                   	.byte 0xf
  3a:	b6 b4                	mov    $0xb4,%dh
  3c:	24 08                	and    $0x8,%al
  3e:	01 00                	add    %eax,(%rax)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [syzbot] general protection fault in __device_attach
@ 2022-03-14  8:46 syzbot
  2022-06-02 19:49 ` syzbot
  2022-06-03 10:02 ` syzbot
  0 siblings, 2 replies; 17+ messages in thread
From: syzbot @ 2022-03-14  8:46 UTC (permalink / raw)
  To: gregkh, linux-kernel, rafael, syzkaller-bugs

Hello,

syzbot found the following issue on:

HEAD commit:    e7e19defa575 Merge tag 'arm64-fixes' of git://git.kernel.o..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13ea76f6700000
kernel config:  https://syzkaller.appspot.com/x/.config?x=442f8ac61e60a75e
dashboard link: https://syzkaller.appspot.com/bug?extid=dd3c97de244683533381
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+dd3c97de244683533381@syzkaller.appspotmail.com

general protection fault, probably for non-canonical address 0xdffffc0000000021: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000108-0x000000000000010f]
CPU: 1 PID: 14569 Comm: syz-executor.4 Not tainted 5.17.0-rc7-syzkaller-00068-ge7e19defa575 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:949
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90010a87b98 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: 1ffff92002150f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000008 RDI: 0000000000000108
RBP: ffff88807829d030 R08: 0000000000000000 R09: ffffc90010a87ad7
R10: fffff52002150f5a R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff88807829d140
FS:  00007f7048b3e700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f704a2dd090 CR3: 0000000074ae3000 CR4: 0000000000350ee0
Call Trace:
 <TASK>
 proc_ioctl.part.0+0x48e/0x560 drivers/usb/core/devio.c:2340
 proc_ioctl drivers/usb/core/devio.c:170 [inline]
 proc_ioctl_compat drivers/usb/core/devio.c:2389 [inline]
 usbdev_do_ioctl drivers/usb/core/devio.c:2705 [inline]
 usbdev_ioctl+0xc01/0x36c0 drivers/usb/core/devio.c:2791
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:874 [inline]
 __se_sys_ioctl fs/ioctl.c:860 [inline]
 __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f704a1c9049
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7048b3e168 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f704a2dbf60 RCX: 00007f704a1c9049
RDX: 0000000020000000 RSI: 00000000c00c5512 RDI: 0000000000000003
RBP: 00007f704a22308d R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffc683ba24f R14: 00007f7048b3e300 R15: 0000000000022000
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__device_attach+0xad/0x4a0 drivers/base/dd.c:949
Code: e8 03 42 80 3c 20 00 0f 85 a3 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 48 49 8d bc 24 08 01 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 6e 03 00 00 45 0f b6 b4 24 08 01 00
RSP: 0018:ffffc90010a87b98 EFLAGS: 00010216
RAX: dffffc0000000000 RBX: 1ffff92002150f74 RCX: 0000000000000000
RDX: 0000000000000021 RSI: 0000000000000008 RDI: 0000000000000108
RBP: ffff88807829d030 R08: 0000000000000000 R09: ffffc90010a87ad7
R10: fffff52002150f5a R11: 0000000000000001 R12: 0000000000000000
R13: 0000000000000000 R14: 00000000fffffff0 R15: ffff88807829d140
FS:  00007f7048b3e700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0b074ee1b8 CR3: 0000000074ae3000 CR4: 0000000000350ef0
----------------
Code disassembly (best guess):
   0:	e8 03 42 80 3c       	callq  0x3c804208
   5:	20 00                	and    %al,(%rax)
   7:	0f 85 a3 03 00 00    	jne    0x3b0
   d:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
  14:	fc ff df
  17:	4c 8b 65 48          	mov    0x48(%rbp),%r12
  1b:	49 8d bc 24 08 01 00 	lea    0x108(%r12),%rdi
  22:	00
  23:	48 89 fa             	mov    %rdi,%rdx
  26:	48 c1 ea 03          	shr    $0x3,%rdx
* 2a:	0f b6 04 02          	movzbl (%rdx,%rax,1),%eax <-- trapping instruction
  2e:	84 c0                	test   %al,%al
  30:	74 06                	je     0x38
  32:	0f 8e 6e 03 00 00    	jle    0x3a6
  38:	45                   	rex.RB
  39:	0f                   	.byte 0xf
  3a:	b6 b4                	mov    $0xb4,%dh
  3c:	24 08                	and    $0x8,%al
  3e:	01 00                	add    %eax,(%rax)


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-06-08  9:06 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20220603033532.5154-1-hdanton@sina.com>
2022-06-03  3:55 ` [syzbot] general protection fault in __device_attach syzbot
     [not found] <20220603074439.5255-1-hdanton@sina.com>
2022-06-03 10:41 ` syzbot
2022-03-14  8:46 syzbot
2022-06-02 19:49 ` syzbot
2022-06-03 10:02 ` syzbot
2022-06-03 11:04   ` Andy Shevchenko
2022-06-03 15:42     ` Alan Stern
2022-06-03 15:52       ` Greg KH
2022-06-03 16:03         ` Alan Stern
2022-06-03 16:11           ` Greg KH
2022-06-03 16:27             ` Alan Stern
2022-06-04  8:32             ` Dmitry Vyukov
2022-06-06 12:38               ` Dan Carpenter
2022-06-07  7:15                 ` Dmitry Vyukov
2022-06-08  3:25                   ` Matthew Wilcox
2022-06-08  8:20                     ` Dmitry Vyukov
2022-06-08  8:24                       ` Dmitry Vyukov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.