linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* WARNING: refcount bug in cdev_get
@ 2019-08-20 22:58 syzbot
  2019-12-04 11:50 ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: syzbot @ 2019-08-20 22:58 UTC (permalink / raw)
  To: linux-fsdevel, linux-kernel, syzkaller-bugs, viro

Hello,

syzbot found the following crash on:

HEAD commit:    2d63ba3e Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pu..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=165d3302600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=3ff364e429585cf2
dashboard link: https://syzkaller.appspot.com/bug?extid=82defefbbd8527e1c2cb
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16c8ab3c600000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16be0c4c600000

Bisection is inconclusive: the bug happens on the oldest tested release.

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11de3622600000
console output: https://syzkaller.appspot.com/x/log.txt?x=15de3622600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+82defefbbd8527e1c2cb@syzkaller.appspotmail.com

------------[ cut here ]------------
refcount_t: increment on 0; use-after-free.
WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156 refcount_inc_checked  
lib/refcount.c:156 [inline]
WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156  
refcount_inc_checked+0x61/0x70 lib/refcount.c:154
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 11828 Comm: syz-executor746 Not tainted 5.3.0-rc4+ #112
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  panic+0x2dc/0x755 kernel/panic.c:219
  __warn.cold+0x20/0x4c kernel/panic.c:576
  report_bug+0x263/0x2b0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:179 [inline]
  fixup_bug arch/x86/kernel/traps.c:174 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
  do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
  invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1028
RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
Code: 1d 8e c6 64 06 31 ff 89 de e8 ab 9c 35 fe 84 db 75 dd e8 62 9b 35 fe  
48 c7 c7 00 05 c6 87 c6 05 6e c6 64 06 01 e8 67 26 07 fe <0f> 0b eb c1 90  
90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
RSP: 0018:ffff8880907d78b8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815c2466 RDI: ffffed10120faf09
RBP: ffff8880907d78c8 R08: ffff8880a771a200 R09: fffffbfff134ae48
R10: fffffbfff134ae47 R11: ffffffff89a5723f R12: ffff88809ea2bb80
R13: 0000000000000000 R14: ffff88809ff6cd40 R15: ffff8880a1c56480
  kref_get include/linux/kref.h:45 [inline]
  kobject_get+0x66/0xc0 lib/kobject.c:644
  cdev_get+0x60/0xb0 fs/char_dev.c:355
  chrdev_open+0xb0/0x6b0 fs/char_dev.c:400
  do_dentry_open+0x4df/0x1250 fs/open.c:797
  vfs_open+0xa0/0xd0 fs/open.c:906
  do_last fs/namei.c:3416 [inline]
  path_openat+0x10e9/0x4630 fs/namei.c:3533
  do_filp_open+0x1a1/0x280 fs/namei.c:3563
  do_sys_open+0x3fe/0x5d0 fs/open.c:1089
  __do_sys_open fs/open.c:1107 [inline]
  __se_sys_open fs/open.c:1102 [inline]
  __x64_sys_open+0x7e/0xc0 fs/open.c:1102
  do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x406311
Code: 75 14 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 a4 18 00 00 c3 48  
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 02 00 00 00 0f 05 <48> 8b 3c 24 48  
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007f047e1c0960 EFLAGS: 00000293 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000406311
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f047e1c0970
RBP: 6666666666666667 R08: 000000000000000f R09: 00007f047e1c1700
R10: 00007f047e1c19d0 R11: 0000000000000293 R12: 00000000006dbc3c
R13: 0000000000000000 R14: 0000000000000000 R15: 00000000317a7973
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING: refcount bug in cdev_get
  2019-08-20 22:58 WARNING: refcount bug in cdev_get syzbot
@ 2019-12-04 11:50 ` Will Deacon
  2019-12-04 12:31   ` Greg KH
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2019-12-04 11:50 UTC (permalink / raw)
  To: syzbot
  Cc: linux-fsdevel, linux-kernel, syzkaller-bugs, viro, hdanton, akpm, gregkh

Hi all,

[+Hillf, +akpm, +Greg]

On Tue, Aug 20, 2019 at 03:58:06PM -0700, syzbot wrote:
> syzbot found the following crash on:
> 
> HEAD commit:    2d63ba3e Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pu..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=165d3302600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=3ff364e429585cf2
> dashboard link: https://syzkaller.appspot.com/bug?extid=82defefbbd8527e1c2cb
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16c8ab3c600000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16be0c4c600000
> 
> Bisection is inconclusive: the bug happens on the oldest tested release.
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11de3622600000
> console output: https://syzkaller.appspot.com/x/log.txt?x=15de3622600000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+82defefbbd8527e1c2cb@syzkaller.appspotmail.com
> 
> ------------[ cut here ]------------
> refcount_t: increment on 0; use-after-free.
> WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156 refcount_inc_checked
> lib/refcount.c:156 [inline]
> WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156
> refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> Kernel panic - not syncing: panic_on_warn set ...

[...]

> RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
> RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> Code: 1d 8e c6 64 06 31 ff 89 de e8 ab 9c 35 fe 84 db 75 dd e8 62 9b 35 fe
> 48 c7 c7 00 05 c6 87 c6 05 6e c6 64 06 01 e8 67 26 07 fe <0f> 0b eb c1 90 90
> 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
> RSP: 0018:ffff8880907d78b8 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff815c2466 RDI: ffffed10120faf09
> RBP: ffff8880907d78c8 R08: ffff8880a771a200 R09: fffffbfff134ae48
> R10: fffffbfff134ae47 R11: ffffffff89a5723f R12: ffff88809ea2bb80
> R13: 0000000000000000 R14: ffff88809ff6cd40 R15: ffff8880a1c56480
>  kref_get include/linux/kref.h:45 [inline]
>  kobject_get+0x66/0xc0 lib/kobject.c:644
>  cdev_get+0x60/0xb0 fs/char_dev.c:355
>  chrdev_open+0xb0/0x6b0 fs/char_dev.c:400
>  do_dentry_open+0x4df/0x1250 fs/open.c:797
>  vfs_open+0xa0/0xd0 fs/open.c:906
>  do_last fs/namei.c:3416 [inline]
>  path_openat+0x10e9/0x4630 fs/namei.c:3533
>  do_filp_open+0x1a1/0x280 fs/namei.c:3563
>  do_sys_open+0x3fe/0x5d0 fs/open.c:1089

FWIW, we've run into this same crash on arm64 so it would be nice to see it
fixed upstream. It looks like Hillf's reply (which included a patch) didn't
make it to the kernel mailing lists for some reason, but it is available
here:

https://groups.google.com/forum/#!original/syzkaller-bugs/PnQNxBrWv_8/X1ygj8d8DgAJ

A simpler fix would just be to use kobject_get_unless_zero() directly in
cdev_get(), but that looks odd in this specific case because chrdev_open()
holds the 'cdev_lock' and you'd hope that finding the kobject in the inode
with that held would mean that it's not being freed at the same time.

Cheers,

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING: refcount bug in cdev_get
  2019-12-04 11:50 ` Will Deacon
@ 2019-12-04 12:31   ` Greg KH
  2019-12-10 11:44     ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2019-12-04 12:31 UTC (permalink / raw)
  To: Will Deacon
  Cc: syzbot, linux-fsdevel, linux-kernel, syzkaller-bugs, viro, hdanton, akpm

On Wed, Dec 04, 2019 at 11:50:56AM +0000, Will Deacon wrote:
> Hi all,
> 
> [+Hillf, +akpm, +Greg]
> 
> On Tue, Aug 20, 2019 at 03:58:06PM -0700, syzbot wrote:
> > syzbot found the following crash on:
> > 
> > HEAD commit:    2d63ba3e Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pu..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=165d3302600000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=3ff364e429585cf2
> > dashboard link: https://syzkaller.appspot.com/bug?extid=82defefbbd8527e1c2cb
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16c8ab3c600000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16be0c4c600000
> > 
> > Bisection is inconclusive: the bug happens on the oldest tested release.
> > 
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11de3622600000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15de3622600000
> > 
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+82defefbbd8527e1c2cb@syzkaller.appspotmail.com
> > 
> > ------------[ cut here ]------------
> > refcount_t: increment on 0; use-after-free.
> > WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156 refcount_inc_checked
> > lib/refcount.c:156 [inline]
> > WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156
> > refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> > Kernel panic - not syncing: panic_on_warn set ...
> 
> [...]
> 
> > RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
> > RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> > Code: 1d 8e c6 64 06 31 ff 89 de e8 ab 9c 35 fe 84 db 75 dd e8 62 9b 35 fe
> > 48 c7 c7 00 05 c6 87 c6 05 6e c6 64 06 01 e8 67 26 07 fe <0f> 0b eb c1 90 90
> > 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
> > RSP: 0018:ffff8880907d78b8 EFLAGS: 00010282
> > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: ffffffff815c2466 RDI: ffffed10120faf09
> > RBP: ffff8880907d78c8 R08: ffff8880a771a200 R09: fffffbfff134ae48
> > R10: fffffbfff134ae47 R11: ffffffff89a5723f R12: ffff88809ea2bb80
> > R13: 0000000000000000 R14: ffff88809ff6cd40 R15: ffff8880a1c56480
> >  kref_get include/linux/kref.h:45 [inline]
> >  kobject_get+0x66/0xc0 lib/kobject.c:644
> >  cdev_get+0x60/0xb0 fs/char_dev.c:355
> >  chrdev_open+0xb0/0x6b0 fs/char_dev.c:400
> >  do_dentry_open+0x4df/0x1250 fs/open.c:797
> >  vfs_open+0xa0/0xd0 fs/open.c:906
> >  do_last fs/namei.c:3416 [inline]
> >  path_openat+0x10e9/0x4630 fs/namei.c:3533
> >  do_filp_open+0x1a1/0x280 fs/namei.c:3563
> >  do_sys_open+0x3fe/0x5d0 fs/open.c:1089
> 
> FWIW, we've run into this same crash on arm64 so it would be nice to see it
> fixed upstream. It looks like Hillf's reply (which included a patch) didn't
> make it to the kernel mailing lists for some reason, but it is available
> here:
> 
> https://groups.google.com/forum/#!original/syzkaller-bugs/PnQNxBrWv_8/X1ygj8d8DgAJ

No one is going to go and dig a patch out of google groups :(

> A simpler fix would just be to use kobject_get_unless_zero() directly in
> cdev_get(), but that looks odd in this specific case because chrdev_open()
> holds the 'cdev_lock' and you'd hope that finding the kobject in the inode
> with that held would mean that it's not being freed at the same time.

When using kref_get_unless_zero() that implies that a lock is not being
used and you are relying on the kobject only instead.

But I thought we had a lock in play here, so why would changing this
actually fix anything?

This code hasn't changed in 15+ years, what suddenly changed that causes
problems here?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING: refcount bug in cdev_get
  2019-12-04 12:31   ` Greg KH
@ 2019-12-10 11:44     ` Will Deacon
  2019-12-18 17:08       ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2019-12-10 11:44 UTC (permalink / raw)
  To: Greg KH
  Cc: syzbot, linux-fsdevel, linux-kernel, syzkaller-bugs, viro, hdanton, akpm

Hi Greg,

On Wed, Dec 04, 2019 at 01:31:48PM +0100, Greg KH wrote:
> On Wed, Dec 04, 2019 at 11:50:56AM +0000, Will Deacon wrote:
> > On Tue, Aug 20, 2019 at 03:58:06PM -0700, syzbot wrote:
> > > syzbot found the following crash on:
> > > 
> > > HEAD commit:    2d63ba3e Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pu..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=165d3302600000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=3ff364e429585cf2
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=82defefbbd8527e1c2cb
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16c8ab3c600000
> > > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=16be0c4c600000
> > > 
> > > Bisection is inconclusive: the bug happens on the oldest tested release.
> > > 
> > > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=11de3622600000
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15de3622600000
> > > 
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+82defefbbd8527e1c2cb@syzkaller.appspotmail.com
> > > 
> > > ------------[ cut here ]------------
> > > refcount_t: increment on 0; use-after-free.
> > > WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156 refcount_inc_checked
> > > lib/refcount.c:156 [inline]
> > > WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156
> > > refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> > > Kernel panic - not syncing: panic_on_warn set ...
> > 
> > [...]
> > 
> > > RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
> > > RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> > > Code: 1d 8e c6 64 06 31 ff 89 de e8 ab 9c 35 fe 84 db 75 dd e8 62 9b 35 fe
> > > 48 c7 c7 00 05 c6 87 c6 05 6e c6 64 06 01 e8 67 26 07 fe <0f> 0b eb c1 90 90
> > > 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
> > > RSP: 0018:ffff8880907d78b8 EFLAGS: 00010282
> > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > > RDX: 0000000000000000 RSI: ffffffff815c2466 RDI: ffffed10120faf09
> > > RBP: ffff8880907d78c8 R08: ffff8880a771a200 R09: fffffbfff134ae48
> > > R10: fffffbfff134ae47 R11: ffffffff89a5723f R12: ffff88809ea2bb80
> > > R13: 0000000000000000 R14: ffff88809ff6cd40 R15: ffff8880a1c56480
> > >  kref_get include/linux/kref.h:45 [inline]
> > >  kobject_get+0x66/0xc0 lib/kobject.c:644
> > >  cdev_get+0x60/0xb0 fs/char_dev.c:355
> > >  chrdev_open+0xb0/0x6b0 fs/char_dev.c:400
> > >  do_dentry_open+0x4df/0x1250 fs/open.c:797
> > >  vfs_open+0xa0/0xd0 fs/open.c:906
> > >  do_last fs/namei.c:3416 [inline]
> > >  path_openat+0x10e9/0x4630 fs/namei.c:3533
> > >  do_filp_open+0x1a1/0x280 fs/namei.c:3563
> > >  do_sys_open+0x3fe/0x5d0 fs/open.c:1089
> > 
> > FWIW, we've run into this same crash on arm64 so it would be nice to see it
> > fixed upstream. It looks like Hillf's reply (which included a patch) didn't
> > make it to the kernel mailing lists for some reason, but it is available
> > here:
> > 
> > https://groups.google.com/forum/#!original/syzkaller-bugs/PnQNxBrWv_8/X1ygj8d8DgAJ
> 
> No one is going to go and dig a patch out of google groups :(

Sure, just thought it was worth mentioning after digging up the history.

> > A simpler fix would just be to use kobject_get_unless_zero() directly in
> > cdev_get(), but that looks odd in this specific case because chrdev_open()
> > holds the 'cdev_lock' and you'd hope that finding the kobject in the inode
> > with that held would mean that it's not being freed at the same time.
> 
> When using kref_get_unless_zero() that implies that a lock is not being
> used and you are relying on the kobject only instead.
> 
> But I thought we had a lock in play here, so why would changing this
> actually fix anything?

I don't think the lock is always used. For example, look at chrdev_open(),
which appears in the backtrace; the locked code is:

	spin_lock(&cdev_lock);
	p = inode->i_cdev;
	if (!p) {
		struct kobject *kobj;
		int idx;
		spin_unlock(&cdev_lock);
		kobj = kobj_lookup(cdev_map, inode->i_rdev, &idx);
		if (!kobj)
			return -ENXIO;
		new = container_of(kobj, struct cdev, kobj);
		spin_lock(&cdev_lock);
		/* Check i_cdev again in case somebody beat us to it while
		   we dropped the lock. */
		p = inode->i_cdev;
		if (!p) {
			inode->i_cdev = p = new;
			list_add(&inode->i_devices, &p->list);
			new = NULL;
		} else if (!cdev_get(p))
			ret = -ENXIO;
	} else if (!cdev_get(p))
		ret = -ENXIO;
	spin_unlock(&cdev_lock);
	cdev_put(new);

So the idea is that multiple threads serialise on the 'cdev_lock' and then
check 'inode->i_cdev' to figure out if the device has already been probed,
taking a reference to it if it's available or probing it via kobj_lookup()
otherwise. I think that's backwards with respect to things like cdev_put(),
where the refcount is dropped *before* 'inode->i_cdev' is cleared to NULL.
In which case, if a concurrent call to cdev_put() can drop the refcount
to zero without 'cdev_lock' held, then you could get a use-after-free on
this path thanks to a dangling pointer in 'inode->i_cdev'..

Looking slightly ahead in this same function, there are error paths which
appear to do exactly that:

	fops = fops_get(p->ops);
	if (!fops)
		goto out_cdev_put;

	replace_fops(filp, fops);
	if (filp->f_op->open) {
		ret = filp->f_op->open(inode, filp);
		if (ret)
			goto out_cdev_put;
	}

	return 0;

 out_cdev_put:
	cdev_put(p);
	return ret;

In which case the thread which installed 'inode->i_cdev' earlier on can
now drop its refcount to zero without the lock held if, for example, the
filp->f_op->open() call fails.

But note, this is purely based on code inspection -- the C reproducer from
syzkaller doesn't work for me, so I've not been able to test any fixes either.
It's also worth noting that cdev_put() is called from __fput(), but I think the
reference counting on the file means we're ok there.

> This code hasn't changed in 15+ years, what suddenly changed that causes
> problems here?

I suppose one thing to consider is that the refcount code is relatively new,
so it could be that the actual use-after-free is extremely rare, but we're
now seeing that it's at least potentially an issue.

Thoughts?

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING: refcount bug in cdev_get
  2019-12-10 11:44     ` Will Deacon
@ 2019-12-18 17:08       ` Will Deacon
  2019-12-18 18:20         ` Greg KH
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2019-12-18 17:08 UTC (permalink / raw)
  To: Greg KH
  Cc: syzbot, linux-fsdevel, linux-kernel, syzkaller-bugs, viro, hdanton, akpm

Hi all,

On Tue, Dec 10, 2019 at 11:44:45AM +0000, Will Deacon wrote:
> On Wed, Dec 04, 2019 at 01:31:48PM +0100, Greg KH wrote:
> > But I thought we had a lock in play here, so why would changing this
> > actually fix anything?
> 
> I don't think the lock is always used. For example, look at chrdev_open(),
> which appears in the backtrace; the locked code is:
> 
> 	spin_lock(&cdev_lock);
> 	p = inode->i_cdev;
> 	if (!p) {
> 		struct kobject *kobj;
> 		int idx;
> 		spin_unlock(&cdev_lock);
> 		kobj = kobj_lookup(cdev_map, inode->i_rdev, &idx);
> 		if (!kobj)
> 			return -ENXIO;
> 		new = container_of(kobj, struct cdev, kobj);
> 		spin_lock(&cdev_lock);
> 		/* Check i_cdev again in case somebody beat us to it while
> 		   we dropped the lock. */
> 		p = inode->i_cdev;
> 		if (!p) {
> 			inode->i_cdev = p = new;
> 			list_add(&inode->i_devices, &p->list);
> 			new = NULL;
> 		} else if (!cdev_get(p))
> 			ret = -ENXIO;
> 	} else if (!cdev_get(p))
> 		ret = -ENXIO;
> 	spin_unlock(&cdev_lock);
> 	cdev_put(new);
> 
> So the idea is that multiple threads serialise on the 'cdev_lock' and then
> check 'inode->i_cdev' to figure out if the device has already been probed,
> taking a reference to it if it's available or probing it via kobj_lookup()
> otherwise. I think that's backwards with respect to things like cdev_put(),
> where the refcount is dropped *before* 'inode->i_cdev' is cleared to NULL.
> In which case, if a concurrent call to cdev_put() can drop the refcount
> to zero without 'cdev_lock' held, then you could get a use-after-free on
> this path thanks to a dangling pointer in 'inode->i_cdev'..
> 
> Looking slightly ahead in this same function, there are error paths which
> appear to do exactly that:
> 
> 	fops = fops_get(p->ops);
> 	if (!fops)
> 		goto out_cdev_put;
> 
> 	replace_fops(filp, fops);
> 	if (filp->f_op->open) {
> 		ret = filp->f_op->open(inode, filp);
> 		if (ret)
> 			goto out_cdev_put;
> 	}
> 
> 	return 0;
> 
>  out_cdev_put:
> 	cdev_put(p);
> 	return ret;
> 
> In which case the thread which installed 'inode->i_cdev' earlier on can
> now drop its refcount to zero without the lock held if, for example, the
> filp->f_op->open() call fails.
> 
> But note, this is purely based on code inspection -- the C reproducer from
> syzkaller doesn't work for me, so I've not been able to test any fixes either.
> It's also worth noting that cdev_put() is called from __fput(), but I think the
> reference counting on the file means we're ok there.
> 
> > This code hasn't changed in 15+ years, what suddenly changed that causes
> > problems here?
> 
> I suppose one thing to consider is that the refcount code is relatively new,
> so it could be that the actual use-after-free is extremely rare, but we're
> now seeing that it's at least potentially an issue.
> 
> Thoughts?

FWIW, I added some mdelay()s to make this race more likely, and I can now
trigger it reasonably reliably. See below.

Will

--->8

[   89.512353] ------------[ cut here ]------------
[   89.513350] refcount_t: addition on 0; use-after-free.
[   89.513977] WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0
[   89.514943] Modules linked in:
[   89.515307] CPU: 2 PID: 6385 Comm: repro Not tainted 5.5.0-rc2+ #22
[   89.516039] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   89.517047] RIP: 0010:refcount_warn_saturate+0x6d/0xf0
[   89.517647] Code: 05 55 9a 15 01 01 e8 9d aa c8 ff 0f 0b c3 80 3d 45 9a 15 01 00 75 ce 48 c7 c7 00 9c 62 b3 c6 08
[   89.519749] RSP: 0018:ffffb524c1b9bc70 EFLAGS: 00010282
[   89.520353] RAX: 0000000000000000 RBX: ffff9e9da1f71390 RCX: 0000000000000000
[   89.521184] RDX: ffff9e9dbbd27618 RSI: ffff9e9dbbd18798 RDI: ffff9e9dbbd18798
[   89.522020] RBP: 0000000000000000 R08: 000000000000095f R09: 0000000000000039
[   89.522854] R10: 0000000000000000 R11: ffffb524c1b9bb20 R12: ffff9e9da1e8c700
[   89.523689] R13: ffffffffb25ee8b0 R14: 0000000000000000 R15: ffff9e9da1e8c700
[   89.524512] FS:  00007f3b87d26700(0000) GS:ffff9e9dbbd00000(0000) knlGS:0000000000000000
[   89.525439] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   89.526105] CR2: 00007fc16909c000 CR3: 000000012df9c000 CR4: 00000000000006e0
[   89.526937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   89.527759] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   89.528587] Call Trace:
[   89.528889]  kobject_get+0x5c/0x60
[   89.529290]  cdev_get+0x2b/0x60
[   89.529656]  chrdev_open+0x55/0x220
[   89.530060]  ? cdev_put.part.3+0x20/0x20
[   89.530515]  do_dentry_open+0x13a/0x390
[   89.530961]  path_openat+0x2c8/0x1470
[   89.531383]  do_filp_open+0x93/0x100
[   89.531797]  ? selinux_file_ioctl+0x17f/0x220
[   89.532297]  do_sys_open+0x186/0x220
[   89.532708]  do_syscall_64+0x48/0x150
[   89.533129]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   89.533704] RIP: 0033:0x7f3b87efcd0e
[   89.534115] Code: 89 54 24 08 e8 a3 f4 ff ff 8b 74 24 0c 48 8b 3c 24 41 89 c0 44 8b 54 24 08 b8 01 01 00 00 89 f4
[   89.536227] RSP: 002b:00007f3b87d259f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
[   89.537085] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3b87efcd0e
[   89.537891] RDX: 0000000000000000 RSI: 00007f3b87d25a80 RDI: 00000000ffffff9c
[   89.538693] RBP: 00007f3b87d25e90 R08: 0000000000000000 R09: 0000000000000000
[   89.539493] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe188f504e
[   89.540291] R13: 00007ffe188f504f R14: 00007f3b87d26700 R15: 0000000000000000
[   89.541090] ---[ end trace 24f53ca58db8180a ]---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING: refcount bug in cdev_get
  2019-12-18 17:08       ` Will Deacon
@ 2019-12-18 18:20         ` Greg KH
  2019-12-19 11:59           ` Will Deacon
  0 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2019-12-18 18:20 UTC (permalink / raw)
  To: Will Deacon
  Cc: syzbot, linux-fsdevel, linux-kernel, syzkaller-bugs, viro, hdanton, akpm

On Wed, Dec 18, 2019 at 05:08:55PM +0000, Will Deacon wrote:
> Hi all,
> 
> On Tue, Dec 10, 2019 at 11:44:45AM +0000, Will Deacon wrote:
> > On Wed, Dec 04, 2019 at 01:31:48PM +0100, Greg KH wrote:
> > > But I thought we had a lock in play here, so why would changing this
> > > actually fix anything?
> > 
> > I don't think the lock is always used. For example, look at chrdev_open(),
> > which appears in the backtrace; the locked code is:
> > 
> > 	spin_lock(&cdev_lock);
> > 	p = inode->i_cdev;
> > 	if (!p) {
> > 		struct kobject *kobj;
> > 		int idx;
> > 		spin_unlock(&cdev_lock);
> > 		kobj = kobj_lookup(cdev_map, inode->i_rdev, &idx);
> > 		if (!kobj)
> > 			return -ENXIO;
> > 		new = container_of(kobj, struct cdev, kobj);
> > 		spin_lock(&cdev_lock);
> > 		/* Check i_cdev again in case somebody beat us to it while
> > 		   we dropped the lock. */
> > 		p = inode->i_cdev;
> > 		if (!p) {
> > 			inode->i_cdev = p = new;
> > 			list_add(&inode->i_devices, &p->list);
> > 			new = NULL;
> > 		} else if (!cdev_get(p))
> > 			ret = -ENXIO;
> > 	} else if (!cdev_get(p))
> > 		ret = -ENXIO;
> > 	spin_unlock(&cdev_lock);
> > 	cdev_put(new);
> > 
> > So the idea is that multiple threads serialise on the 'cdev_lock' and then
> > check 'inode->i_cdev' to figure out if the device has already been probed,
> > taking a reference to it if it's available or probing it via kobj_lookup()
> > otherwise. I think that's backwards with respect to things like cdev_put(),
> > where the refcount is dropped *before* 'inode->i_cdev' is cleared to NULL.
> > In which case, if a concurrent call to cdev_put() can drop the refcount
> > to zero without 'cdev_lock' held, then you could get a use-after-free on
> > this path thanks to a dangling pointer in 'inode->i_cdev'..
> > 
> > Looking slightly ahead in this same function, there are error paths which
> > appear to do exactly that:
> > 
> > 	fops = fops_get(p->ops);
> > 	if (!fops)
> > 		goto out_cdev_put;
> > 
> > 	replace_fops(filp, fops);
> > 	if (filp->f_op->open) {
> > 		ret = filp->f_op->open(inode, filp);
> > 		if (ret)
> > 			goto out_cdev_put;
> > 	}
> > 
> > 	return 0;
> > 
> >  out_cdev_put:
> > 	cdev_put(p);
> > 	return ret;
> > 
> > In which case the thread which installed 'inode->i_cdev' earlier on can
> > now drop its refcount to zero without the lock held if, for example, the
> > filp->f_op->open() call fails.
> > 
> > But note, this is purely based on code inspection -- the C reproducer from
> > syzkaller doesn't work for me, so I've not been able to test any fixes either.
> > It's also worth noting that cdev_put() is called from __fput(), but I think the
> > reference counting on the file means we're ok there.
> > 
> > > This code hasn't changed in 15+ years, what suddenly changed that causes
> > > problems here?
> > 
> > I suppose one thing to consider is that the refcount code is relatively new,
> > so it could be that the actual use-after-free is extremely rare, but we're
> > now seeing that it's at least potentially an issue.
> > 
> > Thoughts?
> 
> FWIW, I added some mdelay()s to make this race more likely, and I can now
> trigger it reasonably reliably. See below.
> 
> Will
> 
> --->8
> 
> [   89.512353] ------------[ cut here ]------------
> [   89.513350] refcount_t: addition on 0; use-after-free.
> [   89.513977] WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0
> [   89.514943] Modules linked in:
> [   89.515307] CPU: 2 PID: 6385 Comm: repro Not tainted 5.5.0-rc2+ #22
> [   89.516039] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
> [   89.517047] RIP: 0010:refcount_warn_saturate+0x6d/0xf0
> [   89.517647] Code: 05 55 9a 15 01 01 e8 9d aa c8 ff 0f 0b c3 80 3d 45 9a 15 01 00 75 ce 48 c7 c7 00 9c 62 b3 c6 08
> [   89.519749] RSP: 0018:ffffb524c1b9bc70 EFLAGS: 00010282
> [   89.520353] RAX: 0000000000000000 RBX: ffff9e9da1f71390 RCX: 0000000000000000
> [   89.521184] RDX: ffff9e9dbbd27618 RSI: ffff9e9dbbd18798 RDI: ffff9e9dbbd18798
> [   89.522020] RBP: 0000000000000000 R08: 000000000000095f R09: 0000000000000039
> [   89.522854] R10: 0000000000000000 R11: ffffb524c1b9bb20 R12: ffff9e9da1e8c700
> [   89.523689] R13: ffffffffb25ee8b0 R14: 0000000000000000 R15: ffff9e9da1e8c700
> [   89.524512] FS:  00007f3b87d26700(0000) GS:ffff9e9dbbd00000(0000) knlGS:0000000000000000
> [   89.525439] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   89.526105] CR2: 00007fc16909c000 CR3: 000000012df9c000 CR4: 00000000000006e0
> [   89.526937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   89.527759] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   89.528587] Call Trace:
> [   89.528889]  kobject_get+0x5c/0x60
> [   89.529290]  cdev_get+0x2b/0x60
> [   89.529656]  chrdev_open+0x55/0x220
> [   89.530060]  ? cdev_put.part.3+0x20/0x20
> [   89.530515]  do_dentry_open+0x13a/0x390
> [   89.530961]  path_openat+0x2c8/0x1470
> [   89.531383]  do_filp_open+0x93/0x100
> [   89.531797]  ? selinux_file_ioctl+0x17f/0x220
> [   89.532297]  do_sys_open+0x186/0x220
> [   89.532708]  do_syscall_64+0x48/0x150
> [   89.533129]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   89.533704] RIP: 0033:0x7f3b87efcd0e
> [   89.534115] Code: 89 54 24 08 e8 a3 f4 ff ff 8b 74 24 0c 48 8b 3c 24 41 89 c0 44 8b 54 24 08 b8 01 01 00 00 89 f4
> [   89.536227] RSP: 002b:00007f3b87d259f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
> [   89.537085] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3b87efcd0e
> [   89.537891] RDX: 0000000000000000 RSI: 00007f3b87d25a80 RDI: 00000000ffffff9c
> [   89.538693] RBP: 00007f3b87d25e90 R08: 0000000000000000 R09: 0000000000000000
> [   89.539493] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffe188f504e
> [   89.540291] R13: 00007ffe188f504f R14: 00007f3b87d26700 R15: 0000000000000000
> [   89.541090] ---[ end trace 24f53ca58db8180a ]---

No hint as to _where_ you put the mdelay()?  :)



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING: refcount bug in cdev_get
  2019-12-18 18:20         ` Greg KH
@ 2019-12-19 11:59           ` Will Deacon
  2019-12-24 12:59             ` Prateek Sood
  0 siblings, 1 reply; 8+ messages in thread
From: Will Deacon @ 2019-12-19 11:59 UTC (permalink / raw)
  To: Greg KH
  Cc: syzbot, linux-fsdevel, linux-kernel, syzkaller-bugs, viro, hdanton, akpm

On Wed, Dec 18, 2019 at 07:20:26PM +0100, Greg KH wrote:
> On Wed, Dec 18, 2019 at 05:08:55PM +0000, Will Deacon wrote:
> > On Tue, Dec 10, 2019 at 11:44:45AM +0000, Will Deacon wrote:
> > > On Wed, Dec 04, 2019 at 01:31:48PM +0100, Greg KH wrote:
> > > > This code hasn't changed in 15+ years, what suddenly changed that causes
> > > > problems here?
> > > 
> > > I suppose one thing to consider is that the refcount code is relatively new,
> > > so it could be that the actual use-after-free is extremely rare, but we're
> > > now seeing that it's at least potentially an issue.
> > > 
> > > Thoughts?
> > 
> > FWIW, I added some mdelay()s to make this race more likely, and I can now
> > trigger it reasonably reliably. See below.
> > 
> > --->8
> > 
> > [   89.512353] ------------[ cut here ]------------
> > [   89.513350] refcount_t: addition on 0; use-after-free.
> > [   89.513977] WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0

[...]

> No hint as to _where_ you put the mdelay()?  :)

I threw it in the release function to maximise the period where the refcount
is 0 but the inode 'i_cdev' pointer is non-NULL. I also hacked chrdev_open()
so that the fops->open() call appears to fail most of the time (I guess
syzkaller uses error injection to do something similar). Nasty hack below.

I'll send a patch, given that I've managed to "reproduce" this.

Will

--->8

diff --git a/fs/char_dev.c b/fs/char_dev.c
index 00dfe17871ac..e2e48fcd0435 100644
--- a/fs/char_dev.c
+++ b/fs/char_dev.c
@@ -375,7 +375,7 @@ static int chrdev_open(struct inode *inode, struct file *filp)
 	const struct file_operations *fops;
 	struct cdev *p;
 	struct cdev *new = NULL;
-	int ret = 0;
+	int ret = 0, first = 0;
 
 	spin_lock(&cdev_lock);
 	p = inode->i_cdev;
@@ -395,6 +395,7 @@ static int chrdev_open(struct inode *inode, struct file *filp)
 			inode->i_cdev = p = new;
 			list_add(&inode->i_devices, &p->list);
 			new = NULL;
+			first = 1;
 		} else if (!cdev_get(p))
 			ret = -ENXIO;
 	} else if (!cdev_get(p))
@@ -411,6 +412,10 @@ static int chrdev_open(struct inode *inode, struct file *filp)
 
 	replace_fops(filp, fops);
 	if (filp->f_op->open) {
+		if (first && (get_cycles() & 0x3)) {
+			ret = -EINTR;
+			goto out_cdev_put;
+		}
 		ret = filp->f_op->open(inode, filp);
 		if (ret)
 			goto out_cdev_put;
@@ -594,12 +599,14 @@ void cdev_del(struct cdev *p)
 	kobject_put(&p->kobj);
 }
 
+#include <linux/delay.h>
 
 static void cdev_default_release(struct kobject *kobj)
 {
 	struct cdev *p = container_of(kobj, struct cdev, kobj);
 	struct kobject *parent = kobj->parent;
 
+	mdelay(50);
 	cdev_purge(p);
 	kobject_put(parent);
 }

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: WARNING: refcount bug in cdev_get
  2019-12-19 11:59           ` Will Deacon
@ 2019-12-24 12:59             ` Prateek Sood
  0 siblings, 0 replies; 8+ messages in thread
From: Prateek Sood @ 2019-12-24 12:59 UTC (permalink / raw)
  To: Will Deacon, Greg KH
  Cc: syzbot, linux-fsdevel, linux-kernel, syzkaller-bugs, viro, hdanton, akpm

Hi Will,

I am facing same issue while syzkaller fault injection code is causing

failure in filp->f_op->open() from chrdev_open().

I believe we need to rely on refcount as cdev_lock() is not sufficient

in this case.

Patch mentioned in 
https://groups.google.com/forum/#!original/syzkaller-bugs/PnQNxBrWv_8/X1ygj8d8DgAJ

seems good.


Please share your opinion the same.


Regards

Prateek



On 12/19/2019 5:29 PM, Will Deacon wrote:
> On Wed, Dec 18, 2019 at 07:20:26PM +0100, Greg KH wrote:
>> On Wed, Dec 18, 2019 at 05:08:55PM +0000, Will Deacon wrote:
>>> On Tue, Dec 10, 2019 at 11:44:45AM +0000, Will Deacon wrote:
>>>> On Wed, Dec 04, 2019 at 01:31:48PM +0100, Greg KH wrote:
>>>>> This code hasn't changed in 15+ years, what suddenly changed that causes
>>>>> problems here?
>>>> I suppose one thing to consider is that the refcount code is relatively new,
>>>> so it could be that the actual use-after-free is extremely rare, but we're
>>>> now seeing that it's at least potentially an issue.
>>>>
>>>> Thoughts?
>>> FWIW, I added some mdelay()s to make this race more likely, and I can now
>>> trigger it reasonably reliably. See below.
>>>
>>> --->8
>>>
>>> [   89.512353] ------------[ cut here ]------------
>>> [   89.513350] refcount_t: addition on 0; use-after-free.
>>> [   89.513977] WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0
> [...]
>
>> No hint as to _where_ you put the mdelay()?  :)
> I threw it in the release function to maximise the period where the refcount
> is 0 but the inode 'i_cdev' pointer is non-NULL. I also hacked chrdev_open()
> so that the fops->open() call appears to fail most of the time (I guess
> syzkaller uses error injection to do something similar). Nasty hack below.
>
> I'll send a patch, given that I've managed to "reproduce" this.
>
> Will
>
> --->8
>
> diff --git a/fs/char_dev.c b/fs/char_dev.c
> index 00dfe17871ac..e2e48fcd0435 100644
> --- a/fs/char_dev.c
> +++ b/fs/char_dev.c
> @@ -375,7 +375,7 @@ static int chrdev_open(struct inode *inode, struct file *filp)
>   	const struct file_operations *fops;
>   	struct cdev *p;
>   	struct cdev *new = NULL;
> -	int ret = 0;
> +	int ret = 0, first = 0;
>   
>   	spin_lock(&cdev_lock);
>   	p = inode->i_cdev;
> @@ -395,6 +395,7 @@ static int chrdev_open(struct inode *inode, struct file *filp)
>   			inode->i_cdev = p = new;
>   			list_add(&inode->i_devices, &p->list);
>   			new = NULL;
> +			first = 1;
>   		} else if (!cdev_get(p))
>   			ret = -ENXIO;
>   	} else if (!cdev_get(p))
> @@ -411,6 +412,10 @@ static int chrdev_open(struct inode *inode, struct file *filp)
>   
>   	replace_fops(filp, fops);
>   	if (filp->f_op->open) {
> +		if (first && (get_cycles() & 0x3)) {
> +			ret = -EINTR;
> +			goto out_cdev_put;
> +		}
>   		ret = filp->f_op->open(inode, filp);
>   		if (ret)
>   			goto out_cdev_put;
> @@ -594,12 +599,14 @@ void cdev_del(struct cdev *p)
>   	kobject_put(&p->kobj);
>   }
>   
> +#include <linux/delay.h>
>   
>   static void cdev_default_release(struct kobject *kobj)
>   {
>   	struct cdev *p = container_of(kobj, struct cdev, kobj);
>   	struct kobject *parent = kobj->parent;
>   
> +	mdelay(50);
>   	cdev_purge(p);
>   	kobject_put(parent);
>   }

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-12-24 12:59 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-20 22:58 WARNING: refcount bug in cdev_get syzbot
2019-12-04 11:50 ` Will Deacon
2019-12-04 12:31   ` Greg KH
2019-12-10 11:44     ` Will Deacon
2019-12-18 17:08       ` Will Deacon
2019-12-18 18:20         ` Greg KH
2019-12-19 11:59           ` Will Deacon
2019-12-24 12:59             ` Prateek Sood

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).