Linux-Raid Archives on lore.kernel.org
 help / color / Atom feed
* WARNING in md_ioctl
@ 2020-10-17 11:06 Dae R. Jeong
  2020-10-19  6:18 ` Song Liu
  0 siblings, 1 reply; 8+ messages in thread
From: Dae R. Jeong @ 2020-10-17 11:06 UTC (permalink / raw)
  To: song; +Cc: yjkwon, linux-raid, linux-kernel

Hi,

I looked into the warning "WARNING in md_ioctl" found by Syzkaller.
(https://syzkaller.appspot.com/bug?id=fbf9eaea2e65bfcabb4e2750c3ab0892867edea1)
I suspect that it is caused by a race between two concurrenct ioctl()s as belows.

CPU1 (md_ioctl())                          CPU2 (md_ioctl())
------                                     ------
set_bit(MD_CLOSING, &mddev->flags);
did_set_md_closing = true;
                                           WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));

if(did_set_md_closing)
    clear_bit(MD_CLOSING, &mddev->flags);

If the above is correct, this warning is introduced
in the commit 065e519e("md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop").
Could you please take a look into this?

Best regards,
Dae R. Jeong



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING in md_ioctl
  2020-10-17 11:06 WARNING in md_ioctl Dae R. Jeong
@ 2020-10-19  6:18 ` Song Liu
  2020-10-19  7:03   ` Dae R. Jeong
  0 siblings, 1 reply; 8+ messages in thread
From: Song Liu @ 2020-10-19  6:18 UTC (permalink / raw)
  To: Dae R. Jeong; +Cc: yjkwon, linux-raid, open list

On Sat, Oct 17, 2020 at 4:13 AM Dae R. Jeong <dae.r.jeong@kaist.ac.kr> wrote:
>
> Hi,
>
> I looked into the warning "WARNING in md_ioctl" found by Syzkaller.
> (https://syzkaller.appspot.com/bug?id=fbf9eaea2e65bfcabb4e2750c3ab0892867edea1)
> I suspect that it is caused by a race between two concurrenct ioctl()s as belows.
>
> CPU1 (md_ioctl())                          CPU2 (md_ioctl())
> ------                                     ------
> set_bit(MD_CLOSING, &mddev->flags);
> did_set_md_closing = true;
>                                            WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
>
> if(did_set_md_closing)
>     clear_bit(MD_CLOSING, &mddev->flags);
>
> If the above is correct, this warning is introduced
> in the commit 065e519e("md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop").
> Could you please take a look into this?

This is an interesting case. We try to protect against concurrent
ioctl via mddev->openers:

                if (mddev->pers && atomic_read(&mddev->openers) > 1) {
                        mutex_unlock(&mddev->open_mutex);
                        err = -EBUSY;
                        goto out;
                }

But in this case, we are sending multiple ioctl from the same fd, so
openers == 1.

We can probably do something like:

diff --git i/drivers/md/md.c w/drivers/md/md.c
index 6072782070230..49442a3f4605b 100644
--- i/drivers/md/md.c
+++ w/drivers/md/md.c
@@ -7591,8 +7591,10 @@ static int md_ioctl(struct block_device *bdev,
fmode_t mode,
                        err = -EBUSY;
                        goto out;
                }
-               WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
-               set_bit(MD_CLOSING, &mddev->flags);
+               if (test_and_set_bit(MD_CLOSING, &mddev->flags)) {
+                       err = -EBUSY;
+                       goto out;
+               }
                did_set_md_closing = true;
                mutex_unlock(&mddev->open_mutex);
                sync_blockdev(bdev);

Could you please test whether this fixes the issue?

Thanks,
Song

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING in md_ioctl
  2020-10-19  6:18 ` Song Liu
@ 2020-10-19  7:03   ` Dae R. Jeong
  2020-10-21  5:28     ` Song Liu
  0 siblings, 1 reply; 8+ messages in thread
From: Dae R. Jeong @ 2020-10-19  7:03 UTC (permalink / raw)
  To: Song Liu; +Cc: yjkwon, linux-raid, open list

> diff --git i/drivers/md/md.c w/drivers/md/md.c
> index 6072782070230..49442a3f4605b 100644
> --- i/drivers/md/md.c
> +++ w/drivers/md/md.c
> @@ -7591,8 +7591,10 @@ static int md_ioctl(struct block_device *bdev,
> fmode_t mode,
>                         err = -EBUSY;
>                         goto out;
>                 }
> -               WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
> -               set_bit(MD_CLOSING, &mddev->flags);
> +               if (test_and_set_bit(MD_CLOSING, &mddev->flags)) {
> +                       err = -EBUSY;
> +                       goto out;
> +               }
>                 did_set_md_closing = true;
>                 mutex_unlock(&mddev->open_mutex);
>                 sync_blockdev(bdev);
> 
> Could you please test whether this fixes the issue?

Since &mddev->open_mutex is held when testing a bit of mddev->flags, I
modified the code just a little bit by putting mutex_unlock() as
belows.

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 98bac4f304ae..643f7f5be49b 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7590,8 +7590,11 @@ static int md_ioctl(struct block_device *bdev, fmode_t mode,
 			err = -EBUSY;
 			goto out;
 		}
-		WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
-		set_bit(MD_CLOSING, &mddev->flags);
+		if (test_and_set_bit(MD_CLOSING, &mddev->flags)) {
+			mutex_unlock(&mddev->open_mutex);
+			err = -EBUSY;
+			goto out;
+		}
 		did_set_md_closing = true;
 		mutex_unlock(&mddev->open_mutex);
 		sync_blockdev(bdev);

The warning no longer recurs (of course, we removed
WARN_ON_ONCE()). As I am not familiar with this code, I do not see any
other problem.

Best regards,
Dae R. Jeong



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING in md_ioctl
  2020-10-19  7:03   ` Dae R. Jeong
@ 2020-10-21  5:28     ` Song Liu
  2020-10-22  0:24       ` Dae R. Jeong
  0 siblings, 1 reply; 8+ messages in thread
From: Song Liu @ 2020-10-21  5:28 UTC (permalink / raw)
  To: Dae R. Jeong; +Cc: yjkwon, linux-raid, open list

On Mon, Oct 19, 2020 at 12:03 AM Dae R. Jeong <dae.r.jeong@kaist.ac.kr> wrote:
>
> > diff --git i/drivers/md/md.c w/drivers/md/md.c
> > index 6072782070230..49442a3f4605b 100644
> > --- i/drivers/md/md.c
> > +++ w/drivers/md/md.c
> > @@ -7591,8 +7591,10 @@ static int md_ioctl(struct block_device *bdev,
> > fmode_t mode,
> >                         err = -EBUSY;
> >                         goto out;
> >                 }
> > -               WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
> > -               set_bit(MD_CLOSING, &mddev->flags);
> > +               if (test_and_set_bit(MD_CLOSING, &mddev->flags)) {
> > +                       err = -EBUSY;
> > +                       goto out;
> > +               }
> >                 did_set_md_closing = true;
> >                 mutex_unlock(&mddev->open_mutex);
> >                 sync_blockdev(bdev);
> >
> > Could you please test whether this fixes the issue?
>
> Since &mddev->open_mutex is held when testing a bit of mddev->flags, I
> modified the code just a little bit by putting mutex_unlock() as
> belows.

Good catch! The fix looks good. Would you like to submit a patch for it?

Thanks,
Song

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING in md_ioctl
  2020-10-21  5:28     ` Song Liu
@ 2020-10-22  0:24       ` Dae R. Jeong
  0 siblings, 0 replies; 8+ messages in thread
From: Dae R. Jeong @ 2020-10-22  0:24 UTC (permalink / raw)
  To: Song Liu; +Cc: yjkwon, linux-raid, open list

> > > diff --git i/drivers/md/md.c w/drivers/md/md.c
> > > index 6072782070230..49442a3f4605b 100644
> > > --- i/drivers/md/md.c
> > > +++ w/drivers/md/md.c
> > > @@ -7591,8 +7591,10 @@ static int md_ioctl(struct block_device *bdev,
> > > fmode_t mode,
> > >                         err = -EBUSY;
> > >                         goto out;
> > >                 }
> > > -               WARN_ON_ONCE(test_bit(MD_CLOSING, &mddev->flags));
> > > -               set_bit(MD_CLOSING, &mddev->flags);
> > > +               if (test_and_set_bit(MD_CLOSING, &mddev->flags)) {
> > > +                       err = -EBUSY;
> > > +                       goto out;
> > > +               }
> > >                 did_set_md_closing = true;
> > >                 mutex_unlock(&mddev->open_mutex);
> > >                 sync_blockdev(bdev);
> > >
> 
> Good catch! The fix looks good. Would you like to submit a patch for it?

Sure. I will send a patch soon.

Best regards,
Dae R. Jeong.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING in md_ioctl
  2019-11-25 22:37 ` syzbot
@ 2019-11-26  8:42   ` Daniel Vetter
  0 siblings, 0 replies; 8+ messages in thread
From: Daniel Vetter @ 2019-11-26  8:42 UTC (permalink / raw)
  To: syzbot
  Cc: airlied, intel-gfx, syzkaller-bugs, linux-kernel, linux-raid,
	dri-devel, shli

On Mon, Nov 25, 2019 at 02:37:01PM -0800, syzbot wrote:
> syzbot has bisected this bug to:
> 
> commit 4b6ce6810a5dc0af387a238e8c852e0d3822381f
> Author: Rafael Antognolli <rafael.antognolli@intel.com>
> Date:   Mon Feb 5 23:33:30 2018 +0000
> 
>     drm/i915/cnl: WaPipeControlBefore3DStateSamplePattern

This seems very unlikely, the reproducer doesn't open a drm device, and
I'd be surprised if your gcd instances have an actual i915 device in them
(but I can't check because boot log isn't provided, didn't find it on the
dashboard either).

Since i915 is built-in I suspect this simply moved something else in the
kernel image around which provokes the bug.
-Daniel

> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13aeb522e00000
> start commit:   c61a56ab Merge branch 'x86-urgent-for-linus' of git://git...
> git tree:       upstream
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=106eb522e00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=17aeb522e00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=4013180e7c7a9ff9
> dashboard link: https://syzkaller.appspot.com/bug?extid=1e46a0864c1a6e9bd3d8
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16bca207800000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14819a47800000
> 
> Reported-by: syzbot+1e46a0864c1a6e9bd3d8@syzkaller.appspotmail.com
> Fixes: 4b6ce6810a5d ("drm/i915/cnl:
> WaPipeControlBefore3DStateSamplePattern")
> 
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: WARNING in md_ioctl
  2018-04-30  1:00 syzbot
@ 2019-11-25 22:37 ` syzbot
  2019-11-26  8:42   ` Daniel Vetter
  0 siblings, 1 reply; 8+ messages in thread
From: syzbot @ 2019-11-25 22:37 UTC (permalink / raw)
  To: airlied, chris, dri-devel, intel-gfx, jani.nikula,
	joonas.lahtinen, linux-kernel, linux-raid, rafael.antognolli,
	rodrigo.vivi, shli, syzkaller-bugs

syzbot has bisected this bug to:

commit 4b6ce6810a5dc0af387a238e8c852e0d3822381f
Author: Rafael Antognolli <rafael.antognolli@intel.com>
Date:   Mon Feb 5 23:33:30 2018 +0000

     drm/i915/cnl: WaPipeControlBefore3DStateSamplePattern

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=13aeb522e00000
start commit:   c61a56ab Merge branch 'x86-urgent-for-linus' of git://git...
git tree:       upstream
final crash:    https://syzkaller.appspot.com/x/report.txt?x=106eb522e00000
console output: https://syzkaller.appspot.com/x/log.txt?x=17aeb522e00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4013180e7c7a9ff9
dashboard link: https://syzkaller.appspot.com/bug?extid=1e46a0864c1a6e9bd3d8
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16bca207800000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14819a47800000

Reported-by: syzbot+1e46a0864c1a6e9bd3d8@syzkaller.appspotmail.com
Fixes: 4b6ce6810a5d ("drm/i915/cnl:  
WaPipeControlBefore3DStateSamplePattern")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply	[flat|nested] 8+ messages in thread

* WARNING in md_ioctl
@ 2018-04-30  1:00 syzbot
  2019-11-25 22:37 ` syzbot
  0 siblings, 1 reply; 8+ messages in thread
From: syzbot @ 2018-04-30  1:00 UTC (permalink / raw)
  To: linux-kernel, linux-raid, shli, syzkaller-bugs

Hello,

syzbot hit the following crash on upstream commit
c61a56ababa404961fa769a2b24229f18e461961 (Sun Apr 29 17:06:05 2018 +0000)
Merge branch 'x86-urgent-for-linus' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=1e46a0864c1a6e9bd3d8

C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5771999158730752
syzkaller reproducer:  
https://syzkaller.appspot.com/x/repro.syz?id=6399853584187392
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=6389218171420672
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=4617060493481910265
compiler: gcc (GCC) 8.0.1 20180413 (experimental)

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+1e46a0864c1a6e9bd3d8@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.
If you forward the report, please keep this part and the footer.

random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
random: sshd: uninitialized urandom read (32 bytes read)
md: md0 stopped.
md: md0 stopped.
WARNING: CPU: 1 PID: 4511 at drivers/md/md.c:7188 md_ioctl+0x3427/0x6360  
drivers/md/md.c:7188
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 4511 Comm: syz-executor107 Not tainted 4.17.0-rc2+ #24
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
  panic+0x22f/0x4de kernel/panic.c:184
  __warn.cold.8+0x163/0x1b3 kernel/panic.c:536
  report_bug+0x252/0x2d0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:178 [inline]
  do_error_trap+0x1de/0x490 arch/x86/kernel/traps.c:296
  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
  invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992
RIP: 0010:md_ioctl+0x3427/0x6360 drivers/md/md.c:7188
RSP: 0018:ffff8801ad9b7400 EFLAGS: 00010293
RAX: ffff8801acf7a540 RBX: 0000000000000932 RCX: ffffffff8565f354
RDX: 0000000000000000 RSI: ffffffff85661c37 RDI: 0000000000000007
RBP: ffff8801ad9b79c0 R08: ffff8801acf7a540 R09: ffffed0039894699
R10: ffff8801ad9b73f0 R11: ffff8801cc4a34cf R12: 0000000000000932
R13: ffff8801cc4a3328 R14: ffff8801ad9b7998 R15: 0000000000000001
  __blkdev_driver_ioctl block/ioctl.c:303 [inline]
  blkdev_ioctl+0x9b6/0x2020 block/ioctl.c:601
  block_ioctl+0xee/0x130 fs/block_dev.c:1877
  vfs_ioctl fs/ioctl.c:46 [inline]
  file_ioctl fs/ioctl.c:500 [inline]
  do_vfs_ioctl+0x1cf/0x16a0 fs/ioctl.c:684
  ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701
  __do_sys_ioctl fs/ioctl.c:708 [inline]
  __se_sys_ioctl fs/ioctl.c:706 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706
  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445a59
RSP: 002b:00007f00dd147da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000445a59
RDX: 0000000020000080 RSI: 0000000000000932 RDI: 0000000000000003
RBP: 00000000006dac3c R08: 00007f00dd148700 R09: 0000000000000000
R10: 00007f00dd148700 R11: 0000000000000246 R12: 00000000006dac38
R13: 30646d2f7665642f R14: 00007f00dd1489c0 R15: 0000000000000003
Dumping ftrace buffer:
    (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a dumb bot. It may contain errors.
See https://goo.gl/tpsmEJ for details.
Direct all questions to syzkaller@googlegroups.com.

syzbot will keep track of this bug report.
If you forgot to add the Reported-by tag, once the fix for this bug is  
merged
into any tree, please reply to this email with:
#syz fix: exact-commit-title
If you want to test a patch for this bug, please reply with:
#syz test: git://repo/address.git branch
and provide the patch inline or as an attachment.
To mark this as a duplicate of another syzbot report, please reply with:
#syz dup: exact-subject-of-another-report
If it's a one-off invalid bug report, please reply with:
#syz invalid
Note: if the crash happens again, it will cause creation of a new bug  
report.
Note: all commands must start from beginning of the line in the email body.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-17 11:06 WARNING in md_ioctl Dae R. Jeong
2020-10-19  6:18 ` Song Liu
2020-10-19  7:03   ` Dae R. Jeong
2020-10-21  5:28     ` Song Liu
2020-10-22  0:24       ` Dae R. Jeong
  -- strict thread matches above, loose matches on Subject: below --
2018-04-30  1:00 syzbot
2019-11-25 22:37 ` syzbot
2019-11-26  8:42   ` Daniel Vetter

Linux-Raid Archives on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-raid/0 linux-raid/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-raid linux-raid/ https://lore.kernel.org/linux-raid \
		linux-raid@vger.kernel.org
	public-inbox-index linux-raid

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-raid


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git