All of lore.kernel.org
 help / color / mirror / Atom feed
* blk: NULL ptr deref in blk_dequeue_request()
@ 2012-09-22 20:35 Sasha Levin
  2012-10-07 18:26 ` Sasha Levin
  0 siblings, 1 reply; 14+ messages in thread
From: Sasha Levin @ 2012-09-22 20:35 UTC (permalink / raw)
  To: Tejun Heo, axboe; +Cc: Dave Jones, linux-kernel

Hi all,

While fuzzing with trinity inside a KVM tools guest running the latest linux-next kernel, I've stumbled on the following BUG.

I've also hit a similar trace where the 'BUG_ON(ELV_ON_HASH(rq));' above that list_del_init() gets hit, so I guess it's a race
condition of some sorts.


[    9.900299] BUG: unable to handle kernel NULL pointer dereference at           (null)
[    9.909508] IP: [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
[    9.910191] PGD 0
[    9.910191] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[    9.910191] Dumping ftrace buffer:
[    9.910191]    (ftrace buffer empty)
[    9.910191] CPU 2
[    9.910191] Pid: 3996, comm: kworker/u:2 Tainted: G        W    3.6.0-rc6-next-20120921-sasha-00001-geb77a39-dirty #3
[    9.910191] RIP: 0010:[<ffffffff819ea637>]  [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
[    9.910191] RSP: 0000:ffff880034e11c88  EFLAGS: 00010007
[    9.910191] RAX: 0000000000000000 RBX: ffff880034e3ec00 RCX: dead000000200200
[    9.910191] RDX: 0000000000000000 RSI: ffffffff85366998 RDI: ffff880034e3ec00
[    9.910191] RBP: ffff880034e11c88 R08: 0000000000000000 R09: ffff88001af60928
[    9.910191] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[    9.910191] R13: ffffffff85366360 R14: 0000000000000000 R15: ffffffff85b4edd0
[    9.910191] FS:  0000000000000000(0000) GS:ffff880029800000(0000) knlGS:0000000000000000
[    9.910191] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    9.910191] CR2: 0000000000000000 CR3: 0000000004c26000 CR4: 00000000000406e0
[    9.910191] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    9.910191] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    9.910191] Process kworker/u:2 (pid: 3996, threadinfo ffff880034e10000, task ffff88001af60000)
[    9.910191] Stack:
[    9.910191]  ffff880034e11ca8 ffffffff819a1a45 ffff880034e3ec00 0000000000000000
[    9.910191]  ffff880034e11cc8 ffffffff819a1ae1 0000000000000000 ffff880034e3ec00
[    9.910191]  ffff880034e11ce8 ffffffff819a271e 0000000000000000 0000000000000000
[    9.910191] Call Trace:
[    9.910191]  [<ffffffff819a1a45>] blk_dequeue_request+0x35/0xc0
[    9.910191]  [<ffffffff819a1ae1>] blk_start_request+0x11/0x40
[    9.910191]  [<ffffffff819a271e>] blk_fetch_request+0x1e/0x30
[    9.910191]  [<ffffffff81e5a89d>] redo_fd_request+0x9d/0x3f0
[    9.910191]  [<ffffffff8112a779>] process_one_work+0x3b9/0x770
[    9.910191]  [<ffffffff8112a628>] ? process_one_work+0x268/0x770
[    9.910191]  [<ffffffff81177a22>] ? get_lock_stats+0x22/0x70
[    9.910191]  [<ffffffff81e5a800>] ? start_motor+0x120/0x120
[    9.910191]  [<ffffffff8112b0fa>] worker_thread+0x2ba/0x3f0
[    9.910191]  [<ffffffff8112ae40>] ? rescuer_thread+0x2d0/0x2d0
[    9.910191]  [<ffffffff81135d83>] kthread+0xe3/0xf0
[    9.910191]  [<ffffffff81177aae>] ? put_lock_stats.isra.16+0xe/0x40
[    9.910191]  [<ffffffff81135ca0>] ? insert_kthread_work+0x90/0x90
[    9.910191]  [<ffffffff839f1e45>] kernel_thread_helper+0x5/0x10
[    9.910191]  [<ffffffff81135ca0>] ? insert_kthread_work+0x90/0x90
[    9.910191] Code: 6a 84 be 3e 00 00 00 48 c7 c7 7b d8 6a 84 31 c0 e8 8f c2 71 ff eb 2c 0f 1f 44 00 00 48 b9 00 02 20 00 00 00
ad de 48 39 c8 74 8c <4c> 8b 00 4c 39 c7 75 a6 4c 8b 42 08 4c 39 c7 75 bc 48 89 42 08
[    9.910191] RIP  [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
[    9.910191]  RSP <ffff880034e11c88>
[    9.910191] CR2: 0000000000000000


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-09-22 20:35 blk: NULL ptr deref in blk_dequeue_request() Sasha Levin
@ 2012-10-07 18:26 ` Sasha Levin
  2012-10-08 17:22   ` Jan Kara
  0 siblings, 1 reply; 14+ messages in thread
From: Sasha Levin @ 2012-10-07 18:26 UTC (permalink / raw)
  To: Tejun Heo, axboe; +Cc: Dave Jones, linux-kernel

Ping?

I'm still seeing this on linux-next.

On Sat, Sep 22, 2012 at 4:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> Hi all,
>
> While fuzzing with trinity inside a KVM tools guest running the latest linux-next kernel, I've stumbled on the following BUG.
>
> I've also hit a similar trace where the 'BUG_ON(ELV_ON_HASH(rq));' above that list_del_init() gets hit, so I guess it's a race
> condition of some sorts.
>
>
> [    9.900299] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [    9.909508] IP: [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> [    9.910191] PGD 0
> [    9.910191] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> [    9.910191] Dumping ftrace buffer:
> [    9.910191]    (ftrace buffer empty)
> [    9.910191] CPU 2
> [    9.910191] Pid: 3996, comm: kworker/u:2 Tainted: G        W    3.6.0-rc6-next-20120921-sasha-00001-geb77a39-dirty #3
> [    9.910191] RIP: 0010:[<ffffffff819ea637>]  [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> [    9.910191] RSP: 0000:ffff880034e11c88  EFLAGS: 00010007
> [    9.910191] RAX: 0000000000000000 RBX: ffff880034e3ec00 RCX: dead000000200200
> [    9.910191] RDX: 0000000000000000 RSI: ffffffff85366998 RDI: ffff880034e3ec00
> [    9.910191] RBP: ffff880034e11c88 R08: 0000000000000000 R09: ffff88001af60928
> [    9.910191] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> [    9.910191] R13: ffffffff85366360 R14: 0000000000000000 R15: ffffffff85b4edd0
> [    9.910191] FS:  0000000000000000(0000) GS:ffff880029800000(0000) knlGS:0000000000000000
> [    9.910191] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    9.910191] CR2: 0000000000000000 CR3: 0000000004c26000 CR4: 00000000000406e0
> [    9.910191] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    9.910191] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [    9.910191] Process kworker/u:2 (pid: 3996, threadinfo ffff880034e10000, task ffff88001af60000)
> [    9.910191] Stack:
> [    9.910191]  ffff880034e11ca8 ffffffff819a1a45 ffff880034e3ec00 0000000000000000
> [    9.910191]  ffff880034e11cc8 ffffffff819a1ae1 0000000000000000 ffff880034e3ec00
> [    9.910191]  ffff880034e11ce8 ffffffff819a271e 0000000000000000 0000000000000000
> [    9.910191] Call Trace:
> [    9.910191]  [<ffffffff819a1a45>] blk_dequeue_request+0x35/0xc0
> [    9.910191]  [<ffffffff819a1ae1>] blk_start_request+0x11/0x40
> [    9.910191]  [<ffffffff819a271e>] blk_fetch_request+0x1e/0x30
> [    9.910191]  [<ffffffff81e5a89d>] redo_fd_request+0x9d/0x3f0
> [    9.910191]  [<ffffffff8112a779>] process_one_work+0x3b9/0x770
> [    9.910191]  [<ffffffff8112a628>] ? process_one_work+0x268/0x770
> [    9.910191]  [<ffffffff81177a22>] ? get_lock_stats+0x22/0x70
> [    9.910191]  [<ffffffff81e5a800>] ? start_motor+0x120/0x120
> [    9.910191]  [<ffffffff8112b0fa>] worker_thread+0x2ba/0x3f0
> [    9.910191]  [<ffffffff8112ae40>] ? rescuer_thread+0x2d0/0x2d0
> [    9.910191]  [<ffffffff81135d83>] kthread+0xe3/0xf0
> [    9.910191]  [<ffffffff81177aae>] ? put_lock_stats.isra.16+0xe/0x40
> [    9.910191]  [<ffffffff81135ca0>] ? insert_kthread_work+0x90/0x90
> [    9.910191]  [<ffffffff839f1e45>] kernel_thread_helper+0x5/0x10
> [    9.910191]  [<ffffffff81135ca0>] ? insert_kthread_work+0x90/0x90
> [    9.910191] Code: 6a 84 be 3e 00 00 00 48 c7 c7 7b d8 6a 84 31 c0 e8 8f c2 71 ff eb 2c 0f 1f 44 00 00 48 b9 00 02 20 00 00 00
> ad de 48 39 c8 74 8c <4c> 8b 00 4c 39 c7 75 a6 4c 8b 42 08 4c 39 c7 75 bc 48 89 42 08
> [    9.910191] RIP  [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> [    9.910191]  RSP <ffff880034e11c88>
> [    9.910191] CR2: 0000000000000000
>
>
> Thanks,
> Sasha

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-07 18:26 ` Sasha Levin
@ 2012-10-08 17:22   ` Jan Kara
  2012-10-08 21:45     ` Jiri Kosina
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kara @ 2012-10-08 17:22 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Tejun Heo, axboe, Dave Jones, linux-kernel, jkosina

On Sun 07-10-12 14:26:42, Sasha Levin wrote:
> Ping?
> 
> I'm still seeing this on linux-next.
  I think this is floppy related (see redo_fd_request() in the stack
trace). And there were quite some changes to the area recently. Adding
maintainer to CC.

									Honza

> On Sat, Sep 22, 2012 at 4:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > Hi all,
> >
> > While fuzzing with trinity inside a KVM tools guest running the latest linux-next kernel, I've stumbled on the following BUG.
> >
> > I've also hit a similar trace where the 'BUG_ON(ELV_ON_HASH(rq));' above that list_del_init() gets hit, so I guess it's a race
> > condition of some sorts.
> >
> >
> > [    9.900299] BUG: unable to handle kernel NULL pointer dereference at           (null)
> > [    9.909508] IP: [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> > [    9.910191] PGD 0
> > [    9.910191] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > [    9.910191] Dumping ftrace buffer:
> > [    9.910191]    (ftrace buffer empty)
> > [    9.910191] CPU 2
> > [    9.910191] Pid: 3996, comm: kworker/u:2 Tainted: G        W    3.6.0-rc6-next-20120921-sasha-00001-geb77a39-dirty #3
> > [    9.910191] RIP: 0010:[<ffffffff819ea637>]  [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> > [    9.910191] RSP: 0000:ffff880034e11c88  EFLAGS: 00010007
> > [    9.910191] RAX: 0000000000000000 RBX: ffff880034e3ec00 RCX: dead000000200200
> > [    9.910191] RDX: 0000000000000000 RSI: ffffffff85366998 RDI: ffff880034e3ec00
> > [    9.910191] RBP: ffff880034e11c88 R08: 0000000000000000 R09: ffff88001af60928
> > [    9.910191] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> > [    9.910191] R13: ffffffff85366360 R14: 0000000000000000 R15: ffffffff85b4edd0
> > [    9.910191] FS:  0000000000000000(0000) GS:ffff880029800000(0000) knlGS:0000000000000000
> > [    9.910191] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    9.910191] CR2: 0000000000000000 CR3: 0000000004c26000 CR4: 00000000000406e0
> > [    9.910191] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [    9.910191] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [    9.910191] Process kworker/u:2 (pid: 3996, threadinfo ffff880034e10000, task ffff88001af60000)
> > [    9.910191] Stack:
> > [    9.910191]  ffff880034e11ca8 ffffffff819a1a45 ffff880034e3ec00 0000000000000000
> > [    9.910191]  ffff880034e11cc8 ffffffff819a1ae1 0000000000000000 ffff880034e3ec00
> > [    9.910191]  ffff880034e11ce8 ffffffff819a271e 0000000000000000 0000000000000000
> > [    9.910191] Call Trace:
> > [    9.910191]  [<ffffffff819a1a45>] blk_dequeue_request+0x35/0xc0
> > [    9.910191]  [<ffffffff819a1ae1>] blk_start_request+0x11/0x40
> > [    9.910191]  [<ffffffff819a271e>] blk_fetch_request+0x1e/0x30
> > [    9.910191]  [<ffffffff81e5a89d>] redo_fd_request+0x9d/0x3f0
> > [    9.910191]  [<ffffffff8112a779>] process_one_work+0x3b9/0x770
> > [    9.910191]  [<ffffffff8112a628>] ? process_one_work+0x268/0x770
> > [    9.910191]  [<ffffffff81177a22>] ? get_lock_stats+0x22/0x70
> > [    9.910191]  [<ffffffff81e5a800>] ? start_motor+0x120/0x120
> > [    9.910191]  [<ffffffff8112b0fa>] worker_thread+0x2ba/0x3f0
> > [    9.910191]  [<ffffffff8112ae40>] ? rescuer_thread+0x2d0/0x2d0
> > [    9.910191]  [<ffffffff81135d83>] kthread+0xe3/0xf0
> > [    9.910191]  [<ffffffff81177aae>] ? put_lock_stats.isra.16+0xe/0x40
> > [    9.910191]  [<ffffffff81135ca0>] ? insert_kthread_work+0x90/0x90
> > [    9.910191]  [<ffffffff839f1e45>] kernel_thread_helper+0x5/0x10
> > [    9.910191]  [<ffffffff81135ca0>] ? insert_kthread_work+0x90/0x90
> > [    9.910191] Code: 6a 84 be 3e 00 00 00 48 c7 c7 7b d8 6a 84 31 c0 e8 8f c2 71 ff eb 2c 0f 1f 44 00 00 48 b9 00 02 20 00 00 00
> > ad de 48 39 c8 74 8c <4c> 8b 00 4c 39 c7 75 a6 4c 8b 42 08 4c 39 c7 75 bc 48 89 42 08
> > [    9.910191] RIP  [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> > [    9.910191]  RSP <ffff880034e11c88>
> > [    9.910191] CR2: 0000000000000000
> >
> >
> > Thanks,
> > Sasha
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-08 17:22   ` Jan Kara
@ 2012-10-08 21:45     ` Jiri Kosina
  2012-10-09 13:21       ` Sasha Levin
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Kosina @ 2012-10-08 21:45 UTC (permalink / raw)
  To: Jan Kara; +Cc: Sasha Levin, Tejun Heo, axboe, Dave Jones, linux-kernel

On Mon, 8 Oct 2012, Jan Kara wrote:

> > I'm still seeing this on linux-next.
>   I think this is floppy related (see redo_fd_request() in the stack
> trace). And there were quite some changes to the area recently. Adding
> maintainer to CC.

Hmm ... I don't immediately see how this is happening.

Sasha, could you please do git bisect on drivers/block/floppy.c between 
f6365201d and your git HEAD for starters (assuming that f6365201d works 
well for you?).

> > On Sat, Sep 22, 2012 at 4:35 PM, Sasha Levin <levinsasha928@gmail.com> wrote:
> > > Hi all,
> > >
> > > While fuzzing with trinity inside a KVM tools guest running the latest linux-next kernel, I've stumbled on the following BUG.
> > >
> > > I've also hit a similar trace where the 'BUG_ON(ELV_ON_HASH(rq));' above that list_del_init() gets hit, so I guess it's a race
> > > condition of some sorts.
> > >
> > >
> > > [    9.900299] BUG: unable to handle kernel NULL pointer dereference at           (null)
> > > [    9.909508] IP: [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> > > [    9.910191] PGD 0
> > > [    9.910191] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > > [    9.910191] Dumping ftrace buffer:
> > > [    9.910191]    (ftrace buffer empty)
> > > [    9.910191] CPU 2
> > > [    9.910191] Pid: 3996, comm: kworker/u:2 Tainted: G        W    3.6.0-rc6-next-20120921-sasha-00001-geb77a39-dirty #3
> > > [    9.910191] RIP: 0010:[<ffffffff819ea637>]  [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> > > [    9.910191] RSP: 0000:ffff880034e11c88  EFLAGS: 00010007
> > > [    9.910191] RAX: 0000000000000000 RBX: ffff880034e3ec00 RCX: dead000000200200
> > > [    9.910191] RDX: 0000000000000000 RSI: ffffffff85366998 RDI: ffff880034e3ec00
> > > [    9.910191] RBP: ffff880034e11c88 R08: 0000000000000000 R09: ffff88001af60928
> > > [    9.910191] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> > > [    9.910191] R13: ffffffff85366360 R14: 0000000000000000 R15: ffffffff85b4edd0
> > > [    9.910191] FS:  0000000000000000(0000) GS:ffff880029800000(0000) knlGS:0000000000000000
> > > [    9.910191] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [    9.910191] CR2: 0000000000000000 CR3: 0000000004c26000 CR4: 00000000000406e0
> > > [    9.910191] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > [    9.910191] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > [    9.910191] Process kworker/u:2 (pid: 3996, threadinfo ffff880034e10000, task ffff88001af60000)
> > > [    9.910191] Stack:
> > > [    9.910191]  ffff880034e11ca8 ffffffff819a1a45 ffff880034e3ec00 0000000000000000
> > > [    9.910191]  ffff880034e11cc8 ffffffff819a1ae1 0000000000000000 ffff880034e3ec00
> > > [    9.910191]  ffff880034e11ce8 ffffffff819a271e 0000000000000000 0000000000000000
> > > [    9.910191] Call Trace:

Seems like the queue obtained in set_next_request() through

	q = unit[fdc_queue].gendisk->queue;

is not a proper one. I am currently not sure why.

> > > [    9.910191]  [<ffffffff819a1a45>] blk_dequeue_request+0x35/0xc0
> > > [    9.910191]  [<ffffffff819a1ae1>] blk_start_request+0x11/0x40
> > > [    9.910191]  [<ffffffff819a271e>] blk_fetch_request+0x1e/0x30
> > > [    9.910191]  [<ffffffff81e5a89d>] redo_fd_request+0x9d/0x3f0
> > > [    9.910191]  [<ffffffff8112a779>] process_one_work+0x3b9/0x770
> > > [    9.910191]  [<ffffffff8112a628>] ? process_one_work+0x268/0x770
> > > [    9.910191]  [<ffffffff81177a22>] ? get_lock_stats+0x22/0x70
> > > [    9.910191]  [<ffffffff81e5a800>] ? start_motor+0x120/0x120
> > > [    9.910191]  [<ffffffff8112b0fa>] worker_thread+0x2ba/0x3f0
> > > [    9.910191]  [<ffffffff8112ae40>] ? rescuer_thread+0x2d0/0x2d0
> > > [    9.910191]  [<ffffffff81135d83>] kthread+0xe3/0xf0
> > > [    9.910191]  [<ffffffff81177aae>] ? put_lock_stats.isra.16+0xe/0x40
> > > [    9.910191]  [<ffffffff81135ca0>] ? insert_kthread_work+0x90/0x90
> > > [    9.910191]  [<ffffffff839f1e45>] kernel_thread_helper+0x5/0x10
> > > [    9.910191]  [<ffffffff81135ca0>] ? insert_kthread_work+0x90/0x90
> > > [    9.910191] Code: 6a 84 be 3e 00 00 00 48 c7 c7 7b d8 6a 84 31 c0 e8 8f c2 71 ff eb 2c 0f 1f 44 00 00 48 b9 00 02 20 00 00 00
> > > ad de 48 39 c8 74 8c <4c> 8b 00 4c 39 c7 75 a6 4c 8b 42 08 4c 39 c7 75 bc 48 89 42 08
> > > [    9.910191] RIP  [<ffffffff819ea637>] __list_del_entry+0xb7/0xe0
> > > [    9.910191]  RSP <ffff880034e11c88>
> > > [    9.910191] CR2: 0000000000000000
> > >
> > >
> > > Thanks,
> > > Sasha

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-08 21:45     ` Jiri Kosina
@ 2012-10-09 13:21       ` Sasha Levin
  2012-10-09 13:26         ` Sasha Levin
  0 siblings, 1 reply; 14+ messages in thread
From: Sasha Levin @ 2012-10-09 13:21 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: Jan Kara, Tejun Heo, axboe, Dave Jones, linux-kernel, ben

On 10/08/2012 05:45 PM, Jiri Kosina wrote:
> On Mon, 8 Oct 2012, Jan Kara wrote:
> 
>>> > > I'm still seeing this on linux-next.
>> >   I think this is floppy related (see redo_fd_request() in the stack
>> > trace). And there were quite some changes to the area recently. Adding
>> > maintainer to CC.
> Hmm ... I don't immediately see how this is happening.
> 
> Sasha, could you please do git bisect on drivers/block/floppy.c between 
> f6365201d and your git HEAD for starters (assuming that f6365201d works 
> well for you?).
> 

A bisect on floppy.c yielded the following:

b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit
commit b33d002f4b6bae912463e5a66387c498aa69b6fe
Author: Ben Hutchings <ben@decadent.org.uk>
Date:   Mon Aug 27 20:56:53 2012 -0300

    genhd: Make put_disk() safe for disks that have not been registered



Thanks,
Sasha

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-09 13:21       ` Sasha Levin
@ 2012-10-09 13:26         ` Sasha Levin
  2012-10-10 15:52           ` Ben Hutchings
  0 siblings, 1 reply; 14+ messages in thread
From: Sasha Levin @ 2012-10-09 13:26 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: Jan Kara, Tejun Heo, axboe, Dave Jones, linux-kernel, ben

On 10/09/2012 09:21 AM, Sasha Levin wrote:
> On 10/08/2012 05:45 PM, Jiri Kosina wrote:
>> On Mon, 8 Oct 2012, Jan Kara wrote:
>>
>>>>>> I'm still seeing this on linux-next.
>>>>   I think this is floppy related (see redo_fd_request() in the stack
>>>> trace). And there were quite some changes to the area recently. Adding
>>>> maintainer to CC.
>> Hmm ... I don't immediately see how this is happening.
>>
>> Sasha, could you please do git bisect on drivers/block/floppy.c between 
>> f6365201d and your git HEAD for starters (assuming that f6365201d works 
>> well for you?).
>>
> 
> A bisect on floppy.c yielded the following:
> 
> b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit
> commit b33d002f4b6bae912463e5a66387c498aa69b6fe
> Author: Ben Hutchings <ben@decadent.org.uk>
> Date:   Mon Aug 27 20:56:53 2012 -0300
> 
>     genhd: Make put_disk() safe for disks that have not been registered

2 more things:

 1. The guest vm which I'm testing on doesn't emulate anything which even looks like a floppy.
 2. I'm seeing the following lines before the BUG:

[    9.836604] floppy0: no floppy controllers found
[    9.837246] work still pending
[    9.837743] floppy0: floppy_shutdown: timeout handler died.


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-09 13:26         ` Sasha Levin
@ 2012-10-10 15:52           ` Ben Hutchings
  2012-10-12 14:51             ` Jiri Kosina
  2012-10-12 17:55             ` Sasha Levin
  0 siblings, 2 replies; 14+ messages in thread
From: Ben Hutchings @ 2012-10-10 15:52 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jiri Kosina, Jan Kara, Tejun Heo, axboe, Dave Jones, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4591 bytes --]

On Tue, 2012-10-09 at 09:26 -0400, Sasha Levin wrote:
> On 10/09/2012 09:21 AM, Sasha Levin wrote:
> > On 10/08/2012 05:45 PM, Jiri Kosina wrote:
> >> On Mon, 8 Oct 2012, Jan Kara wrote:
> >>
> >>>>>> I'm still seeing this on linux-next.
> >>>>   I think this is floppy related (see redo_fd_request() in the stack
> >>>> trace). And there were quite some changes to the area recently. Adding
> >>>> maintainer to CC.
> >> Hmm ... I don't immediately see how this is happening.
> >>
> >> Sasha, could you please do git bisect on drivers/block/floppy.c between 
> >> f6365201d and your git HEAD for starters (assuming that f6365201d works 
> >> well for you?).
> >>
> > 
> > A bisect on floppy.c yielded the following:
> > 
> > b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit
> > commit b33d002f4b6bae912463e5a66387c498aa69b6fe
> > Author: Ben Hutchings <ben@decadent.org.uk>
> > Date:   Mon Aug 27 20:56:53 2012 -0300
> > 
> >     genhd: Make put_disk() safe for disks that have not been registered
> 
> 2 more things:
> 
>  1. The guest vm which I'm testing on doesn't emulate anything which even looks like a floppy.
>  2. I'm seeing the following lines before the BUG:
> 
> [    9.836604] floppy0: no floppy controllers found
> [    9.837246] work still pending
> [    9.837743] floppy0: floppy_shutdown: timeout handler died.

I see two problems:

1. redo_fd_request() races with tear-down of the disks, but because
set_next_request() checks disk->queue before doing anything this was
usually harmless.  Now that do_floppy_init() doesn't clear disk->queue,
the race condition is much easier to hit.  This may fix that problem in
do_floppy_init(), though there appear to be worse bugs in tear-down
order in floppy_module_exit():

--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -4320,13 +4320,13 @@ out_unreg_region:
 out_unreg_blkdev:
 	unregister_blkdev(FLOPPY_MAJOR, "fd");
 out_put_disk:
+	destroy_workqueue(floppy_wq);
 	while (dr--) {
 		del_timer_sync(&motor_off_timer[dr]);
 		if (disks[dr]->queue)
 			blk_cleanup_queue(disks[dr]->queue);
 		put_disk(disks[dr]);
 	}
-	destroy_workqueue(floppy_wq);
 	return err;
 }
 
--- END ---

2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
cleared by del_gendisk().  Incremental patch below, but it should be
squashed into the previous patch if that branch is still rebase-able.

Ben.

---
From: Ben Hutchings <ben@decadent.org.uk>
Date: Wed, 10 Oct 2012 16:17:01 +0100
Subject: [PATCH] genhd: Make put_disk() safe again for disks that *have* been
 registered

Commit b33d002 ('genhd: Make put_disk() safe for disks that have not
been registered') wrongly used the GENHD_FL_UP flag to test whether a
disk held a reference to its queue.  Since this is cleared by
del_gendisk(), queues will not be properly cleaned up if a disk has
been registered and then torn down in the normal way.  Introduce a
new flag for this purpose.

Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 block/genhd.c         |    7 +++++--
 include/linux/genhd.h |    1 +
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/genhd.c b/block/genhd.c
index 633751d..b5f482f 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -617,7 +617,10 @@ void add_disk(struct gendisk *disk)
 	 * Take an extra ref on queue which will be put on disk_release()
 	 * so that it sticks around as long as @disk is there.
 	 */
-	WARN_ON_ONCE(!blk_get_queue(disk->queue));
+	if (blk_get_queue(disk->queue))
+		disk->flags |= GENHD_FL_GOT_QUEUE;
+	else
+		WARN_ON(1);
 
 	retval = sysfs_create_link(&disk_to_dev(disk)->kobj, &bdi->dev->kobj,
 				   "bdi");
@@ -1105,7 +1108,7 @@ static void disk_release(struct device *dev)
 	disk_replace_part_tbl(disk, NULL);
 	free_part_stats(&disk->part0);
 	free_part_info(&disk->part0);
-	if (disk->queue && disk->flags & GENHD_FL_UP)
+	if (disk->queue && disk->flags & GENHD_FL_GOT_QUEUE)
 		blk_put_queue(disk->queue);
 	kfree(disk);
 }
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 4f440b3..7c2560c 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -134,6 +134,7 @@ struct hd_struct {
 #define GENHD_FL_NATIVE_CAPACITY		128
 #define GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE	256
 #define GENHD_FL_NO_PART_SCAN			512
+#define GENHD_FL_GOT_QUEUE			1024
 
 enum {
 	DISK_EVENT_MEDIA_CHANGE			= 1 << 0, /* media changed */


-- 
Ben Hutchings
Who are all these weirdos? - David Bowie, about L-Space IRC channel #afp

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-10 15:52           ` Ben Hutchings
@ 2012-10-12 14:51             ` Jiri Kosina
  2012-10-12 17:55             ` Sasha Levin
  1 sibling, 0 replies; 14+ messages in thread
From: Jiri Kosina @ 2012-10-12 14:51 UTC (permalink / raw)
  To: Ben Hutchings, Sasha Levin
  Cc: Jan Kara, Tejun Heo, Jens Axboe, Dave Jones, linux-kernel

On Wed, 10 Oct 2012, Ben Hutchings wrote:

> > >>>>>> I'm still seeing this on linux-next.
> > >>>>   I think this is floppy related (see redo_fd_request() in the stack
> > >>>> trace). And there were quite some changes to the area recently. Adding
> > >>>> maintainer to CC.
> > >> Hmm ... I don't immediately see how this is happening.
> > >>
> > >> Sasha, could you please do git bisect on drivers/block/floppy.c between 
> > >> f6365201d and your git HEAD for starters (assuming that f6365201d works 
> > >> well for you?).
> > >>
> > > 
> > > A bisect on floppy.c yielded the following:
> > > 
> > > b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit
> > > commit b33d002f4b6bae912463e5a66387c498aa69b6fe
> > > Author: Ben Hutchings <ben@decadent.org.uk>
> > > Date:   Mon Aug 27 20:56:53 2012 -0300
> > > 
> > >     genhd: Make put_disk() safe for disks that have not been registered
> > 
> > 2 more things:
> > 
> >  1. The guest vm which I'm testing on doesn't emulate anything which even looks like a floppy.
> >  2. I'm seeing the following lines before the BUG:
> > 
> > [    9.836604] floppy0: no floppy controllers found
> > [    9.837246] work still pending
> > [    9.837743] floppy0: floppy_shutdown: timeout handler died.
> 
> I see two problems:
> 
> 1. redo_fd_request() races with tear-down of the disks, but because
> set_next_request() checks disk->queue before doing anything this was
> usually harmless.  Now that do_floppy_init() doesn't clear disk->queue,
> the race condition is much easier to hit.  This may fix that problem in
> do_floppy_init(), though there appear to be worse bugs in tear-down
> order in floppy_module_exit():
> 
> --- a/drivers/block/floppy.c
> +++ b/drivers/block/floppy.c
> @@ -4320,13 +4320,13 @@ out_unreg_region:
>  out_unreg_blkdev:
>  	unregister_blkdev(FLOPPY_MAJOR, "fd");
>  out_put_disk:
> +	destroy_workqueue(floppy_wq);
>  	while (dr--) {
>  		del_timer_sync(&motor_off_timer[dr]);
>  		if (disks[dr]->queue)
>  			blk_cleanup_queue(disks[dr]->queue);
>  		put_disk(disks[dr]);
>  	}
> -	destroy_workqueue(floppy_wq);
>  	return err;
>  }
>  
> --- END ---
> 
> 2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
> cleared by del_gendisk().  Incremental patch below, but it should be
> squashed into the previous patch if that branch is still rebase-able.

Sasha,

did you manage to test this to see if it fixes the symptom you are seeing, 
please?

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-10 15:52           ` Ben Hutchings
  2012-10-12 14:51             ` Jiri Kosina
@ 2012-10-12 17:55             ` Sasha Levin
  2012-10-17  1:29               ` Ben Hutchings
  2012-10-17  1:30               ` Ben Hutchings
  1 sibling, 2 replies; 14+ messages in thread
From: Sasha Levin @ 2012-10-12 17:55 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Jiri Kosina, Jan Kara, Tejun Heo, axboe, Dave Jones, linux-kernel

Hi Ben,

On Wed, Oct 10, 2012 at 11:52 AM, Ben Hutchings <ben@decadent.org.uk> wrote:
> On Tue, 2012-10-09 at 09:26 -0400, Sasha Levin wrote:
>> On 10/09/2012 09:21 AM, Sasha Levin wrote:
>> > On 10/08/2012 05:45 PM, Jiri Kosina wrote:
>> >> On Mon, 8 Oct 2012, Jan Kara wrote:
>> >>
>> >>>>>> I'm still seeing this on linux-next.
>> >>>>   I think this is floppy related (see redo_fd_request() in the stack
>> >>>> trace). And there were quite some changes to the area recently. Adding
>> >>>> maintainer to CC.
>> >> Hmm ... I don't immediately see how this is happening.
>> >>
>> >> Sasha, could you please do git bisect on drivers/block/floppy.c between
>> >> f6365201d and your git HEAD for starters (assuming that f6365201d works
>> >> well for you?).
>> >>
>> >
>> > A bisect on floppy.c yielded the following:
>> >
>> > b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit
>> > commit b33d002f4b6bae912463e5a66387c498aa69b6fe
>> > Author: Ben Hutchings <ben@decadent.org.uk>
>> > Date:   Mon Aug 27 20:56:53 2012 -0300
>> >
>> >     genhd: Make put_disk() safe for disks that have not been registered
>>
>> 2 more things:
>>
>>  1. The guest vm which I'm testing on doesn't emulate anything which even looks like a floppy.
>>  2. I'm seeing the following lines before the BUG:
>>
>> [    9.836604] floppy0: no floppy controllers found
>> [    9.837246] work still pending
>> [    9.837743] floppy0: floppy_shutdown: timeout handler died.
>
> I see two problems:
>
> 1. redo_fd_request() races with tear-down of the disks, but because
> set_next_request() checks disk->queue before doing anything this was
> usually harmless.  Now that do_floppy_init() doesn't clear disk->queue,
> the race condition is much easier to hit.  This may fix that problem in
> do_floppy_init(), though there appear to be worse bugs in tear-down
> order in floppy_module_exit():
>
> --- a/drivers/block/floppy.c
> +++ b/drivers/block/floppy.c
> @@ -4320,13 +4320,13 @@ out_unreg_region:
>  out_unreg_blkdev:
>         unregister_blkdev(FLOPPY_MAJOR, "fd");
>  out_put_disk:
> +       destroy_workqueue(floppy_wq);
>         while (dr--) {
>                 del_timer_sync(&motor_off_timer[dr]);
>                 if (disks[dr]->queue)
>                         blk_cleanup_queue(disks[dr]->queue);
>                 put_disk(disks[dr]);
>         }
> -       destroy_workqueue(floppy_wq);
>         return err;
>  }
>
> --- END ---
>
> 2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
> cleared by del_gendisk().  Incremental patch below, but it should be
> squashed into the previous patch if that branch is still rebase-able.
>
> Ben.
>
> ---
> From: Ben Hutchings <ben@decadent.org.uk>
> Date: Wed, 10 Oct 2012 16:17:01 +0100
> Subject: [PATCH] genhd: Make put_disk() safe again for disks that *have* been
>  registered
>
> Commit b33d002 ('genhd: Make put_disk() safe for disks that have not
> been registered') wrongly used the GENHD_FL_UP flag to test whether a
> disk held a reference to its queue.  Since this is cleared by
> del_gendisk(), queues will not be properly cleaned up if a disk has
> been registered and then torn down in the normal way.  Introduce a
> new flag for this purpose.
>
> Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> ---
>  block/genhd.c         |    7 +++++--
>  include/linux/genhd.h |    1 +
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/block/genhd.c b/block/genhd.c
> index 633751d..b5f482f 100644
> --- a/block/genhd.c
> +++ b/block/genhd.c
> @@ -617,7 +617,10 @@ void add_disk(struct gendisk *disk)
>          * Take an extra ref on queue which will be put on disk_release()
>          * so that it sticks around as long as @disk is there.
>          */
> -       WARN_ON_ONCE(!blk_get_queue(disk->queue));
> +       if (blk_get_queue(disk->queue))
> +               disk->flags |= GENHD_FL_GOT_QUEUE;
> +       else
> +               WARN_ON(1);
>
>         retval = sysfs_create_link(&disk_to_dev(disk)->kobj, &bdi->dev->kobj,
>                                    "bdi");
> @@ -1105,7 +1108,7 @@ static void disk_release(struct device *dev)
>         disk_replace_part_tbl(disk, NULL);
>         free_part_stats(&disk->part0);
>         free_part_info(&disk->part0);
> -       if (disk->queue && disk->flags & GENHD_FL_UP)
> +       if (disk->queue && disk->flags & GENHD_FL_GOT_QUEUE)
>                 blk_put_queue(disk->queue);
>         kfree(disk);
>  }
> diff --git a/include/linux/genhd.h b/include/linux/genhd.h
> index 4f440b3..7c2560c 100644
> --- a/include/linux/genhd.h
> +++ b/include/linux/genhd.h
> @@ -134,6 +134,7 @@ struct hd_struct {
>  #define GENHD_FL_NATIVE_CAPACITY               128
>  #define GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE    256
>  #define GENHD_FL_NO_PART_SCAN                  512
> +#define GENHD_FL_GOT_QUEUE                     1024
>
>  enum {
>         DISK_EVENT_MEDIA_CHANGE                 = 1 << 0, /* media changed */

I'm now seeing these instead:

[   34.823972] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[   34.830888] Dumping ftrace buffer:
[   34.830888]    (ftrace buffer empty)
[   34.830888] CPU 5
[   34.830888] Pid: 6, comm: kworker/u:0 Tainted: G        W
3.6.0-next-20121012-sasha-00002-gae01a05-dirty #650
[   34.830888] RIP: 0010:[<ffffffff8112dfe8>]  [<ffffffff8112dfe8>]
flush_workqueue_prep_cwqs+0xf8/0x260
[   34.830888] RSP: 0000:ffff8801bf059a58  EFLAGS: 00010287
[   34.830888] RAX: 0000000000000000 RBX: ffff100b833d8000 RCX: 0000000000000000
[   34.830888] RDX: 0000000000000000 RSI: 0000000000000078 RDI: 0000000000000078
[   34.830888] RBP: ffff8801bf059aa8 R08: ffffffff858bb800 R09: 0000000000000000
[   34.830888] R10: 2222222222222222 R11: 0000000000000078 R12: ffff8809c1610600
[   34.830888] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000002
[   34.830888] FS:  0000000000000000(0000) GS:ffff8809c4000000(0000)
knlGS:0000000000000000
[   34.830888] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   34.830888] CR2: 00000000ffffffff CR3: 0000000004e25000 CR4: 00000000000006e0
[   34.830888] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   34.830888] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   34.830888] Process kworker/u:0 (pid: 6, threadinfo
ffff8801bf058000, task ffff8801bf053000)
[   34.830888] Stack:
[   34.830888]  ffff8801bf059a88 00ff8809c1610620 0000000000000006
ffff8809c16106d0
[   34.830888]  ffffffff8480a53f ffff8809c1610600 ffff8801bf059ae8
0000000000000003
[   34.830888]  ffff8809c16106f0 ffff8809c1610620 ffff8801bf059c08
ffffffff8112e3ba
[   34.830888] Call Trace:
[   34.830888]  [<ffffffff8112e3ba>] flush_workqueue+0x26a/0x5c0
[   34.991665]  [<ffffffff8112e150>] ? flush_workqueue_prep_cwqs+0x260/0x260
[   34.991665]  [<ffffffff8112f9c0>] drain_workqueue+0x70/0x270
[   34.991665]  [<ffffffff819d1a25>] ? kobject_cleanup+0x145/0x190
[   34.991665]  [<ffffffff8112fbd3>] destroy_workqueue+0x13/0x200
[   34.991665]  [<ffffffff85b038dc>] do_floppy_init+0x672/0x70c
[   34.991665]  [<ffffffff85b0397f>] floppy_async_init+0x9/0xb
[   34.991665]  [<ffffffff81143f5b>] async_run_entry_fn+0xab/0x180
[   34.991665]  [<ffffffff8112ec46>] process_one_work+0x386/0x570
[   34.991665]  [<ffffffff8112eb18>] ? process_one_work+0x258/0x570
[   34.991665]  [<ffffffff81143eb0>] ? async_schedule+0x20/0x20
[   34.991665]  [<ffffffff8113062a>] worker_thread+0x20a/0x340
[   34.991665]  [<ffffffff81130420>] ? manage_workers+0x160/0x160
[   34.991665]  [<ffffffff81139c52>] kthread+0xe2/0xf0
[   34.991665]  [<ffffffff8118386a>] ? __lock_release+0x1ba/0x1d0
[   34.991665]  [<ffffffff81139b70>] ? __init_kthread_worker+0x70/0x70
[   34.991665]  [<ffffffff83a645bc>] ret_from_fork+0x7c/0x90
[   34.991665]  [<ffffffff81139b70>] ? __init_kthread_worker+0x70/0x70
[   34.991665] Code: 5c 24 08 44 89 f0 48 03 1c c5 00 13 8b 85 eb 1b
0f 1f 00 41 81 fe 00 10 00 00 75 07 49 8b 5c 24 08 eb 08 31 db 66 0f
1f 44 00 00 <48> 8b 03 48 8b 08 48 89 cf 48 89 4d b0 e8 46 4c 93 02 45
85 ff
[   34.991665] RIP  [<ffffffff8112dfe8>] flush_workqueue_prep_cwqs+0xf8/0x260
[   34.991665]  RSP <ffff8801bf059a58>
[   35.151058] ---[ end trace 48a38e4c9e8f037d ]---

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-12 17:55             ` Sasha Levin
@ 2012-10-17  1:29               ` Ben Hutchings
  2012-10-17 14:11                 ` Jiri Kosina
  2012-10-17  1:30               ` Ben Hutchings
  1 sibling, 1 reply; 14+ messages in thread
From: Ben Hutchings @ 2012-10-17  1:29 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jiri Kosina, Jan Kara, Tejun Heo, axboe, Dave Jones, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2244 bytes --]

On Fri, 2012-10-12 at 13:55 -0400, Sasha Levin wrote:
> Hi Ben,
> 
> On Wed, Oct 10, 2012 at 11:52 AM, Ben Hutchings <ben@decadent.org.uk> wrote:
> > On Tue, 2012-10-09 at 09:26 -0400, Sasha Levin wrote:
> >> On 10/09/2012 09:21 AM, Sasha Levin wrote:
> >> > On 10/08/2012 05:45 PM, Jiri Kosina wrote:
> >> >> On Mon, 8 Oct 2012, Jan Kara wrote:
> >> >>
> >> >>>>>> I'm still seeing this on linux-next.
> >> >>>>   I think this is floppy related (see redo_fd_request() in the stack
> >> >>>> trace). And there were quite some changes to the area recently. Adding
> >> >>>> maintainer to CC.
> >> >> Hmm ... I don't immediately see how this is happening.
> >> >>
> >> >> Sasha, could you please do git bisect on drivers/block/floppy.c between
> >> >> f6365201d and your git HEAD for starters (assuming that f6365201d works
> >> >> well for you?).
> >> >>
> >> >
> >> > A bisect on floppy.c yielded the following:
> >> >
> >> > b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit
> >> > commit b33d002f4b6bae912463e5a66387c498aa69b6fe
> >> > Author: Ben Hutchings <ben@decadent.org.uk>
> >> > Date:   Mon Aug 27 20:56:53 2012 -0300
> >> >
> >> >     genhd: Make put_disk() safe for disks that have not been registered
[...]
> > I see two problems:
> >
> > 1. redo_fd_request() races with tear-down of the disks, but because
> > set_next_request() checks disk->queue before doing anything this was
> > usually harmless.  Now that do_floppy_init() doesn't clear disk->queue,
> > the race condition is much easier to hit.  This may fix that problem in
> > do_floppy_init(), though there appear to be worse bugs in tear-down
> > order in floppy_module_exit():
[...]
> > 2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
> > cleared by del_gendisk().  Incremental patch below, but it should be
> > squashed into the previous patch if that branch is still rebase-able.
[...]
> I'm now seeing these instead:
[...]

Sorry, I'm not going to spend more time in the quagmire of the floppy
driver.  Whoever has this commit in their tree, please revert or drop it
as appropriate.

Ben.

-- 
Ben Hutchings
No political challenge can be met by shopping. - George Monbiot

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-12 17:55             ` Sasha Levin
  2012-10-17  1:29               ` Ben Hutchings
@ 2012-10-17  1:30               ` Ben Hutchings
  1 sibling, 0 replies; 14+ messages in thread
From: Ben Hutchings @ 2012-10-17  1:30 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jiri Kosina, Jan Kara, Tejun Heo, axboe, Dave Jones, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2244 bytes --]

On Fri, 2012-10-12 at 13:55 -0400, Sasha Levin wrote:
> Hi Ben,
> 
> On Wed, Oct 10, 2012 at 11:52 AM, Ben Hutchings <ben@decadent.org.uk> wrote:
> > On Tue, 2012-10-09 at 09:26 -0400, Sasha Levin wrote:
> >> On 10/09/2012 09:21 AM, Sasha Levin wrote:
> >> > On 10/08/2012 05:45 PM, Jiri Kosina wrote:
> >> >> On Mon, 8 Oct 2012, Jan Kara wrote:
> >> >>
> >> >>>>>> I'm still seeing this on linux-next.
> >> >>>>   I think this is floppy related (see redo_fd_request() in the stack
> >> >>>> trace). And there were quite some changes to the area recently. Adding
> >> >>>> maintainer to CC.
> >> >> Hmm ... I don't immediately see how this is happening.
> >> >>
> >> >> Sasha, could you please do git bisect on drivers/block/floppy.c between
> >> >> f6365201d and your git HEAD for starters (assuming that f6365201d works
> >> >> well for you?).
> >> >>
> >> >
> >> > A bisect on floppy.c yielded the following:
> >> >
> >> > b33d002f4b6bae912463e5a66387c498aa69b6fe is the first bad commit
> >> > commit b33d002f4b6bae912463e5a66387c498aa69b6fe
> >> > Author: Ben Hutchings <ben@decadent.org.uk>
> >> > Date:   Mon Aug 27 20:56:53 2012 -0300
> >> >
> >> >     genhd: Make put_disk() safe for disks that have not been registered
[...]
> > I see two problems:
> >
> > 1. redo_fd_request() races with tear-down of the disks, but because
> > set_next_request() checks disk->queue before doing anything this was
> > usually harmless.  Now that do_floppy_init() doesn't clear disk->queue,
> > the race condition is much easier to hit.  This may fix that problem in
> > do_floppy_init(), though there appear to be worse bugs in tear-down
> > order in floppy_module_exit():
[...]
> > 2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
> > cleared by del_gendisk().  Incremental patch below, but it should be
> > squashed into the previous patch if that branch is still rebase-able.
[...]
> I'm now seeing these instead:
[...]

Sorry, I'm not going to spend more time in the quagmire of the floppy
driver.  Whoever has this commit in their tree, please revert or drop it
as appropriate.

Ben.

-- 
Ben Hutchings
No political challenge can be met by shopping. - George Monbiot

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-17  1:29               ` Ben Hutchings
@ 2012-10-17 14:11                 ` Jiri Kosina
  2012-10-17 14:23                   ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Jiri Kosina @ 2012-10-17 14:11 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Sasha Levin, Jan Kara, Tejun Heo, axboe, Dave Jones, linux-kernel

On Wed, 17 Oct 2012, Ben Hutchings wrote:

> > > 1. redo_fd_request() races with tear-down of the disks, but because
> > > set_next_request() checks disk->queue before doing anything this was
> > > usually harmless.  Now that do_floppy_init() doesn't clear disk->queue,
> > > the race condition is much easier to hit.  This may fix that problem in
> > > do_floppy_init(), though there appear to be worse bugs in tear-down
> > > order in floppy_module_exit():
> [...]
> > > 2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
> > > cleared by del_gendisk().  Incremental patch below, but it should be
> > > squashed into the previous patch if that branch is still rebase-able.
> [...]
> > I'm now seeing these instead:
> [...]
> 
> Sorry, I'm not going to spend more time in the quagmire of the floppy
> driver.  Whoever has this commit in their tree, please revert or drop it
> as appropriate.

As far as I can tell, Jens has pulled it from me, but it hasn't made it 
into Linus' tree as of today.

I will do it in my tree and send a new pull request to Jens.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-17 14:11                 ` Jiri Kosina
@ 2012-10-17 14:23                   ` Jens Axboe
  2012-10-26 18:00                     ` Jiri Kosina
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2012-10-17 14:23 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Ben Hutchings, Sasha Levin, Jan Kara, Tejun Heo, Dave Jones,
	linux-kernel

On 2012-10-17 16:11, Jiri Kosina wrote:
> On Wed, 17 Oct 2012, Ben Hutchings wrote:
> 
>>>> 1. redo_fd_request() races with tear-down of the disks, but because
>>>> set_next_request() checks disk->queue before doing anything this was
>>>> usually harmless.  Now that do_floppy_init() doesn't clear disk->queue,
>>>> the race condition is much easier to hit.  This may fix that problem in
>>>> do_floppy_init(), though there appear to be worse bugs in tear-down
>>>> order in floppy_module_exit():
>> [...]
>>>> 2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
>>>> cleared by del_gendisk().  Incremental patch below, but it should be
>>>> squashed into the previous patch if that branch is still rebase-able.
>> [...]
>>> I'm now seeing these instead:
>> [...]
>>
>> Sorry, I'm not going to spend more time in the quagmire of the floppy
>> driver.  Whoever has this commit in their tree, please revert or drop it
>> as appropriate.
> 
> As far as I can tell, Jens has pulled it from me, but it hasn't made it 
> into Linus' tree as of today.
> 
> I will do it in my tree and send a new pull request to Jens.

I did not add the patch from Ben, as it was reported as not working. My
driver pull is late this time due to travel, but it'll go out start of
next week. So if you have pending floppy updates that are tested at that
time, then please do send them my way.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: blk: NULL ptr deref in blk_dequeue_request()
  2012-10-17 14:23                   ` Jens Axboe
@ 2012-10-26 18:00                     ` Jiri Kosina
  0 siblings, 0 replies; 14+ messages in thread
From: Jiri Kosina @ 2012-10-26 18:00 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Ben Hutchings, Sasha Levin, Jan Kara, Tejun Heo, Dave Jones,
	linux-kernel

On Wed, 17 Oct 2012, Jens Axboe wrote:

> >>>> 1. redo_fd_request() races with tear-down of the disks, but because
> >>>> set_next_request() checks disk->queue before doing anything this was
> >>>> usually harmless.  Now that do_floppy_init() doesn't clear disk->queue,
> >>>> the race condition is much easier to hit.  This may fix that problem in
> >>>> do_floppy_init(), though there appear to be worse bugs in tear-down
> >>>> order in floppy_module_exit():
> >> [...]
> >>>> 2. I made a big mistake in using the existing GENHD_FL_UP flag, as it is
> >>>> cleared by del_gendisk().  Incremental patch below, but it should be
> >>>> squashed into the previous patch if that branch is still rebase-able.
> >> [...]
> >>> I'm now seeing these instead:
> >> [...]
> >>
> >> Sorry, I'm not going to spend more time in the quagmire of the floppy
> >> driver.  Whoever has this commit in their tree, please revert or drop it
> >> as appropriate.
> > 
> > As far as I can tell, Jens has pulled it from me, but it hasn't made it 
> > into Linus' tree as of today.
> > 
> > I will do it in my tree and send a new pull request to Jens.
> 
> I did not add the patch from Ben, as it was reported as not working. My
> driver pull is late this time due to travel, but it'll go out start of
> next week. So if you have pending floppy updates that are tested at that
> time, then please do send them my way.

Sorry for the delay on my side as well. I am now working on reverting 
Ben's original patch and doing some testing. Will be sending you a pull 
request still tonight.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-10-26 18:00 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-22 20:35 blk: NULL ptr deref in blk_dequeue_request() Sasha Levin
2012-10-07 18:26 ` Sasha Levin
2012-10-08 17:22   ` Jan Kara
2012-10-08 21:45     ` Jiri Kosina
2012-10-09 13:21       ` Sasha Levin
2012-10-09 13:26         ` Sasha Levin
2012-10-10 15:52           ` Ben Hutchings
2012-10-12 14:51             ` Jiri Kosina
2012-10-12 17:55             ` Sasha Levin
2012-10-17  1:29               ` Ben Hutchings
2012-10-17 14:11                 ` Jiri Kosina
2012-10-17 14:23                   ` Jens Axboe
2012-10-26 18:00                     ` Jiri Kosina
2012-10-17  1:30               ` Ben Hutchings

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.