linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [blk_mq_register_hctx] 29dee3c03a WARNING: CPU: 0 PID: 5 at lib/refcount.c:114 refcount_inc
       [not found] <20170228041117.hbqqyczf5bnypqmc@wfg-t540p.sh.intel.com>
@ 2017-02-28  7:48 ` Peter Zijlstra
  2017-02-28  8:17 ` Peter Zijlstra
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2017-02-28  7:48 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: linux-block, Omar Sandoval, Jens Axboe, LKP

On Tue, Feb 28, 2017 at 12:11:17PM +0800, Fengguang Wu wrote:
> Hello,
> 
> FYI, an old blk_mq bug triggers new warnings on this commit. It's very
> reproducible and you may try the attached reproduce-* script.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> 
> commit 29dee3c03abce04cd527878ef5f9e5f91b7b83f4
> Author:     Peter Zijlstra <peterz@infradead.org>
> AuthorDate: Fri Feb 10 16:27:52 2017 +0100
> Commit:     Ingo Molnar <mingo@kernel.org>
> CommitDate: Fri Feb 24 09:02:10 2017 +0100
> 
>      locking/refcounts: Out-of-line everything
>      
>      Linus asked to please make this real C code.
>      
>      And since size then isn't an issue what so ever anymore, remove the
>      debug knob and make all WARN()s unconditional.
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I suspect that if you revert, but enable CONFIG_DEBUG_REFCOUNT you'd
have gotten the same splats..


I'll go have a look though, something looks buggered.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blk_mq_register_hctx] 29dee3c03a WARNING: CPU: 0 PID: 5 at lib/refcount.c:114 refcount_inc
       [not found] <20170228041117.hbqqyczf5bnypqmc@wfg-t540p.sh.intel.com>
  2017-02-28  7:48 ` [blk_mq_register_hctx] 29dee3c03a WARNING: CPU: 0 PID: 5 at lib/refcount.c:114 refcount_inc Peter Zijlstra
@ 2017-02-28  8:17 ` Peter Zijlstra
  2017-02-28  8:35   ` Fengguang Wu
  2017-02-28  8:38   ` Peter Zijlstra
  1 sibling, 2 replies; 6+ messages in thread
From: Peter Zijlstra @ 2017-02-28  8:17 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: linux-block, Omar Sandoval, Jens Axboe, LKP, Greg Kroah-Hartman

On Tue, Feb 28, 2017 at 12:11:17PM +0800, Fengguang Wu wrote:
> Hello,
> 
> FYI, an old blk_mq bug triggers new warnings on this commit. It's very
> reproducible and you may try the attached reproduce-* script.

> [    4.447772] kobject (ffff88001c041f10): tried to init an initialized object, something is seriously wrong.
> [    4.453395] CPU: 0 PID: 5 Comm: kworker/u2:0 Not tainted 4.10.0-01216-g29dee3c #2
> [    4.455534] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
> [    4.458252] Workqueue: events_unbound async_run_entry_fn
> [    4.459708] Call Trace:
> [    4.460611]  dump_stack+0x19/0x27
> [    4.461652]  kobject_init+0xda/0xf0
> [    4.462731]  blk_mq_register_dev+0x31/0x150
> [    4.463990]  blk_register_queue+0x205/0x250
> [    4.465217]  device_add_disk+0x1ab/0x710
> [    4.466384]  sd_probe_async+0x11c/0x1e0
> [    4.467544]  async_run_entry_fn+0xbd/0x220
> [    4.468760]  process_one_work+0x4a7/0x990
> [    4.469938]  ? process_one_work+0x348/0x990
> [    4.471168]  worker_thread+0x342/0x8a0
> [    4.472300]  ? process_one_work+0x990/0x990
> [    4.473540]  kthread+0x188/0x190
> [    4.474557]  ? kthread_create_on_node+0x40/0x40
> [    4.475850]  ret_from_fork+0x31/0x40

So this was pre-existing wreckage? If so, that needs to be sorted first.
Because if the kobject stuff is broken, there's no way the refcount
stuff can begin to work.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blk_mq_register_hctx] 29dee3c03a WARNING: CPU: 0 PID: 5 at lib/refcount.c:114 refcount_inc
  2017-02-28  8:17 ` Peter Zijlstra
@ 2017-02-28  8:35   ` Fengguang Wu
  2017-02-28  8:52     ` Omar Sandoval
  2017-02-28  8:38   ` Peter Zijlstra
  1 sibling, 1 reply; 6+ messages in thread
From: Fengguang Wu @ 2017-02-28  8:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-block, Omar Sandoval, Jens Axboe, LKP, Greg Kroah-Hartman

On Tue, Feb 28, 2017 at 09:17:11AM +0100, Peter Zijlstra wrote:
>On Tue, Feb 28, 2017 at 12:11:17PM +0800, Fengguang Wu wrote:
>> Hello,
>>
>> FYI, an old blk_mq bug triggers new warnings on this commit. It's very
>> reproducible and you may try the attached reproduce-* script.
>
>> [    4.447772] kobject (ffff88001c041f10): tried to init an initialized object, something is seriously wrong.
>> [    4.453395] CPU: 0 PID: 5 Comm: kworker/u2:0 Not tainted 4.10.0-01216-g29dee3c #2
>> [    4.455534] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
>> [    4.458252] Workqueue: events_unbound async_run_entry_fn
>> [    4.459708] Call Trace:
>> [    4.460611]  dump_stack+0x19/0x27
>> [    4.461652]  kobject_init+0xda/0xf0
>> [    4.462731]  blk_mq_register_dev+0x31/0x150
>> [    4.463990]  blk_register_queue+0x205/0x250
>> [    4.465217]  device_add_disk+0x1ab/0x710
>> [    4.466384]  sd_probe_async+0x11c/0x1e0
>> [    4.467544]  async_run_entry_fn+0xbd/0x220
>> [    4.468760]  process_one_work+0x4a7/0x990
>> [    4.469938]  ? process_one_work+0x348/0x990
>> [    4.471168]  worker_thread+0x342/0x8a0
>> [    4.472300]  ? process_one_work+0x990/0x990
>> [    4.473540]  kthread+0x188/0x190
>> [    4.474557]  ? kthread_create_on_node+0x40/0x40
>> [    4.475850]  ret_from_fork+0x31/0x40
>
>So this was pre-existing wreckage? If so, that needs to be sorted first.
>Because if the kobject stuff is broken, there's no way the refcount
>stuff can begin to work.

Yeah it's old bug that should have existed for quite some time.
It's not quite related to the refcount work, just hoping the new
warning might serve as new clues to help debugging the blk_mq bug.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blk_mq_register_hctx] 29dee3c03a WARNING: CPU: 0 PID: 5 at lib/refcount.c:114 refcount_inc
  2017-02-28  8:17 ` Peter Zijlstra
  2017-02-28  8:35   ` Fengguang Wu
@ 2017-02-28  8:38   ` Peter Zijlstra
  2017-02-28  9:45     ` Peter Zijlstra
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2017-02-28  8:38 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: linux-block, Omar Sandoval, Jens Axboe, LKP, Greg Kroah-Hartman

On Tue, Feb 28, 2017 at 09:17:11AM +0100, Peter Zijlstra wrote:
> On Tue, Feb 28, 2017 at 12:11:17PM +0800, Fengguang Wu wrote:
> > Hello,
> > 
> > FYI, an old blk_mq bug triggers new warnings on this commit. It's very
> > reproducible and you may try the attached reproduce-* script.
> 
> > [    4.447772] kobject (ffff88001c041f10): tried to init an initialized object, something is seriously wrong.
> > [    4.453395] CPU: 0 PID: 5 Comm: kworker/u2:0 Not tainted 4.10.0-01216-g29dee3c #2
> > [    4.455534] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
> > [    4.458252] Workqueue: events_unbound async_run_entry_fn
> > [    4.459708] Call Trace:
> > [    4.460611]  dump_stack+0x19/0x27
> > [    4.461652]  kobject_init+0xda/0xf0
> > [    4.462731]  blk_mq_register_dev+0x31/0x150
> > [    4.463990]  blk_register_queue+0x205/0x250
> > [    4.465217]  device_add_disk+0x1ab/0x710
> > [    4.466384]  sd_probe_async+0x11c/0x1e0
> > [    4.467544]  async_run_entry_fn+0xbd/0x220
> > [    4.468760]  process_one_work+0x4a7/0x990
> > [    4.469938]  ? process_one_work+0x348/0x990
> > [    4.471168]  worker_thread+0x342/0x8a0
> > [    4.472300]  ? process_one_work+0x990/0x990
> > [    4.473540]  kthread+0x188/0x190
> > [    4.474557]  ? kthread_create_on_node+0x40/0x40
> > [    4.475850]  ret_from_fork+0x31/0x40
> 
> So this was pre-existing wreckage? If so, that needs to be sorted first.
> Because if the kobject stuff is broken, there's no way the refcount
> stuff can begin to work.

Google just found me:

 https://lkml.kernel.org/r/1487758442-5855-2-git-send-email-tom.leiming@gmail.com

Let me see if that works.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blk_mq_register_hctx] 29dee3c03a WARNING: CPU: 0 PID: 5 at lib/refcount.c:114 refcount_inc
  2017-02-28  8:35   ` Fengguang Wu
@ 2017-02-28  8:52     ` Omar Sandoval
  0 siblings, 0 replies; 6+ messages in thread
From: Omar Sandoval @ 2017-02-28  8:52 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: Peter Zijlstra, linux-block, Omar Sandoval, Jens Axboe, LKP,
	Greg Kroah-Hartman

On Tue, Feb 28, 2017 at 04:35:08PM +0800, Fengguang Wu wrote:
> On Tue, Feb 28, 2017 at 09:17:11AM +0100, Peter Zijlstra wrote:
> > On Tue, Feb 28, 2017 at 12:11:17PM +0800, Fengguang Wu wrote:
> > > Hello,
> > > 
> > > FYI, an old blk_mq bug triggers new warnings on this commit. It's very
> > > reproducible and you may try the attached reproduce-* script.
> > 
> > > [    4.447772] kobject (ffff88001c041f10): tried to init an initialized object, something is seriously wrong.
> > > [    4.453395] CPU: 0 PID: 5 Comm: kworker/u2:0 Not tainted 4.10.0-01216-g29dee3c #2
> > > [    4.455534] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
> > > [    4.458252] Workqueue: events_unbound async_run_entry_fn
> > > [    4.459708] Call Trace:
> > > [    4.460611]  dump_stack+0x19/0x27
> > > [    4.461652]  kobject_init+0xda/0xf0
> > > [    4.462731]  blk_mq_register_dev+0x31/0x150
> > > [    4.463990]  blk_register_queue+0x205/0x250
> > > [    4.465217]  device_add_disk+0x1ab/0x710
> > > [    4.466384]  sd_probe_async+0x11c/0x1e0
> > > [    4.467544]  async_run_entry_fn+0xbd/0x220
> > > [    4.468760]  process_one_work+0x4a7/0x990
> > > [    4.469938]  ? process_one_work+0x348/0x990
> > > [    4.471168]  worker_thread+0x342/0x8a0
> > > [    4.472300]  ? process_one_work+0x990/0x990
> > > [    4.473540]  kthread+0x188/0x190
> > > [    4.474557]  ? kthread_create_on_node+0x40/0x40
> > > [    4.475850]  ret_from_fork+0x31/0x40
> > 
> > So this was pre-existing wreckage? If so, that needs to be sorted first.
> > Because if the kobject stuff is broken, there's no way the refcount
> > stuff can begin to work.
> 
> Yeah it's old bug that should have existed for quite some time.
> It's not quite related to the refcount work, just hoping the new
> warning might serve as new clues to help debugging the blk_mq bug.
> 
> Thanks,
> Fengguang

Ming Lei posted a series to fix this here [1]. I haven't gotten around
to testing it, but it'd be great if you could try it, too.

http://marc.info/?l=linux-block&m=148775846217069&w=2

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [blk_mq_register_hctx] 29dee3c03a WARNING: CPU: 0 PID: 5 at lib/refcount.c:114 refcount_inc
  2017-02-28  8:38   ` Peter Zijlstra
@ 2017-02-28  9:45     ` Peter Zijlstra
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2017-02-28  9:45 UTC (permalink / raw)
  To: Fengguang Wu
  Cc: linux-block, Omar Sandoval, Jens Axboe, LKP, Greg Kroah-Hartman

On Tue, Feb 28, 2017 at 09:38:04AM +0100, Peter Zijlstra wrote:
> On Tue, Feb 28, 2017 at 09:17:11AM +0100, Peter Zijlstra wrote:
> > On Tue, Feb 28, 2017 at 12:11:17PM +0800, Fengguang Wu wrote:
> > > Hello,
> > > 
> > > FYI, an old blk_mq bug triggers new warnings on this commit. It's very
> > > reproducible and you may try the attached reproduce-* script.
> > 
> > > [    4.447772] kobject (ffff88001c041f10): tried to init an initialized object, something is seriously wrong.
> > > [    4.453395] CPU: 0 PID: 5 Comm: kworker/u2:0 Not tainted 4.10.0-01216-g29dee3c #2
> > > [    4.455534] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
> > > [    4.458252] Workqueue: events_unbound async_run_entry_fn
> > > [    4.459708] Call Trace:
> > > [    4.460611]  dump_stack+0x19/0x27
> > > [    4.461652]  kobject_init+0xda/0xf0
> > > [    4.462731]  blk_mq_register_dev+0x31/0x150
> > > [    4.463990]  blk_register_queue+0x205/0x250
> > > [    4.465217]  device_add_disk+0x1ab/0x710
> > > [    4.466384]  sd_probe_async+0x11c/0x1e0
> > > [    4.467544]  async_run_entry_fn+0xbd/0x220
> > > [    4.468760]  process_one_work+0x4a7/0x990
> > > [    4.469938]  ? process_one_work+0x348/0x990
> > > [    4.471168]  worker_thread+0x342/0x8a0
> > > [    4.472300]  ? process_one_work+0x990/0x990
> > > [    4.473540]  kthread+0x188/0x190
> > > [    4.474557]  ? kthread_create_on_node+0x40/0x40
> > > [    4.475850]  ret_from_fork+0x31/0x40
> > 
> > So this was pre-existing wreckage? If so, that needs to be sorted first.
> > Because if the kobject stuff is broken, there's no way the refcount
> > stuff can begin to work.
> 
> Google just found me:
> 
>  https://lkml.kernel.org/r/1487758442-5855-2-git-send-email-tom.leiming@gmail.com
> 
> Let me see if that works.

Yes, those patches cure the issue.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-02-28 10:45 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20170228041117.hbqqyczf5bnypqmc@wfg-t540p.sh.intel.com>
2017-02-28  7:48 ` [blk_mq_register_hctx] 29dee3c03a WARNING: CPU: 0 PID: 5 at lib/refcount.c:114 refcount_inc Peter Zijlstra
2017-02-28  8:17 ` Peter Zijlstra
2017-02-28  8:35   ` Fengguang Wu
2017-02-28  8:52     ` Omar Sandoval
2017-02-28  8:38   ` Peter Zijlstra
2017-02-28  9:45     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).