linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [syzbot] WARNING in __init_work
       [not found]   ` <163175937144.763609.2073508754264771910@swboyd.mtv.corp.google.com>
@ 2021-09-19 12:41     ` Thomas Gleixner
  2021-09-20  4:03       ` Dave Chinner
       [not found]       ` <163224949689.3714697.17466968510780664239@swboyd.mtv.corp.google.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Thomas Gleixner @ 2021-09-19 12:41 UTC (permalink / raw)
  To: Stephen Boyd, Andrew Morton, syzbot
  Cc: linux-kernel, linux-mm, syzkaller-bugs, Waiman Long,
	Paul E. McKenney, Al Viro, Jens Axboe, linux-block,
	linux-fsdevel

Stephen,

On Wed, Sep 15 2021 at 19:29, Stephen Boyd wrote:
> Quoting Andrew Morton (2021-09-15 16:14:57)
>> On Wed, 15 Sep 2021 10:00:22 -0700 syzbot <syzbot+d6c75f383e01426a40b4@syzkaller.appspotmail.com> wrote:
>> > 
>> > ODEBUG: object ffffc90000fd8bc8 is NOT on stack ffffc900022a0000, but annotated.
>
> This is saying that the object was supposed to be on the stack because
> debug objects was told that, but it isn't on the stack per the
> definition of object_is_on_stack().

Correct.

>> >  <IRQ>
>> >  __init_work+0x2d/0x50 kernel/workqueue.c:519
>> >  synchronize_rcu_expedited+0x392/0x620 kernel/rcu/tree_exp.h:847
>
> This line looks like
>
>   INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp);
>
> inside synchronize_rcu_expedited(). The rew structure is declared on the
> stack
>
>    struct rcu_exp_work rew;

Yes, but object_is_on_stack() checks for task stacks only. And the splat
here is entirely correct:

softirq()
  ...
  synchronize_rcu_expedited()
     INIT_WORK_ONSTACK()
     queue_work()
     wait_event()

is obviously broken. You cannot wait in soft irq context.

synchronize_rcu_expedited() should really have a might_sleep() at the
beginning to make that more obvious.

The splat is clobbered btw:

[  416.415111][    C1] ODEBUG: object ffffc90000fd8bc8 is NOT on stack ffffc900022a0000, but annotated.
[  416.423424][T14850] truncated
[  416.431623][    C1] ------------[ cut here ]------------
[  416.438913][T14850] ------------[ cut here ]------------
[  416.440189][    C1] WARNING: CPU: 1 PID: 2971 at lib/debugobjects.c:548 __debug_object_init.cold+0x252/0x2e5
[  416.455797][T14850] refcount_t: addition on 0; use-after-free.

So there is a refcount_t violation as well.

Nevertheless a hint for finding the culprit is obviously here in that
call chain:

>> >  bdi_remove_from_list mm/backing-dev.c:938 [inline]
>> >  bdi_unregister+0x177/0x5a0 mm/backing-dev.c:946
>> >  release_bdi+0xa1/0xc0 mm/backing-dev.c:968
>> >  kref_put include/linux/kref.h:65 [inline]
>> >  bdi_put+0x72/0xa0 mm/backing-dev.c:976
>> >  bdev_free_inode+0x116/0x220 fs/block_dev.c:819
>> >  i_callback+0x3f/0x70 fs/inode.c:224

The inode code uses RCU for freeing an inode object which then ends up
calling bdi_put() and subsequently in synchronize_rcu_expedited().

>> >  rcu_do_batch kernel/rcu/tree.c:2508 [inline]
>> >  rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2743
>> >  __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
>> >  invoke_softirq kernel/softirq.c:432 [inline]
>> >  __irq_exit_rcu+0x123/0x180 kernel/softirq.c:636
>> >  irq_exit_rcu+0x5/0x20 kernel/softirq.c:648
>> >  sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097
>> >  </IRQ>
>> 
>> Seems that we have a debugobject in the incorrect state, but it doesn't
>> necessarily mean there's something wrong in the bdi code.  It's just
>> that the bdi code happened to be the place which called
>> synchronize_rcu_expedited().

Again, it cannot do that from a softirq because
synchronize_rcu_expedited() might sleep.

> Is it possible that object_is_on_stack() doesn't work in IRQ context?
> I'm not really following along on x86 but I could see where
> task_stack_page() gets the wrong "stack" pointer because the task has one
> stack and the irq stack is some per-cpu dedicated allocation?

Even if debug objects would support objects on irq stacks, the above is
still bogus. But it does not and will not because the operations here
have to be fully synchronous:

    init() -> queue() or arm() -> wait() -> destroy()

because you obviously cannot queue work or arm a timer which are on stack
and then leave the function without waiting for the operation to complete.

So these operations have to be synchronous which is a NONO when running
in hard or soft interrupt context because waiting for the operation to
complete is not possible there.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] WARNING in __init_work
  2021-09-19 12:41     ` [syzbot] WARNING in __init_work Thomas Gleixner
@ 2021-09-20  4:03       ` Dave Chinner
  2021-09-20 12:28         ` Christoph Hellwig
       [not found]       ` <163224949689.3714697.17466968510780664239@swboyd.mtv.corp.google.com>
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2021-09-20  4:03 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Stephen Boyd, Andrew Morton, syzbot, linux-kernel, linux-mm,
	syzkaller-bugs, Waiman Long, Paul E. McKenney, Al Viro,
	Jens Axboe, linux-block, linux-fsdevel, hch

On Sun, Sep 19, 2021 at 02:41:18PM +0200, Thomas Gleixner wrote:
> Stephen,
> 
> On Wed, Sep 15 2021 at 19:29, Stephen Boyd wrote:
> > Quoting Andrew Morton (2021-09-15 16:14:57)
> >> On Wed, 15 Sep 2021 10:00:22 -0700 syzbot <syzbot+d6c75f383e01426a40b4@syzkaller.appspotmail.com> wrote:
> >> > 
> >> > ODEBUG: object ffffc90000fd8bc8 is NOT on stack ffffc900022a0000, but annotated.
> >
> > This is saying that the object was supposed to be on the stack because
> > debug objects was told that, but it isn't on the stack per the
> > definition of object_is_on_stack().
> 
> Correct.
> 
> >> >  <IRQ>
> >> >  __init_work+0x2d/0x50 kernel/workqueue.c:519
> >> >  synchronize_rcu_expedited+0x392/0x620 kernel/rcu/tree_exp.h:847
> >
> > This line looks like
> >
> >   INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp);
> >
> > inside synchronize_rcu_expedited(). The rew structure is declared on the
> > stack
> >
> >    struct rcu_exp_work rew;
> 
> Yes, but object_is_on_stack() checks for task stacks only. And the splat
> here is entirely correct:
> 
> softirq()
>   ...
>   synchronize_rcu_expedited()
>      INIT_WORK_ONSTACK()
>      queue_work()
>      wait_event()
> 
> is obviously broken. You cannot wait in soft irq context.
> 
> synchronize_rcu_expedited() should really have a might_sleep() at the
> beginning to make that more obvious.
> 
> The splat is clobbered btw:
> 
> [  416.415111][    C1] ODEBUG: object ffffc90000fd8bc8 is NOT on stack ffffc900022a0000, but annotated.
> [  416.423424][T14850] truncated
> [  416.431623][    C1] ------------[ cut here ]------------
> [  416.438913][T14850] ------------[ cut here ]------------
> [  416.440189][    C1] WARNING: CPU: 1 PID: 2971 at lib/debugobjects.c:548 __debug_object_init.cold+0x252/0x2e5
> [  416.455797][T14850] refcount_t: addition on 0; use-after-free.
> 
> So there is a refcount_t violation as well.
> 
> Nevertheless a hint for finding the culprit is obviously here in that
> call chain:
> 
> >> >  bdi_remove_from_list mm/backing-dev.c:938 [inline]
> >> >  bdi_unregister+0x177/0x5a0 mm/backing-dev.c:946
> >> >  release_bdi+0xa1/0xc0 mm/backing-dev.c:968
> >> >  kref_put include/linux/kref.h:65 [inline]
> >> >  bdi_put+0x72/0xa0 mm/backing-dev.c:976
> >> >  bdev_free_inode+0x116/0x220 fs/block_dev.c:819
> >> >  i_callback+0x3f/0x70 fs/inode.c:224
> 
> The inode code uses RCU for freeing an inode object which then ends up
> calling bdi_put() and subsequently in synchronize_rcu_expedited().

Commit 889c05cc5834 ("block: ensure the bdi is freed after
inode_detach_wb") might be a good place to start looking here. It
moved the release of the bdi from ->evict context to the RCU freeing
of the blockdev inode...

Christoph?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] WARNING in __init_work
  2021-09-20  4:03       ` Dave Chinner
@ 2021-09-20 12:28         ` Christoph Hellwig
  2021-09-20 12:38           ` Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2021-09-20 12:28 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Thomas Gleixner, Stephen Boyd, Andrew Morton, syzbot,
	linux-kernel, linux-mm, syzkaller-bugs, Waiman Long,
	Paul E. McKenney, Al Viro, Jens Axboe, linux-block,
	linux-fsdevel, hch

On Mon, Sep 20, 2021 at 02:03:36PM +1000, Dave Chinner wrote:
> > >> >  bdi_remove_from_list mm/backing-dev.c:938 [inline]
> > >> >  bdi_unregister+0x177/0x5a0 mm/backing-dev.c:946
> > >> >  release_bdi+0xa1/0xc0 mm/backing-dev.c:968
> > >> >  kref_put include/linux/kref.h:65 [inline]
> > >> >  bdi_put+0x72/0xa0 mm/backing-dev.c:976
> > >> >  bdev_free_inode+0x116/0x220 fs/block_dev.c:819
> > >> >  i_callback+0x3f/0x70 fs/inode.c:224
> > 
> > The inode code uses RCU for freeing an inode object which then ends up
> > calling bdi_put() and subsequently in synchronize_rcu_expedited().
> 
> Commit 889c05cc5834 ("block: ensure the bdi is freed after
> inode_detach_wb") might be a good place to start looking here. It
> moved the release of the bdi from ->evict context to the RCU freeing
> of the blockdev inode...

Well, the block code already does a bdi_unregister in del_gendisk.
So if we end up freeing the whole device bdev with a registered bdi
something is badly going wrong.  Unfortunately the log in this report
isn't much help on how we got there.  IIRC syzbot will eventually spew
out a reproducer, so it might be worth to wait for that.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] WARNING in __init_work
  2021-09-20 12:28         ` Christoph Hellwig
@ 2021-09-20 12:38           ` Paul E. McKenney
  2021-09-20 12:45             ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2021-09-20 12:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Chinner, Thomas Gleixner, Stephen Boyd, Andrew Morton,
	syzbot, linux-kernel, linux-mm, syzkaller-bugs, Waiman Long,
	Al Viro, Jens Axboe, linux-block, linux-fsdevel

On Mon, Sep 20, 2021 at 02:28:46PM +0200, Christoph Hellwig wrote:
> On Mon, Sep 20, 2021 at 02:03:36PM +1000, Dave Chinner wrote:
> > > >> >  bdi_remove_from_list mm/backing-dev.c:938 [inline]
> > > >> >  bdi_unregister+0x177/0x5a0 mm/backing-dev.c:946
> > > >> >  release_bdi+0xa1/0xc0 mm/backing-dev.c:968
> > > >> >  kref_put include/linux/kref.h:65 [inline]
> > > >> >  bdi_put+0x72/0xa0 mm/backing-dev.c:976
> > > >> >  bdev_free_inode+0x116/0x220 fs/block_dev.c:819
> > > >> >  i_callback+0x3f/0x70 fs/inode.c:224
> > > 
> > > The inode code uses RCU for freeing an inode object which then ends up
> > > calling bdi_put() and subsequently in synchronize_rcu_expedited().
> > 
> > Commit 889c05cc5834 ("block: ensure the bdi is freed after
> > inode_detach_wb") might be a good place to start looking here. It
> > moved the release of the bdi from ->evict context to the RCU freeing
> > of the blockdev inode...
> 
> Well, the block code already does a bdi_unregister in del_gendisk.
> So if we end up freeing the whole device bdev with a registered bdi
> something is badly going wrong.  Unfortunately the log in this report
> isn't much help on how we got there.  IIRC syzbot will eventually spew
> out a reproducer, so it might be worth to wait for that.

If it does turn out that you need to block in an RCU callback,
queue_rcu_work() can be helpful.  This schedules a workqueue from the RCU
callback, allowing the function passed to the preceding INIT_RCU_WORK()
to block.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] WARNING in __init_work
  2021-09-20 12:38           ` Paul E. McKenney
@ 2021-09-20 12:45             ` Christoph Hellwig
  2021-09-20 12:54               ` Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2021-09-20 12:45 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Christoph Hellwig, Dave Chinner, Thomas Gleixner, Stephen Boyd,
	Andrew Morton, syzbot, linux-kernel, linux-mm, syzkaller-bugs,
	Waiman Long, Al Viro, Jens Axboe, linux-block, linux-fsdevel

On Mon, Sep 20, 2021 at 05:38:59AM -0700, Paul E. McKenney wrote:
> > Well, the block code already does a bdi_unregister in del_gendisk.
> > So if we end up freeing the whole device bdev with a registered bdi
> > something is badly going wrong.  Unfortunately the log in this report
> > isn't much help on how we got there.  IIRC syzbot will eventually spew
> > out a reproducer, so it might be worth to wait for that.
> 
> If it does turn out that you need to block in an RCU callback,
> queue_rcu_work() can be helpful.  This schedules a workqueue from the RCU
> callback, allowing the function passed to the preceding INIT_RCU_WORK()
> to block.

In this case we really should not block here.  The problem is that
we are hitting the strange bdi auto-unregister misfeature due to a bug
elsewhere.  Which reminds that I have a patch series to remove this
auto unregistration which I need to bring bag once this is fixed.

That being said queue_rcu_work would have been really useful in a few
places I touched in that past.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] WARNING in __init_work
  2021-09-20 12:45             ` Christoph Hellwig
@ 2021-09-20 12:54               ` Paul E. McKenney
  0 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2021-09-20 12:54 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Dave Chinner, Thomas Gleixner, Stephen Boyd, Andrew Morton,
	syzbot, linux-kernel, linux-mm, syzkaller-bugs, Waiman Long,
	Al Viro, Jens Axboe, linux-block, linux-fsdevel

On Mon, Sep 20, 2021 at 02:45:57PM +0200, Christoph Hellwig wrote:
> On Mon, Sep 20, 2021 at 05:38:59AM -0700, Paul E. McKenney wrote:
> > > Well, the block code already does a bdi_unregister in del_gendisk.
> > > So if we end up freeing the whole device bdev with a registered bdi
> > > something is badly going wrong.  Unfortunately the log in this report
> > > isn't much help on how we got there.  IIRC syzbot will eventually spew
> > > out a reproducer, so it might be worth to wait for that.
> > 
> > If it does turn out that you need to block in an RCU callback,
> > queue_rcu_work() can be helpful.  This schedules a workqueue from the RCU
> > callback, allowing the function passed to the preceding INIT_RCU_WORK()
> > to block.
> 
> In this case we really should not block here.  The problem is that
> we are hitting the strange bdi auto-unregister misfeature due to a bug
> elsewhere.  Which reminds that I have a patch series to remove this
> auto unregistration which I need to bring bag once this is fixed.
> 
> That being said queue_rcu_work would have been really useful in a few
> places I touched in that past.

Glad it helped elsewhere and apologies for the noise here!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] WARNING in __init_work
       [not found]       ` <163224949689.3714697.17466968510780664239@swboyd.mtv.corp.google.com>
@ 2021-09-21 20:19         ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2021-09-21 20:19 UTC (permalink / raw)
  To: Stephen Boyd, Andrew Morton, syzbot
  Cc: linux-kernel, linux-mm, syzkaller-bugs, Waiman Long,
	Paul E. McKenney, Al Viro, Jens Axboe, linux-block,
	linux-fsdevel

Stephen,

On Tue, Sep 21 2021 at 11:38, Stephen Boyd wrote:
> Quoting Thomas Gleixner (2021-09-19 05:41:18)
>> Even if debug objects would support objects on irq stacks, the above is
>> still bogus. But it does not and will not because the operations here
>> have to be fully synchronous:
>> 
>>     init() -> queue() or arm() -> wait() -> destroy()
>> 
>> because you obviously cannot queue work or arm a timer which are on stack
>> and then leave the function without waiting for the operation to complete.
>
> Is there some way to make it more obvious that initializing a timer or
> work on the stack in an irq context is a NONO because we can't wait for
> it? Maybe some sort of debugobjects call to might_sleep() when it's
> being told the object is on the stack, or throwing a might_sleep() into
> the initialization of any stack based timer or workqueue, or both?

Let me have a look.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-21 20:19 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <000000000000423e0a05cc0ba2c4@google.com>
     [not found] ` <20210915161457.95ad5c9470efc70196d48410@linux-foundation.org>
     [not found]   ` <163175937144.763609.2073508754264771910@swboyd.mtv.corp.google.com>
2021-09-19 12:41     ` [syzbot] WARNING in __init_work Thomas Gleixner
2021-09-20  4:03       ` Dave Chinner
2021-09-20 12:28         ` Christoph Hellwig
2021-09-20 12:38           ` Paul E. McKenney
2021-09-20 12:45             ` Christoph Hellwig
2021-09-20 12:54               ` Paul E. McKenney
     [not found]       ` <163224949689.3714697.17466968510780664239@swboyd.mtv.corp.google.com>
2021-09-21 20:19         ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).