linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Stephen Boyd <sboyd@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	syzbot <syzbot+d6c75f383e01426a40b4@syzkaller.appspotmail.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	syzkaller-bugs@googlegroups.com, Waiman Long <llong@redhat.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>, Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [syzbot] WARNING in __init_work
Date: Sun, 19 Sep 2021 14:41:18 +0200	[thread overview]
Message-ID: <87sfy07n69.ffs@tglx> (raw)
In-Reply-To: <163175937144.763609.2073508754264771910@swboyd.mtv.corp.google.com>

Stephen,

On Wed, Sep 15 2021 at 19:29, Stephen Boyd wrote:
> Quoting Andrew Morton (2021-09-15 16:14:57)
>> On Wed, 15 Sep 2021 10:00:22 -0700 syzbot <syzbot+d6c75f383e01426a40b4@syzkaller.appspotmail.com> wrote:
>> > 
>> > ODEBUG: object ffffc90000fd8bc8 is NOT on stack ffffc900022a0000, but annotated.
>
> This is saying that the object was supposed to be on the stack because
> debug objects was told that, but it isn't on the stack per the
> definition of object_is_on_stack().

Correct.

>> >  <IRQ>
>> >  __init_work+0x2d/0x50 kernel/workqueue.c:519
>> >  synchronize_rcu_expedited+0x392/0x620 kernel/rcu/tree_exp.h:847
>
> This line looks like
>
>   INIT_WORK_ONSTACK(&rew.rew_work, wait_rcu_exp_gp);
>
> inside synchronize_rcu_expedited(). The rew structure is declared on the
> stack
>
>    struct rcu_exp_work rew;

Yes, but object_is_on_stack() checks for task stacks only. And the splat
here is entirely correct:

softirq()
  ...
  synchronize_rcu_expedited()
     INIT_WORK_ONSTACK()
     queue_work()
     wait_event()

is obviously broken. You cannot wait in soft irq context.

synchronize_rcu_expedited() should really have a might_sleep() at the
beginning to make that more obvious.

The splat is clobbered btw:

[  416.415111][    C1] ODEBUG: object ffffc90000fd8bc8 is NOT on stack ffffc900022a0000, but annotated.
[  416.423424][T14850] truncated
[  416.431623][    C1] ------------[ cut here ]------------
[  416.438913][T14850] ------------[ cut here ]------------
[  416.440189][    C1] WARNING: CPU: 1 PID: 2971 at lib/debugobjects.c:548 __debug_object_init.cold+0x252/0x2e5
[  416.455797][T14850] refcount_t: addition on 0; use-after-free.

So there is a refcount_t violation as well.

Nevertheless a hint for finding the culprit is obviously here in that
call chain:

>> >  bdi_remove_from_list mm/backing-dev.c:938 [inline]
>> >  bdi_unregister+0x177/0x5a0 mm/backing-dev.c:946
>> >  release_bdi+0xa1/0xc0 mm/backing-dev.c:968
>> >  kref_put include/linux/kref.h:65 [inline]
>> >  bdi_put+0x72/0xa0 mm/backing-dev.c:976
>> >  bdev_free_inode+0x116/0x220 fs/block_dev.c:819
>> >  i_callback+0x3f/0x70 fs/inode.c:224

The inode code uses RCU for freeing an inode object which then ends up
calling bdi_put() and subsequently in synchronize_rcu_expedited().

>> >  rcu_do_batch kernel/rcu/tree.c:2508 [inline]
>> >  rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2743
>> >  __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
>> >  invoke_softirq kernel/softirq.c:432 [inline]
>> >  __irq_exit_rcu+0x123/0x180 kernel/softirq.c:636
>> >  irq_exit_rcu+0x5/0x20 kernel/softirq.c:648
>> >  sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1097
>> >  </IRQ>
>> 
>> Seems that we have a debugobject in the incorrect state, but it doesn't
>> necessarily mean there's something wrong in the bdi code.  It's just
>> that the bdi code happened to be the place which called
>> synchronize_rcu_expedited().

Again, it cannot do that from a softirq because
synchronize_rcu_expedited() might sleep.

> Is it possible that object_is_on_stack() doesn't work in IRQ context?
> I'm not really following along on x86 but I could see where
> task_stack_page() gets the wrong "stack" pointer because the task has one
> stack and the irq stack is some per-cpu dedicated allocation?

Even if debug objects would support objects on irq stacks, the above is
still bogus. But it does not and will not because the operations here
have to be fully synchronous:

    init() -> queue() or arm() -> wait() -> destroy()

because you obviously cannot queue work or arm a timer which are on stack
and then leave the function without waiting for the operation to complete.

So these operations have to be synchronous which is a NONO when running
in hard or soft interrupt context because waiting for the operation to
complete is not possible there.

Thanks,

        tglx


  reply	other threads:[~2021-09-19 12:41 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-15 17:00 [syzbot] WARNING in __init_work syzbot
2021-09-15 23:14 ` Andrew Morton
2021-09-16  2:29   ` Stephen Boyd
2021-09-19 12:41     ` Thomas Gleixner [this message]
2021-09-20  4:03       ` Dave Chinner
2021-09-20 12:28         ` Christoph Hellwig
2021-09-20 12:38           ` Paul E. McKenney
2021-09-20 12:45             ` Christoph Hellwig
2021-09-20 12:54               ` Paul E. McKenney
2021-09-21 18:38       ` Stephen Boyd
2021-09-21 20:19         ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sfy07n69.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=llong@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=sboyd@kernel.org \
    --cc=syzbot+d6c75f383e01426a40b4@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).