linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] mm: backing-dev: a possible sleep-in-atomic-context bug in cgwb_create()
@ 2018-06-21  3:02 Jia-Ju Bai
  2018-06-21  3:35 ` Matthew Wilcox
  0 siblings, 1 reply; 3+ messages in thread
From: Jia-Ju Bai @ 2018-06-21  3:02 UTC (permalink / raw)
  To: axboe, akpm, jack, zhangweiping, sergey.senozhatsky,
	andriy.shevchenko, christophe.jaillet, aryabinin
  Cc: linux-mm, Linux Kernel Mailing List

The kernel may sleep with holding a spinlock.
The function call path (from bottom to top) in Linux-4.16.7 is:

[FUNC] schedule
lib/percpu-refcount.c, 222:
         schedule in __percpu_ref_switch_mode
lib/percpu-refcount.c, 339:
         __percpu_ref_switch_mode in percpu_ref_kill_and_confirm
./include/linux/percpu-refcount.h, 127:
         percpu_ref_kill_and_confirm in percpu_ref_kill
mm/backing-dev.c, 545:
         percpu_ref_kill in cgwb_kill
mm/backing-dev.c, 576:
         cgwb_kill in cgwb_create
mm/backing-dev.c, 573:
         _raw_spin_lock_irqsave in cgwb_create

This bug is found by my static analysis tool (DSAC-2) and checked by my
code review.

I do not know how to correctly fix this bug, so I just report them.
Maybe cgwb_kill() should not be called with holding a spinlock.


Best wishes,
Jia-Ju Bai

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] mm: backing-dev: a possible sleep-in-atomic-context bug in cgwb_create()
  2018-06-21  3:02 [BUG] mm: backing-dev: a possible sleep-in-atomic-context bug in cgwb_create() Jia-Ju Bai
@ 2018-06-21  3:35 ` Matthew Wilcox
  2018-06-22  8:50   ` Jan Kara
  0 siblings, 1 reply; 3+ messages in thread
From: Matthew Wilcox @ 2018-06-21  3:35 UTC (permalink / raw)
  To: Jia-Ju Bai
  Cc: axboe, akpm, jack, zhangweiping, sergey.senozhatsky,
	andriy.shevchenko, christophe.jaillet, aryabinin, linux-mm,
	Linux Kernel Mailing List

On Thu, Jun 21, 2018 at 11:02:58AM +0800, Jia-Ju Bai wrote:
> The kernel may sleep with holding a spinlock.
> The function call path (from bottom to top) in Linux-4.16.7 is:
> 
> [FUNC] schedule
> lib/percpu-refcount.c, 222:
>         schedule in __percpu_ref_switch_mode
> lib/percpu-refcount.c, 339:
>         __percpu_ref_switch_mode in percpu_ref_kill_and_confirm
> ./include/linux/percpu-refcount.h, 127:
>         percpu_ref_kill_and_confirm in percpu_ref_kill
> mm/backing-dev.c, 545:
>         percpu_ref_kill in cgwb_kill
> mm/backing-dev.c, 576:
>         cgwb_kill in cgwb_create
> mm/backing-dev.c, 573:
>         _raw_spin_lock_irqsave in cgwb_create
> 
> This bug is found by my static analysis tool (DSAC-2) and checked by my
> code review.

I disagree with your code review.

         * If the previous ATOMIC switching hasn't finished yet, wait for
         * its completion.  If the caller ensures that ATOMIC switching
         * isn't in progress, this function can be called from any context.

I believe cgwb_kill is always called under the spinlock, so we will never
sleep because the percpu ref will never be switching to atomic mode.

This is complex and subtle, so I could be wrong.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BUG] mm: backing-dev: a possible sleep-in-atomic-context bug in cgwb_create()
  2018-06-21  3:35 ` Matthew Wilcox
@ 2018-06-22  8:50   ` Jan Kara
  0 siblings, 0 replies; 3+ messages in thread
From: Jan Kara @ 2018-06-22  8:50 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jia-Ju Bai, axboe, akpm, jack, zhangweiping, sergey.senozhatsky,
	andriy.shevchenko, christophe.jaillet, aryabinin, linux-mm,
	Linux Kernel Mailing List

On Wed 20-06-18 20:35:15, Matthew Wilcox wrote:
> On Thu, Jun 21, 2018 at 11:02:58AM +0800, Jia-Ju Bai wrote:
> > The kernel may sleep with holding a spinlock.
> > The function call path (from bottom to top) in Linux-4.16.7 is:
> > 
> > [FUNC] schedule
> > lib/percpu-refcount.c, 222:
> >         schedule in __percpu_ref_switch_mode
> > lib/percpu-refcount.c, 339:
> >         __percpu_ref_switch_mode in percpu_ref_kill_and_confirm
> > ./include/linux/percpu-refcount.h, 127:
> >         percpu_ref_kill_and_confirm in percpu_ref_kill
> > mm/backing-dev.c, 545:
> >         percpu_ref_kill in cgwb_kill
> > mm/backing-dev.c, 576:
> >         cgwb_kill in cgwb_create
> > mm/backing-dev.c, 573:
> >         _raw_spin_lock_irqsave in cgwb_create
> > 
> > This bug is found by my static analysis tool (DSAC-2) and checked by my
> > code review.
> 
> I disagree with your code review.
> 
>          * If the previous ATOMIC switching hasn't finished yet, wait for
>          * its completion.  If the caller ensures that ATOMIC switching
>          * isn't in progress, this function can be called from any context.
> 
> I believe cgwb_kill is always called under the spinlock, so we will never
> sleep because the percpu ref will never be switching to atomic mode.

You are right that the sleep under spinlock never happens. And the reason
is that percpu_ref_kill() never results in blocking - it does call
percpu_ref_kill_and_confirm() but the 'confirm' argument is NULL and thus
even percpu_ref_kill_and_confirm() never blocks.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-06-22  8:50 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-21  3:02 [BUG] mm: backing-dev: a possible sleep-in-atomic-context bug in cgwb_create() Jia-Ju Bai
2018-06-21  3:35 ` Matthew Wilcox
2018-06-22  8:50   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).