linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited()
@ 2021-10-14  8:24 Zqiang
  2021-10-14 11:24 ` Matthew Wilcox
  2021-10-14 12:06 ` Christoph Hellwig
  0 siblings, 2 replies; 7+ messages in thread
From: Zqiang @ 2021-10-14  8:24 UTC (permalink / raw)
  To: akpm, sunhao.th; +Cc: linux-kernel, linux-mm, Zqiang

<IRQ>
 __init_work+0x2d/0x50
 synchronize_rcu_expedited+0x3af/0x650
 bdi_remove_from_list [inline]
 bdi_unregister+0x17f/0x5c0
 release_bdi+0xa1/0xc0
 kref_put [inline]
 bdi_put+0x72/0xa0
 bdev_free_inode+0x11e/0x220
 i_callback+0x3f/0x70
 rcu_do_batch [inline]
 rcu_core+0x76d/0x16c0
 __do_softirq+0x1d7/0x93b
 invoke_softirq [inline]
 __irq_exit_rcu [inline]
 irq_exit_rcu+0xf2/0x130
 sysvec_apic_timer_interrupt+0x93/0xc0

The bdi_remove_from_list() is called in RCU softirq, however the
synchronize_rcu_expedited() will produce sleep action, use kfree_rcu()
instead of it.

Reported-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
---
 include/linux/backing-dev-defs.h | 1 +
 mm/backing-dev.c                 | 4 +---
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 33207004cfde..35a093384518 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -202,6 +202,7 @@ struct backing_dev_info {
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *debug_dir;
 #endif
+	struct rcu_head rcu;
 };
 
 enum {
diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index c878d995af06..45d866a3a4a2 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -935,8 +935,6 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi)
 	rb_erase(&bdi->rb_node, &bdi_tree);
 	list_del_rcu(&bdi->bdi_list);
 	spin_unlock_bh(&bdi_lock);
-
-	synchronize_rcu_expedited();
 }
 
 void bdi_unregister(struct backing_dev_info *bdi)
@@ -969,7 +967,7 @@ static void release_bdi(struct kref *ref)
 		bdi_unregister(bdi);
 	WARN_ON_ONCE(bdi->dev);
 	wb_exit(&bdi->wb);
-	kfree(bdi);
+	kfree_rcu(bdi, rcu);
 }
 
 void bdi_put(struct backing_dev_info *bdi)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited()
  2021-10-14  8:24 [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited() Zqiang
@ 2021-10-14 11:24 ` Matthew Wilcox
  2021-10-15  3:39   ` zhangqiang
       [not found]   ` <CALm+0cUt7iD5zex4-hRv=i1wPd66tz3JYGHz8P8ZFTZcyOCD1A@mail.gmail.com>
  2021-10-14 12:06 ` Christoph Hellwig
  1 sibling, 2 replies; 7+ messages in thread
From: Matthew Wilcox @ 2021-10-14 11:24 UTC (permalink / raw)
  To: Zqiang; +Cc: akpm, sunhao.th, linux-kernel, linux-mm

On Thu, Oct 14, 2021 at 04:24:33PM +0800, Zqiang wrote:
> The bdi_remove_from_list() is called in RCU softirq, however the
> synchronize_rcu_expedited() will produce sleep action, use kfree_rcu()
> instead of it.
> 
> Reported-by: Hao Sun <sunhao.th@gmail.com>
> Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
> ---
>  include/linux/backing-dev-defs.h | 1 +
>  mm/backing-dev.c                 | 4 +---
>  2 files changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
> index 33207004cfde..35a093384518 100644
> --- a/include/linux/backing-dev-defs.h
> +++ b/include/linux/backing-dev-defs.h
> @@ -202,6 +202,7 @@ struct backing_dev_info {
>  #ifdef CONFIG_DEBUG_FS
>  	struct dentry *debug_dir;
>  #endif
> +	struct rcu_head rcu;
>  };

Instead of growing struct backing_dev_info, it seems to me this rcu_head
could be placed in a union with rb_node, since it will have been removed
from the bdi_tree by this point and the tree is never walked under
RCU protection?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited()
  2021-10-14  8:24 [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited() Zqiang
  2021-10-14 11:24 ` Matthew Wilcox
@ 2021-10-14 12:06 ` Christoph Hellwig
  1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2021-10-14 12:06 UTC (permalink / raw)
  To: Zqiang; +Cc: akpm, sunhao.th, linux-kernel, linux-mm

On Thu, Oct 14, 2021 at 04:24:33PM +0800, Zqiang wrote:
> <IRQ>
>  __init_work+0x2d/0x50
>  synchronize_rcu_expedited+0x3af/0x650
>  bdi_remove_from_list [inline]
>  bdi_unregister+0x17f/0x5c0
>  release_bdi+0xa1/0xc0
>  kref_put [inline]
>  bdi_put+0x72/0xa0
>  bdev_free_inode+0x11e/0x220
>  i_callback+0x3f/0x70
>  rcu_do_batch [inline]
>  rcu_core+0x76d/0x16c0
>  __do_softirq+0x1d7/0x93b
>  invoke_softirq [inline]
>  __irq_exit_rcu [inline]
>  irq_exit_rcu+0xf2/0x130
>  sysvec_apic_timer_interrupt+0x93/0xc0
> 
> The bdi_remove_from_list() is called in RCU softirq, however the
> synchronize_rcu_expedited() will produce sleep action, use kfree_rcu()
> instead of it.

What workload is this running?  If we hit the RCU unregister path
from inode freeing we have a lifetime problem somewhere that we need
to fix first.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited()
  2021-10-14 11:24 ` Matthew Wilcox
@ 2021-10-15  3:39   ` zhangqiang
       [not found]   ` <CALm+0cUt7iD5zex4-hRv=i1wPd66tz3JYGHz8P8ZFTZcyOCD1A@mail.gmail.com>
  1 sibling, 0 replies; 7+ messages in thread
From: zhangqiang @ 2021-10-15  3:39 UTC (permalink / raw)
  To: Matthew Wilcox, Zqiang, hch; +Cc: akpm, sunhao.th, linux-kernel, linux-mm


On 2021/10/14 下午7:24, Matthew Wilcox wrote:
> On Thu, Oct 14, 2021 at 04:24:33PM +0800, Zqiang wrote:
>> The bdi_remove_from_list() is called in RCU softirq, however the
>> synchronize_rcu_expedited() will produce sleep action, use kfree_rcu()
>> instead of it.
>>
>> Reported-by: Hao Sun <sunhao.th@gmail.com>
>> Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
>> ---
>>   include/linux/backing-dev-defs.h | 1 +
>>   mm/backing-dev.c                 | 4 +---
>>   2 files changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
>> index 33207004cfde..35a093384518 100644
>> --- a/include/linux/backing-dev-defs.h
>> +++ b/include/linux/backing-dev-defs.h
>> @@ -202,6 +202,7 @@ struct backing_dev_info {
>>   #ifdef CONFIG_DEBUG_FS
>>   	struct dentry *debug_dir;
>>   #endif
>> +	struct rcu_head rcu;
>>   };
> Instead of growing struct backing_dev_info, it seems to me this rcu_head
> could be placed in a union with rb_node, since it will have been removed
> from the bdi_tree by this point and the tree is never walked under
> RCU protection?
>
Thanks for your advice, I find this bdi_tree is traversed under the 
protection of a spin lock, not under the protection of RCU.
I find this modification does not avoid the problem described in patch, 
the flush_delayed_work() may be called in release_bdi()
The same will cause problems.

may be  we can replace queue_rcu_work() of call_rcu(&inode->i_rcu, 
i_callback) or do you have any better suggestions?

Thanks
Zqiang



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited()
       [not found]     ` <d697d61e-27a2-a25c-3ae1-e41457d08705@gmail.com>
@ 2021-10-15 12:35       ` Matthew Wilcox
  2021-10-15 13:19         ` Christoph Hellwig
  2021-10-18  2:15         ` Zqiang
  0 siblings, 2 replies; 7+ messages in thread
From: Matthew Wilcox @ 2021-10-15 12:35 UTC (permalink / raw)
  To: Zqiang
  Cc: hch, akpm, sunhao.th, linux-kernel, linux-mm, Mikulas Patocka,
	Jens Axboe, Tejun Heo

On Fri, Oct 15, 2021 at 01:06:02PM +0800, Zqiang wrote:
> 
> On 2021/10/15 上午10:57, Qiang Zhang wrote:
> > 
> > 
> > Matthew Wilcox <willy@infradead.org <mailto:willy@infradead.org>>
> > 于2021年10月14日周四 下午7:26写道:
> > 
> >     On Thu, Oct 14, 2021 at 04:24:33PM +0800, Zqiang wrote:
> >     > The bdi_remove_from_list() is called in RCU softirq, however the
> >     > synchronize_rcu_expedited() will produce sleep action, use
> >     kfree_rcu()
> >     > instead of it.
> >     >
> >     > Reported-by: Hao Sun <sunhao.th@gmail.com
> >     <mailto:sunhao.th@gmail.com>>
> >     > Signed-off-by: Zqiang <qiang.zhang1211@gmail.com
> >     <mailto:qiang.zhang1211@gmail.com>>
> >     > ---
> >     >  include/linux/backing-dev-defs.h | 1 +
> >     >  mm/backing-dev.c                 | 4 +---
> >     >  2 files changed, 2 insertions(+), 3 deletions(-)
> >     >
> >     > diff --git a/include/linux/backing-dev-defs.h
> >     b/include/linux/backing-dev-defs.h
> >     > index 33207004cfde..35a093384518 100644
> >     > --- a/include/linux/backing-dev-defs.h
> >     > +++ b/include/linux/backing-dev-defs.h
> >     > @@ -202,6 +202,7 @@ struct backing_dev_info {
> >     >  #ifdef CONFIG_DEBUG_FS
> >     >       struct dentry *debug_dir;
> >     >  #endif
> >     > +     struct rcu_head rcu;
> >     >  };
> > 
> >     >Instead of growing struct backing_dev_info, it seems to me this
> >     rcu_head
> >     >could be placed in a union with rb_node, since it will have been
> >     removed
> >     >from the bdi_tree by this point and the tree is never walked under
> >     >RCU protection?
> > 
> > 
> > Thanks for your advice, I find this bdi_tree is traversed under the
> > protection of a spin lock, not under the protection of RCU.
> > I find this modification does not avoid the problem described in patch,
> > the flush_delayed_work() may be called in release_bdi()
> > The same will cause problems.
> > may be  we can replace queue_rcu_work() of call_rcu(&inode->i_rcu,
> > i_callback) or do you have any better suggestions?

What?  All I was suggesting was:

+++ b/include/linux/backing-dev-defs.h
@@ -168,7 +168,10 @@ struct bdi_writeback {
 
 struct backing_dev_info {
        u64 id;
-       struct rb_node rb_node; /* keyed by ->id */
+       union {
+               struct rb_node rb_node; /* keyed by ->id */
+               struct rcu_head rcu;
+       };
        struct list_head bdi_list;
        unsigned long ra_pages; /* max readahead in PAGE_SIZE units */
        unsigned long io_pages; /* max allowed IO size */


Christoph, independent of the inode lifetime problem, this actually seems
like a good approach to take.  I don't see why we should synchronize_rcu()
here?  Adding Jens (original introducer of the synchronize_rcu()), Mikulas
(converted it to use _expedited) and Tejun (worked around a problem when
using _expedited).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited()
  2021-10-15 12:35       ` Matthew Wilcox
@ 2021-10-15 13:19         ` Christoph Hellwig
  2021-10-18  2:15         ` Zqiang
  1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2021-10-15 13:19 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Zqiang, hch, akpm, sunhao.th, linux-kernel, linux-mm,
	Mikulas Patocka, Jens Axboe, Tejun Heo

On Fri, Oct 15, 2021 at 01:35:56PM +0100, Matthew Wilcox wrote:
>  struct backing_dev_info {
>         u64 id;
> -       struct rb_node rb_node; /* keyed by ->id */
> +       union {
> +               struct rb_node rb_node; /* keyed by ->id */
> +               struct rcu_head rcu;
> +       };
>         struct list_head bdi_list;
>         unsigned long ra_pages; /* max readahead in PAGE_SIZE units */
>         unsigned long io_pages; /* max allowed IO size */
> 
> 
> Christoph, independent of the inode lifetime problem, this actually seems
> like a good approach to take.  I don't see why we should synchronize_rcu()
> here?  Adding Jens (original introducer of the synchronize_rcu()), Mikulas
> (converted it to use _expedited) and Tejun (worked around a problem when
> using _expedited).

The kfree+rcu + your suggestion does seem like a good idea in general to
me.  But I'd still like to fix the actual bug being reported before
optimizing the area in a way that papers over the bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited()
  2021-10-15 12:35       ` Matthew Wilcox
  2021-10-15 13:19         ` Christoph Hellwig
@ 2021-10-18  2:15         ` Zqiang
  1 sibling, 0 replies; 7+ messages in thread
From: Zqiang @ 2021-10-18  2:15 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: hch, akpm, sunhao.th, linux-kernel, linux-mm, Mikulas Patocka,
	Jens Axboe, Tejun Heo


On 2021/10/15 下午8:35, Matthew Wilcox wrote:
> On Fri, Oct 15, 2021 at 01:06:02PM +0800, Zqiang wrote:
>> On 2021/10/15 上午10:57, Qiang Zhang wrote:
>>>
>>> Matthew Wilcox <willy@infradead.org <mailto:willy@infradead.org>>
>>> 于2021年10月14日周四 下午7:26写道:
>>>
>>>      On Thu, Oct 14, 2021 at 04:24:33PM +0800, Zqiang wrote:
>>>      > The bdi_remove_from_list() is called in RCU softirq, however the
>>>      > synchronize_rcu_expedited() will produce sleep action, use
>>>      kfree_rcu()
>>>      > instead of it.
>>>      >
>>>      > Reported-by: Hao Sun <sunhao.th@gmail.com
>>>      <mailto:sunhao.th@gmail.com>>
>>>      > Signed-off-by: Zqiang <qiang.zhang1211@gmail.com
>>>      <mailto:qiang.zhang1211@gmail.com>>
>>>      > ---
>>>      >  include/linux/backing-dev-defs.h | 1 +
>>>      >  mm/backing-dev.c                 | 4 +---
>>>      >  2 files changed, 2 insertions(+), 3 deletions(-)
>>>      >
>>>      > diff --git a/include/linux/backing-dev-defs.h
>>>      b/include/linux/backing-dev-defs.h
>>>      > index 33207004cfde..35a093384518 100644
>>>      > --- a/include/linux/backing-dev-defs.h
>>>      > +++ b/include/linux/backing-dev-defs.h
>>>      > @@ -202,6 +202,7 @@ struct backing_dev_info {
>>>      >  #ifdef CONFIG_DEBUG_FS
>>>      >       struct dentry *debug_dir;
>>>      >  #endif
>>>      > +     struct rcu_head rcu;
>>>      >  };
>>>
>>>      >Instead of growing struct backing_dev_info, it seems to me this
>>>      rcu_head
>>>      >could be placed in a union with rb_node, since it will have been
>>>      removed
>>>      >from the bdi_tree by this point and the tree is never walked under
>>>      >RCU protection?
>>>
>>>
>>> Thanks for your advice, I find this bdi_tree is traversed under the
>>> protection of a spin lock, not under the protection of RCU.
>>> I find this modification does not avoid the problem described in patch,
>>> the flush_delayed_work() may be called in release_bdi()
>>> The same will cause problems.
>>> may be  we can replace queue_rcu_work() of call_rcu(&inode->i_rcu,
>>> i_callback) or do you have any better suggestions?
> What?  All I was suggesting was:
>
> +++ b/include/linux/backing-dev-defs.h
> @@ -168,7 +168,10 @@ struct bdi_writeback {
>   
>   struct backing_dev_info {
>          u64 id;
> -       struct rb_node rb_node; /* keyed by ->id */
> +       union {
> +               struct rb_node rb_node; /* keyed by ->id */
> +               struct rcu_head rcu;
> +       };
>          struct list_head bdi_list;
>          unsigned long ra_pages; /* max readahead in PAGE_SIZE units */
>          unsigned long io_pages; /* max allowed IO size */
>
>
> Christoph, independent of the inode lifetime problem, this actually seems
> like a good approach to take.  I don't see why we should synchronize_rcu()
> here?  Adding Jens (original introducer of the synchronize_rcu()), Mikulas
> (converted it to use _expedited) and Tejun (worked around a problem when
> using _expedited).

Sorry,this my mistake.   this problem and the inode lifetime cycle are 
two different problems

Can this modification which use kfree_rcu() instead of synchronize_rcu() 
be accepted?


Thanks

Zqiang



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-10-18  2:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-14  8:24 [PATCH] mm: backing-dev: use kfree_rcu() instead of synchronize_rcu_expedited() Zqiang
2021-10-14 11:24 ` Matthew Wilcox
2021-10-15  3:39   ` zhangqiang
     [not found]   ` <CALm+0cUt7iD5zex4-hRv=i1wPd66tz3JYGHz8P8ZFTZcyOCD1A@mail.gmail.com>
     [not found]     ` <d697d61e-27a2-a25c-3ae1-e41457d08705@gmail.com>
2021-10-15 12:35       ` Matthew Wilcox
2021-10-15 13:19         ` Christoph Hellwig
2021-10-18  2:15         ` Zqiang
2021-10-14 12:06 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).