All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
@ 2019-07-23 15:00 Carlos Maiolino
  2019-07-23 15:11 ` Darrick J. Wong
  2019-07-23 15:13 ` Carlos Maiolino
  0 siblings, 2 replies; 9+ messages in thread
From: Carlos Maiolino @ 2019-07-23 15:00 UTC (permalink / raw)
  To: linux-xfs

xfs_extent_busy_clear_one() calls kmem_free() with the pag spinlock
locked.

Fix this by adding a new temporary list, and, make
xfs_extent_busy_clear_one() to move the extent_busy items to this new
list, instead of freeing them.

Free the objects in the temporary list after we drop the pagb_lock

Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
---
 fs/xfs/xfs_extent_busy.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
index 0ed68379e551..0a7dcf03340b 100644
--- a/fs/xfs/xfs_extent_busy.c
+++ b/fs/xfs/xfs_extent_busy.c
@@ -523,7 +523,8 @@ STATIC void
 xfs_extent_busy_clear_one(
 	struct xfs_mount	*mp,
 	struct xfs_perag	*pag,
-	struct xfs_extent_busy	*busyp)
+	struct xfs_extent_busy	*busyp,
+	struct list_head	*list)
 {
 	if (busyp->length) {
 		trace_xfs_extent_busy_clear(mp, busyp->agno, busyp->bno,
@@ -531,8 +532,7 @@ xfs_extent_busy_clear_one(
 		rb_erase(&busyp->rb_node, &pag->pagb_tree);
 	}
 
-	list_del_init(&busyp->list);
-	kmem_free(busyp);
+	list_move(&busyp->list, list);
 }
 
 static void
@@ -565,6 +565,7 @@ xfs_extent_busy_clear(
 	struct xfs_perag	*pag = NULL;
 	xfs_agnumber_t		agno = NULLAGNUMBER;
 	bool			wakeup = false;
+	LIST_HEAD(busy_list);
 
 	list_for_each_entry_safe(busyp, n, list, list) {
 		if (busyp->agno != agno) {
@@ -580,13 +581,18 @@ xfs_extent_busy_clear(
 		    !(busyp->flags & XFS_EXTENT_BUSY_SKIP_DISCARD)) {
 			busyp->flags = XFS_EXTENT_BUSY_DISCARDED;
 		} else {
-			xfs_extent_busy_clear_one(mp, pag, busyp);
+			xfs_extent_busy_clear_one(mp, pag, busyp, &busy_list);
 			wakeup = true;
 		}
 	}
 
 	if (pag)
 		xfs_extent_busy_put_pag(pag, wakeup);
+
+	list_for_each_entry_safe(busyp, n, &busy_list, list) {
+		list_del_init(&busyp->list);
+		kmem_free(busyp);
+	}
 }
 
 /*
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
  2019-07-23 15:00 [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock Carlos Maiolino
@ 2019-07-23 15:11 ` Darrick J. Wong
  2019-07-23 15:31   ` Carlos Maiolino
  2019-07-23 15:13 ` Carlos Maiolino
  1 sibling, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2019-07-23 15:11 UTC (permalink / raw)
  To: Carlos Maiolino; +Cc: linux-xfs

On Tue, Jul 23, 2019 at 05:00:17PM +0200, Carlos Maiolino wrote:
> xfs_extent_busy_clear_one() calls kmem_free() with the pag spinlock
> locked.

Er, what problem does this solve?  Does holding on to the pag spinlock
too long while memory freeing causes everything else to stall?  When is
memory freeing slow enough to cause a noticeable impact?

> Fix this by adding a new temporary list, and, make
> xfs_extent_busy_clear_one() to move the extent_busy items to this new
> list, instead of freeing them.
> 
> Free the objects in the temporary list after we drop the pagb_lock
> 
> Reported-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
>  fs/xfs/xfs_extent_busy.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
> index 0ed68379e551..0a7dcf03340b 100644
> --- a/fs/xfs/xfs_extent_busy.c
> +++ b/fs/xfs/xfs_extent_busy.c
> @@ -523,7 +523,8 @@ STATIC void
>  xfs_extent_busy_clear_one(
>  	struct xfs_mount	*mp,
>  	struct xfs_perag	*pag,
> -	struct xfs_extent_busy	*busyp)
> +	struct xfs_extent_busy	*busyp,
> +	struct list_head	*list)
>  {
>  	if (busyp->length) {
>  		trace_xfs_extent_busy_clear(mp, busyp->agno, busyp->bno,
> @@ -531,8 +532,7 @@ xfs_extent_busy_clear_one(
>  		rb_erase(&busyp->rb_node, &pag->pagb_tree);
>  	}
>  
> -	list_del_init(&busyp->list);
> -	kmem_free(busyp);
> +	list_move(&busyp->list, list);
>  }
>  
>  static void
> @@ -565,6 +565,7 @@ xfs_extent_busy_clear(
>  	struct xfs_perag	*pag = NULL;
>  	xfs_agnumber_t		agno = NULLAGNUMBER;
>  	bool			wakeup = false;
> +	LIST_HEAD(busy_list);
>  
>  	list_for_each_entry_safe(busyp, n, list, list) {
>  		if (busyp->agno != agno) {
> @@ -580,13 +581,18 @@ xfs_extent_busy_clear(
>  		    !(busyp->flags & XFS_EXTENT_BUSY_SKIP_DISCARD)) {
>  			busyp->flags = XFS_EXTENT_BUSY_DISCARDED;
>  		} else {
> -			xfs_extent_busy_clear_one(mp, pag, busyp);
> +			xfs_extent_busy_clear_one(mp, pag, busyp, &busy_list);

...and why not just put the busyp on the busy_list here so you don't
have to pass the list_head pointer around?

--D

>  			wakeup = true;
>  		}
>  	}
>  
>  	if (pag)
>  		xfs_extent_busy_put_pag(pag, wakeup);
> +
> +	list_for_each_entry_safe(busyp, n, &busy_list, list) {
> +		list_del_init(&busyp->list);
> +		kmem_free(busyp);
> +	}
>  }
>  
>  /*
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
  2019-07-23 15:00 [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock Carlos Maiolino
  2019-07-23 15:11 ` Darrick J. Wong
@ 2019-07-23 15:13 ` Carlos Maiolino
  1 sibling, 0 replies; 9+ messages in thread
From: Carlos Maiolino @ 2019-07-23 15:13 UTC (permalink / raw)
  To: linux-xfs; +Cc: jlayton

I forgot to Cc Jeff on the patch.

Sorry Jeff, cc'ing now.

On Tue, Jul 23, 2019 at 05:00:17PM +0200, Carlos Maiolino wrote:
> xfs_extent_busy_clear_one() calls kmem_free() with the pag spinlock
> locked.
> 
> Fix this by adding a new temporary list, and, make
> xfs_extent_busy_clear_one() to move the extent_busy items to this new
> list, instead of freeing them.
> 
> Free the objects in the temporary list after we drop the pagb_lock
> 
> Reported-by: Jeff Layton <jlayton@kernel.org>
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
>  fs/xfs/xfs_extent_busy.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
> index 0ed68379e551..0a7dcf03340b 100644
> --- a/fs/xfs/xfs_extent_busy.c
> +++ b/fs/xfs/xfs_extent_busy.c
> @@ -523,7 +523,8 @@ STATIC void
>  xfs_extent_busy_clear_one(
>  	struct xfs_mount	*mp,
>  	struct xfs_perag	*pag,
> -	struct xfs_extent_busy	*busyp)
> +	struct xfs_extent_busy	*busyp,
> +	struct list_head	*list)
>  {
>  	if (busyp->length) {
>  		trace_xfs_extent_busy_clear(mp, busyp->agno, busyp->bno,
> @@ -531,8 +532,7 @@ xfs_extent_busy_clear_one(
>  		rb_erase(&busyp->rb_node, &pag->pagb_tree);
>  	}
>  
> -	list_del_init(&busyp->list);
> -	kmem_free(busyp);
> +	list_move(&busyp->list, list);
>  }
>  
>  static void
> @@ -565,6 +565,7 @@ xfs_extent_busy_clear(
>  	struct xfs_perag	*pag = NULL;
>  	xfs_agnumber_t		agno = NULLAGNUMBER;
>  	bool			wakeup = false;
> +	LIST_HEAD(busy_list);
>  
>  	list_for_each_entry_safe(busyp, n, list, list) {
>  		if (busyp->agno != agno) {
> @@ -580,13 +581,18 @@ xfs_extent_busy_clear(
>  		    !(busyp->flags & XFS_EXTENT_BUSY_SKIP_DISCARD)) {
>  			busyp->flags = XFS_EXTENT_BUSY_DISCARDED;
>  		} else {
> -			xfs_extent_busy_clear_one(mp, pag, busyp);
> +			xfs_extent_busy_clear_one(mp, pag, busyp, &busy_list);
>  			wakeup = true;
>  		}
>  	}
>  
>  	if (pag)
>  		xfs_extent_busy_put_pag(pag, wakeup);
> +
> +	list_for_each_entry_safe(busyp, n, &busy_list, list) {
> +		list_del_init(&busyp->list);
> +		kmem_free(busyp);
> +	}
>  }
>  
>  /*
> -- 
> 2.20.1
> 

-- 
Carlos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
  2019-07-23 15:11 ` Darrick J. Wong
@ 2019-07-23 15:31   ` Carlos Maiolino
  2019-07-23 15:51     ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Carlos Maiolino @ 2019-07-23 15:31 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, jlayton

On Tue, Jul 23, 2019 at 08:11:02AM -0700, Darrick J. Wong wrote:
> On Tue, Jul 23, 2019 at 05:00:17PM +0200, Carlos Maiolino wrote:
> > xfs_extent_busy_clear_one() calls kmem_free() with the pag spinlock
> > locked.
> 

CC'ing Jeff so he can maybe chime in too.


> Er, what problem does this solve?  Does holding on to the pag spinlock
> too long while memory freeing causes everything else to stall?  When is
> memory freeing slow enough to cause a noticeable impact?

Jeff detected it when using this patch:

https://marc.info/?l=linux-mm&m=156388753722881&w=2

At first I don't see any specific problem, but I don't think we are supposed to
use kmem_free() inside interrupt context anyway. So, even though there is no
visible side effect, it should be fixed IMHO. With the patch above, the side
effect is a bunch of warnings :P

> 
> > Fix this by adding a new temporary list, and, make
> > xfs_extent_busy_clear_one() to move the extent_busy items to this new
> > list, instead of freeing them.
> > 
> > Free the objects in the temporary list after we drop the pagb_lock
> > 
> > Reported-by: Jeff Layton <jlayton@kernel.org>
> > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> > ---
> >  fs/xfs/xfs_extent_busy.c | 14 ++++++++++----
> >  1 file changed, 10 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_extent_busy.c b/fs/xfs/xfs_extent_busy.c
> > index 0ed68379e551..0a7dcf03340b 100644
> > --- a/fs/xfs/xfs_extent_busy.c
> > +++ b/fs/xfs/xfs_extent_busy.c
> > @@ -523,7 +523,8 @@ STATIC void
> >  xfs_extent_busy_clear_one(
> >  	struct xfs_mount	*mp,
> >  	struct xfs_perag	*pag,
> > -	struct xfs_extent_busy	*busyp)
> > +	struct xfs_extent_busy	*busyp,
> > +	struct list_head	*list)
> >  {
> >  	if (busyp->length) {
> >  		trace_xfs_extent_busy_clear(mp, busyp->agno, busyp->bno,
> > @@ -531,8 +532,7 @@ xfs_extent_busy_clear_one(
> >  		rb_erase(&busyp->rb_node, &pag->pagb_tree);
> >  	}
> >  
> > -	list_del_init(&busyp->list);
> > -	kmem_free(busyp);
> > +	list_move(&busyp->list, list);
> >  }
> >  
> >  static void
> > @@ -565,6 +565,7 @@ xfs_extent_busy_clear(
> >  	struct xfs_perag	*pag = NULL;
> >  	xfs_agnumber_t		agno = NULLAGNUMBER;
> >  	bool			wakeup = false;
> > +	LIST_HEAD(busy_list);
> >  
> >  	list_for_each_entry_safe(busyp, n, list, list) {
> >  		if (busyp->agno != agno) {
> > @@ -580,13 +581,18 @@ xfs_extent_busy_clear(
> >  		    !(busyp->flags & XFS_EXTENT_BUSY_SKIP_DISCARD)) {
> >  			busyp->flags = XFS_EXTENT_BUSY_DISCARDED;
> >  		} else {
> > -			xfs_extent_busy_clear_one(mp, pag, busyp);
> > +			xfs_extent_busy_clear_one(mp, pag, busyp, &busy_list);
> 
> ...and why not just put the busyp on the busy_list here so you don't
> have to pass the list_head pointer around?

Just left it inside _clear_one to keep manipulation of busyp in a single place.
We already remove it from the rb tree there, so, remove it from the extent
busy list also there, just looked clear to do all the cleanup in the same place.

> 
> --D
> 
> >  			wakeup = true;
> >  		}
> >  	}
> >  
> >  	if (pag)
> >  		xfs_extent_busy_put_pag(pag, wakeup);
> > +
> > +	list_for_each_entry_safe(busyp, n, &busy_list, list) {
> > +		list_del_init(&busyp->list);
> > +		kmem_free(busyp);
> > +	}
> >  }
> >  
> >  /*
> > -- 
> > 2.20.1
> > 

-- 
Carlos

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
  2019-07-23 15:31   ` Carlos Maiolino
@ 2019-07-23 15:51     ` Christoph Hellwig
  2019-07-23 17:07       ` Jeff Layton
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2019-07-23 15:51 UTC (permalink / raw)
  To: Darrick J. Wong, linux-xfs, jlayton

On Tue, Jul 23, 2019 at 05:31:33PM +0200, Carlos Maiolino wrote:
> CC'ing Jeff so he can maybe chime in too.
> 
> 
> > Er, what problem does this solve?  Does holding on to the pag spinlock
> > too long while memory freeing causes everything else to stall?  When is
> > memory freeing slow enough to cause a noticeable impact?
> 
> Jeff detected it when using this patch:
> 
> https://marc.info/?l=linux-mm&m=156388753722881&w=2
> 
> At first I don't see any specific problem, but I don't think we are supposed to
> use kmem_free() inside interrupt context anyway. So, even though there is no
> visible side effect, it should be fixed IMHO. With the patch above, the side
> effect is a bunch of warnings :P

This is going to break lots of places in xfs.  While we have separate
allocation side wrappers for plain kmalloc vs using a vmalloc fallback we
always use the same free side wrapper.  We could fix this by adding a
kmem_free_large and switch all places that allocated using
kmem_alloc_large to that, but it will require a bit of work.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
  2019-07-23 15:51     ` Christoph Hellwig
@ 2019-07-23 17:07       ` Jeff Layton
  2019-07-23 17:08         ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2019-07-23 17:07 UTC (permalink / raw)
  To: Christoph Hellwig, Darrick J. Wong, linux-xfs; +Cc: Al Viro

On Tue, 2019-07-23 at 08:51 -0700, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 05:31:33PM +0200, Carlos Maiolino wrote:
> > CC'ing Jeff so he can maybe chime in too.
> > 
> > 
> > > Er, what problem does this solve?  Does holding on to the pag spinlock
> > > too long while memory freeing causes everything else to stall?  When is
> > > memory freeing slow enough to cause a noticeable impact?
> > 
> > Jeff detected it when using this patch:
> > 
> > https://marc.info/?l=linux-mm&m=156388753722881&w=2
> > 
> > At first I don't see any specific problem, but I don't think we are supposed to
> > use kmem_free() inside interrupt context anyway. So, even though there is no
> > visible side effect, it should be fixed IMHO. With the patch above, the side
> > effect is a bunch of warnings :P
> 
> This is going to break lots of places in xfs.  While we have separate
> allocation side wrappers for plain kmalloc vs using a vmalloc fallback we
> always use the same free side wrapper.  We could fix this by adding a
> kmem_free_large and switch all places that allocated using
> kmem_alloc_large to that, but it will require a bit of work.

(cc'ing Al)

Note that those places are already broken. AIUI, the basic issue is that
vmalloc/vfree have to fix up page tables and that requires being able to
sleep. This patch just makes this situation more evident. If that patch
gets merged, I imagine we'll have a lot of places to clean up (not just
in xfs).

Anyway, in the case of being in an interrupt, we currently queue the
freeing to a workqueue. Al mentioned that we could create a new
kvfree_atomic that we could use from atomic contexts like this. That may
be another option (though Carlos' patch looked reasonable to me and
would probably be more efficient).
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
  2019-07-23 17:07       ` Jeff Layton
@ 2019-07-23 17:08         ` Christoph Hellwig
  2019-07-23 17:38           ` Jeff Layton
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2019-07-23 17:08 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, Darrick J. Wong, linux-xfs, Al Viro

On Tue, Jul 23, 2019 at 01:07:00PM -0400, Jeff Layton wrote:
> Note that those places are already broken. AIUI, the basic issue is that
> vmalloc/vfree have to fix up page tables and that requires being able to
> sleep. This patch just makes this situation more evident. If that patch
> gets merged, I imagine we'll have a lot of places to clean up (not just
> in xfs).
> 
> Anyway, in the case of being in an interrupt, we currently queue the
> freeing to a workqueue. Al mentioned that we could create a new
> kvfree_atomic that we could use from atomic contexts like this. That may
> be another option (though Carlos' patch looked reasonable to me and
> would probably be more efficient).

The point is for XFS we generally only use kmem_free for pure kmalloc
allocations under spinlocks.  But yes, the interfac is a little
suboptimal and a kmem_free_large would be nicer and then warnings like
this that might be pretty useful could be added.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
  2019-07-23 17:08         ` Christoph Hellwig
@ 2019-07-23 17:38           ` Jeff Layton
  2019-07-23 17:41             ` Christoph Hellwig
  0 siblings, 1 reply; 9+ messages in thread
From: Jeff Layton @ 2019-07-23 17:38 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Darrick J. Wong, linux-xfs, Al Viro

On Tue, 2019-07-23 at 10:08 -0700, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 01:07:00PM -0400, Jeff Layton wrote:
> > Note that those places are already broken. AIUI, the basic issue is that
> > vmalloc/vfree have to fix up page tables and that requires being able to
> > sleep. This patch just makes this situation more evident. If that patch
> > gets merged, I imagine we'll have a lot of places to clean up (not just
> > in xfs).
> > 
> > Anyway, in the case of being in an interrupt, we currently queue the
> > freeing to a workqueue. Al mentioned that we could create a new
> > kvfree_atomic that we could use from atomic contexts like this. That may
> > be another option (though Carlos' patch looked reasonable to me and
> > would probably be more efficient).
> 
> The point is for XFS we generally only use kmem_free for pure kmalloc
> allocations under spinlocks.  But yes, the interfac is a little
> suboptimal and a kmem_free_large would be nicer and then warnings like
> this that might be pretty useful could be added.

Ahh ok, I get it now. You're using it as a generic "free this, no matter
what it is" wrapper, and relying on the caller to ensure that it will
never try to free a vmalloc'ed addr from an atomic context.

I wonder how many other places are doing that? I count 858 call sites
for kvfree. If significant portion of those are doing this, then we may
have to re-think my patch. It seems like the right thing to do, but we
there may be more fallout than I expected.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock
  2019-07-23 17:38           ` Jeff Layton
@ 2019-07-23 17:41             ` Christoph Hellwig
  0 siblings, 0 replies; 9+ messages in thread
From: Christoph Hellwig @ 2019-07-23 17:41 UTC (permalink / raw)
  To: Jeff Layton; +Cc: Christoph Hellwig, Darrick J. Wong, linux-xfs, Al Viro

On Tue, Jul 23, 2019 at 01:38:08PM -0400, Jeff Layton wrote:
> Ahh ok, I get it now. You're using it as a generic "free this, no matter
> what it is" wrapper, and relying on the caller to ensure that it will
> never try to free a vmalloc'ed addr from an atomic context.
> 
> I wonder how many other places are doing that? I count 858 call sites
> for kvfree. If significant portion of those are doing this, then we may
> have to re-think my patch. It seems like the right thing to do, but we
> there may be more fallout than I expected.

For xfs we only have 4 direct callers of kmem_alloc_large, and 8 callers
of kmem_zalloc_large, so it they aren't too many, even assuming that due
to error handling we usually have a few more sites that free the
buffers.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-07-23 17:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-23 15:00 [PATCH] xfs: Do not free xfs_extent_busy from inside a spinlock Carlos Maiolino
2019-07-23 15:11 ` Darrick J. Wong
2019-07-23 15:31   ` Carlos Maiolino
2019-07-23 15:51     ` Christoph Hellwig
2019-07-23 17:07       ` Jeff Layton
2019-07-23 17:08         ` Christoph Hellwig
2019-07-23 17:38           ` Jeff Layton
2019-07-23 17:41             ` Christoph Hellwig
2019-07-23 15:13 ` Carlos Maiolino

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.