linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] xfs: fix bogus space reservation in xfs_iomap_write_allocate
@ 2016-08-03 17:33 Christoph Hellwig
  2016-08-05  0:03 ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2016-08-03 17:33 UTC (permalink / raw)
  To: xfs

The space reservations was without an explaination back in commit

    "Add error reporting calls in error paths that return EFSCORRUPTED"

back in 2003.  There is no reason to reserve disk blocks in the
transaction when allocating blocks for delalloc space as we already
reserved the space when creating the delalloc extent.

With this fix we stop running out of the reserved pool in generic/229,
which has happened for long time with small blocksize file systems,
and has increased in severity with the new buffered write path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_iomap.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 2114d53..279353c 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -691,7 +691,6 @@ xfs_iomap_write_allocate(
 	xfs_trans_t	*tp;
 	int		nimaps;
 	int		error = 0;
-	int		nres;
 
 	/*
 	 * Make sure that the dquots are there.
@@ -715,12 +714,15 @@ xfs_iomap_write_allocate(
 		 * is in the delayed allocation extent on which we sit
 		 * but before our buffer starts.
 		 */
-
 		nimaps = 0;
 		while (nimaps == 0) {
-			nres = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
-
-			error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, nres,
+			/*
+			 * We have already reserved space for the extent and any
+			 * indirect blocks when creating the delalloc extent,
+			 * there is no need to reserve space in this transaction
+			 * again.
+			 */
+			error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, 0,
 					0, XFS_TRANS_RESERVE, &tp);
 			if (error)
 				return error;
@@ -783,7 +785,7 @@ xfs_iomap_write_allocate(
 			 */
 			error = xfs_bmapi_write(tp, ip, map_start_fsb,
 						count_fsb, 0, &first_block,
-						nres, imap, &nimaps,
+						0, imap, &nimaps,
 						&dfops);
 			if (error)
 				goto trans_cancel;
-- 
2.1.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] xfs: fix bogus space reservation in xfs_iomap_write_allocate
  2016-08-03 17:33 [PATCH] xfs: fix bogus space reservation in xfs_iomap_write_allocate Christoph Hellwig
@ 2016-08-05  0:03 ` Dave Chinner
  2016-08-11 16:03   ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2016-08-05  0:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Wed, Aug 03, 2016 at 07:33:06PM +0200, Christoph Hellwig wrote:
> The space reservations was without an explaination back in commit
> 
>     "Add error reporting calls in error paths that return EFSCORRUPTED"
> 
> back in 2003.  There is no reason to reserve disk blocks in the
> transaction when allocating blocks for delalloc space as we already
> reserved the space when creating the delalloc extent.
> 
> With this fix we stop running out of the reserved pool in generic/229,
> which has happened for long time with small blocksize file systems,
> and has increased in severity with the new buffered write path.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_iomap.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 2114d53..279353c 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -691,7 +691,6 @@ xfs_iomap_write_allocate(
>  	xfs_trans_t	*tp;
>  	int		nimaps;
>  	int		error = 0;
> -	int		nres;
>  
>  	/*
>  	 * Make sure that the dquots are there.
> @@ -715,12 +714,15 @@ xfs_iomap_write_allocate(
>  		 * is in the delayed allocation extent on which we sit
>  		 * but before our buffer starts.
>  		 */
> -
>  		nimaps = 0;
>  		while (nimaps == 0) {
> -			nres = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
> -
> -			error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, nres,
> +			/*
> +			 * We have already reserved space for the extent and any
> +			 * indirect blocks when creating the delalloc extent,
> +			 * there is no need to reserve space in this transaction
> +			 * again.
> +			 */
> +			error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, 0,
>  					0, XFS_TRANS_RESERVE, &tp);
>  			if (error)
>  				return error;
> @@ -783,7 +785,7 @@ xfs_iomap_write_allocate(
>  			 */
>  			error = xfs_bmapi_write(tp, ip, map_start_fsb,
>  						count_fsb, 0, &first_block,
> -						nres, imap, &nimaps,
> +						0, imap, &nimaps,
>  						&dfops);

I don't think this part of the fix is correct. nres feeds into
args->total which is then used during the AGFL fixup checks. If this
is not set correctly, then we'll select AGs we have enough space in
the AG to fix up the AGFL, but not enough space to allocate all the
BMBT blocks we require. That then leads to ABBA deadlocks on AGF
locks near ENOSPC - see commit dbd5c8c ("xfs: pass total block
res. as total xfs_bmapi_write() parameter") for the full details.

I've been testing a local version of this fix since you pointed out
the problem that still passed nres into xfs_bmapi_write() and I
haven't seen any problems, so I think it is correct to keep nres
here. I'm going to drop this hunk from this patch for the moment in
my tree.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] xfs: fix bogus space reservation in xfs_iomap_write_allocate
  2016-08-05  0:03 ` Dave Chinner
@ 2016-08-11 16:03   ` Christoph Hellwig
  0 siblings, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2016-08-11 16:03 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs

On Fri, Aug 05, 2016 at 10:03:54AM +1000, Dave Chinner wrote:
> I don't think this part of the fix is correct. nres feeds into
> args->total which is then used during the AGFL fixup checks. If this
> is not set correctly, then we'll select AGs we have enough space in
> the AG to fix up the AGFL, but not enough space to allocate all the
> BMBT blocks we require. That then leads to ABBA deadlocks on AGF
> locks near ENOSPC - see commit dbd5c8c ("xfs: pass total block
> res. as total xfs_bmapi_write() parameter") for the full details.

I've been going forth and back between both versions and both have
tested fine - I couldn't really convince me which one is more correct.

> I've been testing a local version of this fix since you pointed out
> the problem that still passed nres into xfs_bmapi_write() and I
> haven't seen any problems, so I think it is correct to keep nres
> here. I'm going to drop this hunk from this patch for the moment in
> my tree.

Ok, sounds fine.  If you want a real resend let me know.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-08-11 16:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-03 17:33 [PATCH] xfs: fix bogus space reservation in xfs_iomap_write_allocate Christoph Hellwig
2016-08-05  0:03 ` Dave Chinner
2016-08-11 16:03   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).