All of lore.kernel.org
 help / color / mirror / Atom feed
* minleft fixes V2
@ 2016-12-22 20:00 Christoph Hellwig
  2016-12-22 20:00 ` [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside Christoph Hellwig
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Christoph Hellwig @ 2016-12-22 20:00 UTC (permalink / raw)
  To: linux-xfs; +Cc: eguan, darrick.wong

his is my attempt to fix the problems with rmap and reflink file system
running out of space during delayed extent conversions.  It turns out
the way minleft has been handled has always been bogus, but the rmap
enablement made it even worse..

This has survived xfstests xfs/109 on a 2k fs for a couple hours now
but Eryu was much better than me at reproducing the problems, so it could
use some further testing.

Changes since V1:
 - new patch to fix the noalign case in xfs_bmap_btalloc
 - new patch to fix xfs_alloc_set_aside
 - dropped the patch to increase the minleft value for bmap allocations,
   we only need the current minimal values as we only allocate one
   extent per call
 - minor style fixes per review of the last round

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside
  2016-12-22 20:00 minleft fixes V2 Christoph Hellwig
@ 2016-12-22 20:00 ` Christoph Hellwig
  2017-01-04 14:33   ` Brian Foster
  2016-12-22 20:00 ` [PATCH 2/5] xfs: fix the alignment fallback in xfs_bmap_btalloc Christoph Hellwig
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2016-12-22 20:00 UTC (permalink / raw)
  To: linux-xfs; +Cc: eguan, darrick.wong

Setting aside 4 blocks globally for bmbt splits isn't all that useful,
as different threads can allocate space in parallel.  Bump it to 4
blocks per AG to allow each thread that is currently doing an
allocation to dip into it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_alloc.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 5050056..0a46f84 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -95,10 +95,7 @@ unsigned int
 xfs_alloc_set_aside(
 	struct xfs_mount	*mp)
 {
-	unsigned int		blocks;
-
-	blocks = 4 + (mp->m_sb.sb_agcount * XFS_ALLOC_AGFL_RESERVE);
-	return blocks;
+	return mp->m_sb.sb_agcount * (XFS_ALLOC_AGFL_RESERVE + 4);
 }
 
 /*
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/5] xfs: fix the alignment fallback in xfs_bmap_btalloc
  2016-12-22 20:00 minleft fixes V2 Christoph Hellwig
  2016-12-22 20:00 ` [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside Christoph Hellwig
@ 2016-12-22 20:00 ` Christoph Hellwig
  2017-01-04 14:34   ` Brian Foster
  2016-12-22 20:00 ` [PATCH 3/5] xfs: fix bogus minleft manipulations Christoph Hellwig
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2016-12-22 20:00 UTC (permalink / raw)
  To: linux-xfs; +Cc: eguan, darrick.wong

The way the alignment field is used the minimum possible alignment is 1.
By setting it to 0 we'll mess up calculations that rely on this.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_bmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 2760bc3..19c05e9 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3796,7 +3796,7 @@ xfs_bmap_btalloc(
 		 */
 		args.type = atype;
 		args.fsbno = ap->blkno;
-		args.alignment = 0;
+		args.alignment = 1;
 		if ((error = xfs_alloc_vextent(&args)))
 			return error;
 	}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/5] xfs: fix bogus minleft manipulations
  2016-12-22 20:00 minleft fixes V2 Christoph Hellwig
  2016-12-22 20:00 ` [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside Christoph Hellwig
  2016-12-22 20:00 ` [PATCH 2/5] xfs: fix the alignment fallback in xfs_bmap_btalloc Christoph Hellwig
@ 2016-12-22 20:00 ` Christoph Hellwig
  2017-01-04 18:19   ` Brian Foster
  2016-12-22 20:00 ` [PATCH 4/5] xfs: adjust allocation length in xfs_alloc_space_available Christoph Hellwig
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2016-12-22 20:00 UTC (permalink / raw)
  To: linux-xfs; +Cc: eguan, darrick.wong

We can't just set minleft to 0 when we're low on space - that's exactly
what we need minleft for: to protect space in the AG for btree block
allocations when we are low on free space.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_alloc.c      | 24 +++++++-----------------
 fs/xfs/libxfs/xfs_bmap.c       |  3 ---
 fs/xfs/libxfs/xfs_bmap_btree.c | 14 --------------
 3 files changed, 7 insertions(+), 34 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 0a46f84..fe92570 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2635,12 +2635,10 @@ xfs_alloc_vextent(
 	xfs_agblock_t	agsize;	/* allocation group size */
 	int		error;
 	int		flags;	/* XFS_ALLOC_FLAG_... locking flags */
-	xfs_extlen_t	minleft;/* minimum left value, temp copy */
 	xfs_mount_t	*mp;	/* mount structure pointer */
 	xfs_agnumber_t	sagno;	/* starting allocation group number */
 	xfs_alloctype_t	type;	/* input allocation type */
 	int		bump_rotor = 0;
-	int		no_min = 0;
 	xfs_agnumber_t	rotorstep = xfs_rotorstep; /* inode32 agf stepper */
 
 	mp = args->mp;
@@ -2669,7 +2667,6 @@ xfs_alloc_vextent(
 		trace_xfs_alloc_vextent_badargs(args);
 		return 0;
 	}
-	minleft = args->minleft;
 
 	switch (type) {
 	case XFS_ALLOCTYPE_THIS_AG:
@@ -2680,9 +2677,7 @@ xfs_alloc_vextent(
 		 */
 		args->agno = XFS_FSB_TO_AGNO(mp, args->fsbno);
 		args->pag = xfs_perag_get(mp, args->agno);
-		args->minleft = 0;
 		error = xfs_alloc_fix_freelist(args, 0);
-		args->minleft = minleft;
 		if (error) {
 			trace_xfs_alloc_vextent_nofix(args);
 			goto error0;
@@ -2747,9 +2742,7 @@ xfs_alloc_vextent(
 		 */
 		for (;;) {
 			args->pag = xfs_perag_get(mp, args->agno);
-			if (no_min) args->minleft = 0;
 			error = xfs_alloc_fix_freelist(args, flags);
-			args->minleft = minleft;
 			if (error) {
 				trace_xfs_alloc_vextent_nofix(args);
 				goto error0;
@@ -2789,20 +2782,17 @@ xfs_alloc_vextent(
 			 * or switch to non-trylock mode.
 			 */
 			if (args->agno == sagno) {
-				if (no_min == 1) {
+				if (flags == 0) {
 					args->agbno = NULLAGBLOCK;
 					trace_xfs_alloc_vextent_allfailed(args);
 					break;
 				}
-				if (flags == 0) {
-					no_min = 1;
-				} else {
-					flags = 0;
-					if (type == XFS_ALLOCTYPE_START_BNO) {
-						args->agbno = XFS_FSB_TO_AGBNO(mp,
-							args->fsbno);
-						args->type = XFS_ALLOCTYPE_NEAR_BNO;
-					}
+
+				flags = 0;
+				if (type == XFS_ALLOCTYPE_START_BNO) {
+					args->agbno = XFS_FSB_TO_AGBNO(mp,
+						args->fsbno);
+					args->type = XFS_ALLOCTYPE_NEAR_BNO;
 				}
 			}
 			xfs_perag_put(args->pag);
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 19c05e9..62663a2 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3812,7 +3812,6 @@ xfs_bmap_btalloc(
 		args.fsbno = 0;
 		args.type = XFS_ALLOCTYPE_FIRST_AG;
 		args.total = ap->minlen;
-		args.minleft = 0;
 		if ((error = xfs_alloc_vextent(&args)))
 			return error;
 		ap->dfops->dop_low = true;
@@ -4344,8 +4343,6 @@ xfs_bmapi_allocate(
 	if (error)
 		return error;
 
-	if (bma->dfops->dop_low)
-		bma->minleft = 0;
 	if (bma->cur)
 		bma->cur->bc_private.b.firstblock = *bma->firstblock;
 	if (bma->blkno == NULLFSBLOCK)
diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
index d6330c2..9581ee8 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.c
+++ b/fs/xfs/libxfs/xfs_bmap_btree.c
@@ -499,20 +499,6 @@ xfs_bmbt_alloc_block(
 		goto try_another_ag;
 	}
 
-	if (args.fsbno == NULLFSBLOCK && args.minleft) {
-		/*
-		 * Could not find an AG with enough free space to satisfy
-		 * a full btree split.  Try again without minleft and if
-		 * successful activate the lowspace algorithm.
-		 */
-		args.fsbno = 0;
-		args.type = XFS_ALLOCTYPE_FIRST_AG;
-		args.minleft = 0;
-		error = xfs_alloc_vextent(&args);
-		if (error)
-			goto error0;
-		cur->bc_private.b.dfops->dop_low = true;
-	}
 	if (args.fsbno == NULLFSBLOCK) {
 		XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
 		*stat = 0;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/5] xfs: adjust allocation length in xfs_alloc_space_available
  2016-12-22 20:00 minleft fixes V2 Christoph Hellwig
                   ` (2 preceding siblings ...)
  2016-12-22 20:00 ` [PATCH 3/5] xfs: fix bogus minleft manipulations Christoph Hellwig
@ 2016-12-22 20:00 ` Christoph Hellwig
  2017-01-04 18:19   ` Brian Foster
  2016-12-22 20:00 ` [PATCH 5/5] xfs: don't rely on ->total " Christoph Hellwig
  2017-01-05  1:21 ` minleft fixes V2 Eryu Guan
  5 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2016-12-22 20:00 UTC (permalink / raw)
  To: linux-xfs; +Cc: eguan, darrick.wong

We must decide in xfs_alloc_fix_freelist if we can perform an
allocation from a given AG is possible or not based on the available
space, and should not fail the allocation past that point on a
healthy file system.

But currently we have two additional places that second-guess
xfs_alloc_fix_freelist: xfs_alloc_ag_vextent tries to adjust the
maxlen parameter to remove the reservation before doing the
allocation (but ignores the various minium freespace requirements),
and xfs_alloc_fix_minleft tries to fix up the allocated length
after we've found an extent, but ignores the reservations and also
doesn't take the AGFL into account (and thus fails allocations
for not matching minlen in some cases).

Remove all these later fixups and just correct the maxlen argument
inside xfs_alloc_fix_freelist once we have the AGF buffer locked.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_alloc.c | 80 ++++++++++-------------------------------------
 fs/xfs/libxfs/xfs_alloc.h |  2 +-
 2 files changed, 17 insertions(+), 65 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index fe92570..6a10aab 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -362,36 +362,12 @@ xfs_alloc_fix_len(
 		return;
 	ASSERT(rlen >= args->minlen && rlen <= args->maxlen);
 	ASSERT(rlen % args->prod == args->mod);
+	ASSERT(args->pag->pagf_freeblks + args->pag->pagf_flcount >=
+		rlen + args->minleft);
 	args->len = rlen;
 }
 
 /*
- * Fix up length if there is too little space left in the a.g.
- * Return 1 if ok, 0 if too little, should give up.
- */
-STATIC int
-xfs_alloc_fix_minleft(
-	xfs_alloc_arg_t	*args)		/* allocation argument structure */
-{
-	xfs_agf_t	*agf;		/* a.g. freelist header */
-	int		diff;		/* free space difference */
-
-	if (args->minleft == 0)
-		return 1;
-	agf = XFS_BUF_TO_AGF(args->agbp);
-	diff = be32_to_cpu(agf->agf_freeblks)
-		- args->len - args->minleft;
-	if (diff >= 0)
-		return 1;
-	args->len += diff;		/* shrink the allocated space */
-	/* casts to (int) catch length underflows */
-	if ((int)args->len >= (int)args->minlen)
-		return 1;
-	args->agbno = NULLAGBLOCK;
-	return 0;
-}
-
-/*
  * Update the two btrees, logically removing from freespace the extent
  * starting at rbno, rlen blocks.  The extent is contained within the
  * actual (current) free extent fbno for flen blocks.
@@ -686,8 +662,6 @@ xfs_alloc_ag_vextent(
 	xfs_alloc_arg_t	*args)	/* argument structure for allocation */
 {
 	int		error=0;
-	xfs_extlen_t	reservation;
-	xfs_extlen_t	oldmax;
 
 	ASSERT(args->minlen > 0);
 	ASSERT(args->maxlen > 0);
@@ -696,20 +670,6 @@ xfs_alloc_ag_vextent(
 	ASSERT(args->alignment > 0);
 
 	/*
-	 * Clamp maxlen to the amount of free space minus any reservations
-	 * that have been made.
-	 */
-	oldmax = args->maxlen;
-	reservation = xfs_ag_resv_needed(args->pag, args->resv);
-	if (args->maxlen > args->pag->pagf_freeblks - reservation)
-		args->maxlen = args->pag->pagf_freeblks - reservation;
-	if (args->maxlen == 0) {
-		args->agbno = NULLAGBLOCK;
-		args->maxlen = oldmax;
-		return 0;
-	}
-
-	/*
 	 * Branch to correct routine based on the type.
 	 */
 	args->wasfromfl = 0;
@@ -728,8 +688,6 @@ xfs_alloc_ag_vextent(
 		/* NOTREACHED */
 	}
 
-	args->maxlen = oldmax;
-
 	if (error || args->agbno == NULLAGBLOCK)
 		return error;
 
@@ -838,9 +796,6 @@ xfs_alloc_ag_vextent_exact(
 	args->len = XFS_AGBLOCK_MIN(tend, args->agbno + args->maxlen)
 						- args->agbno;
 	xfs_alloc_fix_len(args);
-	if (!xfs_alloc_fix_minleft(args))
-		goto not_found;
-
 	ASSERT(args->agbno + args->len <= tend);
 
 	/*
@@ -1146,12 +1101,7 @@ xfs_alloc_ag_vextent_near(
 		XFS_WANT_CORRUPTED_GOTO(args->mp, i == 1, error0);
 		ASSERT(ltbno + ltlen <= be32_to_cpu(XFS_BUF_TO_AGF(args->agbp)->agf_length));
 		args->len = blen;
-		if (!xfs_alloc_fix_minleft(args)) {
-			xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR);
-			trace_xfs_alloc_near_nominleft(args);
-			return 0;
-		}
-		blen = args->len;
+
 		/*
 		 * We are allocating starting at bnew for blen blocks.
 		 */
@@ -1343,12 +1293,6 @@ xfs_alloc_ag_vextent_near(
 	 */
 	args->len = XFS_EXTLEN_MIN(ltlena, args->maxlen);
 	xfs_alloc_fix_len(args);
-	if (!xfs_alloc_fix_minleft(args)) {
-		trace_xfs_alloc_near_nominleft(args);
-		xfs_btree_del_cursor(bno_cur_lt, XFS_BTREE_NOERROR);
-		xfs_btree_del_cursor(cnt_cur, XFS_BTREE_NOERROR);
-		return 0;
-	}
 	rlen = args->len;
 	(void)xfs_alloc_compute_diff(args->agbno, rlen, args->alignment,
 				     args->datatype, ltbnoa, ltlena, &ltnew);
@@ -1550,8 +1494,6 @@ xfs_alloc_ag_vextent_size(
 	}
 	xfs_alloc_fix_len(args);
 
-	if (!xfs_alloc_fix_minleft(args))
-		goto out_nominleft;
 	rlen = args->len;
 	XFS_WANT_CORRUPTED_GOTO(args->mp, rlen <= flen, error0);
 	/*
@@ -2070,10 +2012,19 @@ xfs_alloc_space_available(
 
 	/* do we have enough free space remaining for the allocation? */
 	available = (int)(pag->pagf_freeblks + pag->pagf_flcount -
-			  reservation - min_free - args->total);
-	if (available < (int)args->minleft || available <= 0)
+			  reservation - min_free - args->minleft);
+	if (available < (int)args->total)
 		return false;
 
+	/*
+	 * Clamp maxlen to the amount of free space available for the actual
+	 * extent allocation.
+	 */
+	if (available < (int)args->maxlen && !(flags & XFS_ALLOC_FLAG_CHECK)) {
+		args->maxlen = available;
+		ASSERT(args->maxlen > 0);
+	}
+
 	return true;
 }
 
@@ -2119,7 +2070,8 @@ xfs_alloc_fix_freelist(
 	}
 
 	need = xfs_alloc_min_freelist(mp, pag);
-	if (!xfs_alloc_space_available(args, need, flags))
+	if (!xfs_alloc_space_available(args, need, flags |
+			XFS_ALLOC_FLAG_CHECK))
 		goto out_agbp_relse;
 
 	/*
diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index 7c404a6..1d0f48a 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -56,7 +56,7 @@ typedef unsigned int xfs_alloctype_t;
 #define	XFS_ALLOC_FLAG_FREEING	0x00000002  /* indicate caller is freeing extents*/
 #define	XFS_ALLOC_FLAG_NORMAP	0x00000004  /* don't modify the rmapbt */
 #define	XFS_ALLOC_FLAG_NOSHRINK	0x00000008  /* don't shrink the freelist */
-
+#define	XFS_ALLOC_FLAG_CHECK	0x00000010  /* test only, don't modify args */
 
 /*
  * Argument structure for xfs_alloc routines.
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 5/5] xfs: don't rely on ->total in xfs_alloc_space_available
  2016-12-22 20:00 minleft fixes V2 Christoph Hellwig
                   ` (3 preceding siblings ...)
  2016-12-22 20:00 ` [PATCH 4/5] xfs: adjust allocation length in xfs_alloc_space_available Christoph Hellwig
@ 2016-12-22 20:00 ` Christoph Hellwig
  2017-01-04 18:19   ` Brian Foster
  2017-01-05  1:21 ` minleft fixes V2 Eryu Guan
  5 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2016-12-22 20:00 UTC (permalink / raw)
  To: linux-xfs; +Cc: eguan, darrick.wong

->total is a bit of an odd parameter passed down to the low-level
allocator all the way from the high-level callers.  It's supposed to
contain the maximum number of blocks to be allocated for the whole
transaction [1].

But in xfs_iomap_write_allocate we only convert existing delayed
allocations and thus only have a minimal block reservation for the
current transaction, so xfs_alloc_space_available can't use it for
the allocation decisions.  Use the maximum of args->total and the
calculated block requirement to make a decision.  We probably should
get rid of args->total eventually and instead apply ->minleft more
broadly, but that will require some extensive changes all over.

[1] which creates lots of confusion as most callers don't decrement it
once doing a first allocation.  But that's for a separate series.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_alloc.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 6a10aab..c121407 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -1995,7 +1995,7 @@ xfs_alloc_space_available(
 	int			flags)
 {
 	struct xfs_perag	*pag = args->pag;
-	xfs_extlen_t		longest;
+	xfs_extlen_t		alloc_len, longest;
 	xfs_extlen_t		reservation; /* blocks that are still reserved */
 	int			available;
 
@@ -2005,15 +2005,16 @@ xfs_alloc_space_available(
 	reservation = xfs_ag_resv_needed(pag, args->resv);
 
 	/* do we have enough contiguous free space for the allocation? */
+	alloc_len = args->minlen + (args->alignment - 1) + args->minalignslop;
 	longest = xfs_alloc_longest_free_extent(args->mp, pag, min_free,
 			reservation);
-	if ((args->minlen + args->alignment + args->minalignslop - 1) > longest)
+	if (longest < alloc_len)
 		return false;
 
 	/* do we have enough free space remaining for the allocation? */
 	available = (int)(pag->pagf_freeblks + pag->pagf_flcount -
 			  reservation - min_free - args->minleft);
-	if (available < (int)args->total)
+	if (available < (int)max(args->total, alloc_len))
 		return false;
 
 	/*
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside
  2016-12-22 20:00 ` [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside Christoph Hellwig
@ 2017-01-04 14:33   ` Brian Foster
  2017-01-08 10:30     ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2017-01-04 14:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, eguan, darrick.wong

On Thu, Dec 22, 2016 at 09:00:18PM +0100, Christoph Hellwig wrote:
> Setting aside 4 blocks globally for bmbt splits isn't all that useful,
> as different threads can allocate space in parallel.  Bump it to 4
> blocks per AG to allow each thread that is currently doing an
> allocation to dip into it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Presumably this patch addresses the potential deadlock issues from the
previous version, but the commit log description makes no mention of it
whatsoever. While the code seems fine, I think the commit log
description needs more information wrt to that situation and the
relationship/dependency with minleft.

>  fs/xfs/libxfs/xfs_alloc.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 5050056..0a46f84 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -95,10 +95,7 @@ unsigned int
>  xfs_alloc_set_aside(
>  	struct xfs_mount	*mp)
>  {
> -	unsigned int		blocks;
> -
> -	blocks = 4 + (mp->m_sb.sb_agcount * XFS_ALLOC_AGFL_RESERVE);
> -	return blocks;
> +	return mp->m_sb.sb_agcount * (XFS_ALLOC_AGFL_RESERVE + 4);

The comment above xfs_alloc_set_aside() already touches on the writeback
situation, but why 4 blocks per ag? Wasn't the intent to use
worst_indlen() since that's the base for minleft?

Also, it looks like this causes a regression in xfs/004. On a quick
look, we might just need a test update however...

Brian

>  }
>  
>  /*
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/5] xfs: fix the alignment fallback in xfs_bmap_btalloc
  2016-12-22 20:00 ` [PATCH 2/5] xfs: fix the alignment fallback in xfs_bmap_btalloc Christoph Hellwig
@ 2017-01-04 14:34   ` Brian Foster
  2017-01-08 10:31     ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2017-01-04 14:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, eguan, darrick.wong

On Thu, Dec 22, 2016 at 09:00:19PM +0100, Christoph Hellwig wrote:
> The way the alignment field is used the minimum possible alignment is 1.
> By setting it to 0 we'll mess up calculations that rely on this.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 2760bc3..19c05e9 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -3796,7 +3796,7 @@ xfs_bmap_btalloc(
>  		 */
>  		args.type = atype;
>  		args.fsbno = ap->blkno;
> -		args.alignment = 0;
> +		args.alignment = 1;

Ok, but we have the following near the top of xfs_alloc_vextent():

        if (args->alignment == 0)
                args->alignment = 1;

... so I'm not sure the commit log description is accurate. That aside:

Reviewed-by: Brian Foster <bfoster@redhat.com>

Brian

>  		if ((error = xfs_alloc_vextent(&args)))
>  			return error;
>  	}
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] xfs: fix bogus minleft manipulations
  2016-12-22 20:00 ` [PATCH 3/5] xfs: fix bogus minleft manipulations Christoph Hellwig
@ 2017-01-04 18:19   ` Brian Foster
  2017-01-08 10:36     ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2017-01-04 18:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, eguan, darrick.wong

On Thu, Dec 22, 2016 at 09:00:20PM +0100, Christoph Hellwig wrote:
> We can't just set minleft to 0 when we're low on space - that's exactly
> what we need minleft for: to protect space in the AG for btree block
> allocations when we are low on free space.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Staring at this some more, I'm still not terribly fond of this, moreso
because I wonder how much more of this can be ripped out and whether the
low space allocator thing is still effective. Aside from that, afaict
the set_aside change should make it robust and addresses my previous
question in that it holds blocks out of all transaction reservations.

I'm also curious whether the set_aside patch alone addresses the
original problem, or setting minleft = 0 in one of these cases was
actually the cause.

>  fs/xfs/libxfs/xfs_alloc.c      | 24 +++++++-----------------
>  fs/xfs/libxfs/xfs_bmap.c       |  3 ---
>  fs/xfs/libxfs/xfs_bmap_btree.c | 14 --------------
>  3 files changed, 7 insertions(+), 34 deletions(-)
> 
...
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 19c05e9..62663a2 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -3812,7 +3812,6 @@ xfs_bmap_btalloc(
>  		args.fsbno = 0;
>  		args.type = XFS_ALLOCTYPE_FIRST_AG;
>  		args.total = ap->minlen;
> -		args.minleft = 0;
>  		if ((error = xfs_alloc_vextent(&args)))
>  			return error;
>  		ap->dfops->dop_low = true;
...
> diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
> index d6330c2..9581ee8 100644
> --- a/fs/xfs/libxfs/xfs_bmap_btree.c
> +++ b/fs/xfs/libxfs/xfs_bmap_btree.c
> @@ -499,20 +499,6 @@ xfs_bmbt_alloc_block(
>  		goto try_another_ag;
>  	}
>  
> -	if (args.fsbno == NULLFSBLOCK && args.minleft) {
> -		/*
> -		 * Could not find an AG with enough free space to satisfy
> -		 * a full btree split.  Try again without minleft and if
> -		 * successful activate the lowspace algorithm.
> -		 */
> -		args.fsbno = 0;
> -		args.type = XFS_ALLOCTYPE_FIRST_AG;
> -		args.minleft = 0;
> -		error = xfs_alloc_vextent(&args);
> -		if (error)
> -			goto error0;
> -		cur->bc_private.b.dfops->dop_low = true;
> -	}

We have a similar retry pattern in xfs_bmap_btalloc() where, in the hunk
just above, we retain the retry that appears analogous to this one (in
that it activates the low space algo) and just drop the minleft = 0 bit.
Here we are dropping the whole thing. Any reason not to be consistent
one way or the other? (Though do note that we don't check firstblock
here...).

Brian

>  	if (args.fsbno == NULLFSBLOCK) {
>  		XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
>  		*stat = 0;
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/5] xfs: adjust allocation length in xfs_alloc_space_available
  2016-12-22 20:00 ` [PATCH 4/5] xfs: adjust allocation length in xfs_alloc_space_available Christoph Hellwig
@ 2017-01-04 18:19   ` Brian Foster
  0 siblings, 0 replies; 27+ messages in thread
From: Brian Foster @ 2017-01-04 18:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, eguan, darrick.wong

On Thu, Dec 22, 2016 at 09:00:21PM +0100, Christoph Hellwig wrote:
> We must decide in xfs_alloc_fix_freelist if we can perform an
> allocation from a given AG is possible or not based on the available
> space, and should not fail the allocation past that point on a
> healthy file system.
> 
> But currently we have two additional places that second-guess
> xfs_alloc_fix_freelist: xfs_alloc_ag_vextent tries to adjust the
> maxlen parameter to remove the reservation before doing the
> allocation (but ignores the various minium freespace requirements),
> and xfs_alloc_fix_minleft tries to fix up the allocated length
> after we've found an extent, but ignores the reservations and also
> doesn't take the AGFL into account (and thus fails allocations
> for not matching minlen in some cases).
> 
> Remove all these later fixups and just correct the maxlen argument
> inside xfs_alloc_fix_freelist once we have the AGF buffer locked.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_alloc.c | 80 ++++++++++-------------------------------------
>  fs/xfs/libxfs/xfs_alloc.h |  2 +-
>  2 files changed, 17 insertions(+), 65 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index fe92570..6a10aab 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
...
> @@ -2070,10 +2012,19 @@ xfs_alloc_space_available(
>  
>  	/* do we have enough free space remaining for the allocation? */
>  	available = (int)(pag->pagf_freeblks + pag->pagf_flcount -
> -			  reservation - min_free - args->total);
> -	if (available < (int)args->minleft || available <= 0)
> +			  reservation - min_free - args->minleft);
> +	if (available < (int)args->total)
>  		return false;
>  
> +	/*
> +	 * Clamp maxlen to the amount of free space available for the actual
> +	 * extent allocation.
> +	 */
> +	if (available < (int)args->maxlen && !(flags & XFS_ALLOC_FLAG_CHECK)) {
> +		args->maxlen = available;
> +		ASSERT(args->maxlen > 0);

Still missing my nit from the previous version. Otherwise:

Reviewed-by: Brian Foster <bfoster@redhat.com>

> +	}
> +
>  	return true;
>  }
>  
> @@ -2119,7 +2070,8 @@ xfs_alloc_fix_freelist(
>  	}
>  
>  	need = xfs_alloc_min_freelist(mp, pag);
> -	if (!xfs_alloc_space_available(args, need, flags))
> +	if (!xfs_alloc_space_available(args, need, flags |
> +			XFS_ALLOC_FLAG_CHECK))
>  		goto out_agbp_relse;
>  
>  	/*
> diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
> index 7c404a6..1d0f48a 100644
> --- a/fs/xfs/libxfs/xfs_alloc.h
> +++ b/fs/xfs/libxfs/xfs_alloc.h
> @@ -56,7 +56,7 @@ typedef unsigned int xfs_alloctype_t;
>  #define	XFS_ALLOC_FLAG_FREEING	0x00000002  /* indicate caller is freeing extents*/
>  #define	XFS_ALLOC_FLAG_NORMAP	0x00000004  /* don't modify the rmapbt */
>  #define	XFS_ALLOC_FLAG_NOSHRINK	0x00000008  /* don't shrink the freelist */
> -
> +#define	XFS_ALLOC_FLAG_CHECK	0x00000010  /* test only, don't modify args */
>  
>  /*
>   * Argument structure for xfs_alloc routines.
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/5] xfs: don't rely on ->total in xfs_alloc_space_available
  2016-12-22 20:00 ` [PATCH 5/5] xfs: don't rely on ->total " Christoph Hellwig
@ 2017-01-04 18:19   ` Brian Foster
  0 siblings, 0 replies; 27+ messages in thread
From: Brian Foster @ 2017-01-04 18:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, eguan, darrick.wong

On Thu, Dec 22, 2016 at 09:00:22PM +0100, Christoph Hellwig wrote:
> ->total is a bit of an odd parameter passed down to the low-level
> allocator all the way from the high-level callers.  It's supposed to
> contain the maximum number of blocks to be allocated for the whole
> transaction [1].
> 
> But in xfs_iomap_write_allocate we only convert existing delayed
> allocations and thus only have a minimal block reservation for the
> current transaction, so xfs_alloc_space_available can't use it for
> the allocation decisions.  Use the maximum of args->total and the
> calculated block requirement to make a decision.  We probably should
> get rid of args->total eventually and instead apply ->minleft more
> broadly, but that will require some extensive changes all over.
> 
> [1] which creates lots of confusion as most callers don't decrement it
> once doing a first allocation.  But that's for a separate series.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_alloc.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 6a10aab..c121407 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -1995,7 +1995,7 @@ xfs_alloc_space_available(
>  	int			flags)
>  {
>  	struct xfs_perag	*pag = args->pag;
> -	xfs_extlen_t		longest;
> +	xfs_extlen_t		alloc_len, longest;
>  	xfs_extlen_t		reservation; /* blocks that are still reserved */
>  	int			available;
>  
> @@ -2005,15 +2005,16 @@ xfs_alloc_space_available(
>  	reservation = xfs_ag_resv_needed(pag, args->resv);
>  
>  	/* do we have enough contiguous free space for the allocation? */
> +	alloc_len = args->minlen + (args->alignment - 1) + args->minalignslop;
>  	longest = xfs_alloc_longest_free_extent(args->mp, pag, min_free,
>  			reservation);
> -	if ((args->minlen + args->alignment + args->minalignslop - 1) > longest)
> +	if (longest < alloc_len)
>  		return false;
>  
>  	/* do we have enough free space remaining for the allocation? */
>  	available = (int)(pag->pagf_freeblks + pag->pagf_flcount -
>  			  reservation - min_free - args->minleft);
> -	if (available < (int)args->total)
> +	if (available < (int)max(args->total, alloc_len))
>  		return false;
>  
>  	/*
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2016-12-22 20:00 minleft fixes V2 Christoph Hellwig
                   ` (4 preceding siblings ...)
  2016-12-22 20:00 ` [PATCH 5/5] xfs: don't rely on ->total " Christoph Hellwig
@ 2017-01-05  1:21 ` Eryu Guan
  2017-01-05  2:01   ` Darrick J. Wong
  5 siblings, 1 reply; 27+ messages in thread
From: Eryu Guan @ 2017-01-05  1:21 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, darrick.wong

On Thu, Dec 22, 2016 at 09:00:17PM +0100, Christoph Hellwig wrote:
> his is my attempt to fix the problems with rmap and reflink file system
> running out of space during delayed extent conversions.  It turns out
> the way minleft has been handled has always been bogus, but the rmap
> enablement made it even worse..
> 
> This has survived xfstests xfs/109 on a 2k fs for a couple hours now
> but Eryu was much better than me at reproducing the problems, so it could
> use some further testing.

I applied this patchset on top of v4.10-rc2 kernel and ran xfs/109 with
my xfs_2k_reflink test config for 200 iterations, and all passed.

Thanks,
Eryu

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2017-01-05  1:21 ` minleft fixes V2 Eryu Guan
@ 2017-01-05  2:01   ` Darrick J. Wong
  2017-01-08 10:36     ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2017-01-05  2:01 UTC (permalink / raw)
  To: Eryu Guan; +Cc: Christoph Hellwig, linux-xfs

On Thu, Jan 05, 2017 at 09:21:59AM +0800, Eryu Guan wrote:
> On Thu, Dec 22, 2016 at 09:00:17PM +0100, Christoph Hellwig wrote:
> > his is my attempt to fix the problems with rmap and reflink file system
> > running out of space during delayed extent conversions.  It turns out
> > the way minleft has been handled has always been bogus, but the rmap
> > enablement made it even worse..
> > 
> > This has survived xfstests xfs/109 on a 2k fs for a couple hours now
> > but Eryu was much better than me at reproducing the problems, so it could
> > use some further testing.
> 
> I applied this patchset on top of v4.10-rc2 kernel and ran xfs/109 with
> my xfs_2k_reflink test config for 200 iterations, and all passed.

Ok, sounds good.  I just sent out my first attempt at a pull request to Linus,
so I'll queue this series for another pull req. after that.

--D

> 
> Thanks,
> Eryu
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside
  2017-01-04 14:33   ` Brian Foster
@ 2017-01-08 10:30     ` Christoph Hellwig
  2017-01-08 16:07       ` Brian Foster
  0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2017-01-08 10:30 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs, eguan, darrick.wong

On Wed, Jan 04, 2017 at 09:33:51AM -0500, Brian Foster wrote:
> Presumably this patch addresses the potential deadlock issues from the
> previous version, but the commit log description makes no mention of it
> whatsoever. While the code seems fine, I think the commit log
> description needs more information wrt to that situation and the
> relationship/dependency with minleft.

Ok.

> The comment above xfs_alloc_set_aside() already touches on the writeback
> situation, but why 4 blocks per ag? Wasn't the intent to use
> worst_indlen() since that's the base for minleft?

No, I've given up on that.  worst_indlen deals with the fact that
for converting a delayed extent of a given length we might need multiple
real extents, possible in different AGs.

This version of the series keeps the previous minleft that is for just
allocating a single extent in the AG - the callers will handle "short"
returns from xfs_bmapi_write and just start a new allocation.  And
except for a corner case in the large directory block allocation code
these are in a new / rolled over transaction.  Fixing the latter also
is on my todo list, but it's another big issue that so far hasn't
trigger in practive, so I'd like to keep it in a separate series.

> Also, it looks like this causes a regression in xfs/004. On a quick
> look, we might just need a test update however...

Yes.  Hard to do in a series for the kernel, though :)

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/5] xfs: fix the alignment fallback in xfs_bmap_btalloc
  2017-01-04 14:34   ` Brian Foster
@ 2017-01-08 10:31     ` Christoph Hellwig
  2017-01-08 16:08       ` Brian Foster
  0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2017-01-08 10:31 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs, eguan, darrick.wong

On Wed, Jan 04, 2017 at 09:34:05AM -0500, Brian Foster wrote:
> Ok, but we have the following near the top of xfs_alloc_vextent():
> 
>         if (args->alignment == 0)
>                 args->alignment = 1;
> 
> ... so I'm not sure the commit log description is accurate. That aside:
> 
> Reviewed-by: Brian Foster <bfoster@redhat.com>

Well, in that case we might not actually need it.  This was just a
a drive-by patch from investigating what the alignmnet - 1 mean in
the allocator.

Either way this could be cleaned up, but probably shouldn't be in
a minimal bugfix series.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] xfs: fix bogus minleft manipulations
  2017-01-04 18:19   ` Brian Foster
@ 2017-01-08 10:36     ` Christoph Hellwig
  2017-01-08 16:09       ` Brian Foster
  0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2017-01-08 10:36 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs, eguan, darrick.wong

On Wed, Jan 04, 2017 at 01:19:33PM -0500, Brian Foster wrote:
> Staring at this some more, I'm still not terribly fond of this, moreso
> because I wonder how much more of this can be ripped out

What else do you have in mind?

> and whether the
> low space allocator thing is still effective.

That's a good question.  Another item added to my not critical allocator
TODO list, which keeps growing..

> Aside from that, afaict
> the set_aside change should make it robust and addresses my previous
> question in that it holds blocks out of all transaction reservations.
> 
> I'm also curious whether the set_aside patch alone addresses the
> original problem, or setting minleft = 0 in one of these cases was
> actually the cause.

I think I needed both changes, but it's been a while and I'll have to
retest.

> > -		/*
> > -		 * Could not find an AG with enough free space to satisfy
> > -		 * a full btree split.  Try again without minleft and if
> > -		 * successful activate the lowspace algorithm.
> > -		 */
> > -		args.fsbno = 0;
> > -		args.type = XFS_ALLOCTYPE_FIRST_AG;
> > -		args.minleft = 0;
> > -		error = xfs_alloc_vextent(&args);
> > -		if (error)
> > -			goto error0;
> > -		cur->bc_private.b.dfops->dop_low = true;
> > -	}
> 
> We have a similar retry pattern in xfs_bmap_btalloc() where, in the hunk
> just above, we retain the retry that appears analogous to this one (in
> that it activates the low space algo) and just drop the minleft = 0 bit.
> Here we are dropping the whole thing. Any reason not to be consistent
> one way or the other? (Though do note that we don't check firstblock
> here...).

xfs_bmap_btalloc is a bit different because the alignment question
comes into play as well, in addition to the non-binding AG selection
from the higher level allocator code.  But I suspect that there is a lot
of dead or questionable code in it, and it's been on my todo list
to audit xfs_bmap_btalloc, xfs_alloc_vectent and it's subfunction,
and make the argument passing, allocator modes and things like the
lowspace mode aswell as the firstblock magic a lot cleaner and properly
documented.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2017-01-05  2:01   ` Darrick J. Wong
@ 2017-01-08 10:36     ` Christoph Hellwig
  2017-01-08 16:10       ` Brian Foster
  0 siblings, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2017-01-08 10:36 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Eryu Guan, Christoph Hellwig, linux-xfs

On Wed, Jan 04, 2017 at 06:01:05PM -0800, Darrick J. Wong wrote:
> Ok, sounds good.  I just sent out my first attempt at a pull request to Linus,
> so I'll queue this series for another pull req. after that.

I'll resend with a few minor updates based on the comments from Brian
in a few days, so this should be ready for -rc4.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside
  2017-01-08 10:30     ` Christoph Hellwig
@ 2017-01-08 16:07       ` Brian Foster
  0 siblings, 0 replies; 27+ messages in thread
From: Brian Foster @ 2017-01-08 16:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, eguan, darrick.wong

On Sun, Jan 08, 2017 at 11:30:28AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 04, 2017 at 09:33:51AM -0500, Brian Foster wrote:
> > Presumably this patch addresses the potential deadlock issues from the
> > previous version, but the commit log description makes no mention of it
> > whatsoever. While the code seems fine, I think the commit log
> > description needs more information wrt to that situation and the
> > relationship/dependency with minleft.
> 
> Ok.
> 
> > The comment above xfs_alloc_set_aside() already touches on the writeback
> > situation, but why 4 blocks per ag? Wasn't the intent to use
> > worst_indlen() since that's the base for minleft?
> 
> No, I've given up on that.  worst_indlen deals with the fact that
> for converting a delayed extent of a given length we might need multiple
> real extents, possible in different AGs.
> 
> This version of the series keeps the previous minleft that is for just
> allocating a single extent in the AG - the callers will handle "short"
> returns from xfs_bmapi_write and just start a new allocation.  And
> except for a corner case in the large directory block allocation code
> these are in a new / rolled over transaction.  Fixing the latter also
> is on my todo list, but it's another big issue that so far hasn't
> trigger in practive, so I'd like to keep it in a separate series.
> 

Ok, anything you can include in the commit log and/or comment that helps
clarify that is appreciated.

Brian

> > Also, it looks like this causes a regression in xfs/004. On a quick
> > look, we might just need a test update however...
> 
> Yes.  Hard to do in a series for the kernel, though :)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/5] xfs: fix the alignment fallback in xfs_bmap_btalloc
  2017-01-08 10:31     ` Christoph Hellwig
@ 2017-01-08 16:08       ` Brian Foster
  0 siblings, 0 replies; 27+ messages in thread
From: Brian Foster @ 2017-01-08 16:08 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, eguan, darrick.wong

On Sun, Jan 08, 2017 at 11:31:34AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 04, 2017 at 09:34:05AM -0500, Brian Foster wrote:
> > Ok, but we have the following near the top of xfs_alloc_vextent():
> > 
> >         if (args->alignment == 0)
> >                 args->alignment = 1;
> > 
> > ... so I'm not sure the commit log description is accurate. That aside:
> > 
> > Reviewed-by: Brian Foster <bfoster@redhat.com>
> 
> Well, in that case we might not actually need it.  This was just a
> a drive-by patch from investigating what the alignmnet - 1 mean in
> the allocator.
> 
> Either way this could be cleaned up, but probably shouldn't be in
> a minimal bugfix series.

We could also just replace it with something like ASSERT(args->alignment
> 0) before the use of (alignment - 1), which is kind of
self-documenting (but still probably not worth a patch on its own). But
fair enough either way...

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] xfs: fix bogus minleft manipulations
  2017-01-08 10:36     ` Christoph Hellwig
@ 2017-01-08 16:09       ` Brian Foster
  2017-01-09 17:56         ` Christoph Hellwig
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2017-01-08 16:09 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, eguan, darrick.wong

On Sun, Jan 08, 2017 at 11:36:11AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 04, 2017 at 01:19:33PM -0500, Brian Foster wrote:
> > Staring at this some more, I'm still not terribly fond of this, moreso
> > because I wonder how much more of this can be ripped out
> 
> What else do you have in mind?
> 

Oh, I don't have anything better to offer atm. :P It's just that we're
messing with this little corner of the low space allocation mechanism
when a broader reassessment of the bigger picture seems more appropriate
(hence the low space allocator question). It sounds like doing that is
on your radar, though..

Sorry, I recognize there's a bug to fix here.. I guess I'm just trying
to convince myself this is as minimal as necessary to fix it.

> > and whether the
> > low space allocator thing is still effective.
> 
> That's a good question.  Another item added to my not critical allocator
> TODO list, which keeps growing..
> 

You're welcome. ;)

> > Aside from that, afaict
> > the set_aside change should make it robust and addresses my previous
> > question in that it holds blocks out of all transaction reservations.
> > 
> > I'm also curious whether the set_aside patch alone addresses the
> > original problem, or setting minleft = 0 in one of these cases was
> > actually the cause.
> 
> I think I needed both changes, but it's been a while and I'll have to
> retest.
> 

Ok, thanks.

> > > -		/*
> > > -		 * Could not find an AG with enough free space to satisfy
> > > -		 * a full btree split.  Try again without minleft and if
> > > -		 * successful activate the lowspace algorithm.
> > > -		 */
> > > -		args.fsbno = 0;
> > > -		args.type = XFS_ALLOCTYPE_FIRST_AG;
> > > -		args.minleft = 0;
> > > -		error = xfs_alloc_vextent(&args);
> > > -		if (error)
> > > -			goto error0;
> > > -		cur->bc_private.b.dfops->dop_low = true;
> > > -	}
> > 
> > We have a similar retry pattern in xfs_bmap_btalloc() where, in the hunk
> > just above, we retain the retry that appears analogous to this one (in
> > that it activates the low space algo) and just drop the minleft = 0 bit.
> > Here we are dropping the whole thing. Any reason not to be consistent
> > one way or the other? (Though do note that we don't check firstblock
> > here...).
> 
> xfs_bmap_btalloc is a bit different because the alignment question
> comes into play as well, in addition to the non-binding AG selection
> from the higher level allocator code.  But I suspect that there is a lot
> of dead or questionable code in it, and it's been on my todo list
> to audit xfs_bmap_btalloc, xfs_alloc_vectent and it's subfunction,
> and make the argument passing, allocator modes and things like the
> lowspace mode aswell as the firstblock magic a lot cleaner and properly
> documented.

I agree that is much needed and may very well kill some of this code
off...

In this particular case, I think it's probably safer to defer the
removal of the entire bmbt_alloc_block() hunk to that audit that will
take into consideration the broader context. IOWs, take the same
approach in bmbt_alloc_block() as you have in xfs_bmap_btalloc().

I'm not even sure the code is correct as it is, but if we're in a
situation where multiple bmbt block allocations are required, afaict
that hunk can affect subsequent bmbt block allocs in terms of how
aggressive the allocation request is (via allocation mode). I think that
also provides some logical separation between minleft changes and all of
this retry logic and low space allocator stuff, fwiw.

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2017-01-08 10:36     ` Christoph Hellwig
@ 2017-01-08 16:10       ` Brian Foster
  2017-01-08 18:10         ` Darrick J. Wong
  0 siblings, 1 reply; 27+ messages in thread
From: Brian Foster @ 2017-01-08 16:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Darrick J. Wong, Eryu Guan, linux-xfs

On Sun, Jan 08, 2017 at 11:36:47AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 04, 2017 at 06:01:05PM -0800, Darrick J. Wong wrote:
> > Ok, sounds good.  I just sent out my first attempt at a pull request to Linus,
> > so I'll queue this series for another pull req. after that.
> 
> I'll resend with a few minor updates based on the comments from Brian
> in a few days, so this should be ready for -rc4.

So I just noticed that this series was merged into for-next (which iiuc
is easily fixable, yes?). Christoph already mentioned he was sending a
new variant... but juuuuuust a note that not all of these patches had a
reviewed-by yet either. ;) 

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2017-01-08 16:10       ` Brian Foster
@ 2017-01-08 18:10         ` Darrick J. Wong
  2017-01-09 15:22           ` Brian Foster
  0 siblings, 1 reply; 27+ messages in thread
From: Darrick J. Wong @ 2017-01-08 18:10 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, Eryu Guan, linux-xfs

On Sun, Jan 08, 2017 at 11:10:00AM -0500, Brian Foster wrote:
> On Sun, Jan 08, 2017 at 11:36:47AM +0100, Christoph Hellwig wrote:
> > On Wed, Jan 04, 2017 at 06:01:05PM -0800, Darrick J. Wong wrote:
> > > Ok, sounds good.  I just sent out my first attempt at a pull request to Linus,
> > > so I'll queue this series for another pull req. after that.
> > 
> > I'll resend with a few minor updates based on the comments from Brian
> > in a few days, so this should be ready for -rc4.
> 
> So I just noticed that this series was merged into for-next (which iiuc
> is easily fixable, yes?).

Yep.  for-next can be rebased.

> Christoph already mentioned he was sending a new variant... but
> juuuuuust a note that not all of these patches had a reviewed-by yet
> either. ;) 

<nod> I'm aware; I'll be more patient next time. :)

TBH I was also curious to have the kbuild robot / linux-next pick up a
changeset that wasn't terribly likely to break anything else, just to
see how they run...

--D

> 
> Brian
> 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2017-01-08 18:10         ` Darrick J. Wong
@ 2017-01-09 15:22           ` Brian Foster
  2017-01-09 15:34             ` Christoph Hellwig
  2017-01-10  4:23             ` Darrick J. Wong
  0 siblings, 2 replies; 27+ messages in thread
From: Brian Foster @ 2017-01-09 15:22 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, Eryu Guan, linux-xfs

On Sun, Jan 08, 2017 at 10:10:02AM -0800, Darrick J. Wong wrote:
> On Sun, Jan 08, 2017 at 11:10:00AM -0500, Brian Foster wrote:
> > On Sun, Jan 08, 2017 at 11:36:47AM +0100, Christoph Hellwig wrote:
> > > On Wed, Jan 04, 2017 at 06:01:05PM -0800, Darrick J. Wong wrote:
> > > > Ok, sounds good.  I just sent out my first attempt at a pull request to Linus,
> > > > so I'll queue this series for another pull req. after that.
> > > 
> > > I'll resend with a few minor updates based on the comments from Brian
> > > in a few days, so this should be ready for -rc4.
> > 
> > So I just noticed that this series was merged into for-next (which iiuc
> > is easily fixable, yes?).
> 
> Yep.  for-next can be rebased.
> 
> > Christoph already mentioned he was sending a new variant... but
> > juuuuuust a note that not all of these patches had a reviewed-by yet
> > either. ;) 
> 
> <nod> I'm aware; I'll be more patient next time. :)
> 

Thanks. At risk of stating the obvious, I'd much prefer we continue to
follow the "at least one r-b requirement per patch" development model,
even if that means something slips a release (not that this series is at
such a risk). I'm certainly used to that by now... :P

> TBH I was also curious to have the kbuild robot / linux-next pick up a
> changeset that wasn't terribly likely to break anything else, just to
> see how they run...
> 

Hmm, is this something that can be enabled locally (e.g., on posts to
the list) without having to merge stuff prematurely?

Brian

> --D
> 
> > 
> > Brian
> > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2017-01-09 15:22           ` Brian Foster
@ 2017-01-09 15:34             ` Christoph Hellwig
  2017-01-09 15:43               ` Brian Foster
  2017-01-10  4:23             ` Darrick J. Wong
  1 sibling, 1 reply; 27+ messages in thread
From: Christoph Hellwig @ 2017-01-09 15:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: Darrick J. Wong, Christoph Hellwig, Eryu Guan, linux-xfs

On Mon, Jan 09, 2017 at 10:22:06AM -0500, Brian Foster wrote:
> Hmm, is this something that can be enabled locally (e.g., on posts to
> the list) without having to merge stuff prematurely?

Any linux developer can add their repos to the kbuild bot.  I usually
push my non-trivial patchset to one of my repos covered by it before
sending them out.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2017-01-09 15:34             ` Christoph Hellwig
@ 2017-01-09 15:43               ` Brian Foster
  0 siblings, 0 replies; 27+ messages in thread
From: Brian Foster @ 2017-01-09 15:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Darrick J. Wong, Eryu Guan, linux-xfs

On Mon, Jan 09, 2017 at 04:34:15PM +0100, Christoph Hellwig wrote:
> On Mon, Jan 09, 2017 at 10:22:06AM -0500, Brian Foster wrote:
> > Hmm, is this something that can be enabled locally (e.g., on posts to
> > the list) without having to merge stuff prematurely?
> 
> Any linux developer can add their repos to the kbuild bot.  I usually
> push my non-trivial patchset to one of my repos covered by it before
> sending them out.

Interesting, thanks. Though I guess that depends on having a publicly
accessible repo.. I suppose we could also define a for-kbuild or
for-testing branch or some such, but that's up to Darrick...

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/5] xfs: fix bogus minleft manipulations
  2017-01-08 16:09       ` Brian Foster
@ 2017-01-09 17:56         ` Christoph Hellwig
  0 siblings, 0 replies; 27+ messages in thread
From: Christoph Hellwig @ 2017-01-09 17:56 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs, eguan, darrick.wong

On Sun, Jan 08, 2017 at 11:09:35AM -0500, Brian Foster wrote:
> In this particular case, I think it's probably safer to defer the
> removal of the entire bmbt_alloc_block() hunk to that audit that will
> take into consideration the broader context. IOWs, take the same
> approach in bmbt_alloc_block() as you have in xfs_bmap_btalloc().

Ok, will do.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: minleft fixes V2
  2017-01-09 15:22           ` Brian Foster
  2017-01-09 15:34             ` Christoph Hellwig
@ 2017-01-10  4:23             ` Darrick J. Wong
  1 sibling, 0 replies; 27+ messages in thread
From: Darrick J. Wong @ 2017-01-10  4:23 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, Eryu Guan, linux-xfs

On Mon, Jan 09, 2017 at 10:22:06AM -0500, Brian Foster wrote:
> On Sun, Jan 08, 2017 at 10:10:02AM -0800, Darrick J. Wong wrote:
> > On Sun, Jan 08, 2017 at 11:10:00AM -0500, Brian Foster wrote:
> > > On Sun, Jan 08, 2017 at 11:36:47AM +0100, Christoph Hellwig wrote:
> > > > On Wed, Jan 04, 2017 at 06:01:05PM -0800, Darrick J. Wong wrote:
> > > > > Ok, sounds good.  I just sent out my first attempt at a pull request to Linus,
> > > > > so I'll queue this series for another pull req. after that.
> > > > 
> > > > I'll resend with a few minor updates based on the comments from Brian
> > > > in a few days, so this should be ready for -rc4.
> > > 
> > > So I just noticed that this series was merged into for-next (which iiuc
> > > is easily fixable, yes?).
> > 
> > Yep.  for-next can be rebased.
> > 
> > > Christoph already mentioned he was sending a new variant... but
> > > juuuuuust a note that not all of these patches had a reviewed-by yet
> > > either. ;) 
> > 
> > <nod> I'm aware; I'll be more patient next time. :)
> > 
> 
> Thanks. At risk of stating the obvious, I'd much prefer we continue to
> follow the "at least one r-b requirement per patch" development model,
> even if that means something slips a release (not that this series is at
> such a risk). I'm certainly used to that by now... :P

Yeah, again, sorry about that.

> > TBH I was also curious to have the kbuild robot / linux-next pick up a
> > changeset that wasn't terribly likely to break anything else, just to
> > see how they run...
> > 
> 
> Hmm, is this something that can be enabled locally (e.g., on posts to
> the list) without having to merge stuff prematurely?

I (think) I've requested that the kbuild robot do build testing for any branch
that appears in xfs-linux.git (as well as my djwong xfs-linux repo).

--D

> 
> Brian
> 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-01-10  4:24 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-22 20:00 minleft fixes V2 Christoph Hellwig
2016-12-22 20:00 ` [PATCH 1/5] xfs: bump up reserved blocks in xfs_alloc_set_aside Christoph Hellwig
2017-01-04 14:33   ` Brian Foster
2017-01-08 10:30     ` Christoph Hellwig
2017-01-08 16:07       ` Brian Foster
2016-12-22 20:00 ` [PATCH 2/5] xfs: fix the alignment fallback in xfs_bmap_btalloc Christoph Hellwig
2017-01-04 14:34   ` Brian Foster
2017-01-08 10:31     ` Christoph Hellwig
2017-01-08 16:08       ` Brian Foster
2016-12-22 20:00 ` [PATCH 3/5] xfs: fix bogus minleft manipulations Christoph Hellwig
2017-01-04 18:19   ` Brian Foster
2017-01-08 10:36     ` Christoph Hellwig
2017-01-08 16:09       ` Brian Foster
2017-01-09 17:56         ` Christoph Hellwig
2016-12-22 20:00 ` [PATCH 4/5] xfs: adjust allocation length in xfs_alloc_space_available Christoph Hellwig
2017-01-04 18:19   ` Brian Foster
2016-12-22 20:00 ` [PATCH 5/5] xfs: don't rely on ->total " Christoph Hellwig
2017-01-04 18:19   ` Brian Foster
2017-01-05  1:21 ` minleft fixes V2 Eryu Guan
2017-01-05  2:01   ` Darrick J. Wong
2017-01-08 10:36     ` Christoph Hellwig
2017-01-08 16:10       ` Brian Foster
2017-01-08 18:10         ` Darrick J. Wong
2017-01-09 15:22           ` Brian Foster
2017-01-09 15:34             ` Christoph Hellwig
2017-01-09 15:43               ` Brian Foster
2017-01-10  4:23             ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.