All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] xfs: allocation alignment for forced alignment
@ 2024-04-02 23:28 Dave Chinner
  2024-04-02 23:28 ` [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC Dave Chinner
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Dave Chinner @ 2024-04-02 23:28 UTC (permalink / raw)
  To: linux-xfs; +Cc: john.g.garry

This patchset prepares the allocator infrastructure for extent size
hint alignment to guarantee alignment of extents rather than just be
a hint.

Extent alignment is currently driven by multiple variables that come
from different sources and are applied at different times. Stripe
alignment currently defines the extent start alignment, whilst
extent size hints only affect the extent length and not the start
alignment.  There are also assumptions about alignment of allocation
parameters (such as the maximum length of the allocation) and if
these aren't followed the extent size hints don't actually trim
extents properly.

This patch set aims to unify alignment entirely via args->alignment
and args->prod/args->mod. Alignment at both ends of the extent
should always occur when the free space selected allows for start
alignment of an extent that is at least args->minlen in length.

Hence we need to modify args->alignment setup to take into account
extent size hints in addition to stripe alignment, and we need to
ensure that extent length is always aligned to extent size hints,
even if it means we cannot do a args->maxlen allocation.

This means that only when we completely fail to find aligned free
space to allocate from will extent size hints no longer be start
aligned. They will continue to be tail aligned up until we run out
of contiguous free space extents large enough to hold an extent the
size of a hint. Hence there is no real change of behaviour for
extent size hints; they will simply prefer aligned allocations
first, then naturally fall back to the current unaligned hint sized
allocation strategy.

Unifying the allocation alignment stratgies simplifies the code,
too. It means that the only time we don't align allocation is when
trying to do contiguous allocation when extending the file (which
should already be aligned if alignment is required!) or when there
are no alignment constraints to begin with.  As a result, we can
simplify the allocation failure fallback strategies and make the
code easier to understand and follow.

Finally, we introduce a stub for force aligned allocation and add
the logic fail the allocation if aligned allocation cannot be done.

This has run through fstests for some time here without inducing new
failures or causing any obvious near-ENOSPC performance regressions.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC
  2024-04-02 23:28 [PATCH 0/5] xfs: allocation alignment for forced alignment Dave Chinner
@ 2024-04-02 23:28 ` Dave Chinner
  2024-04-03 16:31   ` John Garry
  2024-04-02 23:28 ` [PATCH 2/5] xfs: always tail align maxlen allocations Dave Chinner
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2024-04-02 23:28 UTC (permalink / raw)
  To: linux-xfs; +Cc: john.g.garry

From: Dave Chinner <dchinner@redhat.com>

When we are near ENOSPC and don't have enough free
space for an args->maxlen allocation, xfs_alloc_space_available()
will trim args->maxlen to equal the available space. However, this
function has only checked that there is enough contiguous free space
for an aligned args->minlen allocation to succeed. Hence there is no
guarantee that an args->maxlen allocation will succeed, nor that the
available space will allow for correct alignment of an args->maxlen
allocation.

Further, by trimming args->maxlen arbitrarily, it breaks an
assumption made in xfs_alloc_fix_len() that if the caller wants
aligned allocation, then args->maxlen will be set to an aligned
value. It then skips the tail alignment and so we end up with
extents that aren't aligned to extent size hint boundaries as we
approach ENOSPC.

To avoid this problem, don't reduce args->maxlen by some random,
arbitrary amount. If args->maxlen is too large for the available
space, reduce the allocation to a minlen allocation as we know we
have contiguous free space available for this to succeed and always
be correctly aligned.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 9da52e92172a..215265e0f68f 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2411,14 +2411,23 @@ xfs_alloc_space_available(
 	if (available < (int)max(args->total, alloc_len))
 		return false;
 
+	if (flags & XFS_ALLOC_FLAG_CHECK)
+		return true;
+
 	/*
-	 * Clamp maxlen to the amount of free space available for the actual
-	 * extent allocation.
+	 * If we can't do a maxlen allocation, then we must reduce the size of
+	 * the allocation to match the available free space. We know how big
+	 * the largest contiguous free space we can allocate is, so that's our
+	 * upper bound. However, we don't exaclty know what alignment/size
+	 * constraints have been placed on the allocation, so we can't
+	 * arbitrarily select some new max size. Hence make this a minlen
+	 * allocation as we know that will definitely succeed and match the
+	 * callers alignment constraints.
 	 */
-	if (available < (int)args->maxlen && !(flags & XFS_ALLOC_FLAG_CHECK)) {
-		args->maxlen = available;
+	alloc_len = args->maxlen + (args->alignment - 1) + args->minalignslop;
+	if (longest < alloc_len) {
+		args->maxlen = args->minlen;
 		ASSERT(args->maxlen > 0);
-		ASSERT(args->maxlen >= args->minlen);
 	}
 
 	return true;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] xfs: always tail align maxlen allocations
  2024-04-02 23:28 [PATCH 0/5] xfs: allocation alignment for forced alignment Dave Chinner
  2024-04-02 23:28 ` [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC Dave Chinner
@ 2024-04-02 23:28 ` Dave Chinner
  2024-04-02 23:28 ` [PATCH 3/5] xfs: simplify extent allocation alignment Dave Chinner
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Dave Chinner @ 2024-04-02 23:28 UTC (permalink / raw)
  To: linux-xfs; +Cc: john.g.garry

From: Dave Chinner <dchinner@redhat.com>

When we do a large allocation, the core free space allocation code
assumes that args->maxlen is aligned to args->prod/args->mod. hence
if we get a maximum sized extent allocated, it does not do tail
alignment of the extent.

However, this assumes that nothing modifies args->maxlen between the
original allocation context setup and trimming the selected free
space extent to size. This assumption has recently been found to be
invalid - xfs_alloc_space_available() modifies args->maxlen in low
space situations - and there may be more situations we haven't yet
found like this.

Force aligned allocation introduces the requirement that extents are
correctly tail aligned, resulting in this occasional latent
alignment failure to e reclassified from an unimportant curiousity
to a must-fix bug.

Removing the assumption about args->maxlen allocations always being
tail aligned is trivial, and should not impact anything because
args->maxlen for inodes with extent size hints configured are
already aligned. Hence all this change does it avoid weird corner
cases that would have resulted in unaligned extent sizes by always
trimming the extent down to an aligned size.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 215265e0f68f..e21fd5c1f802 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -432,20 +432,18 @@ xfs_alloc_compute_diff(
  * Fix up the length, based on mod and prod.
  * len should be k * prod + mod for some k.
  * If len is too small it is returned unchanged.
- * If len hits maxlen it is left alone.
  */
-STATIC void
+static void
 xfs_alloc_fix_len(
-	xfs_alloc_arg_t	*args)		/* allocation argument structure */
+	struct xfs_alloc_arg	*args)
 {
-	xfs_extlen_t	k;
-	xfs_extlen_t	rlen;
+	xfs_extlen_t		k;
+	xfs_extlen_t		rlen = args->len;
 
 	ASSERT(args->mod < args->prod);
-	rlen = args->len;
 	ASSERT(rlen >= args->minlen);
 	ASSERT(rlen <= args->maxlen);
-	if (args->prod <= 1 || rlen < args->mod || rlen == args->maxlen ||
+	if (args->prod <= 1 || rlen < args->mod ||
 	    (args->mod == 0 && rlen < args->prod))
 		return;
 	k = rlen % args->prod;
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] xfs: simplify extent allocation alignment
  2024-04-02 23:28 [PATCH 0/5] xfs: allocation alignment for forced alignment Dave Chinner
  2024-04-02 23:28 ` [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC Dave Chinner
  2024-04-02 23:28 ` [PATCH 2/5] xfs: always tail align maxlen allocations Dave Chinner
@ 2024-04-02 23:28 ` Dave Chinner
  2024-04-02 23:28 ` [PATCH 4/5] xfs: make EOF allocation simpler Dave Chinner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Dave Chinner @ 2024-04-02 23:28 UTC (permalink / raw)
  To: linux-xfs; +Cc: john.g.garry

From: Dave Chinner <dchinner@redhat.com>

We currently align extent allocation to stripe unit or stripe width.
That is specified by an external parameter to the allocation code,
which then manipulates the xfs_alloc_args alignment configuration in
interesting ways.

The args->alignment field specifies extent start alignment, but
because we may be attempting non-aligned allocation first there are
also slop variables that allow for those allocation attempts to
account for aligned allocation if they fail.

This gets much more complex as we introduce forced allocation
alignment, where extent size hints are used to generate the extent
start alignment. extent size hints currently only affect extent
lengths (via args->prod and args->mod) and so with this change we
will have two different start alignment conditions.

Avoid this complexity by always using args->alignment to indicate
extent start alignment, and always using args->prod/mod to indicate
extent length adjustment.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.c |  4 +-
 fs/xfs/libxfs/xfs_alloc.h |  2 +-
 fs/xfs/libxfs/xfs_bmap.c  | 96 +++++++++++++++++----------------------
 3 files changed, 45 insertions(+), 57 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index e21fd5c1f802..563599e956a6 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2393,7 +2393,7 @@ xfs_alloc_space_available(
 	reservation = xfs_ag_resv_needed(pag, args->resv);
 
 	/* do we have enough contiguous free space for the allocation? */
-	alloc_len = args->minlen + (args->alignment - 1) + args->minalignslop;
+	alloc_len = args->minlen + (args->alignment - 1) + args->alignslop;
 	longest = xfs_alloc_longest_free_extent(pag, min_free, reservation);
 	if (longest < alloc_len)
 		return false;
@@ -2422,7 +2422,7 @@ xfs_alloc_space_available(
 	 * allocation as we know that will definitely succeed and match the
 	 * callers alignment constraints.
 	 */
-	alloc_len = args->maxlen + (args->alignment - 1) + args->minalignslop;
+	alloc_len = args->maxlen + (args->alignment - 1) + args->alignslop;
 	if (longest < alloc_len) {
 		args->maxlen = args->minlen;
 		ASSERT(args->maxlen > 0);
diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index 0b956f8b9d5a..aa2c103d98f0 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -46,7 +46,7 @@ typedef struct xfs_alloc_arg {
 	xfs_extlen_t	minleft;	/* min blocks must be left after us */
 	xfs_extlen_t	total;		/* total blocks needed in xaction */
 	xfs_extlen_t	alignment;	/* align answer to multiple of this */
-	xfs_extlen_t	minalignslop;	/* slop for minlen+alignment calcs */
+	xfs_extlen_t	alignslop;	/* slop for alignment calcs */
 	xfs_agblock_t	min_agbno;	/* set an agbno range for NEAR allocs */
 	xfs_agblock_t	max_agbno;	/* ... */
 	xfs_extlen_t	len;		/* output: actual size of extent */
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 656c95a22f2e..d56c82c07505 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3295,6 +3295,10 @@ xfs_bmap_select_minlen(
 	xfs_extlen_t		blen)
 {
 
+	/* Adjust best length for extent start alignment. */
+	if (blen > args->alignment)
+		blen -= args->alignment;
+
 	/*
 	 * Since we used XFS_ALLOC_FLAG_TRYLOCK in _longest_free_extent(), it is
 	 * possible that there is enough contiguous free space for this request.
@@ -3310,6 +3314,7 @@ xfs_bmap_select_minlen(
 	if (blen < args->maxlen)
 		return blen;
 	return args->maxlen;
+
 }
 
 static int
@@ -3403,35 +3408,43 @@ xfs_bmap_alloc_account(
 	xfs_trans_mod_dquot_byino(ap->tp, ap->ip, fld, ap->length);
 }
 
-static int
+/*
+ * Calculate the extent start alignment and the extent length adjustments that
+ * constrain this allocation.
+ *
+ * Extent start alignment is currently determined by stripe configuration and is
+ * carried in args->alignment, whilst extent length adjustment is determined by
+ * extent size hints and is carried by args->prod and args->mod.
+ *
+ * Low level allocation code is free to either ignore or override these values
+ * as required.
+ */
+static void
 xfs_bmap_compute_alignments(
 	struct xfs_bmalloca	*ap,
 	struct xfs_alloc_arg	*args)
 {
 	struct xfs_mount	*mp = args->mp;
 	xfs_extlen_t		align = 0; /* minimum allocation alignment */
-	int			stripe_align = 0;
 
 	/* stripe alignment for allocation is determined by mount parameters */
 	if (mp->m_swidth && xfs_has_swalloc(mp))
-		stripe_align = mp->m_swidth;
+		args->alignment = mp->m_swidth;
 	else if (mp->m_dalign)
-		stripe_align = mp->m_dalign;
+		args->alignment = mp->m_dalign;
 
 	if (ap->flags & XFS_BMAPI_COWFORK)
 		align = xfs_get_cowextsz_hint(ap->ip);
 	else if (ap->datatype & XFS_ALLOC_USERDATA)
 		align = xfs_get_extsz_hint(ap->ip);
+
 	if (align) {
 		if (xfs_bmap_extsize_align(mp, &ap->got, &ap->prev, align, 0,
 					ap->eof, 0, ap->conv, &ap->offset,
 					&ap->length))
 			ASSERT(0);
 		ASSERT(ap->length);
-	}
 
-	/* apply extent size hints if obtained earlier */
-	if (align) {
 		args->prod = align;
 		div_u64_rem(ap->offset, args->prod, &args->mod);
 		if (args->mod)
@@ -3446,7 +3459,6 @@ xfs_bmap_compute_alignments(
 			args->mod = args->prod - args->mod;
 	}
 
-	return stripe_align;
 }
 
 static void
@@ -3518,7 +3530,7 @@ xfs_bmap_exact_minlen_extent_alloc(
 	args.total = ap->total;
 
 	args.alignment = 1;
-	args.minalignslop = 0;
+	args.alignslop = 0;
 
 	args.minleft = ap->minleft;
 	args.wasdel = ap->wasdel;
@@ -3558,7 +3570,6 @@ xfs_bmap_btalloc_at_eof(
 	struct xfs_bmalloca	*ap,
 	struct xfs_alloc_arg	*args,
 	xfs_extlen_t		blen,
-	int			stripe_align,
 	bool			ag_only)
 {
 	struct xfs_mount	*mp = args->mp;
@@ -3572,23 +3583,15 @@ xfs_bmap_btalloc_at_eof(
 	 * allocation.
 	 */
 	if (ap->offset) {
-		xfs_extlen_t	nextminlen = 0;
+		xfs_extlen_t	alignment = args->alignment;
 
 		/*
-		 * Compute the minlen+alignment for the next case.  Set slop so
-		 * that the value of minlen+alignment+slop doesn't go up between
-		 * the calls.
+		 * Compute the alignment slop for the fallback path so we ensure
+		 * we account for the potential alignemnt space required by the
+		 * fallback paths before we modify the AGF and AGFL here.
 		 */
 		args->alignment = 1;
-		if (blen > stripe_align && blen <= args->maxlen)
-			nextminlen = blen - stripe_align;
-		else
-			nextminlen = args->minlen;
-		if (nextminlen + stripe_align > args->minlen + 1)
-			args->minalignslop = nextminlen + stripe_align -
-					args->minlen - 1;
-		else
-			args->minalignslop = 0;
+		args->alignslop = alignment - args->alignment;
 
 		if (!caller_pag)
 			args->pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, ap->blkno));
@@ -3606,19 +3609,8 @@ xfs_bmap_btalloc_at_eof(
 		 * Exact allocation failed. Reset to try an aligned allocation
 		 * according to the original allocation specification.
 		 */
-		args->alignment = stripe_align;
-		args->minlen = nextminlen;
-		args->minalignslop = 0;
-	} else {
-		/*
-		 * Adjust minlen to try and preserve alignment if we
-		 * can't guarantee an aligned maxlen extent.
-		 */
-		args->alignment = stripe_align;
-		if (blen > args->alignment &&
-		    blen <= args->maxlen + args->alignment)
-			args->minlen = blen - args->alignment;
-		args->minalignslop = 0;
+		args->alignment = alignment;
+		args->alignslop = 0;
 	}
 
 	if (ag_only) {
@@ -3636,9 +3628,8 @@ xfs_bmap_btalloc_at_eof(
 		return 0;
 
 	/*
-	 * Allocation failed, so turn return the allocation args to their
-	 * original non-aligned state so the caller can proceed on allocation
-	 * failure as if this function was never called.
+	 * Aligned allocation failed, so all fallback paths from here drop the
+	 * start alignment requirement as we know it will not succeed.
 	 */
 	args->alignment = 1;
 	return 0;
@@ -3646,7 +3637,9 @@ xfs_bmap_btalloc_at_eof(
 
 /*
  * We have failed multiple allocation attempts so now are in a low space
- * allocation situation. Try a locality first full filesystem minimum length
+ * allocation situation. We give up on any attempt at aligned allocation here.
+ *
+ * Try a locality first full filesystem minimum length
  * allocation whilst still maintaining necessary total block reservation
  * requirements.
  *
@@ -3663,6 +3656,7 @@ xfs_bmap_btalloc_low_space(
 {
 	int			error;
 
+	args->alignment = 1;
 	if (args->minlen > ap->minlen) {
 		args->minlen = ap->minlen;
 		error = xfs_alloc_vextent_start_ag(args, ap->blkno);
@@ -3682,13 +3676,11 @@ xfs_bmap_btalloc_low_space(
 static int
 xfs_bmap_btalloc_filestreams(
 	struct xfs_bmalloca	*ap,
-	struct xfs_alloc_arg	*args,
-	int			stripe_align)
+	struct xfs_alloc_arg	*args)
 {
 	xfs_extlen_t		blen = 0;
 	int			error = 0;
 
-
 	error = xfs_filestream_select_ag(ap, args, &blen);
 	if (error)
 		return error;
@@ -3707,8 +3699,7 @@ xfs_bmap_btalloc_filestreams(
 
 	args->minlen = xfs_bmap_select_minlen(ap, args, blen);
 	if (ap->aeof)
-		error = xfs_bmap_btalloc_at_eof(ap, args, blen, stripe_align,
-				true);
+		error = xfs_bmap_btalloc_at_eof(ap, args, blen, true);
 
 	if (!error && args->fsbno == NULLFSBLOCK)
 		error = xfs_alloc_vextent_near_bno(args, ap->blkno);
@@ -3732,8 +3723,7 @@ xfs_bmap_btalloc_filestreams(
 static int
 xfs_bmap_btalloc_best_length(
 	struct xfs_bmalloca	*ap,
-	struct xfs_alloc_arg	*args,
-	int			stripe_align)
+	struct xfs_alloc_arg	*args)
 {
 	xfs_extlen_t		blen = 0;
 	int			error;
@@ -3757,8 +3747,7 @@ xfs_bmap_btalloc_best_length(
 	 * trying.
 	 */
 	if (ap->aeof && !(ap->tp->t_flags & XFS_TRANS_LOWMODE)) {
-		error = xfs_bmap_btalloc_at_eof(ap, args, blen, stripe_align,
-				false);
+		error = xfs_bmap_btalloc_at_eof(ap, args, blen, false);
 		if (error || args->fsbno != NULLFSBLOCK)
 			return error;
 	}
@@ -3785,27 +3774,26 @@ xfs_bmap_btalloc(
 		.resv		= XFS_AG_RESV_NONE,
 		.datatype	= ap->datatype,
 		.alignment	= 1,
-		.minalignslop	= 0,
+		.alignslop	= 0,
 	};
 	xfs_fileoff_t		orig_offset;
 	xfs_extlen_t		orig_length;
 	int			error;
-	int			stripe_align;
 
 	ASSERT(ap->length);
 	orig_offset = ap->offset;
 	orig_length = ap->length;
 
-	stripe_align = xfs_bmap_compute_alignments(ap, &args);
+	xfs_bmap_compute_alignments(ap, &args);
 
 	/* Trim the allocation back to the maximum an AG can fit. */
 	args.maxlen = min(ap->length, mp->m_ag_max_usable);
 
 	if ((ap->datatype & XFS_ALLOC_USERDATA) &&
 	    xfs_inode_is_filestream(ap->ip))
-		error = xfs_bmap_btalloc_filestreams(ap, &args, stripe_align);
+		error = xfs_bmap_btalloc_filestreams(ap, &args);
 	else
-		error = xfs_bmap_btalloc_best_length(ap, &args, stripe_align);
+		error = xfs_bmap_btalloc_best_length(ap, &args);
 	if (error)
 		return error;
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] xfs: make EOF allocation simpler
  2024-04-02 23:28 [PATCH 0/5] xfs: allocation alignment for forced alignment Dave Chinner
                   ` (2 preceding siblings ...)
  2024-04-02 23:28 ` [PATCH 3/5] xfs: simplify extent allocation alignment Dave Chinner
@ 2024-04-02 23:28 ` Dave Chinner
  2024-04-02 23:28 ` [PATCH 5/5] xfs: introduce forced allocation alignment Dave Chinner
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Dave Chinner @ 2024-04-02 23:28 UTC (permalink / raw)
  To: linux-xfs; +Cc: john.g.garry

From: Dave Chinner <dchinner@redhat.com>

Currently the allocation at EOF is broken into two cases - when the
offset is zero and when the offset is non-zero. When the offset is
non-zero, we try to do exact block allocation for contiguous
extent allocation. When the offset is zero, the allocation is simply
an aligned allocation.

We want aligned allocation as the fallback when exact block
allocation fails, but that complicates the EOF allocation in that it
now has to handle two different allocation cases. The
caller also has to handle allocation when not at EOF, and for the
upcoming forced alignment changes we need that to also be aligned
allocation.

To simplify all this, pull the aligned allocation cases back into
the callers and leave the EOF allocation path for exact block
allocation only. This means that the EOF exact block allocation
fallback path is the normal aligned allocation path and that ends up
making things a lot simpler when forced alignment is introduced.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_bmap.c   | 131 +++++++++++++++----------------------
 fs/xfs/libxfs/xfs_ialloc.c |   8 +--
 fs/xfs/xfs_trace.h         |   8 +--
 3 files changed, 62 insertions(+), 85 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index d56c82c07505..c2ddf1875e52 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3320,12 +3320,12 @@ xfs_bmap_select_minlen(
 static int
 xfs_bmap_btalloc_select_lengths(
 	struct xfs_bmalloca	*ap,
-	struct xfs_alloc_arg	*args,
-	xfs_extlen_t		*blen)
+	struct xfs_alloc_arg	*args)
 {
 	struct xfs_mount	*mp = args->mp;
 	struct xfs_perag	*pag;
 	xfs_agnumber_t		agno, startag;
+	xfs_extlen_t		blen = 0;
 	int			error = 0;
 
 	if (ap->tp->t_flags & XFS_TRANS_LOWMODE) {
@@ -3339,19 +3339,18 @@ xfs_bmap_btalloc_select_lengths(
 	if (startag == NULLAGNUMBER)
 		startag = 0;
 
-	*blen = 0;
 	for_each_perag_wrap(mp, startag, agno, pag) {
-		error = xfs_bmap_longest_free_extent(pag, args->tp, blen);
+		error = xfs_bmap_longest_free_extent(pag, args->tp, &blen);
 		if (error && error != -EAGAIN)
 			break;
 		error = 0;
-		if (*blen >= args->maxlen)
+		if (blen >= args->maxlen)
 			break;
 	}
 	if (pag)
 		xfs_perag_rele(pag);
 
-	args->minlen = xfs_bmap_select_minlen(ap, args, *blen);
+	args->minlen = xfs_bmap_select_minlen(ap, args, blen);
 	return error;
 }
 
@@ -3561,78 +3560,40 @@ xfs_bmap_exact_minlen_extent_alloc(
  * If we are not low on available data blocks and we are allocating at
  * EOF, optimise allocation for contiguous file extension and/or stripe
  * alignment of the new extent.
- *
- * NOTE: ap->aeof is only set if the allocation length is >= the
- * stripe unit and the allocation offset is at the end of file.
  */
 static int
 xfs_bmap_btalloc_at_eof(
 	struct xfs_bmalloca	*ap,
-	struct xfs_alloc_arg	*args,
-	xfs_extlen_t		blen,
-	bool			ag_only)
+	struct xfs_alloc_arg	*args)
 {
 	struct xfs_mount	*mp = args->mp;
 	struct xfs_perag	*caller_pag = args->pag;
+	xfs_extlen_t		alignment = args->alignment;
 	int			error;
 
+	ASSERT(ap->aeof && ap->offset);
+	ASSERT(args->alignment >= 1);
+
 	/*
-	 * If there are already extents in the file, try an exact EOF block
-	 * allocation to extend the file as a contiguous extent. If that fails,
-	 * or it's the first allocation in a file, just try for a stripe aligned
-	 * allocation.
+	 * Compute the alignment slop for the fallback path so we ensure
+	 * we account for the potential alignemnt space required by the
+	 * fallback paths before we modify the AGF and AGFL here.
 	 */
-	if (ap->offset) {
-		xfs_extlen_t	alignment = args->alignment;
-
-		/*
-		 * Compute the alignment slop for the fallback path so we ensure
-		 * we account for the potential alignemnt space required by the
-		 * fallback paths before we modify the AGF and AGFL here.
-		 */
-		args->alignment = 1;
-		args->alignslop = alignment - args->alignment;
-
-		if (!caller_pag)
-			args->pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, ap->blkno));
-		error = xfs_alloc_vextent_exact_bno(args, ap->blkno);
-		if (!caller_pag) {
-			xfs_perag_put(args->pag);
-			args->pag = NULL;
-		}
-		if (error)
-			return error;
-
-		if (args->fsbno != NULLFSBLOCK)
-			return 0;
-		/*
-		 * Exact allocation failed. Reset to try an aligned allocation
-		 * according to the original allocation specification.
-		 */
-		args->alignment = alignment;
-		args->alignslop = 0;
-	}
-
-	if (ag_only) {
-		error = xfs_alloc_vextent_near_bno(args, ap->blkno);
-	} else {
+	args->alignment = 1;
+	args->alignslop = alignment - args->alignment;
+
+	if (!caller_pag)
+		args->pag = xfs_perag_get(mp, XFS_FSB_TO_AGNO(mp, ap->blkno));
+	error = xfs_alloc_vextent_exact_bno(args, ap->blkno);
+	if (!caller_pag) {
+		xfs_perag_put(args->pag);
 		args->pag = NULL;
-		error = xfs_alloc_vextent_start_ag(args, ap->blkno);
-		ASSERT(args->pag == NULL);
-		args->pag = caller_pag;
 	}
-	if (error)
-		return error;
 
-	if (args->fsbno != NULLFSBLOCK)
-		return 0;
-
-	/*
-	 * Aligned allocation failed, so all fallback paths from here drop the
-	 * start alignment requirement as we know it will not succeed.
-	 */
-	args->alignment = 1;
-	return 0;
+	/* Reset alignment to original specifications.  */
+	args->alignment = alignment;
+	args->alignslop = 0;
+	return error;
 }
 
 /*
@@ -3698,12 +3659,19 @@ xfs_bmap_btalloc_filestreams(
 	}
 
 	args->minlen = xfs_bmap_select_minlen(ap, args, blen);
-	if (ap->aeof)
-		error = xfs_bmap_btalloc_at_eof(ap, args, blen, true);
+	if (ap->aeof && ap->offset)
+		error = xfs_bmap_btalloc_at_eof(ap, args);
 
+	/* This may be an aligned allocation attempt. */
 	if (!error && args->fsbno == NULLFSBLOCK)
 		error = xfs_alloc_vextent_near_bno(args, ap->blkno);
 
+	/* Attempt non-aligned allocation if we haven't already. */
+	if (!error && args->fsbno == NULLFSBLOCK && args->alignment > 1)  {
+		args->alignment = 1;
+		error = xfs_alloc_vextent_near_bno(args, ap->blkno);
+	}
+
 out_low_space:
 	/*
 	 * We are now done with the perag reference for the filestreams
@@ -3725,7 +3693,6 @@ xfs_bmap_btalloc_best_length(
 	struct xfs_bmalloca	*ap,
 	struct xfs_alloc_arg	*args)
 {
-	xfs_extlen_t		blen = 0;
 	int			error;
 
 	ap->blkno = XFS_INO_TO_FSB(args->mp, ap->ip->i_ino);
@@ -3736,23 +3703,33 @@ xfs_bmap_btalloc_best_length(
 	 * the request.  If one isn't found, then adjust the minimum allocation
 	 * size to the largest space found.
 	 */
-	error = xfs_bmap_btalloc_select_lengths(ap, args, &blen);
+	error = xfs_bmap_btalloc_select_lengths(ap, args);
 	if (error)
 		return error;
 
 	/*
-	 * Don't attempt optimal EOF allocation if previous allocations barely
-	 * succeeded due to being near ENOSPC. It is highly unlikely we'll get
-	 * optimal or even aligned allocations in this case, so don't waste time
-	 * trying.
+	 * If we are in low space mode, then optimal allocation will fail so
+	 * prepare for minimal allocation and run the low space algorithm
+	 * immediately.
 	 */
-	if (ap->aeof && !(ap->tp->t_flags & XFS_TRANS_LOWMODE)) {
-		error = xfs_bmap_btalloc_at_eof(ap, args, blen, false);
-		if (error || args->fsbno != NULLFSBLOCK)
-			return error;
+	if (ap->tp->t_flags & XFS_TRANS_LOWMODE) {
+		ASSERT(args->fsbno == NULLFSBLOCK);
+		return xfs_bmap_btalloc_low_space(ap, args);
+	}
+
+	if (ap->aeof && ap->offset)
+		error = xfs_bmap_btalloc_at_eof(ap, args);
+
+	/* This may be an aligned allocation attempt. */
+	if (!error && args->fsbno == NULLFSBLOCK)
+		error = xfs_alloc_vextent_start_ag(args, ap->blkno);
+
+	/* Attempt non-aligned allocation if we haven't already. */
+	if (!error && args->fsbno == NULLFSBLOCK && args->alignment > 1)  {
+		args->alignment = 1;
+		error = xfs_alloc_vextent_start_ag(args, ap->blkno);
 	}
 
-	error = xfs_alloc_vextent_start_ag(args, ap->blkno);
 	if (error || args->fsbno != NULLFSBLOCK)
 		return error;
 
diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index e5ac3e5430c4..11116838ed71 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -758,12 +758,12 @@ xfs_ialloc_ag_alloc(
 		 *
 		 * For an exact allocation, alignment must be 1,
 		 * however we need to take cluster alignment into account when
-		 * fixing up the freelist. Use the minalignslop field to
+		 * fixing up the freelist. Use the alignslop field to
 		 * indicate that extra blocks might be required for alignment,
 		 * but not to use them in the actual exact allocation.
 		 */
 		args.alignment = 1;
-		args.minalignslop = igeo->cluster_align - 1;
+		args.alignslop = igeo->cluster_align - 1;
 
 		/* Allow space for the inode btree to split. */
 		args.minleft = igeo->inobt_maxlevels;
@@ -780,10 +780,10 @@ xfs_ialloc_ag_alloc(
 		 * the exact agbno requirement and increase the alignment
 		 * instead. It is critical that the total size of the request
 		 * (len + alignment + slop) does not increase from this point
-		 * on, so reset minalignslop to ensure it is not included in
+		 * on, so reset alignslop to ensure it is not included in
 		 * subsequent requests.
 		 */
-		args.minalignslop = 0;
+		args.alignslop = 0;
 	}
 
 	if (unlikely(args.fsbno == NULLFSBLOCK)) {
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index aea97fc074f8..14679d64558a 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -1800,7 +1800,7 @@ DECLARE_EVENT_CLASS(xfs_alloc_class,
 		__field(xfs_extlen_t, minleft)
 		__field(xfs_extlen_t, total)
 		__field(xfs_extlen_t, alignment)
-		__field(xfs_extlen_t, minalignslop)
+		__field(xfs_extlen_t, alignslop)
 		__field(xfs_extlen_t, len)
 		__field(char, wasdel)
 		__field(char, wasfromfl)
@@ -1819,7 +1819,7 @@ DECLARE_EVENT_CLASS(xfs_alloc_class,
 		__entry->minleft = args->minleft;
 		__entry->total = args->total;
 		__entry->alignment = args->alignment;
-		__entry->minalignslop = args->minalignslop;
+		__entry->alignslop = args->alignslop;
 		__entry->len = args->len;
 		__entry->wasdel = args->wasdel;
 		__entry->wasfromfl = args->wasfromfl;
@@ -1828,7 +1828,7 @@ DECLARE_EVENT_CLASS(xfs_alloc_class,
 		__entry->highest_agno = args->tp->t_highest_agno;
 	),
 	TP_printk("dev %d:%d agno 0x%x agbno 0x%x minlen %u maxlen %u mod %u "
-		  "prod %u minleft %u total %u alignment %u minalignslop %u "
+		  "prod %u minleft %u total %u alignment %u alignslop %u "
 		  "len %u wasdel %d wasfromfl %d resv %d "
 		  "datatype 0x%x highest_agno 0x%x",
 		  MAJOR(__entry->dev), MINOR(__entry->dev),
@@ -1841,7 +1841,7 @@ DECLARE_EVENT_CLASS(xfs_alloc_class,
 		  __entry->minleft,
 		  __entry->total,
 		  __entry->alignment,
-		  __entry->minalignslop,
+		  __entry->alignslop,
 		  __entry->len,
 		  __entry->wasdel,
 		  __entry->wasfromfl,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] xfs: introduce forced allocation alignment
  2024-04-02 23:28 [PATCH 0/5] xfs: allocation alignment for forced alignment Dave Chinner
                   ` (3 preceding siblings ...)
  2024-04-02 23:28 ` [PATCH 4/5] xfs: make EOF allocation simpler Dave Chinner
@ 2024-04-02 23:28 ` Dave Chinner
  2024-04-10 12:44 ` [PATCH 0/5] xfs: allocation alignment for forced alignment John Garry
  2024-04-29 14:06 ` John Garry
  6 siblings, 0 replies; 13+ messages in thread
From: Dave Chinner @ 2024-04-02 23:28 UTC (permalink / raw)
  To: linux-xfs; +Cc: john.g.garry

From: Dave Chinner <dchinner@redhat.com>

When forced allocation alignment is specified, the extent will
be aligned to the extent size hint size rather than stripe
alignment. If aligned allocation cannot be done, then the allocation
is failed rather than attempting non-aligned fallbacks.

Note: none of the per-inode force align configuration is present
yet, so this just triggers off an "always false" wrapper function
for the moment.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_alloc.h |  1 +
 fs/xfs/libxfs/xfs_bmap.c  | 29 +++++++++++++++++++++++------
 fs/xfs/xfs_inode.h        |  5 +++++
 3 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_alloc.h b/fs/xfs/libxfs/xfs_alloc.h
index aa2c103d98f0..7de2e6f64882 100644
--- a/fs/xfs/libxfs/xfs_alloc.h
+++ b/fs/xfs/libxfs/xfs_alloc.h
@@ -66,6 +66,7 @@ typedef struct xfs_alloc_arg {
 #define XFS_ALLOC_USERDATA		(1 << 0)/* allocation is for user data*/
 #define XFS_ALLOC_INITIAL_USER_DATA	(1 << 1)/* special case start of file */
 #define XFS_ALLOC_NOBUSY		(1 << 2)/* Busy extents not allowed */
+#define XFS_ALLOC_FORCEALIGN		(1 << 3)/* forced extent alignment */
 
 /* freespace limit calculations */
 unsigned int xfs_alloc_set_aside(struct xfs_mount *mp);
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index c2ddf1875e52..7a0ef0900097 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3411,9 +3411,10 @@ xfs_bmap_alloc_account(
  * Calculate the extent start alignment and the extent length adjustments that
  * constrain this allocation.
  *
- * Extent start alignment is currently determined by stripe configuration and is
- * carried in args->alignment, whilst extent length adjustment is determined by
- * extent size hints and is carried by args->prod and args->mod.
+ * Extent start alignment is currently determined by forced inode alignment or
+ * stripe configuration and is carried in args->alignment, whilst extent length
+ * adjustment is determined by extent size hints and is carried by args->prod
+ * and args->mod.
  *
  * Low level allocation code is free to either ignore or override these values
  * as required.
@@ -3426,11 +3427,18 @@ xfs_bmap_compute_alignments(
 	struct xfs_mount	*mp = args->mp;
 	xfs_extlen_t		align = 0; /* minimum allocation alignment */
 
-	/* stripe alignment for allocation is determined by mount parameters */
-	if (mp->m_swidth && xfs_has_swalloc(mp))
+	/*
+	 * Forced inode alignment takes preference over stripe alignment.
+	 * Stripe alignment for allocation is determined by mount parameters.
+	 */
+	if (xfs_inode_has_forcealign(ap->ip)) {
+		args->alignment = xfs_get_extsz_hint(ap->ip);
+		args->datatype |= XFS_ALLOC_FORCEALIGN;
+	} else if (mp->m_swidth && xfs_has_swalloc(mp)) {
 		args->alignment = mp->m_swidth;
-	else if (mp->m_dalign)
+	} else if (mp->m_dalign) {
 		args->alignment = mp->m_dalign;
+	}
 
 	if (ap->flags & XFS_BMAPI_COWFORK)
 		align = xfs_get_cowextsz_hint(ap->ip);
@@ -3617,6 +3625,11 @@ xfs_bmap_btalloc_low_space(
 {
 	int			error;
 
+	if (args->alignment > 1 && (args->datatype & XFS_ALLOC_FORCEALIGN)) {
+		args->fsbno = NULLFSBLOCK;
+		return 0;
+	}
+
 	args->alignment = 1;
 	if (args->minlen > ap->minlen) {
 		args->minlen = ap->minlen;
@@ -3668,6 +3681,8 @@ xfs_bmap_btalloc_filestreams(
 
 	/* Attempt non-aligned allocation if we haven't already. */
 	if (!error && args->fsbno == NULLFSBLOCK && args->alignment > 1)  {
+		if (args->datatype & XFS_ALLOC_FORCEALIGN)
+			return error;
 		args->alignment = 1;
 		error = xfs_alloc_vextent_near_bno(args, ap->blkno);
 	}
@@ -3726,6 +3741,8 @@ xfs_bmap_btalloc_best_length(
 
 	/* Attempt non-aligned allocation if we haven't already. */
 	if (!error && args->fsbno == NULLFSBLOCK && args->alignment > 1)  {
+		if (args->datatype & XFS_ALLOC_FORCEALIGN)
+			return error;
 		args->alignment = 1;
 		error = xfs_alloc_vextent_start_ag(args, ap->blkno);
 	}
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 0f9d32cbae72..94fa79ae1591 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -312,6 +312,11 @@ static inline bool xfs_inode_has_large_extent_counts(struct xfs_inode *ip)
 	return ip->i_diflags2 & XFS_DIFLAG2_NREXT64;
 }
 
+static inline bool xfs_inode_has_forcealign(struct xfs_inode *ip)
+{
+	return false;
+}
+
 /*
  * Return the buftarg used for data allocations on a given inode.
  */
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC
  2024-04-02 23:28 ` [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC Dave Chinner
@ 2024-04-03 16:31   ` John Garry
  2024-04-03 23:15     ` Dave Chinner
  0 siblings, 1 reply; 13+ messages in thread
From: John Garry @ 2024-04-03 16:31 UTC (permalink / raw)
  To: Dave Chinner, linux-xfs

On 03/04/2024 00:28, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When we are near ENOSPC and don't have enough free
> space for an args->maxlen allocation, xfs_alloc_space_available()
> will trim args->maxlen to equal the available space. However, this
> function has only checked that there is enough contiguous free space
> for an aligned args->minlen allocation to succeed. Hence there is no
> guarantee that an args->maxlen allocation will succeed, nor that the
> available space will allow for correct alignment of an args->maxlen
> allocation.
> 
> Further, by trimming args->maxlen arbitrarily, it breaks an
> assumption made in xfs_alloc_fix_len() that if the caller wants
> aligned allocation, then args->maxlen will be set to an aligned
> value. It then skips the tail alignment and so we end up with
> extents that aren't aligned to extent size hint boundaries as we
> approach ENOSPC.
> 
> To avoid this problem, don't reduce args->maxlen by some random,
> arbitrary amount. If args->maxlen is too large for the available
> space, reduce the allocation to a minlen allocation as we know we
> have contiguous free space available for this to succeed and always
> be correctly aligned.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>

This change seems to cause or at least expose a problem for me - I say 
that because it is the only difference to what I already had from 
https://lore.kernel.org/linux-xfs/ZeeaKrmVEkcXYjbK@dread.disaster.area/T/#me7abe09fe85292ca880f169a4af651eac5ed1424 
and the xfs_alloc_fix_len() fix.

With forcealign extsize=64KB, when I write to the end of a file I get 2x 
new extents, both of which are not a multiple of 64KB in size. Note that 
I am including 
https://lore.kernel.org/linux-xfs/20240304130428.13026-7-john.g.garry@oracle.com/, 
but I don't think it makes a difference.

Before:
  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
    0: [0..383]:        73216..73599      0 (73216..73599)     384
    1: [384..511]:      70528..70655      0 (70528..70655)     128

After:
  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
    0: [0..383]:        73216..73599      0 (73216..73599)     384
    1: [384..511]:      70528..70655      0 (70528..70655)     128
    2: [512..751]:      30336..30575      0 (30336..30575)     240
    3: [752..767]:      48256..48271      0 (48256..48271)      16

> ---
>   fs/xfs/libxfs/xfs_alloc.c | 19 ++++++++++++++-----
>   1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> index 9da52e92172a..215265e0f68f 100644
> --- a/fs/xfs/libxfs/xfs_alloc.c
> +++ b/fs/xfs/libxfs/xfs_alloc.c
> @@ -2411,14 +2411,23 @@ xfs_alloc_space_available(
>   	if (available < (int)max(args->total, alloc_len))
>   		return false;
>   
> +	if (flags & XFS_ALLOC_FLAG_CHECK)
> +		return true;
> +
>   	/*
> -	 * Clamp maxlen to the amount of free space available for the actual
> -	 * extent allocation.
> +	 * If we can't do a maxlen allocation, then we must reduce the size of
> +	 * the allocation to match the available free space. We know how big
> +	 * the largest contiguous free space we can allocate is, so that's our
> +	 * upper bound. However, we don't exaclty know what alignment/siz > +	 * constraints have been placed on the allocation, so we can't
> +	 * arbitrarily select some new max size. Hence make this a minlen
> +	 * allocation as we know that will definitely succeed and match the
> +	 * callers alignment constraints.
>   	 */
> -	if (available < (int)args->maxlen && !(flags & XFS_ALLOC_FLAG_CHECK)) {
> -		args->maxlen = available;
> +	alloc_len = args->maxlen + (args->alignment - 1) + args->minalignslop;

I added some kernel logs to assist debugging, and if I am reading them 
correctly, for ext #2 allocation we had at this point:

longest = 46, alloc_len = 47, args->minlen=30, maxlen=32, alignslop=0 
available=392; longest < alloc_len, so we set args->maxlen = 
args->minlen (= 30)

For ext3:
longest = 32, alloc_len = 17, args->minlen=2, args->maxlen=2, 
alignslop=0, available=362; longest > alloc_len, so do nothing

> +	if (longest < alloc_len) {
> +		args->maxlen = args->minlen;
>   		ASSERT(args->maxlen > 0);
> -		ASSERT(args->maxlen >= args->minlen);
>   	}
>   
>   	return true;


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC
  2024-04-03 16:31   ` John Garry
@ 2024-04-03 23:15     ` Dave Chinner
  2024-04-04 13:54       ` John Garry
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2024-04-03 23:15 UTC (permalink / raw)
  To: John Garry; +Cc: linux-xfs

On Wed, Apr 03, 2024 at 05:31:49PM +0100, John Garry wrote:
> On 03/04/2024 00:28, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > When we are near ENOSPC and don't have enough free
> > space for an args->maxlen allocation, xfs_alloc_space_available()
> > will trim args->maxlen to equal the available space. However, this
> > function has only checked that there is enough contiguous free space
> > for an aligned args->minlen allocation to succeed. Hence there is no
> > guarantee that an args->maxlen allocation will succeed, nor that the
> > available space will allow for correct alignment of an args->maxlen
> > allocation.
> > 
> > Further, by trimming args->maxlen arbitrarily, it breaks an
> > assumption made in xfs_alloc_fix_len() that if the caller wants
> > aligned allocation, then args->maxlen will be set to an aligned
> > value. It then skips the tail alignment and so we end up with
> > extents that aren't aligned to extent size hint boundaries as we
> > approach ENOSPC.
> > 
> > To avoid this problem, don't reduce args->maxlen by some random,
> > arbitrary amount. If args->maxlen is too large for the available
> > space, reduce the allocation to a minlen allocation as we know we
> > have contiguous free space available for this to succeed and always
> > be correctly aligned.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> 
> This change seems to cause or at least expose a problem for me - I say that
> because it is the only difference to what I already had from https://lore.kernel.org/linux-xfs/ZeeaKrmVEkcXYjbK@dread.disaster.area/T/#me7abe09fe85292ca880f169a4af651eac5ed1424
> and the xfs_alloc_fix_len() fix.
> 
> With forcealign extsize=64KB, when I write to the end of a file I get 2x new
> extents, both of which are not a multiple of 64KB in size. Note that I am
> including https://lore.kernel.org/linux-xfs/20240304130428.13026-7-john.g.garry@oracle.com/,
> but I don't think it makes a difference.
> 
> Before:
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>    0: [0..383]:        73216..73599      0 (73216..73599)     384
>    1: [384..511]:      70528..70655      0 (70528..70655)     128
> 
> After:
>  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>    0: [0..383]:        73216..73599      0 (73216..73599)     384
>    1: [384..511]:      70528..70655      0 (70528..70655)     128
>    2: [512..751]:      30336..30575      0 (30336..30575)     240

So that's a 120kB (30 fsb) extent.

>    3: [752..767]:      48256..48271      0 (48256..48271)      16

And that's the 2FSB tail to make it up to 64kB.

> > ---
> >   fs/xfs/libxfs/xfs_alloc.c | 19 ++++++++++++++-----
> >   1 file changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > index 9da52e92172a..215265e0f68f 100644
> > --- a/fs/xfs/libxfs/xfs_alloc.c
> > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > @@ -2411,14 +2411,23 @@ xfs_alloc_space_available(
> >   	if (available < (int)max(args->total, alloc_len))
> >   		return false;
> > +	if (flags & XFS_ALLOC_FLAG_CHECK)
> > +		return true;
> > +
> >   	/*
> > -	 * Clamp maxlen to the amount of free space available for the actual
> > -	 * extent allocation.
> > +	 * If we can't do a maxlen allocation, then we must reduce the size of
> > +	 * the allocation to match the available free space. We know how big
> > +	 * the largest contiguous free space we can allocate is, so that's our
> > +	 * upper bound. However, we don't exaclty know what alignment/siz > +	 * constraints have been placed on the allocation, so we can't
> > +	 * arbitrarily select some new max size. Hence make this a minlen
> > +	 * allocation as we know that will definitely succeed and match the
> > +	 * callers alignment constraints.
> >   	 */
> > -	if (available < (int)args->maxlen && !(flags & XFS_ALLOC_FLAG_CHECK)) {
> > -		args->maxlen = available;
> > +	alloc_len = args->maxlen + (args->alignment - 1) + args->minalignslop;
> 
> I added some kernel logs to assist debugging, and if I am reading them
> correctly, for ext #2 allocation we had at this point:
> 
> longest = 46, alloc_len = 47, args->minlen=30, maxlen=32, alignslop=0
> available=392; longest < alloc_len, so we set args->maxlen = args->minlen (=
> 30)

Why was args->minlen set to 30? Where did that value come from? i.e.
That is not correctly sized/aligned for force alignment - it should
be rounded down to 16 fsbs, right?

I'm guessing that "30" is (longest - alignment) = 46 - 16 = 30? And
then it wasn't rounded down from there?

> For ext3:
> longest = 32, alloc_len = 17, args->minlen=2, args->maxlen=2, alignslop=0,
> available=362; longest > alloc_len, so do nothing

Same issue here - for a forced align allocation, minlen needs to be
bound to alignment constraints. If minlen is set like this, then the
allocation will always return a 2 block extent if they exist.

IOWs, the allocation constraints are not correctly aligned, but the
allocation is doing exactly what the constraints say it is allowed
to do. Therefore the issue is in the constraint setup (args->minlen)
for forced alignment and not the allocation code that selects a
an args->minlen extent that is not correctly sized near ENOSPC.

I'm guessing that we need something like the (untested) patch below
to ensure args->minlen is properly aligned....

-Dave.
-- 
Dave Chinner
david@fromorbit.com

xfs: align args->minlen for forced allocation alignment

From: Dave Chinner <dchinner@redhat.com>

If args->minlen is not aligned to the constraints of forced
alignment, we may do minlen allocations that are not aligned when we
approach ENOSPC. Avoid this by always aligning args->minlen
appropriately. If alignment of minlen results in a value smaller
than the alignment constraint, fail the allocation immediately.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 45 +++++++++++++++++++++++++++++++--------------
 1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 7a0ef0900097..aeebe51550f5 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3288,33 +3288,48 @@ xfs_bmap_longest_free_extent(
 	return 0;
 }
 
-static xfs_extlen_t
+static int
 xfs_bmap_select_minlen(
 	struct xfs_bmalloca	*ap,
 	struct xfs_alloc_arg	*args,
 	xfs_extlen_t		blen)
 {
-
 	/* Adjust best length for extent start alignment. */
 	if (blen > args->alignment)
 		blen -= args->alignment;
 
 	/*
 	 * Since we used XFS_ALLOC_FLAG_TRYLOCK in _longest_free_extent(), it is
-	 * possible that there is enough contiguous free space for this request.
+	 * possible that there is enough contiguous free space for this request
+	 * even if best length is less that the minimum length we need.
+	 *
+	 * If the best length won't satisfy the maximum length we requested,
+	 * then use it as the minimum length so we get as large an allocation
+	 * as possible.
 	 */
 	if (blen < ap->minlen)
-		return ap->minlen;
+		blen = ap->minlen;
+	else if (blen > args->maxlen)
+		 blen = args->maxlen;
 
 	/*
-	 * If the best seen length is less than the request length,
-	 * use the best as the minimum, otherwise we've got the maxlen we
-	 * were asked for.
+	 * If we have alignment constraints, round the minlen down to match the
+	 * constraint so that alignment will be attempted. This may reduce the
+	 * allocation to smaller than was requested, so clamp the minimum to
+	 * ap->minlen to allow unaligned allocation to succeed. If we are forced
+	 * to align the allocation, return ENOSPC at this point because we don't
+	 * have enough contiguous free space to guarantee aligned allocation.
 	 */
-	if (blen < args->maxlen)
-		return blen;
-	return args->maxlen;
-
+	if (args->alignment > 1) {
+		blen = rounddown(blen, args->alignment);
+		if (blen < ap->minlen) {
+			if (args->datatype & XFS_ALLOC_FORCEALIGN)
+				return -ENOSPC;
+			blen = ap->minlen;
+		}
+	}
+	args->minlen = blen;
+	return 0;
 }
 
 static int
@@ -3350,8 +3365,7 @@ xfs_bmap_btalloc_select_lengths(
 	if (pag)
 		xfs_perag_rele(pag);
 
-	args->minlen = xfs_bmap_select_minlen(ap, args, blen);
-	return error;
+	return xfs_bmap_select_minlen(ap, args, blen);
 }
 
 /* Update all inode and quota accounting for the allocation we just did. */
@@ -3671,7 +3685,10 @@ xfs_bmap_btalloc_filestreams(
 		goto out_low_space;
 	}
 
-	args->minlen = xfs_bmap_select_minlen(ap, args, blen);
+	error = xfs_bmap_select_minlen(ap, args, blen);
+	if (error)
+		goto out_low_space;
+
 	if (ap->aeof && ap->offset)
 		error = xfs_bmap_btalloc_at_eof(ap, args);
 


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC
  2024-04-03 23:15     ` Dave Chinner
@ 2024-04-04 13:54       ` John Garry
  0 siblings, 0 replies; 13+ messages in thread
From: John Garry @ 2024-04-04 13:54 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs


>>> -	if (available < (int)args->maxlen && !(flags & XFS_ALLOC_FLAG_CHECK)) {
>>> -		args->maxlen = available;
>>> +	alloc_len = args->maxlen + (args->alignment - 1) + args->minalignslop;
>>
>> I added some kernel logs to assist debugging, and if I am reading them
>> correctly, for ext #2 allocation we had at this point:
>>
>> longest = 46, alloc_len = 47, args->minlen=30, maxlen=32, alignslop=0
>> available=392; longest < alloc_len, so we set args->maxlen = args->minlen (=
>> 30)
> 
> Why was args->minlen set to 30? Where did that value come from? i.e.
> That is not correctly sized/aligned for force alignment - it should
> be rounded down to 16 fsbs, right?

I could not recreate this exact scenario, but another similar one 
occurred which gave:

/root/mnt2/file_2425:
  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
    0: [0..431]:        489600..490031    2 (131200..131631)   432
    1: [432..511]:      529408..529487    2 (171008..171087)    80
#

For allocating EXT #0, in xfs_bmap_select_minlen() we had initially 
args->minlen=0, maxlen=64 blen=70

then blen -= alignment (=16), an finally condition blen < args->maxlen 
passed and have minlen = 54


Then in xfs_alloc_space_available(), for check (longest < alloc_len) 
near the end, at that point we have:
longest=70, alloc_len=79, args->minlen=54, maxlen=64, alignslop=0 
available=155, and so we set maxlen = minlen = 54

Then in xfs_alloc_fix_len(), initially args->mod=0, prod=16, minlen=54, 
maxlen=54, len=54 rlen=54

And we end up with the 54 FSB extent.

> 
> I'm guessing that "30" is (longest - alignment) = 46 - 16 = 30? And
> then it wasn't rounded down from there?
> 
>> For ext3:
>> longest = 32, alloc_len = 17, args->minlen=2, args->maxlen=2, alignslop=0,
>> available=362; longest > alloc_len, so do nothing
> 
> Same issue here - for a forced align allocation, minlen needs to be
> bound to alignment constraints. If minlen is set like this, then the
> allocation will always return a 2 block extent if they exist.
> 
> IOWs, the allocation constraints are not correctly aligned, but the
> allocation is doing exactly what the constraints say it is allowed
> to do. Therefore the issue is in the constraint setup (args->minlen)
> for forced alignment and not the allocation code that selects a
> an args->minlen extent that is not correctly sized near ENOSPC.
> 
> I'm guessing that we need something like the (untested) patch below
> to ensure args->minlen is properly aligned....
> 
> -Dave.

For some reason that diff got excluded or mangled when I tried to reply. 
Anyway, it seems to work ok, but that's based only on a quite limited 
amount of testing.

Thanks,
John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] xfs: allocation alignment for forced alignment
  2024-04-02 23:28 [PATCH 0/5] xfs: allocation alignment for forced alignment Dave Chinner
                   ` (4 preceding siblings ...)
  2024-04-02 23:28 ` [PATCH 5/5] xfs: introduce forced allocation alignment Dave Chinner
@ 2024-04-10 12:44 ` John Garry
  2024-04-16  0:38   ` Dave Chinner
  2024-04-29 14:06 ` John Garry
  6 siblings, 1 reply; 13+ messages in thread
From: John Garry @ 2024-04-10 12:44 UTC (permalink / raw)
  To: Dave Chinner, linux-xfs

On 03/04/2024 00:28, Dave Chinner wrote:

Hi Dave,

Can we come up with some merging strategy here?

This feature is blocking me sending an updated version of my XFS support 
for block atomic writes series.

I suppose we can transfer all the other FORCEALIGN patches from that 
series into this one, so that this series is a fully complete feature 
which could be merged separately.

Thanks,
John

> This patchset prepares the allocator infrastructure for extent size
> hint alignment to guarantee alignment of extents rather than just be
> a hint.
> 
> Extent alignment is currently driven by multiple variables that come
> from different sources and are applied at different times. Stripe
> alignment currently defines the extent start alignment, whilst
> extent size hints only affect the extent length and not the start
> alignment.  There are also assumptions about alignment of allocation
> parameters (such as the maximum length of the allocation) and if
> these aren't followed the extent size hints don't actually trim
> extents properly.
> 
> This patch set aims to unify alignment entirely via args->alignment
> and args->prod/args->mod. Alignment at both ends of the extent
> should always occur when the free space selected allows for start
> alignment of an extent that is at least args->minlen in length.
> 
> Hence we need to modify args->alignment setup to take into account
> extent size hints in addition to stripe alignment, and we need to
> ensure that extent length is always aligned to extent size hints,
> even if it means we cannot do a args->maxlen allocation.
> 
> This means that only when we completely fail to find aligned free
> space to allocate from will extent size hints no longer be start
> aligned. They will continue to be tail aligned up until we run out
> of contiguous free space extents large enough to hold an extent the
> size of a hint. Hence there is no real change of behaviour for
> extent size hints; they will simply prefer aligned allocations
> first, then naturally fall back to the current unaligned hint sized
> allocation strategy.
> 
> Unifying the allocation alignment stratgies simplifies the code,
> too. It means that the only time we don't align allocation is when
> trying to do contiguous allocation when extending the file (which
> should already be aligned if alignment is required!) or when there
> are no alignment constraints to begin with.  As a result, we can
> simplify the allocation failure fallback strategies and make the
> code easier to understand and follow.
> 
> Finally, we introduce a stub for force aligned allocation and add
> the logic fail the allocation if aligned allocation cannot be done.
> 
> This has run through fstests for some time here without inducing new
> failures or causing any obvious near-ENOSPC performance regressions.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] xfs: allocation alignment for forced alignment
  2024-04-10 12:44 ` [PATCH 0/5] xfs: allocation alignment for forced alignment John Garry
@ 2024-04-16  0:38   ` Dave Chinner
  2024-04-16  6:51     ` John Garry
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2024-04-16  0:38 UTC (permalink / raw)
  To: John Garry; +Cc: linux-xfs

On Wed, Apr 10, 2024 at 01:44:47PM +0100, John Garry wrote:
> On 03/04/2024 00:28, Dave Chinner wrote:
> 
> Hi Dave,
> 
> Can we come up with some merging strategy here?
> 
> This feature is blocking me sending an updated version of my XFS support for
> block atomic writes series.

So just add the patches to the start of your series until they are
merged...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] xfs: allocation alignment for forced alignment
  2024-04-16  0:38   ` Dave Chinner
@ 2024-04-16  6:51     ` John Garry
  0 siblings, 0 replies; 13+ messages in thread
From: John Garry @ 2024-04-16  6:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On 16/04/2024 01:38, Dave Chinner wrote:
>> This feature is blocking me sending an updated version of my XFS support for
>> block atomic writes series.
> So just add the patches to the start of your series until they are
> merged...

Fine, I can do that.

There was also a couple of patches which I was of unsure of here:
https://lore.kernel.org/linux-xfs/a7de9521-d101-402d-a59b-f7ce936ba383@oracle.com/

I was hoping that you would have a look at those, but I suppose that I 
can include them in the next series also.

Thanks,
John


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] xfs: allocation alignment for forced alignment
  2024-04-02 23:28 [PATCH 0/5] xfs: allocation alignment for forced alignment Dave Chinner
                   ` (5 preceding siblings ...)
  2024-04-10 12:44 ` [PATCH 0/5] xfs: allocation alignment for forced alignment John Garry
@ 2024-04-29 14:06 ` John Garry
  6 siblings, 0 replies; 13+ messages in thread
From: John Garry @ 2024-04-29 14:06 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On 03/04/2024 00:28, Dave Chinner wrote:
> Extent alignment is currently driven by multiple variables that come
> from different sources and are applied at different times. Stripe
> alignment currently defines the extent start alignment, whilst
> extent size hints only affect the extent length and not the start
> alignment.

BTW, I have been testing stripe alignment for this series vs mainline 
(v6.9-rc1), and I find that I more often get stripe-aligned extents with 
this series. I guess that you expected that.

However, we do seem to make more trips into the block allocator for 
that. For example, writing a new file with 6014 blocks and sunit=16:

this series bmap:

File size of /root/mnt2/file_48 is 24630784 (6014 blocks of 4096 bytes)

/root/mnt2/file_48:
  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
    0: [0..48111]:      547456..595567    1 (240768..288879) 48112 000111

The block allocator gave 1x 6014 block

mainline:

/root/mnt2/file_48:
  EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
    0: [0..48111]:      547440..595551    1 (240752..288863) 48112 001111


The block allocator gave 1x 6000 and 1x 14 blocks, which are contiguous.

Let me know if you want more details on this. Obviously more often 
aligned extents is nice, but more trips to the block allocator isn't.

Thanks,
John





^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-04-29 14:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-02 23:28 [PATCH 0/5] xfs: allocation alignment for forced alignment Dave Chinner
2024-04-02 23:28 ` [PATCH 1/5] xfs: only allow minlen allocations when near ENOSPC Dave Chinner
2024-04-03 16:31   ` John Garry
2024-04-03 23:15     ` Dave Chinner
2024-04-04 13:54       ` John Garry
2024-04-02 23:28 ` [PATCH 2/5] xfs: always tail align maxlen allocations Dave Chinner
2024-04-02 23:28 ` [PATCH 3/5] xfs: simplify extent allocation alignment Dave Chinner
2024-04-02 23:28 ` [PATCH 4/5] xfs: make EOF allocation simpler Dave Chinner
2024-04-02 23:28 ` [PATCH 5/5] xfs: introduce forced allocation alignment Dave Chinner
2024-04-10 12:44 ` [PATCH 0/5] xfs: allocation alignment for forced alignment John Garry
2024-04-16  0:38   ` Dave Chinner
2024-04-16  6:51     ` John Garry
2024-04-29 14:06 ` John Garry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.