All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] buffered write and indlen fixes
@ 2017-02-14 13:03 Brian Foster
  2017-02-14 13:03 ` [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure Brian Foster
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Brian Foster @ 2017-02-14 13:03 UTC (permalink / raw)
  To: linux-xfs

This is just a small update to rebase onto for-next with all previous
debug artifacts removed. No code changes.

Brian

v3:
- Rebase to a clean for-next branch.
v2: http://www.spinics.net/lists/linux-xfs/msg04215.html
- Use do_div() to fix 32-bit builds.
- Prepend patches 1-2 to re-enable indlen testing from xfstests.
v1: http://www.spinics.net/lists/linux-xfs/msg04083.html

Brian Foster (4):
  xfs: clear delalloc and cache on buffered write failure
  xfs: resurrect debug mode drop buffered writes mechanism
  xfs: handle indlen shortage on delalloc extent merge
  xfs: split indlen reservations fairly when under reserved

 fs/xfs/libxfs/xfs_bmap.c | 70 +++++++++++++++++++++++++++++++++---------------
 fs/xfs/xfs_iomap.c       |  9 +++++++
 fs/xfs/xfs_mount.h       | 15 ++++++-----
 fs/xfs/xfs_sysfs.c       | 14 +++++-----
 4 files changed, 73 insertions(+), 35 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure
  2017-02-14 13:03 [PATCH v3 0/4] buffered write and indlen fixes Brian Foster
@ 2017-02-14 13:03 ` Brian Foster
  2017-02-16 20:24   ` Christoph Hellwig
  2017-02-16 22:02   ` [PATCH v2] " Brian Foster
  2017-02-14 13:03 ` [PATCH 2/4] xfs: resurrect debug mode drop buffered writes mechanism Brian Foster
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Brian Foster @ 2017-02-14 13:03 UTC (permalink / raw)
  To: linux-xfs

The buffered write failure handling code in
xfs_file_iomap_end_delalloc() has a couple minor problems. First, if
written == 0, start_fsb is not rounded down and it fails to kill off a
delalloc block if the start offset is block unaligned. This results in a
lingering delalloc block and broken delalloc block accounting detected
at unmount time. Fix this by rounding down start_fsb in the unlikely
event that written == 0.

Second, it is possible for a failed overwrite of a delalloc extent to
leave dirty pagecache around over a hole in the file. This is because is
possible to hit ->iomap_end() on write failure before the iomap code has
attempted to allocate pagecache, and thus has no need to clean it up. If
the targeted delalloc extent was successfully written by a previous
write, however, then it does still have dirty pages when ->iomap_end()
punches out the underlying blocks. This ultimately results in writeback
over a hole. To fix this problem, unconditionally punch out the
pagecache from XFS before the associated delalloc range.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_iomap.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index e0bc290..e2fbed4 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1079,6 +1079,8 @@ xfs_file_iomap_end_delalloc(
 	int			error = 0;
 
 	start_fsb = XFS_B_TO_FSB(mp, offset + written);
+	if (unlikely(!written))
+		start_fsb = XFS_B_TO_FSBT(mp, offset);
 	end_fsb = XFS_B_TO_FSB(mp, offset + length);
 
 	/*
@@ -1090,6 +1092,9 @@ xfs_file_iomap_end_delalloc(
 	 * blocks in the range, they are ours.
 	 */
 	if (start_fsb < end_fsb) {
+		truncate_pagecache_range(VFS_I(ip), XFS_FSB_TO_B(mp, start_fsb),
+					 XFS_FSB_TO_B(mp, end_fsb) - 1);
+
 		xfs_ilock(ip, XFS_ILOCK_EXCL);
 		error = xfs_bmap_punch_delalloc_range(ip, start_fsb,
 					       end_fsb - start_fsb);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/4] xfs: resurrect debug mode drop buffered writes mechanism
  2017-02-14 13:03 [PATCH v3 0/4] buffered write and indlen fixes Brian Foster
  2017-02-14 13:03 ` [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure Brian Foster
@ 2017-02-14 13:03 ` Brian Foster
  2017-02-16 20:24   ` Christoph Hellwig
  2017-02-14 13:03 ` [PATCH 3/4] xfs: handle indlen shortage on delalloc extent merge Brian Foster
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Brian Foster @ 2017-02-14 13:03 UTC (permalink / raw)
  To: linux-xfs

A debug mode write failure mechanism was introduced to XFS in commit
801cc4e17a ("xfs: debug mode forced buffered write failure") to
facilitate targeted testing of delalloc indirect reservation management
from userspace. This code was subsequently rendered ineffective by the
move to iomap based buffered writes in commit 68a9f5e700 ("xfs:
implement iomap based buffered write path"). This likely went unnoticed
because the associated userspace code had not made it into xfstests.

Resurrect this mechanism to facilitate effective indlen reservation
testing from xfstests. The move to iomap based buffered writes relocated
the hook this mechanism needs to return write failure from XFS to
generic code. The failure trigger must remain in XFS. Given that
limitation, convert this from a write failure mechanism to one that
simply drops writes without returning failure to userspace. Rename all
"fail_writes" references to "drop_writes" to illustrate the point. This
is more hacky than preferred, but still triggers the XFS error handling
behavior required to drive the indlen tests. This is only available in
DEBUG mode and for testing purposes only.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_iomap.c |  4 ++++
 fs/xfs/xfs_mount.h | 15 ++++++++-------
 fs/xfs/xfs_sysfs.c | 14 +++++++-------
 3 files changed, 19 insertions(+), 14 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index e2fbed4..01e8780 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1078,6 +1078,10 @@ xfs_file_iomap_end_delalloc(
 	xfs_fileoff_t		end_fsb;
 	int			error = 0;
 
+	/* behave as if the write failed if drop writes is enabled */
+	if (xfs_mp_drop_writes(mp))
+		written = 0;
+
 	start_fsb = XFS_B_TO_FSB(mp, offset + written);
 	if (unlikely(!written))
 		start_fsb = XFS_B_TO_FSBT(mp, offset);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 20e2981..6db6fd6 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -200,11 +200,12 @@ typedef struct xfs_mount {
 	/*
 	 * DEBUG mode instrumentation to test and/or trigger delayed allocation
 	 * block killing in the event of failed writes. When enabled, all
-	 * buffered writes are forced to fail. All delalloc blocks in the range
-	 * of the write (including pre-existing delalloc blocks!) are tossed as
-	 * part of the write failure error handling sequence.
+	 * buffered writes are silenty dropped and handled as if they failed.
+	 * All delalloc blocks in the range of the write (including pre-existing
+	 * delalloc blocks!) are tossed as part of the write failure error
+	 * handling sequence.
 	 */
-	bool			m_fail_writes;
+	bool			m_drop_writes;
 #endif
 } xfs_mount_t;
 
@@ -325,13 +326,13 @@ xfs_daddr_to_agbno(struct xfs_mount *mp, xfs_daddr_t d)
 
 #ifdef DEBUG
 static inline bool
-xfs_mp_fail_writes(struct xfs_mount *mp)
+xfs_mp_drop_writes(struct xfs_mount *mp)
 {
-	return mp->m_fail_writes;
+	return mp->m_drop_writes;
 }
 #else
 static inline bool
-xfs_mp_fail_writes(struct xfs_mount *mp)
+xfs_mp_drop_writes(struct xfs_mount *mp)
 {
 	return 0;
 }
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index de6195e..80ac15f 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -93,7 +93,7 @@ to_mp(struct kobject *kobject)
 #ifdef DEBUG
 
 STATIC ssize_t
-fail_writes_store(
+drop_writes_store(
 	struct kobject		*kobject,
 	const char		*buf,
 	size_t			count)
@@ -107,9 +107,9 @@ fail_writes_store(
 		return ret;
 
 	if (val == 1)
-		mp->m_fail_writes = true;
+		mp->m_drop_writes = true;
 	else if (val == 0)
-		mp->m_fail_writes = false;
+		mp->m_drop_writes = false;
 	else
 		return -EINVAL;
 
@@ -117,21 +117,21 @@ fail_writes_store(
 }
 
 STATIC ssize_t
-fail_writes_show(
+drop_writes_show(
 	struct kobject		*kobject,
 	char			*buf)
 {
 	struct xfs_mount	*mp = to_mp(kobject);
 
-	return snprintf(buf, PAGE_SIZE, "%d\n", mp->m_fail_writes ? 1 : 0);
+	return snprintf(buf, PAGE_SIZE, "%d\n", mp->m_drop_writes ? 1 : 0);
 }
-XFS_SYSFS_ATTR_RW(fail_writes);
+XFS_SYSFS_ATTR_RW(drop_writes);
 
 #endif /* DEBUG */
 
 static struct attribute *xfs_mp_attrs[] = {
 #ifdef DEBUG
-	ATTR_LIST(fail_writes),
+	ATTR_LIST(drop_writes),
 #endif
 	NULL,
 };
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/4] xfs: handle indlen shortage on delalloc extent merge
  2017-02-14 13:03 [PATCH v3 0/4] buffered write and indlen fixes Brian Foster
  2017-02-14 13:03 ` [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure Brian Foster
  2017-02-14 13:03 ` [PATCH 2/4] xfs: resurrect debug mode drop buffered writes mechanism Brian Foster
@ 2017-02-14 13:03 ` Brian Foster
  2017-02-16 20:25   ` Christoph Hellwig
  2017-02-14 13:03 ` [PATCH 4/4] xfs: split indlen reservations fairly when under reserved Brian Foster
  2017-02-15  5:52 ` [PATCH v3 0/4] buffered write and indlen fixes Darrick J. Wong
  4 siblings, 1 reply; 12+ messages in thread
From: Brian Foster @ 2017-02-14 13:03 UTC (permalink / raw)
  To: linux-xfs

When a delalloc extent is created, it can be merged with pre-existing,
contiguous, delalloc extents. When this occurs,
xfs_bmap_add_extent_hole_delay() merges the extents along with the
associated indirect block reservations. The expectation here is that the
combined worst case indlen reservation is always less than or equal to
the indlen reservation for the individual extents.

This is not always the case, however, as existing extents can less than
the expected indlen reservation if the extent was previously split due
to a hole punch. If a new extent merges with such an extent, the total
indlen requirement may be larger than the sum of the indlen reservations
held by both extents.

xfs_bmap_add_extent_hole_delay() assumes that the worst case indlen
reservation is always available and assigns it to the merged extent
without consideration for the indlen held by the pre-existing extent. As
a result, the subsequent xfs_mod_fdblocks() call can attempt an
unintentional allocation rather than a free (indicated by an ASSERT()
failure). Further, if the allocation happens to fail in this context,
the failure goes unhandled and creates a filesystem wide block
accounting inconsistency.

Fix xfs_bmap_add_extent_hole_delay() to function as designed. Cap the
indlen reservation assigned to the merged extent to the sum of the
indlen reservations held by each of the individual extents.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 1cee514..73c9546 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -2805,7 +2805,8 @@ xfs_bmap_add_extent_hole_delay(
 		oldlen = startblockval(left.br_startblock) +
 			startblockval(new->br_startblock) +
 			startblockval(right.br_startblock);
-		newlen = xfs_bmap_worst_indlen(ip, temp);
+		newlen = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp),
+					 oldlen);
 		xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, *idx),
 			nullstartblock((int)newlen));
 		trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_);
@@ -2826,7 +2827,8 @@ xfs_bmap_add_extent_hole_delay(
 		xfs_bmbt_set_blockcount(xfs_iext_get_ext(ifp, *idx), temp);
 		oldlen = startblockval(left.br_startblock) +
 			startblockval(new->br_startblock);
-		newlen = xfs_bmap_worst_indlen(ip, temp);
+		newlen = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp),
+					 oldlen);
 		xfs_bmbt_set_startblock(xfs_iext_get_ext(ifp, *idx),
 			nullstartblock((int)newlen));
 		trace_xfs_bmap_post_update(ip, *idx, state, _THIS_IP_);
@@ -2842,7 +2844,8 @@ xfs_bmap_add_extent_hole_delay(
 		temp = new->br_blockcount + right.br_blockcount;
 		oldlen = startblockval(new->br_startblock) +
 			startblockval(right.br_startblock);
-		newlen = xfs_bmap_worst_indlen(ip, temp);
+		newlen = XFS_FILBLKS_MIN(xfs_bmap_worst_indlen(ip, temp),
+					 oldlen);
 		xfs_bmbt_set_allf(xfs_iext_get_ext(ifp, *idx),
 			new->br_startoff,
 			nullstartblock((int)newlen), temp, right.br_state);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/4] xfs: split indlen reservations fairly when under reserved
  2017-02-14 13:03 [PATCH v3 0/4] buffered write and indlen fixes Brian Foster
                   ` (2 preceding siblings ...)
  2017-02-14 13:03 ` [PATCH 3/4] xfs: handle indlen shortage on delalloc extent merge Brian Foster
@ 2017-02-14 13:03 ` Brian Foster
  2017-02-16 20:26   ` Christoph Hellwig
  2017-02-15  5:52 ` [PATCH v3 0/4] buffered write and indlen fixes Darrick J. Wong
  4 siblings, 1 reply; 12+ messages in thread
From: Brian Foster @ 2017-02-14 13:03 UTC (permalink / raw)
  To: linux-xfs

Certain workoads that punch holes into speculative preallocation can
cause delalloc indirect reservation splits when the delalloc extent is
split in two. If further splits occur, an already short-handed extent
can be split into two in a manner that leaves zero indirect blocks for
one of the two new extents. This occurs because the shortage is large
enough that the xfs_bmap_split_indlen() algorithm completely drains the
requested indlen of one of the extents before it honors the existing
reservation.

This ultimately results in a warning from xfs_bmap_del_extent(). This
has been observed during file copies of large, sparse files using 'cp
--sparse=always.'

To avoid this problem, update xfs_bmap_split_indlen() to explicitly
apply the reservation shortage fairly between both extents. This smooths
out the overall indlen shortage and defers the situation where we end up
with a delalloc extent with zero indlen reservation to extreme
circumstances.

Reported-by: Patrick Dung <mpatdung@gmail.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 61 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 43 insertions(+), 18 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 73c9546..2ae55db 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4795,34 +4795,59 @@ xfs_bmap_split_indlen(
 	xfs_filblks_t			len2 = *indlen2;
 	xfs_filblks_t			nres = len1 + len2; /* new total res. */
 	xfs_filblks_t			stolen = 0;
+	xfs_filblks_t			resfactor;
 
 	/*
 	 * Steal as many blocks as we can to try and satisfy the worst case
 	 * indlen for both new extents.
 	 */
-	while (nres > ores && avail) {
-		nres--;
-		avail--;
-		stolen++;
-	}
+	if (ores < nres && avail)
+		stolen = XFS_FILBLKS_MIN(nres - ores, avail);
+	ores += stolen;
+
+	 /* nothing else to do if we've satisfied the new reservation */
+	if (ores >= nres)
+		return stolen;
+
+	/*
+	 * We can't meet the total required reservation for the two extents.
+	 * Calculate the percent of the overall shortage between both extents
+	 * and apply this percentage to each of the requested indlen values.
+	 * This distributes the shortage fairly and reduces the chances that one
+	 * of the two extents is left with nothing when extents are repeatedly
+	 * split.
+	 */
+	resfactor = (ores * 100);
+	do_div(resfactor, nres);
+	len1 *= resfactor;
+	do_div(len1, 100);
+	len2 *= resfactor;
+	do_div(len2, 100);
+	ASSERT(len1 + len2 <= ores);
+	ASSERT(len1 < *indlen1 && len2 < *indlen2);
 
 	/*
-	 * The only blocks available are those reserved for the original
-	 * extent and what we can steal from the extent being removed.
-	 * If this still isn't enough to satisfy the combined
-	 * requirements for the two new extents, skim blocks off of each
-	 * of the new reservations until they match what is available.
+	 * Hand out the remainder to each extent. If one of the two reservations
+	 * is zero, we want to make sure that one gets a block first. The loop
+	 * below starts with len1, so hand len2 a block right off the bat if it
+	 * is zero.
 	 */
-	while (nres > ores) {
-		if (len1) {
-			len1--;
-			nres--;
+	ores -= (len1 + len2);
+	ASSERT((*indlen1 - len1) + (*indlen2 - len2) >= ores);
+	if (ores && !len2 && *indlen2) {
+		len2++;
+		ores--;
+	}
+	while (ores) {
+		if (len1 < *indlen1) {
+			len1++;
+			ores--;
 		}
-		if (nres == ores)
+		if (!ores)
 			break;
-		if (len2) {
-			len2--;
-			nres--;
+		if (len2 < *indlen2) {
+			len2++;
+			ores--;
 		}
 	}
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 0/4] buffered write and indlen fixes
  2017-02-14 13:03 [PATCH v3 0/4] buffered write and indlen fixes Brian Foster
                   ` (3 preceding siblings ...)
  2017-02-14 13:03 ` [PATCH 4/4] xfs: split indlen reservations fairly when under reserved Brian Foster
@ 2017-02-15  5:52 ` Darrick J. Wong
  4 siblings, 0 replies; 12+ messages in thread
From: Darrick J. Wong @ 2017-02-15  5:52 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Feb 14, 2017 at 08:03:05AM -0500, Brian Foster wrote:
> This is just a small update to rebase onto for-next with all previous
> debug artifacts removed. No code changes.
> 
> Brian
> 
> v3:
> - Rebase to a clean for-next branch.

/me sorted out the merge wonkiness by hand, looks like it's testing ok.

Looks good, so
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

[queued for 4.11]

--D

> v2: http://www.spinics.net/lists/linux-xfs/msg04215.html
> - Use do_div() to fix 32-bit builds.
> - Prepend patches 1-2 to re-enable indlen testing from xfstests.
> v1: http://www.spinics.net/lists/linux-xfs/msg04083.html
> 
> Brian Foster (4):
>   xfs: clear delalloc and cache on buffered write failure
>   xfs: resurrect debug mode drop buffered writes mechanism
>   xfs: handle indlen shortage on delalloc extent merge
>   xfs: split indlen reservations fairly when under reserved
> 
>  fs/xfs/libxfs/xfs_bmap.c | 70 +++++++++++++++++++++++++++++++++---------------
>  fs/xfs/xfs_iomap.c       |  9 +++++++
>  fs/xfs/xfs_mount.h       | 15 ++++++-----
>  fs/xfs/xfs_sysfs.c       | 14 +++++-----
>  4 files changed, 73 insertions(+), 35 deletions(-)
> 
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure
  2017-02-14 13:03 ` [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure Brian Foster
@ 2017-02-16 20:24   ` Christoph Hellwig
  2017-02-16 21:52     ` Brian Foster
  2017-02-16 22:02   ` [PATCH v2] " Brian Foster
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2017-02-16 20:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Tue, Feb 14, 2017 at 08:03:06AM -0500, Brian Foster wrote:
> The buffered write failure handling code in
> xfs_file_iomap_end_delalloc() has a couple minor problems. First, if
> written == 0, start_fsb is not rounded down and it fails to kill off a
> delalloc block if the start offset is block unaligned. This results in a
> lingering delalloc block and broken delalloc block accounting detected
> at unmount time. Fix this by rounding down start_fsb in the unlikely
> event that written == 0.
> 
> Second, it is possible for a failed overwrite of a delalloc extent to
> leave dirty pagecache around over a hole in the file. This is because is
> possible to hit ->iomap_end() on write failure before the iomap code has
> attempted to allocate pagecache, and thus has no need to clean it up. If
> the targeted delalloc extent was successfully written by a previous
> write, however, then it does still have dirty pages when ->iomap_end()
> punches out the underlying blocks. This ultimately results in writeback
> over a hole. To fix this problem, unconditionally punch out the
> pagecache from XFS before the associated delalloc range.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_iomap.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index e0bc290..e2fbed4 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1079,6 +1079,8 @@ xfs_file_iomap_end_delalloc(
>  	int			error = 0;
>  
>  	start_fsb = XFS_B_TO_FSB(mp, offset + written);
> +	if (unlikely(!written))
> +		start_fsb = XFS_B_TO_FSBT(mp, offset);

nitpick: use if/else here and maybe add a comment?

Otherwise looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/4] xfs: resurrect debug mode drop buffered writes mechanism
  2017-02-14 13:03 ` [PATCH 2/4] xfs: resurrect debug mode drop buffered writes mechanism Brian Foster
@ 2017-02-16 20:24   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2017-02-16 20:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

Looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/4] xfs: handle indlen shortage on delalloc extent merge
  2017-02-14 13:03 ` [PATCH 3/4] xfs: handle indlen shortage on delalloc extent merge Brian Foster
@ 2017-02-16 20:25   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2017-02-16 20:25 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

Looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 4/4] xfs: split indlen reservations fairly when under reserved
  2017-02-14 13:03 ` [PATCH 4/4] xfs: split indlen reservations fairly when under reserved Brian Foster
@ 2017-02-16 20:26   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2017-02-16 20:26 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

Looks fine,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure
  2017-02-16 20:24   ` Christoph Hellwig
@ 2017-02-16 21:52     ` Brian Foster
  0 siblings, 0 replies; 12+ messages in thread
From: Brian Foster @ 2017-02-16 21:52 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Feb 16, 2017 at 12:24:04PM -0800, Christoph Hellwig wrote:
> On Tue, Feb 14, 2017 at 08:03:06AM -0500, Brian Foster wrote:
> > The buffered write failure handling code in
> > xfs_file_iomap_end_delalloc() has a couple minor problems. First, if
> > written == 0, start_fsb is not rounded down and it fails to kill off a
> > delalloc block if the start offset is block unaligned. This results in a
> > lingering delalloc block and broken delalloc block accounting detected
> > at unmount time. Fix this by rounding down start_fsb in the unlikely
> > event that written == 0.
> > 
> > Second, it is possible for a failed overwrite of a delalloc extent to
> > leave dirty pagecache around over a hole in the file. This is because is
> > possible to hit ->iomap_end() on write failure before the iomap code has
> > attempted to allocate pagecache, and thus has no need to clean it up. If
> > the targeted delalloc extent was successfully written by a previous
> > write, however, then it does still have dirty pages when ->iomap_end()
> > punches out the underlying blocks. This ultimately results in writeback
> > over a hole. To fix this problem, unconditionally punch out the
> > pagecache from XFS before the associated delalloc range.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_iomap.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> > index e0bc290..e2fbed4 100644
> > --- a/fs/xfs/xfs_iomap.c
> > +++ b/fs/xfs/xfs_iomap.c
> > @@ -1079,6 +1079,8 @@ xfs_file_iomap_end_delalloc(
> >  	int			error = 0;
> >  
> >  	start_fsb = XFS_B_TO_FSB(mp, offset + written);
> > +	if (unlikely(!written))
> > +		start_fsb = XFS_B_TO_FSBT(mp, offset);
> 
> nitpick: use if/else here and maybe add a comment?
> 

Sure..

Brian

> Otherwise looks fine:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] xfs: clear delalloc and cache on buffered write failure
  2017-02-14 13:03 ` [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure Brian Foster
  2017-02-16 20:24   ` Christoph Hellwig
@ 2017-02-16 22:02   ` Brian Foster
  1 sibling, 0 replies; 12+ messages in thread
From: Brian Foster @ 2017-02-16 22:02 UTC (permalink / raw)
  To: linux-xfs

The buffered write failure handling code in
xfs_file_iomap_end_delalloc() has a couple minor problems. First, if
written == 0, start_fsb is not rounded down and it fails to kill off a
delalloc block if the start offset is block unaligned. This results in a
lingering delalloc block and broken delalloc block accounting detected
at unmount time. Fix this by rounding down start_fsb in the unlikely
event that written == 0.

Second, it is possible for a failed overwrite of a delalloc extent to
leave dirty pagecache around over a hole in the file. This is because is
possible to hit ->iomap_end() on write failure before the iomap code has
attempted to allocate pagecache, and thus has no need to clean it up. If
the targeted delalloc extent was successfully written by a previous
write, however, then it does still have dirty pages when ->iomap_end()
punches out the underlying blocks. This ultimately results in writeback
over a hole. To fix this problem, unconditionally punch out the
pagecache from XFS before the associated delalloc range.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---

v2:
- Used if/else logic and added a comment.

 fs/xfs/xfs_iomap.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index e0bc290..4009e7c 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1078,7 +1078,15 @@ xfs_file_iomap_end_delalloc(
 	xfs_fileoff_t		end_fsb;
 	int			error = 0;
 
-	start_fsb = XFS_B_TO_FSB(mp, offset + written);
+	/*
+	 * start_fsb refers to the first unused block after a short write. If
+	 * nothing was written, round offset down to point at the first block in
+	 * the range.
+	 */
+	if (unlikely(!written))
+		start_fsb = XFS_B_TO_FSBT(mp, offset);
+	else
+		start_fsb = XFS_B_TO_FSB(mp, offset + written);
 	end_fsb = XFS_B_TO_FSB(mp, offset + length);
 
 	/*
@@ -1090,6 +1098,9 @@ xfs_file_iomap_end_delalloc(
 	 * blocks in the range, they are ours.
 	 */
 	if (start_fsb < end_fsb) {
+		truncate_pagecache_range(VFS_I(ip), XFS_FSB_TO_B(mp, start_fsb),
+					 XFS_FSB_TO_B(mp, end_fsb) - 1);
+
 		xfs_ilock(ip, XFS_ILOCK_EXCL);
 		error = xfs_bmap_punch_delalloc_range(ip, start_fsb,
 					       end_fsb - start_fsb);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-02-16 22:02 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-14 13:03 [PATCH v3 0/4] buffered write and indlen fixes Brian Foster
2017-02-14 13:03 ` [PATCH 1/4] xfs: clear delalloc and cache on buffered write failure Brian Foster
2017-02-16 20:24   ` Christoph Hellwig
2017-02-16 21:52     ` Brian Foster
2017-02-16 22:02   ` [PATCH v2] " Brian Foster
2017-02-14 13:03 ` [PATCH 2/4] xfs: resurrect debug mode drop buffered writes mechanism Brian Foster
2017-02-16 20:24   ` Christoph Hellwig
2017-02-14 13:03 ` [PATCH 3/4] xfs: handle indlen shortage on delalloc extent merge Brian Foster
2017-02-16 20:25   ` Christoph Hellwig
2017-02-14 13:03 ` [PATCH 4/4] xfs: split indlen reservations fairly when under reserved Brian Foster
2017-02-16 20:26   ` Christoph Hellwig
2017-02-15  5:52 ` [PATCH v3 0/4] buffered write and indlen fixes Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.