linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out
@ 2021-01-18 22:11 Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 01/11] xfs: refactor messy xfs_inode_free_quota_* functions Darrick J. Wong
                   ` (10 more replies)
  0 siblings, 11 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:11 UTC (permalink / raw)
  To: djwong; +Cc: Christoph Hellwig, linux-xfs

Hi all,

Historically, when users ran out of space or quota when trying to write
to the filesystem, XFS didn't try very hard to reclaim space that it
might have speculatively allocated for the purpose of speeding up
front-end filesystem operations (appending writes, cow staging).  The
upcoming deferred inactivation series will greatly increase the amount
of allocated space that isn't actively being used to store user data.

Therefore, try to reduce the circumstances where we return EDQUOT or
ENOSPC to userspace by teaching the write paths to try to clear space
and retry the operation one time before giving up.

v2: clean up and rebase against 5.11.
v3: restructure the retry loops per dchinner suggestion

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=reclaim-space-harder-5.12
---
 fs/xfs/libxfs/xfs_attr.c |   10 ++
 fs/xfs/libxfs/xfs_bmap.c |   10 ++
 fs/xfs/xfs_bmap_util.c   |   22 +++--
 fs/xfs/xfs_file.c        |   26 ++++--
 fs/xfs/xfs_icache.c      |  189 +++++++++++++++++++++++++++++++---------------
 fs/xfs/xfs_icache.h      |    9 ++
 fs/xfs/xfs_inode.c       |   22 ++++-
 fs/xfs/xfs_ioctl.c       |   18 +++-
 fs/xfs/xfs_iomap.c       |   24 ++++--
 fs/xfs/xfs_iops.c        |   20 +++--
 fs/xfs/xfs_qm.c          |   36 +++++++--
 fs/xfs/xfs_quota.h       |   36 +++++----
 fs/xfs/xfs_reflink.c     |   31 +++++---
 fs/xfs/xfs_symlink.c     |   12 ++-
 fs/xfs/xfs_trace.c       |    1 
 fs/xfs/xfs_trace.h       |   40 ++++++++++
 fs/xfs/xfs_trans.c       |   20 +++++
 fs/xfs/xfs_trans_dquot.c |  108 +++++++++++++++++++++++---
 18 files changed, 466 insertions(+), 168 deletions(-)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 01/11] xfs: refactor messy xfs_inode_free_quota_* functions
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks Darrick J. Wong
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: Christoph Hellwig, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

The functions to run an eof/cowblocks scan to try to reduce quota usage
are kind of a mess -- the logic repeatedly initializes an eofb structure
and there are logic bugs in the code that result in the cowblocks scan
never actually happening.

Replace all three functions with a single function that fills out an
eofb if we're low on quota and runs both eof and cowblocks scans.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_file.c   |   15 ++++++---------
 fs/xfs/xfs_icache.c |   46 ++++++++++++++++------------------------------
 fs/xfs/xfs_icache.h |    4 ++--
 3 files changed, 24 insertions(+), 41 deletions(-)


diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5b0f93f73837..5d5cf25668b5 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -652,7 +652,7 @@ xfs_file_buffered_aio_write(
 	struct inode		*inode = mapping->host;
 	struct xfs_inode	*ip = XFS_I(inode);
 	ssize_t			ret;
-	int			enospc = 0;
+	bool			cleared_space = false;
 	int			iolock;
 
 	if (iocb->ki_flags & IOCB_NOWAIT)
@@ -684,19 +684,16 @@ xfs_file_buffered_aio_write(
 	 * also behaves as a filter to prevent too many eofblocks scans from
 	 * running at the same time.
 	 */
-	if (ret == -EDQUOT && !enospc) {
+	if (ret == -EDQUOT && !cleared_space) {
 		xfs_iunlock(ip, iolock);
-		enospc = xfs_inode_free_quota_eofblocks(ip);
-		if (enospc)
-			goto write_retry;
-		enospc = xfs_inode_free_quota_cowblocks(ip);
-		if (enospc)
+		cleared_space = xfs_inode_free_quota_blocks(ip);
+		if (cleared_space)
 			goto write_retry;
 		iolock = 0;
-	} else if (ret == -ENOSPC && !enospc) {
+	} else if (ret == -ENOSPC && !cleared_space) {
 		struct xfs_eofblocks eofb = {0};
 
-		enospc = 1;
+		cleared_space = true;
 		xfs_flush_inodes(ip->i_mount);
 
 		xfs_iunlock(ip, iolock);
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index deb99300d171..c71eb15e3835 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1397,33 +1397,31 @@ xfs_icache_free_eofblocks(
 }
 
 /*
- * Run eofblocks scans on the quotas applicable to the inode. For inodes with
- * multiple quotas, we don't know exactly which quota caused an allocation
+ * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
+ * with multiple quotas, we don't know exactly which quota caused an allocation
  * failure. We make a best effort by including each quota under low free space
  * conditions (less than 1% free space) in the scan.
  */
-static int
-__xfs_inode_free_quota_eofblocks(
-	struct xfs_inode	*ip,
-	int			(*execute)(struct xfs_mount *mp,
-					   struct xfs_eofblocks	*eofb))
+bool
+xfs_inode_free_quota_blocks(
+	struct xfs_inode	*ip)
 {
-	int scan = 0;
-	struct xfs_eofblocks eofb = {0};
-	struct xfs_dquot *dq;
+	struct xfs_eofblocks	eofb = {0};
+	struct xfs_dquot	*dq;
+	bool			do_work = false;
 
 	/*
 	 * Run a sync scan to increase effectiveness and use the union filter to
 	 * cover all applicable quotas in a single scan.
 	 */
-	eofb.eof_flags = XFS_EOF_FLAGS_UNION|XFS_EOF_FLAGS_SYNC;
+	eofb.eof_flags = XFS_EOF_FLAGS_UNION | XFS_EOF_FLAGS_SYNC;
 
 	if (XFS_IS_UQUOTA_ENFORCED(ip->i_mount)) {
 		dq = xfs_inode_dquot(ip, XFS_DQTYPE_USER);
 		if (dq && xfs_dquot_lowsp(dq)) {
 			eofb.eof_uid = VFS_I(ip)->i_uid;
 			eofb.eof_flags |= XFS_EOF_FLAGS_UID;
-			scan = 1;
+			do_work = true;
 		}
 	}
 
@@ -1432,21 +1430,16 @@ __xfs_inode_free_quota_eofblocks(
 		if (dq && xfs_dquot_lowsp(dq)) {
 			eofb.eof_gid = VFS_I(ip)->i_gid;
 			eofb.eof_flags |= XFS_EOF_FLAGS_GID;
-			scan = 1;
+			do_work = true;
 		}
 	}
 
-	if (scan)
-		execute(ip->i_mount, &eofb);
+	if (!do_work)
+		return false;
 
-	return scan;
-}
-
-int
-xfs_inode_free_quota_eofblocks(
-	struct xfs_inode *ip)
-{
-	return __xfs_inode_free_quota_eofblocks(ip, xfs_icache_free_eofblocks);
+	xfs_icache_free_eofblocks(ip->i_mount, &eofb);
+	xfs_icache_free_cowblocks(ip->i_mount, &eofb);
+	return true;
 }
 
 static inline unsigned long
@@ -1646,13 +1639,6 @@ xfs_icache_free_cowblocks(
 			XFS_ICI_COWBLOCKS_TAG);
 }
 
-int
-xfs_inode_free_quota_cowblocks(
-	struct xfs_inode *ip)
-{
-	return __xfs_inode_free_quota_eofblocks(ip, xfs_icache_free_cowblocks);
-}
-
 void
 xfs_inode_set_cowblocks_tag(
 	xfs_inode_t	*ip)
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 3a4c8b382cd0..3f7ddbca8638 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -54,17 +54,17 @@ long xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
+bool xfs_inode_free_quota_blocks(struct xfs_inode *ip);
+
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
 void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
 int xfs_icache_free_eofblocks(struct xfs_mount *, struct xfs_eofblocks *);
-int xfs_inode_free_quota_eofblocks(struct xfs_inode *ip);
 void xfs_eofblocks_worker(struct work_struct *);
 void xfs_queue_eofblocks(struct xfs_mount *);
 
 void xfs_inode_set_cowblocks_tag(struct xfs_inode *ip);
 void xfs_inode_clear_cowblocks_tag(struct xfs_inode *ip);
 int xfs_icache_free_cowblocks(struct xfs_mount *, struct xfs_eofblocks *);
-int xfs_inode_free_quota_cowblocks(struct xfs_inode *ip);
 void xfs_cowblocks_worker(struct work_struct *);
 void xfs_queue_cowblocks(struct xfs_mount *);
 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 01/11] xfs: refactor messy xfs_inode_free_quota_* functions Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-19  6:49   ` Christoph Hellwig
  2021-01-18 22:12 ` [PATCH 03/11] xfs: xfs_inode_free_quota_blocks should scan project quota Darrick J. Wong
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Don't stall the cowblocks scan on a locked inode if we possibly can.
We'd much rather the background scanner keep moving.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_icache.c |   21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)


diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index c71eb15e3835..89f9e692fde7 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1605,17 +1605,31 @@ xfs_inode_free_cowblocks(
 	void			*args)
 {
 	struct xfs_eofblocks	*eofb = args;
+	bool			wait;
 	int			ret = 0;
 
+	wait = eofb && (eofb->eof_flags & XFS_EOF_FLAGS_SYNC);
+
 	if (!xfs_prep_free_cowblocks(ip))
 		return 0;
 
 	if (!xfs_inode_matches_eofb(ip, eofb))
 		return 0;
 
-	/* Free the CoW blocks */
-	xfs_ilock(ip, XFS_IOLOCK_EXCL);
-	xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
+	/*
+	 * If the caller is waiting, return -EAGAIN to keep the background
+	 * scanner moving and revisit the inode in a subsequent pass.
+	 */
+	if (!xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
+		if (wait)
+			return -EAGAIN;
+		return 0;
+	}
+	if (!xfs_ilock_nowait(ip, XFS_MMAPLOCK_EXCL)) {
+		if (wait)
+			ret = -EAGAIN;
+		goto out_iolock;
+	}
 
 	/*
 	 * Check again, nobody else should be able to dirty blocks or change
@@ -1625,6 +1639,7 @@ xfs_inode_free_cowblocks(
 		ret = xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, false);
 
 	xfs_iunlock(ip, XFS_MMAPLOCK_EXCL);
+out_iolock:
 	xfs_iunlock(ip, XFS_IOLOCK_EXCL);
 
 	return ret;


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 03/11] xfs: xfs_inode_free_quota_blocks should scan project quota
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 01/11] xfs: refactor messy xfs_inode_free_quota_* functions Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 04/11] xfs: move and rename xfs_inode_free_quota_blocks to avoid conflicts Darrick J. Wong
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: Christoph Hellwig, linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Buffered writers who have run out of quota reservation call
xfs_inode_free_quota_blocks to try to free any space reservations that
might reduce the quota usage.  Unfortunately, the buffered write path
treats "out of project quota" the same as "out of overall space" so this
function has never supported scanning for space that might ease an "out
of project quota" condition.

We're about to start using this function for cases where we actually
/can/ tell if we're out of project quota, so add in this functionality.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_icache.c |    9 +++++++++
 1 file changed, 9 insertions(+)


diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 89f9e692fde7..10c1a0dee17d 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1434,6 +1434,15 @@ xfs_inode_free_quota_blocks(
 		}
 	}
 
+	if (XFS_IS_PQUOTA_ENFORCED(ip->i_mount)) {
+		dq = xfs_inode_dquot(ip, XFS_DQTYPE_PROJ);
+		if (dq && xfs_dquot_lowsp(dq)) {
+			eofb.eof_prid = ip->i_d.di_projid;
+			eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
+			do_work = true;
+		}
+	}
+
 	if (!do_work)
 		return false;
 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 04/11] xfs: move and rename xfs_inode_free_quota_blocks to avoid conflicts
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
                   ` (2 preceding siblings ...)
  2021-01-18 22:12 ` [PATCH 03/11] xfs: xfs_inode_free_quota_blocks should scan project quota Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-19  6:53   ` Christoph Hellwig
  2021-01-18 22:12 ` [PATCH 05/11] xfs: pass flags and return gc errors from xfs_blockgc_free_quota Darrick J. Wong
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Move this function further down in the file so that later cleanups won't
have to declare static functions.  Change the name because we're about
to rework all the code that performs garbage collection of speculatively
allocated file blocks.  No functional changes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_file.c   |    2 -
 fs/xfs/xfs_icache.c |  110 ++++++++++++++++++++++++++-------------------------
 fs/xfs/xfs_icache.h |    2 -
 3 files changed, 57 insertions(+), 57 deletions(-)


diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5d5cf25668b5..54c658d0f738 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -686,7 +686,7 @@ xfs_file_buffered_aio_write(
 	 */
 	if (ret == -EDQUOT && !cleared_space) {
 		xfs_iunlock(ip, iolock);
-		cleared_space = xfs_inode_free_quota_blocks(ip);
+		cleared_space = xfs_blockgc_free_quota(ip);
 		if (cleared_space)
 			goto write_retry;
 		iolock = 0;
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 10c1a0dee17d..aba901d5637b 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1396,61 +1396,6 @@ xfs_icache_free_eofblocks(
 			XFS_ICI_EOFBLOCKS_TAG);
 }
 
-/*
- * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
- * with multiple quotas, we don't know exactly which quota caused an allocation
- * failure. We make a best effort by including each quota under low free space
- * conditions (less than 1% free space) in the scan.
- */
-bool
-xfs_inode_free_quota_blocks(
-	struct xfs_inode	*ip)
-{
-	struct xfs_eofblocks	eofb = {0};
-	struct xfs_dquot	*dq;
-	bool			do_work = false;
-
-	/*
-	 * Run a sync scan to increase effectiveness and use the union filter to
-	 * cover all applicable quotas in a single scan.
-	 */
-	eofb.eof_flags = XFS_EOF_FLAGS_UNION | XFS_EOF_FLAGS_SYNC;
-
-	if (XFS_IS_UQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_USER);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_uid = VFS_I(ip)->i_uid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_UID;
-			do_work = true;
-		}
-	}
-
-	if (XFS_IS_GQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_GROUP);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_gid = VFS_I(ip)->i_gid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_GID;
-			do_work = true;
-		}
-	}
-
-	if (XFS_IS_PQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_PROJ);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_prid = ip->i_d.di_projid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
-			do_work = true;
-		}
-	}
-
-	if (!do_work)
-		return false;
-
-	xfs_icache_free_eofblocks(ip->i_mount, &eofb);
-	xfs_icache_free_cowblocks(ip->i_mount, &eofb);
-	return true;
-}
-
 static inline unsigned long
 xfs_iflag_for_tag(
 	int		tag)
@@ -1699,3 +1644,58 @@ xfs_start_block_reaping(
 	xfs_queue_eofblocks(mp);
 	xfs_queue_cowblocks(mp);
 }
+
+/*
+ * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
+ * with multiple quotas, we don't know exactly which quota caused an allocation
+ * failure. We make a best effort by including each quota under low free space
+ * conditions (less than 1% free space) in the scan.
+ */
+bool
+xfs_blockgc_free_quota(
+	struct xfs_inode	*ip)
+{
+	struct xfs_eofblocks	eofb = {0};
+	struct xfs_dquot	*dq;
+	bool			do_work = false;
+
+	/*
+	 * Run a sync scan to increase effectiveness and use the union filter to
+	 * cover all applicable quotas in a single scan.
+	 */
+	eofb.eof_flags = XFS_EOF_FLAGS_UNION | XFS_EOF_FLAGS_SYNC;
+
+	if (XFS_IS_UQUOTA_ENFORCED(ip->i_mount)) {
+		dq = xfs_inode_dquot(ip, XFS_DQTYPE_USER);
+		if (dq && xfs_dquot_lowsp(dq)) {
+			eofb.eof_uid = VFS_I(ip)->i_uid;
+			eofb.eof_flags |= XFS_EOF_FLAGS_UID;
+			do_work = true;
+		}
+	}
+
+	if (XFS_IS_GQUOTA_ENFORCED(ip->i_mount)) {
+		dq = xfs_inode_dquot(ip, XFS_DQTYPE_GROUP);
+		if (dq && xfs_dquot_lowsp(dq)) {
+			eofb.eof_gid = VFS_I(ip)->i_gid;
+			eofb.eof_flags |= XFS_EOF_FLAGS_GID;
+			do_work = true;
+		}
+	}
+
+	if (XFS_IS_PQUOTA_ENFORCED(ip->i_mount)) {
+		dq = xfs_inode_dquot(ip, XFS_DQTYPE_PROJ);
+		if (dq && xfs_dquot_lowsp(dq)) {
+			eofb.eof_prid = ip->i_d.di_projid;
+			eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
+			do_work = true;
+		}
+	}
+
+	if (!do_work)
+		return false;
+
+	xfs_icache_free_eofblocks(ip->i_mount, &eofb);
+	xfs_icache_free_cowblocks(ip->i_mount, &eofb);
+	return true;
+}
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 3f7ddbca8638..21b726a05b0d 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -54,7 +54,7 @@ long xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
-bool xfs_inode_free_quota_blocks(struct xfs_inode *ip);
+bool xfs_blockgc_free_quota(struct xfs_inode *ip);
 
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
 void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 05/11] xfs: pass flags and return gc errors from xfs_blockgc_free_quota
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
                   ` (3 preceding siblings ...)
  2021-01-18 22:12 ` [PATCH 04/11] xfs: move and rename xfs_inode_free_quota_blocks to avoid conflicts Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-19  6:59   ` Christoph Hellwig
  2021-01-18 22:12 ` [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks Darrick J. Wong
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Change the signature of xfs_blockgc_free_quota in preparation for the
next few patches.  Callers can now pass EOF_FLAGS into the function to
control scan parameters; and the function will now pass back any
corruption errors seen while scanning, so that we can fail fast.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_file.c   |    7 ++++++-
 fs/xfs/xfs_icache.c |   35 ++++++++++++++++++++++-------------
 fs/xfs/xfs_icache.h |    3 ++-
 3 files changed, 30 insertions(+), 15 deletions(-)


diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 54c658d0f738..a318a4749b59 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -685,8 +685,13 @@ xfs_file_buffered_aio_write(
 	 * running at the same time.
 	 */
 	if (ret == -EDQUOT && !cleared_space) {
+		int	ret2;
+
 		xfs_iunlock(ip, iolock);
-		cleared_space = xfs_blockgc_free_quota(ip);
+		ret2 = xfs_blockgc_free_quota(ip, XFS_EOF_FLAGS_SYNC,
+				&cleared_space);
+		if (ret2)
+			return ret2;
 		if (cleared_space)
 			goto write_retry;
 		iolock = 0;
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index aba901d5637b..1e0ffc0fb73c 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1651,26 +1651,30 @@ xfs_start_block_reaping(
  * failure. We make a best effort by including each quota under low free space
  * conditions (less than 1% free space) in the scan.
  */
-bool
+int
 xfs_blockgc_free_quota(
-	struct xfs_inode	*ip)
+	struct xfs_inode	*ip,
+	unsigned int		eof_flags,
+	bool			*found_work)
 {
 	struct xfs_eofblocks	eofb = {0};
 	struct xfs_dquot	*dq;
-	bool			do_work = false;
+	int			error;
+
+	*found_work = false;
 
 	/*
-	 * Run a sync scan to increase effectiveness and use the union filter to
+	 * Run a scan to increase effectiveness and use the union filter to
 	 * cover all applicable quotas in a single scan.
 	 */
-	eofb.eof_flags = XFS_EOF_FLAGS_UNION | XFS_EOF_FLAGS_SYNC;
+	eofb.eof_flags = XFS_EOF_FLAGS_UNION | eof_flags;
 
 	if (XFS_IS_UQUOTA_ENFORCED(ip->i_mount)) {
 		dq = xfs_inode_dquot(ip, XFS_DQTYPE_USER);
 		if (dq && xfs_dquot_lowsp(dq)) {
 			eofb.eof_uid = VFS_I(ip)->i_uid;
 			eofb.eof_flags |= XFS_EOF_FLAGS_UID;
-			do_work = true;
+			*found_work = true;
 		}
 	}
 
@@ -1679,7 +1683,7 @@ xfs_blockgc_free_quota(
 		if (dq && xfs_dquot_lowsp(dq)) {
 			eofb.eof_gid = VFS_I(ip)->i_gid;
 			eofb.eof_flags |= XFS_EOF_FLAGS_GID;
-			do_work = true;
+			*found_work = true;
 		}
 	}
 
@@ -1688,14 +1692,19 @@ xfs_blockgc_free_quota(
 		if (dq && xfs_dquot_lowsp(dq)) {
 			eofb.eof_prid = ip->i_d.di_projid;
 			eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
-			do_work = true;
+			*found_work = true;
 		}
 	}
 
-	if (!do_work)
-		return false;
+	if (*found_work) {
+		error = xfs_icache_free_eofblocks(ip->i_mount, &eofb);
+		if (error)
+			return error;
 
-	xfs_icache_free_eofblocks(ip->i_mount, &eofb);
-	xfs_icache_free_cowblocks(ip->i_mount, &eofb);
-	return true;
+		error = xfs_icache_free_cowblocks(ip->i_mount, &eofb);
+		if (error)
+			return error;
+	}
+
+	return 0;
 }
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 21b726a05b0d..f7b6ead6fc08 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -54,7 +54,8 @@ long xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
-bool xfs_blockgc_free_quota(struct xfs_inode *ip);
+int xfs_blockgc_free_quota(struct xfs_inode *ip, unsigned int eof_flags,
+		bool *found_work);
 
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
 void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
                   ` (4 preceding siblings ...)
  2021-01-18 22:12 ` [PATCH 05/11] xfs: pass flags and return gc errors from xfs_blockgc_free_quota Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-19  3:56   ` Darrick J. Wong
  2021-01-19  7:08   ` Christoph Hellwig
  2021-01-18 22:12 ` [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation Darrick J. Wong
                   ` (4 subsequent siblings)
  10 siblings, 2 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If a fs modification (data write, reflink, xattr set, fallocate, etc.)
is unable to reserve enough quota to handle the modification, try
clearing whatever space the filesystem might have been hanging onto in
the hopes of speeding up the filesystem.  The flushing behavior will
become particularly important when we add deferred inode inactivation
because that will increase the amount of space that isn't actively tied
to user data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_attr.c |   10 +++++--
 fs/xfs/libxfs/xfs_bmap.c |   10 +++++--
 fs/xfs/xfs_bmap_util.c   |   22 ++++++++++------
 fs/xfs/xfs_iomap.c       |   24 ++++++++++++-----
 fs/xfs/xfs_quota.h       |   15 +++++++----
 fs/xfs/xfs_reflink.c     |   31 +++++++++++++++-------
 fs/xfs/xfs_trans_dquot.c |   64 ++++++++++++++++++++++++++++++++++++++--------
 7 files changed, 129 insertions(+), 47 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index fd8e6418a0d3..49da931d0d19 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -395,6 +395,7 @@ xfs_attr_set(
 	struct xfs_mount	*mp = dp->i_mount;
 	struct xfs_trans_res	tres;
 	bool			rsvd = (args->attr_filter & XFS_ATTR_ROOT);
+	bool			quota_retry = false;
 	int			error, local;
 	unsigned int		total;
 
@@ -453,6 +454,7 @@ xfs_attr_set(
 	 * Root fork attributes can use reserved data blocks for this
 	 * operation if necessary
 	 */
+retry:
 	error = xfs_trans_alloc(mp, &tres, total, 0,
 			rsvd ? XFS_TRANS_RESERVE : 0, &args->trans);
 	if (error)
@@ -465,10 +467,12 @@ xfs_attr_set(
 
 		if (rsvd)
 			quota_flags |= XFS_QMOPT_FORCE_RES;
-		error = xfs_trans_reserve_quota_nblks(args->trans, dp,
-				args->total, 0, quota_flags);
+		error = xfs_trans_reserve_quota_nblks(&args->trans, dp,
+				args->total, 0, quota_flags, &quota_retry);
 		if (error)
-			goto out_trans_cancel;
+			return error;
+		if (quota_retry)
+			goto retry;
 
 		error = xfs_has_attr(args);
 		if (error == -EEXIST && (args->attr_flags & XATTR_CREATE))
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index fa6b442eb75f..1bca18a7381b 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1070,6 +1070,7 @@ xfs_bmap_add_attrfork(
 	int			blks;		/* space reservation */
 	int			version = 1;	/* superblock attr version */
 	int			logflags;	/* logging flags */
+	bool			quota_retry = false;
 	int			error;		/* error return value */
 
 	ASSERT(XFS_IFORK_Q(ip) == 0);
@@ -1079,17 +1080,20 @@ xfs_bmap_add_attrfork(
 
 	blks = XFS_ADDAFORK_SPACE_RES(mp);
 
+retry:
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_addafork, blks, 0,
 			rsvd ? XFS_TRANS_RESERVE : 0, &tp);
 	if (error)
 		return error;
 
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
-	error = xfs_trans_reserve_quota_nblks(tp, ip, blks, 0, rsvd ?
+	error = xfs_trans_reserve_quota_nblks(&tp, ip, blks, 0, rsvd ?
 			XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES :
-			XFS_QMOPT_RES_REGBLKS);
+			XFS_QMOPT_RES_REGBLKS, &quota_retry);
 	if (error)
-		goto trans_cancel;
+		return error;
+	if (quota_retry)
+		goto retry;
 	if (XFS_IFORK_Q(ip))
 		goto trans_cancel;
 
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 671cb104861e..d12c9e9e5c7b 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -761,6 +761,7 @@ xfs_alloc_file_space(
 	 */
 	while (allocatesize_fsb && !error) {
 		xfs_fileoff_t	s, e;
+		bool		quota_retry = false;
 
 		/*
 		 * Determine space reservations for data/realtime.
@@ -803,6 +804,7 @@ xfs_alloc_file_space(
 		/*
 		 * Allocate and setup the transaction.
 		 */
+retry:
 		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks,
 				resrtextents, 0, &tp);
 
@@ -817,10 +819,12 @@ xfs_alloc_file_space(
 			break;
 		}
 		xfs_ilock(ip, XFS_ILOCK_EXCL);
-		error = xfs_trans_reserve_quota_nblks(tp, ip, qblocks,
-						      0, quota_flag);
+		error = xfs_trans_reserve_quota_nblks(&tp, ip, qblocks, 0,
+				quota_flag, &quota_retry);
 		if (error)
-			goto error1;
+			return error;
+		if (quota_retry)
+			goto retry;
 
 		xfs_trans_ijoin(tp, ip, 0);
 
@@ -853,8 +857,6 @@ xfs_alloc_file_space(
 
 error0:	/* unlock inode, unreserve quota blocks, cancel trans */
 	xfs_trans_unreserve_quota_nblks(tp, ip, (long)qblocks, 0, quota_flag);
-
-error1:	/* Just cancel transaction */
 	xfs_trans_cancel(tp);
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return error;
@@ -870,8 +872,10 @@ xfs_unmap_extent(
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_trans	*tp;
 	uint			resblks = XFS_DIOSTRAT_SPACE_RES(mp, 0);
+	bool			quota_retry = false;
 	int			error;
 
+retry:
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
 	if (error) {
 		ASSERT(error == -ENOSPC || XFS_FORCED_SHUTDOWN(mp));
@@ -879,10 +883,12 @@ xfs_unmap_extent(
 	}
 
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
-	error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0,
-			XFS_QMOPT_RES_REGBLKS);
+	error = xfs_trans_reserve_quota_nblks(&tp, ip, resblks, 0,
+			XFS_QMOPT_RES_REGBLKS, &quota_retry);
 	if (error)
-		goto out_trans_cancel;
+		return error;
+	if (quota_retry)
+		goto retry;
 
 	xfs_trans_ijoin(tp, ip, 0);
 
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 7b9ff824e82d..d9583f1de2fc 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -27,7 +27,7 @@
 #include "xfs_dquot_item.h"
 #include "xfs_dquot.h"
 #include "xfs_reflink.h"
-
+#include "xfs_icache.h"
 
 #define XFS_ALLOC_ALIGN(mp, off) \
 	(((off) >> mp->m_allocsize_log) << mp->m_allocsize_log)
@@ -197,6 +197,7 @@ xfs_iomap_write_direct(
 	int			quota_flag;
 	uint			qblocks, resblks;
 	unsigned int		resrtextents = 0;
+	bool			quota_retry = false;
 	int			error;
 	int			bmapi_flags = XFS_BMAPI_PREALLOC;
 	uint			tflags = 0;
@@ -239,6 +240,7 @@ xfs_iomap_write_direct(
 			resblks = XFS_DIOSTRAT_SPACE_RES(mp, 0) << 1;
 		}
 	}
+retry:
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, resrtextents,
 			tflags, &tp);
 	if (error)
@@ -246,9 +248,12 @@ xfs_iomap_write_direct(
 
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 
-	error = xfs_trans_reserve_quota_nblks(tp, ip, qblocks, 0, quota_flag);
+	error = xfs_trans_reserve_quota_nblks(&tp, ip, qblocks, 0, quota_flag,
+			&quota_retry);
 	if (error)
-		goto out_trans_cancel;
+		return error;
+	if (quota_retry)
+		goto retry;
 
 	xfs_trans_ijoin(tp, ip, 0);
 
@@ -286,7 +291,6 @@ xfs_iomap_write_direct(
 
 out_res_cancel:
 	xfs_trans_unreserve_quota_nblks(tp, ip, (long)qblocks, 0, quota_flag);
-out_trans_cancel:
 	xfs_trans_cancel(tp);
 	goto out_unlock;
 }
@@ -539,6 +543,8 @@ xfs_iomap_write_unwritten(
 		return error;
 
 	do {
+		bool	quota_retry = false;
+
 		/*
 		 * Set up a transaction to convert the range of extents
 		 * from unwritten to real. Do allocations in a loop until
@@ -548,6 +554,7 @@ xfs_iomap_write_unwritten(
 		 * here as we might be asked to write out the same inode that we
 		 * complete here and might deadlock on the iolock.
 		 */
+retry:
 		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0,
 				XFS_TRANS_RESERVE, &tp);
 		if (error)
@@ -556,10 +563,13 @@ xfs_iomap_write_unwritten(
 		xfs_ilock(ip, XFS_ILOCK_EXCL);
 		xfs_trans_ijoin(tp, ip, 0);
 
-		error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0,
-				XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES);
+		error = xfs_trans_reserve_quota_nblks(&tp, ip, resblks, 0,
+				XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES,
+				&quota_retry);
 		if (error)
-			goto error_on_bmapi_transaction;
+			return error;
+		if (quota_retry)
+			goto retry;
 
 		/*
 		 * Modify the unwritten extent state of the buffer.
diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index 1c37b5be60c3..996046eb1492 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -81,8 +81,9 @@ extern void xfs_trans_mod_dquot_byino(struct xfs_trans *, struct xfs_inode *,
 		uint, int64_t);
 extern void xfs_trans_apply_dquot_deltas(struct xfs_trans *);
 extern void xfs_trans_unreserve_and_mod_dquots(struct xfs_trans *);
-extern int xfs_trans_reserve_quota_nblks(struct xfs_trans *,
-		struct xfs_inode *, int64_t, long, uint);
+int xfs_trans_reserve_quota_nblks(struct xfs_trans **tpp, struct xfs_inode *ip,
+		int64_t nblocks, long ninos, unsigned int flags,
+		bool *retry);
 extern int xfs_trans_reserve_quota_bydquots(struct xfs_trans *,
 		struct xfs_mount *, struct xfs_dquot *,
 		struct xfs_dquot *, struct xfs_dquot *, int64_t, long, uint);
@@ -128,7 +129,8 @@ xfs_qm_vop_dqalloc(struct xfs_inode *ip, kuid_t kuid, kgid_t kgid,
 #define xfs_trans_apply_dquot_deltas(tp)
 #define xfs_trans_unreserve_and_mod_dquots(tp)
 static inline int xfs_trans_reserve_quota_nblks(struct xfs_trans *tp,
-		struct xfs_inode *ip, int64_t nblks, long ninos, uint flags)
+		struct xfs_inode *ip, int64_t nblks, long ninos,
+		unsigned int flags, bool *retry)
 {
 	return 0;
 }
@@ -170,14 +172,17 @@ static inline int
 xfs_trans_unreserve_quota_nblks(struct xfs_trans *tp, struct xfs_inode *ip,
 		int64_t nblks, long ninos, unsigned int flags)
 {
-	return xfs_trans_reserve_quota_nblks(tp, ip, -nblks, -ninos, flags);
+	return xfs_trans_reserve_quota_nblks(&tp, ip, -nblks, -ninos, flags,
+			NULL);
 }
 
 static inline int
 xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t nblks,
 		unsigned int flags)
 {
-	return xfs_trans_reserve_quota_nblks(NULL, ip, nblks, 0, flags);
+	struct xfs_trans	*tp = NULL;
+
+	return xfs_trans_reserve_quota_nblks(&tp, ip, nblks, 0, flags, NULL);
 }
 
 static inline int
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 1f3270ffaea5..8d768c3c4b25 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -351,13 +351,14 @@ xfs_reflink_allocate_cow(
 	bool			convert_now)
 {
 	struct xfs_mount	*mp = ip->i_mount;
+	struct xfs_trans	*tp;
 	xfs_fileoff_t		offset_fsb = imap->br_startoff;
 	xfs_filblks_t		count_fsb = imap->br_blockcount;
-	struct xfs_trans	*tp;
-	int			nimaps, error = 0;
-	bool			found;
 	xfs_filblks_t		resaligned;
 	xfs_extlen_t		resblks = 0;
+	bool			found;
+	bool			quota_retry = false;
+	int			nimaps, error = 0;
 
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
 	if (!ip->i_cowfp) {
@@ -376,6 +377,7 @@ xfs_reflink_allocate_cow(
 	resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned);
 
 	xfs_iunlock(ip, *lockmode);
+retry:
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
 	*lockmode = XFS_ILOCK_EXCL;
 	xfs_ilock(ip, *lockmode);
@@ -398,10 +400,15 @@ xfs_reflink_allocate_cow(
 		goto convert;
 	}
 
-	error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0,
-			XFS_QMOPT_RES_REGBLKS);
-	if (error)
-		goto out_trans_cancel;
+	error = xfs_trans_reserve_quota_nblks(&tp, ip, resblks, 0,
+			XFS_QMOPT_RES_REGBLKS, &quota_retry);
+	if (error) {
+		/* This function must return with the ILOCK held. */
+		xfs_ilock(ip, *lockmode);
+		return error;
+	}
+	if (quota_retry)
+		goto retry;
 
 	xfs_trans_ijoin(tp, ip, 0);
 
@@ -1000,9 +1007,11 @@ xfs_reflink_remap_extent(
 	unsigned int		resblks;
 	bool			smap_real;
 	bool			dmap_written = xfs_bmap_is_written_extent(dmap);
+	bool			quota_retry = false;
 	int			nimaps;
 	int			error;
 
+retry:
 	/* Start a rolling transaction to switch the mappings */
 	resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
@@ -1087,10 +1096,12 @@ xfs_reflink_remap_extent(
 	if (!smap_real && dmap_written)
 		qres += dmap->br_blockcount;
 	if (qres > 0) {
-		error = xfs_trans_reserve_quota_nblks(tp, ip, qres, 0,
-				XFS_QMOPT_RES_REGBLKS);
+		error = xfs_trans_reserve_quota_nblks(&tp, ip, qres, 0,
+				XFS_QMOPT_RES_REGBLKS, &quota_retry);
 		if (error)
-			goto out_cancel;
+			goto out;
+		if (quota_retry)
+			goto retry;
 	}
 
 	if (smap_real) {
diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
index 3315498a6fa1..c6abe1f1106c 100644
--- a/fs/xfs/xfs_trans_dquot.c
+++ b/fs/xfs/xfs_trans_dquot.c
@@ -16,6 +16,7 @@
 #include "xfs_quota.h"
 #include "xfs_qm.h"
 #include "xfs_trace.h"
+#include "xfs_icache.h"
 
 STATIC void	xfs_trans_alloc_dqinfo(xfs_trans_t *);
 
@@ -770,21 +771,40 @@ xfs_trans_reserve_quota_bydquots(
 	return error;
 }
 
-
 /*
- * Lock the dquot and change the reservation if we can.
- * This doesn't change the actual usage, just the reservation.
- * The inode sent in is locked.
+ * Lock the dquot and change the reservation if we can.  This doesn't change
+ * the actual usage, just the reservation.  The caller must hold ILOCK_EXCL on
+ * the inode.  If @retry is not a NULL pointer, the caller must ensure that
+ * *retry is set to false before the first time this function is called.
+ *
+ * If the quota reservation fails because we hit a quota limit (and retry is
+ * not a NULL pointer, and *retry is true), this function will try to invoke
+ * the speculative preallocation gc scanner to reduce quota usage.  In order to
+ * do that, we cancel the transaction, NULL out tpp, and drop the ILOCK.  If we
+ * actually do some freeing work, *retry will be set to true.
+ *
+ * This function ends one of four ways:
+ *  a) if retry is NULL, returns the result of trying to change the quota
+ *     reservation, with *tpp still set and the inode still locked;
+ *  b) returns an error code with *tpp cancelled and set to NULL, and the inode
+ *     unlocked;
+ *  c) returns zero with *tpp cancelled and set to NULL, the inode unlocked,
+ *     and *retry is true, which means we cleared space and the caller should
+ *     try again with a new transaction;
+ *  d) returns zero with *tpp still set, the inode still locked, and *retry
+ *     is false, which means the reservation succeeded.
  */
 int
 xfs_trans_reserve_quota_nblks(
-	struct xfs_trans	*tp,
+	struct xfs_trans	**tpp,
 	struct xfs_inode	*ip,
 	int64_t			nblks,
 	long			ninos,
-	uint			flags)
+	unsigned int		flags,
+	bool			*retry)
 {
 	struct xfs_mount	*mp = ip->i_mount;
+	int			error, err2;
 
 	if (!XFS_IS_QUOTA_RUNNING(mp) || !XFS_IS_QUOTA_ON(mp))
 		return 0;
@@ -795,13 +815,35 @@ xfs_trans_reserve_quota_nblks(
 	ASSERT((flags & ~(XFS_QMOPT_FORCE_RES)) == XFS_TRANS_DQ_RES_RTBLKS ||
 	       (flags & ~(XFS_QMOPT_FORCE_RES)) == XFS_TRANS_DQ_RES_BLKS);
 
+	/* Reserve nblks against these dquots, with trans as the mediator. */
+	error = xfs_trans_reserve_quota_bydquots(*tpp, mp, ip->i_udquot,
+			ip->i_gdquot, ip->i_pdquot, nblks, ninos, flags);
+
+	/* Exit immediately if the caller did not want retries. */
+	if (retry == NULL)
+		return error;
+
 	/*
-	 * Reserve nblks against these dquots, with trans as the mediator.
+	 * If the caller /can/ handle retries, we always cancel the transaction
+	 * on error, even if we aren't going to attempt a gc scan.
 	 */
-	return xfs_trans_reserve_quota_bydquots(tp, mp,
-						ip->i_udquot, ip->i_gdquot,
-						ip->i_pdquot,
-						nblks, ninos, flags);
+	if (error) {
+		xfs_trans_cancel(*tpp);
+		*tpp = NULL;
+		xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	}
+
+	/* We only allow one retry for EDQUOT/ENOSPC. */
+	if (*retry || (error != -EDQUOT && error != -ENOSPC)) {
+		*retry = false;
+		return error;
+	}
+
+	/* Try to free some quota for this file's dquots. */
+	err2 = xfs_blockgc_free_quota(ip, 0, retry);
+	if (err2)
+		return err2;
+	return *retry ? 0 : error;
 }
 
 /* Change the quota reservations for an inode creation activity. */


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
                   ` (5 preceding siblings ...)
  2021-01-18 22:12 ` [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 08/11] xfs: flush eof/cowblocks if we can't reserve quota for chown Darrick J. Wong
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If an inode creation is unable to reserve enough quota to handle the
modification, try clearing whatever space the filesystem might have been
hanging onto in the hopes of speeding up the filesystem.  The flushing
behavior will become particularly important when we add deferred inode
inactivation because that will increase the amount of space that isn't
actively tied to user data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_icache.c      |   79 +++++++++++++++++++++++++++++-----------------
 fs/xfs/xfs_icache.h      |    3 ++
 fs/xfs/xfs_inode.c       |   22 +++++++++----
 fs/xfs/xfs_quota.h       |   13 ++++----
 fs/xfs/xfs_symlink.c     |   12 +++++--
 fs/xfs/xfs_trans_dquot.c |   44 +++++++++++++++++++++++---
 6 files changed, 123 insertions(+), 50 deletions(-)


diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 1e0ffc0fb73c..9ba1ad69abb7 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1646,65 +1646,84 @@ xfs_start_block_reaping(
 }
 
 /*
- * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
- * with multiple quotas, we don't know exactly which quota caused an allocation
- * failure. We make a best effort by including each quota under low free space
- * conditions (less than 1% free space) in the scan.
+ * Run cow/eofblocks scans on the supplied dquots.  We don't know exactly which
+ * quota caused an allocation failure, so we make a best effort by including
+ * each quota under low free space conditions (less than 1% free space) in the
+ * scan.
  */
 int
-xfs_blockgc_free_quota(
-	struct xfs_inode	*ip,
+xfs_blockgc_free_dquots(
+	struct xfs_dquot	*udqp,
+	struct xfs_dquot	*gdqp,
+	struct xfs_dquot	*pdqp,
 	unsigned int		eof_flags,
 	bool			*found_work)
 {
 	struct xfs_eofblocks	eofb = {0};
-	struct xfs_dquot	*dq;
+	struct xfs_mount	*mp = NULL;
 	int			error;
 
 	*found_work = false;
 
+	if (!udqp && !gdqp && !pdqp)
+		return 0;
+	if (udqp)
+		mp = udqp->q_mount;
+	if (!mp && gdqp)
+		mp = gdqp->q_mount;
+	if (!mp && pdqp)
+		mp = pdqp->q_mount;
+
 	/*
 	 * Run a scan to increase effectiveness and use the union filter to
 	 * cover all applicable quotas in a single scan.
 	 */
 	eofb.eof_flags = XFS_EOF_FLAGS_UNION | eof_flags;
 
-	if (XFS_IS_UQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_USER);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_uid = VFS_I(ip)->i_uid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_UID;
-			*found_work = true;
-		}
+	if (XFS_IS_UQUOTA_ENFORCED(mp) && udqp && xfs_dquot_lowsp(udqp)) {
+		eofb.eof_uid = make_kuid(mp->m_super->s_user_ns, udqp->q_id);
+		eofb.eof_flags |= XFS_EOF_FLAGS_UID;
+		*found_work = true;
 	}
 
-	if (XFS_IS_GQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_GROUP);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_gid = VFS_I(ip)->i_gid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_GID;
-			*found_work = true;
-		}
+	if (XFS_IS_UQUOTA_ENFORCED(mp) && gdqp && xfs_dquot_lowsp(gdqp)) {
+		eofb.eof_gid = make_kgid(mp->m_super->s_user_ns, gdqp->q_id);
+		eofb.eof_flags |= XFS_EOF_FLAGS_GID;
+		*found_work = true;
 	}
 
-	if (XFS_IS_PQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_PROJ);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_prid = ip->i_d.di_projid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
-			*found_work = true;
-		}
+	if (XFS_IS_PQUOTA_ENFORCED(mp) && pdqp && xfs_dquot_lowsp(pdqp)) {
+		eofb.eof_prid = pdqp->q_id;
+		eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
+		*found_work = true;
 	}
 
 	if (*found_work) {
-		error = xfs_icache_free_eofblocks(ip->i_mount, &eofb);
+		error = xfs_icache_free_eofblocks(mp, &eofb);
 		if (error)
 			return error;
 
-		error = xfs_icache_free_cowblocks(ip->i_mount, &eofb);
+		error = xfs_icache_free_cowblocks(mp, &eofb);
 		if (error)
 			return error;
 	}
 
 	return 0;
 }
+/*
+ * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
+ * with multiple quotas, we don't know exactly which quota caused an allocation
+ * failure. We make a best effort by including each quota under low free space
+ * conditions (less than 1% free space) in the scan.
+ */
+int
+xfs_blockgc_free_quota(
+	struct xfs_inode	*ip,
+	unsigned int		eof_flags,
+	bool			*found_work)
+{
+	return xfs_blockgc_free_dquots(xfs_inode_dquot(ip, XFS_DQTYPE_USER),
+			xfs_inode_dquot(ip, XFS_DQTYPE_GROUP),
+			xfs_inode_dquot(ip, XFS_DQTYPE_PROJ), eof_flags,
+			found_work);
+}
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index f7b6ead6fc08..8d8e7cabc27f 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -54,6 +54,9 @@ long xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
+int xfs_blockgc_free_dquots(struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
+		struct xfs_dquot *pdqp, unsigned int eof_flags,
+		bool *found_work);
 int xfs_blockgc_free_quota(struct xfs_inode *ip, unsigned int eof_flags,
 		bool *found_work);
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 43cff78c20c4..596b6d10d2bc 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -990,6 +990,7 @@ xfs_create(
 	struct xfs_dquot	*gdqp = NULL;
 	struct xfs_dquot	*pdqp = NULL;
 	struct xfs_trans_res	*tres;
+	bool			quota_retry = false;
 	uint			resblks;
 
 	trace_xfs_create(dp, name);
@@ -1022,6 +1023,7 @@ xfs_create(
 	 * the case we'll drop the one we have and get a more
 	 * appropriate transaction later.
 	 */
+retry:
 	error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
 	if (error == -ENOSPC) {
 		/* flush outstanding delalloc blocks and retry */
@@ -1037,10 +1039,12 @@ xfs_create(
 	/*
 	 * Reserve disk quota and the inode.
 	 */
-	error = xfs_trans_reserve_quota_icreate(tp, dp, udqp, gdqp, pdqp,
-			resblks);
+	error = xfs_trans_reserve_quota_icreate(&tp, dp, &unlock_dp_on_error,
+			udqp, gdqp, pdqp, resblks, &quota_retry);
 	if (error)
-		goto out_trans_cancel;
+		goto out_release_dquots;
+	if (quota_retry)
+		goto retry;
 
 	/*
 	 * A newly created regular or special file just has one directory
@@ -1117,6 +1121,7 @@ xfs_create(
 		xfs_irele(ip);
 	}
 
+out_release_dquots:
 	xfs_qm_dqrele(udqp);
 	xfs_qm_dqrele(gdqp);
 	xfs_qm_dqrele(pdqp);
@@ -1141,6 +1146,7 @@ xfs_create_tmpfile(
 	struct xfs_dquot	*gdqp = NULL;
 	struct xfs_dquot	*pdqp = NULL;
 	struct xfs_trans_res	*tres;
+	bool			quota_retry = false;
 	uint			resblks;
 
 	if (XFS_FORCED_SHUTDOWN(mp))
@@ -1160,14 +1166,17 @@ xfs_create_tmpfile(
 	resblks = XFS_IALLOC_SPACE_RES(mp);
 	tres = &M_RES(mp)->tr_create_tmpfile;
 
+retry:
 	error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
 	if (error)
 		goto out_release_inode;
 
-	error = xfs_trans_reserve_quota_icreate(tp, dp, udqp, gdqp, pdqp,
-			resblks);
+	error = xfs_trans_reserve_quota_icreate(&tp, dp, NULL, udqp, gdqp, pdqp,
+			resblks, &quota_retry);
 	if (error)
-		goto out_trans_cancel;
+		goto out_release_dquots;
+	if (quota_retry)
+		goto retry;
 
 	error = xfs_dir_ialloc(&tp, dp, mode, 0, 0, prid, &ip);
 	if (error)
@@ -1211,6 +1220,7 @@ xfs_create_tmpfile(
 		xfs_irele(ip);
 	}
 
+out_release_dquots:
 	xfs_qm_dqrele(udqp);
 	xfs_qm_dqrele(gdqp);
 	xfs_qm_dqrele(pdqp);
diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index 996046eb1492..ec09b38a9687 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -87,9 +87,10 @@ int xfs_trans_reserve_quota_nblks(struct xfs_trans **tpp, struct xfs_inode *ip,
 extern int xfs_trans_reserve_quota_bydquots(struct xfs_trans *,
 		struct xfs_mount *, struct xfs_dquot *,
 		struct xfs_dquot *, struct xfs_dquot *, int64_t, long, uint);
-int xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_inode *dp,
-		struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
-		struct xfs_dquot *pdqp, int64_t nblks);
+int xfs_trans_reserve_quota_icreate(struct xfs_trans **tpp,
+		struct xfs_inode *dp, bool *dp_locked, struct xfs_dquot *udqp,
+		struct xfs_dquot *gdqp, struct xfs_dquot *pdqp, int64_t nblks,
+		bool *retry);
 
 extern int xfs_qm_vop_dqalloc(struct xfs_inode *, kuid_t, kgid_t,
 		prid_t, uint, struct xfs_dquot **, struct xfs_dquot **,
@@ -147,9 +148,9 @@ static inline int xfs_quota_reserve_blkres(struct xfs_inode *ip,
 	return 0;
 }
 static inline int
-xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_inode *dp,
-		struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
-		struct xfs_dquot *pdqp, int64_t nblks)
+xfs_trans_reserve_quota_icreate(struct xfs_trans **tpp, struct xfs_inode *dp,
+		bool *dp_locked, struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
+		struct xfs_dquot *pdqp, int64_t nblks, bool *retry)
 {
 	return 0;
 }
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index e53f7bc2b820..535585a821d0 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -159,6 +159,7 @@ xfs_symlink(
 	struct xfs_dquot	*udqp = NULL;
 	struct xfs_dquot	*gdqp = NULL;
 	struct xfs_dquot	*pdqp = NULL;
+	bool			quota_retry = false;
 	uint			resblks;
 
 	*ipp = NULL;
@@ -197,6 +198,7 @@ xfs_symlink(
 		fs_blocks = xfs_symlink_blocks(mp, pathlen);
 	resblks = XFS_SYMLINK_SPACE_RES(mp, link_name->len, fs_blocks);
 
+retry:
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_symlink, resblks, 0, 0, &tp);
 	if (error)
 		goto out_release_inode;
@@ -215,10 +217,12 @@ xfs_symlink(
 	/*
 	 * Reserve disk quota : blocks and inode.
 	 */
-	error = xfs_trans_reserve_quota_icreate(tp, dp, udqp, gdqp, pdqp,
-			resblks);
+	error = xfs_trans_reserve_quota_icreate(&tp, dp, &unlock_dp_on_error,
+			udqp, gdqp, pdqp, resblks, &quota_retry);
 	if (error)
-		goto out_trans_cancel;
+		goto out_release_dquots;
+	if (quota_retry)
+		goto retry;
 
 	/*
 	 * Allocate an inode for the symlink.
@@ -342,7 +346,7 @@ xfs_symlink(
 		xfs_finish_inode_setup(ip);
 		xfs_irele(ip);
 	}
-
+out_release_dquots:
 	xfs_qm_dqrele(udqp);
 	xfs_qm_dqrele(gdqp);
 	xfs_qm_dqrele(pdqp);
diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
index c6abe1f1106c..7fc5aa69c1c5 100644
--- a/fs/xfs/xfs_trans_dquot.c
+++ b/fs/xfs/xfs_trans_dquot.c
@@ -846,25 +846,61 @@ xfs_trans_reserve_quota_nblks(
 	return *retry ? 0 : error;
 }
 
-/* Change the quota reservations for an inode creation activity. */
+/*
+ * Change the quota reservations for an inode creation activity.  This function
+ * has the same return behavior as xfs_trans_reserve_quota_nblks but with the
+ * added twist that we update *dp_locked so that the caller can track the ILOCK
+ * state of the parent directory.
+ */
 int
 xfs_trans_reserve_quota_icreate(
-	struct xfs_trans	*tp,
+	struct xfs_trans	**tpp,
 	struct xfs_inode	*dp,
+	bool			*dp_locked,
 	struct xfs_dquot	*udqp,
 	struct xfs_dquot	*gdqp,
 	struct xfs_dquot	*pdqp,
-	int64_t			nblks)
+	int64_t			nblks,
+	bool			*retry)
 {
 	struct xfs_mount	*mp = dp->i_mount;
+	int			error, err2;
 
 	if (!XFS_IS_QUOTA_RUNNING(mp) || !XFS_IS_QUOTA_ON(mp))
 		return 0;
 
 	ASSERT(!xfs_is_quota_inode(&mp->m_sb, dp->i_ino));
 
-	return xfs_trans_reserve_quota_bydquots(tp, dp->i_mount, udqp, gdqp,
+	error = xfs_trans_reserve_quota_bydquots(*tpp, dp->i_mount, udqp, gdqp,
 			pdqp, nblks, 1, XFS_QMOPT_RES_REGBLKS);
+
+	/* Exit immediately if the caller did not want retries. */
+	if (retry == NULL)
+		return error;
+
+	/*
+	 * If the caller /can/ handle retries, we always cancel the transaction
+	 * on error, even if we aren't going to attempt a gc scan.
+	 */
+	if (error) {
+		xfs_trans_cancel(*tpp);
+		*tpp = NULL;
+		if (dp_locked && *dp_locked) {
+			xfs_iunlock(dp, XFS_ILOCK_EXCL);
+			*dp_locked = false;
+		}
+	}
+	/* We only allow one retry for EDQUOT/ENOSPC. */
+	if (*retry || (error != -EDQUOT && error != -ENOSPC)) {
+		*retry = false;
+		return error;
+	}
+
+	/* Try to free some quota for the new file's dquots. */
+	err2 = xfs_blockgc_free_dquots(udqp, gdqp, pdqp, 0, retry);
+	if (err2)
+		return err2;
+	return *retry ? 0 : error;
 }
 
 /*


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 08/11] xfs: flush eof/cowblocks if we can't reserve quota for chown
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
                   ` (6 preceding siblings ...)
  2021-01-18 22:12 ` [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 09/11] xfs: add a tracepoint for blockgc scans Darrick J. Wong
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If a file user, group, or project change is unable to reserve enough
quota to handle the modification, try clearing whatever space the
filesystem might have been hanging onto in the hopes of speeding up the
filesystem.  The flushing behavior will become particularly important
when we add deferred inode inactivation because that will increase the
amount of space that isn't actively tied to user data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_ioctl.c |   16 ++++++++++++----
 fs/xfs/xfs_iops.c  |   20 ++++++++++++--------
 fs/xfs/xfs_qm.c    |   36 +++++++++++++++++++++++++++---------
 fs/xfs/xfs_quota.h |    8 ++++----
 4 files changed, 55 insertions(+), 25 deletions(-)


diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 3fbd98f61ea5..70d9637b7806 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1436,6 +1436,7 @@ xfs_ioctl_setattr(
 	struct xfs_trans	*tp;
 	struct xfs_dquot	*pdqp = NULL;
 	struct xfs_dquot	*olddquot = NULL;
+	bool			quota_retry = false;
 	int			code;
 
 	trace_xfs_ioctl_setattr(ip);
@@ -1462,6 +1463,7 @@ xfs_ioctl_setattr(
 
 	xfs_ioctl_setattr_prepare_dax(ip, fa);
 
+retry:
 	tp = xfs_ioctl_setattr_get_trans(ip);
 	if (IS_ERR(tp)) {
 		code = PTR_ERR(tp);
@@ -1470,10 +1472,16 @@ xfs_ioctl_setattr(
 
 	if (XFS_IS_QUOTA_RUNNING(mp) && XFS_IS_PQUOTA_ON(mp) &&
 	    ip->i_d.di_projid != fa->fsx_projid) {
-		code = xfs_qm_vop_chown_reserve(tp, ip, NULL, NULL, pdqp,
-				capable(CAP_FOWNER) ?  XFS_QMOPT_FORCE_RES : 0);
-		if (code)	/* out of quota */
-			goto error_trans_cancel;
+		unsigned int	flags = 0;
+
+		if (capable(CAP_FOWNER))
+			flags |= XFS_QMOPT_FORCE_RES;
+		code = xfs_qm_vop_chown_reserve(&tp, ip, NULL, NULL, pdqp,
+				flags, &quota_retry);
+		if (code)
+			goto error_free_dquots;
+		if (quota_retry)
+			goto retry;
 	}
 
 	xfs_fill_fsxattr(ip, false, &old_fa);
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 67c8dc9de8aa..3cc6c4aa01c3 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -660,6 +660,7 @@ xfs_setattr_nonsize(
 	kgid_t			gid = GLOBAL_ROOT_GID, igid = GLOBAL_ROOT_GID;
 	struct xfs_dquot	*udqp = NULL, *gdqp = NULL;
 	struct xfs_dquot	*olddquot1 = NULL, *olddquot2 = NULL;
+	bool			quota_retry = false;
 
 	ASSERT((mask & ATTR_SIZE) == 0);
 
@@ -700,6 +701,7 @@ xfs_setattr_nonsize(
 			return error;
 	}
 
+retry:
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_ichange, 0, 0, 0, &tp);
 	if (error)
 		goto out_dqrele;
@@ -729,12 +731,17 @@ xfs_setattr_nonsize(
 		if (XFS_IS_QUOTA_RUNNING(mp) &&
 		    ((XFS_IS_UQUOTA_ON(mp) && !uid_eq(iuid, uid)) ||
 		     (XFS_IS_GQUOTA_ON(mp) && !gid_eq(igid, gid)))) {
+			unsigned int	flags = 0;
+
+			if (capable(CAP_FOWNER))
+				flags |= XFS_QMOPT_FORCE_RES;
 			ASSERT(tp);
-			error = xfs_qm_vop_chown_reserve(tp, ip, udqp, gdqp,
-						NULL, capable(CAP_FOWNER) ?
-						XFS_QMOPT_FORCE_RES : 0);
-			if (error)	/* out of quota */
-				goto out_cancel;
+			error = xfs_qm_vop_chown_reserve(&tp, ip, udqp, gdqp,
+					NULL, flags, &quota_retry);
+			if (error)
+				goto out_dqrele;
+			if (quota_retry)
+				goto retry;
 		}
 
 		/*
@@ -814,9 +821,6 @@ xfs_setattr_nonsize(
 
 	return 0;
 
-out_cancel:
-	xfs_trans_cancel(tp);
-	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 out_dqrele:
 	xfs_qm_dqrele(udqp);
 	xfs_qm_dqrele(gdqp);
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index c134eb4aeaa8..4c095526aaed 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -1795,27 +1795,29 @@ xfs_qm_vop_chown(
 }
 
 /*
- * Quota reservations for setattr(AT_UID|AT_GID|AT_PROJID).
+ * Quota reservations for setattr(AT_UID|AT_GID|AT_PROJID).  This function has
+ * the same return behavior as xfs_trans_reserve_quota_nblks.
  */
 int
 xfs_qm_vop_chown_reserve(
-	struct xfs_trans	*tp,
+	struct xfs_trans	**tpp,
 	struct xfs_inode	*ip,
 	struct xfs_dquot	*udqp,
 	struct xfs_dquot	*gdqp,
 	struct xfs_dquot	*pdqp,
-	uint			flags)
+	unsigned int		flags,
+	bool			*retry)
 {
 	struct xfs_mount	*mp = ip->i_mount;
 	uint64_t		delblks;
 	unsigned int		blkflags;
-	struct xfs_dquot	*udq_unres = NULL;
+	struct xfs_dquot	*udq_unres = NULL; /* old dquots */
 	struct xfs_dquot	*gdq_unres = NULL;
 	struct xfs_dquot	*pdq_unres = NULL;
-	struct xfs_dquot	*udq_delblks = NULL;
+	struct xfs_dquot	*udq_delblks = NULL; /* new dquots */
 	struct xfs_dquot	*gdq_delblks = NULL;
 	struct xfs_dquot	*pdq_delblks = NULL;
-	int			error;
+	int			error, err2;
 
 
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL|XFS_ILOCK_SHARED));
@@ -1856,11 +1858,11 @@ xfs_qm_vop_chown_reserve(
 		}
 	}
 
-	error = xfs_trans_reserve_quota_bydquots(tp, ip->i_mount,
+	error = xfs_trans_reserve_quota_bydquots(*tpp, ip->i_mount,
 				udq_delblks, gdq_delblks, pdq_delblks,
 				ip->i_d.di_nblocks, 1, flags | blkflags);
 	if (error)
-		return error;
+		goto err;
 
 	/*
 	 * Do the delayed blks reservations/unreservations now. Since, these
@@ -1878,13 +1880,29 @@ xfs_qm_vop_chown_reserve(
 			    udq_delblks, gdq_delblks, pdq_delblks,
 			    (xfs_qcnt_t)delblks, 0, flags | blkflags);
 		if (error)
-			return error;
+			goto err;
 		xfs_trans_reserve_quota_bydquots(NULL, ip->i_mount,
 				udq_unres, gdq_unres, pdq_unres,
 				-((xfs_qcnt_t)delblks), 0, blkflags);
 	}
 
 	return 0;
+err:
+	xfs_trans_cancel(*tpp);
+	*tpp = NULL;
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	/* We only allow one retry for EDQUOT/ENOSPC. */
+	if (*retry || (error != -EDQUOT && error != -ENOSPC)) {
+		*retry = false;
+		return error;
+	}
+
+	/* Try to free some quota in the new dquots. */
+	err2 = xfs_blockgc_free_dquots(udq_delblks, gdq_delblks, pdq_delblks,
+			0, retry);
+	if (err2)
+		return err2;
+	return *retry ? 0 : error;
 }
 
 int
diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index ec09b38a9687..2ec57a65b1db 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -100,9 +100,9 @@ extern void xfs_qm_vop_create_dqattach(struct xfs_trans *, struct xfs_inode *,
 extern int xfs_qm_vop_rename_dqattach(struct xfs_inode **);
 extern struct xfs_dquot *xfs_qm_vop_chown(struct xfs_trans *,
 		struct xfs_inode *, struct xfs_dquot **, struct xfs_dquot *);
-extern int xfs_qm_vop_chown_reserve(struct xfs_trans *, struct xfs_inode *,
-		struct xfs_dquot *, struct xfs_dquot *,
-		struct xfs_dquot *, uint);
+int xfs_qm_vop_chown_reserve(struct xfs_trans **tpp, struct xfs_inode *ip,
+		struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
+		struct xfs_dquot *pdqp, unsigned int flags, bool *retry);
 extern int xfs_qm_dqattach(struct xfs_inode *);
 extern int xfs_qm_dqattach_locked(struct xfs_inode *ip, bool doalloc);
 extern void xfs_qm_dqdetach(struct xfs_inode *);
@@ -157,7 +157,7 @@ xfs_trans_reserve_quota_icreate(struct xfs_trans **tpp, struct xfs_inode *dp,
 #define xfs_qm_vop_create_dqattach(tp, ip, u, g, p)
 #define xfs_qm_vop_rename_dqattach(it)					(0)
 #define xfs_qm_vop_chown(tp, ip, old, new)				(NULL)
-#define xfs_qm_vop_chown_reserve(tp, ip, u, g, p, fl)			(0)
+#define xfs_qm_vop_chown_reserve(tpp, ip, u, g, p, fl, retry)		(0)
 #define xfs_qm_dqattach(ip)						(0)
 #define xfs_qm_dqattach_locked(ip, fl)					(0)
 #define xfs_qm_dqdetach(ip)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 09/11] xfs: add a tracepoint for blockgc scans
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
                   ` (7 preceding siblings ...)
  2021-01-18 22:12 ` [PATCH 08/11] xfs: flush eof/cowblocks if we can't reserve quota for chown Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-19  7:11   ` Christoph Hellwig
  2021-01-18 22:12 ` [PATCH 10/11] xfs: refactor xfs_icache_free_{eof,cow}blocks call sites Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 11/11] xfs: flush speculative space allocations when we run out of space Darrick J. Wong
  10 siblings, 1 reply; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Add some tracepoints so that we can observe when the speculative
preallocation garbage collector runs.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_ioctl.c |    2 ++
 fs/xfs/xfs_trace.c |    1 +
 fs/xfs/xfs_trace.h |   39 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+)


diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 70d9637b7806..ccff1f9ca6e9 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -2356,6 +2356,8 @@ xfs_file_ioctl(
 		if (error)
 			return error;
 
+		trace_xfs_ioc_free_eofblocks(mp, &keofb, _RET_IP_);
+
 		sb_start_write(mp->m_super);
 		error = xfs_icache_free_eofblocks(mp, &keofb);
 		sb_end_write(mp->m_super);
diff --git a/fs/xfs/xfs_trace.c b/fs/xfs/xfs_trace.c
index 120398a37c2a..9b8d703dc9fd 100644
--- a/fs/xfs/xfs_trace.c
+++ b/fs/xfs/xfs_trace.c
@@ -29,6 +29,7 @@
 #include "xfs_filestream.h"
 #include "xfs_fsmap.h"
 #include "xfs_btree_staging.h"
+#include "xfs_icache.h"
 
 /*
  * We include this last to have the helpers above available for the trace
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 5a263ae3d4f0..3f761f33099b 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -37,6 +37,7 @@ struct xfs_trans_res;
 struct xfs_inobt_rec_incore;
 union xfs_btree_ptr;
 struct xfs_dqtrx;
+struct xfs_eofblocks;
 
 #define XFS_ATTR_FILTER_FLAGS \
 	{ XFS_ATTR_ROOT,	"ROOT" }, \
@@ -3888,6 +3889,44 @@ DEFINE_EVENT(xfs_timestamp_range_class, name, \
 DEFINE_TIMESTAMP_RANGE_EVENT(xfs_inode_timestamp_range);
 DEFINE_TIMESTAMP_RANGE_EVENT(xfs_quota_expiry_range);
 
+DECLARE_EVENT_CLASS(xfs_eofblocks_class,
+	TP_PROTO(struct xfs_mount *mp, struct xfs_eofblocks *eofb,
+		 unsigned long caller_ip),
+	TP_ARGS(mp, eofb, caller_ip),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(__u32, flags)
+		__field(uint32_t, uid)
+		__field(uint32_t, gid)
+		__field(prid_t, prid)
+		__field(__u64, min_file_size)
+		__field(unsigned long, caller_ip)
+	),
+	TP_fast_assign(
+		__entry->dev = mp->m_super->s_dev;
+		__entry->flags = eofb->eof_flags;
+		__entry->uid = from_kuid(mp->m_super->s_user_ns, eofb->eof_uid);
+		__entry->gid = from_kgid(mp->m_super->s_user_ns, eofb->eof_gid);
+		__entry->prid = eofb->eof_prid;
+		__entry->min_file_size = eofb->eof_min_file_size;
+		__entry->caller_ip = caller_ip;
+	),
+	TP_printk("dev %d:%d flags 0x%x uid %u gid %u prid %u minsize %llu caller %pS",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  __entry->flags,
+		  __entry->uid,
+		  __entry->gid,
+		  __entry->prid,
+		  __entry->min_file_size,
+		  (char *)__entry->caller_ip)
+);
+#define DEFINE_EOFBLOCKS_EVENT(name)	\
+DEFINE_EVENT(xfs_eofblocks_class, name,	\
+	TP_PROTO(struct xfs_mount *mp, struct xfs_eofblocks *eofb, \
+		 unsigned long caller_ip), \
+	TP_ARGS(mp, eofb, caller_ip))
+DEFINE_EOFBLOCKS_EVENT(xfs_ioc_free_eofblocks);
+
 #endif /* _TRACE_XFS_H */
 
 #undef TRACE_INCLUDE_PATH


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 10/11] xfs: refactor xfs_icache_free_{eof,cow}blocks call sites
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
                   ` (8 preceding siblings ...)
  2021-01-18 22:12 ` [PATCH 09/11] xfs: add a tracepoint for blockgc scans Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  2021-01-18 22:12 ` [PATCH 11/11] xfs: flush speculative space allocations when we run out of space Darrick J. Wong
  10 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

In anticipation of more restructuring of the eof/cowblocks gc code,
refactor calling of those two functions into a single internal helper
function, then present a new standard interface to purge speculative
block preallocations and start shifting higher level code to use that.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_file.c   |    6 ++++--
 fs/xfs/xfs_icache.c |   45 +++++++++++++++++++++++++++++++++++----------
 fs/xfs/xfs_icache.h |    1 +
 fs/xfs/xfs_trace.h  |    1 +
 4 files changed, 41 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index a318a4749b59..40b12db17a20 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -697,14 +697,16 @@ xfs_file_buffered_aio_write(
 		iolock = 0;
 	} else if (ret == -ENOSPC && !cleared_space) {
 		struct xfs_eofblocks eofb = {0};
+		int	ret2;
 
 		cleared_space = true;
 		xfs_flush_inodes(ip->i_mount);
 
 		xfs_iunlock(ip, iolock);
 		eofb.eof_flags = XFS_EOF_FLAGS_SYNC;
-		xfs_icache_free_eofblocks(ip->i_mount, &eofb);
-		xfs_icache_free_cowblocks(ip->i_mount, &eofb);
+		ret2 = xfs_blockgc_free_space(ip->i_mount, &eofb);
+		if (ret2)
+			return ret2;
 		goto write_retry;
 	}
 
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 9ba1ad69abb7..acf67384e52f 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1645,6 +1645,38 @@ xfs_start_block_reaping(
 	xfs_queue_cowblocks(mp);
 }
 
+/* Scan all incore inodes for block preallocations that we can remove. */
+static inline int
+xfs_blockgc_scan(
+	struct xfs_mount	*mp,
+	struct xfs_eofblocks	*eofb)
+{
+	int			error;
+
+	error = xfs_icache_free_eofblocks(mp, eofb);
+	if (error)
+		return error;
+
+	error = xfs_icache_free_cowblocks(mp, eofb);
+	if (error)
+		return error;
+
+	return 0;
+}
+
+/*
+ * Try to free space in the filesystem by purging eofblocks and cowblocks.
+ */
+int
+xfs_blockgc_free_space(
+	struct xfs_mount	*mp,
+	struct xfs_eofblocks	*eofb)
+{
+	trace_xfs_blockgc_free_space(mp, eofb, _RET_IP_);
+
+	return xfs_blockgc_scan(mp, eofb);
+}
+
 /*
  * Run cow/eofblocks scans on the supplied dquots.  We don't know exactly which
  * quota caused an allocation failure, so we make a best effort by including
@@ -1661,7 +1693,6 @@ xfs_blockgc_free_dquots(
 {
 	struct xfs_eofblocks	eofb = {0};
 	struct xfs_mount	*mp = NULL;
-	int			error;
 
 	*found_work = false;
 
@@ -1698,18 +1729,12 @@ xfs_blockgc_free_dquots(
 		*found_work = true;
 	}
 
-	if (*found_work) {
-		error = xfs_icache_free_eofblocks(mp, &eofb);
-		if (error)
-			return error;
-
-		error = xfs_icache_free_cowblocks(mp, &eofb);
-		if (error)
-			return error;
-	}
+	if (*found_work)
+		return xfs_blockgc_free_space(mp, &eofb);
 
 	return 0;
 }
+
 /*
  * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
  * with multiple quotas, we don't know exactly which quota caused an allocation
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 8d8e7cabc27f..56ae668dcdcf 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -59,6 +59,7 @@ int xfs_blockgc_free_dquots(struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
 		bool *found_work);
 int xfs_blockgc_free_quota(struct xfs_inode *ip, unsigned int eof_flags,
 		bool *found_work);
+int xfs_blockgc_free_space(struct xfs_mount *mp, struct xfs_eofblocks *eofb);
 
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
 void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 3f761f33099b..7ec9d4d703a6 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -3926,6 +3926,7 @@ DEFINE_EVENT(xfs_eofblocks_class, name,	\
 		 unsigned long caller_ip), \
 	TP_ARGS(mp, eofb, caller_ip))
 DEFINE_EOFBLOCKS_EVENT(xfs_ioc_free_eofblocks);
+DEFINE_EOFBLOCKS_EVENT(xfs_blockgc_free_space);
 
 #endif /* _TRACE_XFS_H */
 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 11/11] xfs: flush speculative space allocations when we run out of space
  2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
                   ` (9 preceding siblings ...)
  2021-01-18 22:12 ` [PATCH 10/11] xfs: refactor xfs_icache_free_{eof,cow}blocks call sites Darrick J. Wong
@ 2021-01-18 22:12 ` Darrick J. Wong
  10 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-18 22:12 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

If a fs modification (creation, file write, reflink, etc.) is unable to
reserve enough space to handle the modification, try clearing whatever
space the filesystem might have been hanging onto in the hopes of
speeding up the filesystem.  The flushing behavior will become
particularly important when we add deferred inode inactivation because
that will increase the amount of space that isn't actively tied to user
data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_trans.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)


diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index e72730f85af1..2b92a4084bb8 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -20,6 +20,8 @@
 #include "xfs_trace.h"
 #include "xfs_error.h"
 #include "xfs_defer.h"
+#include "xfs_inode.h"
+#include "xfs_icache.h"
 
 kmem_zone_t	*xfs_trans_zone;
 
@@ -256,8 +258,10 @@ xfs_trans_alloc(
 	struct xfs_trans	**tpp)
 {
 	struct xfs_trans	*tp;
+	unsigned int		tries = 1;
 	int			error;
 
+retry:
 	/*
 	 * Allocate the handle before we do our freeze accounting and setting up
 	 * GFP_NOFS allocation context so that we avoid lockdep false positives
@@ -285,6 +289,22 @@ xfs_trans_alloc(
 	tp->t_firstblock = NULLFSBLOCK;
 
 	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
+	if (error == -ENOSPC && tries > 0) {
+		xfs_trans_cancel(tp);
+
+		/*
+		 * We weren't able to reserve enough space for the transaction.
+		 * Flush the other speculative space allocations to free space.
+		 * Do not perform a synchronous scan because callers can hold
+		 * other locks.
+		 */
+		error = xfs_blockgc_free_space(mp, NULL);
+		if (error)
+			return error;
+
+		tries--;
+		goto retry;
+	}
 	if (error) {
 		xfs_trans_cancel(tp);
 		return error;


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks
  2021-01-18 22:12 ` [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks Darrick J. Wong
@ 2021-01-19  3:56   ` Darrick J. Wong
  2021-01-19  7:08   ` Christoph Hellwig
  1 sibling, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-19  3:56 UTC (permalink / raw)
  To: linux-xfs

On Mon, Jan 18, 2021 at 02:12:31PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If a fs modification (data write, reflink, xattr set, fallocate, etc.)
> is unable to reserve enough quota to handle the modification, try
> clearing whatever space the filesystem might have been hanging onto in
> the hopes of speeding up the filesystem.  The flushing behavior will
> become particularly important when we add deferred inode inactivation
> because that will increase the amount of space that isn't actively tied
> to user data.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/libxfs/xfs_attr.c |   10 +++++--
>  fs/xfs/libxfs/xfs_bmap.c |   10 +++++--
>  fs/xfs/xfs_bmap_util.c   |   22 ++++++++++------
>  fs/xfs/xfs_iomap.c       |   24 ++++++++++++-----
>  fs/xfs/xfs_quota.h       |   15 +++++++----
>  fs/xfs/xfs_reflink.c     |   31 +++++++++++++++-------
>  fs/xfs/xfs_trans_dquot.c |   64 ++++++++++++++++++++++++++++++++++++++--------
>  7 files changed, 129 insertions(+), 47 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index fd8e6418a0d3..49da931d0d19 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -395,6 +395,7 @@ xfs_attr_set(
>  	struct xfs_mount	*mp = dp->i_mount;
>  	struct xfs_trans_res	tres;
>  	bool			rsvd = (args->attr_filter & XFS_ATTR_ROOT);
> +	bool			quota_retry = false;
>  	int			error, local;
>  	unsigned int		total;
>  
> @@ -453,6 +454,7 @@ xfs_attr_set(
>  	 * Root fork attributes can use reserved data blocks for this
>  	 * operation if necessary
>  	 */
> +retry:
>  	error = xfs_trans_alloc(mp, &tres, total, 0,
>  			rsvd ? XFS_TRANS_RESERVE : 0, &args->trans);
>  	if (error)
> @@ -465,10 +467,12 @@ xfs_attr_set(
>  
>  		if (rsvd)
>  			quota_flags |= XFS_QMOPT_FORCE_RES;
> -		error = xfs_trans_reserve_quota_nblks(args->trans, dp,
> -				args->total, 0, quota_flags);
> +		error = xfs_trans_reserve_quota_nblks(&args->trans, dp,
> +				args->total, 0, quota_flags, &quota_retry);
>  		if (error)
> -			goto out_trans_cancel;
> +			return error;
> +		if (quota_retry)
> +			goto retry;
>  
>  		error = xfs_has_attr(args);
>  		if (error == -EEXIST && (args->attr_flags & XATTR_CREATE))
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index fa6b442eb75f..1bca18a7381b 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -1070,6 +1070,7 @@ xfs_bmap_add_attrfork(
>  	int			blks;		/* space reservation */
>  	int			version = 1;	/* superblock attr version */
>  	int			logflags;	/* logging flags */
> +	bool			quota_retry = false;
>  	int			error;		/* error return value */
>  
>  	ASSERT(XFS_IFORK_Q(ip) == 0);
> @@ -1079,17 +1080,20 @@ xfs_bmap_add_attrfork(
>  
>  	blks = XFS_ADDAFORK_SPACE_RES(mp);
>  
> +retry:
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_addafork, blks, 0,
>  			rsvd ? XFS_TRANS_RESERVE : 0, &tp);
>  	if (error)
>  		return error;
>  
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
> -	error = xfs_trans_reserve_quota_nblks(tp, ip, blks, 0, rsvd ?
> +	error = xfs_trans_reserve_quota_nblks(&tp, ip, blks, 0, rsvd ?
>  			XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES :
> -			XFS_QMOPT_RES_REGBLKS);
> +			XFS_QMOPT_RES_REGBLKS, &quota_retry);
>  	if (error)
> -		goto trans_cancel;
> +		return error;
> +	if (quota_retry)
> +		goto retry;
>  	if (XFS_IFORK_Q(ip))
>  		goto trans_cancel;
>  
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index 671cb104861e..d12c9e9e5c7b 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -761,6 +761,7 @@ xfs_alloc_file_space(
>  	 */
>  	while (allocatesize_fsb && !error) {
>  		xfs_fileoff_t	s, e;
> +		bool		quota_retry = false;
>  
>  		/*
>  		 * Determine space reservations for data/realtime.
> @@ -803,6 +804,7 @@ xfs_alloc_file_space(
>  		/*
>  		 * Allocate and setup the transaction.
>  		 */
> +retry:
>  		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks,
>  				resrtextents, 0, &tp);
>  
> @@ -817,10 +819,12 @@ xfs_alloc_file_space(
>  			break;
>  		}
>  		xfs_ilock(ip, XFS_ILOCK_EXCL);
> -		error = xfs_trans_reserve_quota_nblks(tp, ip, qblocks,
> -						      0, quota_flag);
> +		error = xfs_trans_reserve_quota_nblks(&tp, ip, qblocks, 0,
> +				quota_flag, &quota_retry);
>  		if (error)
> -			goto error1;
> +			return error;
> +		if (quota_retry)
> +			goto retry;
>  
>  		xfs_trans_ijoin(tp, ip, 0);
>  
> @@ -853,8 +857,6 @@ xfs_alloc_file_space(
>  
>  error0:	/* unlock inode, unreserve quota blocks, cancel trans */
>  	xfs_trans_unreserve_quota_nblks(tp, ip, (long)qblocks, 0, quota_flag);
> -
> -error1:	/* Just cancel transaction */
>  	xfs_trans_cancel(tp);
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return error;
> @@ -870,8 +872,10 @@ xfs_unmap_extent(
>  	struct xfs_mount	*mp = ip->i_mount;
>  	struct xfs_trans	*tp;
>  	uint			resblks = XFS_DIOSTRAT_SPACE_RES(mp, 0);
> +	bool			quota_retry = false;
>  	int			error;
>  
> +retry:
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
>  	if (error) {
>  		ASSERT(error == -ENOSPC || XFS_FORCED_SHUTDOWN(mp));
> @@ -879,10 +883,12 @@ xfs_unmap_extent(
>  	}
>  
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
> -	error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0,
> -			XFS_QMOPT_RES_REGBLKS);
> +	error = xfs_trans_reserve_quota_nblks(&tp, ip, resblks, 0,
> +			XFS_QMOPT_RES_REGBLKS, &quota_retry);
>  	if (error)
> -		goto out_trans_cancel;
> +		return error;
> +	if (quota_retry)
> +		goto retry;
>  
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 7b9ff824e82d..d9583f1de2fc 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -27,7 +27,7 @@
>  #include "xfs_dquot_item.h"
>  #include "xfs_dquot.h"
>  #include "xfs_reflink.h"
> -
> +#include "xfs_icache.h"
>  
>  #define XFS_ALLOC_ALIGN(mp, off) \
>  	(((off) >> mp->m_allocsize_log) << mp->m_allocsize_log)
> @@ -197,6 +197,7 @@ xfs_iomap_write_direct(
>  	int			quota_flag;
>  	uint			qblocks, resblks;
>  	unsigned int		resrtextents = 0;
> +	bool			quota_retry = false;
>  	int			error;
>  	int			bmapi_flags = XFS_BMAPI_PREALLOC;
>  	uint			tflags = 0;
> @@ -239,6 +240,7 @@ xfs_iomap_write_direct(
>  			resblks = XFS_DIOSTRAT_SPACE_RES(mp, 0) << 1;
>  		}
>  	}
> +retry:
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, resrtextents,
>  			tflags, &tp);
>  	if (error)
> @@ -246,9 +248,12 @@ xfs_iomap_write_direct(
>  
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
>  
> -	error = xfs_trans_reserve_quota_nblks(tp, ip, qblocks, 0, quota_flag);
> +	error = xfs_trans_reserve_quota_nblks(&tp, ip, qblocks, 0, quota_flag,
> +			&quota_retry);
>  	if (error)
> -		goto out_trans_cancel;
> +		return error;
> +	if (quota_retry)
> +		goto retry;
>  
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> @@ -286,7 +291,6 @@ xfs_iomap_write_direct(
>  
>  out_res_cancel:
>  	xfs_trans_unreserve_quota_nblks(tp, ip, (long)qblocks, 0, quota_flag);
> -out_trans_cancel:
>  	xfs_trans_cancel(tp);
>  	goto out_unlock;
>  }
> @@ -539,6 +543,8 @@ xfs_iomap_write_unwritten(
>  		return error;
>  
>  	do {
> +		bool	quota_retry = false;
> +
>  		/*
>  		 * Set up a transaction to convert the range of extents
>  		 * from unwritten to real. Do allocations in a loop until
> @@ -548,6 +554,7 @@ xfs_iomap_write_unwritten(
>  		 * here as we might be asked to write out the same inode that we
>  		 * complete here and might deadlock on the iolock.
>  		 */
> +retry:
>  		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0,
>  				XFS_TRANS_RESERVE, &tp);
>  		if (error)
> @@ -556,10 +563,13 @@ xfs_iomap_write_unwritten(
>  		xfs_ilock(ip, XFS_ILOCK_EXCL);
>  		xfs_trans_ijoin(tp, ip, 0);
>  
> -		error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0,
> -				XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES);
> +		error = xfs_trans_reserve_quota_nblks(&tp, ip, resblks, 0,
> +				XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES,
> +				&quota_retry);
>  		if (error)
> -			goto error_on_bmapi_transaction;
> +			return error;
> +		if (quota_retry)
> +			goto retry;
>  
>  		/*
>  		 * Modify the unwritten extent state of the buffer.
> diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
> index 1c37b5be60c3..996046eb1492 100644
> --- a/fs/xfs/xfs_quota.h
> +++ b/fs/xfs/xfs_quota.h
> @@ -81,8 +81,9 @@ extern void xfs_trans_mod_dquot_byino(struct xfs_trans *, struct xfs_inode *,
>  		uint, int64_t);
>  extern void xfs_trans_apply_dquot_deltas(struct xfs_trans *);
>  extern void xfs_trans_unreserve_and_mod_dquots(struct xfs_trans *);
> -extern int xfs_trans_reserve_quota_nblks(struct xfs_trans *,
> -		struct xfs_inode *, int64_t, long, uint);
> +int xfs_trans_reserve_quota_nblks(struct xfs_trans **tpp, struct xfs_inode *ip,
> +		int64_t nblocks, long ninos, unsigned int flags,
> +		bool *retry);
>  extern int xfs_trans_reserve_quota_bydquots(struct xfs_trans *,
>  		struct xfs_mount *, struct xfs_dquot *,
>  		struct xfs_dquot *, struct xfs_dquot *, int64_t, long, uint);
> @@ -128,7 +129,8 @@ xfs_qm_vop_dqalloc(struct xfs_inode *ip, kuid_t kuid, kgid_t kgid,
>  #define xfs_trans_apply_dquot_deltas(tp)
>  #define xfs_trans_unreserve_and_mod_dquots(tp)
>  static inline int xfs_trans_reserve_quota_nblks(struct xfs_trans *tp,

Oops, forgot to change this to 'struct xfs_trans **tpp'...

--D

> -		struct xfs_inode *ip, int64_t nblks, long ninos, uint flags)
> +		struct xfs_inode *ip, int64_t nblks, long ninos,
> +		unsigned int flags, bool *retry)
>  {
>  	return 0;
>  }
> @@ -170,14 +172,17 @@ static inline int
>  xfs_trans_unreserve_quota_nblks(struct xfs_trans *tp, struct xfs_inode *ip,
>  		int64_t nblks, long ninos, unsigned int flags)
>  {
> -	return xfs_trans_reserve_quota_nblks(tp, ip, -nblks, -ninos, flags);
> +	return xfs_trans_reserve_quota_nblks(&tp, ip, -nblks, -ninos, flags,
> +			NULL);
>  }
>  
>  static inline int
>  xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t nblks,
>  		unsigned int flags)
>  {
> -	return xfs_trans_reserve_quota_nblks(NULL, ip, nblks, 0, flags);
> +	struct xfs_trans	*tp = NULL;
> +
> +	return xfs_trans_reserve_quota_nblks(&tp, ip, nblks, 0, flags, NULL);
>  }
>  
>  static inline int
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 1f3270ffaea5..8d768c3c4b25 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -351,13 +351,14 @@ xfs_reflink_allocate_cow(
>  	bool			convert_now)
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
> +	struct xfs_trans	*tp;
>  	xfs_fileoff_t		offset_fsb = imap->br_startoff;
>  	xfs_filblks_t		count_fsb = imap->br_blockcount;
> -	struct xfs_trans	*tp;
> -	int			nimaps, error = 0;
> -	bool			found;
>  	xfs_filblks_t		resaligned;
>  	xfs_extlen_t		resblks = 0;
> +	bool			found;
> +	bool			quota_retry = false;
> +	int			nimaps, error = 0;
>  
>  	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
>  	if (!ip->i_cowfp) {
> @@ -376,6 +377,7 @@ xfs_reflink_allocate_cow(
>  	resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned);
>  
>  	xfs_iunlock(ip, *lockmode);
> +retry:
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
>  	*lockmode = XFS_ILOCK_EXCL;
>  	xfs_ilock(ip, *lockmode);
> @@ -398,10 +400,15 @@ xfs_reflink_allocate_cow(
>  		goto convert;
>  	}
>  
> -	error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0,
> -			XFS_QMOPT_RES_REGBLKS);
> -	if (error)
> -		goto out_trans_cancel;
> +	error = xfs_trans_reserve_quota_nblks(&tp, ip, resblks, 0,
> +			XFS_QMOPT_RES_REGBLKS, &quota_retry);
> +	if (error) {
> +		/* This function must return with the ILOCK held. */
> +		xfs_ilock(ip, *lockmode);
> +		return error;
> +	}
> +	if (quota_retry)
> +		goto retry;
>  
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> @@ -1000,9 +1007,11 @@ xfs_reflink_remap_extent(
>  	unsigned int		resblks;
>  	bool			smap_real;
>  	bool			dmap_written = xfs_bmap_is_written_extent(dmap);
> +	bool			quota_retry = false;
>  	int			nimaps;
>  	int			error;
>  
> +retry:
>  	/* Start a rolling transaction to switch the mappings */
>  	resblks = XFS_EXTENTADD_SPACE_RES(mp, XFS_DATA_FORK);
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
> @@ -1087,10 +1096,12 @@ xfs_reflink_remap_extent(
>  	if (!smap_real && dmap_written)
>  		qres += dmap->br_blockcount;
>  	if (qres > 0) {
> -		error = xfs_trans_reserve_quota_nblks(tp, ip, qres, 0,
> -				XFS_QMOPT_RES_REGBLKS);
> +		error = xfs_trans_reserve_quota_nblks(&tp, ip, qres, 0,
> +				XFS_QMOPT_RES_REGBLKS, &quota_retry);
>  		if (error)
> -			goto out_cancel;
> +			goto out;
> +		if (quota_retry)
> +			goto retry;
>  	}
>  
>  	if (smap_real) {
> diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
> index 3315498a6fa1..c6abe1f1106c 100644
> --- a/fs/xfs/xfs_trans_dquot.c
> +++ b/fs/xfs/xfs_trans_dquot.c
> @@ -16,6 +16,7 @@
>  #include "xfs_quota.h"
>  #include "xfs_qm.h"
>  #include "xfs_trace.h"
> +#include "xfs_icache.h"
>  
>  STATIC void	xfs_trans_alloc_dqinfo(xfs_trans_t *);
>  
> @@ -770,21 +771,40 @@ xfs_trans_reserve_quota_bydquots(
>  	return error;
>  }
>  
> -
>  /*
> - * Lock the dquot and change the reservation if we can.
> - * This doesn't change the actual usage, just the reservation.
> - * The inode sent in is locked.
> + * Lock the dquot and change the reservation if we can.  This doesn't change
> + * the actual usage, just the reservation.  The caller must hold ILOCK_EXCL on
> + * the inode.  If @retry is not a NULL pointer, the caller must ensure that
> + * *retry is set to false before the first time this function is called.
> + *
> + * If the quota reservation fails because we hit a quota limit (and retry is
> + * not a NULL pointer, and *retry is true), this function will try to invoke
> + * the speculative preallocation gc scanner to reduce quota usage.  In order to
> + * do that, we cancel the transaction, NULL out tpp, and drop the ILOCK.  If we
> + * actually do some freeing work, *retry will be set to true.
> + *
> + * This function ends one of four ways:
> + *  a) if retry is NULL, returns the result of trying to change the quota
> + *     reservation, with *tpp still set and the inode still locked;
> + *  b) returns an error code with *tpp cancelled and set to NULL, and the inode
> + *     unlocked;
> + *  c) returns zero with *tpp cancelled and set to NULL, the inode unlocked,
> + *     and *retry is true, which means we cleared space and the caller should
> + *     try again with a new transaction;
> + *  d) returns zero with *tpp still set, the inode still locked, and *retry
> + *     is false, which means the reservation succeeded.
>   */
>  int
>  xfs_trans_reserve_quota_nblks(
> -	struct xfs_trans	*tp,
> +	struct xfs_trans	**tpp,
>  	struct xfs_inode	*ip,
>  	int64_t			nblks,
>  	long			ninos,
> -	uint			flags)
> +	unsigned int		flags,
> +	bool			*retry)
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
> +	int			error, err2;
>  
>  	if (!XFS_IS_QUOTA_RUNNING(mp) || !XFS_IS_QUOTA_ON(mp))
>  		return 0;
> @@ -795,13 +815,35 @@ xfs_trans_reserve_quota_nblks(
>  	ASSERT((flags & ~(XFS_QMOPT_FORCE_RES)) == XFS_TRANS_DQ_RES_RTBLKS ||
>  	       (flags & ~(XFS_QMOPT_FORCE_RES)) == XFS_TRANS_DQ_RES_BLKS);
>  
> +	/* Reserve nblks against these dquots, with trans as the mediator. */
> +	error = xfs_trans_reserve_quota_bydquots(*tpp, mp, ip->i_udquot,
> +			ip->i_gdquot, ip->i_pdquot, nblks, ninos, flags);
> +
> +	/* Exit immediately if the caller did not want retries. */
> +	if (retry == NULL)
> +		return error;
> +
>  	/*
> -	 * Reserve nblks against these dquots, with trans as the mediator.
> +	 * If the caller /can/ handle retries, we always cancel the transaction
> +	 * on error, even if we aren't going to attempt a gc scan.
>  	 */
> -	return xfs_trans_reserve_quota_bydquots(tp, mp,
> -						ip->i_udquot, ip->i_gdquot,
> -						ip->i_pdquot,
> -						nblks, ninos, flags);
> +	if (error) {
> +		xfs_trans_cancel(*tpp);
> +		*tpp = NULL;
> +		xfs_iunlock(ip, XFS_ILOCK_EXCL);
> +	}
> +
> +	/* We only allow one retry for EDQUOT/ENOSPC. */
> +	if (*retry || (error != -EDQUOT && error != -ENOSPC)) {
> +		*retry = false;
> +		return error;
> +	}
> +
> +	/* Try to free some quota for this file's dquots. */
> +	err2 = xfs_blockgc_free_quota(ip, 0, retry);
> +	if (err2)
> +		return err2;
> +	return *retry ? 0 : error;
>  }
>  
>  /* Change the quota reservations for an inode creation activity. */
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks
  2021-01-18 22:12 ` [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks Darrick J. Wong
@ 2021-01-19  6:49   ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-01-19  6:49 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Jan 18, 2021 at 02:12:09PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Don't stall the cowblocks scan on a locked inode if we possibly can.
> We'd much rather the background scanner keep moving.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 04/11] xfs: move and rename xfs_inode_free_quota_blocks to avoid conflicts
  2021-01-18 22:12 ` [PATCH 04/11] xfs: move and rename xfs_inode_free_quota_blocks to avoid conflicts Darrick J. Wong
@ 2021-01-19  6:53   ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-01-19  6:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Mon, Jan 18, 2021 at 02:12:20PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Move this function further down in the file so that later cleanups won't
> have to declare static functions.  Change the name because we're about
> to rework all the code that performs garbage collection of speculatively
> allocated file blocks.  No functional changes.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 05/11] xfs: pass flags and return gc errors from xfs_blockgc_free_quota
  2021-01-18 22:12 ` [PATCH 05/11] xfs: pass flags and return gc errors from xfs_blockgc_free_quota Darrick J. Wong
@ 2021-01-19  6:59   ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-01-19  6:59 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks
  2021-01-18 22:12 ` [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks Darrick J. Wong
  2021-01-19  3:56   ` Darrick J. Wong
@ 2021-01-19  7:08   ` Christoph Hellwig
  2021-01-19 19:43     ` Darrick J. Wong
  1 sibling, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2021-01-19  7:08 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

> @@ -351,13 +351,14 @@ xfs_reflink_allocate_cow(
>  	bool			convert_now)
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
> +	struct xfs_trans	*tp;
>  	xfs_fileoff_t		offset_fsb = imap->br_startoff;
>  	xfs_filblks_t		count_fsb = imap->br_blockcount;
> -	struct xfs_trans	*tp;
> -	int			nimaps, error = 0;
> -	bool			found;
>  	xfs_filblks_t		resaligned;
>  	xfs_extlen_t		resblks = 0;
> +	bool			found;
> +	bool			quota_retry = false;
> +	int			nimaps, error = 0;

Any good reason for reshuffling the declarations here?

> +	if (error) {
> +		/* This function must return with the ILOCK held. */
> +		xfs_ilock(ip, *lockmode);
> +		return error;
> +	}

Ugg.

> +	if (error) {
> +		xfs_trans_cancel(*tpp);
> +		*tpp = NULL;
> +		xfs_iunlock(ip, XFS_ILOCK_EXCL);
> +	}
> +
> +	/* We only allow one retry for EDQUOT/ENOSPC. */
> +	if (*retry || (error != -EDQUOT && error != -ENOSPC)) {
> +		*retry = false;
> +		return error;
> +	}

Id really don't like the semantics where this wrapper unlocks the
ilock.  Keeping all the locking at one layer, which is the callers
makes the code much easier to reason about

> +
> +	/* Try to free some quota for this file's dquots. */
> +	err2 = xfs_blockgc_free_quota(ip, 0, retry);
> +	if (err2)
> +		return err2;
> +	return *retry ? 0 : error;
>  }

Why not have a should_retry helper for the callers and let them call
xfs_blockgc_free_quota?  That is a little more boilerplate code, but
a lot less obsfucated.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 09/11] xfs: add a tracepoint for blockgc scans
  2021-01-18 22:12 ` [PATCH 09/11] xfs: add a tracepoint for blockgc scans Darrick J. Wong
@ 2021-01-19  7:11   ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2021-01-19  7:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks
  2021-01-19  7:08   ` Christoph Hellwig
@ 2021-01-19 19:43     ` Darrick J. Wong
  0 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-19 19:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Tue, Jan 19, 2021 at 08:08:45AM +0100, Christoph Hellwig wrote:
> > @@ -351,13 +351,14 @@ xfs_reflink_allocate_cow(
> >  	bool			convert_now)
> >  {
> >  	struct xfs_mount	*mp = ip->i_mount;
> > +	struct xfs_trans	*tp;
> >  	xfs_fileoff_t		offset_fsb = imap->br_startoff;
> >  	xfs_filblks_t		count_fsb = imap->br_blockcount;
> > -	struct xfs_trans	*tp;
> > -	int			nimaps, error = 0;
> > -	bool			found;
> >  	xfs_filblks_t		resaligned;
> >  	xfs_extlen_t		resblks = 0;
> > +	bool			found;
> > +	bool			quota_retry = false;
> > +	int			nimaps, error = 0;
> 
> Any good reason for reshuffling the declarations here?
> 
> > +	if (error) {
> > +		/* This function must return with the ILOCK held. */
> > +		xfs_ilock(ip, *lockmode);
> > +		return error;
> > +	}
> 
> Ugg.

Yeah.  I can't think of a good (as in non-brain-straining) way to fix
this unusual locking -- this function can be called with the ILOCK held,
and it's possible that we then find what we are looking for due to a
speculative preallocation and can exit without cycling the lock.

I think what we really want is for xfs_direct_write_iomap_begin to call
xfs_find_trim_cow_extent and xfs_reflink_convert_cow_locked directly,
and if the first call doesn't find a cow staging extent then drop the
ILOCK, call xfs_reflink_allocate_cow, and re-take the ILOCK after.

> > +	if (error) {
> > +		xfs_trans_cancel(*tpp);
> > +		*tpp = NULL;
> > +		xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > +	}
> > +
> > +	/* We only allow one retry for EDQUOT/ENOSPC. */
> > +	if (*retry || (error != -EDQUOT && error != -ENOSPC)) {
> > +		*retry = false;
> > +		return error;
> > +	}
> 
> Id really don't like the semantics where this wrapper unlocks the
> ilock.  Keeping all the locking at one layer, which is the callers
> makes the code much easier to reason about
> 
> > +
> > +	/* Try to free some quota for this file's dquots. */
> > +	err2 = xfs_blockgc_free_quota(ip, 0, retry);
> > +	if (err2)
> > +		return err2;
> > +	return *retry ? 0 : error;
> >  }
> 
> Why not have a should_retry helper for the callers and let them call
> xfs_blockgc_free_quota?  That is a little more boilerplate code, but
> a lot less obsfucated.

The previous version of this patchset did that, but Dave complained that
spread the retry calls and error/retry branching all over the code base
and asked for the structure that's in this version:

https://lore.kernel.org/linux-xfs/161040735389.1582114.15084485390769234805.stgit@magnolia/T/#mfcd6786f99791adf771697416fc51d168d3050f8

--D

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation
  2021-01-28  6:02 [PATCHSET v5 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
@ 2021-01-28  6:03 ` Darrick J. Wong
  0 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-28  6:03 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, hch, david, bfoster

From: Darrick J. Wong <djwong@kernel.org>

If an inode creation is unable to reserve enough quota to handle the
modification, try clearing whatever space the filesystem might have been
hanging onto in the hopes of speeding up the filesystem.  The flushing
behavior will become particularly important when we add deferred inode
inactivation because that will increase the amount of space that isn't
actively tied to user data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_icache.c      |   73 +++++++++++++++++++++++++++-------------------
 fs/xfs/xfs_icache.h      |    2 +
 fs/xfs/xfs_quota.h       |   18 ++++++++++-
 fs/xfs/xfs_trans.c       |    9 +++++-
 fs/xfs/xfs_trans_dquot.c |   36 +++++++++++++++++++++--
 5 files changed, 102 insertions(+), 36 deletions(-)


diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 4a074aa12b52..cd369dd48818 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1646,64 +1646,77 @@ xfs_start_block_reaping(
 }
 
 /*
- * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
- * with multiple quotas, we don't know exactly which quota caused an allocation
- * failure. We make a best effort by including each quota under low free space
- * conditions (less than 1% free space) in the scan.
+ * Run cow/eofblocks scans on the supplied dquots.  We don't know exactly which
+ * quota caused an allocation failure, so we make a best effort by including
+ * each quota under low free space conditions (less than 1% free space) in the
+ * scan.
  *
  * Callers must not hold any inode's ILOCK.  If requesting a synchronous scan
  * (XFS_EOF_FLAGS_SYNC), the caller also must not hold any inode's IOLOCK or
  * MMAPLOCK.
  */
 int
-xfs_blockgc_free_quota(
-	struct xfs_inode	*ip,
+xfs_blockgc_free_dquots(
+	struct xfs_dquot	*udqp,
+	struct xfs_dquot	*gdqp,
+	struct xfs_dquot	*pdqp,
 	unsigned int		eof_flags)
 {
 	struct xfs_eofblocks	eofb = {0};
-	struct xfs_dquot	*dq;
+	struct xfs_mount	*mp = NULL;
 	bool			do_work = false;
 	int			error;
 
+	if (!udqp && !gdqp && !pdqp)
+		return 0;
+	if (udqp)
+		mp = udqp->q_mount;
+	if (!mp && gdqp)
+		mp = gdqp->q_mount;
+	if (!mp && pdqp)
+		mp = pdqp->q_mount;
+
 	/*
 	 * Run a scan to free blocks using the union filter to cover all
 	 * applicable quotas in a single scan.
 	 */
 	eofb.eof_flags = XFS_EOF_FLAGS_UNION | eof_flags;
 
-	if (XFS_IS_UQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_USER);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_uid = VFS_I(ip)->i_uid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_UID;
-			do_work = true;
-		}
+	if (XFS_IS_UQUOTA_ENFORCED(mp) && udqp && xfs_dquot_lowsp(udqp)) {
+		eofb.eof_uid = make_kuid(mp->m_super->s_user_ns, udqp->q_id);
+		eofb.eof_flags |= XFS_EOF_FLAGS_UID;
+		do_work = true;
 	}
 
-	if (XFS_IS_GQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_GROUP);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_gid = VFS_I(ip)->i_gid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_GID;
-			do_work = true;
-		}
+	if (XFS_IS_UQUOTA_ENFORCED(mp) && gdqp && xfs_dquot_lowsp(gdqp)) {
+		eofb.eof_gid = make_kgid(mp->m_super->s_user_ns, gdqp->q_id);
+		eofb.eof_flags |= XFS_EOF_FLAGS_GID;
+		do_work = true;
 	}
 
-	if (XFS_IS_PQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_PROJ);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_prid = ip->i_d.di_projid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
-			do_work = true;
-		}
+	if (XFS_IS_PQUOTA_ENFORCED(mp) && pdqp && xfs_dquot_lowsp(pdqp)) {
+		eofb.eof_prid = pdqp->q_id;
+		eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
+		do_work = true;
 	}
 
 	if (!do_work)
 		return 0;
 
-	error = xfs_icache_free_eofblocks(ip->i_mount, &eofb);
+	error = xfs_icache_free_eofblocks(mp, &eofb);
 	if (error)
 		return error;
 
-	return xfs_icache_free_cowblocks(ip->i_mount, &eofb);
+	return xfs_icache_free_cowblocks(mp, &eofb);
+}
+
+/* Run cow/eofblocks scans on the quotas attached to the inode. */
+int
+xfs_blockgc_free_quota(
+	struct xfs_inode	*ip,
+	unsigned int		eof_flags)
+{
+	return xfs_blockgc_free_dquots(xfs_inode_dquot(ip, XFS_DQTYPE_USER),
+			xfs_inode_dquot(ip, XFS_DQTYPE_GROUP),
+			xfs_inode_dquot(ip, XFS_DQTYPE_PROJ), eof_flags);
 }
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index d64ea8f5c589..5f520de637f6 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -54,6 +54,8 @@ long xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
+int xfs_blockgc_free_dquots(struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
+		struct xfs_dquot *pdqp, unsigned int eof_flags);
 int xfs_blockgc_free_quota(struct xfs_inode *ip, unsigned int eof_flags);
 
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index dcdc3cd0361a..7d87a1244256 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -90,7 +90,10 @@ extern int xfs_trans_reserve_quota_bydquots(struct xfs_trans *,
 		struct xfs_dquot *, struct xfs_dquot *, int64_t, long, uint);
 int xfs_trans_reserve_quota_icreate(struct xfs_trans *tp,
 		struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
-		struct xfs_dquot *pdqp, int64_t dblocks);
+		struct xfs_dquot *pdqp, int64_t dblocks, unsigned int *retry);
+void xfs_trans_cancel_qretry_icreate(struct xfs_trans *tp,
+		struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
+		struct xfs_dquot *pdqp);
 
 extern int xfs_qm_vop_dqalloc(struct xfs_inode *, kuid_t, kgid_t,
 		prid_t, uint, struct xfs_dquot **, struct xfs_dquot **,
@@ -157,7 +160,8 @@ xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t dblocks)
 
 static inline int
 xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_dquot *udqp,
-		struct xfs_dquot *gdqp, struct xfs_dquot *pdqp, int64_t dblocks)
+		struct xfs_dquot *gdqp, struct xfs_dquot *pdqp, int64_t dblocks,
+		unsigned int *retry)
 {
 	return 0;
 }
@@ -168,6 +172,16 @@ xfs_trans_cancel_qretry_nblks(struct xfs_trans *tp, struct xfs_inode *ip)
 	ASSERT(0);
 }
 
+static inline void
+xfs_trans_cancel_qretry_icreate(
+	struct xfs_trans	*tp,
+	struct xfs_dquot	*udqp,
+	struct xfs_dquot	*gdqp,
+	struct xfs_dquot	*pdqp)
+{
+	ASSERT(0);
+}
+
 #define xfs_qm_vop_create_dqattach(tp, ip, u, g, p)
 #define xfs_qm_vop_rename_dqattach(it)					(0)
 #define xfs_qm_vop_chown(tp, ip, old, new)				(NULL)
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index 3d1f745b721a..7d4823cbd7c0 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -1099,17 +1099,24 @@ xfs_trans_alloc_icreate(
 	struct xfs_trans	**tpp)
 {
 	struct xfs_trans	*tp;
+	unsigned int		qretry = 0;
 	int			error;
 
+retry:
 	error = xfs_trans_alloc(mp, resv, dblocks, 0, 0, &tp);
 	if (error)
 		return error;
 
-	error = xfs_trans_reserve_quota_icreate(tp, udqp, gdqp, pdqp, dblocks);
+	error = xfs_trans_reserve_quota_icreate(tp, udqp, gdqp, pdqp, dblocks,
+			&qretry);
 	if (error) {
 		xfs_trans_cancel(tp);
 		return error;
 	}
+	if (qretry) {
+		xfs_trans_cancel_qretry_icreate(tp, udqp, gdqp, pdqp);
+		goto retry;
+	}
 
 	*tpp = tp;
 	return 0;
diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
index 229e2a20889f..2358c9e35be8 100644
--- a/fs/xfs/xfs_trans_dquot.c
+++ b/fs/xfs/xfs_trans_dquot.c
@@ -893,22 +893,52 @@ xfs_trans_cancel_qretry_nblks(
 	xfs_blockgc_free_quota(ip, 0);
 }
 
-/* Change the quota reservations for an inode creation activity. */
+/*
+ * Change the quota reservations for an inode creation activity.  This doesn't
+ * change the actual usage, just the reservation.  If @retry is not a NULL
+ * pointer, the caller must ensure that *retry is set to zero before the first
+ * time this function is called.
+ *
+ * If the quota reservation fails because we hit a quota limit (and retry is
+ * not a NULL pointer, and *retry is zero), this function will set *retry to
+ * nonzero and return zero.
+ */
 int
 xfs_trans_reserve_quota_icreate(
 	struct xfs_trans	*tp,
 	struct xfs_dquot	*udqp,
 	struct xfs_dquot	*gdqp,
 	struct xfs_dquot	*pdqp,
-	int64_t			dblocks)
+	int64_t			dblocks,
+	unsigned int		*retry)
 {
 	struct xfs_mount	*mp = tp->t_mountp;
+	int			error;
 
 	if (!XFS_IS_QUOTA_RUNNING(mp) || !XFS_IS_QUOTA_ON(mp))
 		return 0;
 
-	return xfs_trans_reserve_quota_bydquots(tp, mp, udqp, gdqp, pdqp,
+	error = xfs_trans_reserve_quota_bydquots(tp, mp, udqp, gdqp, pdqp,
 			dblocks, 1, XFS_QMOPT_RES_REGBLKS);
+	if (!reservation_success(XFS_QMOPT_RES_REGBLKS, retry, &error))
+		return error;
+
+	return 0;
+}
+
+/*
+ * Cancel a transaction and try to clear some space so that we can reserve some
+ * quota.  When this function returns, the transaction will be cancelled.
+ */
+void
+xfs_trans_cancel_qretry_icreate(
+	struct xfs_trans	*tp,
+	struct xfs_dquot	*udqp,
+	struct xfs_dquot	*gdqp,
+	struct xfs_dquot	*pdqp)
+{
+	xfs_trans_cancel(tp);
+	xfs_blockgc_free_dquots(udqp, gdqp, pdqp, 0);
 }
 
 /*


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation
  2021-01-23 18:51 [PATCHSET v4 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
@ 2021-01-23 18:52 ` Darrick J. Wong
  0 siblings, 0 replies; 21+ messages in thread
From: Darrick J. Wong @ 2021-01-23 18:52 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs, hch, david

From: Darrick J. Wong <djwong@kernel.org>

If an inode creation is unable to reserve enough quota to handle the
modification, try clearing whatever space the filesystem might have been
hanging onto in the hopes of speeding up the filesystem.  The flushing
behavior will become particularly important when we add deferred inode
inactivation because that will increase the amount of space that isn't
actively tied to user data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_icache.c      |   78 ++++++++++++++++++++++++++++------------------
 fs/xfs/xfs_icache.h      |    2 +
 fs/xfs/xfs_inode.c       |   17 ++++++++--
 fs/xfs/xfs_quota.h       |   13 ++++----
 fs/xfs/xfs_symlink.c     |    8 ++++-
 fs/xfs/xfs_trans_dquot.c |   52 ++++++++++++++++++++++++++++---
 6 files changed, 124 insertions(+), 46 deletions(-)


diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 68b6f72593dc..7f999f9dd80a 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1646,60 +1646,78 @@ xfs_start_block_reaping(
 }
 
 /*
- * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
- * with multiple quotas, we don't know exactly which quota caused an allocation
- * failure. We make a best effort by including each quota under low free space
- * conditions (less than 1% free space) in the scan.
+ * Run cow/eofblocks scans on the supplied dquots.  We don't know exactly which
+ * quota caused an allocation failure, so we make a best effort by including
+ * each quota under low free space conditions (less than 1% free space) in the
+ * scan.
  */
 int
-xfs_blockgc_free_quota(
-	struct xfs_inode	*ip,
+xfs_blockgc_free_dquots(
+	struct xfs_dquot	*udqp,
+	struct xfs_dquot	*gdqp,
+	struct xfs_dquot	*pdqp,
 	unsigned int		eof_flags)
 {
 	struct xfs_eofblocks	eofb = {0};
-	struct xfs_dquot	*dq;
+	struct xfs_mount	*mp = NULL;
 	bool			do_work = false;
 	int			error;
 
+	if (!udqp && !gdqp && !pdqp)
+		return 0;
+	if (udqp)
+		mp = udqp->q_mount;
+	if (!mp && gdqp)
+		mp = gdqp->q_mount;
+	if (!mp && pdqp)
+		mp = pdqp->q_mount;
+
 	/*
 	 * Run a scan to increase effectiveness and use the union filter to
 	 * cover all applicable quotas in a single scan.
 	 */
 	eofb.eof_flags = XFS_EOF_FLAGS_UNION | eof_flags;
 
-	if (XFS_IS_UQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_USER);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_uid = VFS_I(ip)->i_uid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_UID;
-			do_work = true;
-		}
+	if (XFS_IS_UQUOTA_ENFORCED(mp) && udqp && xfs_dquot_lowsp(udqp)) {
+		eofb.eof_uid = make_kuid(mp->m_super->s_user_ns, udqp->q_id);
+		eofb.eof_flags |= XFS_EOF_FLAGS_UID;
+		do_work = true;
 	}
 
-	if (XFS_IS_GQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_GROUP);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_gid = VFS_I(ip)->i_gid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_GID;
-			do_work = true;
-		}
+	if (XFS_IS_UQUOTA_ENFORCED(mp) && gdqp && xfs_dquot_lowsp(gdqp)) {
+		eofb.eof_gid = make_kgid(mp->m_super->s_user_ns, gdqp->q_id);
+		eofb.eof_flags |= XFS_EOF_FLAGS_GID;
+		do_work = true;
 	}
 
-	if (XFS_IS_PQUOTA_ENFORCED(ip->i_mount)) {
-		dq = xfs_inode_dquot(ip, XFS_DQTYPE_PROJ);
-		if (dq && xfs_dquot_lowsp(dq)) {
-			eofb.eof_prid = ip->i_d.di_projid;
-			eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
-			do_work = true;
-		}
+	if (XFS_IS_PQUOTA_ENFORCED(mp) && pdqp && xfs_dquot_lowsp(pdqp)) {
+		eofb.eof_prid = pdqp->q_id;
+		eofb.eof_flags |= XFS_EOF_FLAGS_PRID;
+		do_work = true;
 	}
 
 	if (!do_work)
 		return 0;
 
-	error = xfs_icache_free_eofblocks(ip->i_mount, &eofb);
+	error = xfs_icache_free_eofblocks(mp, &eofb);
 	if (error)
 		return error;
 
-	return xfs_icache_free_cowblocks(ip->i_mount, &eofb);
+	return xfs_icache_free_cowblocks(mp, &eofb);
+}
+
+/*
+ * Run cow/eofblocks scans on the quotas applicable to the inode. For inodes
+ * with multiple quotas, we don't know exactly which quota caused an allocation
+ * failure. We make a best effort by including each quota under low free space
+ * conditions (less than 1% free space) in the scan.
+ */
+int
+xfs_blockgc_free_quota(
+	struct xfs_inode	*ip,
+	unsigned int		eof_flags)
+{
+	return xfs_blockgc_free_dquots(xfs_inode_dquot(ip, XFS_DQTYPE_USER),
+			xfs_inode_dquot(ip, XFS_DQTYPE_GROUP),
+			xfs_inode_dquot(ip, XFS_DQTYPE_PROJ), eof_flags);
 }
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index d64ea8f5c589..5f520de637f6 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -54,6 +54,8 @@ long xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
+int xfs_blockgc_free_dquots(struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
+		struct xfs_dquot *pdqp, unsigned int eof_flags);
 int xfs_blockgc_free_quota(struct xfs_inode *ip, unsigned int eof_flags);
 
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index e909da05cd28..a3072c3f5028 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -990,6 +990,7 @@ xfs_create(
 	struct xfs_dquot	*gdqp = NULL;
 	struct xfs_dquot	*pdqp = NULL;
 	struct xfs_trans_res	*tres;
+	bool			quota_retry = false;
 	uint			resblks;
 
 	trace_xfs_create(dp, name);
@@ -1022,6 +1023,7 @@ xfs_create(
 	 * the case we'll drop the one we have and get a more
 	 * appropriate transaction later.
 	 */
+retry:
 	error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
 	if (error == -ENOSPC) {
 		/* flush outstanding delalloc blocks and retry */
@@ -1037,10 +1039,12 @@ xfs_create(
 	/*
 	 * Reserve disk quota and the inode.
 	 */
-	error = xfs_trans_reserve_quota_icreate(tp, dp, udqp, gdqp, pdqp,
-			resblks);
+	error = xfs_trans_reserve_quota_icreate(&tp, dp, &unlock_dp_on_error,
+			udqp, gdqp, pdqp, resblks, &quota_retry);
 	if (error)
 		goto out_trans_cancel;
+	if (quota_retry)
+		goto retry;
 
 	error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
 			XFS_IEXT_DIR_MANIP_CNT(mp));
@@ -1146,6 +1150,8 @@ xfs_create_tmpfile(
 	struct xfs_dquot	*gdqp = NULL;
 	struct xfs_dquot	*pdqp = NULL;
 	struct xfs_trans_res	*tres;
+	bool			locked = false;
+	bool			quota_retry = false;
 	uint			resblks;
 
 	if (XFS_FORCED_SHUTDOWN(mp))
@@ -1165,14 +1171,17 @@ xfs_create_tmpfile(
 	resblks = XFS_IALLOC_SPACE_RES(mp);
 	tres = &M_RES(mp)->tr_create_tmpfile;
 
+retry:
 	error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
 	if (error)
 		goto out_release_inode;
 
-	error = xfs_trans_reserve_quota_icreate(tp, dp, udqp, gdqp, pdqp,
-			resblks);
+	error = xfs_trans_reserve_quota_icreate(&tp, dp, &locked, udqp, gdqp,
+			pdqp, resblks, &quota_retry);
 	if (error)
 		goto out_trans_cancel;
+	if (quota_retry)
+		goto retry;
 
 	error = xfs_dir_ialloc(&tp, dp, mode, 0, 0, prid, &ip);
 	if (error)
diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index 1c083b5267d9..c4d02252e36f 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -87,9 +87,10 @@ int xfs_trans_reserve_quota_nblks(struct xfs_trans **tpp, struct xfs_inode *ip,
 extern int xfs_trans_reserve_quota_bydquots(struct xfs_trans *,
 		struct xfs_mount *, struct xfs_dquot *,
 		struct xfs_dquot *, struct xfs_dquot *, int64_t, long, uint);
-int xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_inode *dp,
-		struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
-		struct xfs_dquot *pdqp, int64_t nblks);
+int xfs_trans_reserve_quota_icreate(struct xfs_trans **tpp,
+		struct xfs_inode *dp, bool *dp_locked, struct xfs_dquot *udqp,
+		struct xfs_dquot *gdqp, struct xfs_dquot *pdqp, int64_t nblks,
+		bool *retry);
 
 extern int xfs_qm_vop_dqalloc(struct xfs_inode *, kuid_t, kgid_t,
 		prid_t, uint, struct xfs_dquot **, struct xfs_dquot **,
@@ -158,9 +159,9 @@ xfs_quota_reserve_blkres(struct xfs_inode *ip, int64_t nblks, bool isrt)
 }
 
 static inline int
-xfs_trans_reserve_quota_icreate(struct xfs_trans *tp, struct xfs_inode *dp,
-		struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
-		struct xfs_dquot *pdqp, int64_t nblks)
+xfs_trans_reserve_quota_icreate(struct xfs_trans **tpp, struct xfs_inode *dp,
+		bool *dp_locked, struct xfs_dquot *udqp, struct xfs_dquot *gdqp,
+		struct xfs_dquot *pdqp, int64_t nblks, bool *retry)
 {
 	return 0;
 }
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index f8bfa51bdeef..20c150ad699f 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -159,6 +159,7 @@ xfs_symlink(
 	struct xfs_dquot	*udqp = NULL;
 	struct xfs_dquot	*gdqp = NULL;
 	struct xfs_dquot	*pdqp = NULL;
+	bool			quota_retry = false;
 	uint			resblks;
 
 	*ipp = NULL;
@@ -197,6 +198,7 @@ xfs_symlink(
 		fs_blocks = xfs_symlink_blocks(mp, pathlen);
 	resblks = XFS_SYMLINK_SPACE_RES(mp, link_name->len, fs_blocks);
 
+retry:
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_symlink, resblks, 0, 0, &tp);
 	if (error)
 		goto out_release_inode;
@@ -215,10 +217,12 @@ xfs_symlink(
 	/*
 	 * Reserve disk quota : blocks and inode.
 	 */
-	error = xfs_trans_reserve_quota_icreate(tp, dp, udqp, gdqp, pdqp,
-			resblks);
+	error = xfs_trans_reserve_quota_icreate(&tp, dp, &unlock_dp_on_error,
+			udqp, gdqp, pdqp, resblks, &quota_retry);
 	if (error)
 		goto out_trans_cancel;
+	if (quota_retry)
+		goto retry;
 
 	error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
 			XFS_IEXT_DIR_MANIP_CNT(mp));
diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
index adc7331ff182..340c066f8ef1 100644
--- a/fs/xfs/xfs_trans_dquot.c
+++ b/fs/xfs/xfs_trans_dquot.c
@@ -835,25 +835,69 @@ xfs_trans_reserve_quota_nblks(
 	return 0;
 }
 
-/* Change the quota reservations for an inode creation activity. */
+/*
+ * Change the quota reservations for an inode creation activity.  This doesn't
+ * change the actual usage, just the reservation.  The caller may hold
+ * ILOCK_EXCL on the inode.  If @retry is not a NULL pointer, the caller must
+ * ensure that *retry is set to false before the first time this function is
+ * called.
+ *
+ * If the quota reservation fails because we hit a quota limit (and retry is
+ * not a NULL pointer, and *retry is false), this function will try to invoke
+ * the speculative preallocation gc scanner to reduce quota usage.  In order to
+ * do that, we cancel the transaction, NULL out tpp, drop the ILOCK and update
+ * *dp_locked if dp_locked is not NULL, and set *retry to true.
+ *
+ * This function ends one of two ways:
+ *
+ *  1) To signal the caller to try again, *retry is set to true; *tpp is
+ *     cancelled and set to NULL; if *dp_locked is true, the inode is unlocked
+ *     and *dp_locked is set to false; and the return value is zero.
+ *
+ *  2) Otherwise, *tpp is still set; the inode is still locked; and the return
+ *     value is zero or the usual negative error code.
+ */
 int
 xfs_trans_reserve_quota_icreate(
-	struct xfs_trans	*tp,
+	struct xfs_trans	**tpp,
 	struct xfs_inode	*dp,
+	bool			*dp_locked,
 	struct xfs_dquot	*udqp,
 	struct xfs_dquot	*gdqp,
 	struct xfs_dquot	*pdqp,
-	int64_t			nblks)
+	int64_t			nblks,
+	bool			*retry)
 {
 	struct xfs_mount	*mp = dp->i_mount;
+	int			error;
 
 	if (!XFS_IS_QUOTA_RUNNING(mp) || !XFS_IS_QUOTA_ON(mp))
 		return 0;
 
 	ASSERT(!xfs_is_quota_inode(&mp->m_sb, dp->i_ino));
 
-	return xfs_trans_reserve_quota_bydquots(tp, dp->i_mount, udqp, gdqp,
+	error = xfs_trans_reserve_quota_bydquots(*tpp, dp->i_mount, udqp, gdqp,
 			pdqp, nblks, 1, XFS_QMOPT_RES_REGBLKS);
+	if (retry == NULL)
+		return error;
+	/* We only allow one retry for EDQUOT/ENOSPC. */
+	if (*retry || (error != -EDQUOT && error != -ENOSPC)) {
+		*retry = false;
+		return error;
+	}
+
+	/* Release resources, prepare for scan. */
+	xfs_trans_cancel(*tpp);
+	*tpp = NULL;
+	if (*dp_locked) {
+		xfs_iunlock(dp, XFS_ILOCK_EXCL);
+		*dp_locked = false;
+	}
+
+	/* Try to free some quota for the supplied dquots. */
+	*retry = true;
+	xfs_blockgc_free_dquots(udqp, gdqp, pdqp, 0);
+	return 0;
 }
 
 /*


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-01-28  6:03 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-18 22:11 [PATCHSET v3 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
2021-01-18 22:12 ` [PATCH 01/11] xfs: refactor messy xfs_inode_free_quota_* functions Darrick J. Wong
2021-01-18 22:12 ` [PATCH 02/11] xfs: don't stall cowblocks scan if we can't take locks Darrick J. Wong
2021-01-19  6:49   ` Christoph Hellwig
2021-01-18 22:12 ` [PATCH 03/11] xfs: xfs_inode_free_quota_blocks should scan project quota Darrick J. Wong
2021-01-18 22:12 ` [PATCH 04/11] xfs: move and rename xfs_inode_free_quota_blocks to avoid conflicts Darrick J. Wong
2021-01-19  6:53   ` Christoph Hellwig
2021-01-18 22:12 ` [PATCH 05/11] xfs: pass flags and return gc errors from xfs_blockgc_free_quota Darrick J. Wong
2021-01-19  6:59   ` Christoph Hellwig
2021-01-18 22:12 ` [PATCH 06/11] xfs: flush eof/cowblocks if we can't reserve quota for file blocks Darrick J. Wong
2021-01-19  3:56   ` Darrick J. Wong
2021-01-19  7:08   ` Christoph Hellwig
2021-01-19 19:43     ` Darrick J. Wong
2021-01-18 22:12 ` [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation Darrick J. Wong
2021-01-18 22:12 ` [PATCH 08/11] xfs: flush eof/cowblocks if we can't reserve quota for chown Darrick J. Wong
2021-01-18 22:12 ` [PATCH 09/11] xfs: add a tracepoint for blockgc scans Darrick J. Wong
2021-01-19  7:11   ` Christoph Hellwig
2021-01-18 22:12 ` [PATCH 10/11] xfs: refactor xfs_icache_free_{eof,cow}blocks call sites Darrick J. Wong
2021-01-18 22:12 ` [PATCH 11/11] xfs: flush speculative space allocations when we run out of space Darrick J. Wong
2021-01-23 18:51 [PATCHSET v4 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
2021-01-23 18:52 ` [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation Darrick J. Wong
2021-01-28  6:02 [PATCHSET v5 00/11] xfs: try harder to reclaim space when we run out Darrick J. Wong
2021-01-28  6:03 ` [PATCH 07/11] xfs: flush eof/cowblocks if we can't reserve quota for inode creation Darrick J. Wong

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox