All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: djwong@kernel.org
Cc: linux-xfs@vger.kernel.org
Subject: [PATCH 03/11] xfs: don't reclaim dquots with incore reservations
Date: Wed, 10 Mar 2021 19:05:57 -0800	[thread overview]
Message-ID: <161543195719.1947934.8218545606940173264.stgit@magnolia> (raw)
In-Reply-To: <161543194009.1947934.9910987247994410125.stgit@magnolia>

From: Darrick J. Wong <djwong@kernel.org>

If a dquot has an incore reservation that exceeds the ondisk count, it
by definition has active incore state and must not be reclaimed.  Up to
this point every inode with an incore dquot reservation has always
retained a reference to the dquot so it was never possible for
xfs_qm_dquot_isolate to be called on a dquot with active state and zero
refcount, but this will soon change.

Deferred inode inactivation is about to reorganize how inodes are
inactivated by shunting all that work to a background workqueue.  In
order to avoid deadlocks with the quotaoff inode scan and reduce overall
memory requirements (since inodes can spend a lot of time waiting for
inactivation), inactive inodes will drop their dquot references while
they're waiting to be inactivated.

However, inactive inodes can have delalloc extents in the data fork or
any extents in the CoW fork.  Either of these contribute to the dquot's
incore reservation being larger than the resource count (i.e. they're
the reason the dquot still has active incore state), so we cannot allow
the dquot to be reclaimed.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_qm.c |   29 ++++++++++++++++++++++++-----
 fs/xfs/xfs_qm.h |   17 +++++++++++++++++
 2 files changed, 41 insertions(+), 5 deletions(-)


diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index bfa4164990b1..b3ce04dec181 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -166,9 +166,14 @@ xfs_qm_dqpurge(
 
 	/*
 	 * We move dquots to the freelist as soon as their reference count
-	 * hits zero, so it really should be on the freelist here.
+	 * hits zero, so it really should be on the freelist here.  If we're
+	 * running quotaoff, it's possible that we're purging a zero-refcount
+	 * dquot with active incore reservation because there are inodes
+	 * awaiting inactivation.  Dquots in this state will not be on the LRU
+	 * but it's quotaoff, so we don't care.
 	 */
-	ASSERT(!list_empty(&dqp->q_lru));
+	ASSERT(!(mp->m_qflags & xfs_quota_active_flag(xfs_dquot_type(dqp))) ||
+	       !list_empty(&dqp->q_lru));
 	list_lru_del(&qi->qi_lru, &dqp->q_lru);
 	XFS_STATS_DEC(mp, xs_qm_dquot_unused);
 
@@ -411,6 +416,15 @@ struct xfs_qm_isolate {
 	struct list_head	dispose;
 };
 
+static inline bool
+xfs_dquot_has_incore_resv(
+	struct xfs_dquot	*dqp)
+{
+	return  dqp->q_blk.reserved > dqp->q_blk.count ||
+		dqp->q_ino.reserved > dqp->q_ino.count ||
+		dqp->q_rtb.reserved > dqp->q_rtb.count;
+}
+
 static enum lru_status
 xfs_qm_dquot_isolate(
 	struct list_head	*item,
@@ -427,10 +441,15 @@ xfs_qm_dquot_isolate(
 		goto out_miss_busy;
 
 	/*
-	 * This dquot has acquired a reference in the meantime remove it from
-	 * the freelist and try again.
+	 * Either this dquot has incore reservations or it has acquired a
+	 * reference.  Remove it from the freelist and try again.
+	 *
+	 * Inodes tagged for inactivation drop their dquot references to avoid
+	 * deadlocks with quotaoff.  If these inodes have delalloc reservations
+	 * in the data fork or any extents in the CoW fork, these contribute
+	 * to the dquot's incore block reservation exceeding the count.
 	 */
-	if (dqp->q_nrefs) {
+	if (xfs_dquot_has_incore_resv(dqp) || dqp->q_nrefs) {
 		xfs_dqunlock(dqp);
 		XFS_STATS_INC(dqp->q_mount, xs_qm_dqwants);
 
diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
index e3dabab44097..78f90935e91e 100644
--- a/fs/xfs/xfs_qm.h
+++ b/fs/xfs/xfs_qm.h
@@ -105,6 +105,23 @@ xfs_quota_inode(struct xfs_mount *mp, xfs_dqtype_t type)
 	return NULL;
 }
 
+static inline unsigned int
+xfs_quota_active_flag(
+	xfs_dqtype_t		type)
+{
+	switch (type) {
+	case XFS_DQTYPE_USER:
+		return XFS_UQUOTA_ACTIVE;
+	case XFS_DQTYPE_GROUP:
+		return XFS_GQUOTA_ACTIVE;
+	case XFS_DQTYPE_PROJ:
+		return XFS_PQUOTA_ACTIVE;
+	default:
+		ASSERT(0);
+	}
+	return 0;
+}
+
 extern void	xfs_trans_mod_dquot(struct xfs_trans *tp, struct xfs_dquot *dqp,
 				    uint field, int64_t delta);
 extern void	xfs_trans_dqjoin(struct xfs_trans *, struct xfs_dquot *);


  parent reply	other threads:[~2021-03-11  3:06 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11  3:05 [PATCHSET v3 00/11] xfs: deferred inode inactivation Darrick J. Wong
2021-03-11  3:05 ` [PATCH 01/11] xfs: prevent metadata files from being inactivated Darrick J. Wong
2021-03-11 13:05   ` Christoph Hellwig
2021-03-22 23:13   ` Dave Chinner
2021-03-11  3:05 ` [PATCH 02/11] xfs: refactor the predicate part of xfs_free_eofblocks Darrick J. Wong
2021-03-11 13:09   ` Christoph Hellwig
2021-03-15 18:46   ` Christoph Hellwig
2021-03-18  4:33     ` Darrick J. Wong
2021-03-19  1:48       ` Darrick J. Wong
2021-03-11  3:05 ` Darrick J. Wong [this message]
2021-03-15 18:29   ` [PATCH 03/11] xfs: don't reclaim dquots with incore reservations Christoph Hellwig
2021-03-22 23:31   ` Dave Chinner
2021-03-23  0:01     ` Darrick J. Wong
2021-03-23  1:48       ` Dave Chinner
2021-03-11  3:06 ` [PATCH 04/11] xfs: decide if inode needs inactivation Darrick J. Wong
2021-03-15 18:47   ` Christoph Hellwig
2021-03-15 19:06     ` Darrick J. Wong
2021-03-11  3:06 ` [PATCH 05/11] xfs: rename the blockgc workqueue Darrick J. Wong
2021-03-15 18:49   ` Christoph Hellwig
2021-03-11  3:06 ` [PATCH 06/11] xfs: deferred inode inactivation Darrick J. Wong
2021-03-16  7:27   ` Christoph Hellwig
2021-03-16 15:47     ` Darrick J. Wong
2021-03-17 15:21       ` Christoph Hellwig
2021-03-17 15:49         ` Darrick J. Wong
2021-03-22 23:46           ` Dave Chinner
2021-03-22 23:37       ` Dave Chinner
2021-03-23  0:24         ` Darrick J. Wong
2021-03-23  1:44   ` Dave Chinner
2021-03-23  4:00     ` Darrick J. Wong
2021-03-23  5:19       ` Dave Chinner
2021-03-24  2:04         ` Darrick J. Wong
2021-03-24  4:57           ` Dave Chinner
2021-03-25  4:20             ` Darrick J. Wong
2021-03-24 17:53       ` Christoph Hellwig
2021-03-25  4:26         ` Darrick J. Wong
2021-03-11  3:06 ` [PATCH 07/11] xfs: expose sysfs knob to control inode inactivation delay Darrick J. Wong
2021-03-11  3:06 ` [PATCH 08/11] xfs: force inode inactivation and retry fs writes when there isn't space Darrick J. Wong
2021-03-15 18:54   ` Christoph Hellwig
2021-03-15 19:06     ` Darrick J. Wong
2021-03-11  3:06 ` [PATCH 09/11] xfs: force inode garbage collection before fallocate when space is low Darrick J. Wong
2021-03-11  3:06 ` [PATCH 10/11] xfs: parallelize inode inactivation Darrick J. Wong
2021-03-15 18:55   ` Christoph Hellwig
2021-03-15 19:03     ` Darrick J. Wong
2021-03-23 22:21   ` Dave Chinner
2021-03-24  3:52     ` Darrick J. Wong
2021-03-11  3:06 ` [PATCH 11/11] xfs: create a polled function to force " Darrick J. Wong
2021-03-23 22:31   ` Dave Chinner
2021-03-24  3:34     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=161543195719.1947934.8218545606940173264.stgit@magnolia \
    --to=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.