All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] xfs: rework log quiesce to cover the log
@ 2021-01-06 17:41 Brian Foster
  2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
                   ` (8 more replies)
  0 siblings, 9 replies; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

Hi all,

This series modifies the log quiesce code path to cover the log instead
of mark it clean and separates the latter into a distinct step. This
allows existing users to determine whether to quiesce (i.e. cover) or
clean the log depending on the operation, and thus avoids the need for
contexts like freeze to have to redirty the log after a quiesce.

By covering the log on quiesce, we can also fold final superblock
updates (i.e., lazy superblock counters) into the quiesce sequence
because covering uses the same superblock transaction as the explicit
superblock updates. This same approach can accommodate (probably with
some additional tweaks) future final superblock updates, such as for log
feature compatibility bit management when the log is cleared of
incompatible items.

Patch 1 is a repost of a lazy sb accounting bug fix that was previously
sent (included here as a dependency). Patches 2-3 make some preliminary
cleanups. Patch 4 injects log covering into the log quiesce sequence.
Patches 5-6 fold the existing lazy superblock accounting update into
quiesce. Patches 7-8 make some final refactoring cleanups (these two
patches could probably be squashed now that I look at them again).
Finally, patch 9 updates fs freeze to cover the log instead of cleaning
and redirtying it.

Thoughts, reviews, flames appreciated.

Brian

Brian Foster (9):
  xfs: sync lazy sb accounting on quiesce of read-only mounts
  xfs: lift writable fs check up into log worker task
  xfs: separate log cleaning from log quiesce
  xfs: cover the log during log quiesce
  xfs: don't reset log idle state on covering checkpoints
  xfs: fold sbcount quiesce logging into log covering
  xfs: remove duplicate wq cancel and log force from attr quiesce
  xfs: remove xfs_quiesce_attr()
  xfs: cover the log on freeze instead of cleaning it

 fs/xfs/xfs_log.c   | 122 +++++++++++++++++++++++++++++++++++++--------
 fs/xfs/xfs_log.h   |   4 +-
 fs/xfs/xfs_mount.c |  34 +------------
 fs/xfs/xfs_mount.h |   1 -
 fs/xfs/xfs_super.c |  38 +-------------
 5 files changed, 106 insertions(+), 93 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-06 22:50   ` Allison Henderson
                     ` (3 more replies)
  2021-01-06 17:41 ` [PATCH 2/9] xfs: lift writable fs check up into log worker task Brian Foster
                   ` (7 subsequent siblings)
  8 siblings, 4 replies; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

xfs_log_sbcount() syncs the superblock specifically to accumulate
the in-core percpu superblock counters and commit them to disk. This
is required to maintain filesystem consistency across quiesce
(freeze, read-only mount/remount) or unmount when lazy superblock
accounting is enabled because individual transactions do not update
the superblock directly.

This mechanism works as expected for writable mounts, but
xfs_log_sbcount() skips the update for read-only mounts. Read-only
mounts otherwise still allow log recovery and write out an unmount
record during log quiesce. If a read-only mount performs log
recovery, it can modify the in-core superblock counters and write an
unmount record when the filesystem unmounts without ever syncing the
in-core counters. This leaves the filesystem with a clean log but in
an inconsistent state with regard to lazy sb counters.

Update xfs_log_sbcount() to use the same logic
xfs_log_unmount_write() uses to determine when to write an unmount
record. We can drop the freeze state check because the update is
already allowed during the freezing process and no context calls
this function on an already frozen fs. This ensures that lazy
accounting is always synced before the log is cleaned. Refactor this
logic into a new helper to distinguish between a writable filesystem
and a writable log. Specifically, the log is writable unless the
filesystem is mounted with the norecovery mount option, the
underlying log device is read-only, or the filesystem is shutdown.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
---
 fs/xfs/xfs_log.c   | 28 ++++++++++++++++++++--------
 fs/xfs/xfs_log.h   |  1 +
 fs/xfs/xfs_mount.c |  3 +--
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index fa2d05e65ff1..b445e63cbc3c 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -347,6 +347,25 @@ xlog_tic_add_region(xlog_ticket_t *tic, uint len, uint type)
 	tic->t_res_num++;
 }
 
+bool
+xfs_log_writable(
+	struct xfs_mount	*mp)
+{
+	/*
+	 * Never write to the log on norecovery mounts, if the block device is
+	 * read-only, or if the filesystem is shutdown. Read-only mounts still
+	 * allow internal writes for log recovery and unmount purposes, so don't
+	 * restrict that case here.
+	 */
+	if (mp->m_flags & XFS_MOUNT_NORECOVERY)
+		return false;
+	if (xfs_readonly_buftarg(mp->m_log->l_targ))
+		return false;
+	if (XFS_FORCED_SHUTDOWN(mp))
+		return false;
+	return true;
+}
+
 /*
  * Replenish the byte reservation required by moving the grant write head.
  */
@@ -886,15 +905,8 @@ xfs_log_unmount_write(
 {
 	struct xlog		*log = mp->m_log;
 
-	/*
-	 * Don't write out unmount record on norecovery mounts or ro devices.
-	 * Or, if we are doing a forced umount (typically because of IO errors).
-	 */
-	if (mp->m_flags & XFS_MOUNT_NORECOVERY ||
-	    xfs_readonly_buftarg(log->l_targ)) {
-		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
+	if (!xfs_log_writable(mp))
 		return;
-	}
 
 	xfs_log_force(mp, XFS_LOG_SYNC);
 
diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
index 58c3fcbec94a..98c913da7587 100644
--- a/fs/xfs/xfs_log.h
+++ b/fs/xfs/xfs_log.h
@@ -127,6 +127,7 @@ int	  xfs_log_reserve(struct xfs_mount *mp,
 int	  xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic);
 void      xfs_log_unmount(struct xfs_mount *mp);
 int	  xfs_log_force_umount(struct xfs_mount *mp, int logerror);
+bool	xfs_log_writable(struct xfs_mount *mp);
 
 struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
 void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 7110507a2b6b..a62b8a574409 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1176,8 +1176,7 @@ xfs_fs_writable(
 int
 xfs_log_sbcount(xfs_mount_t *mp)
 {
-	/* allow this to proceed during the freeze sequence... */
-	if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE))
+	if (!xfs_log_writable(mp))
 		return 0;
 
 	/*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/9] xfs: lift writable fs check up into log worker task
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
  2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-06 22:50   ` Allison Henderson
                     ` (2 more replies)
  2021-01-06 17:41 ` [PATCH 3/9] xfs: separate log cleaning from log quiesce Brian Foster
                   ` (6 subsequent siblings)
  8 siblings, 3 replies; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

The log covering helper checks whether the filesystem is writable to
determine whether to cover the log. The helper is currently only
called from the background log worker. In preparation to reuse the
helper from freezing contexts, lift the check into xfs_log_worker().

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index b445e63cbc3c..4137ed007111 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1050,13 +1050,11 @@ xfs_log_space_wake(
  * can't start trying to idle the log until both the CIL and AIL are empty.
  */
 static int
-xfs_log_need_covered(xfs_mount_t *mp)
+xfs_log_need_covered(
+	struct xfs_mount	*mp)
 {
-	struct xlog	*log = mp->m_log;
-	int		needed = 0;
-
-	if (!xfs_fs_writable(mp, SB_FREEZE_WRITE))
-		return 0;
+	struct xlog		*log = mp->m_log;
+	int			needed = 0;
 
 	if (!xlog_cil_empty(log))
 		return 0;
@@ -1271,7 +1269,7 @@ xfs_log_worker(
 	struct xfs_mount	*mp = log->l_mp;
 
 	/* dgc: errors ignored - not fatal and nowhere to report them */
-	if (xfs_log_need_covered(mp)) {
+	if (xfs_fs_writable(mp, SB_FREEZE_WRITE) && xfs_log_need_covered(mp)) {
 		/*
 		 * Dump a transaction into the log that contains no real change.
 		 * This is needed to stamp the current tail LSN into the log
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/9] xfs: separate log cleaning from log quiesce
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
  2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
  2021-01-06 17:41 ` [PATCH 2/9] xfs: lift writable fs check up into log worker task Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-06 22:50   ` Allison Henderson
                     ` (2 more replies)
  2021-01-06 17:41 ` [PATCH 4/9] xfs: cover the log during " Brian Foster
                   ` (5 subsequent siblings)
  8 siblings, 3 replies; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

Log quiesce is currently associated with cleaning the log, which is
accomplished by writing an unmount record as the last step of the
quiesce sequence. The quiesce codepath is a bit convoluted in this
regard due to how it is reused from various contexts. In preparation
to create separate log cleaning and log covering interfaces, lift
the write of the unmount record into a new cleaning helper and call
that wherever xfs_log_quiesce() is currently invoked. No functional
changes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c   | 8 +++++++-
 fs/xfs/xfs_log.h   | 1 +
 fs/xfs/xfs_super.c | 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 4137ed007111..1b3227a033ad 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -957,7 +957,13 @@ xfs_log_quiesce(
 	xfs_wait_buftarg(mp->m_ddev_targp);
 	xfs_buf_lock(mp->m_sb_bp);
 	xfs_buf_unlock(mp->m_sb_bp);
+}
 
+void
+xfs_log_clean(
+	struct xfs_mount	*mp)
+{
+	xfs_log_quiesce(mp);
 	xfs_log_unmount_write(mp);
 }
 
@@ -972,7 +978,7 @@ void
 xfs_log_unmount(
 	struct xfs_mount	*mp)
 {
-	xfs_log_quiesce(mp);
+	xfs_log_clean(mp);
 
 	xfs_trans_ail_destroy(mp);
 
diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
index 98c913da7587..b0400589f824 100644
--- a/fs/xfs/xfs_log.h
+++ b/fs/xfs/xfs_log.h
@@ -139,6 +139,7 @@ bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
 
 void	xfs_log_work_queue(struct xfs_mount *mp);
 void	xfs_log_quiesce(struct xfs_mount *mp);
+void	xfs_log_clean(struct xfs_mount *mp);
 bool	xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t);
 bool	xfs_log_in_recovery(struct xfs_mount *);
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 813be879a5e5..09d956e30fd8 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -897,7 +897,7 @@ xfs_quiesce_attr(
 	if (error)
 		xfs_warn(mp, "xfs_attr_quiesce: failed to log sb changes. "
 				"Frozen image may not be consistent.");
-	xfs_log_quiesce(mp);
+	xfs_log_clean(mp);
 }
 
 /*
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/9] xfs: cover the log during log quiesce
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
                   ` (2 preceding siblings ...)
  2021-01-06 17:41 ` [PATCH 3/9] xfs: separate log cleaning from log quiesce Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-07 19:04   ` Darrick J. Wong
  2021-01-19 15:35   ` Christoph Hellwig
  2021-01-06 17:41 ` [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints Brian Foster
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

The log quiesce mechanism historically terminates by marking the log
clean with an unmount record. The primary objective is to indicate
that log recovery is no longer required after the quiesce has
flushed all in-core changes and written back filesystem metadata.
While this is perfectly fine, it is somewhat hacky as currently used
in certain contexts. For example, filesystem freeze quiesces (i.e.
cleans) the log and immediately redirties it with a dummy superblock
transaction to ensure that log recovery runs in the event of a
crash.

While this functions correctly, cleaning the log from freeze context
is clearly superfluous given the current redirtying behavior.
Instead, the desired behavior can be achieved by simply covering the
log. This effectively retires all on-disk log items from the active
range of the log by issuing two synchronous and sequential dummy
superblock update transactions that serve to update the on-disk log
head and tail. The subtle difference is that the log technically
remains dirty due to the lack of an unmount record, though recovery
is effectively a no-op due to the content of the checkpoints being
clean (i.e. the unmodified on-disk superblock).

Log covering currently runs in the background and only triggers once
the filesystem and log has idled. The purpose of the background
mechanism is to prevent log recovery from replaying the most
recently logged items long after those items may have been written
back. In the quiesce path, the log has been deliberately idled by
forcing the log and pushing the AIL until empty in a context where
no further mutable filesystem operations are allowed. Therefore, we
can cover the log as the final step in the log quiesce codepath to
reflect that all previously active items have been successfully
written back.

This facilitates selective log covering from certain contexts (i.e.
freeze) that only seek to quiesce, but not necessarily clean the
log. Note that as a side effect of this change, log covering now
occurs when cleaning the log as well. This is harmless, facilitates
subsequent cleanups, and is mostly temporary as various operations
switch to use explicit log covering.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c | 49 +++++++++++++++++++++++++++++++++++++++++++++---
 fs/xfs/xfs_log.h |  2 +-
 2 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 1b3227a033ad..f7b23044723d 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -91,6 +91,9 @@ STATIC int
 xlog_iclogs_empty(
 	struct xlog		*log);
 
+static int
+xfs_log_cover(struct xfs_mount *);
+
 static void
 xlog_grant_sub_space(
 	struct xlog		*log,
@@ -936,10 +939,9 @@ xfs_log_unmount_write(
  * To do this, we first need to shut down the background log work so it is not
  * trying to cover the log as we clean up. We then need to unpin all objects in
  * the log so we can then flush them out. Once they have completed their IO and
- * run the callbacks removing themselves from the AIL, we can write the unmount
- * record.
+ * run the callbacks removing themselves from the AIL, we can cover the log.
  */
-void
+int
 xfs_log_quiesce(
 	struct xfs_mount	*mp)
 {
@@ -957,6 +959,8 @@ xfs_log_quiesce(
 	xfs_wait_buftarg(mp->m_ddev_targp);
 	xfs_buf_lock(mp->m_sb_bp);
 	xfs_buf_unlock(mp->m_sb_bp);
+
+	return xfs_log_cover(mp);
 }
 
 void
@@ -1092,6 +1096,45 @@ xfs_log_need_covered(
 	return needed;
 }
 
+/*
+ * Explicitly cover the log. This is similar to background log covering but
+ * intended for usage in quiesce codepaths. The caller is responsible to ensure
+ * the log is idle and suitable for covering. The CIL, iclog buffers and AIL
+ * must all be empty.
+ */
+static int
+xfs_log_cover(
+	struct xfs_mount	*mp)
+{
+	struct xlog		*log = mp->m_log;
+	int			error = 0;
+
+	ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) &&
+	        !xfs_ail_min_lsn(log->l_ailp)) ||
+	       XFS_FORCED_SHUTDOWN(mp));
+
+	if (!xfs_log_writable(mp))
+		return 0;
+
+	/*
+	 * To cover the log, commit the superblock twice (at most) in
+	 * independent checkpoints. The first serves as a reference for the
+	 * tail pointer. The sync transaction and AIL push empties the AIL and
+	 * updates the in-core tail to the LSN of the first checkpoint. The
+	 * second commit updates the on-disk tail with the in-core LSN,
+	 * covering the log. Push the AIL one more time to leave it empty, as
+	 * we found it.
+	 */
+	while (xfs_log_need_covered(mp)) {
+		error = xfs_sync_sb(mp, true);
+		if (error)
+			break;
+		xfs_ail_push_all_sync(mp->m_ail);
+	}
+
+	return error;
+}
+
 /*
  * We may be holding the log iclog lock upon entering this routine.
  */
diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
index b0400589f824..044e02cb8921 100644
--- a/fs/xfs/xfs_log.h
+++ b/fs/xfs/xfs_log.h
@@ -138,7 +138,7 @@ void	xlog_cil_process_committed(struct list_head *list);
 bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
 
 void	xfs_log_work_queue(struct xfs_mount *mp);
-void	xfs_log_quiesce(struct xfs_mount *mp);
+int	xfs_log_quiesce(struct xfs_mount *mp);
 void	xfs_log_clean(struct xfs_mount *mp);
 bool	xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t);
 bool	xfs_log_in_recovery(struct xfs_mount *);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
                   ` (3 preceding siblings ...)
  2021-01-06 17:41 ` [PATCH 4/9] xfs: cover the log during " Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-07 19:30   ` Darrick J. Wong
  2021-01-06 17:41 ` [PATCH 6/9] xfs: fold sbcount quiesce logging into log covering Brian Foster
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

Now that log covering occurs on quiesce, we'd like to reuse the
underlying superblock sync for final superblock updates. This
includes things like lazy superblock counter updates, log feature
incompat bits in the future, etc. One quirk to this approach is that
once the log is in the IDLE (i.e. already covered) state, any
subsequent log write resets the state back to NEED. This means that
a final superblock sync to an already covered log requires two more
sb syncs to return the log back to IDLE again.

For example, if a lazy superblock enabled filesystem is mount cycled
without any modifications, the unmount path syncs the superblock
once and writes an unmount record. With the desired log quiesce
covering behavior, we sync the superblock three times at unmount
time: once for the lazy superblock counter update and twice more to
cover the log. By contrast, if the log is active or only partially
covered at unmount time, a final superblock sync would doubly serve
as the one or two remaining syncs required to cover the log.

This duplicate covering sequence is unnecessary because the
filesystem remains consistent if a crash occurs at any point. The
superblock will either be recovered in the event of a crash or
written back before the log is quiesced and potentially cleaned with
an unmount record.

Update the log covering state machine to remain in the IDLE state if
additional covering checkpoints pass through the log. This
facilitates final superblock updates (such as lazy superblock
counters) via a single sb sync without losing covered status. This
provides some consistency with the active and partially covered
cases and also avoids harmless, but spurious checkpoints when
quiescing the log.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index f7b23044723d..9b8564f280b7 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -2597,12 +2597,15 @@ xlog_covered_state(
 	int			iclogs_changed)
 {
 	/*
-	 * We usually go to NEED. But we go to NEED2 if the changed indicates we
-	 * are done writing the dummy record.  If we are done with the second
-	 * dummy recored (DONE2), then we go to IDLE.
+	 * We go to NEED for any non-covering writes. We go to NEED2 if we just
+	 * wrote the first covering record (DONE). We go to IDLE if we just
+	 * wrote the second covering record (DONE2) and remain in IDLE until a
+	 * non-covering write occurs.
 	 */
 	switch (prev_state) {
 	case XLOG_STATE_COVER_IDLE:
+		if (iclogs_changed == 1)
+			return XLOG_STATE_COVER_IDLE;
 	case XLOG_STATE_COVER_NEED:
 	case XLOG_STATE_COVER_NEED2:
 		break;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 6/9] xfs: fold sbcount quiesce logging into log covering
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
                   ` (4 preceding siblings ...)
  2021-01-06 17:41 ` [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-07 19:31   ` Darrick J. Wong
  2021-01-06 17:41 ` [PATCH 7/9] xfs: remove duplicate wq cancel and log force from attr quiesce Brian Foster
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

xfs_log_sbcount() calls xfs_sync_sb() to sync superblock counters to
disk when lazy superblock accounting is enabled. This occurs on
unmount, freeze, and read-only (re)mount and ensures the final
values are calculated and persisted to disk before each form of
quiesce completes.

Now that log covering occurs in all of these contexts and uses the
same xfs_sync_sb() mechanism to update log state, there is no need
to log the superblock separately for any reason. Update the log
quiesce path to sync the superblock at least once for any mount
where lazy superblock accounting is enabled. If the log is already
covered, it will remain in the covered state. Otherwise, the next
sync as part of the normal covering sequence will carry the
associated superblock update with it. Remove xfs_log_sbcount() now
that it is no longer needed.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log.c   | 20 ++++++++++++++++++--
 fs/xfs/xfs_mount.c | 31 -------------------------------
 fs/xfs/xfs_mount.h |  1 -
 fs/xfs/xfs_super.c |  8 --------
 4 files changed, 18 insertions(+), 42 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 9b8564f280b7..83e2697d4e38 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -1108,6 +1108,7 @@ xfs_log_cover(
 {
 	struct xlog		*log = mp->m_log;
 	int			error = 0;
+	bool			need_covered;
 
 	ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) &&
 	        !xfs_ail_min_lsn(log->l_ailp)) ||
@@ -1116,6 +1117,21 @@ xfs_log_cover(
 	if (!xfs_log_writable(mp))
 		return 0;
 
+	/*
+	 * xfs_log_need_covered() is not idempotent because it progresses the
+	 * state machine if the log requires covering. Therefore, we must call
+	 * this function once and use the result until we've issued an sb sync.
+	 * Do so first to make that abundantly clear.
+	 *
+	 * Fall into the covering sequence if the log needs covering or the
+	 * mount has lazy superblock accounting to sync to disk. The sb sync
+	 * used for covering accumulates the in-core counters, so covering
+	 * handles this for us.
+	 */
+	need_covered = xfs_log_need_covered(mp);
+	if (!need_covered && !xfs_sb_version_haslazysbcount(&mp->m_sb))
+		return 0;
+
 	/*
 	 * To cover the log, commit the superblock twice (at most) in
 	 * independent checkpoints. The first serves as a reference for the
@@ -1125,12 +1141,12 @@ xfs_log_cover(
 	 * covering the log. Push the AIL one more time to leave it empty, as
 	 * we found it.
 	 */
-	while (xfs_log_need_covered(mp)) {
+	do {
 		error = xfs_sync_sb(mp, true);
 		if (error)
 			break;
 		xfs_ail_push_all_sync(mp->m_ail);
-	}
+	} while (xfs_log_need_covered(mp));
 
 	return error;
 }
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index a62b8a574409..f97b82d0e30f 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1124,12 +1124,6 @@ xfs_unmountfs(
 		xfs_warn(mp, "Unable to free reserved block pool. "
 				"Freespace may not be correct on next mount.");
 
-	error = xfs_log_sbcount(mp);
-	if (error)
-		xfs_warn(mp, "Unable to update superblock counters. "
-				"Freespace may not be correct on next mount.");
-
-
 	xfs_log_unmount(mp);
 	xfs_da_unmount(mp);
 	xfs_uuid_unmount(mp);
@@ -1164,31 +1158,6 @@ xfs_fs_writable(
 	return true;
 }
 
-/*
- * xfs_log_sbcount
- *
- * Sync the superblock counters to disk.
- *
- * Note this code can be called during the process of freezing, so we use the
- * transaction allocator that does not block when the transaction subsystem is
- * in its frozen state.
- */
-int
-xfs_log_sbcount(xfs_mount_t *mp)
-{
-	if (!xfs_log_writable(mp))
-		return 0;
-
-	/*
-	 * we don't need to do this if we are updating the superblock
-	 * counters on every modification.
-	 */
-	if (!xfs_sb_version_haslazysbcount(&mp->m_sb))
-		return 0;
-
-	return xfs_sync_sb(mp, true);
-}
-
 /*
  * Deltas for the block count can vary from 1 to very large, but lock contention
  * only occurs on frequent small block count updates such as in the delayed
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index dfa429b77ee2..452ca7654dc5 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -399,7 +399,6 @@ int xfs_buf_hash_init(xfs_perag_t *pag);
 void xfs_buf_hash_destroy(xfs_perag_t *pag);
 
 extern void	xfs_uuid_table_free(void);
-extern int	xfs_log_sbcount(xfs_mount_t *);
 extern uint64_t xfs_default_resblks(xfs_mount_t *mp);
 extern int	xfs_mountfs(xfs_mount_t *mp);
 extern int	xfs_initialize_perag(xfs_mount_t *mp, xfs_agnumber_t agcount,
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 09d956e30fd8..75ada867c665 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -884,19 +884,11 @@ void
 xfs_quiesce_attr(
 	struct xfs_mount	*mp)
 {
-	int	error = 0;
-
 	cancel_delayed_work_sync(&mp->m_log->l_work);
 
 	/* force the log to unpin objects from the now complete transactions */
 	xfs_log_force(mp, XFS_LOG_SYNC);
 
-
-	/* Push the superblock and write an unmount record */
-	error = xfs_log_sbcount(mp);
-	if (error)
-		xfs_warn(mp, "xfs_attr_quiesce: failed to log sb changes. "
-				"Frozen image may not be consistent.");
 	xfs_log_clean(mp);
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 7/9] xfs: remove duplicate wq cancel and log force from attr quiesce
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
                   ` (5 preceding siblings ...)
  2021-01-06 17:41 ` [PATCH 6/9] xfs: fold sbcount quiesce logging into log covering Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-07 19:38   ` Darrick J. Wong
  2021-01-06 17:41 ` [PATCH 8/9] xfs: remove xfs_quiesce_attr() Brian Foster
  2021-01-06 17:41 ` [PATCH 9/9] xfs: cover the log on freeze instead of cleaning it Brian Foster
  8 siblings, 1 reply; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

These two calls are repeated at the beginning of xfs_log_quiesce().
Drop them from xfs_quiesce_attr().

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_super.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 75ada867c665..8fc9044131fc 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -884,11 +884,6 @@ void
 xfs_quiesce_attr(
 	struct xfs_mount	*mp)
 {
-	cancel_delayed_work_sync(&mp->m_log->l_work);
-
-	/* force the log to unpin objects from the now complete transactions */
-	xfs_log_force(mp, XFS_LOG_SYNC);
-
 	xfs_log_clean(mp);
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 8/9] xfs: remove xfs_quiesce_attr()
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
                   ` (6 preceding siblings ...)
  2021-01-06 17:41 ` [PATCH 7/9] xfs: remove duplicate wq cancel and log force from attr quiesce Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-07 19:39   ` Darrick J. Wong
  2021-01-06 17:41 ` [PATCH 9/9] xfs: cover the log on freeze instead of cleaning it Brian Foster
  8 siblings, 1 reply; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

xfs_quiesce_attr() is now a wrapper for xfs_log_clean(). Remove it
and call xfs_log_clean() directly.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_mount.c |  2 +-
 fs/xfs/xfs_super.c | 24 ++----------------------
 2 files changed, 3 insertions(+), 23 deletions(-)

diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index f97b82d0e30f..4a26b48b18e4 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -946,7 +946,7 @@ xfs_mountfs(
 	 */
 	if ((mp->m_flags & (XFS_MOUNT_RDONLY|XFS_MOUNT_NORECOVERY)) ==
 							XFS_MOUNT_RDONLY) {
-		xfs_quiesce_attr(mp);
+		xfs_log_clean(mp);
 	}
 
 	/*
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 8fc9044131fc..aedf622d221b 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -867,26 +867,6 @@ xfs_restore_resvblks(struct xfs_mount *mp)
 	xfs_reserve_blocks(mp, &resblks, NULL);
 }
 
-/*
- * Trigger writeback of all the dirty metadata in the file system.
- *
- * This ensures that the metadata is written to their location on disk rather
- * than just existing in transactions in the log. This means after a quiesce
- * there is no log replay required to write the inodes to disk - this is the
- * primary difference between a sync and a quiesce.
- *
- * We cancel log work early here to ensure all transactions the log worker may
- * run have finished before we clean up and log the superblock and write an
- * unmount record. The unfreeze process is responsible for restarting the log
- * worker correctly.
- */
-void
-xfs_quiesce_attr(
-	struct xfs_mount	*mp)
-{
-	xfs_log_clean(mp);
-}
-
 /*
  * Second stage of a freeze. The data is already frozen so we only
  * need to take care of the metadata. Once that's done sync the superblock
@@ -909,7 +889,7 @@ xfs_fs_freeze(
 	flags = memalloc_nofs_save();
 	xfs_stop_block_reaping(mp);
 	xfs_save_resvblks(mp);
-	xfs_quiesce_attr(mp);
+	xfs_log_clean(mp);
 	ret = xfs_sync_sb(mp, true);
 	memalloc_nofs_restore(flags);
 	return ret;
@@ -1752,7 +1732,7 @@ xfs_remount_ro(
 	 */
 	xfs_save_resvblks(mp);
 
-	xfs_quiesce_attr(mp);
+	xfs_log_clean(mp);
 	mp->m_flags |= XFS_MOUNT_RDONLY;
 
 	return 0;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 9/9] xfs: cover the log on freeze instead of cleaning it
  2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
                   ` (7 preceding siblings ...)
  2021-01-06 17:41 ` [PATCH 8/9] xfs: remove xfs_quiesce_attr() Brian Foster
@ 2021-01-06 17:41 ` Brian Foster
  2021-01-07 19:39   ` Darrick J. Wong
  8 siblings, 1 reply; 36+ messages in thread
From: Brian Foster @ 2021-01-06 17:41 UTC (permalink / raw)
  To: linux-xfs

Filesystem freeze cleans the log and immediately redirties it so log
recovery runs if a crash occurs after the filesystem is frozen. Now
that log quiesce covers the log, there is no need to clean the log and
redirty it to trigger log recovery because covering has the same
effect. Update xfs_fs_freeze() to quiesce (and thus cover) the log.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_super.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index aedf622d221b..aed74a3fc787 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -889,8 +889,7 @@ xfs_fs_freeze(
 	flags = memalloc_nofs_save();
 	xfs_stop_block_reaping(mp);
 	xfs_save_resvblks(mp);
-	xfs_log_clean(mp);
-	ret = xfs_sync_sb(mp, true);
+	ret = xfs_log_quiesce(mp);
 	memalloc_nofs_restore(flags);
 	return ret;
 }
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
@ 2021-01-06 22:50   ` Allison Henderson
  2021-01-07 19:06   ` Darrick J. Wong
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 36+ messages in thread
From: Allison Henderson @ 2021-01-06 22:50 UTC (permalink / raw)
  To: Brian Foster, linux-xfs



On 1/6/21 10:41 AM, Brian Foster wrote:
> xfs_log_sbcount() syncs the superblock specifically to accumulate
> the in-core percpu superblock counters and commit them to disk. This
> is required to maintain filesystem consistency across quiesce
> (freeze, read-only mount/remount) or unmount when lazy superblock
> accounting is enabled because individual transactions do not update
> the superblock directly.
> 
> This mechanism works as expected for writable mounts, but
> xfs_log_sbcount() skips the update for read-only mounts. Read-only
> mounts otherwise still allow log recovery and write out an unmount
> record during log quiesce. If a read-only mount performs log
> recovery, it can modify the in-core superblock counters and write an
> unmount record when the filesystem unmounts without ever syncing the
> in-core counters. This leaves the filesystem with a clean log but in
> an inconsistent state with regard to lazy sb counters.
> 
> Update xfs_log_sbcount() to use the same logic
> xfs_log_unmount_write() uses to determine when to write an unmount
> record. We can drop the freeze state check because the update is
> already allowed during the freezing process and no context calls
> this function on an already frozen fs. This ensures that lazy
> accounting is always synced before the log is cleaned. Refactor this
> logic into a new helper to distinguish between a writable filesystem
> and a writable log. Specifically, the log is writable unless the
> filesystem is mounted with the norecovery mount option, the
> underlying log device is read-only, or the filesystem is shutdown.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> Reviewed-by: Gao Xiang <hsiangkao@redhat.com>

Ok makes sense
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>   fs/xfs/xfs_log.c   | 28 ++++++++++++++++++++--------
>   fs/xfs/xfs_log.h   |  1 +
>   fs/xfs/xfs_mount.c |  3 +--
>   3 files changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index fa2d05e65ff1..b445e63cbc3c 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -347,6 +347,25 @@ xlog_tic_add_region(xlog_ticket_t *tic, uint len, uint type)
>   	tic->t_res_num++;
>   }
>   
> +bool
> +xfs_log_writable(
> +	struct xfs_mount	*mp)
> +{
> +	/*
> +	 * Never write to the log on norecovery mounts, if the block device is
> +	 * read-only, or if the filesystem is shutdown. Read-only mounts still
> +	 * allow internal writes for log recovery and unmount purposes, so don't
> +	 * restrict that case here.
> +	 */
> +	if (mp->m_flags & XFS_MOUNT_NORECOVERY)
> +		return false;
> +	if (xfs_readonly_buftarg(mp->m_log->l_targ))
> +		return false;
> +	if (XFS_FORCED_SHUTDOWN(mp))
> +		return false;
> +	return true;
> +}
> +
>   /*
>    * Replenish the byte reservation required by moving the grant write head.
>    */
> @@ -886,15 +905,8 @@ xfs_log_unmount_write(
>   {
>   	struct xlog		*log = mp->m_log;
>   
> -	/*
> -	 * Don't write out unmount record on norecovery mounts or ro devices.
> -	 * Or, if we are doing a forced umount (typically because of IO errors).
> -	 */
> -	if (mp->m_flags & XFS_MOUNT_NORECOVERY ||
> -	    xfs_readonly_buftarg(log->l_targ)) {
> -		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> +	if (!xfs_log_writable(mp))
>   		return;
> -	}
>   
>   	xfs_log_force(mp, XFS_LOG_SYNC);
>   
> diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> index 58c3fcbec94a..98c913da7587 100644
> --- a/fs/xfs/xfs_log.h
> +++ b/fs/xfs/xfs_log.h
> @@ -127,6 +127,7 @@ int	  xfs_log_reserve(struct xfs_mount *mp,
>   int	  xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic);
>   void      xfs_log_unmount(struct xfs_mount *mp);
>   int	  xfs_log_force_umount(struct xfs_mount *mp, int logerror);
> +bool	xfs_log_writable(struct xfs_mount *mp);
>   
>   struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
>   void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 7110507a2b6b..a62b8a574409 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1176,8 +1176,7 @@ xfs_fs_writable(
>   int
>   xfs_log_sbcount(xfs_mount_t *mp)
>   {
> -	/* allow this to proceed during the freeze sequence... */
> -	if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE))
> +	if (!xfs_log_writable(mp))
>   		return 0;
>   
>   	/*
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/9] xfs: lift writable fs check up into log worker task
  2021-01-06 17:41 ` [PATCH 2/9] xfs: lift writable fs check up into log worker task Brian Foster
@ 2021-01-06 22:50   ` Allison Henderson
  2021-01-07 18:34   ` Darrick J. Wong
  2021-01-13 15:24   ` Christoph Hellwig
  2 siblings, 0 replies; 36+ messages in thread
From: Allison Henderson @ 2021-01-06 22:50 UTC (permalink / raw)
  To: Brian Foster, linux-xfs



On 1/6/21 10:41 AM, Brian Foster wrote:
> The log covering helper checks whether the filesystem is writable to
> determine whether to cover the log. The helper is currently only
> called from the background log worker. In preparation to reuse the
> helper from freezing contexts, lift the check into xfs_log_worker().
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Looks straight forward
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>   fs/xfs/xfs_log.c | 12 +++++-------
>   1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index b445e63cbc3c..4137ed007111 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1050,13 +1050,11 @@ xfs_log_space_wake(
>    * can't start trying to idle the log until both the CIL and AIL are empty.
>    */
>   static int
> -xfs_log_need_covered(xfs_mount_t *mp)
> +xfs_log_need_covered(
> +	struct xfs_mount	*mp)
>   {
> -	struct xlog	*log = mp->m_log;
> -	int		needed = 0;
> -
> -	if (!xfs_fs_writable(mp, SB_FREEZE_WRITE))
> -		return 0;
> +	struct xlog		*log = mp->m_log;
> +	int			needed = 0;
>   
>   	if (!xlog_cil_empty(log))
>   		return 0;
> @@ -1271,7 +1269,7 @@ xfs_log_worker(
>   	struct xfs_mount	*mp = log->l_mp;
>   
>   	/* dgc: errors ignored - not fatal and nowhere to report them */
> -	if (xfs_log_need_covered(mp)) {
> +	if (xfs_fs_writable(mp, SB_FREEZE_WRITE) && xfs_log_need_covered(mp)) {
>   		/*
>   		 * Dump a transaction into the log that contains no real change.
>   		 * This is needed to stamp the current tail LSN into the log
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] xfs: separate log cleaning from log quiesce
  2021-01-06 17:41 ` [PATCH 3/9] xfs: separate log cleaning from log quiesce Brian Foster
@ 2021-01-06 22:50   ` Allison Henderson
  2021-01-07 19:07   ` Darrick J. Wong
  2021-01-13 15:30   ` Christoph Hellwig
  2 siblings, 0 replies; 36+ messages in thread
From: Allison Henderson @ 2021-01-06 22:50 UTC (permalink / raw)
  To: Brian Foster, linux-xfs



On 1/6/21 10:41 AM, Brian Foster wrote:
> Log quiesce is currently associated with cleaning the log, which is
> accomplished by writing an unmount record as the last step of the
> quiesce sequence. The quiesce codepath is a bit convoluted in this
> regard due to how it is reused from various contexts. In preparation
> to create separate log cleaning and log covering interfaces, lift
> the write of the unmount record into a new cleaning helper and call
> that wherever xfs_log_quiesce() is currently invoked. No functional
> changes.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
Looks ok to me
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>   fs/xfs/xfs_log.c   | 8 +++++++-
>   fs/xfs/xfs_log.h   | 1 +
>   fs/xfs/xfs_super.c | 2 +-
>   3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 4137ed007111..1b3227a033ad 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -957,7 +957,13 @@ xfs_log_quiesce(
>   	xfs_wait_buftarg(mp->m_ddev_targp);
>   	xfs_buf_lock(mp->m_sb_bp);
>   	xfs_buf_unlock(mp->m_sb_bp);
> +}
>   
> +void
> +xfs_log_clean(
> +	struct xfs_mount	*mp)
> +{
> +	xfs_log_quiesce(mp);
>   	xfs_log_unmount_write(mp);
>   }
>   
> @@ -972,7 +978,7 @@ void
>   xfs_log_unmount(
>   	struct xfs_mount	*mp)
>   {
> -	xfs_log_quiesce(mp);
> +	xfs_log_clean(mp);
>   
>   	xfs_trans_ail_destroy(mp);
>   
> diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> index 98c913da7587..b0400589f824 100644
> --- a/fs/xfs/xfs_log.h
> +++ b/fs/xfs/xfs_log.h
> @@ -139,6 +139,7 @@ bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
>   
>   void	xfs_log_work_queue(struct xfs_mount *mp);
>   void	xfs_log_quiesce(struct xfs_mount *mp);
> +void	xfs_log_clean(struct xfs_mount *mp);
>   bool	xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t);
>   bool	xfs_log_in_recovery(struct xfs_mount *);
>   
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 813be879a5e5..09d956e30fd8 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -897,7 +897,7 @@ xfs_quiesce_attr(
>   	if (error)
>   		xfs_warn(mp, "xfs_attr_quiesce: failed to log sb changes. "
>   				"Frozen image may not be consistent.");
> -	xfs_log_quiesce(mp);
> +	xfs_log_clean(mp);
>   }
>   
>   /*
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/9] xfs: lift writable fs check up into log worker task
  2021-01-06 17:41 ` [PATCH 2/9] xfs: lift writable fs check up into log worker task Brian Foster
  2021-01-06 22:50   ` Allison Henderson
@ 2021-01-07 18:34   ` Darrick J. Wong
  2021-01-07 19:53     ` Brian Foster
  2021-01-13 15:24   ` Christoph Hellwig
  2 siblings, 1 reply; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 18:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:20PM -0500, Brian Foster wrote:
> The log covering helper checks whether the filesystem is writable to
> determine whether to cover the log. The helper is currently only
> called from the background log worker. In preparation to reuse the
> helper from freezing contexts, lift the check into xfs_log_worker().
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_log.c | 12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index b445e63cbc3c..4137ed007111 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1050,13 +1050,11 @@ xfs_log_space_wake(
>   * can't start trying to idle the log until both the CIL and AIL are empty.
>   */
>  static int

I think this is a predicate, right?  Should this function return a bool
instead of an int?

This function always confuses me slightly since it pushes us through the
covering state machine, and (I think) assumes that someone will force
the CIL and push the AIL if it returns zero. :)

To check my thinking further-- back in that thread I started about
setting and clearing log incompat flags, I think Dave was pushing me to
clear the log incompat flags just before we call xfs_sync_sb when the
log is in NEED2 state, right?

AFAICT the net effect of this series is to reorder the log code so that
xfs_log_quiesce covers the log (force cil, push ail, log two
transactions containing only the superblock), and adds an xfs_log_clean
that quiesces the log and then writes an unmount record after that.

Two callers whose behavior does not change with this series are: 1) The
log worker quiesces the log when it's idle; and 2) unmount quiesces the
log and then writes an unmount record so that the next mount knows it
can skip replay entirely.

The big difference is 3) freeze now only covers the log, whereas before
it would cover, write an unmount record, and immediately redirty the log
to force replay of the snapshot, right?

Assuming I understood all that, my next question is: Eric Sandeen was
working on a patchset to process unlinked inodes unconditionally on
mount so that frozen fs images can be written out with the unmount
record (because I guess people make ro snapshots of live fs images and
then balk when they have to make the snapshot rw to run log recovery.
Any thoughts about /that/? :)

--D

> -xfs_log_need_covered(xfs_mount_t *mp)
> +xfs_log_need_covered(
> +	struct xfs_mount	*mp)
>  {
> -	struct xlog	*log = mp->m_log;
> -	int		needed = 0;
> -
> -	if (!xfs_fs_writable(mp, SB_FREEZE_WRITE))
> -		return 0;
> +	struct xlog		*log = mp->m_log;
> +	int			needed = 0;
>  
>  	if (!xlog_cil_empty(log))
>  		return 0;
> @@ -1271,7 +1269,7 @@ xfs_log_worker(
>  	struct xfs_mount	*mp = log->l_mp;
>  
>  	/* dgc: errors ignored - not fatal and nowhere to report them */
> -	if (xfs_log_need_covered(mp)) {
> +	if (xfs_fs_writable(mp, SB_FREEZE_WRITE) && xfs_log_need_covered(mp)) {
>  		/*
>  		 * Dump a transaction into the log that contains no real change.
>  		 * This is needed to stamp the current tail LSN into the log
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] xfs: cover the log during log quiesce
  2021-01-06 17:41 ` [PATCH 4/9] xfs: cover the log during " Brian Foster
@ 2021-01-07 19:04   ` Darrick J. Wong
  2021-01-07 19:53     ` Brian Foster
  2021-01-19 15:35   ` Christoph Hellwig
  1 sibling, 1 reply; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 19:04 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:22PM -0500, Brian Foster wrote:
> The log quiesce mechanism historically terminates by marking the log
> clean with an unmount record. The primary objective is to indicate
> that log recovery is no longer required after the quiesce has
> flushed all in-core changes and written back filesystem metadata.
> While this is perfectly fine, it is somewhat hacky as currently used
> in certain contexts. For example, filesystem freeze quiesces (i.e.
> cleans) the log and immediately redirties it with a dummy superblock
> transaction to ensure that log recovery runs in the event of a
> crash.
> 
> While this functions correctly, cleaning the log from freeze context
> is clearly superfluous given the current redirtying behavior.
> Instead, the desired behavior can be achieved by simply covering the
> log. This effectively retires all on-disk log items from the active
> range of the log by issuing two synchronous and sequential dummy
> superblock update transactions that serve to update the on-disk log
> head and tail. The subtle difference is that the log technically
> remains dirty due to the lack of an unmount record, though recovery
> is effectively a no-op due to the content of the checkpoints being
> clean (i.e. the unmodified on-disk superblock).
> 
> Log covering currently runs in the background and only triggers once
> the filesystem and log has idled. The purpose of the background
> mechanism is to prevent log recovery from replaying the most
> recently logged items long after those items may have been written
> back. In the quiesce path, the log has been deliberately idled by
> forcing the log and pushing the AIL until empty in a context where
> no further mutable filesystem operations are allowed. Therefore, we
> can cover the log as the final step in the log quiesce codepath to
> reflect that all previously active items have been successfully
> written back.
> 
> This facilitates selective log covering from certain contexts (i.e.
> freeze) that only seek to quiesce, but not necessarily clean the
> log. Note that as a side effect of this change, log covering now
> occurs when cleaning the log as well. This is harmless, facilitates
> subsequent cleanups, and is mostly temporary as various operations
> switch to use explicit log covering.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_log.c | 49 +++++++++++++++++++++++++++++++++++++++++++++---
>  fs/xfs/xfs_log.h |  2 +-
>  2 files changed, 47 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 1b3227a033ad..f7b23044723d 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -91,6 +91,9 @@ STATIC int
>  xlog_iclogs_empty(
>  	struct xlog		*log);
>  
> +static int
> +xfs_log_cover(struct xfs_mount *);
> +
>  static void
>  xlog_grant_sub_space(
>  	struct xlog		*log,
> @@ -936,10 +939,9 @@ xfs_log_unmount_write(
>   * To do this, we first need to shut down the background log work so it is not
>   * trying to cover the log as we clean up. We then need to unpin all objects in
>   * the log so we can then flush them out. Once they have completed their IO and
> - * run the callbacks removing themselves from the AIL, we can write the unmount
> - * record.
> + * run the callbacks removing themselves from the AIL, we can cover the log.
>   */
> -void
> +int
>  xfs_log_quiesce(
>  	struct xfs_mount	*mp)
>  {
> @@ -957,6 +959,8 @@ xfs_log_quiesce(
>  	xfs_wait_buftarg(mp->m_ddev_targp);
>  	xfs_buf_lock(mp->m_sb_bp);
>  	xfs_buf_unlock(mp->m_sb_bp);
> +
> +	return xfs_log_cover(mp);
>  }
>  
>  void
> @@ -1092,6 +1096,45 @@ xfs_log_need_covered(
>  	return needed;
>  }
>  
> +/*
> + * Explicitly cover the log. This is similar to background log covering but
> + * intended for usage in quiesce codepaths. The caller is responsible to ensure
> + * the log is idle and suitable for covering. The CIL, iclog buffers and AIL
> + * must all be empty.
> + */
> +static int
> +xfs_log_cover(
> +	struct xfs_mount	*mp)
> +{
> +	struct xlog		*log = mp->m_log;
> +	int			error = 0;
> +
> +	ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) &&
> +	        !xfs_ail_min_lsn(log->l_ailp)) ||
> +	       XFS_FORCED_SHUTDOWN(mp));
> +
> +	if (!xfs_log_writable(mp))
> +		return 0;
> +
> +	/*
> +	 * To cover the log, commit the superblock twice (at most) in
> +	 * independent checkpoints. The first serves as a reference for the
> +	 * tail pointer. The sync transaction and AIL push empties the AIL and
> +	 * updates the in-core tail to the LSN of the first checkpoint. The
> +	 * second commit updates the on-disk tail with the in-core LSN,
> +	 * covering the log. Push the AIL one more time to leave it empty, as
> +	 * we found it.
> +	 */

Hm.  At first I looked at _need_covered and wondered how this could work
properly if we are in state DONE or DONE2, because this not-quite
predicate returns zero in that case.

I think it's the case that the only way the log can end up in DONE state
is if the background log worker had previously been in NEED, written the
first of the dummy transactions, moved the state to DONE, and waited for
xlog_covered_state to move the log from DONE to NEED2.  Similarly, the
log can only be in DONE2 state if the background worker wrote the second
dummy and is now waiting for xlog_covered_state to move the log from
DONE2 to IDLE.

Since xfs_log_quiesce cancelled the log worker and waited for it to
finish before calling xfs_log_cover, the covering state here can only be
IDLE, NEED, or NEED2, right?  And hence the while loop pushes the log to
IDLE no matter where it is now, right?

(I also wondered why this isn't a do-while loop but patch 6 addresses
that.)

--D

> +	while (xfs_log_need_covered(mp)) {
> +		error = xfs_sync_sb(mp, true);
> +		if (error)
> +			break;
> +		xfs_ail_push_all_sync(mp->m_ail);
> +	}
> +
> +	return error;
> +}
> +
>  /*
>   * We may be holding the log iclog lock upon entering this routine.
>   */
> diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> index b0400589f824..044e02cb8921 100644
> --- a/fs/xfs/xfs_log.h
> +++ b/fs/xfs/xfs_log.h
> @@ -138,7 +138,7 @@ void	xlog_cil_process_committed(struct list_head *list);
>  bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
>  
>  void	xfs_log_work_queue(struct xfs_mount *mp);
> -void	xfs_log_quiesce(struct xfs_mount *mp);
> +int	xfs_log_quiesce(struct xfs_mount *mp);
>  void	xfs_log_clean(struct xfs_mount *mp);
>  bool	xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t);
>  bool	xfs_log_in_recovery(struct xfs_mount *);
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
  2021-01-06 22:50   ` Allison Henderson
@ 2021-01-07 19:06   ` Darrick J. Wong
  2021-01-11 17:38   ` Christoph Hellwig
  2021-01-21 15:08   ` Bill O'Donnell
  3 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 19:06 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:19PM -0500, Brian Foster wrote:
> xfs_log_sbcount() syncs the superblock specifically to accumulate
> the in-core percpu superblock counters and commit them to disk. This
> is required to maintain filesystem consistency across quiesce
> (freeze, read-only mount/remount) or unmount when lazy superblock
> accounting is enabled because individual transactions do not update
> the superblock directly.
> 
> This mechanism works as expected for writable mounts, but
> xfs_log_sbcount() skips the update for read-only mounts. Read-only
> mounts otherwise still allow log recovery and write out an unmount
> record during log quiesce. If a read-only mount performs log
> recovery, it can modify the in-core superblock counters and write an
> unmount record when the filesystem unmounts without ever syncing the
> in-core counters. This leaves the filesystem with a clean log but in
> an inconsistent state with regard to lazy sb counters.
> 
> Update xfs_log_sbcount() to use the same logic
> xfs_log_unmount_write() uses to determine when to write an unmount
> record. We can drop the freeze state check because the update is
> already allowed during the freezing process and no context calls
> this function on an already frozen fs. This ensures that lazy
> accounting is always synced before the log is cleaned. Refactor this
> logic into a new helper to distinguish between a writable filesystem
> and a writable log. Specifically, the log is writable unless the
> filesystem is mounted with the norecovery mount option, the
> underlying log device is read-only, or the filesystem is shutdown.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> Reviewed-by: Gao Xiang <hsiangkao@redhat.com>

Looks good to me,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_log.c   | 28 ++++++++++++++++++++--------
>  fs/xfs/xfs_log.h   |  1 +
>  fs/xfs/xfs_mount.c |  3 +--
>  3 files changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index fa2d05e65ff1..b445e63cbc3c 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -347,6 +347,25 @@ xlog_tic_add_region(xlog_ticket_t *tic, uint len, uint type)
>  	tic->t_res_num++;
>  }
>  
> +bool
> +xfs_log_writable(
> +	struct xfs_mount	*mp)
> +{
> +	/*
> +	 * Never write to the log on norecovery mounts, if the block device is
> +	 * read-only, or if the filesystem is shutdown. Read-only mounts still
> +	 * allow internal writes for log recovery and unmount purposes, so don't
> +	 * restrict that case here.
> +	 */
> +	if (mp->m_flags & XFS_MOUNT_NORECOVERY)
> +		return false;
> +	if (xfs_readonly_buftarg(mp->m_log->l_targ))
> +		return false;
> +	if (XFS_FORCED_SHUTDOWN(mp))
> +		return false;
> +	return true;
> +}
> +
>  /*
>   * Replenish the byte reservation required by moving the grant write head.
>   */
> @@ -886,15 +905,8 @@ xfs_log_unmount_write(
>  {
>  	struct xlog		*log = mp->m_log;
>  
> -	/*
> -	 * Don't write out unmount record on norecovery mounts or ro devices.
> -	 * Or, if we are doing a forced umount (typically because of IO errors).
> -	 */
> -	if (mp->m_flags & XFS_MOUNT_NORECOVERY ||
> -	    xfs_readonly_buftarg(log->l_targ)) {
> -		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> +	if (!xfs_log_writable(mp))
>  		return;
> -	}
>  
>  	xfs_log_force(mp, XFS_LOG_SYNC);
>  
> diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> index 58c3fcbec94a..98c913da7587 100644
> --- a/fs/xfs/xfs_log.h
> +++ b/fs/xfs/xfs_log.h
> @@ -127,6 +127,7 @@ int	  xfs_log_reserve(struct xfs_mount *mp,
>  int	  xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic);
>  void      xfs_log_unmount(struct xfs_mount *mp);
>  int	  xfs_log_force_umount(struct xfs_mount *mp, int logerror);
> +bool	xfs_log_writable(struct xfs_mount *mp);
>  
>  struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
>  void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 7110507a2b6b..a62b8a574409 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1176,8 +1176,7 @@ xfs_fs_writable(
>  int
>  xfs_log_sbcount(xfs_mount_t *mp)
>  {
> -	/* allow this to proceed during the freeze sequence... */
> -	if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE))
> +	if (!xfs_log_writable(mp))
>  		return 0;
>  
>  	/*
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] xfs: separate log cleaning from log quiesce
  2021-01-06 17:41 ` [PATCH 3/9] xfs: separate log cleaning from log quiesce Brian Foster
  2021-01-06 22:50   ` Allison Henderson
@ 2021-01-07 19:07   ` Darrick J. Wong
  2021-01-13 15:30   ` Christoph Hellwig
  2 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 19:07 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:21PM -0500, Brian Foster wrote:
> Log quiesce is currently associated with cleaning the log, which is
> accomplished by writing an unmount record as the last step of the
> quiesce sequence. The quiesce codepath is a bit convoluted in this
> regard due to how it is reused from various contexts. In preparation
> to create separate log cleaning and log covering interfaces, lift
> the write of the unmount record into a new cleaning helper and call
> that wherever xfs_log_quiesce() is currently invoked. No functional
> changes.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Assuming the stuff I rambled about in my reply to patch 2 wasn't a
total braino,

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_log.c   | 8 +++++++-
>  fs/xfs/xfs_log.h   | 1 +
>  fs/xfs/xfs_super.c | 2 +-
>  3 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 4137ed007111..1b3227a033ad 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -957,7 +957,13 @@ xfs_log_quiesce(
>  	xfs_wait_buftarg(mp->m_ddev_targp);
>  	xfs_buf_lock(mp->m_sb_bp);
>  	xfs_buf_unlock(mp->m_sb_bp);
> +}
>  
> +void
> +xfs_log_clean(
> +	struct xfs_mount	*mp)
> +{
> +	xfs_log_quiesce(mp);
>  	xfs_log_unmount_write(mp);
>  }
>  
> @@ -972,7 +978,7 @@ void
>  xfs_log_unmount(
>  	struct xfs_mount	*mp)
>  {
> -	xfs_log_quiesce(mp);
> +	xfs_log_clean(mp);
>  
>  	xfs_trans_ail_destroy(mp);
>  
> diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> index 98c913da7587..b0400589f824 100644
> --- a/fs/xfs/xfs_log.h
> +++ b/fs/xfs/xfs_log.h
> @@ -139,6 +139,7 @@ bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
>  
>  void	xfs_log_work_queue(struct xfs_mount *mp);
>  void	xfs_log_quiesce(struct xfs_mount *mp);
> +void	xfs_log_clean(struct xfs_mount *mp);
>  bool	xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t);
>  bool	xfs_log_in_recovery(struct xfs_mount *);
>  
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 813be879a5e5..09d956e30fd8 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -897,7 +897,7 @@ xfs_quiesce_attr(
>  	if (error)
>  		xfs_warn(mp, "xfs_attr_quiesce: failed to log sb changes. "
>  				"Frozen image may not be consistent.");
> -	xfs_log_quiesce(mp);
> +	xfs_log_clean(mp);
>  }
>  
>  /*
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints
  2021-01-06 17:41 ` [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints Brian Foster
@ 2021-01-07 19:30   ` Darrick J. Wong
  2021-01-07 20:01     ` Brian Foster
  0 siblings, 1 reply; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 19:30 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:23PM -0500, Brian Foster wrote:
> Now that log covering occurs on quiesce, we'd like to reuse the
> underlying superblock sync for final superblock updates. This
> includes things like lazy superblock counter updates, log feature
> incompat bits in the future, etc. One quirk to this approach is that
> once the log is in the IDLE (i.e. already covered) state, any
> subsequent log write resets the state back to NEED. This means that
> a final superblock sync to an already covered log requires two more
> sb syncs to return the log back to IDLE again.
> 
> For example, if a lazy superblock enabled filesystem is mount cycled
> without any modifications, the unmount path syncs the superblock
> once and writes an unmount record. With the desired log quiesce
> covering behavior, we sync the superblock three times at unmount
> time: once for the lazy superblock counter update and twice more to
> cover the log. By contrast, if the log is active or only partially
> covered at unmount time, a final superblock sync would doubly serve
> as the one or two remaining syncs required to cover the log.
> 
> This duplicate covering sequence is unnecessary because the
> filesystem remains consistent if a crash occurs at any point. The
> superblock will either be recovered in the event of a crash or
> written back before the log is quiesced and potentially cleaned with
> an unmount record.
> 
> Update the log covering state machine to remain in the IDLE state if
> additional covering checkpoints pass through the log. This
> facilitates final superblock updates (such as lazy superblock
> counters) via a single sb sync without losing covered status. This
> provides some consistency with the active and partially covered
> cases and also avoids harmless, but spurious checkpoints when
> quiescing the log.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

I /think/ the premise of this is ok.

I found myself wondering if xlog_state_activate_iclog could mistake an
iclog containing 5 log ops from a real update for an iclog containing
just the dummy transaction, since a synchronous inode mtime update
transaction can also produce an iclog with 5 ops.  I /think/ that
doesn't matter because xlog_covered_state only cares about the value of
iclogs_changed if the log worker previously set the log state to DONE or
DONE2, and iclogs_changed won't be 1 here if there were multiple dirty
iclogs or if the sole dirty iclog contains more than just the dummy.

If I got that right,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_log.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index f7b23044723d..9b8564f280b7 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -2597,12 +2597,15 @@ xlog_covered_state(
>  	int			iclogs_changed)
>  {
>  	/*
> -	 * We usually go to NEED. But we go to NEED2 if the changed indicates we
> -	 * are done writing the dummy record.  If we are done with the second
> -	 * dummy recored (DONE2), then we go to IDLE.
> +	 * We go to NEED for any non-covering writes. We go to NEED2 if we just
> +	 * wrote the first covering record (DONE). We go to IDLE if we just
> +	 * wrote the second covering record (DONE2) and remain in IDLE until a
> +	 * non-covering write occurs.
>  	 */
>  	switch (prev_state) {
>  	case XLOG_STATE_COVER_IDLE:
> +		if (iclogs_changed == 1)
> +			return XLOG_STATE_COVER_IDLE;
>  	case XLOG_STATE_COVER_NEED:
>  	case XLOG_STATE_COVER_NEED2:
>  		break;
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 6/9] xfs: fold sbcount quiesce logging into log covering
  2021-01-06 17:41 ` [PATCH 6/9] xfs: fold sbcount quiesce logging into log covering Brian Foster
@ 2021-01-07 19:31   ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 19:31 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:24PM -0500, Brian Foster wrote:
> xfs_log_sbcount() calls xfs_sync_sb() to sync superblock counters to
> disk when lazy superblock accounting is enabled. This occurs on
> unmount, freeze, and read-only (re)mount and ensures the final
> values are calculated and persisted to disk before each form of
> quiesce completes.
> 
> Now that log covering occurs in all of these contexts and uses the
> same xfs_sync_sb() mechanism to update log state, there is no need
> to log the superblock separately for any reason. Update the log
> quiesce path to sync the superblock at least once for any mount
> where lazy superblock accounting is enabled. If the log is already
> covered, it will remain in the covered state. Otherwise, the next
> sync as part of the normal covering sequence will carry the
> associated superblock update with it. Remove xfs_log_sbcount() now
> that it is no longer needed.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

If all my previous ramblings were correct,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_log.c   | 20 ++++++++++++++++++--
>  fs/xfs/xfs_mount.c | 31 -------------------------------
>  fs/xfs/xfs_mount.h |  1 -
>  fs/xfs/xfs_super.c |  8 --------
>  4 files changed, 18 insertions(+), 42 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 9b8564f280b7..83e2697d4e38 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1108,6 +1108,7 @@ xfs_log_cover(
>  {
>  	struct xlog		*log = mp->m_log;
>  	int			error = 0;
> +	bool			need_covered;
>  
>  	ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) &&
>  	        !xfs_ail_min_lsn(log->l_ailp)) ||
> @@ -1116,6 +1117,21 @@ xfs_log_cover(
>  	if (!xfs_log_writable(mp))
>  		return 0;
>  
> +	/*
> +	 * xfs_log_need_covered() is not idempotent because it progresses the
> +	 * state machine if the log requires covering. Therefore, we must call
> +	 * this function once and use the result until we've issued an sb sync.
> +	 * Do so first to make that abundantly clear.
> +	 *
> +	 * Fall into the covering sequence if the log needs covering or the
> +	 * mount has lazy superblock accounting to sync to disk. The sb sync
> +	 * used for covering accumulates the in-core counters, so covering
> +	 * handles this for us.
> +	 */
> +	need_covered = xfs_log_need_covered(mp);
> +	if (!need_covered && !xfs_sb_version_haslazysbcount(&mp->m_sb))
> +		return 0;
> +
>  	/*
>  	 * To cover the log, commit the superblock twice (at most) in
>  	 * independent checkpoints. The first serves as a reference for the
> @@ -1125,12 +1141,12 @@ xfs_log_cover(
>  	 * covering the log. Push the AIL one more time to leave it empty, as
>  	 * we found it.
>  	 */
> -	while (xfs_log_need_covered(mp)) {
> +	do {
>  		error = xfs_sync_sb(mp, true);
>  		if (error)
>  			break;
>  		xfs_ail_push_all_sync(mp->m_ail);
> -	}
> +	} while (xfs_log_need_covered(mp));
>  
>  	return error;
>  }
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index a62b8a574409..f97b82d0e30f 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1124,12 +1124,6 @@ xfs_unmountfs(
>  		xfs_warn(mp, "Unable to free reserved block pool. "
>  				"Freespace may not be correct on next mount.");
>  
> -	error = xfs_log_sbcount(mp);
> -	if (error)
> -		xfs_warn(mp, "Unable to update superblock counters. "
> -				"Freespace may not be correct on next mount.");
> -
> -
>  	xfs_log_unmount(mp);
>  	xfs_da_unmount(mp);
>  	xfs_uuid_unmount(mp);
> @@ -1164,31 +1158,6 @@ xfs_fs_writable(
>  	return true;
>  }
>  
> -/*
> - * xfs_log_sbcount
> - *
> - * Sync the superblock counters to disk.
> - *
> - * Note this code can be called during the process of freezing, so we use the
> - * transaction allocator that does not block when the transaction subsystem is
> - * in its frozen state.
> - */
> -int
> -xfs_log_sbcount(xfs_mount_t *mp)
> -{
> -	if (!xfs_log_writable(mp))
> -		return 0;
> -
> -	/*
> -	 * we don't need to do this if we are updating the superblock
> -	 * counters on every modification.
> -	 */
> -	if (!xfs_sb_version_haslazysbcount(&mp->m_sb))
> -		return 0;
> -
> -	return xfs_sync_sb(mp, true);
> -}
> -
>  /*
>   * Deltas for the block count can vary from 1 to very large, but lock contention
>   * only occurs on frequent small block count updates such as in the delayed
> diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> index dfa429b77ee2..452ca7654dc5 100644
> --- a/fs/xfs/xfs_mount.h
> +++ b/fs/xfs/xfs_mount.h
> @@ -399,7 +399,6 @@ int xfs_buf_hash_init(xfs_perag_t *pag);
>  void xfs_buf_hash_destroy(xfs_perag_t *pag);
>  
>  extern void	xfs_uuid_table_free(void);
> -extern int	xfs_log_sbcount(xfs_mount_t *);
>  extern uint64_t xfs_default_resblks(xfs_mount_t *mp);
>  extern int	xfs_mountfs(xfs_mount_t *mp);
>  extern int	xfs_initialize_perag(xfs_mount_t *mp, xfs_agnumber_t agcount,
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 09d956e30fd8..75ada867c665 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -884,19 +884,11 @@ void
>  xfs_quiesce_attr(
>  	struct xfs_mount	*mp)
>  {
> -	int	error = 0;
> -
>  	cancel_delayed_work_sync(&mp->m_log->l_work);
>  
>  	/* force the log to unpin objects from the now complete transactions */
>  	xfs_log_force(mp, XFS_LOG_SYNC);
>  
> -
> -	/* Push the superblock and write an unmount record */
> -	error = xfs_log_sbcount(mp);
> -	if (error)
> -		xfs_warn(mp, "xfs_attr_quiesce: failed to log sb changes. "
> -				"Frozen image may not be consistent.");
>  	xfs_log_clean(mp);
>  }
>  
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/9] xfs: remove duplicate wq cancel and log force from attr quiesce
  2021-01-06 17:41 ` [PATCH 7/9] xfs: remove duplicate wq cancel and log force from attr quiesce Brian Foster
@ 2021-01-07 19:38   ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 19:38 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:25PM -0500, Brian Foster wrote:
> These two calls are repeated at the beginning of xfs_log_quiesce().
> Drop them from xfs_quiesce_attr().
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Makes sense,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_super.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 75ada867c665..8fc9044131fc 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -884,11 +884,6 @@ void
>  xfs_quiesce_attr(
>  	struct xfs_mount	*mp)
>  {
> -	cancel_delayed_work_sync(&mp->m_log->l_work);
> -
> -	/* force the log to unpin objects from the now complete transactions */
> -	xfs_log_force(mp, XFS_LOG_SYNC);
> -
>  	xfs_log_clean(mp);
>  }
>  
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 8/9] xfs: remove xfs_quiesce_attr()
  2021-01-06 17:41 ` [PATCH 8/9] xfs: remove xfs_quiesce_attr() Brian Foster
@ 2021-01-07 19:39   ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 19:39 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:26PM -0500, Brian Foster wrote:
> xfs_quiesce_attr() is now a wrapper for xfs_log_clean(). Remove it
> and call xfs_log_clean() directly.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Yay, I always found "quiesce attr" confusing...

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_mount.c |  2 +-
>  fs/xfs/xfs_super.c | 24 ++----------------------
>  2 files changed, 3 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index f97b82d0e30f..4a26b48b18e4 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -946,7 +946,7 @@ xfs_mountfs(
>  	 */
>  	if ((mp->m_flags & (XFS_MOUNT_RDONLY|XFS_MOUNT_NORECOVERY)) ==
>  							XFS_MOUNT_RDONLY) {
> -		xfs_quiesce_attr(mp);
> +		xfs_log_clean(mp);
>  	}
>  
>  	/*
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 8fc9044131fc..aedf622d221b 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -867,26 +867,6 @@ xfs_restore_resvblks(struct xfs_mount *mp)
>  	xfs_reserve_blocks(mp, &resblks, NULL);
>  }
>  
> -/*
> - * Trigger writeback of all the dirty metadata in the file system.
> - *
> - * This ensures that the metadata is written to their location on disk rather
> - * than just existing in transactions in the log. This means after a quiesce
> - * there is no log replay required to write the inodes to disk - this is the
> - * primary difference between a sync and a quiesce.
> - *
> - * We cancel log work early here to ensure all transactions the log worker may
> - * run have finished before we clean up and log the superblock and write an
> - * unmount record. The unfreeze process is responsible for restarting the log
> - * worker correctly.
> - */
> -void
> -xfs_quiesce_attr(
> -	struct xfs_mount	*mp)
> -{
> -	xfs_log_clean(mp);
> -}
> -
>  /*
>   * Second stage of a freeze. The data is already frozen so we only
>   * need to take care of the metadata. Once that's done sync the superblock
> @@ -909,7 +889,7 @@ xfs_fs_freeze(
>  	flags = memalloc_nofs_save();
>  	xfs_stop_block_reaping(mp);
>  	xfs_save_resvblks(mp);
> -	xfs_quiesce_attr(mp);
> +	xfs_log_clean(mp);
>  	ret = xfs_sync_sb(mp, true);
>  	memalloc_nofs_restore(flags);
>  	return ret;
> @@ -1752,7 +1732,7 @@ xfs_remount_ro(
>  	 */
>  	xfs_save_resvblks(mp);
>  
> -	xfs_quiesce_attr(mp);
> +	xfs_log_clean(mp);
>  	mp->m_flags |= XFS_MOUNT_RDONLY;
>  
>  	return 0;
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] xfs: cover the log on freeze instead of cleaning it
  2021-01-06 17:41 ` [PATCH 9/9] xfs: cover the log on freeze instead of cleaning it Brian Foster
@ 2021-01-07 19:39   ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 19:39 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:27PM -0500, Brian Foster wrote:
> Filesystem freeze cleans the log and immediately redirties it so log
> recovery runs if a crash occurs after the filesystem is frozen. Now
> that log quiesce covers the log, there is no need to clean the log and
> redirty it to trigger log recovery because covering has the same
> effect. Update xfs_fs_freeze() to quiesce (and thus cover) the log.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Woot!
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_super.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index aedf622d221b..aed74a3fc787 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -889,8 +889,7 @@ xfs_fs_freeze(
>  	flags = memalloc_nofs_save();
>  	xfs_stop_block_reaping(mp);
>  	xfs_save_resvblks(mp);
> -	xfs_log_clean(mp);
> -	ret = xfs_sync_sb(mp, true);
> +	ret = xfs_log_quiesce(mp);
>  	memalloc_nofs_restore(flags);
>  	return ret;
>  }
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/9] xfs: lift writable fs check up into log worker task
  2021-01-07 18:34   ` Darrick J. Wong
@ 2021-01-07 19:53     ` Brian Foster
  2021-01-07 21:28       ` Darrick J. Wong
  0 siblings, 1 reply; 36+ messages in thread
From: Brian Foster @ 2021-01-07 19:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Jan 07, 2021 at 10:34:22AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 06, 2021 at 12:41:20PM -0500, Brian Foster wrote:
> > The log covering helper checks whether the filesystem is writable to
> > determine whether to cover the log. The helper is currently only
> > called from the background log worker. In preparation to reuse the
> > helper from freezing contexts, lift the check into xfs_log_worker().
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_log.c | 12 +++++-------
> >  1 file changed, 5 insertions(+), 7 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index b445e63cbc3c..4137ed007111 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -1050,13 +1050,11 @@ xfs_log_space_wake(
> >   * can't start trying to idle the log until both the CIL and AIL are empty.
> >   */
> >  static int
> 
> I think this is a predicate, right?  Should this function return a bool
> instead of an int?
> 

Yes, we could change that to return a bool.

> This function always confuses me slightly since it pushes us through the
> covering state machine, and (I think) assumes that someone will force
> the CIL and push the AIL if it returns zero. :)
> 

It basically assumes that the caller will issue a covering commit
(xfs_sync_sb()) if indicated, and so progresses ->l_covered_state along
in anticipation of that (i.e. NEED -> DONE). The log subsystem side
detects that covering commit and makes further state changes (such as
DONE -> NEED2) for the next time around in the background worker.

> To check my thinking further-- back in that thread I started about
> setting and clearing log incompat flags, I think Dave was pushing me to
> clear the log incompat flags just before we call xfs_sync_sb when the
> log is in NEED2 state, right?
> 

In general, I think so. I don't think it technically has to be NEED2 (as
opposed to NEED || NEED2), but in general the idea is to make any such
final superblock updates in-core just before the quiesce completes and
allow the log covering sequence to commit it for us. This is similar to
how this series handles the lazy superblock counters (with the caveat
that that stuff just happened to already be implemented inside
xfs_sync_sb()).

FWIW, we could also enforce that such final superblock updates reset
covered state of the log to NEED2 if we wanted to. I went back and forth
on that a bit but decided to leave out unnecessary complexity for the
first pass.

> AFAICT the net effect of this series is to reorder the log code so that
> xfs_log_quiesce covers the log (force cil, push ail, log two
> transactions containing only the superblock), and adds an xfs_log_clean
> that quiesces the log and then writes an unmount record after that.
> 

Yep.

> Two callers whose behavior does not change with this series are: 1) The
> log worker quiesces the log when it's idle; and 2) unmount quiesces the
> log and then writes an unmount record so that the next mount knows it
> can skip replay entirely.
> 

Right, though just to be clear, quiesce never covered the log before
this series. It effectively drained the log by forcing the log and
pushing the AIL until empty, but then just wrote the unmount record to
mark it clean...

> The big difference is 3) freeze now only covers the log, whereas before
> it would cover, write an unmount record, and immediately redirty the log
> to force replay of the snapshot, right?
> 

Yes. As above, unmount now also does a log cover -> unmount record
instead of just writing the unmount record. This is harmless because we
end up in the clean state either way, but I've tried to point this out
in the commit logs and whatnot so it's apparent to reviewers. We could
technically make the log cover during quiesce optional with a new
parameter or something, but it just didn't seem worth it once we start
overloading the covering sequence to handle things like lazy sb
accounting (or log incompat bits, etc.).

> Assuming I understood all that, my next question is: Eric Sandeen was
> working on a patchset to process unlinked inodes unconditionally on
> mount so that frozen fs images can be written out with the unmount
> record (because I guess people make ro snapshots of live fs images and
> then balk when they have to make the snapshot rw to run log recovery.
> Any thoughts about /that/? :)
> 

Eric had mentioned that to me as well. I don't quite recall what the
impediment to making that change was the last time around (Eric?), but
my view was that is orthogonal to this series. IOW, the primary
motivations for this series are to clean up the whole xfs_quiesce_attr()
-> xfs_log_quiesce() mess and facilitate the reuse of covering for
things like lazy sb accounting and log incompat bit management. We can
decide whether to quiesce or clean the log on freeze independently and
that's really only a single line tweak to the last patch of the series
(i.e., continue to clean the log and just don't redirty it).

Brian

> --D
> 
> > -xfs_log_need_covered(xfs_mount_t *mp)
> > +xfs_log_need_covered(
> > +	struct xfs_mount	*mp)
> >  {
> > -	struct xlog	*log = mp->m_log;
> > -	int		needed = 0;
> > -
> > -	if (!xfs_fs_writable(mp, SB_FREEZE_WRITE))
> > -		return 0;
> > +	struct xlog		*log = mp->m_log;
> > +	int			needed = 0;
> >  
> >  	if (!xlog_cil_empty(log))
> >  		return 0;
> > @@ -1271,7 +1269,7 @@ xfs_log_worker(
> >  	struct xfs_mount	*mp = log->l_mp;
> >  
> >  	/* dgc: errors ignored - not fatal and nowhere to report them */
> > -	if (xfs_log_need_covered(mp)) {
> > +	if (xfs_fs_writable(mp, SB_FREEZE_WRITE) && xfs_log_need_covered(mp)) {
> >  		/*
> >  		 * Dump a transaction into the log that contains no real change.
> >  		 * This is needed to stamp the current tail LSN into the log
> > -- 
> > 2.26.2
> > 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] xfs: cover the log during log quiesce
  2021-01-07 19:04   ` Darrick J. Wong
@ 2021-01-07 19:53     ` Brian Foster
  2021-01-19 17:51       ` Darrick J. Wong
  0 siblings, 1 reply; 36+ messages in thread
From: Brian Foster @ 2021-01-07 19:53 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Jan 07, 2021 at 11:04:08AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 06, 2021 at 12:41:22PM -0500, Brian Foster wrote:
> > The log quiesce mechanism historically terminates by marking the log
> > clean with an unmount record. The primary objective is to indicate
> > that log recovery is no longer required after the quiesce has
> > flushed all in-core changes and written back filesystem metadata.
> > While this is perfectly fine, it is somewhat hacky as currently used
> > in certain contexts. For example, filesystem freeze quiesces (i.e.
> > cleans) the log and immediately redirties it with a dummy superblock
> > transaction to ensure that log recovery runs in the event of a
> > crash.
> > 
> > While this functions correctly, cleaning the log from freeze context
> > is clearly superfluous given the current redirtying behavior.
> > Instead, the desired behavior can be achieved by simply covering the
> > log. This effectively retires all on-disk log items from the active
> > range of the log by issuing two synchronous and sequential dummy
> > superblock update transactions that serve to update the on-disk log
> > head and tail. The subtle difference is that the log technically
> > remains dirty due to the lack of an unmount record, though recovery
> > is effectively a no-op due to the content of the checkpoints being
> > clean (i.e. the unmodified on-disk superblock).
> > 
> > Log covering currently runs in the background and only triggers once
> > the filesystem and log has idled. The purpose of the background
> > mechanism is to prevent log recovery from replaying the most
> > recently logged items long after those items may have been written
> > back. In the quiesce path, the log has been deliberately idled by
> > forcing the log and pushing the AIL until empty in a context where
> > no further mutable filesystem operations are allowed. Therefore, we
> > can cover the log as the final step in the log quiesce codepath to
> > reflect that all previously active items have been successfully
> > written back.
> > 
> > This facilitates selective log covering from certain contexts (i.e.
> > freeze) that only seek to quiesce, but not necessarily clean the
> > log. Note that as a side effect of this change, log covering now
> > occurs when cleaning the log as well. This is harmless, facilitates
> > subsequent cleanups, and is mostly temporary as various operations
> > switch to use explicit log covering.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> >  fs/xfs/xfs_log.c | 49 +++++++++++++++++++++++++++++++++++++++++++++---
> >  fs/xfs/xfs_log.h |  2 +-
> >  2 files changed, 47 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index 1b3227a033ad..f7b23044723d 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -91,6 +91,9 @@ STATIC int
> >  xlog_iclogs_empty(
> >  	struct xlog		*log);
> >  
> > +static int
> > +xfs_log_cover(struct xfs_mount *);
> > +
> >  static void
> >  xlog_grant_sub_space(
> >  	struct xlog		*log,
> > @@ -936,10 +939,9 @@ xfs_log_unmount_write(
> >   * To do this, we first need to shut down the background log work so it is not
> >   * trying to cover the log as we clean up. We then need to unpin all objects in
> >   * the log so we can then flush them out. Once they have completed their IO and
> > - * run the callbacks removing themselves from the AIL, we can write the unmount
> > - * record.
> > + * run the callbacks removing themselves from the AIL, we can cover the log.
> >   */
> > -void
> > +int
> >  xfs_log_quiesce(
> >  	struct xfs_mount	*mp)
> >  {
> > @@ -957,6 +959,8 @@ xfs_log_quiesce(
> >  	xfs_wait_buftarg(mp->m_ddev_targp);
> >  	xfs_buf_lock(mp->m_sb_bp);
> >  	xfs_buf_unlock(mp->m_sb_bp);
> > +
> > +	return xfs_log_cover(mp);
> >  }
> >  
> >  void
> > @@ -1092,6 +1096,45 @@ xfs_log_need_covered(
> >  	return needed;
> >  }
> >  
> > +/*
> > + * Explicitly cover the log. This is similar to background log covering but
> > + * intended for usage in quiesce codepaths. The caller is responsible to ensure
> > + * the log is idle and suitable for covering. The CIL, iclog buffers and AIL
> > + * must all be empty.
> > + */
> > +static int
> > +xfs_log_cover(
> > +	struct xfs_mount	*mp)
> > +{
> > +	struct xlog		*log = mp->m_log;
> > +	int			error = 0;
> > +
> > +	ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) &&
> > +	        !xfs_ail_min_lsn(log->l_ailp)) ||
> > +	       XFS_FORCED_SHUTDOWN(mp));
> > +
> > +	if (!xfs_log_writable(mp))
> > +		return 0;
> > +
> > +	/*
> > +	 * To cover the log, commit the superblock twice (at most) in
> > +	 * independent checkpoints. The first serves as a reference for the
> > +	 * tail pointer. The sync transaction and AIL push empties the AIL and
> > +	 * updates the in-core tail to the LSN of the first checkpoint. The
> > +	 * second commit updates the on-disk tail with the in-core LSN,
> > +	 * covering the log. Push the AIL one more time to leave it empty, as
> > +	 * we found it.
> > +	 */
> 
> Hm.  At first I looked at _need_covered and wondered how this could work
> properly if we are in state DONE or DONE2, because this not-quite
> predicate returns zero in that case.
> 
> I think it's the case that the only way the log can end up in DONE state
> is if the background log worker had previously been in NEED, written the
> first of the dummy transactions, moved the state to DONE, and waited for
> xlog_covered_state to move the log from DONE to NEED2.  Similarly, the
> log can only be in DONE2 state if the background worker wrote the second
> dummy and is now waiting for xlog_covered_state to move the log from
> DONE2 to IDLE.
> 
> Since xfs_log_quiesce cancelled the log worker and waited for it to
> finish before calling xfs_log_cover, the covering state here can only be
> IDLE, NEED, or NEED2, right?  And hence the while loop pushes the log to
> IDLE no matter where it is now, right?
> 

Yeah, we're in a quiescent context at this point where no other
transactions are running, the in-core structures should be drained and
the background log worker cancelled, etc. With regard to the background
log worker, I don't think it should ever actually see the DONE or DONE2
states as it sets those states and immediately issues the synchronous sb
transaction. Therefore, the commit should have changed the state from
DONE to NEED2 or NEED (if other items happened to land in the log)
before it returns.

That said, I suppose it wouldn't be that surprising if some odd timing
scenario or combination of external superblock commits could cause the
background log worker to see a DONE state. I haven't fully audited for
that, but regardless it would appropriately do nothing and that
shouldn't be an issue from the quiesce context due to the runtime being
pretty much shut down by this point.

> (I also wondered why this isn't a do-while loop but patch 6 addresses
> that.)
> 

Right, that changes due to the lazy sb counter logic..

Brian

> --D
> 
> > +	while (xfs_log_need_covered(mp)) {
> > +		error = xfs_sync_sb(mp, true);
> > +		if (error)
> > +			break;
> > +		xfs_ail_push_all_sync(mp->m_ail);
> > +	}
> > +
> > +	return error;
> > +}
> > +
> >  /*
> >   * We may be holding the log iclog lock upon entering this routine.
> >   */
> > diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> > index b0400589f824..044e02cb8921 100644
> > --- a/fs/xfs/xfs_log.h
> > +++ b/fs/xfs/xfs_log.h
> > @@ -138,7 +138,7 @@ void	xlog_cil_process_committed(struct list_head *list);
> >  bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
> >  
> >  void	xfs_log_work_queue(struct xfs_mount *mp);
> > -void	xfs_log_quiesce(struct xfs_mount *mp);
> > +int	xfs_log_quiesce(struct xfs_mount *mp);
> >  void	xfs_log_clean(struct xfs_mount *mp);
> >  bool	xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t);
> >  bool	xfs_log_in_recovery(struct xfs_mount *);
> > -- 
> > 2.26.2
> > 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints
  2021-01-07 19:30   ` Darrick J. Wong
@ 2021-01-07 20:01     ` Brian Foster
  0 siblings, 0 replies; 36+ messages in thread
From: Brian Foster @ 2021-01-07 20:01 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Jan 07, 2021 at 11:30:50AM -0800, Darrick J. Wong wrote:
> On Wed, Jan 06, 2021 at 12:41:23PM -0500, Brian Foster wrote:
> > Now that log covering occurs on quiesce, we'd like to reuse the
> > underlying superblock sync for final superblock updates. This
> > includes things like lazy superblock counter updates, log feature
> > incompat bits in the future, etc. One quirk to this approach is that
> > once the log is in the IDLE (i.e. already covered) state, any
> > subsequent log write resets the state back to NEED. This means that
> > a final superblock sync to an already covered log requires two more
> > sb syncs to return the log back to IDLE again.
> > 
> > For example, if a lazy superblock enabled filesystem is mount cycled
> > without any modifications, the unmount path syncs the superblock
> > once and writes an unmount record. With the desired log quiesce
> > covering behavior, we sync the superblock three times at unmount
> > time: once for the lazy superblock counter update and twice more to
> > cover the log. By contrast, if the log is active or only partially
> > covered at unmount time, a final superblock sync would doubly serve
> > as the one or two remaining syncs required to cover the log.
> > 
> > This duplicate covering sequence is unnecessary because the
> > filesystem remains consistent if a crash occurs at any point. The
> > superblock will either be recovered in the event of a crash or
> > written back before the log is quiesced and potentially cleaned with
> > an unmount record.
> > 
> > Update the log covering state machine to remain in the IDLE state if
> > additional covering checkpoints pass through the log. This
> > facilitates final superblock updates (such as lazy superblock
> > counters) via a single sb sync without losing covered status. This
> > provides some consistency with the active and partially covered
> > cases and also avoids harmless, but spurious checkpoints when
> > quiescing the log.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> 
> I /think/ the premise of this is ok.
> 
> I found myself wondering if xlog_state_activate_iclog could mistake an
> iclog containing 5 log ops from a real update for an iclog containing
> just the dummy transaction, since a synchronous inode mtime update
> transaction can also produce an iclog with 5 ops.  I /think/ that
> doesn't matter because xlog_covered_state only cares about the value of
> iclogs_changed if the log worker previously set the log state to DONE or
> DONE2, and iclogs_changed won't be 1 here if there were multiple dirty
> iclogs or if the sole dirty iclog contains more than just the dummy.
> 

That is my understanding. I had similar questions when first passing
through the covering code and realizing the detection of the covering
commit was sort of implicit in and of itself (hence the additional state
checking and iclogs_changed logic). I think the bigger picture with
regard to this series was that it doesn't really matter because nothing
else is happening during quiesce, but I might revisit that from the
background covering perspective now that I have a better handle on how
the covering mechanism works...

Brian

> If I got that right,
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> --D
> 
> > ---
> >  fs/xfs/xfs_log.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index f7b23044723d..9b8564f280b7 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -2597,12 +2597,15 @@ xlog_covered_state(
> >  	int			iclogs_changed)
> >  {
> >  	/*
> > -	 * We usually go to NEED. But we go to NEED2 if the changed indicates we
> > -	 * are done writing the dummy record.  If we are done with the second
> > -	 * dummy recored (DONE2), then we go to IDLE.
> > +	 * We go to NEED for any non-covering writes. We go to NEED2 if we just
> > +	 * wrote the first covering record (DONE). We go to IDLE if we just
> > +	 * wrote the second covering record (DONE2) and remain in IDLE until a
> > +	 * non-covering write occurs.
> >  	 */
> >  	switch (prev_state) {
> >  	case XLOG_STATE_COVER_IDLE:
> > +		if (iclogs_changed == 1)
> > +			return XLOG_STATE_COVER_IDLE;
> >  	case XLOG_STATE_COVER_NEED:
> >  	case XLOG_STATE_COVER_NEED2:
> >  		break;
> > -- 
> > 2.26.2
> > 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/9] xfs: lift writable fs check up into log worker task
  2021-01-07 19:53     ` Brian Foster
@ 2021-01-07 21:28       ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-07 21:28 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Jan 07, 2021 at 02:53:21PM -0500, Brian Foster wrote:
> On Thu, Jan 07, 2021 at 10:34:22AM -0800, Darrick J. Wong wrote:
> > On Wed, Jan 06, 2021 at 12:41:20PM -0500, Brian Foster wrote:
> > > The log covering helper checks whether the filesystem is writable to
> > > determine whether to cover the log. The helper is currently only
> > > called from the background log worker. In preparation to reuse the
> > > helper from freezing contexts, lift the check into xfs_log_worker().
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > >  fs/xfs/xfs_log.c | 12 +++++-------
> > >  1 file changed, 5 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index b445e63cbc3c..4137ed007111 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -1050,13 +1050,11 @@ xfs_log_space_wake(
> > >   * can't start trying to idle the log until both the CIL and AIL are empty.
> > >   */
> > >  static int
> > 
> > I think this is a predicate, right?  Should this function return a bool
> > instead of an int?
> > 
> 
> Yes, we could change that to return a bool.
> 
> > This function always confuses me slightly since it pushes us through the
> > covering state machine, and (I think) assumes that someone will force
> > the CIL and push the AIL if it returns zero. :)
> > 
> 
> It basically assumes that the caller will issue a covering commit
> (xfs_sync_sb()) if indicated, and so progresses ->l_covered_state along
> in anticipation of that (i.e. NEED -> DONE). The log subsystem side
> detects that covering commit and makes further state changes (such as
> DONE -> NEED2) for the next time around in the background worker.
> 
> > To check my thinking further-- back in that thread I started about
> > setting and clearing log incompat flags, I think Dave was pushing me to
> > clear the log incompat flags just before we call xfs_sync_sb when the
> > log is in NEED2 state, right?
> > 
> 
> In general, I think so. I don't think it technically has to be NEED2 (as
> opposed to NEED || NEED2), but in general the idea is to make any such
> final superblock updates in-core just before the quiesce completes and
> allow the log covering sequence to commit it for us. This is similar to
> how this series handles the lazy superblock counters (with the caveat
> that that stuff just happened to already be implemented inside
> xfs_sync_sb()).
> 
> FWIW, we could also enforce that such final superblock updates reset
> covered state of the log to NEED2 if we wanted to. I went back and forth
> on that a bit but decided to leave out unnecessary complexity for the
> first pass.
> 
> > AFAICT the net effect of this series is to reorder the log code so that
> > xfs_log_quiesce covers the log (force cil, push ail, log two
> > transactions containing only the superblock), and adds an xfs_log_clean
> > that quiesces the log and then writes an unmount record after that.
> > 
> 
> Yep.
> 
> > Two callers whose behavior does not change with this series are: 1) The
> > log worker quiesces the log when it's idle; and 2) unmount quiesces the
> > log and then writes an unmount record so that the next mount knows it
> > can skip replay entirely.
> > 
> 
> Right, though just to be clear, quiesce never covered the log before
> this series. It effectively drained the log by forcing the log and
> pushing the AIL until empty, but then just wrote the unmount record to
> mark it clean...

<nod> Right, I should've echoed that old quiesce only did force cil and
push ail, so freeze and unmount do more now.

> > The big difference is 3) freeze now only covers the log, whereas before
> > it would cover, write an unmount record, and immediately redirty the log
> > to force replay of the snapshot, right?
> > 
> 
> Yes. As above, unmount now also does a log cover -> unmount record
> instead of just writing the unmount record. This is harmless because we
> end up in the clean state either way, but I've tried to point this out
> in the commit logs and whatnot so it's apparent to reviewers. We could
> technically make the log cover during quiesce optional with a new
> parameter or something, but it just didn't seem worth it once we start
> overloading the covering sequence to handle things like lazy sb
> accounting (or log incompat bits, etc.).
>
> > Assuming I understood all that, my next question is: Eric Sandeen was
> > working on a patchset to process unlinked inodes unconditionally on
> > mount so that frozen fs images can be written out with the unmount
> > record (because I guess people make ro snapshots of live fs images and
> > then balk when they have to make the snapshot rw to run log recovery.
> > Any thoughts about /that/? :)
> > 
> 
> Eric had mentioned that to me as well. I don't quite recall what the
> impediment to making that change was the last time around (Eric?), but
> my view was that is orthogonal to this series. IOW, the primary
> motivations for this series are to clean up the whole xfs_quiesce_attr()
> -> xfs_log_quiesce() mess and facilitate the reuse of covering for
> things like lazy sb accounting and log incompat bit management. We can
> decide whether to quiesce or clean the log on freeze independently and
> that's really only a single line tweak to the last patch of the series
> (i.e., continue to clean the log and just don't redirty it).

Oh, it's totally orthogonal, but touched some of the same code parts. :)

IIRC I applied it then hit fstests regressions and kicked it out again.

--D

> Brian
> 
> > --D
> > 
> > > -xfs_log_need_covered(xfs_mount_t *mp)
> > > +xfs_log_need_covered(
> > > +	struct xfs_mount	*mp)
> > >  {
> > > -	struct xlog	*log = mp->m_log;
> > > -	int		needed = 0;
> > > -
> > > -	if (!xfs_fs_writable(mp, SB_FREEZE_WRITE))
> > > -		return 0;
> > > +	struct xlog		*log = mp->m_log;
> > > +	int			needed = 0;
> > >  
> > >  	if (!xlog_cil_empty(log))
> > >  		return 0;
> > > @@ -1271,7 +1269,7 @@ xfs_log_worker(
> > >  	struct xfs_mount	*mp = log->l_mp;
> > >  
> > >  	/* dgc: errors ignored - not fatal and nowhere to report them */
> > > -	if (xfs_log_need_covered(mp)) {
> > > +	if (xfs_fs_writable(mp, SB_FREEZE_WRITE) && xfs_log_need_covered(mp)) {
> > >  		/*
> > >  		 * Dump a transaction into the log that contains no real change.
> > >  		 * This is needed to stamp the current tail LSN into the log
> > > -- 
> > > 2.26.2
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
  2021-01-06 22:50   ` Allison Henderson
  2021-01-07 19:06   ` Darrick J. Wong
@ 2021-01-11 17:38   ` Christoph Hellwig
  2021-01-12 14:55     ` Brian Foster
  2021-01-21 15:08   ` Bill O'Donnell
  3 siblings, 1 reply; 36+ messages in thread
From: Christoph Hellwig @ 2021-01-11 17:38 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:19PM -0500, Brian Foster wrote:
> Update xfs_log_sbcount() to use the same logic
> xfs_log_unmount_write() uses to determine when to write an unmount
> record.

But it isn't the same old logic - a shutdown check is added as well.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-11 17:38   ` Christoph Hellwig
@ 2021-01-12 14:55     ` Brian Foster
  2021-01-12 18:20       ` Christoph Hellwig
  0 siblings, 1 reply; 36+ messages in thread
From: Brian Foster @ 2021-01-12 14:55 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Jan 11, 2021 at 05:38:51PM +0000, Christoph Hellwig wrote:
> On Wed, Jan 06, 2021 at 12:41:19PM -0500, Brian Foster wrote:
> > Update xfs_log_sbcount() to use the same logic
> > xfs_log_unmount_write() uses to determine when to write an unmount
> > record.
> 
> But it isn't the same old logic - a shutdown check is added as well.
> 

xfs_log_unmount_write() does have a shutdown check, it just doesn't show
up in the diff because I wanted to retain the post-log force check in
that function (in the event that the force triggers a shutdown).

Brian


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-12 14:55     ` Brian Foster
@ 2021-01-12 18:20       ` Christoph Hellwig
  0 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2021-01-12 18:20 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Tue, Jan 12, 2021 at 09:55:19AM -0500, Brian Foster wrote:
> xfs_log_unmount_write() does have a shutdown check, it just doesn't show
> up in the diff because I wanted to retain the post-log force check in
> that function (in the event that the force triggers a shutdown).

Ok.  Maybe mention you are throwing in another shutdown check just
because we can't have 'nough of them.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/9] xfs: lift writable fs check up into log worker task
  2021-01-06 17:41 ` [PATCH 2/9] xfs: lift writable fs check up into log worker task Brian Foster
  2021-01-06 22:50   ` Allison Henderson
  2021-01-07 18:34   ` Darrick J. Wong
@ 2021-01-13 15:24   ` Christoph Hellwig
  2 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2021-01-13 15:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:20PM -0500, Brian Foster wrote:
> The log covering helper checks whether the filesystem is writable to
> determine whether to cover the log. The helper is currently only
> called from the background log worker. In preparation to reuse the
> helper from freezing contexts, lift the check into xfs_log_worker().
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] xfs: separate log cleaning from log quiesce
  2021-01-06 17:41 ` [PATCH 3/9] xfs: separate log cleaning from log quiesce Brian Foster
  2021-01-06 22:50   ` Allison Henderson
  2021-01-07 19:07   ` Darrick J. Wong
@ 2021-01-13 15:30   ` Christoph Hellwig
  2 siblings, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2021-01-13 15:30 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:21PM -0500, Brian Foster wrote:
> Log quiesce is currently associated with cleaning the log, which is
> accomplished by writing an unmount record as the last step of the
> quiesce sequence. The quiesce codepath is a bit convoluted in this
> regard due to how it is reused from various contexts. In preparation
> to create separate log cleaning and log covering interfaces, lift
> the write of the unmount record into a new cleaning helper and call
> that wherever xfs_log_quiesce() is currently invoked. No functional
> changes.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] xfs: cover the log during log quiesce
  2021-01-06 17:41 ` [PATCH 4/9] xfs: cover the log during " Brian Foster
  2021-01-07 19:04   ` Darrick J. Wong
@ 2021-01-19 15:35   ` Christoph Hellwig
  1 sibling, 0 replies; 36+ messages in thread
From: Christoph Hellwig @ 2021-01-19 15:35 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] xfs: cover the log during log quiesce
  2021-01-07 19:53     ` Brian Foster
@ 2021-01-19 17:51       ` Darrick J. Wong
  0 siblings, 0 replies; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-19 17:51 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Thu, Jan 07, 2021 at 02:53:36PM -0500, Brian Foster wrote:
> On Thu, Jan 07, 2021 at 11:04:08AM -0800, Darrick J. Wong wrote:
> > On Wed, Jan 06, 2021 at 12:41:22PM -0500, Brian Foster wrote:
> > > The log quiesce mechanism historically terminates by marking the log
> > > clean with an unmount record. The primary objective is to indicate
> > > that log recovery is no longer required after the quiesce has
> > > flushed all in-core changes and written back filesystem metadata.
> > > While this is perfectly fine, it is somewhat hacky as currently used
> > > in certain contexts. For example, filesystem freeze quiesces (i.e.
> > > cleans) the log and immediately redirties it with a dummy superblock
> > > transaction to ensure that log recovery runs in the event of a
> > > crash.
> > > 
> > > While this functions correctly, cleaning the log from freeze context
> > > is clearly superfluous given the current redirtying behavior.
> > > Instead, the desired behavior can be achieved by simply covering the
> > > log. This effectively retires all on-disk log items from the active
> > > range of the log by issuing two synchronous and sequential dummy
> > > superblock update transactions that serve to update the on-disk log
> > > head and tail. The subtle difference is that the log technically
> > > remains dirty due to the lack of an unmount record, though recovery
> > > is effectively a no-op due to the content of the checkpoints being
> > > clean (i.e. the unmodified on-disk superblock).
> > > 
> > > Log covering currently runs in the background and only triggers once
> > > the filesystem and log has idled. The purpose of the background
> > > mechanism is to prevent log recovery from replaying the most
> > > recently logged items long after those items may have been written
> > > back. In the quiesce path, the log has been deliberately idled by
> > > forcing the log and pushing the AIL until empty in a context where
> > > no further mutable filesystem operations are allowed. Therefore, we
> > > can cover the log as the final step in the log quiesce codepath to
> > > reflect that all previously active items have been successfully
> > > written back.
> > > 
> > > This facilitates selective log covering from certain contexts (i.e.
> > > freeze) that only seek to quiesce, but not necessarily clean the
> > > log. Note that as a side effect of this change, log covering now
> > > occurs when cleaning the log as well. This is harmless, facilitates
> > > subsequent cleanups, and is mostly temporary as various operations
> > > switch to use explicit log covering.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > >  fs/xfs/xfs_log.c | 49 +++++++++++++++++++++++++++++++++++++++++++++---
> > >  fs/xfs/xfs_log.h |  2 +-
> > >  2 files changed, 47 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index 1b3227a033ad..f7b23044723d 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -91,6 +91,9 @@ STATIC int
> > >  xlog_iclogs_empty(
> > >  	struct xlog		*log);
> > >  
> > > +static int
> > > +xfs_log_cover(struct xfs_mount *);
> > > +
> > >  static void
> > >  xlog_grant_sub_space(
> > >  	struct xlog		*log,
> > > @@ -936,10 +939,9 @@ xfs_log_unmount_write(
> > >   * To do this, we first need to shut down the background log work so it is not
> > >   * trying to cover the log as we clean up. We then need to unpin all objects in
> > >   * the log so we can then flush them out. Once they have completed their IO and
> > > - * run the callbacks removing themselves from the AIL, we can write the unmount
> > > - * record.
> > > + * run the callbacks removing themselves from the AIL, we can cover the log.
> > >   */
> > > -void
> > > +int
> > >  xfs_log_quiesce(
> > >  	struct xfs_mount	*mp)
> > >  {
> > > @@ -957,6 +959,8 @@ xfs_log_quiesce(
> > >  	xfs_wait_buftarg(mp->m_ddev_targp);
> > >  	xfs_buf_lock(mp->m_sb_bp);
> > >  	xfs_buf_unlock(mp->m_sb_bp);
> > > +
> > > +	return xfs_log_cover(mp);
> > >  }
> > >  
> > >  void
> > > @@ -1092,6 +1096,45 @@ xfs_log_need_covered(
> > >  	return needed;
> > >  }
> > >  
> > > +/*
> > > + * Explicitly cover the log. This is similar to background log covering but
> > > + * intended for usage in quiesce codepaths. The caller is responsible to ensure
> > > + * the log is idle and suitable for covering. The CIL, iclog buffers and AIL
> > > + * must all be empty.
> > > + */
> > > +static int
> > > +xfs_log_cover(
> > > +	struct xfs_mount	*mp)
> > > +{
> > > +	struct xlog		*log = mp->m_log;
> > > +	int			error = 0;
> > > +
> > > +	ASSERT((xlog_cil_empty(log) && xlog_iclogs_empty(log) &&
> > > +	        !xfs_ail_min_lsn(log->l_ailp)) ||
> > > +	       XFS_FORCED_SHUTDOWN(mp));
> > > +
> > > +	if (!xfs_log_writable(mp))
> > > +		return 0;
> > > +
> > > +	/*
> > > +	 * To cover the log, commit the superblock twice (at most) in
> > > +	 * independent checkpoints. The first serves as a reference for the
> > > +	 * tail pointer. The sync transaction and AIL push empties the AIL and
> > > +	 * updates the in-core tail to the LSN of the first checkpoint. The
> > > +	 * second commit updates the on-disk tail with the in-core LSN,
> > > +	 * covering the log. Push the AIL one more time to leave it empty, as
> > > +	 * we found it.
> > > +	 */
> > 
> > Hm.  At first I looked at _need_covered and wondered how this could work
> > properly if we are in state DONE or DONE2, because this not-quite
> > predicate returns zero in that case.
> > 
> > I think it's the case that the only way the log can end up in DONE state
> > is if the background log worker had previously been in NEED, written the
> > first of the dummy transactions, moved the state to DONE, and waited for
> > xlog_covered_state to move the log from DONE to NEED2.  Similarly, the
> > log can only be in DONE2 state if the background worker wrote the second
> > dummy and is now waiting for xlog_covered_state to move the log from
> > DONE2 to IDLE.
> > 
> > Since xfs_log_quiesce cancelled the log worker and waited for it to
> > finish before calling xfs_log_cover, the covering state here can only be
> > IDLE, NEED, or NEED2, right?  And hence the while loop pushes the log to
> > IDLE no matter where it is now, right?
> > 
> 
> Yeah, we're in a quiescent context at this point where no other
> transactions are running, the in-core structures should be drained and
> the background log worker cancelled, etc. With regard to the background
> log worker, I don't think it should ever actually see the DONE or DONE2
> states as it sets those states and immediately issues the synchronous sb
> transaction. Therefore, the commit should have changed the state from
> DONE to NEED2 or NEED (if other items happened to land in the log)
> before it returns.
> 
> That said, I suppose it wouldn't be that surprising if some odd timing
> scenario or combination of external superblock commits could cause the
> background log worker to see a DONE state. I haven't fully audited for
> that, but regardless it would appropriately do nothing and that
> shouldn't be an issue from the quiesce context due to the runtime being
> pretty much shut down by this point.

<nod> I think that makes sense.  :)

--D

> > (I also wondered why this isn't a do-while loop but patch 6 addresses
> > that.)
> > 
> 
> Right, that changes due to the lazy sb counter logic..
> 
> Brian
> 
> > --D
> > 
> > > +	while (xfs_log_need_covered(mp)) {
> > > +		error = xfs_sync_sb(mp, true);
> > > +		if (error)
> > > +			break;
> > > +		xfs_ail_push_all_sync(mp->m_ail);
> > > +	}
> > > +
> > > +	return error;
> > > +}
> > > +
> > >  /*
> > >   * We may be holding the log iclog lock upon entering this routine.
> > >   */
> > > diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> > > index b0400589f824..044e02cb8921 100644
> > > --- a/fs/xfs/xfs_log.h
> > > +++ b/fs/xfs/xfs_log.h
> > > @@ -138,7 +138,7 @@ void	xlog_cil_process_committed(struct list_head *list);
> > >  bool	xfs_log_item_in_current_chkpt(struct xfs_log_item *lip);
> > >  
> > >  void	xfs_log_work_queue(struct xfs_mount *mp);
> > > -void	xfs_log_quiesce(struct xfs_mount *mp);
> > > +int	xfs_log_quiesce(struct xfs_mount *mp);
> > >  void	xfs_log_clean(struct xfs_mount *mp);
> > >  bool	xfs_log_check_lsn(struct xfs_mount *, xfs_lsn_t);
> > >  bool	xfs_log_in_recovery(struct xfs_mount *);
> > > -- 
> > > 2.26.2
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
                     ` (2 preceding siblings ...)
  2021-01-11 17:38   ` Christoph Hellwig
@ 2021-01-21 15:08   ` Bill O'Donnell
  2021-01-21 16:49     ` Darrick J. Wong
  3 siblings, 1 reply; 36+ messages in thread
From: Bill O'Donnell @ 2021-01-21 15:08 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Wed, Jan 06, 2021 at 12:41:19PM -0500, Brian Foster wrote:
> xfs_log_sbcount() syncs the superblock specifically to accumulate
> the in-core percpu superblock counters and commit them to disk. This
> is required to maintain filesystem consistency across quiesce
> (freeze, read-only mount/remount) or unmount when lazy superblock
> accounting is enabled because individual transactions do not update
> the superblock directly.
> 
> This mechanism works as expected for writable mounts, but
> xfs_log_sbcount() skips the update for read-only mounts. Read-only
> mounts otherwise still allow log recovery and write out an unmount
> record during log quiesce. If a read-only mount performs log
> recovery, it can modify the in-core superblock counters and write an
> unmount record when the filesystem unmounts without ever syncing the
> in-core counters. This leaves the filesystem with a clean log but in
> an inconsistent state with regard to lazy sb counters.
> 
> Update xfs_log_sbcount() to use the same logic
> xfs_log_unmount_write() uses to determine when to write an unmount
> record. We can drop the freeze state check because the update is
> already allowed during the freezing process and no context calls
> this function on an already frozen fs. This ensures that lazy
> accounting is always synced before the log is cleaned. Refactor this
> logic into a new helper to distinguish between a writable filesystem
> and a writable log. Specifically, the log is writable unless the
> filesystem is mounted with the norecovery mount option, the
> underlying log device is read-only, or the filesystem is shutdown.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> Reviewed-by: Gao Xiang <hsiangkao@redhat.com>

works for me.
Reviewed-by: Bill O'Donnell <billodo@redhat.com>

> ---
>  fs/xfs/xfs_log.c   | 28 ++++++++++++++++++++--------
>  fs/xfs/xfs_log.h   |  1 +
>  fs/xfs/xfs_mount.c |  3 +--
>  3 files changed, 22 insertions(+), 10 deletions(-)
> 
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index fa2d05e65ff1..b445e63cbc3c 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -347,6 +347,25 @@ xlog_tic_add_region(xlog_ticket_t *tic, uint len, uint type)
>  	tic->t_res_num++;
>  }
>  
> +bool
> +xfs_log_writable(
> +	struct xfs_mount	*mp)
> +{
> +	/*
> +	 * Never write to the log on norecovery mounts, if the block device is
> +	 * read-only, or if the filesystem is shutdown. Read-only mounts still
> +	 * allow internal writes for log recovery and unmount purposes, so don't
> +	 * restrict that case here.
> +	 */
> +	if (mp->m_flags & XFS_MOUNT_NORECOVERY)
> +		return false;
> +	if (xfs_readonly_buftarg(mp->m_log->l_targ))
> +		return false;
> +	if (XFS_FORCED_SHUTDOWN(mp))
> +		return false;
> +	return true;
> +}
> +
>  /*
>   * Replenish the byte reservation required by moving the grant write head.
>   */
> @@ -886,15 +905,8 @@ xfs_log_unmount_write(
>  {
>  	struct xlog		*log = mp->m_log;
>  
> -	/*
> -	 * Don't write out unmount record on norecovery mounts or ro devices.
> -	 * Or, if we are doing a forced umount (typically because of IO errors).
> -	 */
> -	if (mp->m_flags & XFS_MOUNT_NORECOVERY ||
> -	    xfs_readonly_buftarg(log->l_targ)) {
> -		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> +	if (!xfs_log_writable(mp))
>  		return;
> -	}
>  
>  	xfs_log_force(mp, XFS_LOG_SYNC);
>  
> diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> index 58c3fcbec94a..98c913da7587 100644
> --- a/fs/xfs/xfs_log.h
> +++ b/fs/xfs/xfs_log.h
> @@ -127,6 +127,7 @@ int	  xfs_log_reserve(struct xfs_mount *mp,
>  int	  xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic);
>  void      xfs_log_unmount(struct xfs_mount *mp);
>  int	  xfs_log_force_umount(struct xfs_mount *mp, int logerror);
> +bool	xfs_log_writable(struct xfs_mount *mp);
>  
>  struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
>  void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 7110507a2b6b..a62b8a574409 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1176,8 +1176,7 @@ xfs_fs_writable(
>  int
>  xfs_log_sbcount(xfs_mount_t *mp)
>  {
> -	/* allow this to proceed during the freeze sequence... */
> -	if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE))
> +	if (!xfs_log_writable(mp))
>  		return 0;
>  
>  	/*
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-21 15:08   ` Bill O'Donnell
@ 2021-01-21 16:49     ` Darrick J. Wong
  2021-01-21 17:17       ` Bill O'Donnell
  0 siblings, 1 reply; 36+ messages in thread
From: Darrick J. Wong @ 2021-01-21 16:49 UTC (permalink / raw)
  To: Bill O'Donnell; +Cc: Brian Foster, linux-xfs

On Thu, Jan 21, 2021 at 09:08:27AM -0600, Bill O'Donnell wrote:
> On Wed, Jan 06, 2021 at 12:41:19PM -0500, Brian Foster wrote:
> > xfs_log_sbcount() syncs the superblock specifically to accumulate
> > the in-core percpu superblock counters and commit them to disk. This
> > is required to maintain filesystem consistency across quiesce
> > (freeze, read-only mount/remount) or unmount when lazy superblock
> > accounting is enabled because individual transactions do not update
> > the superblock directly.
> > 
> > This mechanism works as expected for writable mounts, but
> > xfs_log_sbcount() skips the update for read-only mounts. Read-only
> > mounts otherwise still allow log recovery and write out an unmount
> > record during log quiesce. If a read-only mount performs log
> > recovery, it can modify the in-core superblock counters and write an
> > unmount record when the filesystem unmounts without ever syncing the
> > in-core counters. This leaves the filesystem with a clean log but in
> > an inconsistent state with regard to lazy sb counters.
> > 
> > Update xfs_log_sbcount() to use the same logic
> > xfs_log_unmount_write() uses to determine when to write an unmount
> > record. We can drop the freeze state check because the update is
> > already allowed during the freezing process and no context calls
> > this function on an already frozen fs. This ensures that lazy
> > accounting is always synced before the log is cleaned. Refactor this
> > logic into a new helper to distinguish between a writable filesystem
> > and a writable log. Specifically, the log is writable unless the
> > filesystem is mounted with the norecovery mount option, the
> > underlying log device is read-only, or the filesystem is shutdown.
> > 
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
> 
> works for me.
> Reviewed-by: Bill O'Donnell <billodo@redhat.com>

Does this also apply to the v3 series that Brian just sent?

--D

> 
> > ---
> >  fs/xfs/xfs_log.c   | 28 ++++++++++++++++++++--------
> >  fs/xfs/xfs_log.h   |  1 +
> >  fs/xfs/xfs_mount.c |  3 +--
> >  3 files changed, 22 insertions(+), 10 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > index fa2d05e65ff1..b445e63cbc3c 100644
> > --- a/fs/xfs/xfs_log.c
> > +++ b/fs/xfs/xfs_log.c
> > @@ -347,6 +347,25 @@ xlog_tic_add_region(xlog_ticket_t *tic, uint len, uint type)
> >  	tic->t_res_num++;
> >  }
> >  
> > +bool
> > +xfs_log_writable(
> > +	struct xfs_mount	*mp)
> > +{
> > +	/*
> > +	 * Never write to the log on norecovery mounts, if the block device is
> > +	 * read-only, or if the filesystem is shutdown. Read-only mounts still
> > +	 * allow internal writes for log recovery and unmount purposes, so don't
> > +	 * restrict that case here.
> > +	 */
> > +	if (mp->m_flags & XFS_MOUNT_NORECOVERY)
> > +		return false;
> > +	if (xfs_readonly_buftarg(mp->m_log->l_targ))
> > +		return false;
> > +	if (XFS_FORCED_SHUTDOWN(mp))
> > +		return false;
> > +	return true;
> > +}
> > +
> >  /*
> >   * Replenish the byte reservation required by moving the grant write head.
> >   */
> > @@ -886,15 +905,8 @@ xfs_log_unmount_write(
> >  {
> >  	struct xlog		*log = mp->m_log;
> >  
> > -	/*
> > -	 * Don't write out unmount record on norecovery mounts or ro devices.
> > -	 * Or, if we are doing a forced umount (typically because of IO errors).
> > -	 */
> > -	if (mp->m_flags & XFS_MOUNT_NORECOVERY ||
> > -	    xfs_readonly_buftarg(log->l_targ)) {
> > -		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> > +	if (!xfs_log_writable(mp))
> >  		return;
> > -	}
> >  
> >  	xfs_log_force(mp, XFS_LOG_SYNC);
> >  
> > diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> > index 58c3fcbec94a..98c913da7587 100644
> > --- a/fs/xfs/xfs_log.h
> > +++ b/fs/xfs/xfs_log.h
> > @@ -127,6 +127,7 @@ int	  xfs_log_reserve(struct xfs_mount *mp,
> >  int	  xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic);
> >  void      xfs_log_unmount(struct xfs_mount *mp);
> >  int	  xfs_log_force_umount(struct xfs_mount *mp, int logerror);
> > +bool	xfs_log_writable(struct xfs_mount *mp);
> >  
> >  struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
> >  void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
> > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > index 7110507a2b6b..a62b8a574409 100644
> > --- a/fs/xfs/xfs_mount.c
> > +++ b/fs/xfs/xfs_mount.c
> > @@ -1176,8 +1176,7 @@ xfs_fs_writable(
> >  int
> >  xfs_log_sbcount(xfs_mount_t *mp)
> >  {
> > -	/* allow this to proceed during the freeze sequence... */
> > -	if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE))
> > +	if (!xfs_log_writable(mp))
> >  		return 0;
> >  
> >  	/*
> > -- 
> > 2.26.2
> > 
> 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts
  2021-01-21 16:49     ` Darrick J. Wong
@ 2021-01-21 17:17       ` Bill O'Donnell
  0 siblings, 0 replies; 36+ messages in thread
From: Bill O'Donnell @ 2021-01-21 17:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Brian Foster, linux-xfs

On Thu, Jan 21, 2021 at 08:49:33AM -0800, Darrick J. Wong wrote:
> On Thu, Jan 21, 2021 at 09:08:27AM -0600, Bill O'Donnell wrote:
> > On Wed, Jan 06, 2021 at 12:41:19PM -0500, Brian Foster wrote:
> > > xfs_log_sbcount() syncs the superblock specifically to accumulate
> > > the in-core percpu superblock counters and commit them to disk. This
> > > is required to maintain filesystem consistency across quiesce
> > > (freeze, read-only mount/remount) or unmount when lazy superblock
> > > accounting is enabled because individual transactions do not update
> > > the superblock directly.
> > > 
> > > This mechanism works as expected for writable mounts, but
> > > xfs_log_sbcount() skips the update for read-only mounts. Read-only
> > > mounts otherwise still allow log recovery and write out an unmount
> > > record during log quiesce. If a read-only mount performs log
> > > recovery, it can modify the in-core superblock counters and write an
> > > unmount record when the filesystem unmounts without ever syncing the
> > > in-core counters. This leaves the filesystem with a clean log but in
> > > an inconsistent state with regard to lazy sb counters.
> > > 
> > > Update xfs_log_sbcount() to use the same logic
> > > xfs_log_unmount_write() uses to determine when to write an unmount
> > > record. We can drop the freeze state check because the update is
> > > already allowed during the freezing process and no context calls
> > > this function on an already frozen fs. This ensures that lazy
> > > accounting is always synced before the log is cleaned. Refactor this
> > > logic into a new helper to distinguish between a writable filesystem
> > > and a writable log. Specifically, the log is writable unless the
> > > filesystem is mounted with the norecovery mount option, the
> > > underlying log device is read-only, or the filesystem is shutdown.
> > > 
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
> > 
> > works for me.
> > Reviewed-by: Bill O'Donnell <billodo@redhat.com>
> 
> Does this also apply to the v3 series that Brian just sent?

I only used this patch 1/9, as that was all I was focusing on.

> 
> --D
> 
> > 
> > > ---
> > >  fs/xfs/xfs_log.c   | 28 ++++++++++++++++++++--------
> > >  fs/xfs/xfs_log.h   |  1 +
> > >  fs/xfs/xfs_mount.c |  3 +--
> > >  3 files changed, 22 insertions(+), 10 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> > > index fa2d05e65ff1..b445e63cbc3c 100644
> > > --- a/fs/xfs/xfs_log.c
> > > +++ b/fs/xfs/xfs_log.c
> > > @@ -347,6 +347,25 @@ xlog_tic_add_region(xlog_ticket_t *tic, uint len, uint type)
> > >  	tic->t_res_num++;
> > >  }
> > >  
> > > +bool
> > > +xfs_log_writable(
> > > +	struct xfs_mount	*mp)
> > > +{
> > > +	/*
> > > +	 * Never write to the log on norecovery mounts, if the block device is
> > > +	 * read-only, or if the filesystem is shutdown. Read-only mounts still
> > > +	 * allow internal writes for log recovery and unmount purposes, so don't
> > > +	 * restrict that case here.
> > > +	 */
> > > +	if (mp->m_flags & XFS_MOUNT_NORECOVERY)
> > > +		return false;
> > > +	if (xfs_readonly_buftarg(mp->m_log->l_targ))
> > > +		return false;
> > > +	if (XFS_FORCED_SHUTDOWN(mp))
> > > +		return false;
> > > +	return true;
> > > +}
> > > +
> > >  /*
> > >   * Replenish the byte reservation required by moving the grant write head.
> > >   */
> > > @@ -886,15 +905,8 @@ xfs_log_unmount_write(
> > >  {
> > >  	struct xlog		*log = mp->m_log;
> > >  
> > > -	/*
> > > -	 * Don't write out unmount record on norecovery mounts or ro devices.
> > > -	 * Or, if we are doing a forced umount (typically because of IO errors).
> > > -	 */
> > > -	if (mp->m_flags & XFS_MOUNT_NORECOVERY ||
> > > -	    xfs_readonly_buftarg(log->l_targ)) {
> > > -		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
> > > +	if (!xfs_log_writable(mp))
> > >  		return;
> > > -	}
> > >  
> > >  	xfs_log_force(mp, XFS_LOG_SYNC);
> > >  
> > > diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
> > > index 58c3fcbec94a..98c913da7587 100644
> > > --- a/fs/xfs/xfs_log.h
> > > +++ b/fs/xfs/xfs_log.h
> > > @@ -127,6 +127,7 @@ int	  xfs_log_reserve(struct xfs_mount *mp,
> > >  int	  xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic);
> > >  void      xfs_log_unmount(struct xfs_mount *mp);
> > >  int	  xfs_log_force_umount(struct xfs_mount *mp, int logerror);
> > > +bool	xfs_log_writable(struct xfs_mount *mp);
> > >  
> > >  struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
> > >  void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
> > > diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> > > index 7110507a2b6b..a62b8a574409 100644
> > > --- a/fs/xfs/xfs_mount.c
> > > +++ b/fs/xfs/xfs_mount.c
> > > @@ -1176,8 +1176,7 @@ xfs_fs_writable(
> > >  int
> > >  xfs_log_sbcount(xfs_mount_t *mp)
> > >  {
> > > -	/* allow this to proceed during the freeze sequence... */
> > > -	if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE))
> > > +	if (!xfs_log_writable(mp))
> > >  		return 0;
> > >  
> > >  	/*
> > > -- 
> > > 2.26.2
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2021-01-21 17:19 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-06 17:41 [PATCH 0/9] xfs: rework log quiesce to cover the log Brian Foster
2021-01-06 17:41 ` [PATCH 1/9] xfs: sync lazy sb accounting on quiesce of read-only mounts Brian Foster
2021-01-06 22:50   ` Allison Henderson
2021-01-07 19:06   ` Darrick J. Wong
2021-01-11 17:38   ` Christoph Hellwig
2021-01-12 14:55     ` Brian Foster
2021-01-12 18:20       ` Christoph Hellwig
2021-01-21 15:08   ` Bill O'Donnell
2021-01-21 16:49     ` Darrick J. Wong
2021-01-21 17:17       ` Bill O'Donnell
2021-01-06 17:41 ` [PATCH 2/9] xfs: lift writable fs check up into log worker task Brian Foster
2021-01-06 22:50   ` Allison Henderson
2021-01-07 18:34   ` Darrick J. Wong
2021-01-07 19:53     ` Brian Foster
2021-01-07 21:28       ` Darrick J. Wong
2021-01-13 15:24   ` Christoph Hellwig
2021-01-06 17:41 ` [PATCH 3/9] xfs: separate log cleaning from log quiesce Brian Foster
2021-01-06 22:50   ` Allison Henderson
2021-01-07 19:07   ` Darrick J. Wong
2021-01-13 15:30   ` Christoph Hellwig
2021-01-06 17:41 ` [PATCH 4/9] xfs: cover the log during " Brian Foster
2021-01-07 19:04   ` Darrick J. Wong
2021-01-07 19:53     ` Brian Foster
2021-01-19 17:51       ` Darrick J. Wong
2021-01-19 15:35   ` Christoph Hellwig
2021-01-06 17:41 ` [PATCH 5/9] xfs: don't reset log idle state on covering checkpoints Brian Foster
2021-01-07 19:30   ` Darrick J. Wong
2021-01-07 20:01     ` Brian Foster
2021-01-06 17:41 ` [PATCH 6/9] xfs: fold sbcount quiesce logging into log covering Brian Foster
2021-01-07 19:31   ` Darrick J. Wong
2021-01-06 17:41 ` [PATCH 7/9] xfs: remove duplicate wq cancel and log force from attr quiesce Brian Foster
2021-01-07 19:38   ` Darrick J. Wong
2021-01-06 17:41 ` [PATCH 8/9] xfs: remove xfs_quiesce_attr() Brian Foster
2021-01-07 19:39   ` Darrick J. Wong
2021-01-06 17:41 ` [PATCH 9/9] xfs: cover the log on freeze instead of cleaning it Brian Foster
2021-01-07 19:39   ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.