linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] xfs: fix scrub timeout warnings
@ 2019-11-06  4:42 Darrick J. Wong
  2019-11-06  4:43 ` [PATCH 1/2] xfs: add missing early termination checks to record scrubbing functions Darrick J. Wong
  2019-11-06  4:43 ` [PATCH 2/2] xfs: periodically yield scrub threads to the scheduler Darrick J. Wong
  0 siblings, 2 replies; 6+ messages in thread
From: Darrick J. Wong @ 2019-11-06  4:42 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, hch

Hi all,

This patch series makes scrub more responsive to the user aborting a
scrub by sending SIGTERM; and fixes a problem where the stall timeouts
trigger when the kernel isn't preemptible.

This has been lightly tested with fstests.  Enjoy!
Comments and questions are, as always, welcome.

--D

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] xfs: add missing early termination checks to record scrubbing functions
  2019-11-06  4:42 [PATCH 0/2] xfs: fix scrub timeout warnings Darrick J. Wong
@ 2019-11-06  4:43 ` Darrick J. Wong
  2019-11-06 14:43   ` Christoph Hellwig
  2019-11-06  4:43 ` [PATCH 2/2] xfs: periodically yield scrub threads to the scheduler Darrick J. Wong
  1 sibling, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2019-11-06  4:43 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs, hch

From: Darrick J. Wong <darrick.wong@oracle.com>

Scrubbing directories, quotas, and fs counters all involve iterating
some collection of metadata items.  The per-item scrub functions for
these three are missing some of the components they need to be able to
check for a fatal signal and terminate early.

Per-item scrub functions need to call xchk_should_terminate to look for
fatal signals, and they need to check the scrub context's corruption
flag because there's no point in continuing a scan once we've decided
the data structure is bad.  Add both of these where missing.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/dir.c        |    3 +++
 fs/xfs/scrub/fscounters.c |    8 ++++++--
 fs/xfs/scrub/quota.c      |    7 +++++++
 3 files changed, 16 insertions(+), 2 deletions(-)


diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c
index 1e2e11721eb9..dca5f159f82a 100644
--- a/fs/xfs/scrub/dir.c
+++ b/fs/xfs/scrub/dir.c
@@ -113,6 +113,9 @@ xchk_dir_actor(
 	offset = xfs_dir2_db_to_da(mp->m_dir_geo,
 			xfs_dir2_dataptr_to_db(mp->m_dir_geo, pos));
 
+	if (xchk_should_terminate(sdc->sc, &error))
+		return error;
+
 	/* Does this inode number make sense? */
 	if (!xfs_verify_dir_ino(mp, ino)) {
 		xchk_fblock_set_corrupt(sdc->sc, XFS_DATA_FORK, offset);
diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c
index 98f82d7c8b40..7251c66a82c9 100644
--- a/fs/xfs/scrub/fscounters.c
+++ b/fs/xfs/scrub/fscounters.c
@@ -104,7 +104,7 @@ xchk_fscount_warmup(
 		pag = NULL;
 		error = 0;
 
-		if (fatal_signal_pending(current))
+		if (xchk_should_terminate(sc, &error))
 			break;
 	}
 
@@ -163,6 +163,7 @@ xchk_fscount_aggregate_agcounts(
 	uint64_t		delayed;
 	xfs_agnumber_t		agno;
 	int			tries = 8;
+	int			error = 0;
 
 retry:
 	fsc->icount = 0;
@@ -196,10 +197,13 @@ xchk_fscount_aggregate_agcounts(
 
 		xfs_perag_put(pag);
 
-		if (fatal_signal_pending(current))
+		if (xchk_should_terminate(sc, &error))
 			break;
 	}
 
+	if (error)
+		return error;
+
 	/*
 	 * The global incore space reservation is taken from the incore
 	 * counters, so leave that out of the computation.
diff --git a/fs/xfs/scrub/quota.c b/fs/xfs/scrub/quota.c
index 0a33b4421c32..905a34558361 100644
--- a/fs/xfs/scrub/quota.c
+++ b/fs/xfs/scrub/quota.c
@@ -93,6 +93,10 @@ xchk_quota_item(
 	unsigned long long	rcount;
 	xfs_ino_t		fs_icount;
 	xfs_dqid_t		id = be32_to_cpu(d->d_id);
+	int			error = 0;
+
+	if (xchk_should_terminate(sc, &error))
+		return error;
 
 	/*
 	 * Except for the root dquot, the actual dquot we got must either have
@@ -178,6 +182,9 @@ xchk_quota_item(
 	if (id != 0 && rhard != 0 && rcount > rhard)
 		xchk_fblock_set_warning(sc, XFS_DATA_FORK, offset);
 
+	if (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)
+		return -EFSCORRUPTED;
+
 	return 0;
 }
 


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] xfs: periodically yield scrub threads to the scheduler
  2019-11-06  4:42 [PATCH 0/2] xfs: fix scrub timeout warnings Darrick J. Wong
  2019-11-06  4:43 ` [PATCH 1/2] xfs: add missing early termination checks to record scrubbing functions Darrick J. Wong
@ 2019-11-06  4:43 ` Darrick J. Wong
  2019-11-06 14:44   ` Christoph Hellwig
  1 sibling, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2019-11-06  4:43 UTC (permalink / raw)
  To: darrick.wong; +Cc: Christoph Hellwig, linux-xfs, hch

From: Darrick J. Wong <darrick.wong@oracle.com>

Christoph Hellwig complained about the following soft lockup warning
when running scrub after generic/175 when preemption is disabled and
slub debugging is enabled:

watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [xfs_scrub:161]
Modules linked in:
irq event stamp: 41692326
hardirqs last  enabled at (41692325): [<ffffffff8232c3b7>] _raw_0
hardirqs last disabled at (41692326): [<ffffffff81001c5a>] trace0
softirqs last  enabled at (41684994): [<ffffffff8260031f>] __do_e
softirqs last disabled at (41684987): [<ffffffff81127d8c>] irq_e0
CPU: 3 PID: 16189 Comm: xfs_scrub Not tainted 5.4.0-rc3+ #30
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.124
RIP: 0010:_raw_spin_unlock_irqrestore+0x39/0x40
Code: 89 f3 be 01 00 00 00 e8 d5 3a e5 fe 48 89 ef e8 ed 87 e5 f2
RSP: 0018:ffffc9000233f970 EFLAGS: 00000286 ORIG_RAX: ffffffffff3
RAX: ffff88813b398040 RBX: 0000000000000286 RCX: 0000000000000006
RDX: 0000000000000006 RSI: ffff88813b3988c0 RDI: ffff88813b398040
RBP: ffff888137958640 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffea00042b0c00
R13: 0000000000000001 R14: ffff88810ac32308 R15: ffff8881376fc040
FS:  00007f6113dea700(0000) GS:ffff88813bb80000(0000) knlGS:00000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6113de8ff8 CR3: 000000012f290000 CR4: 00000000000006e0
Call Trace:
 free_debug_processing+0x1dd/0x240
 __slab_free+0x231/0x410
 kmem_cache_free+0x30e/0x360
 xchk_ag_btcur_free+0x76/0xb0
 xchk_ag_free+0x10/0x80
 xchk_bmap_iextent_xref.isra.14+0xd9/0x120
 xchk_bmap_iextent+0x187/0x210
 xchk_bmap+0x2e0/0x3b0
 xfs_scrub_metadata+0x2e7/0x500
 xfs_ioc_scrub_metadata+0x4a/0xa0
 xfs_file_ioctl+0x58a/0xcd0
 do_vfs_ioctl+0xa0/0x6f0
 ksys_ioctl+0x5b/0x90
 __x64_sys_ioctl+0x11/0x20
 do_syscall_64+0x4b/0x1a0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

If preemption is disabled, all metadata buffers needed to perform the
scrub are already in memory, and there are a lot of records to check,
it's possible that the scrub thread will run for an extended period of
time without sleeping for IO or any other reason.  Then the watchdog
timer or the RCU stall timeout can trigger, producing the backtrace
above.

To fix this problem, we detect when preemption is disabled and
explicitly schedule() the scrub thread every few seconds.

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/scrub/common.h |   14 +++++++++++++-
 fs/xfs/scrub/scrub.h  |    9 +++++++++
 2 files changed, 22 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index 003a772cd26c..597b4d45e990 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -14,8 +14,20 @@
 static inline bool
 xchk_should_terminate(
 	struct xfs_scrub	*sc,
-	int				*error)
+	int			*error)
 {
+#if !IS_ENABLED(CONFIG_PREEMPT)
+	/*
+	 * If preemption is disabled, we need to yield to the scheduler every
+	 * few seconds so that we don't run afoul of the soft lockup watchdog
+	 * or RCU stall detector.
+	 */
+	if (sc->next_yield != 0 && time_after(jiffies, sc->next_yield))
+		return false;
+	schedule();
+	sc->next_yield = jiffies + msecs_to_jiffies(5000);
+#endif
+
 	if (fatal_signal_pending(current)) {
 		if (*error == 0)
 			*error = -EAGAIN;
diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h
index ad1ceb44a628..ada8e4976024 100644
--- a/fs/xfs/scrub/scrub.h
+++ b/fs/xfs/scrub/scrub.h
@@ -75,6 +75,15 @@ struct xfs_scrub {
 
 	/* State tracking for single-AG operations. */
 	struct xchk_ag			sa;
+
+#if !IS_ENABLED(CONFIG_PREEMPT)
+	/*
+	 * This is the time (in jiffies) when this scrub thread needs to
+	 * yield the processor back to the scheduler so that we don't run
+	 * afoul of either the soft lockup watchdog or RCU stall detector.
+	 */
+	unsigned long			next_yield;
+#endif
 };
 
 /* XCHK state flags grow up from zero, XREP state flags grown down from 2^31 */


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] xfs: add missing early termination checks to record scrubbing functions
  2019-11-06  4:43 ` [PATCH 1/2] xfs: add missing early termination checks to record scrubbing functions Darrick J. Wong
@ 2019-11-06 14:43   ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2019-11-06 14:43 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, hch

On Tue, Nov 05, 2019 at 08:43:00PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Scrubbing directories, quotas, and fs counters all involve iterating
> some collection of metadata items.  The per-item scrub functions for
> these three are missing some of the components they need to be able to
> check for a fatal signal and terminate early.
> 
> Per-item scrub functions need to call xchk_should_terminate to look for
> fatal signals, and they need to check the scrub context's corruption
> flag because there's no point in continuing a scan once we've decided
> the data structure is bad.  Add both of these where missing.

Looks sensible, but take this with a grain of salt as I'm not very
familiar with the scrub code:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] xfs: periodically yield scrub threads to the scheduler
  2019-11-06  4:43 ` [PATCH 2/2] xfs: periodically yield scrub threads to the scheduler Darrick J. Wong
@ 2019-11-06 14:44   ` Christoph Hellwig
  2019-11-06 16:13     ` Darrick J. Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2019-11-06 14:44 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs, hch

On Tue, Nov 05, 2019 at 08:43:06PM -0800, Darrick J. Wong wrote:
> +++ b/fs/xfs/scrub/common.h
> @@ -14,8 +14,20 @@
>  static inline bool
>  xchk_should_terminate(
>  	struct xfs_scrub	*sc,
> -	int				*error)
> +	int			*error)
>  {
> +#if !IS_ENABLED(CONFIG_PREEMPT)
> +	/*
> +	 * If preemption is disabled, we need to yield to the scheduler every
> +	 * few seconds so that we don't run afoul of the soft lockup watchdog
> +	 * or RCU stall detector.
> +	 */
> +	if (sc->next_yield != 0 && time_after(jiffies, sc->next_yield))
> +		return false;
> +	schedule();
> +	sc->next_yield = jiffies + msecs_to_jiffies(5000);
> +#endif

This looks weird.  Can't we just do a cond_resched() here?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] xfs: periodically yield scrub threads to the scheduler
  2019-11-06 14:44   ` Christoph Hellwig
@ 2019-11-06 16:13     ` Darrick J. Wong
  0 siblings, 0 replies; 6+ messages in thread
From: Darrick J. Wong @ 2019-11-06 16:13 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Christoph Hellwig, linux-xfs

On Wed, Nov 06, 2019 at 03:44:46PM +0100, Christoph Hellwig wrote:
> On Tue, Nov 05, 2019 at 08:43:06PM -0800, Darrick J. Wong wrote:
> > +++ b/fs/xfs/scrub/common.h
> > @@ -14,8 +14,20 @@
> >  static inline bool
> >  xchk_should_terminate(
> >  	struct xfs_scrub	*sc,
> > -	int				*error)
> > +	int			*error)
> >  {
> > +#if !IS_ENABLED(CONFIG_PREEMPT)
> > +	/*
> > +	 * If preemption is disabled, we need to yield to the scheduler every
> > +	 * few seconds so that we don't run afoul of the soft lockup watchdog
> > +	 * or RCU stall detector.
> > +	 */
> > +	if (sc->next_yield != 0 && time_after(jiffies, sc->next_yield))
> > +		return false;
> > +	schedule();
> > +	sc->next_yield = jiffies + msecs_to_jiffies(5000);
> > +#endif
> 
> This looks weird.  Can't we just do a cond_resched() here?

DOH.  Yes, probably.  Dave even suggested it a few nights ago to fix a
similar problem and apparently I forgot.  Will fix. :(

--D

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-06 16:13 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-06  4:42 [PATCH 0/2] xfs: fix scrub timeout warnings Darrick J. Wong
2019-11-06  4:43 ` [PATCH 1/2] xfs: add missing early termination checks to record scrubbing functions Darrick J. Wong
2019-11-06 14:43   ` Christoph Hellwig
2019-11-06  4:43 ` [PATCH 2/2] xfs: periodically yield scrub threads to the scheduler Darrick J. Wong
2019-11-06 14:44   ` Christoph Hellwig
2019-11-06 16:13     ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).