linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: sandeen@sandeen.net, darrick.wong@oracle.com
Cc: linux-xfs@vger.kernel.org
Subject: [PATCH 10/11] xfs_scrub: fix read verify disk error handling strategy
Date: Thu, 05 Sep 2019 20:38:45 -0700	[thread overview]
Message-ID: <156774112526.2645135.10744599143310432497.stgit@magnolia> (raw)
In-Reply-To: <156774106064.2645135.2756383874064764589.stgit@magnolia>

From: Darrick J. Wong <darrick.wong@oracle.com>

The error handling strategy for media errors is totally bogus.  First of
all, short reads are entirely unhandled -- when we encounter a short
read, we know the disk was able to feed us the beginning of what we
asked for, so we need to single-step through the remainder to try to
capture the exact error that we hit.

Second, an actual IO error causes the entire region to be marked bad
even though it could be just a few MB of a multi-gigabyte extent that's
bad.  Therefore, single-step each block in the IO request until we stop
getting IO errors to find out if all the blocks are bad or if it's just
that extent.

Third, fix the fact that the loop updates its own counter variables with
the length fed to read(), which doesn't necessarily have anything to do
with the amount of data that the read actually produced.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 scrub/read_verify.c |   86 ++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 74 insertions(+), 12 deletions(-)


diff --git a/scrub/read_verify.c b/scrub/read_verify.c
index 9d9be68d..3dac10ce 100644
--- a/scrub/read_verify.c
+++ b/scrub/read_verify.c
@@ -169,30 +169,92 @@ read_verify(
 	struct read_verify		*rv = arg;
 	struct read_verify_pool		*rvp;
 	unsigned long long		verified = 0;
+	ssize_t				io_max_size;
 	ssize_t				sz;
 	ssize_t				len;
+	int				io_error;
 	int				ret;
 
 	rvp = (struct read_verify_pool *)wq->wq_ctx;
+	if (rvp->errors_seen)
+		return;
+
+	io_max_size = RVP_IO_MAX_SIZE;
+
 	while (rv->io_length > 0) {
-		len = min(rv->io_length, RVP_IO_MAX_SIZE);
+		io_error = 0;
+		len = min(rv->io_length, io_max_size);
 		dbg_printf("diskverify %d %"PRIu64" %zu\n", rvp->disk->d_fd,
 				rv->io_start, len);
 		sz = disk_read_verify(rvp->disk, rvp->readbuf, rv->io_start,
 				len);
-		if (sz < 0) {
-			dbg_printf("IOERR %d %"PRIu64" %zu\n",
-					rvp->disk->d_fd, rv->io_start, len);
-			/* IO error, so try the next logical block. */
-			len = rvp->miniosz;
-			rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, len,
-					errno, rv->io_end_arg);
+		if (sz == len && io_max_size < rvp->miniosz) {
+			/*
+			 * If the verify request was 100% successful and less
+			 * than a single block in length, we were trying to
+			 * read to the end of a block after a short read.  That
+			 * suggests there's something funny with this device,
+			 * so single-step our way through the rest of the @rv
+			 * range.
+			 */
+			io_max_size = rvp->miniosz;
+		} else if (sz < 0) {
+			io_error = errno;
+
+			/* Runtime error, bail out... */
+			if (io_error != EIO && io_error != EILSEQ) {
+				rvp->errors_seen = io_error;
+				return;
+			}
+
+			/*
+			 * A direct read encountered an error while performing
+			 * a multi-block read.  Reduce the transfer size to a
+			 * single block so that we can identify the exact range
+			 * of bad blocks and good blocks.  We single-step all
+			 * the way to the end of the @rv range, (re)starting
+			 * with the block that just failed.
+			 */
+			if (io_max_size > rvp->miniosz) {
+				io_max_size = rvp->miniosz;
+				continue;
+			}
+
+			/*
+			 * A direct read hit an error while we were stepping
+			 * through single blocks.  Mark everything bad from
+			 * io_start to the next miniosz block.
+			 */
+			sz = rvp->miniosz - (rv->io_start % rvp->miniosz);
+			dbg_printf("IOERR %d @ %"PRIu64" %zu err %d\n",
+					rvp->disk->d_fd, rv->io_start, sz,
+					io_error);
+			rvp->ioerr_fn(rvp->ctx, rvp->disk, rv->io_start, sz,
+					io_error, rv->io_end_arg);
+		} else if (sz < len) {
+			/*
+			 * A short direct read suggests that we might have hit
+			 * an IO error midway through the read but still had to
+			 * return the number of bytes that were actually read.
+			 *
+			 * We need to force an EIO, so try reading the rest of
+			 * the block (if it was a partial block read) or the
+			 * next full block.
+			 */
+			io_max_size = rvp->miniosz - (sz % rvp->miniosz);
+			dbg_printf("SHORT %d READ @ %"PRIu64" %zu try for %zd\n",
+					rvp->disk->d_fd, rv->io_start, sz,
+					io_max_size);
+		} else {
+			/* We should never get back more bytes than we asked. */
+			assert(sz == len);
 		}
 
-		progress_add(len);
-		verified += len;
-		rv->io_start += len;
-		rv->io_length -= len;
+		progress_add(sz);
+		if (io_error == 0)
+			verified += sz;
+		rv->io_start += sz;
+		rv->io_length -= sz;
 	}
 
 	free(rv);


  parent reply	other threads:[~2019-09-06  3:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-06  3:37 [PATCH 00/11] xfs_scrub: fix IO error detection during media verify Darrick J. Wong
2019-09-06  3:37 ` [PATCH 01/11] xfs_scrub: fix handling of read-verify pool runtime errors Darrick J. Wong
2019-09-06  3:37 ` [PATCH 02/11] xfs_scrub: abort all read verification work immediately on error Darrick J. Wong
2019-09-06  3:37 ` [PATCH 03/11] xfs_scrub: fix read-verify pool error communication problems Darrick J. Wong
2019-09-06  3:38 ` [PATCH 04/11] xfs_scrub: fix queue-and-stash of non-contiguous verify requests Darrick J. Wong
2019-09-06  3:38 ` [PATCH 05/11] xfs_scrub: only call read_verify_force_io once per pool Darrick J. Wong
2019-09-06  3:38 ` [PATCH 06/11] xfs_scrub: refactor inode prefix rendering code Darrick J. Wong
2019-09-06  3:38 ` [PATCH 07/11] xfs_scrub: record disk LBA size Darrick J. Wong
2019-09-06  3:38 ` [PATCH 08/11] xfs_scrub: enforce read verify pool minimum io size Darrick J. Wong
2019-09-06  3:38 ` [PATCH 09/11] xfs_scrub: return bytes verified from a SCSI VERIFY command Darrick J. Wong
2019-09-06  3:38 ` Darrick J. Wong [this message]
2019-09-06  3:38 ` [PATCH 11/11] xfs_scrub: simulate errors in the read-verify phase Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2019-09-25 21:34 [PATCH 00/11] xfs_scrub: fix IO error detection during media verify Darrick J. Wong
2019-09-25 21:35 ` [PATCH 10/11] xfs_scrub: fix read verify disk error handling strategy Darrick J. Wong
2019-08-26 21:29 [PATCH 00/11] xfs_scrub: fix IO error detection during media verify Darrick J. Wong
2019-08-26 21:30 ` [PATCH 10/11] xfs_scrub: fix read verify disk error handling strategy Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=156774112526.2645135.10744599143310432497.stgit@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).