linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Cc: neilb@suse.de, akpm@linux-foundation.org, davem@davemloft.net,
	christopher.leech@intel.com, shannon.nelson@intel.com,
	herbert@gondor.apana.org.au, jeff@garzik.org
Subject: [md-accel PATCH 11/19] md: handle_stripe5 - add request/completion logic for async check ops
Date: Tue, 26 Jun 2007 18:51:30 -0700	[thread overview]
Message-ID: <20070627015130.18962.39545.stgit@dwillia2-linux.ch.intel.com> (raw)
In-Reply-To: <20070627014823.18962.96398.stgit@dwillia2-linux.ch.intel.com>

Check operations are scheduled when the array is being resynced or an
explicit 'check/repair' command was sent to the array.  Previously check
operations would destroy the parity block in the cache such that even if
parity turned out to be correct the parity block would be marked
!R5_UPTODATE at the completion of the check.  When the operation can be
carried out by a dma engine the assumption is that it can check parity as a
read-only operation.  If raid5_run_ops notices that the check was handled
by hardware it will preserve the R5_UPTODATE status of the parity disk.

When a check operation determines that the parity needs to be repaired we
reuse the existing compute block infrastructure to carry out the operation.
Repair operations imply an immediate write back of the data, so to
differentiate a repair from a normal compute operation the
STRIPE_OP_MOD_REPAIR_PD flag is added.

Changelog:
* remove test_and_set/test_and_clear BUG_ONs, Neil Brown

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-By: NeilBrown <neilb@suse.de>
---

 drivers/md/raid5.c |   84 ++++++++++++++++++++++++++++++++++++++++------------
 1 files changed, 65 insertions(+), 19 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 38b8167..89d3890 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2464,26 +2464,67 @@ static void handle_parity_checks5(raid5_conf_t *conf, struct stripe_head *sh,
 				struct stripe_head_state *s, int disks)
 {
 	set_bit(STRIPE_HANDLE, &sh->state);
-	if (s->failed == 0) {
-		BUG_ON(s->uptodate != disks);
-		compute_parity5(sh, CHECK_PARITY);
-		s->uptodate--;
-		if (page_is_zero(sh->dev[sh->pd_idx].page)) {
-			/* parity is correct (on disc, not in buffer any more)
-			 */
-			set_bit(STRIPE_INSYNC, &sh->state);
-		} else {
-			conf->mddev->resync_mismatches += STRIPE_SECTORS;
-			if (test_bit(MD_RECOVERY_CHECK, &conf->mddev->recovery))
-				/* don't try to repair!! */
+	/* Take one of the following actions:
+	 * 1/ start a check parity operation if (uptodate == disks)
+	 * 2/ finish a check parity operation and act on the result
+	 * 3/ skip to the writeback section if we previously
+	 *    initiated a recovery operation
+	 */
+	if (s->failed == 0 &&
+	    !test_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending)) {
+		if (!test_and_set_bit(STRIPE_OP_CHECK, &sh->ops.pending)) {
+			BUG_ON(s->uptodate != disks);
+			clear_bit(R5_UPTODATE, &sh->dev[sh->pd_idx].flags);
+			sh->ops.count++;
+			s->uptodate--;
+		} else if (
+		       test_and_clear_bit(STRIPE_OP_CHECK, &sh->ops.complete)) {
+			clear_bit(STRIPE_OP_CHECK, &sh->ops.ack);
+			clear_bit(STRIPE_OP_CHECK, &sh->ops.pending);
+
+			if (sh->ops.zero_sum_result == 0)
+				/* parity is correct (on disc,
+				 * not in buffer any more)
+				 */
 				set_bit(STRIPE_INSYNC, &sh->state);
 			else {
-				compute_block(sh, sh->pd_idx);
-				s->uptodate++;
+				conf->mddev->resync_mismatches +=
+					STRIPE_SECTORS;
+				if (test_bit(
+				     MD_RECOVERY_CHECK, &conf->mddev->recovery))
+					/* don't try to repair!! */
+					set_bit(STRIPE_INSYNC, &sh->state);
+				else {
+					set_bit(STRIPE_OP_COMPUTE_BLK,
+						&sh->ops.pending);
+					set_bit(STRIPE_OP_MOD_REPAIR_PD,
+						&sh->ops.pending);
+					set_bit(R5_Wantcompute,
+						&sh->dev[sh->pd_idx].flags);
+					sh->ops.target = sh->pd_idx;
+					sh->ops.count++;
+					s->uptodate++;
+				}
 			}
 		}
 	}
-	if (!test_bit(STRIPE_INSYNC, &sh->state)) {
+
+	/* check if we can clear a parity disk reconstruct */
+	if (test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.complete) &&
+		test_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending)) {
+
+		clear_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending);
+		clear_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.complete);
+		clear_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.ack);
+		clear_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending);
+	}
+
+	/* Wait for check parity and compute block operations to complete
+	 * before write-back
+	 */
+	if (!test_bit(STRIPE_INSYNC, &sh->state) &&
+		!test_bit(STRIPE_OP_CHECK, &sh->ops.pending) &&
+		!test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending)) {
 		struct r5dev *dev;
 		/* either failed parity check, or recovery is happening */
 		if (s->failed == 0)
@@ -2848,12 +2889,17 @@ static void handle_stripe5(struct stripe_head *sh)
 		handle_issuing_new_write_requests5(conf, sh, &s, disks);
 
 	/* maybe we need to check and possibly fix the parity for this stripe
-	 * Any reads will already have been scheduled, so we just see if enough data
-	 * is available
+	 * Any reads will already have been scheduled, so we just see if enough
+	 * data is available.  The parity check is held off while parity
+	 * dependent operations are in flight.
 	 */
-	if (s.syncing && s.locked == 0 &&
-	    !test_bit(STRIPE_INSYNC, &sh->state))
+	if ((s.syncing && s.locked == 0 &&
+	     !test_bit(STRIPE_OP_COMPUTE_BLK, &sh->ops.pending) &&
+	     !test_bit(STRIPE_INSYNC, &sh->state)) ||
+	      test_bit(STRIPE_OP_CHECK, &sh->ops.pending) ||
+	      test_bit(STRIPE_OP_MOD_REPAIR_PD, &sh->ops.pending))
 		handle_parity_checks5(conf, sh, &s, disks);
+
 	if (s.syncing && s.locked == 0 && test_bit(STRIPE_INSYNC, &sh->state)) {
 		md_done_sync(conf->mddev, STRIPE_SECTORS,1);
 		clear_bit(STRIPE_SYNCING, &sh->state);

  parent reply	other threads:[~2007-06-27  1:55 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-27  1:50 [md-accel PATCH 00/19] md raid acceleration and the async_tx api Dan Williams
2007-06-27  1:50 ` [md-accel PATCH 01/19] dmaengine: refactor dmaengine around dma_async_tx_descriptor Dan Williams
2007-06-27  1:50 ` [md-accel PATCH 02/19] dmaengine: make clients responsible for managing channels Dan Williams
2007-06-27  1:50 ` [md-accel PATCH 03/19] xor: make 'xor_blocks' a library routine for use with async_tx Dan Williams
2007-06-27  6:39   ` Satyam Sharma
2007-06-27 16:13     ` Dan Williams
2007-06-27 16:22       ` Herbert Xu
2007-06-27  1:50 ` [md-accel PATCH 04/19] async_tx: add the async_tx api Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 05/19] raid5: refactor handle_stripe5 and handle_stripe6 (v2) Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 06/19] raid5: replace custom debug PRINTKs with standard pr_debug Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 07/19] md: raid5_run_ops - run stripe operations outside sh->lock Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 08/19] md: common infrastructure for running operations with raid5_run_ops Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 09/19] md: handle_stripe5 - add request/completion logic for async write ops Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 10/19] md: handle_stripe5 - add request/completion logic for async compute ops Dan Williams
2007-06-27  1:51 ` Dan Williams [this message]
2007-06-27  1:51 ` [md-accel PATCH 12/19] md: handle_stripe5 - add request/completion logic for async read ops Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 13/19] md: handle_stripe5 - add request/completion logic for async expand ops Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 14/19] md: handle_stripe5 - request io processing in raid5_run_ops Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 15/19] md: remove raid5 compute_block and compute_parity5 Dan Williams
2007-06-27  1:51 ` [md-accel PATCH 16/19] dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines Dan Williams
2007-08-27 13:11   ` saeed bishara
2007-08-27 13:14     ` saeed bishara
2007-08-27 19:31       ` Williams, Dan J
2007-08-30 18:43         ` saeed bishara
2007-08-30 20:41           ` Dan Williams
2007-06-27  1:52 ` [md-accel PATCH 17/19] iop13xx: surface the iop13xx adma units to the iop-adma driver Dan Williams
2007-06-27  1:52 ` [md-accel PATCH 18/19] iop3xx: surface the iop3xx DMA and AAU " Dan Williams
2007-06-27  1:52 ` [md-accel PATCH 19/19] ARM: Add drivers/dma to arch/arm/Kconfig Dan Williams
2007-06-27 16:45 ` [md-accel PATCH 00/19] md raid acceleration and the async_tx api Bill Davidsen
2007-06-27 17:09   ` Williams, Dan J

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070627015130.18962.39545.stgit@dwillia2-linux.ch.intel.com \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=christopher.leech@intel.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=jeff@garzik.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=shannon.nelson@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).