All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] dm raid: deadlock/corruptor fixes
@ 2018-09-05 17:36 Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 1/5] dm-raid: fix reshape race on small devices Heinz Mauelshagen
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Heinz Mauelshagen @ 2018-09-05 17:36 UTC (permalink / raw)
  To: heinzm, dm-devel

This series of raid patches critically fixes:
- a race causing hangs occuring on tiny devices
- a reshape deadlock / potential data corruption
- a superblock update problem rebuilding individual component devices

In addition, it contains (patches 4+5, less critical) removal
of code duplication deciphering the synchonization action and
avoids to create a bitmap when using a raid4/5/6 journal device.

Fixes pass all lvm2 raid tests.

Heinz Mauelshagen (5):
  dm-raid: fix reshape race on small devices
  dm-raid: fix stripe adding reshape deadlock
  dm-raid: correct explicit superblock update requests
  dm-raid: share decipher_sync_action
  dm-raid: disable bitmap when journaled

 Documentation/device-mapper/dm-raid.txt |   2 +
 drivers/md/dm-raid.c                    | 177 +++++++++---------------
 2 files changed, 71 insertions(+), 108 deletions(-)

-- 
2.17.1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/5] dm-raid: fix reshape race on small devices
  2018-09-05 17:36 [PATCH 0/5] dm raid: deadlock/corruptor fixes Heinz Mauelshagen
@ 2018-09-05 17:36 ` Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 2/5] dm-raid: fix stripe adding reshape deadlock Heinz Mauelshagen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Heinz Mauelshagen @ 2018-09-05 17:36 UTC (permalink / raw)
  To: heinzm, dm-devel

This race does not occur with usual raid device sizes but
with small ones (e.g. those created by the lvm2 test suite).

Race scenario:

Loading a new mapping table, the dm-raid target's constructor
retrieves the volatile reshaping state from the raid superblocks.

When the new table is activated in a following resume, the actual
reshape position is retrieved.  The reshape driven by the previous
mapping can already have finished on small and/or fast devices thus
updating raid superblocks about the new raid layout.

This causes the actual array state (e.g. stripe size reshape finished)
to be inconsistent with the one in the new mapping causing hangs with
left behind devices.

Fix by keeping the array frozen until the reloaded table is resumed.

Whilst on this, add/fix comments.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
---
 Documentation/device-mapper/dm-raid.txt |  1 +
 drivers/md/dm-raid.c                    | 58 +++----------------------
 2 files changed, 8 insertions(+), 51 deletions(-)

diff --git a/Documentation/device-mapper/dm-raid.txt b/Documentation/device-mapper/dm-raid.txt
index 390c145f01d7..f68d06d6f28b 100644
--- a/Documentation/device-mapper/dm-raid.txt
+++ b/Documentation/device-mapper/dm-raid.txt
@@ -348,3 +348,4 @@ Version History
 1.13.1  Fix deadlock caused by early md_stop_writes().  Also fix size an
 	state races.
 1.13.2  Fix raid redundancy validation and avoid keeping raid set frozen
+1.13.3  Fix reshape race on small devices
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index cae689de75fd..ecb7706f7330 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -1,6 +1,6 @@
 /*
  * Copyright (C) 2010-2011 Neil Brown
- * Copyright (C) 2010-2017 Red Hat, Inc. All rights reserved.
+ * Copyright (C) 2010-2018 Red Hat, Inc. All rights reserved.
  *
  * This file is released under the GPL.
  */
@@ -29,9 +29,6 @@
  */
 #define	MIN_RAID456_JOURNAL_SPACE (4*2048)
 
-/* Global list of all raid sets */
-static LIST_HEAD(raid_sets);
-
 static bool devices_handle_discard_safely = false;
 
 /*
@@ -227,7 +224,6 @@ struct rs_layout {
 
 struct raid_set {
 	struct dm_target *ti;
-	struct list_head list;
 
 	uint32_t stripe_cache_entries;
 	unsigned long ctr_flags;
@@ -273,19 +269,6 @@ static void rs_config_restore(struct raid_set *rs, struct rs_layout *l)
 	mddev->new_chunk_sectors = l->new_chunk_sectors;
 }
 
-/* Find any raid_set in active slot for @rs on global list */
-static struct raid_set *rs_find_active(struct raid_set *rs)
-{
-	struct raid_set *r;
-	struct mapped_device *md = dm_table_get_md(rs->ti->table);
-
-	list_for_each_entry(r, &raid_sets, list)
-		if (r != rs && dm_table_get_md(r->ti->table) == md)
-			return r;
-
-	return NULL;
-}
-
 /* raid10 algorithms (i.e. formats) */
 #define	ALGORITHM_RAID10_DEFAULT	0
 #define	ALGORITHM_RAID10_NEAR		1
@@ -764,7 +747,6 @@ static struct raid_set *raid_set_alloc(struct dm_target *ti, struct raid_type *r
 
 	mddev_init(&rs->md);
 
-	INIT_LIST_HEAD(&rs->list);
 	rs->raid_disks = raid_devs;
 	rs->delta_disks = 0;
 
@@ -782,9 +764,6 @@ static struct raid_set *raid_set_alloc(struct dm_target *ti, struct raid_type *r
 	for (i = 0; i < raid_devs; i++)
 		md_rdev_init(&rs->dev[i].rdev);
 
-	/* Add @rs to global list. */
-	list_add(&rs->list, &raid_sets);
-
 	/*
 	 * Remaining items to be initialized by further RAID params:
 	 *  rs->md.persistent
@@ -797,7 +776,7 @@ static struct raid_set *raid_set_alloc(struct dm_target *ti, struct raid_type *r
 	return rs;
 }
 
-/* Free all @rs allocations and remove it from global list. */
+/* Free all @rs allocations */
 static void raid_set_free(struct raid_set *rs)
 {
 	int i;
@@ -815,8 +794,6 @@ static void raid_set_free(struct raid_set *rs)
 			dm_put_device(rs->ti, rs->dev[i].data_dev);
 	}
 
-	list_del(&rs->list);
-
 	kfree(rs);
 }
 
@@ -2649,7 +2626,7 @@ static int rs_adjust_data_offsets(struct raid_set *rs)
 		return 0;
 	}
 
-	/* HM FIXME: get InSync raid_dev? */
+	/* HM FIXME: get In_Sync raid_dev? */
 	rdev = &rs->dev[0].rdev;
 
 	if (rs->delta_disks < 0) {
@@ -3242,6 +3219,8 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	/* Start raid set read-only and assumed clean to change in raid_resume() */
 	rs->md.ro = 1;
 	rs->md.in_sync = 1;
+
+	/* Keep array frozen */
 	set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);
 
 	/* Has to be held on running the array */
@@ -3265,7 +3244,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	rs->callbacks.congested_fn = raid_is_congested;
 	dm_table_add_target_callbacks(ti->table, &rs->callbacks);
 
-	/* If raid4/5/6 journal mode explictely requested (only possible with journal dev) -> set it */
+	/* If raid4/5/6 journal mode explicitly requested (only possible with journal dev) -> set it */
 	if (test_bit(__CTR_FLAG_JOURNAL_MODE, &rs->ctr_flags)) {
 		r = r5c_journal_mode_set(&rs->md, rs->journal_dev.mode);
 		if (r) {
@@ -3947,29 +3926,6 @@ static int raid_preresume(struct dm_target *ti)
 	if (test_and_set_bit(RT_FLAG_RS_PRERESUMED, &rs->runtime_flags))
 		return 0;
 
-	if (!test_bit(__CTR_FLAG_REBUILD, &rs->ctr_flags)) {
-		struct raid_set *rs_active = rs_find_active(rs);
-
-		if (rs_active) {
-			/*
-			 * In case no rebuilds have been requested
-			 * and an active table slot exists, copy
-			 * current resynchonization completed and
-			 * reshape position pointers across from
-			 * suspended raid set in the active slot.
-			 *
-			 * This resumes the new mapping at current
-			 * offsets to continue recover/reshape without
-			 * necessarily redoing a raid set partially or
-			 * causing data corruption in case of a reshape.
-			 */
-			if (rs_active->md.curr_resync_completed != MaxSector)
-				mddev->curr_resync_completed = rs_active->md.curr_resync_completed;
-			if (rs_active->md.reshape_position != MaxSector)
-				mddev->reshape_position = rs_active->md.reshape_position;
-		}
-	}
-
 	/*
 	 * The superblocks need to be updated on disk if the
 	 * array is new or new devices got added (thus zeroed
@@ -4046,7 +4002,7 @@ static void raid_resume(struct dm_target *ti)
 
 static struct target_type raid_target = {
 	.name = "raid",
-	.version = {1, 13, 2},
+	.version = {1, 13, 3},
 	.module = THIS_MODULE,
 	.ctr = raid_ctr,
 	.dtr = raid_dtr,
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/5] dm-raid: fix stripe adding reshape deadlock
  2018-09-05 17:36 [PATCH 0/5] dm raid: deadlock/corruptor fixes Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 1/5] dm-raid: fix reshape race on small devices Heinz Mauelshagen
@ 2018-09-05 17:36 ` Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 3/5] dm-raid: correct explicit superblock update requests Heinz Mauelshagen
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Heinz Mauelshagen @ 2018-09-05 17:36 UTC (permalink / raw)
  To: heinzm, dm-devel

When initiating a stripe adding reshape, a deadlock between
md_stop_writes() waiting for the sync thread to stop and the
running sync thread waiting for inactive stripes occurs
(this frequently happens on single-core but rarely
 on multi-core systems).

Resolve by setting MD_RECOVERY_WAIT to request the main MD
resynchronization thread worker function md_do_sync() to bail
out when initiating the reshape via constructor arguments.
Don't set the flag when reloading without those arguments and
avoid superfluous mddev_{suspend,resume} setting up reshape.

Passes all lvm2 raid tests.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
---
 Documentation/device-mapper/dm-raid.txt |  1 +
 drivers/md/dm-raid.c                    | 13 ++++---------
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/Documentation/device-mapper/dm-raid.txt b/Documentation/device-mapper/dm-raid.txt
index f68d06d6f28b..efb73f521568 100644
--- a/Documentation/device-mapper/dm-raid.txt
+++ b/Documentation/device-mapper/dm-raid.txt
@@ -349,3 +349,4 @@ Version History
 	state races.
 1.13.2  Fix raid redundancy validation and avoid keeping raid set frozen
 1.13.3  Fix reshape race on small devices
+1.14.0  Fix stripe adding reshape deadlock/potential data corruption
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index ecb7706f7330..03dd915eff9e 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3871,14 +3871,13 @@ static int rs_start_reshape(struct raid_set *rs)
 	struct mddev *mddev = &rs->md;
 	struct md_personality *pers = mddev->pers;
 
+	/* Don't allow the sync thread to work until the table gets reloaded. */
+	set_bit(MD_RECOVERY_WAIT, &mddev->recovery);
+
 	r = rs_setup_reshape(rs);
 	if (r)
 		return r;
 
-	/* Need to be resumed to be able to start reshape, recovery is frozen until raid_resume() though */
-	if (test_and_clear_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags))
-		mddev_resume(mddev);
-
 	/*
 	 * Check any reshape constraints enforced by the personalility
 	 *
@@ -3902,10 +3901,6 @@ static int rs_start_reshape(struct raid_set *rs)
 		}
 	}
 
-	/* Suspend because a resume will happen in raid_resume() */
-	set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags);
-	mddev_suspend(mddev);
-
 	/*
 	 * Now reshape got set up, update superblocks to
 	 * reflect the fact so that a table reload will
@@ -4002,7 +3997,7 @@ static void raid_resume(struct dm_target *ti)
 
 static struct target_type raid_target = {
 	.name = "raid",
-	.version = {1, 13, 3},
+	.version = {1, 14, 0},
 	.module = THIS_MODULE,
 	.ctr = raid_ctr,
 	.dtr = raid_dtr,
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/5] dm-raid: correct explicit superblock update requests
  2018-09-05 17:36 [PATCH 0/5] dm raid: deadlock/corruptor fixes Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 1/5] dm-raid: fix reshape race on small devices Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 2/5] dm-raid: fix stripe adding reshape deadlock Heinz Mauelshagen
@ 2018-09-05 17:36 ` Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 4/5] dm-raid: share decipher_sync_action Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 5/5] dm-raid: disable bitmap when journaled Heinz Mauelshagen
  4 siblings, 0 replies; 6+ messages in thread
From: Heinz Mauelshagen @ 2018-09-05 17:36 UTC (permalink / raw)
  To: heinzm, dm-devel

Critical:

Request superblock updates when particular devices are requested
to be rebuild (e.g. via lvconvert --replace ...) to avoid racy
"New device injected into existing raid set without 'delta_disks'
 or 'rebuild' parameter specified" kernel error messages.

Cleanup:

Remove request to write superblocks in super_load() when new,
because raid_ctr() requests it already via flag RT_FLAG_UPDATE_SBS
for new arrays (it's performed in preresume).

Avoid explicit superblocks update in rs_start_reshape() because
it's already requested before for new reshapes in
rs_prepare_reshape().

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
---
 drivers/md/dm-raid.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 03dd915eff9e..efe035fcb23e 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -2202,9 +2202,6 @@ static int super_load(struct md_rdev *rdev, struct md_rdev *refdev)
 		set_bit(FirstUse, &rdev->flags);
 		sb->compat_features = cpu_to_le32(FEATURE_FLAG_SUPPORTS_V190);
 
-		/* Force writing of superblocks to disk */
-		set_bit(MD_SB_CHANGE_DEVS, &rdev->mddev->sb_flags);
-
 		/* Any superblock is better than none, choose that if given */
 		return refdev ? 0 : 1;
 	}
@@ -3126,6 +3123,11 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		set_bit(RT_FLAG_UPDATE_SBS, &rs->runtime_flags);
 		rs_set_new(rs);
 	} else if (rs_is_recovering(rs)) {
+		/* Rebuild particular devices */
+		if (test_bit(__CTR_FLAG_REBUILD, &rs->ctr_flags)) {
+			set_bit(RT_FLAG_UPDATE_SBS, &rs->runtime_flags);
+			rs_setup_recovery(rs, MaxSector);
+		}
 		/* A recovering raid set may be resized */
 		; /* skip setup rs */
 	} else if (rs_is_reshaping(rs)) {
@@ -3234,7 +3236,6 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	}
 
 	r = md_start(&rs->md);
-
 	if (r) {
 		ti->error = "Failed to start raid array";
 		mddev_unlock(&rs->md);
@@ -3846,7 +3847,7 @@ static int __load_dirty_region_bitmap(struct raid_set *rs)
 	return r;
 }
 
-/* Enforce updating all superblocks */
+/* Enforce updating new superblocks */
 static void rs_update_sbs(struct raid_set *rs)
 {
 	struct mddev *mddev = &rs->md;
@@ -3901,13 +3902,6 @@ static int rs_start_reshape(struct raid_set *rs)
 		}
 	}
 
-	/*
-	 * Now reshape got set up, update superblocks to
-	 * reflect the fact so that a table reload will
-	 * access proper superblock content in the ctr.
-	 */
-	rs_update_sbs(rs);
-
 	return 0;
 }
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/5] dm-raid: share decipher_sync_action
  2018-09-05 17:36 [PATCH 0/5] dm raid: deadlock/corruptor fixes Heinz Mauelshagen
                   ` (2 preceding siblings ...)
  2018-09-05 17:36 ` [PATCH 3/5] dm-raid: correct explicit superblock update requests Heinz Mauelshagen
@ 2018-09-05 17:36 ` Heinz Mauelshagen
  2018-09-05 17:36 ` [PATCH 5/5] dm-raid: disable bitmap when journaled Heinz Mauelshagen
  4 siblings, 0 replies; 6+ messages in thread
From: Heinz Mauelshagen @ 2018-09-05 17:36 UTC (permalink / raw)
  To: heinzm, dm-devel

decipher_sync_action() is only called from raid_status() thus
duplicating tests to define the current synchronization action
(frozen, check, resync, ...) in rs_get_progress().

Change it to return an enum defining the sync states and
share it to avoid code duplication.  Introduce sync_str()
to translate from enum to string in raid_status().

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
---
 drivers/md/dm-raid.c | 88 ++++++++++++++++++++++++++------------------
 1 file changed, 52 insertions(+), 36 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index efe035fcb23e..9754ef24fd0b 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3330,32 +3330,55 @@ static int raid_map(struct dm_target *ti, struct bio *bio)
 	return DM_MAPIO_SUBMITTED;
 }
 
-/* Return string describing the current sync action of @mddev */
-static const char *decipher_sync_action(struct mddev *mddev, unsigned long recovery)
+/* Return sync state string for @state */
+enum sync_state { st_frozen, st_reshape, st_resync, st_check, st_repair, st_recover, st_idle };
+static const char *sync_str(enum sync_state state)
+{
+	/* Has to be in above sync_state order! */
+	static const char *sync_strs[] = {
+		"frozen",
+		"reshape",
+		"resync",
+		"check",
+		"repair",
+		"recover",
+		"idle"
+	};
+
+	BUG_ON(!__within_range(state, 0, ARRAY_SIZE(sync_strs) - 1));
+
+	return sync_strs[state];
+};
+
+/* Return enum sync_state for @mddev derived from @recovery flags */
+static const enum sync_state decipher_sync_action(struct mddev *mddev, unsigned long recovery)
 {
 	if (test_bit(MD_RECOVERY_FROZEN, &recovery))
-		return "frozen";
+		return st_frozen;
 
-	/* The MD sync thread can be done with io but still be running */
+	/* The MD sync thread can be done with io or be interrupted but still be running */
 	if (!test_bit(MD_RECOVERY_DONE, &recovery) &&
 	    (test_bit(MD_RECOVERY_RUNNING, &recovery) ||
 	     (!mddev->ro && test_bit(MD_RECOVERY_NEEDED, &recovery)))) {
 		if (test_bit(MD_RECOVERY_RESHAPE, &recovery))
-			return "reshape";
+			return st_reshape;
 
 		if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
 			if (!test_bit(MD_RECOVERY_REQUESTED, &recovery))
-				return "resync";
-			else if (test_bit(MD_RECOVERY_CHECK, &recovery))
-				return "check";
-			return "repair";
+				return st_resync;
+			if (test_bit(MD_RECOVERY_CHECK, &recovery))
+				return st_check;
+			return st_repair;
 		}
 
 		if (test_bit(MD_RECOVERY_RECOVER, &recovery))
-			return "recover";
+			return st_recover;
+
+		if (mddev->reshape_position != MaxSector)
+			return st_reshape;
 	}
 
-	return "idle";
+	return st_idle;
 }
 
 /*
@@ -3389,6 +3412,7 @@ static sector_t rs_get_progress(struct raid_set *rs, unsigned long recovery,
 				sector_t resync_max_sectors)
 {
 	sector_t r;
+	enum sync_state state;
 	struct mddev *mddev = &rs->md;
 
 	clear_bit(RT_FLAG_RS_IN_SYNC, &rs->runtime_flags);
@@ -3399,20 +3423,14 @@ static sector_t rs_get_progress(struct raid_set *rs, unsigned long recovery,
 		set_bit(RT_FLAG_RS_IN_SYNC, &rs->runtime_flags);
 
 	} else {
-		if (!test_bit(__CTR_FLAG_NOSYNC, &rs->ctr_flags) &&
-		    !test_bit(MD_RECOVERY_INTR, &recovery) &&
-		    (test_bit(MD_RECOVERY_NEEDED, &recovery) ||
-		     test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
-		     test_bit(MD_RECOVERY_RUNNING, &recovery)))
-			r = mddev->curr_resync_completed;
-		else
+		state = decipher_sync_action(mddev, recovery);
+
+		if (state == st_idle && !test_bit(MD_RECOVERY_INTR, &recovery))
 			r = mddev->recovery_cp;
+		else
+			r = mddev->curr_resync_completed;
 
-		if (r >= resync_max_sectors &&
-		    (!test_bit(MD_RECOVERY_REQUESTED, &recovery) ||
-		     (!test_bit(MD_RECOVERY_FROZEN, &recovery) &&
-		      !test_bit(MD_RECOVERY_NEEDED, &recovery) &&
-		      !test_bit(MD_RECOVERY_RUNNING, &recovery)))) {
+		if (state == st_idle && r >= resync_max_sectors) {
 			/*
 			 * Sync complete.
 			 */
@@ -3420,24 +3438,20 @@ static sector_t rs_get_progress(struct raid_set *rs, unsigned long recovery,
 			if (test_bit(MD_RECOVERY_RECOVER, &recovery))
 				set_bit(RT_FLAG_RS_IN_SYNC, &rs->runtime_flags);
 
-		} else if (test_bit(MD_RECOVERY_RECOVER, &recovery)) {
+		} else if (state == st_recover)
 			/*
 			 * In case we are recovering, the array is not in sync
 			 * and health chars should show the recovering legs.
 			 */
 			;
-
-		} else if (test_bit(MD_RECOVERY_SYNC, &recovery) &&
-			   !test_bit(MD_RECOVERY_REQUESTED, &recovery)) {
+		else if (state == st_resync)
 			/*
 			 * If "resync" is occurring, the raid set
 			 * is or may be out of sync hence the health
 			 * characters shall be 'a'.
 			 */
 			set_bit(RT_FLAG_RS_RESYNCING, &rs->runtime_flags);
-
-		} else if (test_bit(MD_RECOVERY_RESHAPE, &recovery) &&
-			   !test_bit(MD_RECOVERY_REQUESTED, &recovery)) {
+		else if (state == st_reshape)
 			/*
 			 * If "reshape" is occurring, the raid set
 			 * is or may be out of sync hence the health
@@ -3445,7 +3459,7 @@ static sector_t rs_get_progress(struct raid_set *rs, unsigned long recovery,
 			 */
 			set_bit(RT_FLAG_RS_RESYNCING, &rs->runtime_flags);
 
-		} else if (test_bit(MD_RECOVERY_REQUESTED, &recovery)) {
+		else if (state == st_check || state == st_repair)
 			/*
 			 * If "check" or "repair" is occurring, the raid set has
 			 * undergone an initial sync and the health characters
@@ -3453,12 +3467,12 @@ static sector_t rs_get_progress(struct raid_set *rs, unsigned long recovery,
 			 */
 			set_bit(RT_FLAG_RS_IN_SYNC, &rs->runtime_flags);
 
-		} else {
+		else {
 			struct md_rdev *rdev;
 
 			/*
 			 * We are idle and recovery is needed, prevent 'A' chars race
-			 * caused by components still set to in-sync by constrcuctor.
+			 * caused by components still set to in-sync by constructor.
 			 */
 			if (test_bit(MD_RECOVERY_NEEDED, &recovery))
 				set_bit(RT_FLAG_RS_RESYNCING, &rs->runtime_flags);
@@ -3522,7 +3536,7 @@ static void raid_status(struct dm_target *ti, status_type_t type,
 		progress = rs_get_progress(rs, recovery, resync_max_sectors);
 		resync_mismatches = (mddev->last_sync_action && !strcasecmp(mddev->last_sync_action, "check")) ?
 				    atomic64_read(&mddev->resync_mismatches) : 0;
-		sync_action = decipher_sync_action(&rs->md, recovery);
+		sync_action = sync_str(decipher_sync_action(&rs->md, recovery));
 
 		/* HM FIXME: do we want another state char for raid0? It shows 'D'/'A'/'-' now */
 		for (i = 0; i < rs->raid_disks; i++)
@@ -3739,12 +3753,14 @@ static void raid_postsuspend(struct dm_target *ti)
 	struct raid_set *rs = ti->private;
 
 	if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) {
+		BUG_ON(rs->md.suspended);
+
 		/* Writes have to be stopped before suspending to avoid deadlocks. */
-		if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
-			md_stop_writes(&rs->md);
+		md_stop_writes(&rs->md);
 
 		mddev_lock_nointr(&rs->md);
 		mddev_suspend(&rs->md);
+		rs->md.ro = 1;
 		mddev_unlock(&rs->md);
 	}
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 5/5] dm-raid: disable bitmap when journaled
  2018-09-05 17:36 [PATCH 0/5] dm raid: deadlock/corruptor fixes Heinz Mauelshagen
                   ` (3 preceding siblings ...)
  2018-09-05 17:36 ` [PATCH 4/5] dm-raid: share decipher_sync_action Heinz Mauelshagen
@ 2018-09-05 17:36 ` Heinz Mauelshagen
  4 siblings, 0 replies; 6+ messages in thread
From: Heinz Mauelshagen @ 2018-09-05 17:36 UTC (permalink / raw)
  To: heinzm, dm-devel

When a journal device for raid4/5/6 is used,
don't create a mutually exclusive write intent bitmap.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
---
 drivers/md/dm-raid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 9754ef24fd0b..1f0e85f6f7e6 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -2472,7 +2472,7 @@ static int super_validate(struct raid_set *rs, struct md_rdev *rdev)
 	}
 
 	/* Enable bitmap creation for RAID levels != 0 */
-	mddev->bitmap_info.offset = rt_is_raid0(rs->raid_type) ? 0 : to_sector(4096);
+	mddev->bitmap_info.offset = (rt_is_raid0(rs->raid_type) || rs->journal_dev.dev) ? 0 : to_sector(4096);
 	mddev->bitmap_info.default_offset = mddev->bitmap_info.offset;
 
 	if (!test_and_clear_bit(FirstUse, &rdev->flags)) {
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-09-05 17:36 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-05 17:36 [PATCH 0/5] dm raid: deadlock/corruptor fixes Heinz Mauelshagen
2018-09-05 17:36 ` [PATCH 1/5] dm-raid: fix reshape race on small devices Heinz Mauelshagen
2018-09-05 17:36 ` [PATCH 2/5] dm-raid: fix stripe adding reshape deadlock Heinz Mauelshagen
2018-09-05 17:36 ` [PATCH 3/5] dm-raid: correct explicit superblock update requests Heinz Mauelshagen
2018-09-05 17:36 ` [PATCH 4/5] dm-raid: share decipher_sync_action Heinz Mauelshagen
2018-09-05 17:36 ` [PATCH 5/5] dm-raid: disable bitmap when journaled Heinz Mauelshagen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.