All of lore.kernel.org
 help / color / mirror / Atom feed
* [dm-devel] [PATCH -next v2 00/28] md: synchronize io with array reconfiguration
@ 2023-08-28  1:59 ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Changes in v2:
 - rebase with latest md-next
 - remove some follow up cleanup patches, these patches will be sent
 later after this patchset.

After previous four patchset of preparatory work, this patchset impelement
a new version of mddev_suspend(), the new apis:
 - reconfig_mutex is not required;
 - the weird logical that suspend array hold 'reconfig_mutex' for
   mddev_check_recovery() to update superblock is not needed;
 - the special handling, 'pers->prepare_suspend', for raid456 is not
   needed;
 - It's safe to be called at any time once mddev is allocated, and it's
   designed to be used from slow path where array configuration is changed;

And use the new api to replace:

mddev_lock
mddev_suspend or not
// array reconfiguration
mddev_resume or not
mddev_unlock

With:

mddev_suspend
mddev_lock
// array reconfiguration
mddev_unlock
mddev_resume

However, the above change is not possible for raid5 and raid-cluster in
some corner cases, and mddev_suspend/resume() is replaced with quiesce()
callback, which will suspend the array as well.

This patchset is tested in my VM with mdadm testsuite with loop device
except for 10ddf tests(they always fail before this patchset).

A lot of cleanups will be started after this patchset.

Yu Kuai (28):
  md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
  md: use 'mddev->suspended' for is_md_suspended()
  md: add new helpers to suspend/resume array
  md: add new helpers to suspend/resume and lock/unlock array
  md: use new apis to suspend array for suspend_lo/hi_store()
  md: use new apis to suspend array for level_store()
  md: use new apis to suspend array for serialize_policy_store()
  md/dm-raid: use new apis to suspend array
  md/md-bitmap: use new apis to suspend array for location_store()
  md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
  md/raid5-cache: use new apis to suspend array for
    r5c_disable_writeback_async()
  md/raid5-cache: use new apis to suspend array for
    r5c_journal_mode_store()
  md/raid5: use new apis to suspend array for raid5_store_stripe_size()
  md/raid5: use new apis to suspend array for raid5_store_skip_copy()
  md/raid5: use new apis to suspend array for
    raid5_store_group_thread_cnt()
  md/raid5: use new apis to suspend array for
    raid5_change_consistency_policy()
  md/raid5: replace suspend with quiesce() callback
  md: quiesce before md_kick_rdev_from_array() for md-cluster
  md: use new apis to suspend array for ioctls involed array
    reconfiguration
  md: use new apis to suspend array for adding/removing rdev from
    state_store()
  md: use new apis to suspend array for bind_rdev_to_array()
  md: use new apis to suspend array related to serial pool in
    state_store()
  md: use new apis to suspend array in backlog_store()
  md: suspend array in md_start_sync() if array need reconfiguration
  md: cleanup mddev_create/destroy_serial_pool()
  md/md-linear: cleanup linear_add()
  md: remove old apis to suspend the array
  md: rename __mddev_suspend/resume() back to mddev_suspend/resume()

 drivers/md/dm-raid.c       |   8 +-
 drivers/md/md-autodetect.c |   4 +-
 drivers/md/md-bitmap.c     |  18 ++-
 drivers/md/md-linear.c     |   2 -
 drivers/md/md.c            | 250 ++++++++++++++++++++++---------------
 drivers/md/md.h            |  52 ++++++--
 drivers/md/raid5-cache.c   |  61 +++++----
 drivers/md/raid5.c         |  56 ++++-----
 8 files changed, 253 insertions(+), 198 deletions(-)

-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH -next v2 00/28] md: synchronize io with array reconfiguration
@ 2023-08-28  1:59 ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Changes in v2:
 - rebase with latest md-next
 - remove some follow up cleanup patches, these patches will be sent
 later after this patchset.

After previous four patchset of preparatory work, this patchset impelement
a new version of mddev_suspend(), the new apis:
 - reconfig_mutex is not required;
 - the weird logical that suspend array hold 'reconfig_mutex' for
   mddev_check_recovery() to update superblock is not needed;
 - the special handling, 'pers->prepare_suspend', for raid456 is not
   needed;
 - It's safe to be called at any time once mddev is allocated, and it's
   designed to be used from slow path where array configuration is changed;

And use the new api to replace:

mddev_lock
mddev_suspend or not
// array reconfiguration
mddev_resume or not
mddev_unlock

With:

mddev_suspend
mddev_lock
// array reconfiguration
mddev_unlock
mddev_resume

However, the above change is not possible for raid5 and raid-cluster in
some corner cases, and mddev_suspend/resume() is replaced with quiesce()
callback, which will suspend the array as well.

This patchset is tested in my VM with mdadm testsuite with loop device
except for 10ddf tests(they always fail before this patchset).

A lot of cleanups will be started after this patchset.

Yu Kuai (28):
  md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
  md: use 'mddev->suspended' for is_md_suspended()
  md: add new helpers to suspend/resume array
  md: add new helpers to suspend/resume and lock/unlock array
  md: use new apis to suspend array for suspend_lo/hi_store()
  md: use new apis to suspend array for level_store()
  md: use new apis to suspend array for serialize_policy_store()
  md/dm-raid: use new apis to suspend array
  md/md-bitmap: use new apis to suspend array for location_store()
  md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
  md/raid5-cache: use new apis to suspend array for
    r5c_disable_writeback_async()
  md/raid5-cache: use new apis to suspend array for
    r5c_journal_mode_store()
  md/raid5: use new apis to suspend array for raid5_store_stripe_size()
  md/raid5: use new apis to suspend array for raid5_store_skip_copy()
  md/raid5: use new apis to suspend array for
    raid5_store_group_thread_cnt()
  md/raid5: use new apis to suspend array for
    raid5_change_consistency_policy()
  md/raid5: replace suspend with quiesce() callback
  md: quiesce before md_kick_rdev_from_array() for md-cluster
  md: use new apis to suspend array for ioctls involed array
    reconfiguration
  md: use new apis to suspend array for adding/removing rdev from
    state_store()
  md: use new apis to suspend array for bind_rdev_to_array()
  md: use new apis to suspend array related to serial pool in
    state_store()
  md: use new apis to suspend array in backlog_store()
  md: suspend array in md_start_sync() if array need reconfiguration
  md: cleanup mddev_create/destroy_serial_pool()
  md/md-linear: cleanup linear_add()
  md: remove old apis to suspend the array
  md: rename __mddev_suspend/resume() back to mddev_suspend/resume()

 drivers/md/dm-raid.c       |   8 +-
 drivers/md/md-autodetect.c |   4 +-
 drivers/md/md-bitmap.c     |  18 ++-
 drivers/md/md-linear.c     |   2 -
 drivers/md/md.c            | 250 ++++++++++++++++++++++---------------
 drivers/md/md.h            |  52 ++++++--
 drivers/md/raid5-cache.c   |  61 +++++----
 drivers/md/raid5.c         |  56 ++++-----
 8 files changed, 253 insertions(+), 198 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  1:59   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request()
is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal
value.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 46badd13a687..9d8dff9d923c 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -359,11 +359,11 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio)
 		return true;
 	if (bio_data_dir(bio) != WRITE)
 		return false;
-	if (mddev->suspend_lo >= mddev->suspend_hi)
+	if (READ_ONCE(mddev->suspend_lo) >= READ_ONCE(mddev->suspend_hi))
 		return false;
-	if (bio->bi_iter.bi_sector >= mddev->suspend_hi)
+	if (bio->bi_iter.bi_sector >= READ_ONCE(mddev->suspend_hi))
 		return false;
-	if (bio_end_sector(bio) < mddev->suspend_lo)
+	if (bio_end_sector(bio) < READ_ONCE(mddev->suspend_lo))
 		return false;
 	return true;
 }
@@ -5171,7 +5171,8 @@ __ATTR(sync_max, S_IRUGO|S_IWUSR, max_sync_show, max_sync_store);
 static ssize_t
 suspend_lo_show(struct mddev *mddev, char *page)
 {
-	return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_lo);
+	return sprintf(page, "%llu\n",
+		       (unsigned long long)READ_ONCE(mddev->suspend_lo));
 }
 
 static ssize_t
@@ -5191,7 +5192,7 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
 		return err;
 
 	mddev_suspend(mddev);
-	mddev->suspend_lo = new;
+	WRITE_ONCE(mddev->suspend_lo, new);
 	mddev_resume(mddev);
 
 	mddev_unlock(mddev);
@@ -5203,7 +5204,8 @@ __ATTR(suspend_lo, S_IRUGO|S_IWUSR, suspend_lo_show, suspend_lo_store);
 static ssize_t
 suspend_hi_show(struct mddev *mddev, char *page)
 {
-	return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_hi);
+	return sprintf(page, "%llu\n",
+		       (unsigned long long)READ_ONCE(mddev->suspend_hi));
 }
 
 static ssize_t
@@ -5223,7 +5225,7 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
 		return err;
 
 	mddev_suspend(mddev);
-	mddev->suspend_hi = new;
+	WRITE_ONCE(mddev->suspend_hi, new);
 	mddev_resume(mddev);
 
 	mddev_unlock(mddev);
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
@ 2023-08-28  1:59   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request()
is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal
value.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 46badd13a687..9d8dff9d923c 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -359,11 +359,11 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio)
 		return true;
 	if (bio_data_dir(bio) != WRITE)
 		return false;
-	if (mddev->suspend_lo >= mddev->suspend_hi)
+	if (READ_ONCE(mddev->suspend_lo) >= READ_ONCE(mddev->suspend_hi))
 		return false;
-	if (bio->bi_iter.bi_sector >= mddev->suspend_hi)
+	if (bio->bi_iter.bi_sector >= READ_ONCE(mddev->suspend_hi))
 		return false;
-	if (bio_end_sector(bio) < mddev->suspend_lo)
+	if (bio_end_sector(bio) < READ_ONCE(mddev->suspend_lo))
 		return false;
 	return true;
 }
@@ -5171,7 +5171,8 @@ __ATTR(sync_max, S_IRUGO|S_IWUSR, max_sync_show, max_sync_store);
 static ssize_t
 suspend_lo_show(struct mddev *mddev, char *page)
 {
-	return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_lo);
+	return sprintf(page, "%llu\n",
+		       (unsigned long long)READ_ONCE(mddev->suspend_lo));
 }
 
 static ssize_t
@@ -5191,7 +5192,7 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
 		return err;
 
 	mddev_suspend(mddev);
-	mddev->suspend_lo = new;
+	WRITE_ONCE(mddev->suspend_lo, new);
 	mddev_resume(mddev);
 
 	mddev_unlock(mddev);
@@ -5203,7 +5204,8 @@ __ATTR(suspend_lo, S_IRUGO|S_IWUSR, suspend_lo_show, suspend_lo_store);
 static ssize_t
 suspend_hi_show(struct mddev *mddev, char *page)
 {
-	return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_hi);
+	return sprintf(page, "%llu\n",
+		       (unsigned long long)READ_ONCE(mddev->suspend_hi));
 }
 
 static ssize_t
@@ -5223,7 +5225,7 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
 		return err;
 
 	mddev_suspend(mddev);
-	mddev->suspend_hi = new;
+	WRITE_ONCE(mddev->suspend_hi, new);
 	mddev_resume(mddev);
 
 	mddev_unlock(mddev);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  1:59   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

'pers->prepare_suspend' is introduced to prevent a deadlock for raid456,
this change prepares to clean this up in later patches while refactoring
mddev_suspend(). Specifically allow reshape to make progress while
waiting for 'active_io' to be 0.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 2 +-
 drivers/md/md.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9d8dff9d923c..7fa311a14317 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -355,7 +355,7 @@ static DEFINE_SPINLOCK(all_mddevs_lock);
  */
 static bool is_suspended(struct mddev *mddev, struct bio *bio)
 {
-	if (is_md_suspended(mddev))
+	if (is_md_suspended(mddev) || percpu_ref_is_dying(&mddev->active_io))
 		return true;
 	if (bio_data_dir(bio) != WRITE)
 		return false;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index b628c292506e..fb3b123f16dd 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -584,7 +584,7 @@ static inline bool md_is_rdwr(struct mddev *mddev)
 
 static inline bool is_md_suspended(struct mddev *mddev)
 {
-	return percpu_ref_is_dying(&mddev->active_io);
+	return READ_ONCE(mddev->suspended);
 }
 
 static inline int __must_check mddev_lock(struct mddev *mddev)
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended()
@ 2023-08-28  1:59   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

'pers->prepare_suspend' is introduced to prevent a deadlock for raid456,
this change prepares to clean this up in later patches while refactoring
mddev_suspend(). Specifically allow reshape to make progress while
waiting for 'active_io' to be 0.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 2 +-
 drivers/md/md.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9d8dff9d923c..7fa311a14317 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -355,7 +355,7 @@ static DEFINE_SPINLOCK(all_mddevs_lock);
  */
 static bool is_suspended(struct mddev *mddev, struct bio *bio)
 {
-	if (is_md_suspended(mddev))
+	if (is_md_suspended(mddev) || percpu_ref_is_dying(&mddev->active_io))
 		return true;
 	if (bio_data_dir(bio) != WRITE)
 		return false;
diff --git a/drivers/md/md.h b/drivers/md/md.h
index b628c292506e..fb3b123f16dd 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -584,7 +584,7 @@ static inline bool md_is_rdwr(struct mddev *mddev)
 
 static inline bool is_md_suspended(struct mddev *mddev)
 {
-	return percpu_ref_is_dying(&mddev->active_io);
+	return READ_ONCE(mddev->suspended);
 }
 
 static inline int __must_check mddev_lock(struct mddev *mddev)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 03/28] md: add new helpers to suspend/resume array
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  1:59   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Advantages for new apis:
 - reconfig_mutex is not required;
 - the weird logical that suspend array hold 'reconfig_mutex' for
   mddev_check_recovery() to update superblock is not needed;
 - the specail handling, 'pers->prepare_suspend', for raid456 is not
   needed;
 - It's safe to be called at any time once mddev is allocated, and it's
   designed to be used from slow path where array configuration is changed;

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++--
 drivers/md/md.h |  3 ++
 2 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7fa311a14317..6236e2e395c1 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -443,12 +443,22 @@ void mddev_suspend(struct mddev *mddev)
 			lockdep_is_held(&mddev->reconfig_mutex));
 
 	WARN_ON_ONCE(thread && current == thread->tsk);
-	if (mddev->suspended++)
+
+	/* can't concurrent with __mddev_suspend() and __mddev_resume() */
+	mutex_lock(&mddev->suspend_mutex);
+	if (mddev->suspended++) {
+		mutex_unlock(&mddev->suspend_mutex);
 		return;
+	}
+
 	wake_up(&mddev->sb_wait);
 	set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
 	percpu_ref_kill(&mddev->active_io);
 
+	/*
+	 * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
+	 * by __mddev_suspend().
+	 */
 	if (mddev->pers && mddev->pers->prepare_suspend)
 		mddev->pers->prepare_suspend(mddev);
 
@@ -459,14 +469,21 @@ void mddev_suspend(struct mddev *mddev)
 	del_timer_sync(&mddev->safemode_timer);
 	/* restrict memory reclaim I/O during raid array is suspend */
 	mddev->noio_flag = memalloc_noio_save();
+
+	mutex_unlock(&mddev->suspend_mutex);
 }
 EXPORT_SYMBOL_GPL(mddev_suspend);
 
 void mddev_resume(struct mddev *mddev)
 {
 	lockdep_assert_held(&mddev->reconfig_mutex);
-	if (--mddev->suspended)
+
+	/* can't concurrent with __mddev_suspend() and __mddev_resume() */
+	mutex_lock(&mddev->suspend_mutex);
+	if (--mddev->suspended) {
+		mutex_unlock(&mddev->suspend_mutex);
 		return;
+	}
 
 	/* entred the memalloc scope from mddev_suspend() */
 	memalloc_noio_restore(mddev->noio_flag);
@@ -477,9 +494,72 @@ void mddev_resume(struct mddev *mddev)
 	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 	md_wakeup_thread(mddev->thread);
 	md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
+
+	mutex_unlock(&mddev->suspend_mutex);
 }
 EXPORT_SYMBOL_GPL(mddev_resume);
 
+void __mddev_suspend(struct mddev *mddev)
+{
+
+	/*
+	 * hold reconfig_mutex to wait for normal io will deadlock, because
+	 * other context can't update super_block, and normal io can rely on
+	 * updating super_block.
+	 */
+	lockdep_assert_not_held(&mddev->reconfig_mutex);
+
+	mutex_lock(&mddev->suspend_mutex);
+
+	if (mddev->suspended) {
+		WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
+		mutex_unlock(&mddev->suspend_mutex);
+		return;
+	}
+
+	percpu_ref_kill(&mddev->active_io);
+	wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
+
+	/*
+	 * For raid456, io might be waiting for reshape to make progress,
+	 * allow new reshape to start while waiting for io to be done to
+	 * prevent deadlock.
+	 */
+	WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
+
+	del_timer_sync(&mddev->safemode_timer);
+	/* restrict memory reclaim I/O during raid array is suspend */
+	mddev->noio_flag = memalloc_noio_save();
+
+	mutex_unlock(&mddev->suspend_mutex);
+}
+EXPORT_SYMBOL_GPL(__mddev_suspend);
+
+void __mddev_resume(struct mddev *mddev)
+{
+	lockdep_assert_not_held(&mddev->reconfig_mutex);
+
+	mutex_lock(&mddev->suspend_mutex);
+	WRITE_ONCE(mddev->suspended, mddev->suspended - 1);
+	if (mddev->suspended) {
+		mutex_unlock(&mddev->suspend_mutex);
+		return;
+	}
+
+	/* entred the memalloc scope from __mddev_suspend() */
+	memalloc_noio_restore(mddev->noio_flag);
+
+	percpu_ref_resurrect(&mddev->active_io);
+	wake_up(&mddev->sb_wait);
+
+	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+	md_wakeup_thread(mddev->thread);
+	md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
+
+	mutex_unlock(&mddev->suspend_mutex);
+}
+EXPORT_SYMBOL_GPL(__mddev_resume);
+
 /*
  * Generic flush handling for md
  */
@@ -667,6 +747,7 @@ int mddev_init(struct mddev *mddev)
 	mutex_init(&mddev->open_mutex);
 	mutex_init(&mddev->reconfig_mutex);
 	mutex_init(&mddev->sync_mutex);
+	mutex_init(&mddev->suspend_mutex);
 	mutex_init(&mddev->bitmap_info.mutex);
 	INIT_LIST_HEAD(&mddev->disks);
 	INIT_LIST_HEAD(&mddev->all_mddevs);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index fb3b123f16dd..1103e6b08ad9 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -316,6 +316,7 @@ struct mddev {
 	unsigned long			sb_flags;
 
 	int				suspended;
+	struct mutex			suspend_mutex;
 	struct percpu_ref		active_io;
 	int				ro;
 	int				sysfs_active; /* set when sysfs deletes
@@ -811,6 +812,8 @@ extern void md_rdev_clear(struct md_rdev *rdev);
 extern void md_handle_request(struct mddev *mddev, struct bio *bio);
 extern void mddev_suspend(struct mddev *mddev);
 extern void mddev_resume(struct mddev *mddev);
+extern void __mddev_suspend(struct mddev *mddev);
+extern void __mddev_resume(struct mddev *mddev);
 
 extern void md_reload_sb(struct mddev *mddev, int raid_disk);
 extern void md_update_sb(struct mddev *mddev, int force);
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 03/28] md: add new helpers to suspend/resume array
@ 2023-08-28  1:59   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Advantages for new apis:
 - reconfig_mutex is not required;
 - the weird logical that suspend array hold 'reconfig_mutex' for
   mddev_check_recovery() to update superblock is not needed;
 - the specail handling, 'pers->prepare_suspend', for raid456 is not
   needed;
 - It's safe to be called at any time once mddev is allocated, and it's
   designed to be used from slow path where array configuration is changed;

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++--
 drivers/md/md.h |  3 ++
 2 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7fa311a14317..6236e2e395c1 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -443,12 +443,22 @@ void mddev_suspend(struct mddev *mddev)
 			lockdep_is_held(&mddev->reconfig_mutex));
 
 	WARN_ON_ONCE(thread && current == thread->tsk);
-	if (mddev->suspended++)
+
+	/* can't concurrent with __mddev_suspend() and __mddev_resume() */
+	mutex_lock(&mddev->suspend_mutex);
+	if (mddev->suspended++) {
+		mutex_unlock(&mddev->suspend_mutex);
 		return;
+	}
+
 	wake_up(&mddev->sb_wait);
 	set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
 	percpu_ref_kill(&mddev->active_io);
 
+	/*
+	 * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
+	 * by __mddev_suspend().
+	 */
 	if (mddev->pers && mddev->pers->prepare_suspend)
 		mddev->pers->prepare_suspend(mddev);
 
@@ -459,14 +469,21 @@ void mddev_suspend(struct mddev *mddev)
 	del_timer_sync(&mddev->safemode_timer);
 	/* restrict memory reclaim I/O during raid array is suspend */
 	mddev->noio_flag = memalloc_noio_save();
+
+	mutex_unlock(&mddev->suspend_mutex);
 }
 EXPORT_SYMBOL_GPL(mddev_suspend);
 
 void mddev_resume(struct mddev *mddev)
 {
 	lockdep_assert_held(&mddev->reconfig_mutex);
-	if (--mddev->suspended)
+
+	/* can't concurrent with __mddev_suspend() and __mddev_resume() */
+	mutex_lock(&mddev->suspend_mutex);
+	if (--mddev->suspended) {
+		mutex_unlock(&mddev->suspend_mutex);
 		return;
+	}
 
 	/* entred the memalloc scope from mddev_suspend() */
 	memalloc_noio_restore(mddev->noio_flag);
@@ -477,9 +494,72 @@ void mddev_resume(struct mddev *mddev)
 	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 	md_wakeup_thread(mddev->thread);
 	md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
+
+	mutex_unlock(&mddev->suspend_mutex);
 }
 EXPORT_SYMBOL_GPL(mddev_resume);
 
+void __mddev_suspend(struct mddev *mddev)
+{
+
+	/*
+	 * hold reconfig_mutex to wait for normal io will deadlock, because
+	 * other context can't update super_block, and normal io can rely on
+	 * updating super_block.
+	 */
+	lockdep_assert_not_held(&mddev->reconfig_mutex);
+
+	mutex_lock(&mddev->suspend_mutex);
+
+	if (mddev->suspended) {
+		WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
+		mutex_unlock(&mddev->suspend_mutex);
+		return;
+	}
+
+	percpu_ref_kill(&mddev->active_io);
+	wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
+
+	/*
+	 * For raid456, io might be waiting for reshape to make progress,
+	 * allow new reshape to start while waiting for io to be done to
+	 * prevent deadlock.
+	 */
+	WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
+
+	del_timer_sync(&mddev->safemode_timer);
+	/* restrict memory reclaim I/O during raid array is suspend */
+	mddev->noio_flag = memalloc_noio_save();
+
+	mutex_unlock(&mddev->suspend_mutex);
+}
+EXPORT_SYMBOL_GPL(__mddev_suspend);
+
+void __mddev_resume(struct mddev *mddev)
+{
+	lockdep_assert_not_held(&mddev->reconfig_mutex);
+
+	mutex_lock(&mddev->suspend_mutex);
+	WRITE_ONCE(mddev->suspended, mddev->suspended - 1);
+	if (mddev->suspended) {
+		mutex_unlock(&mddev->suspend_mutex);
+		return;
+	}
+
+	/* entred the memalloc scope from __mddev_suspend() */
+	memalloc_noio_restore(mddev->noio_flag);
+
+	percpu_ref_resurrect(&mddev->active_io);
+	wake_up(&mddev->sb_wait);
+
+	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+	md_wakeup_thread(mddev->thread);
+	md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
+
+	mutex_unlock(&mddev->suspend_mutex);
+}
+EXPORT_SYMBOL_GPL(__mddev_resume);
+
 /*
  * Generic flush handling for md
  */
@@ -667,6 +747,7 @@ int mddev_init(struct mddev *mddev)
 	mutex_init(&mddev->open_mutex);
 	mutex_init(&mddev->reconfig_mutex);
 	mutex_init(&mddev->sync_mutex);
+	mutex_init(&mddev->suspend_mutex);
 	mutex_init(&mddev->bitmap_info.mutex);
 	INIT_LIST_HEAD(&mddev->disks);
 	INIT_LIST_HEAD(&mddev->all_mddevs);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index fb3b123f16dd..1103e6b08ad9 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -316,6 +316,7 @@ struct mddev {
 	unsigned long			sb_flags;
 
 	int				suspended;
+	struct mutex			suspend_mutex;
 	struct percpu_ref		active_io;
 	int				ro;
 	int				sysfs_active; /* set when sysfs deletes
@@ -811,6 +812,8 @@ extern void md_rdev_clear(struct md_rdev *rdev);
 extern void md_handle_request(struct mddev *mddev, struct bio *bio);
 extern void mddev_suspend(struct mddev *mddev);
 extern void mddev_resume(struct mddev *mddev);
+extern void __mddev_suspend(struct mddev *mddev);
+extern void __mddev_resume(struct mddev *mddev);
 
 extern void md_reload_sb(struct mddev *mddev, int raid_disk);
 extern void md_update_sb(struct mddev *mddev, int force);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 04/28] md: add new helpers to suspend/resume and lock/unlock array
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  1:59   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

The new helpers suspend the array first and then lock the array,

Prepare to refactor from:

mddev_lock/trylock/lock_nointr
mddev_suspend
...
mddev_resuem
mddev_lock

With:

mddev_suspend_and_lock/trylock/lock_nointr
...
mddev_unlock_and_resume

After all the use cases is refactored, mddev_suspend/resume() will be
removed.

And mddev_suspend_and_lock() will also replace mddev_lock() for the case
that the array will be reconfigured, in order to synchronize with io to
prevent problems in many corner cases.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.h | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/drivers/md/md.h b/drivers/md/md.h
index 1103e6b08ad9..07496179084a 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -858,6 +858,42 @@ static inline void mddev_check_write_zeroes(struct mddev *mddev, struct bio *bio
 		mddev->queue->limits.max_write_zeroes_sectors = 0;
 }
 
+static inline int mddev_suspend_and_lock(struct mddev *mddev)
+{
+	int ret;
+
+	__mddev_suspend(mddev);
+	ret = mddev_lock(mddev);
+	if (ret)
+		__mddev_resume(mddev);
+
+	return ret;
+}
+
+static inline void mddev_suspend_and_lock_nointr(struct mddev *mddev)
+{
+	__mddev_suspend(mddev);
+	mutex_lock(&mddev->reconfig_mutex);
+}
+
+static inline int mddev_suspend_and_trylock(struct mddev *mddev)
+{
+	int ret;
+
+	__mddev_suspend(mddev);
+	ret = mutex_trylock(&mddev->reconfig_mutex);
+	if (ret)
+		__mddev_resume(mddev);
+
+	return ret;
+}
+
+static inline void mddev_unlock_and_resume(struct mddev *mddev)
+{
+	mddev_unlock(mddev);
+	__mddev_resume(mddev);
+}
+
 struct mdu_array_info_s;
 struct mdu_disk_info_s;
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 04/28] md: add new helpers to suspend/resume and lock/unlock array
@ 2023-08-28  1:59   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

The new helpers suspend the array first and then lock the array,

Prepare to refactor from:

mddev_lock/trylock/lock_nointr
mddev_suspend
...
mddev_resuem
mddev_lock

With:

mddev_suspend_and_lock/trylock/lock_nointr
...
mddev_unlock_and_resume

After all the use cases is refactored, mddev_suspend/resume() will be
removed.

And mddev_suspend_and_lock() will also replace mddev_lock() for the case
that the array will be reconfigured, in order to synchronize with io to
prevent problems in many corner cases.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.h | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/drivers/md/md.h b/drivers/md/md.h
index 1103e6b08ad9..07496179084a 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -858,6 +858,42 @@ static inline void mddev_check_write_zeroes(struct mddev *mddev, struct bio *bio
 		mddev->queue->limits.max_write_zeroes_sectors = 0;
 }
 
+static inline int mddev_suspend_and_lock(struct mddev *mddev)
+{
+	int ret;
+
+	__mddev_suspend(mddev);
+	ret = mddev_lock(mddev);
+	if (ret)
+		__mddev_resume(mddev);
+
+	return ret;
+}
+
+static inline void mddev_suspend_and_lock_nointr(struct mddev *mddev)
+{
+	__mddev_suspend(mddev);
+	mutex_lock(&mddev->reconfig_mutex);
+}
+
+static inline int mddev_suspend_and_trylock(struct mddev *mddev)
+{
+	int ret;
+
+	__mddev_suspend(mddev);
+	ret = mutex_trylock(&mddev->reconfig_mutex);
+	if (ret)
+		__mddev_resume(mddev);
+
+	return ret;
+}
+
+static inline void mddev_unlock_and_resume(struct mddev *mddev)
+{
+	mddev_unlock(mddev);
+	__mddev_resume(mddev);
+}
+
 struct mdu_array_info_s;
 struct mdu_disk_info_s;
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 05/28] md: use new apis to suspend array for suspend_lo/hi_store()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  1:59   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 6236e2e395c1..84d077110174 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5268,15 +5268,10 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
 	if (new != (sector_t)new)
 		return -EINVAL;
 
-	err = mddev_lock(mddev);
-	if (err)
-		return err;
-
-	mddev_suspend(mddev);
+	__mddev_suspend(mddev);
 	WRITE_ONCE(mddev->suspend_lo, new);
-	mddev_resume(mddev);
+	__mddev_resume(mddev);
 
-	mddev_unlock(mddev);
 	return len;
 }
 static struct md_sysfs_entry md_suspend_lo =
@@ -5301,15 +5296,10 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
 	if (new != (sector_t)new)
 		return -EINVAL;
 
-	err = mddev_lock(mddev);
-	if (err)
-		return err;
-
-	mddev_suspend(mddev);
+	__mddev_suspend(mddev);
 	WRITE_ONCE(mddev->suspend_hi, new);
-	mddev_resume(mddev);
+	__mddev_resume(mddev);
 
-	mddev_unlock(mddev);
 	return len;
 }
 static struct md_sysfs_entry md_suspend_hi =
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 05/28] md: use new apis to suspend array for suspend_lo/hi_store()
@ 2023-08-28  1:59   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 6236e2e395c1..84d077110174 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5268,15 +5268,10 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
 	if (new != (sector_t)new)
 		return -EINVAL;
 
-	err = mddev_lock(mddev);
-	if (err)
-		return err;
-
-	mddev_suspend(mddev);
+	__mddev_suspend(mddev);
 	WRITE_ONCE(mddev->suspend_lo, new);
-	mddev_resume(mddev);
+	__mddev_resume(mddev);
 
-	mddev_unlock(mddev);
 	return len;
 }
 static struct md_sysfs_entry md_suspend_lo =
@@ -5301,15 +5296,10 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
 	if (new != (sector_t)new)
 		return -EINVAL;
 
-	err = mddev_lock(mddev);
-	if (err)
-		return err;
-
-	mddev_suspend(mddev);
+	__mddev_suspend(mddev);
 	WRITE_ONCE(mddev->suspend_hi, new);
-	mddev_resume(mddev);
+	__mddev_resume(mddev);
 
-	mddev_unlock(mddev);
 	return len;
 }
 static struct md_sysfs_entry md_suspend_hi =
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 06/28] md: use new apis to suspend array for level_store()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  1:59   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 84d077110174..03b83b56fe16 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3997,7 +3997,7 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
 	if (slen == 0 || slen >= sizeof(clevel))
 		return -EINVAL;
 
-	rv = mddev_lock(mddev);
+	rv = mddev_suspend_and_lock(mddev);
 	if (rv)
 		return rv;
 
@@ -4090,7 +4090,6 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
 	}
 
 	/* Looks like we have a winner */
-	mddev_suspend(mddev);
 	mddev_detach(mddev);
 
 	spin_lock(&mddev->lock);
@@ -4176,14 +4175,13 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
 	blk_set_stacking_limits(&mddev->queue->limits);
 	pers->run(mddev);
 	set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags);
-	mddev_resume(mddev);
 	if (!mddev->thread)
 		md_update_sb(mddev, 1);
 	sysfs_notify_dirent_safe(mddev->sysfs_level);
 	md_new_event();
 	rv = len;
 out_unlock:
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return rv;
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 06/28] md: use new apis to suspend array for level_store()
@ 2023-08-28  1:59   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  1:59 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 84d077110174..03b83b56fe16 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3997,7 +3997,7 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
 	if (slen == 0 || slen >= sizeof(clevel))
 		return -EINVAL;
 
-	rv = mddev_lock(mddev);
+	rv = mddev_suspend_and_lock(mddev);
 	if (rv)
 		return rv;
 
@@ -4090,7 +4090,6 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
 	}
 
 	/* Looks like we have a winner */
-	mddev_suspend(mddev);
 	mddev_detach(mddev);
 
 	spin_lock(&mddev->lock);
@@ -4176,14 +4175,13 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
 	blk_set_stacking_limits(&mddev->queue->limits);
 	pers->run(mddev);
 	set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags);
-	mddev_resume(mddev);
 	if (!mddev->thread)
 		md_update_sb(mddev, 1);
 	sysfs_notify_dirent_safe(mddev->sysfs_level);
 	md_new_event();
 	rv = len;
 out_unlock:
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return rv;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 07/28] md: use new apis to suspend array for serialize_policy_store()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 03b83b56fe16..a3bc4968fa0f 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5545,7 +5545,7 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
 	if (value == mddev->serialize_policy)
 		return len;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	if (mddev->pers == NULL || (mddev->pers->level != 1)) {
@@ -5554,15 +5554,13 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
 		goto unlock;
 	}
 
-	mddev_suspend(mddev);
 	if (value)
 		mddev_create_serial_pool(mddev, NULL, true);
 	else
 		mddev_destroy_serial_pool(mddev, NULL, true);
 	mddev->serialize_policy = value;
-	mddev_resume(mddev);
 unlock:
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return err ?: len;
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 07/28] md: use new apis to suspend array for serialize_policy_store()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 03b83b56fe16..a3bc4968fa0f 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5545,7 +5545,7 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
 	if (value == mddev->serialize_policy)
 		return len;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	if (mddev->pers == NULL || (mddev->pers->level != 1)) {
@@ -5554,15 +5554,13 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
 		goto unlock;
 	}
 
-	mddev_suspend(mddev);
 	if (value)
 		mddev_create_serial_pool(mddev, NULL, true);
 	else
 		mddev_destroy_serial_pool(mddev, NULL, true);
 	mddev->serialize_policy = value;
-	mddev_resume(mddev);
 unlock:
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return err ?: len;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 08/28] md/dm-raid: use new apis to suspend array
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

These are not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/dm-raid.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 020598c10db0..2ff33b5d9a1b 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3244,7 +3244,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);
 
 	/* Has to be held on running the array */
-	mddev_lock_nointr(&rs->md);
+	mddev_suspend_and_lock_nointr(&rs->md);
 	r = md_run(&rs->md);
 	rs->md.in_sync = 0; /* Assume already marked dirty */
 	if (r) {
@@ -3270,7 +3270,6 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		}
 	}
 
-	mddev_suspend(&rs->md);
 	set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags);
 
 	/* Try to adjust the raid4/5/6 stripe cache size to the stripe size */
@@ -3800,9 +3799,7 @@ static void raid_postsuspend(struct dm_target *ti)
 		if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
 			md_stop_writes(&rs->md);
 
-		mddev_lock_nointr(&rs->md);
-		mddev_suspend(&rs->md);
-		mddev_unlock(&rs->md);
+		__mddev_suspend(&rs->md);
 	}
 }
 
@@ -4014,7 +4011,7 @@ static int raid_preresume(struct dm_target *ti)
 	}
 
 	/* Check for any resize/reshape on @rs and adjust/initiate */
-	/* Be prepared for mddev_resume() in raid_resume() */
+	/* Be prepared for __mddev_resume() in raid_resume() */
 	set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 	if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) {
 		set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
@@ -4061,8 +4058,7 @@ static void raid_resume(struct dm_target *ti)
 		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 		mddev->ro = 0;
 		mddev->in_sync = 0;
-		mddev_resume(mddev);
-		mddev_unlock(mddev);
+		mddev_unlock_and_resume(mddev);
 	}
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 08/28] md/dm-raid: use new apis to suspend array
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

These are not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/dm-raid.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 020598c10db0..2ff33b5d9a1b 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3244,7 +3244,7 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 	set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery);
 
 	/* Has to be held on running the array */
-	mddev_lock_nointr(&rs->md);
+	mddev_suspend_and_lock_nointr(&rs->md);
 	r = md_run(&rs->md);
 	rs->md.in_sync = 0; /* Assume already marked dirty */
 	if (r) {
@@ -3270,7 +3270,6 @@ static int raid_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 		}
 	}
 
-	mddev_suspend(&rs->md);
 	set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags);
 
 	/* Try to adjust the raid4/5/6 stripe cache size to the stripe size */
@@ -3800,9 +3799,7 @@ static void raid_postsuspend(struct dm_target *ti)
 		if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
 			md_stop_writes(&rs->md);
 
-		mddev_lock_nointr(&rs->md);
-		mddev_suspend(&rs->md);
-		mddev_unlock(&rs->md);
+		__mddev_suspend(&rs->md);
 	}
 }
 
@@ -4014,7 +4011,7 @@ static int raid_preresume(struct dm_target *ti)
 	}
 
 	/* Check for any resize/reshape on @rs and adjust/initiate */
-	/* Be prepared for mddev_resume() in raid_resume() */
+	/* Be prepared for __mddev_resume() in raid_resume() */
 	set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 	if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) {
 		set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
@@ -4061,8 +4058,7 @@ static void raid_resume(struct dm_target *ti)
 		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 		mddev->ro = 0;
 		mddev->in_sync = 0;
-		mddev_resume(mddev);
-		mddev_unlock(mddev);
+		mddev_unlock_and_resume(mddev);
 	}
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 09/28] md/md-bitmap: use new apis to suspend array for location_store()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-bitmap.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 0c661e5036bb..7d21e2a5b06e 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2348,11 +2348,10 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
 {
 	int rv;
 
-	rv = mddev_lock(mddev);
+	rv = mddev_suspend_and_lock(mddev);
 	if (rv)
 		return rv;
 
-	mddev_suspend(mddev);
 	if (mddev->pers) {
 		if (mddev->recovery || mddev->sync_thread) {
 			rv = -EBUSY;
@@ -2429,8 +2428,7 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
 	}
 	rv = 0;
 out:
-	mddev_resume(mddev);
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	if (rv)
 		return rv;
 	return len;
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 09/28] md/md-bitmap: use new apis to suspend array for location_store()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-bitmap.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 0c661e5036bb..7d21e2a5b06e 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2348,11 +2348,10 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
 {
 	int rv;
 
-	rv = mddev_lock(mddev);
+	rv = mddev_suspend_and_lock(mddev);
 	if (rv)
 		return rv;
 
-	mddev_suspend(mddev);
 	if (mddev->pers) {
 		if (mddev->recovery || mddev->sync_thread) {
 			rv = -EBUSY;
@@ -2429,8 +2428,7 @@ location_store(struct mddev *mddev, const char *buf, size_t len)
 	}
 	rv = 0;
 out:
-	mddev_resume(mddev);
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	if (rv)
 		return rv;
 	return len;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 10/28] md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

'conf->log' is set with 'reconfig_mutex' grabbed, however, readers are
not procted, hence use READ_ONCE/WRITE_ONCE to prevent reading abnormal
value.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5-cache.c | 47 +++++++++++++++++++++-------------------
 1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 518b7cfa78b9..889bba60d6ff 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -327,8 +327,9 @@ void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
 void r5c_check_stripe_cache_usage(struct r5conf *conf)
 {
 	int total_cached;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
-	if (!r5c_is_writeback(conf->log))
+	if (!r5c_is_writeback(log))
 		return;
 
 	total_cached = atomic_read(&conf->r5c_cached_partial_stripes) +
@@ -344,7 +345,7 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
 	 */
 	if (total_cached > conf->min_nr_stripes * 1 / 2 ||
 	    atomic_read(&conf->empty_inactive_list_nr) > 0)
-		r5l_wake_reclaim(conf->log, 0);
+		r5l_wake_reclaim(log, 0);
 }
 
 /*
@@ -353,7 +354,9 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
  */
 void r5c_check_cached_full_stripe(struct r5conf *conf)
 {
-	if (!r5c_is_writeback(conf->log))
+	struct r5l_log *log = READ_ONCE(conf->log);
+
+	if (!r5c_is_writeback(log))
 		return;
 
 	/*
@@ -363,7 +366,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
 	if (atomic_read(&conf->r5c_cached_full_stripes) >=
 	    min(R5C_FULL_STRIPE_FLUSH_BATCH(conf),
 		conf->chunk_sectors >> RAID5_STRIPE_SHIFT(conf)))
-		r5l_wake_reclaim(conf->log, 0);
+		r5l_wake_reclaim(log, 0);
 }
 
 /*
@@ -396,7 +399,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
  */
 static sector_t r5c_log_required_to_flush_cache(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	if (!r5c_is_writeback(log))
 		return 0;
@@ -449,7 +452,7 @@ static inline void r5c_update_log_state(struct r5l_log *log)
 void r5c_make_stripe_write_out(struct stripe_head *sh)
 {
 	struct r5conf *conf = sh->raid_conf;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	BUG_ON(!r5c_is_writeback(log));
 
@@ -491,7 +494,7 @@ static void r5c_handle_parity_cached(struct stripe_head *sh)
  */
 static void r5c_finish_cache_stripe(struct stripe_head *sh)
 {
-	struct r5l_log *log = sh->raid_conf->log;
+	struct r5l_log *log = READ_ONCE(sh->raid_conf->log);
 
 	if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) {
 		BUG_ON(test_bit(STRIPE_R5C_CACHING, &sh->state));
@@ -692,7 +695,7 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 
 	/* wait superblock change before suspend */
 	wait_event(mddev->sb_wait,
-		   conf->log == NULL ||
+		   !READ_ONCE(conf->log) ||
 		   (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
 		    (locked = mddev_trylock(mddev))));
 	if (locked) {
@@ -1151,7 +1154,7 @@ static void r5l_run_no_space_stripes(struct r5l_log *log)
 static sector_t r5c_calculate_new_cp(struct r5conf *conf)
 {
 	struct stripe_head *sh;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	sector_t new_cp;
 	unsigned long flags;
 
@@ -1159,12 +1162,12 @@ static sector_t r5c_calculate_new_cp(struct r5conf *conf)
 		return log->next_checkpoint;
 
 	spin_lock_irqsave(&log->stripe_in_journal_lock, flags);
-	if (list_empty(&conf->log->stripe_in_journal_list)) {
+	if (list_empty(&log->stripe_in_journal_list)) {
 		/* all stripes flushed */
 		spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
 		return log->next_checkpoint;
 	}
-	sh = list_first_entry(&conf->log->stripe_in_journal_list,
+	sh = list_first_entry(&log->stripe_in_journal_list,
 			      struct stripe_head, r5c);
 	new_cp = sh->log_start;
 	spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
@@ -1399,7 +1402,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)
 	struct stripe_head *sh, *next;
 
 	lockdep_assert_held(&conf->device_lock);
-	if (!conf->log)
+	if (!READ_ONCE(conf->log))
 		return;
 
 	count = 0;
@@ -1420,7 +1423,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)
 
 static void r5c_do_reclaim(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	struct stripe_head *sh;
 	int count = 0;
 	unsigned long flags;
@@ -1549,7 +1552,7 @@ static void r5l_reclaim_thread(struct md_thread *thread)
 {
 	struct mddev *mddev = thread->mddev;
 	struct r5conf *conf = mddev->private;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	if (!log)
 		return;
@@ -1591,7 +1594,7 @@ void r5l_quiesce(struct r5l_log *log, int quiesce)
 
 bool r5l_log_disk_error(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	/* don't allow write if journal disk is missing */
 	if (!log)
@@ -2635,7 +2638,7 @@ int r5c_try_caching_write(struct r5conf *conf,
 			  struct stripe_head_state *s,
 			  int disks)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	int i;
 	struct r5dev *dev;
 	int to_cache = 0;
@@ -2802,7 +2805,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
 				 struct stripe_head *sh,
 				 struct stripe_head_state *s)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	int i;
 	int do_wakeup = 0;
 	sector_t tree_index;
@@ -2941,7 +2944,7 @@ int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh)
 /* check whether this big stripe is in write back cache. */
 bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	sector_t tree_index;
 	void *slot;
 
@@ -3049,14 +3052,14 @@ int r5l_start(struct r5l_log *log)
 void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev)
 {
 	struct r5conf *conf = mddev->private;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	if (!log)
 		return;
 
 	if ((raid5_calc_degraded(conf) > 0 ||
 	     test_bit(Journal, &rdev->flags)) &&
-	    conf->log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
+	    log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
 		schedule_work(&log->disable_writeback_work);
 }
 
@@ -3145,7 +3148,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
 	spin_lock_init(&log->stripe_in_journal_lock);
 	atomic_set(&log->stripe_in_journal_count, 0);
 
-	conf->log = log;
+	WRITE_ONCE(conf->log, log);
 
 	set_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
 	return 0;
@@ -3173,7 +3176,7 @@ void r5l_exit_log(struct r5conf *conf)
 	 * 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
 	 * ensure disable_writeback_work wakes up and exits.
 	 */
-	conf->log = NULL;
+	WRITE_ONCE(conf->log, NULL);
 	wake_up(&conf->mddev->sb_wait);
 	flush_work(&log->disable_writeback_work);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 10/28] md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

'conf->log' is set with 'reconfig_mutex' grabbed, however, readers are
not procted, hence use READ_ONCE/WRITE_ONCE to prevent reading abnormal
value.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5-cache.c | 47 +++++++++++++++++++++-------------------
 1 file changed, 25 insertions(+), 22 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 518b7cfa78b9..889bba60d6ff 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -327,8 +327,9 @@ void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
 void r5c_check_stripe_cache_usage(struct r5conf *conf)
 {
 	int total_cached;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
-	if (!r5c_is_writeback(conf->log))
+	if (!r5c_is_writeback(log))
 		return;
 
 	total_cached = atomic_read(&conf->r5c_cached_partial_stripes) +
@@ -344,7 +345,7 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
 	 */
 	if (total_cached > conf->min_nr_stripes * 1 / 2 ||
 	    atomic_read(&conf->empty_inactive_list_nr) > 0)
-		r5l_wake_reclaim(conf->log, 0);
+		r5l_wake_reclaim(log, 0);
 }
 
 /*
@@ -353,7 +354,9 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
  */
 void r5c_check_cached_full_stripe(struct r5conf *conf)
 {
-	if (!r5c_is_writeback(conf->log))
+	struct r5l_log *log = READ_ONCE(conf->log);
+
+	if (!r5c_is_writeback(log))
 		return;
 
 	/*
@@ -363,7 +366,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
 	if (atomic_read(&conf->r5c_cached_full_stripes) >=
 	    min(R5C_FULL_STRIPE_FLUSH_BATCH(conf),
 		conf->chunk_sectors >> RAID5_STRIPE_SHIFT(conf)))
-		r5l_wake_reclaim(conf->log, 0);
+		r5l_wake_reclaim(log, 0);
 }
 
 /*
@@ -396,7 +399,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
  */
 static sector_t r5c_log_required_to_flush_cache(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	if (!r5c_is_writeback(log))
 		return 0;
@@ -449,7 +452,7 @@ static inline void r5c_update_log_state(struct r5l_log *log)
 void r5c_make_stripe_write_out(struct stripe_head *sh)
 {
 	struct r5conf *conf = sh->raid_conf;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	BUG_ON(!r5c_is_writeback(log));
 
@@ -491,7 +494,7 @@ static void r5c_handle_parity_cached(struct stripe_head *sh)
  */
 static void r5c_finish_cache_stripe(struct stripe_head *sh)
 {
-	struct r5l_log *log = sh->raid_conf->log;
+	struct r5l_log *log = READ_ONCE(sh->raid_conf->log);
 
 	if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) {
 		BUG_ON(test_bit(STRIPE_R5C_CACHING, &sh->state));
@@ -692,7 +695,7 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 
 	/* wait superblock change before suspend */
 	wait_event(mddev->sb_wait,
-		   conf->log == NULL ||
+		   !READ_ONCE(conf->log) ||
 		   (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
 		    (locked = mddev_trylock(mddev))));
 	if (locked) {
@@ -1151,7 +1154,7 @@ static void r5l_run_no_space_stripes(struct r5l_log *log)
 static sector_t r5c_calculate_new_cp(struct r5conf *conf)
 {
 	struct stripe_head *sh;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	sector_t new_cp;
 	unsigned long flags;
 
@@ -1159,12 +1162,12 @@ static sector_t r5c_calculate_new_cp(struct r5conf *conf)
 		return log->next_checkpoint;
 
 	spin_lock_irqsave(&log->stripe_in_journal_lock, flags);
-	if (list_empty(&conf->log->stripe_in_journal_list)) {
+	if (list_empty(&log->stripe_in_journal_list)) {
 		/* all stripes flushed */
 		spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
 		return log->next_checkpoint;
 	}
-	sh = list_first_entry(&conf->log->stripe_in_journal_list,
+	sh = list_first_entry(&log->stripe_in_journal_list,
 			      struct stripe_head, r5c);
 	new_cp = sh->log_start;
 	spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
@@ -1399,7 +1402,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)
 	struct stripe_head *sh, *next;
 
 	lockdep_assert_held(&conf->device_lock);
-	if (!conf->log)
+	if (!READ_ONCE(conf->log))
 		return;
 
 	count = 0;
@@ -1420,7 +1423,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)
 
 static void r5c_do_reclaim(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	struct stripe_head *sh;
 	int count = 0;
 	unsigned long flags;
@@ -1549,7 +1552,7 @@ static void r5l_reclaim_thread(struct md_thread *thread)
 {
 	struct mddev *mddev = thread->mddev;
 	struct r5conf *conf = mddev->private;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	if (!log)
 		return;
@@ -1591,7 +1594,7 @@ void r5l_quiesce(struct r5l_log *log, int quiesce)
 
 bool r5l_log_disk_error(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	/* don't allow write if journal disk is missing */
 	if (!log)
@@ -2635,7 +2638,7 @@ int r5c_try_caching_write(struct r5conf *conf,
 			  struct stripe_head_state *s,
 			  int disks)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	int i;
 	struct r5dev *dev;
 	int to_cache = 0;
@@ -2802,7 +2805,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
 				 struct stripe_head *sh,
 				 struct stripe_head_state *s)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	int i;
 	int do_wakeup = 0;
 	sector_t tree_index;
@@ -2941,7 +2944,7 @@ int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh)
 /* check whether this big stripe is in write back cache. */
 bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 	sector_t tree_index;
 	void *slot;
 
@@ -3049,14 +3052,14 @@ int r5l_start(struct r5l_log *log)
 void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev)
 {
 	struct r5conf *conf = mddev->private;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = READ_ONCE(conf->log);
 
 	if (!log)
 		return;
 
 	if ((raid5_calc_degraded(conf) > 0 ||
 	     test_bit(Journal, &rdev->flags)) &&
-	    conf->log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
+	    log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
 		schedule_work(&log->disable_writeback_work);
 }
 
@@ -3145,7 +3148,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
 	spin_lock_init(&log->stripe_in_journal_lock);
 	atomic_set(&log->stripe_in_journal_count, 0);
 
-	conf->log = log;
+	WRITE_ONCE(conf->log, log);
 
 	set_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
 	return 0;
@@ -3173,7 +3176,7 @@ void r5l_exit_log(struct r5conf *conf)
 	 * 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
 	 * ensure disable_writeback_work wakes up and exits.
 	 */
-	conf->log = NULL;
+	WRITE_ONCE(conf->log, NULL);
 	wake_up(&conf->mddev->sb_wait);
 	flush_work(&log->disable_writeback_work);
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 11/28] md/raid5-cache: use new apis to suspend array for r5c_disable_writeback_async()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5-cache.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 889bba60d6ff..109367fec7c0 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -686,7 +686,6 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 					   disable_writeback_work);
 	struct mddev *mddev = log->rdev->mddev;
 	struct r5conf *conf = mddev->private;
-	int locked = 0;
 
 	if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH)
 		return;
@@ -696,13 +695,12 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 	/* wait superblock change before suspend */
 	wait_event(mddev->sb_wait,
 		   !READ_ONCE(conf->log) ||
-		   (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
-		    (locked = mddev_trylock(mddev))));
-	if (locked) {
-		mddev_suspend(mddev);
+		   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags));
+
+	if (READ_ONCE(conf->log)) {
+		__mddev_suspend(mddev);
 		log->r5c_journal_mode = R5C_JOURNAL_MODE_WRITE_THROUGH;
-		mddev_resume(mddev);
-		mddev_unlock(mddev);
+		__mddev_resume(mddev);
 	}
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 11/28] md/raid5-cache: use new apis to suspend array for r5c_disable_writeback_async()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5-cache.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 889bba60d6ff..109367fec7c0 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -686,7 +686,6 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 					   disable_writeback_work);
 	struct mddev *mddev = log->rdev->mddev;
 	struct r5conf *conf = mddev->private;
-	int locked = 0;
 
 	if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH)
 		return;
@@ -696,13 +695,12 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 	/* wait superblock change before suspend */
 	wait_event(mddev->sb_wait,
 		   !READ_ONCE(conf->log) ||
-		   (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
-		    (locked = mddev_trylock(mddev))));
-	if (locked) {
-		mddev_suspend(mddev);
+		   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags));
+
+	if (READ_ONCE(conf->log)) {
+		__mddev_suspend(mddev);
 		log->r5c_journal_mode = R5C_JOURNAL_MODE_WRITE_THROUGH;
-		mddev_resume(mddev);
-		mddev_unlock(mddev);
+		__mddev_resume(mddev);
 	}
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 12/28] md/raid5-cache: use new apis to suspend array for r5c_journal_mode_store()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

r5c_journal_mode_set() will suspend array and it has only 2 caller, the
other caller raid_ctl() already suspend the array with new apis.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5-cache.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 109367fec7c0..38d38f2e33bc 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -2584,9 +2584,7 @@ int r5c_journal_mode_set(struct mddev *mddev, int mode)
 	    mode == R5C_JOURNAL_MODE_WRITE_BACK)
 		return -EINVAL;
 
-	mddev_suspend(mddev);
 	conf->log->r5c_journal_mode = mode;
-	mddev_resume(mddev);
 
 	pr_debug("md/raid:%s: setting r5c cache mode to %d: %s\n",
 		 mdname(mddev), mode, r5c_journal_mode_str[mode]);
@@ -2611,11 +2609,11 @@ static ssize_t r5c_journal_mode_store(struct mddev *mddev,
 		if (strlen(r5c_journal_mode_str[mode]) == len &&
 		    !strncmp(page, r5c_journal_mode_str[mode], len))
 			break;
-	ret = mddev_lock(mddev);
+	ret = mddev_suspend_and_lock(mddev);
 	if (ret)
 		return ret;
 	ret = r5c_journal_mode_set(mddev, mode);
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return ret ?: length;
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 12/28] md/raid5-cache: use new apis to suspend array for r5c_journal_mode_store()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

r5c_journal_mode_set() will suspend array and it has only 2 caller, the
other caller raid_ctl() already suspend the array with new apis.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5-cache.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 109367fec7c0..38d38f2e33bc 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -2584,9 +2584,7 @@ int r5c_journal_mode_set(struct mddev *mddev, int mode)
 	    mode == R5C_JOURNAL_MODE_WRITE_BACK)
 		return -EINVAL;
 
-	mddev_suspend(mddev);
 	conf->log->r5c_journal_mode = mode;
-	mddev_resume(mddev);
 
 	pr_debug("md/raid:%s: setting r5c cache mode to %d: %s\n",
 		 mdname(mddev), mode, r5c_journal_mode_str[mode]);
@@ -2611,11 +2609,11 @@ static ssize_t r5c_journal_mode_store(struct mddev *mddev,
 		if (strlen(r5c_journal_mode_str[mode]) == len &&
 		    !strncmp(page, r5c_journal_mode_str[mode], len))
 			break;
-	ret = mddev_lock(mddev);
+	ret = mddev_suspend_and_lock(mddev);
 	if (ret)
 		return ret;
 	ret = r5c_journal_mode_set(mddev, mode);
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return ret ?: length;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 13/28] md/raid5: use new apis to suspend array for raid5_store_stripe_size()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 6383723468e5..f1c32b4d190f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7025,7 +7025,7 @@ raid5_store_stripe_size(struct mddev  *mddev, const char *page, size_t len)
 			new != roundup_pow_of_two(new))
 		return -EINVAL;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 
@@ -7049,7 +7049,6 @@ raid5_store_stripe_size(struct mddev  *mddev, const char *page, size_t len)
 		goto out_unlock;
 	}
 
-	mddev_suspend(mddev);
 	mutex_lock(&conf->cache_size_mutex);
 	size = conf->max_nr_stripes;
 
@@ -7064,10 +7063,9 @@ raid5_store_stripe_size(struct mddev  *mddev, const char *page, size_t len)
 		err = -ENOMEM;
 	}
 	mutex_unlock(&conf->cache_size_mutex);
-	mddev_resume(mddev);
 
 out_unlock:
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return err ?: len;
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 13/28] md/raid5: use new apis to suspend array for raid5_store_stripe_size()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 6383723468e5..f1c32b4d190f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7025,7 +7025,7 @@ raid5_store_stripe_size(struct mddev  *mddev, const char *page, size_t len)
 			new != roundup_pow_of_two(new))
 		return -EINVAL;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 
@@ -7049,7 +7049,6 @@ raid5_store_stripe_size(struct mddev  *mddev, const char *page, size_t len)
 		goto out_unlock;
 	}
 
-	mddev_suspend(mddev);
 	mutex_lock(&conf->cache_size_mutex);
 	size = conf->max_nr_stripes;
 
@@ -7064,10 +7063,9 @@ raid5_store_stripe_size(struct mddev  *mddev, const char *page, size_t len)
 		err = -ENOMEM;
 	}
 	mutex_unlock(&conf->cache_size_mutex);
-	mddev_resume(mddev);
 
 out_unlock:
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return err ?: len;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 14/28] md/raid5: use new apis to suspend array for raid5_store_skip_copy()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index f1c32b4d190f..c937716fed01 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7151,7 +7151,7 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
 		return -EINVAL;
 	new = !!new;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	conf = mddev->private;
@@ -7160,15 +7160,13 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
 	else if (new != conf->skip_copy) {
 		struct request_queue *q = mddev->queue;
 
-		mddev_suspend(mddev);
 		conf->skip_copy = new;
 		if (new)
 			blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, q);
 		else
 			blk_queue_flag_clear(QUEUE_FLAG_STABLE_WRITES, q);
-		mddev_resume(mddev);
 	}
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return err ?: len;
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 14/28] md/raid5: use new apis to suspend array for raid5_store_skip_copy()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index f1c32b4d190f..c937716fed01 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7151,7 +7151,7 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
 		return -EINVAL;
 	new = !!new;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	conf = mddev->private;
@@ -7160,15 +7160,13 @@ raid5_store_skip_copy(struct mddev *mddev, const char *page, size_t len)
 	else if (new != conf->skip_copy) {
 		struct request_queue *q = mddev->queue;
 
-		mddev_suspend(mddev);
 		conf->skip_copy = new;
 		if (new)
 			blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, q);
 		else
 			blk_queue_flag_clear(QUEUE_FLAG_STABLE_WRITES, q);
-		mddev_resume(mddev);
 	}
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return err ?: len;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 15/28] md/raid5: use new apis to suspend array for raid5_store_group_thread_cnt()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c937716fed01..8060d29e99d2 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7221,15 +7221,13 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
 	if (new > 8192)
 		return -EINVAL;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	conf = mddev->private;
 	if (!conf)
 		err = -ENODEV;
 	else if (new != conf->worker_cnt_per_group) {
-		mddev_suspend(mddev);
-
 		old_groups = conf->worker_groups;
 		if (old_groups)
 			flush_workqueue(raid5_wq);
@@ -7246,9 +7244,8 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
 				kfree(old_groups[0].workers);
 			kfree(old_groups);
 		}
-		mddev_resume(mddev);
 	}
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 
 	return err ?: len;
 }
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 15/28] md/raid5: use new apis to suspend array for raid5_store_group_thread_cnt()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index c937716fed01..8060d29e99d2 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7221,15 +7221,13 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
 	if (new > 8192)
 		return -EINVAL;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	conf = mddev->private;
 	if (!conf)
 		err = -ENODEV;
 	else if (new != conf->worker_cnt_per_group) {
-		mddev_suspend(mddev);
-
 		old_groups = conf->worker_groups;
 		if (old_groups)
 			flush_workqueue(raid5_wq);
@@ -7246,9 +7244,8 @@ raid5_store_group_thread_cnt(struct mddev *mddev, const char *page, size_t len)
 				kfree(old_groups[0].workers);
 			kfree(old_groups);
 		}
-		mddev_resume(mddev);
 	}
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 
 	return err ?: len;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 16/28] md/raid5: use new apis to suspend array for raid5_change_consistency_policy()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8060d29e99d2..e6b8c0145648 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -8967,12 +8967,12 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 	struct r5conf *conf;
 	int err;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	conf = mddev->private;
 	if (!conf) {
-		mddev_unlock(mddev);
+		mddev_unlock_and_resume(mddev);
 		return -ENODEV;
 	}
 
@@ -8982,19 +8982,14 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 			err = log_init(conf, NULL, true);
 			if (!err) {
 				err = resize_stripes(conf, conf->pool_size);
-				if (err) {
-					mddev_suspend(mddev);
+				if (err)
 					log_exit(conf);
-					mddev_resume(mddev);
-				}
 			}
 		} else
 			err = -EINVAL;
 	} else if (strncmp(buf, "resync", 6) == 0) {
 		if (raid5_has_ppl(conf)) {
-			mddev_suspend(mddev);
 			log_exit(conf);
-			mddev_resume(mddev);
 			err = resize_stripes(conf, conf->pool_size);
 		} else if (test_bit(MD_HAS_JOURNAL, &conf->mddev->flags) &&
 			   r5l_log_disk_error(conf)) {
@@ -9007,11 +9002,9 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 					break;
 				}
 
-			if (!journal_dev_exists) {
-				mddev_suspend(mddev);
+			if (!journal_dev_exists)
 				clear_bit(MD_HAS_JOURNAL, &mddev->flags);
-				mddev_resume(mddev);
-			} else  /* need remove journal device first */
+			else  /* need remove journal device first */
 				err = -EBUSY;
 		} else
 			err = -EINVAL;
@@ -9022,7 +9015,7 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 	if (!err)
 		md_update_sb(mddev, 1);
 
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 
 	return err;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 16/28] md/raid5: use new apis to suspend array for raid5_change_consistency_policy()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Convert to use new apis, the old apis will be removed eventually.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 19 ++++++-------------
 1 file changed, 6 insertions(+), 13 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8060d29e99d2..e6b8c0145648 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -8967,12 +8967,12 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 	struct r5conf *conf;
 	int err;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	conf = mddev->private;
 	if (!conf) {
-		mddev_unlock(mddev);
+		mddev_unlock_and_resume(mddev);
 		return -ENODEV;
 	}
 
@@ -8982,19 +8982,14 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 			err = log_init(conf, NULL, true);
 			if (!err) {
 				err = resize_stripes(conf, conf->pool_size);
-				if (err) {
-					mddev_suspend(mddev);
+				if (err)
 					log_exit(conf);
-					mddev_resume(mddev);
-				}
 			}
 		} else
 			err = -EINVAL;
 	} else if (strncmp(buf, "resync", 6) == 0) {
 		if (raid5_has_ppl(conf)) {
-			mddev_suspend(mddev);
 			log_exit(conf);
-			mddev_resume(mddev);
 			err = resize_stripes(conf, conf->pool_size);
 		} else if (test_bit(MD_HAS_JOURNAL, &conf->mddev->flags) &&
 			   r5l_log_disk_error(conf)) {
@@ -9007,11 +9002,9 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 					break;
 				}
 
-			if (!journal_dev_exists) {
-				mddev_suspend(mddev);
+			if (!journal_dev_exists)
 				clear_bit(MD_HAS_JOURNAL, &mddev->flags);
-				mddev_resume(mddev);
-			} else  /* need remove journal device first */
+			else  /* need remove journal device first */
 				err = -EBUSY;
 		} else
 			err = -EINVAL;
@@ -9022,7 +9015,7 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 	if (!err)
 		md_update_sb(mddev, 1);
 
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 
 	return err;
 }
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 17/28] md/raid5: replace suspend with quiesce() callback
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

raid5 is the only personality to suspend array in check_reshape() and
start_reshape() callback, suspend and quiesce() callback can both wait
for all normal io to be done, and prevent new io to be dispatched, the
difference is that suspend is implemented in common layer, and quiesce()
callback is implemented in raid5.

In order to cleanup all the usage of mddev_suspend(), the new apis
__mddev_suspend() need to be called before 'reconfig_mutex' is held,
and it's not good to affect all the personalities in common layer just
for raid5. Hence replace suspend with quiesce() callaback, prepare to
reomove all the users of mddev_suspend().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e6b8c0145648..d6de084a85e5 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -70,6 +70,8 @@ MODULE_PARM_DESC(devices_handle_discard_safely,
 		 "Set to Y if all devices in each array reliably return zeroes on reads from discarded regions");
 static struct workqueue_struct *raid5_wq;
 
+static void raid5_quiesce(struct mddev *mddev, int quiesce);
+
 static inline struct hlist_head *stripe_hash(struct r5conf *conf, sector_t sect)
 {
 	int hash = (sect >> RAID5_STRIPE_SHIFT(conf)) & HASH_MASK;
@@ -2492,15 +2494,12 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
 	unsigned long cpu;
 	int err = 0;
 
-	/*
-	 * Never shrink. And mddev_suspend() could deadlock if this is called
-	 * from raid5d. In that case, scribble_disks and scribble_sectors
-	 * should equal to new_disks and new_sectors
-	 */
+	/* Never shrink. */
 	if (conf->scribble_disks >= new_disks &&
 	    conf->scribble_sectors >= new_sectors)
 		return 0;
-	mddev_suspend(conf->mddev);
+
+	raid5_quiesce(conf->mddev, true);
 	cpus_read_lock();
 
 	for_each_present_cpu(cpu) {
@@ -2514,7 +2513,8 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
 	}
 
 	cpus_read_unlock();
-	mddev_resume(conf->mddev);
+	raid5_quiesce(conf->mddev, false);
+
 	if (!err) {
 		conf->scribble_disks = new_disks;
 		conf->scribble_sectors = new_sectors;
@@ -8551,8 +8551,8 @@ static int raid5_start_reshape(struct mddev *mddev)
 	 * the reshape wasn't running - like Discard or Read - have
 	 * completed.
 	 */
-	mddev_suspend(mddev);
-	mddev_resume(mddev);
+	raid5_quiesce(mddev, true);
+	raid5_quiesce(mddev, false);
 
 	/* Add some new drives, as many as will fit.
 	 * We know there are enough to make the newly sized array work.
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 17/28] md/raid5: replace suspend with quiesce() callback
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

raid5 is the only personality to suspend array in check_reshape() and
start_reshape() callback, suspend and quiesce() callback can both wait
for all normal io to be done, and prevent new io to be dispatched, the
difference is that suspend is implemented in common layer, and quiesce()
callback is implemented in raid5.

In order to cleanup all the usage of mddev_suspend(), the new apis
__mddev_suspend() need to be called before 'reconfig_mutex' is held,
and it's not good to affect all the personalities in common layer just
for raid5. Hence replace suspend with quiesce() callaback, prepare to
reomove all the users of mddev_suspend().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid5.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e6b8c0145648..d6de084a85e5 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -70,6 +70,8 @@ MODULE_PARM_DESC(devices_handle_discard_safely,
 		 "Set to Y if all devices in each array reliably return zeroes on reads from discarded regions");
 static struct workqueue_struct *raid5_wq;
 
+static void raid5_quiesce(struct mddev *mddev, int quiesce);
+
 static inline struct hlist_head *stripe_hash(struct r5conf *conf, sector_t sect)
 {
 	int hash = (sect >> RAID5_STRIPE_SHIFT(conf)) & HASH_MASK;
@@ -2492,15 +2494,12 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
 	unsigned long cpu;
 	int err = 0;
 
-	/*
-	 * Never shrink. And mddev_suspend() could deadlock if this is called
-	 * from raid5d. In that case, scribble_disks and scribble_sectors
-	 * should equal to new_disks and new_sectors
-	 */
+	/* Never shrink. */
 	if (conf->scribble_disks >= new_disks &&
 	    conf->scribble_sectors >= new_sectors)
 		return 0;
-	mddev_suspend(conf->mddev);
+
+	raid5_quiesce(conf->mddev, true);
 	cpus_read_lock();
 
 	for_each_present_cpu(cpu) {
@@ -2514,7 +2513,8 @@ static int resize_chunks(struct r5conf *conf, int new_disks, int new_sectors)
 	}
 
 	cpus_read_unlock();
-	mddev_resume(conf->mddev);
+	raid5_quiesce(conf->mddev, false);
+
 	if (!err) {
 		conf->scribble_disks = new_disks;
 		conf->scribble_sectors = new_sectors;
@@ -8551,8 +8551,8 @@ static int raid5_start_reshape(struct mddev *mddev)
 	 * the reshape wasn't running - like Discard or Read - have
 	 * completed.
 	 */
-	mddev_suspend(mddev);
-	mddev_resume(mddev);
+	raid5_quiesce(mddev, true);
+	raid5_quiesce(mddev, false);
 
 	/* Add some new drives, as many as will fit.
 	 * We know there are enough to make the newly sized array work.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 18/28] md: quiesce before md_kick_rdev_from_array() for md-cluster
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

md_kick_rdev_from_array() can be called from md_check_recovery() and
md_reload_sb() for md-cluster, it's very complicated to use new apis to
suspend the array before holding 'reconfig_mutex' in this case.

Fortunately, md-cluster is only supported for raid1 and raid10, and they
both impelement quiesce() callback that is safe to be called from daemon
thread. Hence use quiesce() callback to prevent io concurrent with
removing rdev from the array.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index a3bc4968fa0f..3343767882bb 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9609,6 +9609,21 @@ void md_check_recovery(struct mddev *mddev)
 
 		if (mddev_is_clustered(mddev)) {
 			struct md_rdev *rdev, *tmp;
+			bool suspended = false;
+
+			/*
+			 * md-cluster is used for raid1/raid10, and they both
+			 * implement quiesce() callback that is safe to be
+			 * called from daemon thread.
+			 */
+			rdev_for_each(rdev, mddev)
+				if (test_bit(ClusterRemove, &rdev->flags) &&
+				    rdev->raid_disk < 0) {
+					mddev->pers->quiesce(mddev, true);
+					suspended = true;
+					break;
+				}
+
 			/* kick the device if another node issued a
 			 * remove disk.
 			 */
@@ -9617,6 +9632,9 @@ void md_check_recovery(struct mddev *mddev)
 						rdev->raid_disk < 0)
 					md_kick_rdev_from_array(rdev);
 			}
+
+			if (suspended)
+				mddev->pers->quiesce(mddev, false);
 		}
 
 		if (try_set_sync && !mddev->external && !mddev->in_sync) {
@@ -9904,6 +9922,7 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
 {
 	struct mdp_superblock_1 *sb = page_address(rdev->sb_page);
 	struct md_rdev *rdev2, *tmp;
+	bool suspended = false;
 	int role, ret;
 
 	/*
@@ -9918,6 +9937,22 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
 			md_bitmap_update_sb(mddev->bitmap);
 	}
 
+	/*
+	 * md-cluster is used for raid1/raid10, and they both
+	 * implement quiesce() callback.
+	 */
+	rdev_for_each(rdev2, mddev) {
+		if (test_bit(Faulty, &rdev2->flags))
+			continue;
+		role = le16_to_cpu(sb->dev_roles[rdev2->desc_nr]);
+		if (test_bit(Candidate, &rdev2->flags) &&
+		    role == MD_DISK_ROLE_FAULTY) {
+			mddev->pers->quiesce(mddev, true);
+			suspended = true;
+			break;
+		}
+	}
+
 	/* Check for change of roles in the active devices */
 	rdev_for_each_safe(rdev2, tmp, mddev) {
 		if (test_bit(Faulty, &rdev2->flags))
@@ -9966,6 +10001,9 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
 		}
 	}
 
+	if (suspended)
+		mddev->pers->quiesce(mddev, false);
+
 	if (mddev->raid_disks != le32_to_cpu(sb->raid_disks)) {
 		ret = update_raid_disks(mddev, le32_to_cpu(sb->raid_disks));
 		if (ret)
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 18/28] md: quiesce before md_kick_rdev_from_array() for md-cluster
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

md_kick_rdev_from_array() can be called from md_check_recovery() and
md_reload_sb() for md-cluster, it's very complicated to use new apis to
suspend the array before holding 'reconfig_mutex' in this case.

Fortunately, md-cluster is only supported for raid1 and raid10, and they
both impelement quiesce() callback that is safe to be called from daemon
thread. Hence use quiesce() callback to prevent io concurrent with
removing rdev from the array.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index a3bc4968fa0f..3343767882bb 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9609,6 +9609,21 @@ void md_check_recovery(struct mddev *mddev)
 
 		if (mddev_is_clustered(mddev)) {
 			struct md_rdev *rdev, *tmp;
+			bool suspended = false;
+
+			/*
+			 * md-cluster is used for raid1/raid10, and they both
+			 * implement quiesce() callback that is safe to be
+			 * called from daemon thread.
+			 */
+			rdev_for_each(rdev, mddev)
+				if (test_bit(ClusterRemove, &rdev->flags) &&
+				    rdev->raid_disk < 0) {
+					mddev->pers->quiesce(mddev, true);
+					suspended = true;
+					break;
+				}
+
 			/* kick the device if another node issued a
 			 * remove disk.
 			 */
@@ -9617,6 +9632,9 @@ void md_check_recovery(struct mddev *mddev)
 						rdev->raid_disk < 0)
 					md_kick_rdev_from_array(rdev);
 			}
+
+			if (suspended)
+				mddev->pers->quiesce(mddev, false);
 		}
 
 		if (try_set_sync && !mddev->external && !mddev->in_sync) {
@@ -9904,6 +9922,7 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
 {
 	struct mdp_superblock_1 *sb = page_address(rdev->sb_page);
 	struct md_rdev *rdev2, *tmp;
+	bool suspended = false;
 	int role, ret;
 
 	/*
@@ -9918,6 +9937,22 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
 			md_bitmap_update_sb(mddev->bitmap);
 	}
 
+	/*
+	 * md-cluster is used for raid1/raid10, and they both
+	 * implement quiesce() callback.
+	 */
+	rdev_for_each(rdev2, mddev) {
+		if (test_bit(Faulty, &rdev2->flags))
+			continue;
+		role = le16_to_cpu(sb->dev_roles[rdev2->desc_nr]);
+		if (test_bit(Candidate, &rdev2->flags) &&
+		    role == MD_DISK_ROLE_FAULTY) {
+			mddev->pers->quiesce(mddev, true);
+			suspended = true;
+			break;
+		}
+	}
+
 	/* Check for change of roles in the active devices */
 	rdev_for_each_safe(rdev2, tmp, mddev) {
 		if (test_bit(Faulty, &rdev2->flags))
@@ -9966,6 +10001,9 @@ static void check_sb_changes(struct mddev *mddev, struct md_rdev *rdev)
 		}
 	}
 
+	if (suspended)
+		mddev->pers->quiesce(mddev, false);
+
 	if (mddev->raid_disks != le32_to_cpu(sb->raid_disks)) {
 		ret = update_raid_disks(mddev, le32_to_cpu(sb->raid_disks));
 		if (ret)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 19/28] md: use new apis to suspend array for ioctls involed array reconfiguration
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

'reconfig_mutex' will be grabbed before these ioctls, suspend array
before holding the lock, so that io won't concurrent with array
reconfiguration through ioctls.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 3343767882bb..81c7b9d1cc36 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7179,7 +7179,6 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
 			struct bitmap *bitmap;
 
 			bitmap = md_bitmap_create(mddev, -1);
-			mddev_suspend(mddev);
 			if (!IS_ERR(bitmap)) {
 				mddev->bitmap = bitmap;
 				err = md_bitmap_load(mddev);
@@ -7189,11 +7188,8 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
 				md_bitmap_destroy(mddev);
 				fd = -1;
 			}
-			mddev_resume(mddev);
 		} else if (fd < 0) {
-			mddev_suspend(mddev);
 			md_bitmap_destroy(mddev);
-			mddev_resume(mddev);
 		}
 	}
 	if (fd < 0) {
@@ -7482,7 +7478,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
 			mddev->bitmap_info.space =
 				mddev->bitmap_info.default_space;
 			bitmap = md_bitmap_create(mddev, -1);
-			mddev_suspend(mddev);
 			if (!IS_ERR(bitmap)) {
 				mddev->bitmap = bitmap;
 				rv = md_bitmap_load(mddev);
@@ -7490,7 +7485,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
 				rv = PTR_ERR(bitmap);
 			if (rv)
 				md_bitmap_destroy(mddev);
-			mddev_resume(mddev);
 		} else {
 			/* remove the bitmap */
 			if (!mddev->bitmap) {
@@ -7515,9 +7509,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
 				module_put(md_cluster_mod);
 				mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY;
 			}
-			mddev_suspend(mddev);
 			md_bitmap_destroy(mddev);
-			mddev_resume(mddev);
 			mddev->bitmap_info.offset = 0;
 		}
 	}
@@ -7588,6 +7580,20 @@ static inline bool md_ioctl_valid(unsigned int cmd)
 	}
 }
 
+static bool md_ioctl_need_suspend(unsigned int cmd)
+{
+	switch (cmd) {
+	case ADD_NEW_DISK:
+	case HOT_ADD_DISK:
+	case HOT_REMOVE_DISK:
+	case SET_BITMAP_FILE:
+	case SET_ARRAY_INFO:
+		return true;
+	default:
+		return false;
+	}
+}
+
 static int __md_set_array_info(struct mddev *mddev, void __user *argp)
 {
 	mdu_array_info_t info;
@@ -7720,7 +7726,8 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
 	if (!md_is_rdwr(mddev))
 		flush_work(&mddev->sync_work);
 
-	err = mddev_lock(mddev);
+	err = md_ioctl_need_suspend(cmd) ? mddev_suspend_and_lock(mddev) :
+					   mddev_lock(mddev);
 	if (err) {
 		pr_debug("md: ioctl lock interrupted, reason %d, cmd %d\n",
 			 err, cmd);
@@ -7848,7 +7855,11 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
 	if (mddev->hold_active == UNTIL_IOCTL &&
 	    err != -EINVAL)
 		mddev->hold_active = 0;
+
 	mddev_unlock(mddev);
+	if (md_ioctl_need_suspend(cmd))
+		__mddev_resume(mddev);
+
 out:
 	if(did_set_md_closing)
 		clear_bit(MD_CLOSING, &mddev->flags);
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 19/28] md: use new apis to suspend array for ioctls involed array reconfiguration
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

'reconfig_mutex' will be grabbed before these ioctls, suspend array
before holding the lock, so that io won't concurrent with array
reconfiguration through ioctls.

This is not hot path, so performance is not concerned.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 3343767882bb..81c7b9d1cc36 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7179,7 +7179,6 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
 			struct bitmap *bitmap;
 
 			bitmap = md_bitmap_create(mddev, -1);
-			mddev_suspend(mddev);
 			if (!IS_ERR(bitmap)) {
 				mddev->bitmap = bitmap;
 				err = md_bitmap_load(mddev);
@@ -7189,11 +7188,8 @@ static int set_bitmap_file(struct mddev *mddev, int fd)
 				md_bitmap_destroy(mddev);
 				fd = -1;
 			}
-			mddev_resume(mddev);
 		} else if (fd < 0) {
-			mddev_suspend(mddev);
 			md_bitmap_destroy(mddev);
-			mddev_resume(mddev);
 		}
 	}
 	if (fd < 0) {
@@ -7482,7 +7478,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
 			mddev->bitmap_info.space =
 				mddev->bitmap_info.default_space;
 			bitmap = md_bitmap_create(mddev, -1);
-			mddev_suspend(mddev);
 			if (!IS_ERR(bitmap)) {
 				mddev->bitmap = bitmap;
 				rv = md_bitmap_load(mddev);
@@ -7490,7 +7485,6 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
 				rv = PTR_ERR(bitmap);
 			if (rv)
 				md_bitmap_destroy(mddev);
-			mddev_resume(mddev);
 		} else {
 			/* remove the bitmap */
 			if (!mddev->bitmap) {
@@ -7515,9 +7509,7 @@ static int update_array_info(struct mddev *mddev, mdu_array_info_t *info)
 				module_put(md_cluster_mod);
 				mddev->safemode_delay = DEFAULT_SAFEMODE_DELAY;
 			}
-			mddev_suspend(mddev);
 			md_bitmap_destroy(mddev);
-			mddev_resume(mddev);
 			mddev->bitmap_info.offset = 0;
 		}
 	}
@@ -7588,6 +7580,20 @@ static inline bool md_ioctl_valid(unsigned int cmd)
 	}
 }
 
+static bool md_ioctl_need_suspend(unsigned int cmd)
+{
+	switch (cmd) {
+	case ADD_NEW_DISK:
+	case HOT_ADD_DISK:
+	case HOT_REMOVE_DISK:
+	case SET_BITMAP_FILE:
+	case SET_ARRAY_INFO:
+		return true;
+	default:
+		return false;
+	}
+}
+
 static int __md_set_array_info(struct mddev *mddev, void __user *argp)
 {
 	mdu_array_info_t info;
@@ -7720,7 +7726,8 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
 	if (!md_is_rdwr(mddev))
 		flush_work(&mddev->sync_work);
 
-	err = mddev_lock(mddev);
+	err = md_ioctl_need_suspend(cmd) ? mddev_suspend_and_lock(mddev) :
+					   mddev_lock(mddev);
 	if (err) {
 		pr_debug("md: ioctl lock interrupted, reason %d, cmd %d\n",
 			 err, cmd);
@@ -7848,7 +7855,11 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
 	if (mddev->hold_active == UNTIL_IOCTL &&
 	    err != -EINVAL)
 		mddev->hold_active = 0;
+
 	mddev_unlock(mddev);
+	if (md_ioctl_need_suspend(cmd))
+		__mddev_resume(mddev);
+
 out:
 	if(did_set_md_closing)
 		clear_bit(MD_CLOSING, &mddev->flags);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 20/28] md: use new apis to suspend array for adding/removing rdev from state_store()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

User can write 'remove' and 're-add' to trigger array reconfiguration
through sysfs, suspend array in this case so that io won't concurrent
with array reconfiguration.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 81c7b9d1cc36..0f1006197afd 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2918,11 +2918,7 @@ static int add_bound_rdev(struct md_rdev *rdev)
 		 */
 		super_types[mddev->major_version].
 			validate_super(mddev, rdev);
-		if (add_journal)
-			mddev_suspend(mddev);
 		err = mddev->pers->hot_add_disk(mddev, rdev);
-		if (add_journal)
-			mddev_resume(mddev);
 		if (err) {
 			md_kick_rdev_from_array(rdev);
 			return err;
@@ -3675,6 +3671,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 	struct rdev_sysfs_entry *entry = container_of(attr, struct rdev_sysfs_entry, attr);
 	struct md_rdev *rdev = container_of(kobj, struct md_rdev, kobj);
 	struct kernfs_node *kn = NULL;
+	bool suspended = false;
 	ssize_t rv;
 	struct mddev *mddev = rdev->mddev;
 
@@ -3683,8 +3680,14 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (entry->store == state_store && cmd_match(page, "remove"))
-		kn = sysfs_break_active_protection(kobj, attr);
+	if (entry->store == state_store) {
+		if (cmd_match(page, "remove"))
+			kn = sysfs_break_active_protection(kobj, attr);
+		if (cmd_match(page, "remove") || cmd_match(page, "re-add")) {
+			__mddev_suspend(mddev);
+			suspended = true;
+		}
+	}
 
 	rv = mddev ? mddev_lock(mddev) : -ENODEV;
 	if (!rv) {
@@ -3693,6 +3696,9 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 		else
 			rv = entry->store(rdev, page, length);
 		mddev_unlock(mddev);
+
+		if (suspended)
+			__mddev_resume(mddev);
 	}
 
 	if (kn)
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 20/28] md: use new apis to suspend array for adding/removing rdev from state_store()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

User can write 'remove' and 're-add' to trigger array reconfiguration
through sysfs, suspend array in this case so that io won't concurrent
with array reconfiguration.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 81c7b9d1cc36..0f1006197afd 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2918,11 +2918,7 @@ static int add_bound_rdev(struct md_rdev *rdev)
 		 */
 		super_types[mddev->major_version].
 			validate_super(mddev, rdev);
-		if (add_journal)
-			mddev_suspend(mddev);
 		err = mddev->pers->hot_add_disk(mddev, rdev);
-		if (add_journal)
-			mddev_resume(mddev);
 		if (err) {
 			md_kick_rdev_from_array(rdev);
 			return err;
@@ -3675,6 +3671,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 	struct rdev_sysfs_entry *entry = container_of(attr, struct rdev_sysfs_entry, attr);
 	struct md_rdev *rdev = container_of(kobj, struct md_rdev, kobj);
 	struct kernfs_node *kn = NULL;
+	bool suspended = false;
 	ssize_t rv;
 	struct mddev *mddev = rdev->mddev;
 
@@ -3683,8 +3680,14 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (entry->store == state_store && cmd_match(page, "remove"))
-		kn = sysfs_break_active_protection(kobj, attr);
+	if (entry->store == state_store) {
+		if (cmd_match(page, "remove"))
+			kn = sysfs_break_active_protection(kobj, attr);
+		if (cmd_match(page, "remove") || cmd_match(page, "re-add")) {
+			__mddev_suspend(mddev);
+			suspended = true;
+		}
+	}
 
 	rv = mddev ? mddev_lock(mddev) : -ENODEV;
 	if (!rv) {
@@ -3693,6 +3696,9 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 		else
 			rv = entry->store(rdev, page, length);
 		mddev_unlock(mddev);
+
+		if (suspended)
+			__mddev_resume(mddev);
 	}
 
 	if (kn)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 21/28] md: use new apis to suspend array for bind_rdev_to_array()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

mddev_create_serial_pool() will be called from bind_rdev_to_array(), and
mddev_suspend() will be called if serial pool is used.

Prepare to remove the mddev_suspend() from mddev_create_serial_pool().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-autodetect.c |  4 ++--
 drivers/md/md.c            | 14 +++++++-------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/md/md-autodetect.c b/drivers/md/md-autodetect.c
index 6eaa0eab40f9..4b80165afd23 100644
--- a/drivers/md/md-autodetect.c
+++ b/drivers/md/md-autodetect.c
@@ -175,7 +175,7 @@ static void __init md_setup_drive(struct md_setup_args *args)
 		return;
 	}
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err) {
 		pr_err("md: failed to lock array %s\n", name);
 		goto out_mddev_put;
@@ -221,7 +221,7 @@ static void __init md_setup_drive(struct md_setup_args *args)
 	if (err)
 		pr_warn("md: starting %s failed\n", name);
 out_unlock:
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 out_mddev_put:
 	mddev_put(mddev);
 }
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0f1006197afd..43bd7274b705 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2535,7 +2535,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
 	pr_debug("md: bind<%s>\n", b);
 
 	if (mddev->raid_disks)
-		mddev_create_serial_pool(mddev, rdev, false);
+		mddev_create_serial_pool(mddev, rdev, true);
 
 	if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b)))
 		goto fail;
@@ -4662,7 +4662,7 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
 	    minor != MINOR(dev))
 		return -EOVERFLOW;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	if (mddev->persistent) {
@@ -4683,14 +4683,14 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
 		rdev = md_import_device(dev, -1, -1);
 
 	if (IS_ERR(rdev)) {
-		mddev_unlock(mddev);
+		mddev_unlock_and_resume(mddev);
 		return PTR_ERR(rdev);
 	}
 	err = bind_rdev_to_array(rdev, mddev);
  out:
 	if (err)
 		export_rdev(rdev, mddev);
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	if (!err)
 		md_new_event();
 	return err ? err : len;
@@ -6619,13 +6619,13 @@ static void autorun_devices(int part)
 		if (IS_ERR(mddev))
 			break;
 
-		if (mddev_lock(mddev))
+		if (mddev_suspend_and_lock(mddev))
 			pr_warn("md: %s locked, cannot run\n", mdname(mddev));
 		else if (mddev->raid_disks || mddev->major_version
 			 || !list_empty(&mddev->disks)) {
 			pr_warn("md: %s already running, cannot run %pg\n",
 				mdname(mddev), rdev0->bdev);
-			mddev_unlock(mddev);
+			mddev_unlock_and_resume(mddev);
 		} else {
 			pr_debug("md: created %s\n", mdname(mddev));
 			mddev->persistent = 1;
@@ -6635,7 +6635,7 @@ static void autorun_devices(int part)
 					export_rdev(rdev, mddev);
 			}
 			autorun_array(mddev);
-			mddev_unlock(mddev);
+			mddev_unlock_and_resume(mddev);
 		}
 		/* on success, candidates will be empty, on error
 		 * it won't...
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 21/28] md: use new apis to suspend array for bind_rdev_to_array()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

mddev_create_serial_pool() will be called from bind_rdev_to_array(), and
mddev_suspend() will be called if serial pool is used.

Prepare to remove the mddev_suspend() from mddev_create_serial_pool().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-autodetect.c |  4 ++--
 drivers/md/md.c            | 14 +++++++-------
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/md/md-autodetect.c b/drivers/md/md-autodetect.c
index 6eaa0eab40f9..4b80165afd23 100644
--- a/drivers/md/md-autodetect.c
+++ b/drivers/md/md-autodetect.c
@@ -175,7 +175,7 @@ static void __init md_setup_drive(struct md_setup_args *args)
 		return;
 	}
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err) {
 		pr_err("md: failed to lock array %s\n", name);
 		goto out_mddev_put;
@@ -221,7 +221,7 @@ static void __init md_setup_drive(struct md_setup_args *args)
 	if (err)
 		pr_warn("md: starting %s failed\n", name);
 out_unlock:
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 out_mddev_put:
 	mddev_put(mddev);
 }
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0f1006197afd..43bd7274b705 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -2535,7 +2535,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
 	pr_debug("md: bind<%s>\n", b);
 
 	if (mddev->raid_disks)
-		mddev_create_serial_pool(mddev, rdev, false);
+		mddev_create_serial_pool(mddev, rdev, true);
 
 	if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b)))
 		goto fail;
@@ -4662,7 +4662,7 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
 	    minor != MINOR(dev))
 		return -EOVERFLOW;
 
-	err = mddev_lock(mddev);
+	err = mddev_suspend_and_lock(mddev);
 	if (err)
 		return err;
 	if (mddev->persistent) {
@@ -4683,14 +4683,14 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
 		rdev = md_import_device(dev, -1, -1);
 
 	if (IS_ERR(rdev)) {
-		mddev_unlock(mddev);
+		mddev_unlock_and_resume(mddev);
 		return PTR_ERR(rdev);
 	}
 	err = bind_rdev_to_array(rdev, mddev);
  out:
 	if (err)
 		export_rdev(rdev, mddev);
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	if (!err)
 		md_new_event();
 	return err ? err : len;
@@ -6619,13 +6619,13 @@ static void autorun_devices(int part)
 		if (IS_ERR(mddev))
 			break;
 
-		if (mddev_lock(mddev))
+		if (mddev_suspend_and_lock(mddev))
 			pr_warn("md: %s locked, cannot run\n", mdname(mddev));
 		else if (mddev->raid_disks || mddev->major_version
 			 || !list_empty(&mddev->disks)) {
 			pr_warn("md: %s already running, cannot run %pg\n",
 				mdname(mddev), rdev0->bdev);
-			mddev_unlock(mddev);
+			mddev_unlock_and_resume(mddev);
 		} else {
 			pr_debug("md: created %s\n", mdname(mddev));
 			mddev->persistent = 1;
@@ -6635,7 +6635,7 @@ static void autorun_devices(int part)
 					export_rdev(rdev, mddev);
 			}
 			autorun_array(mddev);
-			mddev_unlock(mddev);
+			mddev_unlock_and_resume(mddev);
 		}
 		/* on success, candidates will be empty, on error
 		 * it won't...
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 22/28] md: use new apis to suspend array related to serial pool in state_store()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

mddev_create/destroy_serial_pool() will be called from state_store() if
user write 'writemostly'/'-writemostly', and mddev_suspend() will be
called later.

Prepare to remove the mddev_suspend() from
mddev_create/destroy_serial_pool().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 43bd7274b705..305694b67fd7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3055,11 +3055,11 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
 		}
 	} else if (cmd_match(buf, "writemostly")) {
 		set_bit(WriteMostly, &rdev->flags);
-		mddev_create_serial_pool(rdev->mddev, rdev, false);
+		mddev_create_serial_pool(rdev->mddev, rdev, true);
 		need_update_sb = true;
 		err = 0;
 	} else if (cmd_match(buf, "-writemostly")) {
-		mddev_destroy_serial_pool(rdev->mddev, rdev, false);
+		mddev_destroy_serial_pool(rdev->mddev, rdev, true);
 		clear_bit(WriteMostly, &rdev->flags);
 		need_update_sb = true;
 		err = 0;
@@ -3683,7 +3683,9 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 	if (entry->store == state_store) {
 		if (cmd_match(page, "remove"))
 			kn = sysfs_break_active_protection(kobj, attr);
-		if (cmd_match(page, "remove") || cmd_match(page, "re-add")) {
+		if (cmd_match(page, "remove") || cmd_match(page, "re-add") ||
+		    cmd_match(page, "writemostly") ||
+		    cmd_match(page, "-writemostly")) {
 			__mddev_suspend(mddev);
 			suspended = true;
 		}
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 22/28] md: use new apis to suspend array related to serial pool in state_store()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

mddev_create/destroy_serial_pool() will be called from state_store() if
user write 'writemostly'/'-writemostly', and mddev_suspend() will be
called later.

Prepare to remove the mddev_suspend() from
mddev_create/destroy_serial_pool().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 43bd7274b705..305694b67fd7 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -3055,11 +3055,11 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
 		}
 	} else if (cmd_match(buf, "writemostly")) {
 		set_bit(WriteMostly, &rdev->flags);
-		mddev_create_serial_pool(rdev->mddev, rdev, false);
+		mddev_create_serial_pool(rdev->mddev, rdev, true);
 		need_update_sb = true;
 		err = 0;
 	} else if (cmd_match(buf, "-writemostly")) {
-		mddev_destroy_serial_pool(rdev->mddev, rdev, false);
+		mddev_destroy_serial_pool(rdev->mddev, rdev, true);
 		clear_bit(WriteMostly, &rdev->flags);
 		need_update_sb = true;
 		err = 0;
@@ -3683,7 +3683,9 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 	if (entry->store == state_store) {
 		if (cmd_match(page, "remove"))
 			kn = sysfs_break_active_protection(kobj, attr);
-		if (cmd_match(page, "remove") || cmd_match(page, "re-add")) {
+		if (cmd_match(page, "remove") || cmd_match(page, "re-add") ||
+		    cmd_match(page, "writemostly") ||
+		    cmd_match(page, "-writemostly")) {
 			__mddev_suspend(mddev);
 			suspended = true;
 		}
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 23/28] md: use new apis to suspend array in backlog_store()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

mddev_create/destroy_serial_pool() will be called from backlog_store(),
and mddev_suspend() will be called later.

Prepare to remove the mddev_suspend() from
mddev_create/destroy_serial_pool().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-bitmap.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 7d21e2a5b06e..b3d701c5c461 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2537,7 +2537,7 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
 	if (backlog > COUNTER_MAX)
 		return -EINVAL;
 
-	rv = mddev_lock(mddev);
+	rv = mddev_suspend_and_lock(mddev);
 	if (rv)
 		return rv;
 
@@ -2562,16 +2562,16 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
 	if (!backlog && mddev->serial_info_pool) {
 		/* serial_info_pool is not needed if backlog is zero */
 		if (!mddev->serialize_policy)
-			mddev_destroy_serial_pool(mddev, NULL, false);
+			mddev_destroy_serial_pool(mddev, NULL, true);
 	} else if (backlog && !mddev->serial_info_pool) {
 		/* serial_info_pool is needed since backlog is not zero */
 		rdev_for_each(rdev, mddev)
-			mddev_create_serial_pool(mddev, rdev, false);
+			mddev_create_serial_pool(mddev, rdev, true);
 	}
 	if (old_mwb != backlog)
 		md_bitmap_update_sb(mddev->bitmap);
 
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return len;
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 23/28] md: use new apis to suspend array in backlog_store()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

mddev_create/destroy_serial_pool() will be called from backlog_store(),
and mddev_suspend() will be called later.

Prepare to remove the mddev_suspend() from
mddev_create/destroy_serial_pool().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-bitmap.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index 7d21e2a5b06e..b3d701c5c461 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2537,7 +2537,7 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
 	if (backlog > COUNTER_MAX)
 		return -EINVAL;
 
-	rv = mddev_lock(mddev);
+	rv = mddev_suspend_and_lock(mddev);
 	if (rv)
 		return rv;
 
@@ -2562,16 +2562,16 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
 	if (!backlog && mddev->serial_info_pool) {
 		/* serial_info_pool is not needed if backlog is zero */
 		if (!mddev->serialize_policy)
-			mddev_destroy_serial_pool(mddev, NULL, false);
+			mddev_destroy_serial_pool(mddev, NULL, true);
 	} else if (backlog && !mddev->serial_info_pool) {
 		/* serial_info_pool is needed since backlog is not zero */
 		rdev_for_each(rdev, mddev)
-			mddev_create_serial_pool(mddev, rdev, false);
+			mddev_create_serial_pool(mddev, rdev, true);
 	}
 	if (old_mwb != backlog)
 		md_bitmap_update_sb(mddev->bitmap);
 
-	mddev_unlock(mddev);
+	mddev_unlock_and_resume(mddev);
 	return len;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 24/28] md: suspend array in md_start_sync() if array need reconfiguration
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

So that io won't concurrent with array reconfiguration, and it's safe to
suspend the array directly because normal io won't rely on
md_start_sync().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 305694b67fd7..0bb4c59543aa 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9457,6 +9457,12 @@ static void md_start_sync(struct work_struct *ws)
 {
 	struct mddev *mddev = container_of(ws, struct mddev, sync_work);
 	int spares = 0;
+	bool suspended = false;
+
+	if (md_spares_need_change(mddev)) {
+		__mddev_suspend(mddev);
+		suspended = true;
+	}
 
 	mddev_lock_nointr(mddev);
 
@@ -9495,6 +9501,9 @@ static void md_start_sync(struct work_struct *ws)
 	}
 
 	mddev_unlock(mddev);
+	if (suspended)
+		__mddev_resume(mddev);
+
 	md_wakeup_thread(mddev->sync_thread);
 	sysfs_notify_dirent_safe(mddev->sysfs_action);
 	md_new_event();
@@ -9507,6 +9516,8 @@ static void md_start_sync(struct work_struct *ws)
 	clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
 	clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
 	mddev_unlock(mddev);
+	if (suspended)
+		__mddev_resume(mddev);
 
 	wake_up(&resync_wait);
 	if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 24/28] md: suspend array in md_start_sync() if array need reconfiguration
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

So that io won't concurrent with array reconfiguration, and it's safe to
suspend the array directly because normal io won't rely on
md_start_sync().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 305694b67fd7..0bb4c59543aa 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9457,6 +9457,12 @@ static void md_start_sync(struct work_struct *ws)
 {
 	struct mddev *mddev = container_of(ws, struct mddev, sync_work);
 	int spares = 0;
+	bool suspended = false;
+
+	if (md_spares_need_change(mddev)) {
+		__mddev_suspend(mddev);
+		suspended = true;
+	}
 
 	mddev_lock_nointr(mddev);
 
@@ -9495,6 +9501,9 @@ static void md_start_sync(struct work_struct *ws)
 	}
 
 	mddev_unlock(mddev);
+	if (suspended)
+		__mddev_resume(mddev);
+
 	md_wakeup_thread(mddev->sync_thread);
 	sysfs_notify_dirent_safe(mddev->sysfs_action);
 	md_new_event();
@@ -9507,6 +9516,8 @@ static void md_start_sync(struct work_struct *ws)
 	clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
 	clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
 	mddev_unlock(mddev);
+	if (suspended)
+		__mddev_resume(mddev);
 
 	wake_up(&resync_wait);
 	if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 25/28] md: cleanup mddev_create/destroy_serial_pool()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Now that except for stopping the array, all the callers already suspend
the array, there is no need to suspend anymore, hence remove the second
parameter.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-bitmap.c |  8 ++++----
 drivers/md/md.c        | 33 ++++++++++-----------------------
 drivers/md/md.h        |  7 +++----
 3 files changed, 17 insertions(+), 31 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index b3d701c5c461..9672f75c3050 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1861,7 +1861,7 @@ void md_bitmap_destroy(struct mddev *mddev)
 
 	md_bitmap_wait_behind_writes(mddev);
 	if (!mddev->serialize_policy)
-		mddev_destroy_serial_pool(mddev, NULL, true);
+		mddev_destroy_serial_pool(mddev, NULL);
 
 	mutex_lock(&mddev->bitmap_info.mutex);
 	spin_lock(&mddev->lock);
@@ -1977,7 +1977,7 @@ int md_bitmap_load(struct mddev *mddev)
 		goto out;
 
 	rdev_for_each(rdev, mddev)
-		mddev_create_serial_pool(mddev, rdev, true);
+		mddev_create_serial_pool(mddev, rdev);
 
 	if (mddev_is_clustered(mddev))
 		md_cluster_ops->load_bitmaps(mddev, mddev->bitmap_info.nodes);
@@ -2562,11 +2562,11 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
 	if (!backlog && mddev->serial_info_pool) {
 		/* serial_info_pool is not needed if backlog is zero */
 		if (!mddev->serialize_policy)
-			mddev_destroy_serial_pool(mddev, NULL, true);
+			mddev_destroy_serial_pool(mddev, NULL);
 	} else if (backlog && !mddev->serial_info_pool) {
 		/* serial_info_pool is needed since backlog is not zero */
 		rdev_for_each(rdev, mddev)
-			mddev_create_serial_pool(mddev, rdev, true);
+			mddev_create_serial_pool(mddev, rdev);
 	}
 	if (old_mwb != backlog)
 		md_bitmap_update_sb(mddev->bitmap);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0bb4c59543aa..53133b37c9b9 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -206,8 +206,7 @@ static int rdev_need_serial(struct md_rdev *rdev)
  * 1. rdev is the first device which return true from rdev_enable_serial.
  * 2. rdev is NULL, means we want to enable serialization for all rdevs.
  */
-void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
-			      bool is_suspend)
+void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev)
 {
 	int ret = 0;
 
@@ -215,15 +214,12 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 	    !test_bit(CollisionCheck, &rdev->flags))
 		return;
 
-	if (!is_suspend)
-		mddev_suspend(mddev);
-
 	if (!rdev)
 		ret = rdevs_init_serial(mddev);
 	else
 		ret = rdev_init_serial(rdev);
 	if (ret)
-		goto abort;
+		return;
 
 	if (mddev->serial_info_pool == NULL) {
 		/*
@@ -238,10 +234,6 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 			pr_err("can't alloc memory pool for serialization\n");
 		}
 	}
-
-abort:
-	if (!is_suspend)
-		mddev_resume(mddev);
 }
 
 /*
@@ -250,8 +242,7 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
  * 2. when bitmap is destroyed while policy is not enabled.
  * 3. for disable policy, the pool is destroyed only when no rdev needs it.
  */
-void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
-			       bool is_suspend)
+void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev)
 {
 	if (rdev && !test_bit(CollisionCheck, &rdev->flags))
 		return;
@@ -260,8 +251,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 		struct md_rdev *temp;
 		int num = 0; /* used to track if other rdevs need the pool */
 
-		if (!is_suspend)
-			mddev_suspend(mddev);
 		rdev_for_each(temp, mddev) {
 			if (!rdev) {
 				if (!mddev->serialize_policy ||
@@ -283,8 +272,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 			mempool_destroy(mddev->serial_info_pool);
 			mddev->serial_info_pool = NULL;
 		}
-		if (!is_suspend)
-			mddev_resume(mddev);
 	}
 }
 
@@ -2535,7 +2522,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
 	pr_debug("md: bind<%s>\n", b);
 
 	if (mddev->raid_disks)
-		mddev_create_serial_pool(mddev, rdev, true);
+		mddev_create_serial_pool(mddev, rdev);
 
 	if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b)))
 		goto fail;
@@ -2588,7 +2575,7 @@ static void md_kick_rdev_from_array(struct md_rdev *rdev)
 	bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk);
 	list_del_rcu(&rdev->same_set);
 	pr_debug("md: unbind<%pg>\n", rdev->bdev);
-	mddev_destroy_serial_pool(rdev->mddev, rdev, false);
+	mddev_destroy_serial_pool(rdev->mddev, rdev);
 	rdev->mddev = NULL;
 	sysfs_remove_link(&rdev->kobj, "block");
 	sysfs_put(rdev->sysfs_state);
@@ -3055,11 +3042,11 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
 		}
 	} else if (cmd_match(buf, "writemostly")) {
 		set_bit(WriteMostly, &rdev->flags);
-		mddev_create_serial_pool(rdev->mddev, rdev, true);
+		mddev_create_serial_pool(rdev->mddev, rdev);
 		need_update_sb = true;
 		err = 0;
 	} else if (cmd_match(buf, "-writemostly")) {
-		mddev_destroy_serial_pool(rdev->mddev, rdev, true);
+		mddev_destroy_serial_pool(rdev->mddev, rdev);
 		clear_bit(WriteMostly, &rdev->flags);
 		need_update_sb = true;
 		err = 0;
@@ -5563,9 +5550,9 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
 	}
 
 	if (value)
-		mddev_create_serial_pool(mddev, NULL, true);
+		mddev_create_serial_pool(mddev, NULL);
 	else
-		mddev_destroy_serial_pool(mddev, NULL, true);
+		mddev_destroy_serial_pool(mddev, NULL);
 	mddev->serialize_policy = value;
 unlock:
 	mddev_unlock_and_resume(mddev);
@@ -6331,7 +6318,7 @@ static void __md_stop_writes(struct mddev *mddev)
 	}
 	/* disable policy to guarantee rdevs free resources for serialization */
 	mddev->serialize_policy = 0;
-	mddev_destroy_serial_pool(mddev, NULL, true);
+	mddev_destroy_serial_pool(mddev, NULL);
 }
 
 void md_stop_writes(struct mddev *mddev)
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 07496179084a..73334034e880 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -817,10 +817,9 @@ extern void __mddev_resume(struct mddev *mddev);
 
 extern void md_reload_sb(struct mddev *mddev, int raid_disk);
 extern void md_update_sb(struct mddev *mddev, int force);
-extern void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
-				     bool is_suspend);
-extern void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
-				      bool is_suspend);
+extern void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev);
+extern void mddev_destroy_serial_pool(struct mddev *mddev,
+				      struct md_rdev *rdev);
 struct md_rdev *md_find_rdev_nr_rcu(struct mddev *mddev, int nr);
 struct md_rdev *md_find_rdev_rcu(struct mddev *mddev, dev_t dev);
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 25/28] md: cleanup mddev_create/destroy_serial_pool()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Now that except for stopping the array, all the callers already suspend
the array, there is no need to suspend anymore, hence remove the second
parameter.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-bitmap.c |  8 ++++----
 drivers/md/md.c        | 33 ++++++++++-----------------------
 drivers/md/md.h        |  7 +++----
 3 files changed, 17 insertions(+), 31 deletions(-)

diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index b3d701c5c461..9672f75c3050 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -1861,7 +1861,7 @@ void md_bitmap_destroy(struct mddev *mddev)
 
 	md_bitmap_wait_behind_writes(mddev);
 	if (!mddev->serialize_policy)
-		mddev_destroy_serial_pool(mddev, NULL, true);
+		mddev_destroy_serial_pool(mddev, NULL);
 
 	mutex_lock(&mddev->bitmap_info.mutex);
 	spin_lock(&mddev->lock);
@@ -1977,7 +1977,7 @@ int md_bitmap_load(struct mddev *mddev)
 		goto out;
 
 	rdev_for_each(rdev, mddev)
-		mddev_create_serial_pool(mddev, rdev, true);
+		mddev_create_serial_pool(mddev, rdev);
 
 	if (mddev_is_clustered(mddev))
 		md_cluster_ops->load_bitmaps(mddev, mddev->bitmap_info.nodes);
@@ -2562,11 +2562,11 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
 	if (!backlog && mddev->serial_info_pool) {
 		/* serial_info_pool is not needed if backlog is zero */
 		if (!mddev->serialize_policy)
-			mddev_destroy_serial_pool(mddev, NULL, true);
+			mddev_destroy_serial_pool(mddev, NULL);
 	} else if (backlog && !mddev->serial_info_pool) {
 		/* serial_info_pool is needed since backlog is not zero */
 		rdev_for_each(rdev, mddev)
-			mddev_create_serial_pool(mddev, rdev, true);
+			mddev_create_serial_pool(mddev, rdev);
 	}
 	if (old_mwb != backlog)
 		md_bitmap_update_sb(mddev->bitmap);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0bb4c59543aa..53133b37c9b9 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -206,8 +206,7 @@ static int rdev_need_serial(struct md_rdev *rdev)
  * 1. rdev is the first device which return true from rdev_enable_serial.
  * 2. rdev is NULL, means we want to enable serialization for all rdevs.
  */
-void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
-			      bool is_suspend)
+void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev)
 {
 	int ret = 0;
 
@@ -215,15 +214,12 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 	    !test_bit(CollisionCheck, &rdev->flags))
 		return;
 
-	if (!is_suspend)
-		mddev_suspend(mddev);
-
 	if (!rdev)
 		ret = rdevs_init_serial(mddev);
 	else
 		ret = rdev_init_serial(rdev);
 	if (ret)
-		goto abort;
+		return;
 
 	if (mddev->serial_info_pool == NULL) {
 		/*
@@ -238,10 +234,6 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 			pr_err("can't alloc memory pool for serialization\n");
 		}
 	}
-
-abort:
-	if (!is_suspend)
-		mddev_resume(mddev);
 }
 
 /*
@@ -250,8 +242,7 @@ void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
  * 2. when bitmap is destroyed while policy is not enabled.
  * 3. for disable policy, the pool is destroyed only when no rdev needs it.
  */
-void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
-			       bool is_suspend)
+void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev)
 {
 	if (rdev && !test_bit(CollisionCheck, &rdev->flags))
 		return;
@@ -260,8 +251,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 		struct md_rdev *temp;
 		int num = 0; /* used to track if other rdevs need the pool */
 
-		if (!is_suspend)
-			mddev_suspend(mddev);
 		rdev_for_each(temp, mddev) {
 			if (!rdev) {
 				if (!mddev->serialize_policy ||
@@ -283,8 +272,6 @@ void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
 			mempool_destroy(mddev->serial_info_pool);
 			mddev->serial_info_pool = NULL;
 		}
-		if (!is_suspend)
-			mddev_resume(mddev);
 	}
 }
 
@@ -2535,7 +2522,7 @@ static int bind_rdev_to_array(struct md_rdev *rdev, struct mddev *mddev)
 	pr_debug("md: bind<%s>\n", b);
 
 	if (mddev->raid_disks)
-		mddev_create_serial_pool(mddev, rdev, true);
+		mddev_create_serial_pool(mddev, rdev);
 
 	if ((err = kobject_add(&rdev->kobj, &mddev->kobj, "dev-%s", b)))
 		goto fail;
@@ -2588,7 +2575,7 @@ static void md_kick_rdev_from_array(struct md_rdev *rdev)
 	bd_unlink_disk_holder(rdev->bdev, rdev->mddev->gendisk);
 	list_del_rcu(&rdev->same_set);
 	pr_debug("md: unbind<%pg>\n", rdev->bdev);
-	mddev_destroy_serial_pool(rdev->mddev, rdev, false);
+	mddev_destroy_serial_pool(rdev->mddev, rdev);
 	rdev->mddev = NULL;
 	sysfs_remove_link(&rdev->kobj, "block");
 	sysfs_put(rdev->sysfs_state);
@@ -3055,11 +3042,11 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
 		}
 	} else if (cmd_match(buf, "writemostly")) {
 		set_bit(WriteMostly, &rdev->flags);
-		mddev_create_serial_pool(rdev->mddev, rdev, true);
+		mddev_create_serial_pool(rdev->mddev, rdev);
 		need_update_sb = true;
 		err = 0;
 	} else if (cmd_match(buf, "-writemostly")) {
-		mddev_destroy_serial_pool(rdev->mddev, rdev, true);
+		mddev_destroy_serial_pool(rdev->mddev, rdev);
 		clear_bit(WriteMostly, &rdev->flags);
 		need_update_sb = true;
 		err = 0;
@@ -5563,9 +5550,9 @@ serialize_policy_store(struct mddev *mddev, const char *buf, size_t len)
 	}
 
 	if (value)
-		mddev_create_serial_pool(mddev, NULL, true);
+		mddev_create_serial_pool(mddev, NULL);
 	else
-		mddev_destroy_serial_pool(mddev, NULL, true);
+		mddev_destroy_serial_pool(mddev, NULL);
 	mddev->serialize_policy = value;
 unlock:
 	mddev_unlock_and_resume(mddev);
@@ -6331,7 +6318,7 @@ static void __md_stop_writes(struct mddev *mddev)
 	}
 	/* disable policy to guarantee rdevs free resources for serialization */
 	mddev->serialize_policy = 0;
-	mddev_destroy_serial_pool(mddev, NULL, true);
+	mddev_destroy_serial_pool(mddev, NULL);
 }
 
 void md_stop_writes(struct mddev *mddev)
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 07496179084a..73334034e880 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -817,10 +817,9 @@ extern void __mddev_resume(struct mddev *mddev);
 
 extern void md_reload_sb(struct mddev *mddev, int raid_disk);
 extern void md_update_sb(struct mddev *mddev, int force);
-extern void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
-				     bool is_suspend);
-extern void mddev_destroy_serial_pool(struct mddev *mddev, struct md_rdev *rdev,
-				      bool is_suspend);
+extern void mddev_create_serial_pool(struct mddev *mddev, struct md_rdev *rdev);
+extern void mddev_destroy_serial_pool(struct mddev *mddev,
+				      struct md_rdev *rdev);
 struct md_rdev *md_find_rdev_nr_rcu(struct mddev *mddev, int nr);
 struct md_rdev *md_find_rdev_rcu(struct mddev *mddev, dev_t dev);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 26/28] md/md-linear: cleanup linear_add()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Now that caller already suspend the array, there is no need to suspend
array in liner_add().

Note that mddev_suspend/resume() is not used anymore.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-linear.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
index 71ac99646827..66412397cef0 100644
--- a/drivers/md/md-linear.c
+++ b/drivers/md/md-linear.c
@@ -183,7 +183,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
 	 * in linear_congested(), therefore kfree_rcu() is used to free
 	 * oldconf until no one uses it anymore.
 	 */
-	mddev_suspend(mddev);
 	oldconf = rcu_dereference_protected(mddev->private,
 			lockdep_is_held(&mddev->reconfig_mutex));
 	mddev->raid_disks++;
@@ -192,7 +191,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
 	rcu_assign_pointer(mddev->private, newconf);
 	md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
 	set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
-	mddev_resume(mddev);
 	kfree_rcu(oldconf, rcu);
 	return 0;
 }
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 26/28] md/md-linear: cleanup linear_add()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Now that caller already suspend the array, there is no need to suspend
array in liner_add().

Note that mddev_suspend/resume() is not used anymore.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md-linear.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/md/md-linear.c b/drivers/md/md-linear.c
index 71ac99646827..66412397cef0 100644
--- a/drivers/md/md-linear.c
+++ b/drivers/md/md-linear.c
@@ -183,7 +183,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
 	 * in linear_congested(), therefore kfree_rcu() is used to free
 	 * oldconf until no one uses it anymore.
 	 */
-	mddev_suspend(mddev);
 	oldconf = rcu_dereference_protected(mddev->private,
 			lockdep_is_held(&mddev->reconfig_mutex));
 	mddev->raid_disks++;
@@ -192,7 +191,6 @@ static int linear_add(struct mddev *mddev, struct md_rdev *rdev)
 	rcu_assign_pointer(mddev->private, newconf);
 	md_set_array_sectors(mddev, linear_size(mddev, 0, 0));
 	set_capacity_and_notify(mddev->gendisk, mddev->array_sectors);
-	mddev_resume(mddev);
 	kfree_rcu(oldconf, rcu);
 	return 0;
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 27/28] md: remove old apis to suspend the array
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Now that mddev_suspend() and mddev_resume() is not used anywhere, remove
them, and remove 'MD_ALLOW_SB_UPDATE' and 'MD_UPDATING_SB' as well.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 82 ++-----------------------------------------------
 drivers/md/md.h |  8 -----
 2 files changed, 3 insertions(+), 87 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 53133b37c9b9..bb67734eded6 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -418,74 +418,10 @@ static void md_submit_bio(struct bio *bio)
 	md_handle_request(mddev, bio);
 }
 
-/* mddev_suspend makes sure no new requests are submitted
- * to the device, and that any requests that have been submitted
- * are completely handled.
- * Once mddev_detach() is called and completes, the module will be
- * completely unused.
+/*
+ * Make sure no new requests are submitted to the device, and any requests that
+ * have been submitted are completely handled.
  */
-void mddev_suspend(struct mddev *mddev)
-{
-	struct md_thread *thread = rcu_dereference_protected(mddev->thread,
-			lockdep_is_held(&mddev->reconfig_mutex));
-
-	WARN_ON_ONCE(thread && current == thread->tsk);
-
-	/* can't concurrent with __mddev_suspend() and __mddev_resume() */
-	mutex_lock(&mddev->suspend_mutex);
-	if (mddev->suspended++) {
-		mutex_unlock(&mddev->suspend_mutex);
-		return;
-	}
-
-	wake_up(&mddev->sb_wait);
-	set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
-	percpu_ref_kill(&mddev->active_io);
-
-	/*
-	 * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
-	 * by __mddev_suspend().
-	 */
-	if (mddev->pers && mddev->pers->prepare_suspend)
-		mddev->pers->prepare_suspend(mddev);
-
-	wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
-	clear_bit_unlock(MD_ALLOW_SB_UPDATE, &mddev->flags);
-	wait_event(mddev->sb_wait, !test_bit(MD_UPDATING_SB, &mddev->flags));
-
-	del_timer_sync(&mddev->safemode_timer);
-	/* restrict memory reclaim I/O during raid array is suspend */
-	mddev->noio_flag = memalloc_noio_save();
-
-	mutex_unlock(&mddev->suspend_mutex);
-}
-EXPORT_SYMBOL_GPL(mddev_suspend);
-
-void mddev_resume(struct mddev *mddev)
-{
-	lockdep_assert_held(&mddev->reconfig_mutex);
-
-	/* can't concurrent with __mddev_suspend() and __mddev_resume() */
-	mutex_lock(&mddev->suspend_mutex);
-	if (--mddev->suspended) {
-		mutex_unlock(&mddev->suspend_mutex);
-		return;
-	}
-
-	/* entred the memalloc scope from mddev_suspend() */
-	memalloc_noio_restore(mddev->noio_flag);
-
-	percpu_ref_resurrect(&mddev->active_io);
-	wake_up(&mddev->sb_wait);
-
-	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
-	md_wakeup_thread(mddev->thread);
-	md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
-
-	mutex_unlock(&mddev->suspend_mutex);
-}
-EXPORT_SYMBOL_GPL(mddev_resume);
-
 void __mddev_suspend(struct mddev *mddev)
 {
 
@@ -9536,18 +9472,6 @@ static void md_start_sync(struct work_struct *ws)
  */
 void md_check_recovery(struct mddev *mddev)
 {
-	if (test_bit(MD_ALLOW_SB_UPDATE, &mddev->flags) && mddev->sb_flags) {
-		/* Write superblock - thread that called mddev_suspend()
-		 * holds reconfig_mutex for us.
-		 */
-		set_bit(MD_UPDATING_SB, &mddev->flags);
-		smp_mb__after_atomic();
-		if (test_bit(MD_ALLOW_SB_UPDATE, &mddev->flags))
-			md_update_sb(mddev, 0);
-		clear_bit_unlock(MD_UPDATING_SB, &mddev->flags);
-		wake_up(&mddev->sb_wait);
-	}
-
 	if (is_md_suspended(mddev))
 		return;
 
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 73334034e880..f932d0cb9db0 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -248,10 +248,6 @@ struct md_cluster_info;
  *			    become failed.
  * @MD_HAS_PPL:  The raid array has PPL feature set.
  * @MD_HAS_MULTIPLE_PPLS: The raid array has multiple PPLs feature set.
- * @MD_ALLOW_SB_UPDATE: md_check_recovery is allowed to update the metadata
- *			 without taking reconfig_mutex.
- * @MD_UPDATING_SB: md_check_recovery is updating the metadata without
- *		     explicitly holding reconfig_mutex.
  * @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that
  *		   array is ready yet.
  * @MD_BROKEN: This is used to stop writes and mark array as failed.
@@ -268,8 +264,6 @@ enum mddev_flags {
 	MD_FAILFAST_SUPPORTED,
 	MD_HAS_PPL,
 	MD_HAS_MULTIPLE_PPLS,
-	MD_ALLOW_SB_UPDATE,
-	MD_UPDATING_SB,
 	MD_NOT_READY,
 	MD_BROKEN,
 	MD_DELETED,
@@ -810,8 +804,6 @@ extern int md_rdev_init(struct md_rdev *rdev);
 extern void md_rdev_clear(struct md_rdev *rdev);
 
 extern void md_handle_request(struct mddev *mddev, struct bio *bio);
-extern void mddev_suspend(struct mddev *mddev);
-extern void mddev_resume(struct mddev *mddev);
 extern void __mddev_suspend(struct mddev *mddev);
 extern void __mddev_resume(struct mddev *mddev);
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 27/28] md: remove old apis to suspend the array
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Now that mddev_suspend() and mddev_resume() is not used anywhere, remove
them, and remove 'MD_ALLOW_SB_UPDATE' and 'MD_UPDATING_SB' as well.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 82 ++-----------------------------------------------
 drivers/md/md.h |  8 -----
 2 files changed, 3 insertions(+), 87 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 53133b37c9b9..bb67734eded6 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -418,74 +418,10 @@ static void md_submit_bio(struct bio *bio)
 	md_handle_request(mddev, bio);
 }
 
-/* mddev_suspend makes sure no new requests are submitted
- * to the device, and that any requests that have been submitted
- * are completely handled.
- * Once mddev_detach() is called and completes, the module will be
- * completely unused.
+/*
+ * Make sure no new requests are submitted to the device, and any requests that
+ * have been submitted are completely handled.
  */
-void mddev_suspend(struct mddev *mddev)
-{
-	struct md_thread *thread = rcu_dereference_protected(mddev->thread,
-			lockdep_is_held(&mddev->reconfig_mutex));
-
-	WARN_ON_ONCE(thread && current == thread->tsk);
-
-	/* can't concurrent with __mddev_suspend() and __mddev_resume() */
-	mutex_lock(&mddev->suspend_mutex);
-	if (mddev->suspended++) {
-		mutex_unlock(&mddev->suspend_mutex);
-		return;
-	}
-
-	wake_up(&mddev->sb_wait);
-	set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
-	percpu_ref_kill(&mddev->active_io);
-
-	/*
-	 * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
-	 * by __mddev_suspend().
-	 */
-	if (mddev->pers && mddev->pers->prepare_suspend)
-		mddev->pers->prepare_suspend(mddev);
-
-	wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
-	clear_bit_unlock(MD_ALLOW_SB_UPDATE, &mddev->flags);
-	wait_event(mddev->sb_wait, !test_bit(MD_UPDATING_SB, &mddev->flags));
-
-	del_timer_sync(&mddev->safemode_timer);
-	/* restrict memory reclaim I/O during raid array is suspend */
-	mddev->noio_flag = memalloc_noio_save();
-
-	mutex_unlock(&mddev->suspend_mutex);
-}
-EXPORT_SYMBOL_GPL(mddev_suspend);
-
-void mddev_resume(struct mddev *mddev)
-{
-	lockdep_assert_held(&mddev->reconfig_mutex);
-
-	/* can't concurrent with __mddev_suspend() and __mddev_resume() */
-	mutex_lock(&mddev->suspend_mutex);
-	if (--mddev->suspended) {
-		mutex_unlock(&mddev->suspend_mutex);
-		return;
-	}
-
-	/* entred the memalloc scope from mddev_suspend() */
-	memalloc_noio_restore(mddev->noio_flag);
-
-	percpu_ref_resurrect(&mddev->active_io);
-	wake_up(&mddev->sb_wait);
-
-	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
-	md_wakeup_thread(mddev->thread);
-	md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
-
-	mutex_unlock(&mddev->suspend_mutex);
-}
-EXPORT_SYMBOL_GPL(mddev_resume);
-
 void __mddev_suspend(struct mddev *mddev)
 {
 
@@ -9536,18 +9472,6 @@ static void md_start_sync(struct work_struct *ws)
  */
 void md_check_recovery(struct mddev *mddev)
 {
-	if (test_bit(MD_ALLOW_SB_UPDATE, &mddev->flags) && mddev->sb_flags) {
-		/* Write superblock - thread that called mddev_suspend()
-		 * holds reconfig_mutex for us.
-		 */
-		set_bit(MD_UPDATING_SB, &mddev->flags);
-		smp_mb__after_atomic();
-		if (test_bit(MD_ALLOW_SB_UPDATE, &mddev->flags))
-			md_update_sb(mddev, 0);
-		clear_bit_unlock(MD_UPDATING_SB, &mddev->flags);
-		wake_up(&mddev->sb_wait);
-	}
-
 	if (is_md_suspended(mddev))
 		return;
 
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 73334034e880..f932d0cb9db0 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -248,10 +248,6 @@ struct md_cluster_info;
  *			    become failed.
  * @MD_HAS_PPL:  The raid array has PPL feature set.
  * @MD_HAS_MULTIPLE_PPLS: The raid array has multiple PPLs feature set.
- * @MD_ALLOW_SB_UPDATE: md_check_recovery is allowed to update the metadata
- *			 without taking reconfig_mutex.
- * @MD_UPDATING_SB: md_check_recovery is updating the metadata without
- *		     explicitly holding reconfig_mutex.
  * @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that
  *		   array is ready yet.
  * @MD_BROKEN: This is used to stop writes and mark array as failed.
@@ -268,8 +264,6 @@ enum mddev_flags {
 	MD_FAILFAST_SUPPORTED,
 	MD_HAS_PPL,
 	MD_HAS_MULTIPLE_PPLS,
-	MD_ALLOW_SB_UPDATE,
-	MD_UPDATING_SB,
 	MD_NOT_READY,
 	MD_BROKEN,
 	MD_DELETED,
@@ -810,8 +804,6 @@ extern int md_rdev_init(struct md_rdev *rdev);
 extern void md_rdev_clear(struct md_rdev *rdev);
 
 extern void md_handle_request(struct mddev *mddev, struct bio *bio);
-extern void mddev_suspend(struct mddev *mddev);
-extern void mddev_resume(struct mddev *mddev);
 extern void __mddev_suspend(struct mddev *mddev);
 extern void __mddev_resume(struct mddev *mddev);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [dm-devel] [PATCH -next v2 28/28] md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
  2023-08-28  1:59 ` Yu Kuai
@ 2023-08-28  2:00   ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: yi.zhang, yangerkun, linux-kernel, linux-raid, yukuai1, yukuai3

From: Yu Kuai <yukuai3@huawei.com>

Now that the old apis are removed, __mddev_suspend/resume() can be
renamed to their original names.

This is done by:

sed -i "s/__mddev_suspend/mddev_suspend/g" *.[ch]
sed -i "s/__mddev_resume/mddev_resume/g" *.[ch]

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/dm-raid.c     |  4 ++--
 drivers/md/md.c          | 30 +++++++++++++++---------------
 drivers/md/md.h          | 16 ++++++++--------
 drivers/md/raid5-cache.c |  4 ++--
 4 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 2ff33b5d9a1b..5114daead945 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3799,7 +3799,7 @@ static void raid_postsuspend(struct dm_target *ti)
 		if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
 			md_stop_writes(&rs->md);
 
-		__mddev_suspend(&rs->md);
+		mddev_suspend(&rs->md);
 	}
 }
 
@@ -4011,7 +4011,7 @@ static int raid_preresume(struct dm_target *ti)
 	}
 
 	/* Check for any resize/reshape on @rs and adjust/initiate */
-	/* Be prepared for __mddev_resume() in raid_resume() */
+	/* Be prepared for mddev_resume() in raid_resume() */
 	set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 	if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) {
 		set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index bb67734eded6..3afaf960aa4a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -422,7 +422,7 @@ static void md_submit_bio(struct bio *bio)
  * Make sure no new requests are submitted to the device, and any requests that
  * have been submitted are completely handled.
  */
-void __mddev_suspend(struct mddev *mddev)
+void mddev_suspend(struct mddev *mddev)
 {
 
 	/*
@@ -456,9 +456,9 @@ void __mddev_suspend(struct mddev *mddev)
 
 	mutex_unlock(&mddev->suspend_mutex);
 }
-EXPORT_SYMBOL_GPL(__mddev_suspend);
+EXPORT_SYMBOL_GPL(mddev_suspend);
 
-void __mddev_resume(struct mddev *mddev)
+void mddev_resume(struct mddev *mddev)
 {
 	lockdep_assert_not_held(&mddev->reconfig_mutex);
 
@@ -469,7 +469,7 @@ void __mddev_resume(struct mddev *mddev)
 		return;
 	}
 
-	/* entred the memalloc scope from __mddev_suspend() */
+	/* entred the memalloc scope from mddev_suspend() */
 	memalloc_noio_restore(mddev->noio_flag);
 
 	percpu_ref_resurrect(&mddev->active_io);
@@ -481,7 +481,7 @@ void __mddev_resume(struct mddev *mddev)
 
 	mutex_unlock(&mddev->suspend_mutex);
 }
-EXPORT_SYMBOL_GPL(__mddev_resume);
+EXPORT_SYMBOL_GPL(mddev_resume);
 
 /*
  * Generic flush handling for md
@@ -3609,7 +3609,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 		if (cmd_match(page, "remove") || cmd_match(page, "re-add") ||
 		    cmd_match(page, "writemostly") ||
 		    cmd_match(page, "-writemostly")) {
-			__mddev_suspend(mddev);
+			mddev_suspend(mddev);
 			suspended = true;
 		}
 	}
@@ -3623,7 +3623,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 		mddev_unlock(mddev);
 
 		if (suspended)
-			__mddev_resume(mddev);
+			mddev_resume(mddev);
 	}
 
 	if (kn)
@@ -5197,9 +5197,9 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
 	if (new != (sector_t)new)
 		return -EINVAL;
 
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	WRITE_ONCE(mddev->suspend_lo, new);
-	__mddev_resume(mddev);
+	mddev_resume(mddev);
 
 	return len;
 }
@@ -5225,9 +5225,9 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
 	if (new != (sector_t)new)
 		return -EINVAL;
 
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	WRITE_ONCE(mddev->suspend_hi, new);
-	__mddev_resume(mddev);
+	mddev_resume(mddev);
 
 	return len;
 }
@@ -7789,7 +7789,7 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
 
 	mddev_unlock(mddev);
 	if (md_ioctl_need_suspend(cmd))
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 out:
 	if(did_set_md_closing)
@@ -9383,7 +9383,7 @@ static void md_start_sync(struct work_struct *ws)
 	bool suspended = false;
 
 	if (md_spares_need_change(mddev)) {
-		__mddev_suspend(mddev);
+		mddev_suspend(mddev);
 		suspended = true;
 	}
 
@@ -9425,7 +9425,7 @@ static void md_start_sync(struct work_struct *ws)
 
 	mddev_unlock(mddev);
 	if (suspended)
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 	md_wakeup_thread(mddev->sync_thread);
 	sysfs_notify_dirent_safe(mddev->sysfs_action);
@@ -9440,7 +9440,7 @@ static void md_start_sync(struct work_struct *ws)
 	clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
 	mddev_unlock(mddev);
 	if (suspended)
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 	wake_up(&resync_wait);
 	if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
diff --git a/drivers/md/md.h b/drivers/md/md.h
index f932d0cb9db0..fc64dea4f84d 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -804,8 +804,8 @@ extern int md_rdev_init(struct md_rdev *rdev);
 extern void md_rdev_clear(struct md_rdev *rdev);
 
 extern void md_handle_request(struct mddev *mddev, struct bio *bio);
-extern void __mddev_suspend(struct mddev *mddev);
-extern void __mddev_resume(struct mddev *mddev);
+extern void mddev_suspend(struct mddev *mddev);
+extern void mddev_resume(struct mddev *mddev);
 
 extern void md_reload_sb(struct mddev *mddev, int raid_disk);
 extern void md_update_sb(struct mddev *mddev, int force);
@@ -853,17 +853,17 @@ static inline int mddev_suspend_and_lock(struct mddev *mddev)
 {
 	int ret;
 
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	ret = mddev_lock(mddev);
 	if (ret)
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 	return ret;
 }
 
 static inline void mddev_suspend_and_lock_nointr(struct mddev *mddev)
 {
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	mutex_lock(&mddev->reconfig_mutex);
 }
 
@@ -871,10 +871,10 @@ static inline int mddev_suspend_and_trylock(struct mddev *mddev)
 {
 	int ret;
 
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	ret = mutex_trylock(&mddev->reconfig_mutex);
 	if (ret)
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 	return ret;
 }
@@ -882,7 +882,7 @@ static inline int mddev_suspend_and_trylock(struct mddev *mddev)
 static inline void mddev_unlock_and_resume(struct mddev *mddev)
 {
 	mddev_unlock(mddev);
-	__mddev_resume(mddev);
+	mddev_resume(mddev);
 }
 
 struct mdu_array_info_s;
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 38d38f2e33bc..4dc69826a5d3 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -698,9 +698,9 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 		   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags));
 
 	if (READ_ONCE(conf->log)) {
-		__mddev_suspend(mddev);
+		mddev_suspend(mddev);
 		log->r5c_journal_mode = R5C_JOURNAL_MODE_WRITE_THROUGH;
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 	}
 }
 
-- 
2.39.2

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH -next v2 28/28] md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
@ 2023-08-28  2:00   ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-08-28  2:00 UTC (permalink / raw)
  To: agk, snitzer, dm-devel, song, xni
  Cc: linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang, yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Now that the old apis are removed, __mddev_suspend/resume() can be
renamed to their original names.

This is done by:

sed -i "s/__mddev_suspend/mddev_suspend/g" *.[ch]
sed -i "s/__mddev_resume/mddev_resume/g" *.[ch]

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/dm-raid.c     |  4 ++--
 drivers/md/md.c          | 30 +++++++++++++++---------------
 drivers/md/md.h          | 16 ++++++++--------
 drivers/md/raid5-cache.c |  4 ++--
 4 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 2ff33b5d9a1b..5114daead945 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3799,7 +3799,7 @@ static void raid_postsuspend(struct dm_target *ti)
 		if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery))
 			md_stop_writes(&rs->md);
 
-		__mddev_suspend(&rs->md);
+		mddev_suspend(&rs->md);
 	}
 }
 
@@ -4011,7 +4011,7 @@ static int raid_preresume(struct dm_target *ti)
 	}
 
 	/* Check for any resize/reshape on @rs and adjust/initiate */
-	/* Be prepared for __mddev_resume() in raid_resume() */
+	/* Be prepared for mddev_resume() in raid_resume() */
 	set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 	if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) {
 		set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index bb67734eded6..3afaf960aa4a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -422,7 +422,7 @@ static void md_submit_bio(struct bio *bio)
  * Make sure no new requests are submitted to the device, and any requests that
  * have been submitted are completely handled.
  */
-void __mddev_suspend(struct mddev *mddev)
+void mddev_suspend(struct mddev *mddev)
 {
 
 	/*
@@ -456,9 +456,9 @@ void __mddev_suspend(struct mddev *mddev)
 
 	mutex_unlock(&mddev->suspend_mutex);
 }
-EXPORT_SYMBOL_GPL(__mddev_suspend);
+EXPORT_SYMBOL_GPL(mddev_suspend);
 
-void __mddev_resume(struct mddev *mddev)
+void mddev_resume(struct mddev *mddev)
 {
 	lockdep_assert_not_held(&mddev->reconfig_mutex);
 
@@ -469,7 +469,7 @@ void __mddev_resume(struct mddev *mddev)
 		return;
 	}
 
-	/* entred the memalloc scope from __mddev_suspend() */
+	/* entred the memalloc scope from mddev_suspend() */
 	memalloc_noio_restore(mddev->noio_flag);
 
 	percpu_ref_resurrect(&mddev->active_io);
@@ -481,7 +481,7 @@ void __mddev_resume(struct mddev *mddev)
 
 	mutex_unlock(&mddev->suspend_mutex);
 }
-EXPORT_SYMBOL_GPL(__mddev_resume);
+EXPORT_SYMBOL_GPL(mddev_resume);
 
 /*
  * Generic flush handling for md
@@ -3609,7 +3609,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 		if (cmd_match(page, "remove") || cmd_match(page, "re-add") ||
 		    cmd_match(page, "writemostly") ||
 		    cmd_match(page, "-writemostly")) {
-			__mddev_suspend(mddev);
+			mddev_suspend(mddev);
 			suspended = true;
 		}
 	}
@@ -3623,7 +3623,7 @@ rdev_attr_store(struct kobject *kobj, struct attribute *attr,
 		mddev_unlock(mddev);
 
 		if (suspended)
-			__mddev_resume(mddev);
+			mddev_resume(mddev);
 	}
 
 	if (kn)
@@ -5197,9 +5197,9 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
 	if (new != (sector_t)new)
 		return -EINVAL;
 
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	WRITE_ONCE(mddev->suspend_lo, new);
-	__mddev_resume(mddev);
+	mddev_resume(mddev);
 
 	return len;
 }
@@ -5225,9 +5225,9 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
 	if (new != (sector_t)new)
 		return -EINVAL;
 
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	WRITE_ONCE(mddev->suspend_hi, new);
-	__mddev_resume(mddev);
+	mddev_resume(mddev);
 
 	return len;
 }
@@ -7789,7 +7789,7 @@ static int md_ioctl(struct block_device *bdev, blk_mode_t mode,
 
 	mddev_unlock(mddev);
 	if (md_ioctl_need_suspend(cmd))
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 out:
 	if(did_set_md_closing)
@@ -9383,7 +9383,7 @@ static void md_start_sync(struct work_struct *ws)
 	bool suspended = false;
 
 	if (md_spares_need_change(mddev)) {
-		__mddev_suspend(mddev);
+		mddev_suspend(mddev);
 		suspended = true;
 	}
 
@@ -9425,7 +9425,7 @@ static void md_start_sync(struct work_struct *ws)
 
 	mddev_unlock(mddev);
 	if (suspended)
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 	md_wakeup_thread(mddev->sync_thread);
 	sysfs_notify_dirent_safe(mddev->sysfs_action);
@@ -9440,7 +9440,7 @@ static void md_start_sync(struct work_struct *ws)
 	clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
 	mddev_unlock(mddev);
 	if (suspended)
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 	wake_up(&resync_wait);
 	if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
diff --git a/drivers/md/md.h b/drivers/md/md.h
index f932d0cb9db0..fc64dea4f84d 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -804,8 +804,8 @@ extern int md_rdev_init(struct md_rdev *rdev);
 extern void md_rdev_clear(struct md_rdev *rdev);
 
 extern void md_handle_request(struct mddev *mddev, struct bio *bio);
-extern void __mddev_suspend(struct mddev *mddev);
-extern void __mddev_resume(struct mddev *mddev);
+extern void mddev_suspend(struct mddev *mddev);
+extern void mddev_resume(struct mddev *mddev);
 
 extern void md_reload_sb(struct mddev *mddev, int raid_disk);
 extern void md_update_sb(struct mddev *mddev, int force);
@@ -853,17 +853,17 @@ static inline int mddev_suspend_and_lock(struct mddev *mddev)
 {
 	int ret;
 
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	ret = mddev_lock(mddev);
 	if (ret)
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 	return ret;
 }
 
 static inline void mddev_suspend_and_lock_nointr(struct mddev *mddev)
 {
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	mutex_lock(&mddev->reconfig_mutex);
 }
 
@@ -871,10 +871,10 @@ static inline int mddev_suspend_and_trylock(struct mddev *mddev)
 {
 	int ret;
 
-	__mddev_suspend(mddev);
+	mddev_suspend(mddev);
 	ret = mutex_trylock(&mddev->reconfig_mutex);
 	if (ret)
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 
 	return ret;
 }
@@ -882,7 +882,7 @@ static inline int mddev_suspend_and_trylock(struct mddev *mddev)
 static inline void mddev_unlock_and_resume(struct mddev *mddev)
 {
 	mddev_unlock(mddev);
-	__mddev_resume(mddev);
+	mddev_resume(mddev);
 }
 
 struct mdu_array_info_s;
diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 38d38f2e33bc..4dc69826a5d3 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -698,9 +698,9 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 		   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags));
 
 	if (READ_ONCE(conf->log)) {
-		__mddev_suspend(mddev);
+		mddev_suspend(mddev);
 		log->r5c_journal_mode = R5C_JOURNAL_MODE_WRITE_THROUGH;
-		__mddev_resume(mddev);
+		mddev_resume(mddev);
 	}
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
  2023-08-28  1:59   ` Yu Kuai
@ 2023-09-14  2:53     ` Xiao Ni
  -1 siblings, 0 replies; 74+ messages in thread
From: Xiao Ni @ 2023-09-14  2:53 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, dm-devel, song, linux-kernel, linux-raid, yukuai3,
	yi.zhang, yangerkun

On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request()
> is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal
> value.

Hi Kuai

If we don't use READ_ONCE/WRITE_ONCE, What's the risk here? Could you
explain in detail or give an example?

Regards
Xiao
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 46badd13a687..9d8dff9d923c 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -359,11 +359,11 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio)
>                 return true;
>         if (bio_data_dir(bio) != WRITE)
>                 return false;
> -       if (mddev->suspend_lo >= mddev->suspend_hi)
> +       if (READ_ONCE(mddev->suspend_lo) >= READ_ONCE(mddev->suspend_hi))
>                 return false;
> -       if (bio->bi_iter.bi_sector >= mddev->suspend_hi)
> +       if (bio->bi_iter.bi_sector >= READ_ONCE(mddev->suspend_hi))
>                 return false;
> -       if (bio_end_sector(bio) < mddev->suspend_lo)
> +       if (bio_end_sector(bio) < READ_ONCE(mddev->suspend_lo))
>                 return false;
>         return true;
>  }
> @@ -5171,7 +5171,8 @@ __ATTR(sync_max, S_IRUGO|S_IWUSR, max_sync_show, max_sync_store);
>  static ssize_t
>  suspend_lo_show(struct mddev *mddev, char *page)
>  {
> -       return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_lo);
> +       return sprintf(page, "%llu\n",
> +                      (unsigned long long)READ_ONCE(mddev->suspend_lo));
>  }
>
>  static ssize_t
> @@ -5191,7 +5192,7 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
>                 return err;
>
>         mddev_suspend(mddev);
> -       mddev->suspend_lo = new;
> +       WRITE_ONCE(mddev->suspend_lo, new);
>         mddev_resume(mddev);
>
>         mddev_unlock(mddev);
> @@ -5203,7 +5204,8 @@ __ATTR(suspend_lo, S_IRUGO|S_IWUSR, suspend_lo_show, suspend_lo_store);
>  static ssize_t
>  suspend_hi_show(struct mddev *mddev, char *page)
>  {
> -       return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_hi);
> +       return sprintf(page, "%llu\n",
> +                      (unsigned long long)READ_ONCE(mddev->suspend_hi));
>  }
>
>  static ssize_t
> @@ -5223,7 +5225,7 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
>                 return err;
>
>         mddev_suspend(mddev);
> -       mddev->suspend_hi = new;
> +       WRITE_ONCE(mddev->suspend_hi, new);
>         mddev_resume(mddev);
>
>         mddev_unlock(mddev);
> --
> 2.39.2
>


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [dm-devel] [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
@ 2023-09-14  2:53     ` Xiao Ni
  0 siblings, 0 replies; 74+ messages in thread
From: Xiao Ni @ 2023-09-14  2:53 UTC (permalink / raw)
  To: Yu Kuai
  Cc: yi.zhang, yangerkun, snitzer, linux-kernel, linux-raid, song,
	dm-devel, yukuai3, agk

On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request()
> is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal
> value.

Hi Kuai

If we don't use READ_ONCE/WRITE_ONCE, What's the risk here? Could you
explain in detail or give an example?

Regards
Xiao
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 46badd13a687..9d8dff9d923c 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -359,11 +359,11 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio)
>                 return true;
>         if (bio_data_dir(bio) != WRITE)
>                 return false;
> -       if (mddev->suspend_lo >= mddev->suspend_hi)
> +       if (READ_ONCE(mddev->suspend_lo) >= READ_ONCE(mddev->suspend_hi))
>                 return false;
> -       if (bio->bi_iter.bi_sector >= mddev->suspend_hi)
> +       if (bio->bi_iter.bi_sector >= READ_ONCE(mddev->suspend_hi))
>                 return false;
> -       if (bio_end_sector(bio) < mddev->suspend_lo)
> +       if (bio_end_sector(bio) < READ_ONCE(mddev->suspend_lo))
>                 return false;
>         return true;
>  }
> @@ -5171,7 +5171,8 @@ __ATTR(sync_max, S_IRUGO|S_IWUSR, max_sync_show, max_sync_store);
>  static ssize_t
>  suspend_lo_show(struct mddev *mddev, char *page)
>  {
> -       return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_lo);
> +       return sprintf(page, "%llu\n",
> +                      (unsigned long long)READ_ONCE(mddev->suspend_lo));
>  }
>
>  static ssize_t
> @@ -5191,7 +5192,7 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
>                 return err;
>
>         mddev_suspend(mddev);
> -       mddev->suspend_lo = new;
> +       WRITE_ONCE(mddev->suspend_lo, new);
>         mddev_resume(mddev);
>
>         mddev_unlock(mddev);
> @@ -5203,7 +5204,8 @@ __ATTR(suspend_lo, S_IRUGO|S_IWUSR, suspend_lo_show, suspend_lo_store);
>  static ssize_t
>  suspend_hi_show(struct mddev *mddev, char *page)
>  {
> -       return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_hi);
> +       return sprintf(page, "%llu\n",
> +                      (unsigned long long)READ_ONCE(mddev->suspend_hi));
>  }
>
>  static ssize_t
> @@ -5223,7 +5225,7 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
>                 return err;
>
>         mddev_suspend(mddev);
> -       mddev->suspend_hi = new;
> +       WRITE_ONCE(mddev->suspend_hi, new);
>         mddev_resume(mddev);
>
>         mddev_unlock(mddev);
> --
> 2.39.2
>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended()
  2023-08-28  1:59   ` Yu Kuai
@ 2023-09-20  8:46     ` Xiao Ni
  -1 siblings, 0 replies; 74+ messages in thread
From: Xiao Ni @ 2023-09-20  8:46 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, dm-devel, song, linux-kernel, linux-raid, yukuai3,
	yi.zhang, yangerkun

On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> 'pers->prepare_suspend' is introduced to prevent a deadlock for raid456,
> this change prepares to clean this up in later patches while refactoring
> mddev_suspend(). Specifically allow reshape to make progress while
> waiting for 'active_io' to be 0.

Hi Kuai

From my side, I can't understand the comments. The change has
relationship with pers->prepare_suspend? And why this change can
affect reshape? If this change indeed can affect these two things, can
you explain more?

>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 2 +-
>  drivers/md/md.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 9d8dff9d923c..7fa311a14317 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -355,7 +355,7 @@ static DEFINE_SPINLOCK(all_mddevs_lock);
>   */
>  static bool is_suspended(struct mddev *mddev, struct bio *bio)
>  {
> -       if (is_md_suspended(mddev))
> +       if (is_md_suspended(mddev) || percpu_ref_is_dying(&mddev->active_io))

If we use mddev->suspended to judge if the raid is suspended, it
should be enough? Because mddev->suspended must be true when active_io
is dying.

Best Regards
Xiao
>                 return true;
>         if (bio_data_dir(bio) != WRITE)
>                 return false;
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index b628c292506e..fb3b123f16dd 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -584,7 +584,7 @@ static inline bool md_is_rdwr(struct mddev *mddev)
>
>  static inline bool is_md_suspended(struct mddev *mddev)
>  {
> -       return percpu_ref_is_dying(&mddev->active_io);
> +       return READ_ONCE(mddev->suspended);
>  }
>
>  static inline int __must_check mddev_lock(struct mddev *mddev)
> --
> 2.39.2
>


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [dm-devel] [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended()
@ 2023-09-20  8:46     ` Xiao Ni
  0 siblings, 0 replies; 74+ messages in thread
From: Xiao Ni @ 2023-09-20  8:46 UTC (permalink / raw)
  To: Yu Kuai
  Cc: yi.zhang, yangerkun, snitzer, linux-kernel, linux-raid, song,
	dm-devel, yukuai3, agk

On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> 'pers->prepare_suspend' is introduced to prevent a deadlock for raid456,
> this change prepares to clean this up in later patches while refactoring
> mddev_suspend(). Specifically allow reshape to make progress while
> waiting for 'active_io' to be 0.

Hi Kuai

From my side, I can't understand the comments. The change has
relationship with pers->prepare_suspend? And why this change can
affect reshape? If this change indeed can affect these two things, can
you explain more?

>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 2 +-
>  drivers/md/md.h | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 9d8dff9d923c..7fa311a14317 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -355,7 +355,7 @@ static DEFINE_SPINLOCK(all_mddevs_lock);
>   */
>  static bool is_suspended(struct mddev *mddev, struct bio *bio)
>  {
> -       if (is_md_suspended(mddev))
> +       if (is_md_suspended(mddev) || percpu_ref_is_dying(&mddev->active_io))

If we use mddev->suspended to judge if the raid is suspended, it
should be enough? Because mddev->suspended must be true when active_io
is dying.

Best Regards
Xiao
>                 return true;
>         if (bio_data_dir(bio) != WRITE)
>                 return false;
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index b628c292506e..fb3b123f16dd 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -584,7 +584,7 @@ static inline bool md_is_rdwr(struct mddev *mddev)
>
>  static inline bool is_md_suspended(struct mddev *mddev)
>  {
> -       return percpu_ref_is_dying(&mddev->active_io);
> +       return READ_ONCE(mddev->suspended);
>  }
>
>  static inline int __must_check mddev_lock(struct mddev *mddev)
> --
> 2.39.2
>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
  2023-09-14  2:53     ` [dm-devel] " Xiao Ni
@ 2023-09-25  1:18       ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-09-25  1:18 UTC (permalink / raw)
  To: Xiao Ni, Yu Kuai
  Cc: agk, snitzer, dm-devel, song, linux-kernel, linux-raid, yi.zhang,
	yangerkun, yukuai (C)

Hi,

在 2023/09/14 10:53, Xiao Ni 写道:
> On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request()
>> is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal
>> value.
> 
> Hi Kuai
> 
> If we don't use READ_ONCE/WRITE_ONCE, What's the risk here? Could you
> explain in detail or give an example?

Sorry for the late reply.

That depends on the architecture, a load/store may not be atomice,
for example:

// assume a is 10
t1 write 01
// write half first
a = 11
		t2 read
		//read
		a = 11 -> read abnormal value.
// write other half
a = 01

READ_ONCE/WRITE_ONCE can guarantee that either old value or new value is
read.

Thanks,
Kuai

> 
> Regards
> Xiao
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.c | 16 +++++++++-------
>>   1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 46badd13a687..9d8dff9d923c 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -359,11 +359,11 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio)
>>                  return true;
>>          if (bio_data_dir(bio) != WRITE)
>>                  return false;
>> -       if (mddev->suspend_lo >= mddev->suspend_hi)
>> +       if (READ_ONCE(mddev->suspend_lo) >= READ_ONCE(mddev->suspend_hi))
>>                  return false;
>> -       if (bio->bi_iter.bi_sector >= mddev->suspend_hi)
>> +       if (bio->bi_iter.bi_sector >= READ_ONCE(mddev->suspend_hi))
>>                  return false;
>> -       if (bio_end_sector(bio) < mddev->suspend_lo)
>> +       if (bio_end_sector(bio) < READ_ONCE(mddev->suspend_lo))
>>                  return false;
>>          return true;
>>   }
>> @@ -5171,7 +5171,8 @@ __ATTR(sync_max, S_IRUGO|S_IWUSR, max_sync_show, max_sync_store);
>>   static ssize_t
>>   suspend_lo_show(struct mddev *mddev, char *page)
>>   {
>> -       return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_lo);
>> +       return sprintf(page, "%llu\n",
>> +                      (unsigned long long)READ_ONCE(mddev->suspend_lo));
>>   }
>>
>>   static ssize_t
>> @@ -5191,7 +5192,7 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
>>                  return err;
>>
>>          mddev_suspend(mddev);
>> -       mddev->suspend_lo = new;
>> +       WRITE_ONCE(mddev->suspend_lo, new);
>>          mddev_resume(mddev);
>>
>>          mddev_unlock(mddev);
>> @@ -5203,7 +5204,8 @@ __ATTR(suspend_lo, S_IRUGO|S_IWUSR, suspend_lo_show, suspend_lo_store);
>>   static ssize_t
>>   suspend_hi_show(struct mddev *mddev, char *page)
>>   {
>> -       return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_hi);
>> +       return sprintf(page, "%llu\n",
>> +                      (unsigned long long)READ_ONCE(mddev->suspend_hi));
>>   }
>>
>>   static ssize_t
>> @@ -5223,7 +5225,7 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
>>                  return err;
>>
>>          mddev_suspend(mddev);
>> -       mddev->suspend_hi = new;
>> +       WRITE_ONCE(mddev->suspend_hi, new);
>>          mddev_resume(mddev);
>>
>>          mddev_unlock(mddev);
>> --
>> 2.39.2
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [dm-devel] [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
@ 2023-09-25  1:18       ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-09-25  1:18 UTC (permalink / raw)
  To: Xiao Ni, Yu Kuai
  Cc: yi.zhang, yangerkun, snitzer, linux-kernel, linux-raid, song,
	dm-devel, yukuai (C),
	agk

Hi,

在 2023/09/14 10:53, Xiao Ni 写道:
> On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Because reading 'suspend_lo' and 'suspend_hi' from md_handle_request()
>> is not protected, use READ_ONCE/WRITE_ONCE to prevent reading abnormal
>> value.
> 
> Hi Kuai
> 
> If we don't use READ_ONCE/WRITE_ONCE, What's the risk here? Could you
> explain in detail or give an example?

Sorry for the late reply.

That depends on the architecture, a load/store may not be atomice,
for example:

// assume a is 10
t1 write 01
// write half first
a = 11
		t2 read
		//read
		a = 11 -> read abnormal value.
// write other half
a = 01

READ_ONCE/WRITE_ONCE can guarantee that either old value or new value is
read.

Thanks,
Kuai

> 
> Regards
> Xiao
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.c | 16 +++++++++-------
>>   1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 46badd13a687..9d8dff9d923c 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -359,11 +359,11 @@ static bool is_suspended(struct mddev *mddev, struct bio *bio)
>>                  return true;
>>          if (bio_data_dir(bio) != WRITE)
>>                  return false;
>> -       if (mddev->suspend_lo >= mddev->suspend_hi)
>> +       if (READ_ONCE(mddev->suspend_lo) >= READ_ONCE(mddev->suspend_hi))
>>                  return false;
>> -       if (bio->bi_iter.bi_sector >= mddev->suspend_hi)
>> +       if (bio->bi_iter.bi_sector >= READ_ONCE(mddev->suspend_hi))
>>                  return false;
>> -       if (bio_end_sector(bio) < mddev->suspend_lo)
>> +       if (bio_end_sector(bio) < READ_ONCE(mddev->suspend_lo))
>>                  return false;
>>          return true;
>>   }
>> @@ -5171,7 +5171,8 @@ __ATTR(sync_max, S_IRUGO|S_IWUSR, max_sync_show, max_sync_store);
>>   static ssize_t
>>   suspend_lo_show(struct mddev *mddev, char *page)
>>   {
>> -       return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_lo);
>> +       return sprintf(page, "%llu\n",
>> +                      (unsigned long long)READ_ONCE(mddev->suspend_lo));
>>   }
>>
>>   static ssize_t
>> @@ -5191,7 +5192,7 @@ suspend_lo_store(struct mddev *mddev, const char *buf, size_t len)
>>                  return err;
>>
>>          mddev_suspend(mddev);
>> -       mddev->suspend_lo = new;
>> +       WRITE_ONCE(mddev->suspend_lo, new);
>>          mddev_resume(mddev);
>>
>>          mddev_unlock(mddev);
>> @@ -5203,7 +5204,8 @@ __ATTR(suspend_lo, S_IRUGO|S_IWUSR, suspend_lo_show, suspend_lo_store);
>>   static ssize_t
>>   suspend_hi_show(struct mddev *mddev, char *page)
>>   {
>> -       return sprintf(page, "%llu\n", (unsigned long long)mddev->suspend_hi);
>> +       return sprintf(page, "%llu\n",
>> +                      (unsigned long long)READ_ONCE(mddev->suspend_hi));
>>   }
>>
>>   static ssize_t
>> @@ -5223,7 +5225,7 @@ suspend_hi_store(struct mddev *mddev, const char *buf, size_t len)
>>                  return err;
>>
>>          mddev_suspend(mddev);
>> -       mddev->suspend_hi = new;
>> +       WRITE_ONCE(mddev->suspend_hi, new);
>>          mddev_resume(mddev);
>>
>>          mddev_unlock(mddev);
>> --
>> 2.39.2
>>
> 
> .
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended()
  2023-09-20  8:46     ` [dm-devel] " Xiao Ni
@ 2023-09-25  1:34       ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-09-25  1:34 UTC (permalink / raw)
  To: Xiao Ni, Yu Kuai
  Cc: agk, snitzer, dm-devel, song, linux-kernel, linux-raid, yi.zhang,
	yangerkun, yukuai (C)

Hi,

在 2023/09/20 16:46, Xiao Ni 写道:
> On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> 'pers->prepare_suspend' is introduced to prevent a deadlock for raid456,
>> this change prepares to clean this up in later patches while refactoring
>> mddev_suspend(). Specifically allow reshape to make progress while
>> waiting for 'active_io' to be 0.
> 
> Hi Kuai
> 
>>From my side, I can't understand the comments. The change has
> relationship with pers->prepare_suspend? And why this change can
> affect reshape? If this change indeed can affect these two things, can
> you explain more?

First of all, 'prepare_suspend' is used to fix a deadlock in raid456:

1) suspend is waiting for normal io to be done.

mddev_suspend
  mddev->suspended++ -> new sync_thread can't start
  percpu_ref_kill(active_io)
  wait_event(percpu_ref_is_zero(active_io))

2) normal io is waiting for reshape to make progress.
3) reshape is waiting for suspended array to be resumed.

md_check_recovery
  if (is_md_suspended(mddev))
   return

Then prepare_suspend will failed the io that is waiting for reshape to
make progress:

mddev_suspend
  mddev->suspended++
  percpu_ref_kill(active_io)
   -> new io will be stuck in md_handle_request
  pers->prepare_suspend() -> raid5_prepare_suspend
   -> wake_up(wait_for_overlap)
		// woke up
		raid5_make_request
		 make_stripe_request
		  !reshape_inprogress(mddev) && reshape_disabled(mddev)
		   // return io error for the io that is waiting for
		   // reshape to make progress

  wait_event(percpu_ref_is_zero(active_io))

With this patch and the new api to suspend array:

mddev_suspend
  percpu_ref_kill(active_io)
  wait_event(percpu_ref_is_zero(active_io))
  -> while waiting for normal io to be done, new sync_thread can still
     start, and reshape can still make progress.
  mddev->suspended++

> 
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.c | 2 +-
>>   drivers/md/md.h | 2 +-
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 9d8dff9d923c..7fa311a14317 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -355,7 +355,7 @@ static DEFINE_SPINLOCK(all_mddevs_lock);
>>    */
>>   static bool is_suspended(struct mddev *mddev, struct bio *bio)
>>   {
>> -       if (is_md_suspended(mddev))
>> +       if (is_md_suspended(mddev) || percpu_ref_is_dying(&mddev->active_io))
> 
> If we use mddev->suspended to judge if the raid is suspended, it
> should be enough? Because mddev->suspended must be true when active_io
> is dying.

In the new api, active_io is killed before increasing suspended, and the
difference is that the timing that array is suspended will be delayed
from the start of mddev_suspend() to when all dispatched io is done.

I think this is OK because this doesn't change behaviour when
mddev_suspend() returns.

Thanks,
Kuai
`
> 
> Best Regards
> Xiao
>>                  return true;
>>          if (bio_data_dir(bio) != WRITE)
>>                  return false;
>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>> index b628c292506e..fb3b123f16dd 100644
>> --- a/drivers/md/md.h
>> +++ b/drivers/md/md.h
>> @@ -584,7 +584,7 @@ static inline bool md_is_rdwr(struct mddev *mddev)
>>
>>   static inline bool is_md_suspended(struct mddev *mddev)
>>   {
>> -       return percpu_ref_is_dying(&mddev->active_io);
>> +       return READ_ONCE(mddev->suspended);
>>   }
>>
>>   static inline int __must_check mddev_lock(struct mddev *mddev)
>> --
>> 2.39.2
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [dm-devel] [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended()
@ 2023-09-25  1:34       ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-09-25  1:34 UTC (permalink / raw)
  To: Xiao Ni, Yu Kuai
  Cc: yi.zhang, yangerkun, snitzer, linux-kernel, linux-raid, song,
	dm-devel, yukuai (C),
	agk

Hi,

在 2023/09/20 16:46, Xiao Ni 写道:
> On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> 'pers->prepare_suspend' is introduced to prevent a deadlock for raid456,
>> this change prepares to clean this up in later patches while refactoring
>> mddev_suspend(). Specifically allow reshape to make progress while
>> waiting for 'active_io' to be 0.
> 
> Hi Kuai
> 
>>From my side, I can't understand the comments. The change has
> relationship with pers->prepare_suspend? And why this change can
> affect reshape? If this change indeed can affect these two things, can
> you explain more?

First of all, 'prepare_suspend' is used to fix a deadlock in raid456:

1) suspend is waiting for normal io to be done.

mddev_suspend
  mddev->suspended++ -> new sync_thread can't start
  percpu_ref_kill(active_io)
  wait_event(percpu_ref_is_zero(active_io))

2) normal io is waiting for reshape to make progress.
3) reshape is waiting for suspended array to be resumed.

md_check_recovery
  if (is_md_suspended(mddev))
   return

Then prepare_suspend will failed the io that is waiting for reshape to
make progress:

mddev_suspend
  mddev->suspended++
  percpu_ref_kill(active_io)
   -> new io will be stuck in md_handle_request
  pers->prepare_suspend() -> raid5_prepare_suspend
   -> wake_up(wait_for_overlap)
		// woke up
		raid5_make_request
		 make_stripe_request
		  !reshape_inprogress(mddev) && reshape_disabled(mddev)
		   // return io error for the io that is waiting for
		   // reshape to make progress

  wait_event(percpu_ref_is_zero(active_io))

With this patch and the new api to suspend array:

mddev_suspend
  percpu_ref_kill(active_io)
  wait_event(percpu_ref_is_zero(active_io))
  -> while waiting for normal io to be done, new sync_thread can still
     start, and reshape can still make progress.
  mddev->suspended++

> 
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.c | 2 +-
>>   drivers/md/md.h | 2 +-
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 9d8dff9d923c..7fa311a14317 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -355,7 +355,7 @@ static DEFINE_SPINLOCK(all_mddevs_lock);
>>    */
>>   static bool is_suspended(struct mddev *mddev, struct bio *bio)
>>   {
>> -       if (is_md_suspended(mddev))
>> +       if (is_md_suspended(mddev) || percpu_ref_is_dying(&mddev->active_io))
> 
> If we use mddev->suspended to judge if the raid is suspended, it
> should be enough? Because mddev->suspended must be true when active_io
> is dying.

In the new api, active_io is killed before increasing suspended, and the
difference is that the timing that array is suspended will be delayed
from the start of mddev_suspend() to when all dispatched io is done.

I think this is OK because this doesn't change behaviour when
mddev_suspend() returns.

Thanks,
Kuai
`
> 
> Best Regards
> Xiao
>>                  return true;
>>          if (bio_data_dir(bio) != WRITE)
>>                  return false;
>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>> index b628c292506e..fb3b123f16dd 100644
>> --- a/drivers/md/md.h
>> +++ b/drivers/md/md.h
>> @@ -584,7 +584,7 @@ static inline bool md_is_rdwr(struct mddev *mddev)
>>
>>   static inline bool is_md_suspended(struct mddev *mddev)
>>   {
>> -       return percpu_ref_is_dying(&mddev->active_io);
>> +       return READ_ONCE(mddev->suspended);
>>   }
>>
>>   static inline int __must_check mddev_lock(struct mddev *mddev)
>> --
>> 2.39.2
>>
> 
> .
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH -next v2 03/28] md: add new helpers to suspend/resume array
  2023-08-28  1:59   ` Yu Kuai
@ 2023-09-25  7:21     ` Xiao Ni
  -1 siblings, 0 replies; 74+ messages in thread
From: Xiao Ni @ 2023-09-25  7:21 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, dm-devel, song, linux-kernel, linux-raid, yukuai3,
	yi.zhang, yangerkun

On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Advantages for new apis:
>  - reconfig_mutex is not required;
>  - the weird logical that suspend array hold 'reconfig_mutex' for
>    mddev_check_recovery() to update superblock is not needed;
>  - the specail handling, 'pers->prepare_suspend', for raid456 is not
>    needed;
>  - It's safe to be called at any time once mddev is allocated, and it's
>    designed to be used from slow path where array configuration is changed;
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++--
>  drivers/md/md.h |  3 ++
>  2 files changed, 86 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 7fa311a14317..6236e2e395c1 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -443,12 +443,22 @@ void mddev_suspend(struct mddev *mddev)
>                         lockdep_is_held(&mddev->reconfig_mutex));
>
>         WARN_ON_ONCE(thread && current == thread->tsk);
> -       if (mddev->suspended++)
> +
> +       /* can't concurrent with __mddev_suspend() and __mddev_resume() */
> +       mutex_lock(&mddev->suspend_mutex);
> +       if (mddev->suspended++) {
> +               mutex_unlock(&mddev->suspend_mutex);
>                 return;
> +       }
> +
>         wake_up(&mddev->sb_wait);
>         set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
>         percpu_ref_kill(&mddev->active_io);
>
> +       /*
> +        * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
> +        * by __mddev_suspend().
> +        */
>         if (mddev->pers && mddev->pers->prepare_suspend)
>                 mddev->pers->prepare_suspend(mddev);
>
> @@ -459,14 +469,21 @@ void mddev_suspend(struct mddev *mddev)
>         del_timer_sync(&mddev->safemode_timer);
>         /* restrict memory reclaim I/O during raid array is suspend */
>         mddev->noio_flag = memalloc_noio_save();
> +
> +       mutex_unlock(&mddev->suspend_mutex);
>  }
>  EXPORT_SYMBOL_GPL(mddev_suspend);
>
>  void mddev_resume(struct mddev *mddev)
>  {
>         lockdep_assert_held(&mddev->reconfig_mutex);
> -       if (--mddev->suspended)
> +
> +       /* can't concurrent with __mddev_suspend() and __mddev_resume() */
> +       mutex_lock(&mddev->suspend_mutex);
> +       if (--mddev->suspended) {
> +               mutex_unlock(&mddev->suspend_mutex);
>                 return;
> +       }
>
>         /* entred the memalloc scope from mddev_suspend() */
>         memalloc_noio_restore(mddev->noio_flag);
> @@ -477,9 +494,72 @@ void mddev_resume(struct mddev *mddev)
>         set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
>         md_wakeup_thread(mddev->thread);
>         md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
> +
> +       mutex_unlock(&mddev->suspend_mutex);
>  }
>  EXPORT_SYMBOL_GPL(mddev_resume);
>
> +void __mddev_suspend(struct mddev *mddev)
> +{
> +
> +       /*
> +        * hold reconfig_mutex to wait for normal io will deadlock, because
> +        * other context can't update super_block, and normal io can rely on
> +        * updating super_block.
> +        */
> +       lockdep_assert_not_held(&mddev->reconfig_mutex);
> +
> +       mutex_lock(&mddev->suspend_mutex);
> +
> +       if (mddev->suspended) {
> +               WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
> +               mutex_unlock(&mddev->suspend_mutex);
> +               return;
> +       }
> +
> +       percpu_ref_kill(&mddev->active_io);
> +       wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
> +
> +       /*
> +        * For raid456, io might be waiting for reshape to make progress,
> +        * allow new reshape to start while waiting for io to be done to
> +        * prevent deadlock.
> +        */
> +       WRITE_ONCE(mddev->suspended, mddev->suspended + 1);

It changes the order of setting suspended and checking active_io.
suspended is used to stop I/O. Now it checks active_io first and then
adds suspended, if the i/o doesn't stop, it looks like active_io can't
be 0. So it will stuck at waiting active_io to be 0?

Best Regards
Xiao

> +
> +       del_timer_sync(&mddev->safemode_timer);
> +       /* restrict memory reclaim I/O during raid array is suspend */
> +       mddev->noio_flag = memalloc_noio_save();
> +
> +       mutex_unlock(&mddev->suspend_mutex);
> +}
> +EXPORT_SYMBOL_GPL(__mddev_suspend);
> +
> +void __mddev_resume(struct mddev *mddev)
> +{
> +       lockdep_assert_not_held(&mddev->reconfig_mutex);
> +
> +       mutex_lock(&mddev->suspend_mutex);
> +       WRITE_ONCE(mddev->suspended, mddev->suspended - 1);
> +       if (mddev->suspended) {
> +               mutex_unlock(&mddev->suspend_mutex);
> +               return;
> +       }
> +
> +       /* entred the memalloc scope from __mddev_suspend() */
> +       memalloc_noio_restore(mddev->noio_flag);
> +
> +       percpu_ref_resurrect(&mddev->active_io);
> +       wake_up(&mddev->sb_wait);
> +
> +       set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> +       md_wakeup_thread(mddev->thread);
> +       md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
> +
> +       mutex_unlock(&mddev->suspend_mutex);
> +}
> +EXPORT_SYMBOL_GPL(__mddev_resume);
> +
>  /*
>   * Generic flush handling for md
>   */
> @@ -667,6 +747,7 @@ int mddev_init(struct mddev *mddev)
>         mutex_init(&mddev->open_mutex);
>         mutex_init(&mddev->reconfig_mutex);
>         mutex_init(&mddev->sync_mutex);
> +       mutex_init(&mddev->suspend_mutex);
>         mutex_init(&mddev->bitmap_info.mutex);
>         INIT_LIST_HEAD(&mddev->disks);
>         INIT_LIST_HEAD(&mddev->all_mddevs);
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index fb3b123f16dd..1103e6b08ad9 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -316,6 +316,7 @@ struct mddev {
>         unsigned long                   sb_flags;
>
>         int                             suspended;
> +       struct mutex                    suspend_mutex;
>         struct percpu_ref               active_io;
>         int                             ro;
>         int                             sysfs_active; /* set when sysfs deletes
> @@ -811,6 +812,8 @@ extern void md_rdev_clear(struct md_rdev *rdev);
>  extern void md_handle_request(struct mddev *mddev, struct bio *bio);
>  extern void mddev_suspend(struct mddev *mddev);
>  extern void mddev_resume(struct mddev *mddev);
> +extern void __mddev_suspend(struct mddev *mddev);
> +extern void __mddev_resume(struct mddev *mddev);
>
>  extern void md_reload_sb(struct mddev *mddev, int raid_disk);
>  extern void md_update_sb(struct mddev *mddev, int force);
> --
> 2.39.2
>


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [dm-devel] [PATCH -next v2 03/28] md: add new helpers to suspend/resume array
@ 2023-09-25  7:21     ` Xiao Ni
  0 siblings, 0 replies; 74+ messages in thread
From: Xiao Ni @ 2023-09-25  7:21 UTC (permalink / raw)
  To: Yu Kuai
  Cc: yi.zhang, yangerkun, snitzer, linux-kernel, linux-raid, song,
	dm-devel, yukuai3, agk

On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Advantages for new apis:
>  - reconfig_mutex is not required;
>  - the weird logical that suspend array hold 'reconfig_mutex' for
>    mddev_check_recovery() to update superblock is not needed;
>  - the specail handling, 'pers->prepare_suspend', for raid456 is not
>    needed;
>  - It's safe to be called at any time once mddev is allocated, and it's
>    designed to be used from slow path where array configuration is changed;
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++--
>  drivers/md/md.h |  3 ++
>  2 files changed, 86 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 7fa311a14317..6236e2e395c1 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -443,12 +443,22 @@ void mddev_suspend(struct mddev *mddev)
>                         lockdep_is_held(&mddev->reconfig_mutex));
>
>         WARN_ON_ONCE(thread && current == thread->tsk);
> -       if (mddev->suspended++)
> +
> +       /* can't concurrent with __mddev_suspend() and __mddev_resume() */
> +       mutex_lock(&mddev->suspend_mutex);
> +       if (mddev->suspended++) {
> +               mutex_unlock(&mddev->suspend_mutex);
>                 return;
> +       }
> +
>         wake_up(&mddev->sb_wait);
>         set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
>         percpu_ref_kill(&mddev->active_io);
>
> +       /*
> +        * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
> +        * by __mddev_suspend().
> +        */
>         if (mddev->pers && mddev->pers->prepare_suspend)
>                 mddev->pers->prepare_suspend(mddev);
>
> @@ -459,14 +469,21 @@ void mddev_suspend(struct mddev *mddev)
>         del_timer_sync(&mddev->safemode_timer);
>         /* restrict memory reclaim I/O during raid array is suspend */
>         mddev->noio_flag = memalloc_noio_save();
> +
> +       mutex_unlock(&mddev->suspend_mutex);
>  }
>  EXPORT_SYMBOL_GPL(mddev_suspend);
>
>  void mddev_resume(struct mddev *mddev)
>  {
>         lockdep_assert_held(&mddev->reconfig_mutex);
> -       if (--mddev->suspended)
> +
> +       /* can't concurrent with __mddev_suspend() and __mddev_resume() */
> +       mutex_lock(&mddev->suspend_mutex);
> +       if (--mddev->suspended) {
> +               mutex_unlock(&mddev->suspend_mutex);
>                 return;
> +       }
>
>         /* entred the memalloc scope from mddev_suspend() */
>         memalloc_noio_restore(mddev->noio_flag);
> @@ -477,9 +494,72 @@ void mddev_resume(struct mddev *mddev)
>         set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
>         md_wakeup_thread(mddev->thread);
>         md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
> +
> +       mutex_unlock(&mddev->suspend_mutex);
>  }
>  EXPORT_SYMBOL_GPL(mddev_resume);
>
> +void __mddev_suspend(struct mddev *mddev)
> +{
> +
> +       /*
> +        * hold reconfig_mutex to wait for normal io will deadlock, because
> +        * other context can't update super_block, and normal io can rely on
> +        * updating super_block.
> +        */
> +       lockdep_assert_not_held(&mddev->reconfig_mutex);
> +
> +       mutex_lock(&mddev->suspend_mutex);
> +
> +       if (mddev->suspended) {
> +               WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
> +               mutex_unlock(&mddev->suspend_mutex);
> +               return;
> +       }
> +
> +       percpu_ref_kill(&mddev->active_io);
> +       wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
> +
> +       /*
> +        * For raid456, io might be waiting for reshape to make progress,
> +        * allow new reshape to start while waiting for io to be done to
> +        * prevent deadlock.
> +        */
> +       WRITE_ONCE(mddev->suspended, mddev->suspended + 1);

It changes the order of setting suspended and checking active_io.
suspended is used to stop I/O. Now it checks active_io first and then
adds suspended, if the i/o doesn't stop, it looks like active_io can't
be 0. So it will stuck at waiting active_io to be 0?

Best Regards
Xiao

> +
> +       del_timer_sync(&mddev->safemode_timer);
> +       /* restrict memory reclaim I/O during raid array is suspend */
> +       mddev->noio_flag = memalloc_noio_save();
> +
> +       mutex_unlock(&mddev->suspend_mutex);
> +}
> +EXPORT_SYMBOL_GPL(__mddev_suspend);
> +
> +void __mddev_resume(struct mddev *mddev)
> +{
> +       lockdep_assert_not_held(&mddev->reconfig_mutex);
> +
> +       mutex_lock(&mddev->suspend_mutex);
> +       WRITE_ONCE(mddev->suspended, mddev->suspended - 1);
> +       if (mddev->suspended) {
> +               mutex_unlock(&mddev->suspend_mutex);
> +               return;
> +       }
> +
> +       /* entred the memalloc scope from __mddev_suspend() */
> +       memalloc_noio_restore(mddev->noio_flag);
> +
> +       percpu_ref_resurrect(&mddev->active_io);
> +       wake_up(&mddev->sb_wait);
> +
> +       set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> +       md_wakeup_thread(mddev->thread);
> +       md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
> +
> +       mutex_unlock(&mddev->suspend_mutex);
> +}
> +EXPORT_SYMBOL_GPL(__mddev_resume);
> +
>  /*
>   * Generic flush handling for md
>   */
> @@ -667,6 +747,7 @@ int mddev_init(struct mddev *mddev)
>         mutex_init(&mddev->open_mutex);
>         mutex_init(&mddev->reconfig_mutex);
>         mutex_init(&mddev->sync_mutex);
> +       mutex_init(&mddev->suspend_mutex);
>         mutex_init(&mddev->bitmap_info.mutex);
>         INIT_LIST_HEAD(&mddev->disks);
>         INIT_LIST_HEAD(&mddev->all_mddevs);
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index fb3b123f16dd..1103e6b08ad9 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -316,6 +316,7 @@ struct mddev {
>         unsigned long                   sb_flags;
>
>         int                             suspended;
> +       struct mutex                    suspend_mutex;
>         struct percpu_ref               active_io;
>         int                             ro;
>         int                             sysfs_active; /* set when sysfs deletes
> @@ -811,6 +812,8 @@ extern void md_rdev_clear(struct md_rdev *rdev);
>  extern void md_handle_request(struct mddev *mddev, struct bio *bio);
>  extern void mddev_suspend(struct mddev *mddev);
>  extern void mddev_resume(struct mddev *mddev);
> +extern void __mddev_suspend(struct mddev *mddev);
> +extern void __mddev_resume(struct mddev *mddev);
>
>  extern void md_reload_sb(struct mddev *mddev, int raid_disk);
>  extern void md_update_sb(struct mddev *mddev, int force);
> --
> 2.39.2
>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH -next v2 03/28] md: add new helpers to suspend/resume array
  2023-09-25  7:21     ` [dm-devel] " Xiao Ni
@ 2023-09-25  7:23       ` Xiao Ni
  -1 siblings, 0 replies; 74+ messages in thread
From: Xiao Ni @ 2023-09-25  7:23 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, dm-devel, song, linux-kernel, linux-raid, yukuai3,
	yi.zhang, yangerkun

On Mon, Sep 25, 2023 at 3:21 PM Xiao Ni <xni@redhat.com> wrote:
>
> On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >
> > From: Yu Kuai <yukuai3@huawei.com>
> >
> > Advantages for new apis:
> >  - reconfig_mutex is not required;
> >  - the weird logical that suspend array hold 'reconfig_mutex' for
> >    mddev_check_recovery() to update superblock is not needed;
> >  - the specail handling, 'pers->prepare_suspend', for raid456 is not
> >    needed;
> >  - It's safe to be called at any time once mddev is allocated, and it's
> >    designed to be used from slow path where array configuration is changed;
> >
> > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > ---
> >  drivers/md/md.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++--
> >  drivers/md/md.h |  3 ++
> >  2 files changed, 86 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index 7fa311a14317..6236e2e395c1 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -443,12 +443,22 @@ void mddev_suspend(struct mddev *mddev)
> >                         lockdep_is_held(&mddev->reconfig_mutex));
> >
> >         WARN_ON_ONCE(thread && current == thread->tsk);
> > -       if (mddev->suspended++)
> > +
> > +       /* can't concurrent with __mddev_suspend() and __mddev_resume() */
> > +       mutex_lock(&mddev->suspend_mutex);
> > +       if (mddev->suspended++) {
> > +               mutex_unlock(&mddev->suspend_mutex);
> >                 return;
> > +       }
> > +
> >         wake_up(&mddev->sb_wait);
> >         set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
> >         percpu_ref_kill(&mddev->active_io);
> >
> > +       /*
> > +        * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
> > +        * by __mddev_suspend().
> > +        */
> >         if (mddev->pers && mddev->pers->prepare_suspend)
> >                 mddev->pers->prepare_suspend(mddev);
> >
> > @@ -459,14 +469,21 @@ void mddev_suspend(struct mddev *mddev)
> >         del_timer_sync(&mddev->safemode_timer);
> >         /* restrict memory reclaim I/O during raid array is suspend */
> >         mddev->noio_flag = memalloc_noio_save();
> > +
> > +       mutex_unlock(&mddev->suspend_mutex);
> >  }
> >  EXPORT_SYMBOL_GPL(mddev_suspend);
> >
> >  void mddev_resume(struct mddev *mddev)
> >  {
> >         lockdep_assert_held(&mddev->reconfig_mutex);
> > -       if (--mddev->suspended)
> > +
> > +       /* can't concurrent with __mddev_suspend() and __mddev_resume() */
> > +       mutex_lock(&mddev->suspend_mutex);
> > +       if (--mddev->suspended) {
> > +               mutex_unlock(&mddev->suspend_mutex);
> >                 return;
> > +       }
> >
> >         /* entred the memalloc scope from mddev_suspend() */
> >         memalloc_noio_restore(mddev->noio_flag);
> > @@ -477,9 +494,72 @@ void mddev_resume(struct mddev *mddev)
> >         set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> >         md_wakeup_thread(mddev->thread);
> >         md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
> > +
> > +       mutex_unlock(&mddev->suspend_mutex);
> >  }
> >  EXPORT_SYMBOL_GPL(mddev_resume);
> >
> > +void __mddev_suspend(struct mddev *mddev)
> > +{
> > +
> > +       /*
> > +        * hold reconfig_mutex to wait for normal io will deadlock, because
> > +        * other context can't update super_block, and normal io can rely on
> > +        * updating super_block.
> > +        */
> > +       lockdep_assert_not_held(&mddev->reconfig_mutex);
> > +
> > +       mutex_lock(&mddev->suspend_mutex);
> > +
> > +       if (mddev->suspended) {
> > +               WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
> > +               mutex_unlock(&mddev->suspend_mutex);
> > +               return;
> > +       }
> > +
> > +       percpu_ref_kill(&mddev->active_io);
> > +       wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
> > +
> > +       /*
> > +        * For raid456, io might be waiting for reshape to make progress,
> > +        * allow new reshape to start while waiting for io to be done to
> > +        * prevent deadlock.
> > +        */
> > +       WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
>
> It changes the order of setting suspended and checking active_io.
> suspended is used to stop I/O. Now it checks active_io first and then
> adds suspended, if the i/o doesn't stop, it looks like active_io can't
> be 0. So it will stuck at waiting active_io to be 0?

Ah, I c, you add the state of active_io to judge if a raid is suspended.

Regards
Xiao
>
> Best Regards
> Xiao
>
> > +
> > +       del_timer_sync(&mddev->safemode_timer);
> > +       /* restrict memory reclaim I/O during raid array is suspend */
> > +       mddev->noio_flag = memalloc_noio_save();
> > +
> > +       mutex_unlock(&mddev->suspend_mutex);
> > +}
> > +EXPORT_SYMBOL_GPL(__mddev_suspend);
> > +
> > +void __mddev_resume(struct mddev *mddev)
> > +{
> > +       lockdep_assert_not_held(&mddev->reconfig_mutex);
> > +
> > +       mutex_lock(&mddev->suspend_mutex);
> > +       WRITE_ONCE(mddev->suspended, mddev->suspended - 1);
> > +       if (mddev->suspended) {
> > +               mutex_unlock(&mddev->suspend_mutex);
> > +               return;
> > +       }
> > +
> > +       /* entred the memalloc scope from __mddev_suspend() */
> > +       memalloc_noio_restore(mddev->noio_flag);
> > +
> > +       percpu_ref_resurrect(&mddev->active_io);
> > +       wake_up(&mddev->sb_wait);
> > +
> > +       set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> > +       md_wakeup_thread(mddev->thread);
> > +       md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
> > +
> > +       mutex_unlock(&mddev->suspend_mutex);
> > +}
> > +EXPORT_SYMBOL_GPL(__mddev_resume);
> > +
> >  /*
> >   * Generic flush handling for md
> >   */
> > @@ -667,6 +747,7 @@ int mddev_init(struct mddev *mddev)
> >         mutex_init(&mddev->open_mutex);
> >         mutex_init(&mddev->reconfig_mutex);
> >         mutex_init(&mddev->sync_mutex);
> > +       mutex_init(&mddev->suspend_mutex);
> >         mutex_init(&mddev->bitmap_info.mutex);
> >         INIT_LIST_HEAD(&mddev->disks);
> >         INIT_LIST_HEAD(&mddev->all_mddevs);
> > diff --git a/drivers/md/md.h b/drivers/md/md.h
> > index fb3b123f16dd..1103e6b08ad9 100644
> > --- a/drivers/md/md.h
> > +++ b/drivers/md/md.h
> > @@ -316,6 +316,7 @@ struct mddev {
> >         unsigned long                   sb_flags;
> >
> >         int                             suspended;
> > +       struct mutex                    suspend_mutex;
> >         struct percpu_ref               active_io;
> >         int                             ro;
> >         int                             sysfs_active; /* set when sysfs deletes
> > @@ -811,6 +812,8 @@ extern void md_rdev_clear(struct md_rdev *rdev);
> >  extern void md_handle_request(struct mddev *mddev, struct bio *bio);
> >  extern void mddev_suspend(struct mddev *mddev);
> >  extern void mddev_resume(struct mddev *mddev);
> > +extern void __mddev_suspend(struct mddev *mddev);
> > +extern void __mddev_resume(struct mddev *mddev);
> >
> >  extern void md_reload_sb(struct mddev *mddev, int raid_disk);
> >  extern void md_update_sb(struct mddev *mddev, int force);
> > --
> > 2.39.2
> >


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [dm-devel] [PATCH -next v2 03/28] md: add new helpers to suspend/resume array
@ 2023-09-25  7:23       ` Xiao Ni
  0 siblings, 0 replies; 74+ messages in thread
From: Xiao Ni @ 2023-09-25  7:23 UTC (permalink / raw)
  To: Yu Kuai
  Cc: yi.zhang, yangerkun, snitzer, linux-kernel, linux-raid, song,
	dm-devel, yukuai3, agk

On Mon, Sep 25, 2023 at 3:21 PM Xiao Ni <xni@redhat.com> wrote:
>
> On Mon, Aug 28, 2023 at 10:04 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >
> > From: Yu Kuai <yukuai3@huawei.com>
> >
> > Advantages for new apis:
> >  - reconfig_mutex is not required;
> >  - the weird logical that suspend array hold 'reconfig_mutex' for
> >    mddev_check_recovery() to update superblock is not needed;
> >  - the specail handling, 'pers->prepare_suspend', for raid456 is not
> >    needed;
> >  - It's safe to be called at any time once mddev is allocated, and it's
> >    designed to be used from slow path where array configuration is changed;
> >
> > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > ---
> >  drivers/md/md.c | 85 +++++++++++++++++++++++++++++++++++++++++++++++--
> >  drivers/md/md.h |  3 ++
> >  2 files changed, 86 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index 7fa311a14317..6236e2e395c1 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -443,12 +443,22 @@ void mddev_suspend(struct mddev *mddev)
> >                         lockdep_is_held(&mddev->reconfig_mutex));
> >
> >         WARN_ON_ONCE(thread && current == thread->tsk);
> > -       if (mddev->suspended++)
> > +
> > +       /* can't concurrent with __mddev_suspend() and __mddev_resume() */
> > +       mutex_lock(&mddev->suspend_mutex);
> > +       if (mddev->suspended++) {
> > +               mutex_unlock(&mddev->suspend_mutex);
> >                 return;
> > +       }
> > +
> >         wake_up(&mddev->sb_wait);
> >         set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
> >         percpu_ref_kill(&mddev->active_io);
> >
> > +       /*
> > +        * TODO: cleanup 'pers->prepare_suspend after all callers are replaced
> > +        * by __mddev_suspend().
> > +        */
> >         if (mddev->pers && mddev->pers->prepare_suspend)
> >                 mddev->pers->prepare_suspend(mddev);
> >
> > @@ -459,14 +469,21 @@ void mddev_suspend(struct mddev *mddev)
> >         del_timer_sync(&mddev->safemode_timer);
> >         /* restrict memory reclaim I/O during raid array is suspend */
> >         mddev->noio_flag = memalloc_noio_save();
> > +
> > +       mutex_unlock(&mddev->suspend_mutex);
> >  }
> >  EXPORT_SYMBOL_GPL(mddev_suspend);
> >
> >  void mddev_resume(struct mddev *mddev)
> >  {
> >         lockdep_assert_held(&mddev->reconfig_mutex);
> > -       if (--mddev->suspended)
> > +
> > +       /* can't concurrent with __mddev_suspend() and __mddev_resume() */
> > +       mutex_lock(&mddev->suspend_mutex);
> > +       if (--mddev->suspended) {
> > +               mutex_unlock(&mddev->suspend_mutex);
> >                 return;
> > +       }
> >
> >         /* entred the memalloc scope from mddev_suspend() */
> >         memalloc_noio_restore(mddev->noio_flag);
> > @@ -477,9 +494,72 @@ void mddev_resume(struct mddev *mddev)
> >         set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> >         md_wakeup_thread(mddev->thread);
> >         md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
> > +
> > +       mutex_unlock(&mddev->suspend_mutex);
> >  }
> >  EXPORT_SYMBOL_GPL(mddev_resume);
> >
> > +void __mddev_suspend(struct mddev *mddev)
> > +{
> > +
> > +       /*
> > +        * hold reconfig_mutex to wait for normal io will deadlock, because
> > +        * other context can't update super_block, and normal io can rely on
> > +        * updating super_block.
> > +        */
> > +       lockdep_assert_not_held(&mddev->reconfig_mutex);
> > +
> > +       mutex_lock(&mddev->suspend_mutex);
> > +
> > +       if (mddev->suspended) {
> > +               WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
> > +               mutex_unlock(&mddev->suspend_mutex);
> > +               return;
> > +       }
> > +
> > +       percpu_ref_kill(&mddev->active_io);
> > +       wait_event(mddev->sb_wait, percpu_ref_is_zero(&mddev->active_io));
> > +
> > +       /*
> > +        * For raid456, io might be waiting for reshape to make progress,
> > +        * allow new reshape to start while waiting for io to be done to
> > +        * prevent deadlock.
> > +        */
> > +       WRITE_ONCE(mddev->suspended, mddev->suspended + 1);
>
> It changes the order of setting suspended and checking active_io.
> suspended is used to stop I/O. Now it checks active_io first and then
> adds suspended, if the i/o doesn't stop, it looks like active_io can't
> be 0. So it will stuck at waiting active_io to be 0?

Ah, I c, you add the state of active_io to judge if a raid is suspended.

Regards
Xiao
>
> Best Regards
> Xiao
>
> > +
> > +       del_timer_sync(&mddev->safemode_timer);
> > +       /* restrict memory reclaim I/O during raid array is suspend */
> > +       mddev->noio_flag = memalloc_noio_save();
> > +
> > +       mutex_unlock(&mddev->suspend_mutex);
> > +}
> > +EXPORT_SYMBOL_GPL(__mddev_suspend);
> > +
> > +void __mddev_resume(struct mddev *mddev)
> > +{
> > +       lockdep_assert_not_held(&mddev->reconfig_mutex);
> > +
> > +       mutex_lock(&mddev->suspend_mutex);
> > +       WRITE_ONCE(mddev->suspended, mddev->suspended - 1);
> > +       if (mddev->suspended) {
> > +               mutex_unlock(&mddev->suspend_mutex);
> > +               return;
> > +       }
> > +
> > +       /* entred the memalloc scope from __mddev_suspend() */
> > +       memalloc_noio_restore(mddev->noio_flag);
> > +
> > +       percpu_ref_resurrect(&mddev->active_io);
> > +       wake_up(&mddev->sb_wait);
> > +
> > +       set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> > +       md_wakeup_thread(mddev->thread);
> > +       md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
> > +
> > +       mutex_unlock(&mddev->suspend_mutex);
> > +}
> > +EXPORT_SYMBOL_GPL(__mddev_resume);
> > +
> >  /*
> >   * Generic flush handling for md
> >   */
> > @@ -667,6 +747,7 @@ int mddev_init(struct mddev *mddev)
> >         mutex_init(&mddev->open_mutex);
> >         mutex_init(&mddev->reconfig_mutex);
> >         mutex_init(&mddev->sync_mutex);
> > +       mutex_init(&mddev->suspend_mutex);
> >         mutex_init(&mddev->bitmap_info.mutex);
> >         INIT_LIST_HEAD(&mddev->disks);
> >         INIT_LIST_HEAD(&mddev->all_mddevs);
> > diff --git a/drivers/md/md.h b/drivers/md/md.h
> > index fb3b123f16dd..1103e6b08ad9 100644
> > --- a/drivers/md/md.h
> > +++ b/drivers/md/md.h
> > @@ -316,6 +316,7 @@ struct mddev {
> >         unsigned long                   sb_flags;
> >
> >         int                             suspended;
> > +       struct mutex                    suspend_mutex;
> >         struct percpu_ref               active_io;
> >         int                             ro;
> >         int                             sysfs_active; /* set when sysfs deletes
> > @@ -811,6 +812,8 @@ extern void md_rdev_clear(struct md_rdev *rdev);
> >  extern void md_handle_request(struct mddev *mddev, struct bio *bio);
> >  extern void mddev_suspend(struct mddev *mddev);
> >  extern void mddev_resume(struct mddev *mddev);
> > +extern void __mddev_suspend(struct mddev *mddev);
> > +extern void __mddev_resume(struct mddev *mddev);
> >
> >  extern void md_reload_sb(struct mddev *mddev, int raid_disk);
> >  extern void md_update_sb(struct mddev *mddev, int force);
> > --
> > 2.39.2
> >

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH -next v2 00/28] md: synchronize io with array reconfiguration
  2023-08-28  1:59 ` Yu Kuai
@ 2023-09-25 15:45   ` Song Liu
  -1 siblings, 0 replies; 74+ messages in thread
From: Song Liu @ 2023-09-25 15:45 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, dm-devel, xni, linux-kernel, linux-raid, yukuai3,
	yi.zhang, yangerkun

Hi Kuai,

Thanks for the patchset!

I have got the following panic with mdadm test 23rdev-lifetime.
Could you please look into it?

I pushed the test code to this branch:

https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/log/?h=md-test-28

Thanks,
Song


[  173.143010] ==================================================================
[  173.144256] BUG: KASAN: null-ptr-deref in __mutex_lock+0xc0/0x920
[  173.145232] Read of size 8 at addr 00000000000000a8 by task test/1215
[  173.146138]
[  173.146375] CPU: 26 PID: 1215 Comm: test Not tainted 6.6.0-rc2+ #8
[  173.147254] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[  173.148840] Call Trace:
[  173.149202]  <TASK>
[  173.149531]  dump_stack_lvl+0xb5/0x100
[  173.150093]  ? __pfx_dump_stack_lvl+0x10/0x10
[  173.150724]  ? _printk+0xac/0xf0
[  173.151251]  ? lock_acquired+0xff/0x680
[  173.151852]  print_report+0xe6/0x510
[  173.152372]  ? __might_resched+0x1a1/0x3d0
[  173.152997]  ? __mutex_lock+0xc0/0x920
[  173.153566]  kasan_report+0x119/0x150
[  173.154114]  ? lock_acquire+0x18a/0x390
[  173.154667]  ? __mutex_lock+0xc0/0x920
[  173.155225]  ? mddev_suspend+0xbc/0x260
[  173.155799]  __mutex_lock+0xc0/0x920
[  173.156332]  ? lock_acquire+0x18a/0x390
[  173.156928]  ? kernfs_find_and_get_ns+0x4c/0xb0
[  173.157578]  ? __pfx___mutex_lock+0x10/0x10
[  173.158177]  ? down_read+0x6b2/0x800
[  173.158696]  ? lock_is_held_type+0xdb/0x150
[  173.159300]  mddev_suspend+0xbc/0x260
[  173.159832]  ? __pfx_lock_release+0x10/0x10
[  173.160427]  ? lock_is_held_type+0xdb/0x150
[  173.161074]  ? __pfx_mddev_suspend+0x10/0x10
[  173.161698]  rdev_attr_store+0x5ba/0x600
[  173.162282]  ? __pfx_sysfs_kf_write+0x10/0x10
[  173.162915]  kernfs_fop_write_iter+0x1d1/0x280
[  173.163595]  vfs_write+0x45d/0x5d0
[  173.164113]  ? __pfx_vfs_write+0x10/0x10
[  173.164709]  ? __pfx_lock_release+0x10/0x10
[  173.165352]  ksys_write+0xed/0x1a0
[  173.165912]  ? __pfx_ksys_write+0x10/0x10
[  173.166501]  ? __audit_syscall_entry+0x1cf/0x200
[  173.167191]  ? syscall_enter_from_user_mode+0x181/0x220
[  173.168034]  do_syscall_64+0x43/0x90
[  173.168588]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  173.169355] RIP: 0033:0x7f4e65ced648
[  173.169830] md: could not open device unknown-block(7,0).
[  173.169914] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
d4 55
[  173.173324] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[  173.174398] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
[  173.175405] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
[  173.176416] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
[  173.177417] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
[  173.178418] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
[  173.179441]  </TASK>
[  173.179775] ==================================================================
[  173.180838] Disabling lock debugging due to kernel taint
[  173.181662] BUG: kernel NULL pointer dereference, address: 00000000000000a8
[  173.182654] #PF: supervisor read access in kernel mode
[  173.183408] #PF: error_code(0x0000) - not-present page
[  173.184152] PGD 0 P4D 0
[  173.184531] Oops: 0000 [#1] PREEMPT SMP KASAN PTI
[  173.185224] CPU: 26 PID: 1215 Comm: test Tainted: G    B
  6.6.0-rc2+ #8
[  173.186320] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[  173.187912] RIP: 0010:__mutex_lock+0xc0/0x920
[  173.188557] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
fe 4d
[  173.191203] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
[  173.191958] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
[  173.192968] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
[  173.193976] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
[  173.194986] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
[  173.196263] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
[  173.197274] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
knlGS:0000000000000000
[  173.198466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  173.199316] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
[  173.200327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  173.201382] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  173.202430] Call Trace:
[  173.202810]  <TASK>
[  173.203173]  ? __die_body+0x63/0xb0
[  173.203678]  ? page_fault_oops+0x2f3/0x440
[  173.204338]  ? __pfx_page_fault_oops+0x10/0x10
[  173.204981]  ? vprintk_emit+0x455/0x520
[  173.205593]  ? __pfx_vprintk_emit+0x10/0x10
[  173.206276]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[  173.207068]  ? do_user_addr_fault+0x796/0x840
[  173.207694]  ? _printk+0xac/0xf0
[  173.208188]  ? __pfx_do_user_addr_fault+0x10/0x10
[  173.208879]  ? rcu_is_watching+0x30/0x60
[  173.209475]  ? exc_page_fault+0x7d/0x290
[  173.210043]  ? asm_exc_page_fault+0x22/0x30
[  173.210639]  ? mddev_suspend+0xbc/0x260
[  173.211294]  ? add_taint+0x41/0x90
[  173.211798]  ? __mutex_lock+0xc0/0x920
[  173.212352]  ? lock_acquire+0x18a/0x390
[  173.212914]  ? kernfs_find_and_get_ns+0x4c/0xb0
[  173.213623]  ? __pfx___mutex_lock+0x10/0x10
[  173.214243]  ? down_read+0x6b2/0x800
[  173.214773]  ? lock_is_held_type+0xdb/0x150
[  173.215374]  mddev_suspend+0xbc/0x260
[  173.215941]  ? __pfx_lock_release+0x10/0x10
[  173.216541]  ? lock_is_held_type+0xdb/0x150
[  173.217148]  ? __pfx_mddev_suspend+0x10/0x10
[  173.217776]  rdev_attr_store+0x5ba/0x600
[  173.218343]  ? __pfx_sysfs_kf_write+0x10/0x10
[  173.218977]  kernfs_fop_write_iter+0x1d1/0x280
[  173.219618]  vfs_write+0x45d/0x5d0
[  173.220126]  ? __pfx_vfs_write+0x10/0x10
[  173.220689]  ? __pfx_lock_release+0x10/0x10
[  173.221342]  ksys_write+0xed/0x1a0
[  173.221850]  ? __pfx_ksys_write+0x10/0x10
[  173.222421]  ? __audit_syscall_entry+0x1cf/0x200
[  173.223090]  ? syscall_enter_from_user_mode+0x181/0x220
[  173.223845]  do_syscall_64+0x43/0x90
[  173.224362]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  173.225083] RIP: 0033:0x7f4e65ced648
[  173.225599] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
d4 55
[  173.228199] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[  173.229267] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
[  173.230273] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
[  173.231274] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
[  173.232323] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
[  173.233323] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
[  173.234333]  </TASK>
[  173.234657] Modules linked in:
[  173.235118] CR2: 00000000000000a8
[  173.235601] ---[ end trace 0000000000000000 ]---
[  173.236270] RIP: 0010:__mutex_lock+0xc0/0x920
[  173.236906] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
fe 4d
[  173.239538] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
[  173.240286] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
[  173.241293] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
[  173.242342] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
[  173.243343] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
[  173.244346] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
[  173.245384] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
knlGS:0000000000000000
[  173.246548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  173.247362] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
[  173.248371] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  173.249390] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  173.250395] Kernel panic - not syncing: Fatal exception
[  173.251612] Kernel Offset: disabled
[  173.252133] ---[ end Kernel panic - not syncing: Fatal exception ]---


On Sun, Aug 27, 2023 at 7:04 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Changes in v2:
>  - rebase with latest md-next
>  - remove some follow up cleanup patches, these patches will be sent
>  later after this patchset.
>
> After previous four patchset of preparatory work, this patchset impelement
> a new version of mddev_suspend(), the new apis:
>  - reconfig_mutex is not required;
>  - the weird logical that suspend array hold 'reconfig_mutex' for
>    mddev_check_recovery() to update superblock is not needed;
>  - the special handling, 'pers->prepare_suspend', for raid456 is not
>    needed;
>  - It's safe to be called at any time once mddev is allocated, and it's
>    designed to be used from slow path where array configuration is changed;
>
> And use the new api to replace:
>
> mddev_lock
> mddev_suspend or not
> // array reconfiguration
> mddev_resume or not
> mddev_unlock
>
> With:
>
> mddev_suspend
> mddev_lock
> // array reconfiguration
> mddev_unlock
> mddev_resume
>
> However, the above change is not possible for raid5 and raid-cluster in
> some corner cases, and mddev_suspend/resume() is replaced with quiesce()
> callback, which will suspend the array as well.
>
> This patchset is tested in my VM with mdadm testsuite with loop device
> except for 10ddf tests(they always fail before this patchset).
>
> A lot of cleanups will be started after this patchset.
>
> Yu Kuai (28):
>   md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
>   md: use 'mddev->suspended' for is_md_suspended()
>   md: add new helpers to suspend/resume array
>   md: add new helpers to suspend/resume and lock/unlock array
>   md: use new apis to suspend array for suspend_lo/hi_store()
>   md: use new apis to suspend array for level_store()
>   md: use new apis to suspend array for serialize_policy_store()
>   md/dm-raid: use new apis to suspend array
>   md/md-bitmap: use new apis to suspend array for location_store()
>   md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
>   md/raid5-cache: use new apis to suspend array for
>     r5c_disable_writeback_async()
>   md/raid5-cache: use new apis to suspend array for
>     r5c_journal_mode_store()
>   md/raid5: use new apis to suspend array for raid5_store_stripe_size()
>   md/raid5: use new apis to suspend array for raid5_store_skip_copy()
>   md/raid5: use new apis to suspend array for
>     raid5_store_group_thread_cnt()
>   md/raid5: use new apis to suspend array for
>     raid5_change_consistency_policy()
>   md/raid5: replace suspend with quiesce() callback
>   md: quiesce before md_kick_rdev_from_array() for md-cluster
>   md: use new apis to suspend array for ioctls involed array
>     reconfiguration
>   md: use new apis to suspend array for adding/removing rdev from
>     state_store()
>   md: use new apis to suspend array for bind_rdev_to_array()
>   md: use new apis to suspend array related to serial pool in
>     state_store()
>   md: use new apis to suspend array in backlog_store()
>   md: suspend array in md_start_sync() if array need reconfiguration
>   md: cleanup mddev_create/destroy_serial_pool()
>   md/md-linear: cleanup linear_add()
>   md: remove old apis to suspend the array
>   md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
>
>  drivers/md/dm-raid.c       |   8 +-
>  drivers/md/md-autodetect.c |   4 +-
>  drivers/md/md-bitmap.c     |  18 ++-
>  drivers/md/md-linear.c     |   2 -
>  drivers/md/md.c            | 250 ++++++++++++++++++++++---------------
>  drivers/md/md.h            |  52 ++++++--
>  drivers/md/raid5-cache.c   |  61 +++++----
>  drivers/md/raid5.c         |  56 ++++-----
>  8 files changed, 253 insertions(+), 198 deletions(-)
>
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [dm-devel] [PATCH -next v2 00/28] md: synchronize io with array reconfiguration
@ 2023-09-25 15:45   ` Song Liu
  0 siblings, 0 replies; 74+ messages in thread
From: Song Liu @ 2023-09-25 15:45 UTC (permalink / raw)
  To: Yu Kuai
  Cc: yi.zhang, yangerkun, snitzer, xni, linux-kernel, linux-raid,
	dm-devel, yukuai3, agk

Hi Kuai,

Thanks for the patchset!

I have got the following panic with mdadm test 23rdev-lifetime.
Could you please look into it?

I pushed the test code to this branch:

https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/log/?h=md-test-28

Thanks,
Song


[  173.143010] ==================================================================
[  173.144256] BUG: KASAN: null-ptr-deref in __mutex_lock+0xc0/0x920
[  173.145232] Read of size 8 at addr 00000000000000a8 by task test/1215
[  173.146138]
[  173.146375] CPU: 26 PID: 1215 Comm: test Not tainted 6.6.0-rc2+ #8
[  173.147254] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[  173.148840] Call Trace:
[  173.149202]  <TASK>
[  173.149531]  dump_stack_lvl+0xb5/0x100
[  173.150093]  ? __pfx_dump_stack_lvl+0x10/0x10
[  173.150724]  ? _printk+0xac/0xf0
[  173.151251]  ? lock_acquired+0xff/0x680
[  173.151852]  print_report+0xe6/0x510
[  173.152372]  ? __might_resched+0x1a1/0x3d0
[  173.152997]  ? __mutex_lock+0xc0/0x920
[  173.153566]  kasan_report+0x119/0x150
[  173.154114]  ? lock_acquire+0x18a/0x390
[  173.154667]  ? __mutex_lock+0xc0/0x920
[  173.155225]  ? mddev_suspend+0xbc/0x260
[  173.155799]  __mutex_lock+0xc0/0x920
[  173.156332]  ? lock_acquire+0x18a/0x390
[  173.156928]  ? kernfs_find_and_get_ns+0x4c/0xb0
[  173.157578]  ? __pfx___mutex_lock+0x10/0x10
[  173.158177]  ? down_read+0x6b2/0x800
[  173.158696]  ? lock_is_held_type+0xdb/0x150
[  173.159300]  mddev_suspend+0xbc/0x260
[  173.159832]  ? __pfx_lock_release+0x10/0x10
[  173.160427]  ? lock_is_held_type+0xdb/0x150
[  173.161074]  ? __pfx_mddev_suspend+0x10/0x10
[  173.161698]  rdev_attr_store+0x5ba/0x600
[  173.162282]  ? __pfx_sysfs_kf_write+0x10/0x10
[  173.162915]  kernfs_fop_write_iter+0x1d1/0x280
[  173.163595]  vfs_write+0x45d/0x5d0
[  173.164113]  ? __pfx_vfs_write+0x10/0x10
[  173.164709]  ? __pfx_lock_release+0x10/0x10
[  173.165352]  ksys_write+0xed/0x1a0
[  173.165912]  ? __pfx_ksys_write+0x10/0x10
[  173.166501]  ? __audit_syscall_entry+0x1cf/0x200
[  173.167191]  ? syscall_enter_from_user_mode+0x181/0x220
[  173.168034]  do_syscall_64+0x43/0x90
[  173.168588]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  173.169355] RIP: 0033:0x7f4e65ced648
[  173.169830] md: could not open device unknown-block(7,0).
[  173.169914] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
d4 55
[  173.173324] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[  173.174398] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
[  173.175405] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
[  173.176416] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
[  173.177417] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
[  173.178418] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
[  173.179441]  </TASK>
[  173.179775] ==================================================================
[  173.180838] Disabling lock debugging due to kernel taint
[  173.181662] BUG: kernel NULL pointer dereference, address: 00000000000000a8
[  173.182654] #PF: supervisor read access in kernel mode
[  173.183408] #PF: error_code(0x0000) - not-present page
[  173.184152] PGD 0 P4D 0
[  173.184531] Oops: 0000 [#1] PREEMPT SMP KASAN PTI
[  173.185224] CPU: 26 PID: 1215 Comm: test Tainted: G    B
  6.6.0-rc2+ #8
[  173.186320] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[  173.187912] RIP: 0010:__mutex_lock+0xc0/0x920
[  173.188557] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
fe 4d
[  173.191203] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
[  173.191958] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
[  173.192968] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
[  173.193976] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
[  173.194986] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
[  173.196263] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
[  173.197274] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
knlGS:0000000000000000
[  173.198466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  173.199316] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
[  173.200327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  173.201382] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  173.202430] Call Trace:
[  173.202810]  <TASK>
[  173.203173]  ? __die_body+0x63/0xb0
[  173.203678]  ? page_fault_oops+0x2f3/0x440
[  173.204338]  ? __pfx_page_fault_oops+0x10/0x10
[  173.204981]  ? vprintk_emit+0x455/0x520
[  173.205593]  ? __pfx_vprintk_emit+0x10/0x10
[  173.206276]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
[  173.207068]  ? do_user_addr_fault+0x796/0x840
[  173.207694]  ? _printk+0xac/0xf0
[  173.208188]  ? __pfx_do_user_addr_fault+0x10/0x10
[  173.208879]  ? rcu_is_watching+0x30/0x60
[  173.209475]  ? exc_page_fault+0x7d/0x290
[  173.210043]  ? asm_exc_page_fault+0x22/0x30
[  173.210639]  ? mddev_suspend+0xbc/0x260
[  173.211294]  ? add_taint+0x41/0x90
[  173.211798]  ? __mutex_lock+0xc0/0x920
[  173.212352]  ? lock_acquire+0x18a/0x390
[  173.212914]  ? kernfs_find_and_get_ns+0x4c/0xb0
[  173.213623]  ? __pfx___mutex_lock+0x10/0x10
[  173.214243]  ? down_read+0x6b2/0x800
[  173.214773]  ? lock_is_held_type+0xdb/0x150
[  173.215374]  mddev_suspend+0xbc/0x260
[  173.215941]  ? __pfx_lock_release+0x10/0x10
[  173.216541]  ? lock_is_held_type+0xdb/0x150
[  173.217148]  ? __pfx_mddev_suspend+0x10/0x10
[  173.217776]  rdev_attr_store+0x5ba/0x600
[  173.218343]  ? __pfx_sysfs_kf_write+0x10/0x10
[  173.218977]  kernfs_fop_write_iter+0x1d1/0x280
[  173.219618]  vfs_write+0x45d/0x5d0
[  173.220126]  ? __pfx_vfs_write+0x10/0x10
[  173.220689]  ? __pfx_lock_release+0x10/0x10
[  173.221342]  ksys_write+0xed/0x1a0
[  173.221850]  ? __pfx_ksys_write+0x10/0x10
[  173.222421]  ? __audit_syscall_entry+0x1cf/0x200
[  173.223090]  ? syscall_enter_from_user_mode+0x181/0x220
[  173.223845]  do_syscall_64+0x43/0x90
[  173.224362]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  173.225083] RIP: 0033:0x7f4e65ced648
[  173.225599] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
d4 55
[  173.228199] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[  173.229267] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
[  173.230273] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
[  173.231274] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
[  173.232323] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
[  173.233323] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
[  173.234333]  </TASK>
[  173.234657] Modules linked in:
[  173.235118] CR2: 00000000000000a8
[  173.235601] ---[ end trace 0000000000000000 ]---
[  173.236270] RIP: 0010:__mutex_lock+0xc0/0x920
[  173.236906] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
fe 4d
[  173.239538] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
[  173.240286] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
[  173.241293] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
[  173.242342] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
[  173.243343] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
[  173.244346] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
[  173.245384] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
knlGS:0000000000000000
[  173.246548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  173.247362] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
[  173.248371] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  173.249390] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  173.250395] Kernel panic - not syncing: Fatal exception
[  173.251612] Kernel Offset: disabled
[  173.252133] ---[ end Kernel panic - not syncing: Fatal exception ]---


On Sun, Aug 27, 2023 at 7:04 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Changes in v2:
>  - rebase with latest md-next
>  - remove some follow up cleanup patches, these patches will be sent
>  later after this patchset.
>
> After previous four patchset of preparatory work, this patchset impelement
> a new version of mddev_suspend(), the new apis:
>  - reconfig_mutex is not required;
>  - the weird logical that suspend array hold 'reconfig_mutex' for
>    mddev_check_recovery() to update superblock is not needed;
>  - the special handling, 'pers->prepare_suspend', for raid456 is not
>    needed;
>  - It's safe to be called at any time once mddev is allocated, and it's
>    designed to be used from slow path where array configuration is changed;
>
> And use the new api to replace:
>
> mddev_lock
> mddev_suspend or not
> // array reconfiguration
> mddev_resume or not
> mddev_unlock
>
> With:
>
> mddev_suspend
> mddev_lock
> // array reconfiguration
> mddev_unlock
> mddev_resume
>
> However, the above change is not possible for raid5 and raid-cluster in
> some corner cases, and mddev_suspend/resume() is replaced with quiesce()
> callback, which will suspend the array as well.
>
> This patchset is tested in my VM with mdadm testsuite with loop device
> except for 10ddf tests(they always fail before this patchset).
>
> A lot of cleanups will be started after this patchset.
>
> Yu Kuai (28):
>   md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
>   md: use 'mddev->suspended' for is_md_suspended()
>   md: add new helpers to suspend/resume array
>   md: add new helpers to suspend/resume and lock/unlock array
>   md: use new apis to suspend array for suspend_lo/hi_store()
>   md: use new apis to suspend array for level_store()
>   md: use new apis to suspend array for serialize_policy_store()
>   md/dm-raid: use new apis to suspend array
>   md/md-bitmap: use new apis to suspend array for location_store()
>   md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
>   md/raid5-cache: use new apis to suspend array for
>     r5c_disable_writeback_async()
>   md/raid5-cache: use new apis to suspend array for
>     r5c_journal_mode_store()
>   md/raid5: use new apis to suspend array for raid5_store_stripe_size()
>   md/raid5: use new apis to suspend array for raid5_store_skip_copy()
>   md/raid5: use new apis to suspend array for
>     raid5_store_group_thread_cnt()
>   md/raid5: use new apis to suspend array for
>     raid5_change_consistency_policy()
>   md/raid5: replace suspend with quiesce() callback
>   md: quiesce before md_kick_rdev_from_array() for md-cluster
>   md: use new apis to suspend array for ioctls involed array
>     reconfiguration
>   md: use new apis to suspend array for adding/removing rdev from
>     state_store()
>   md: use new apis to suspend array for bind_rdev_to_array()
>   md: use new apis to suspend array related to serial pool in
>     state_store()
>   md: use new apis to suspend array in backlog_store()
>   md: suspend array in md_start_sync() if array need reconfiguration
>   md: cleanup mddev_create/destroy_serial_pool()
>   md/md-linear: cleanup linear_add()
>   md: remove old apis to suspend the array
>   md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
>
>  drivers/md/dm-raid.c       |   8 +-
>  drivers/md/md-autodetect.c |   4 +-
>  drivers/md/md-bitmap.c     |  18 ++-
>  drivers/md/md-linear.c     |   2 -
>  drivers/md/md.c            | 250 ++++++++++++++++++++++---------------
>  drivers/md/md.h            |  52 ++++++--
>  drivers/md/raid5-cache.c   |  61 +++++----
>  drivers/md/raid5.c         |  56 ++++-----
>  8 files changed, 253 insertions(+), 198 deletions(-)
>
> --
> 2.39.2
>

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH -next v2 00/28] md: synchronize io with array reconfiguration
  2023-09-25 15:45   ` [dm-devel] " Song Liu
@ 2023-09-26  0:55     ` Yu Kuai
  -1 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-09-26  0:55 UTC (permalink / raw)
  To: Song Liu, Yu Kuai
  Cc: agk, snitzer, dm-devel, xni, linux-kernel, linux-raid, yi.zhang,
	yangerkun, yukuai (C)

Hi,

在 2023/09/25 23:45, Song Liu 写道:
> Hi Kuai,
> 
> Thanks for the patchset!
> 
> I have got the following panic with mdadm test 23rdev-lifetime.
> Could you please look into it?
> 
> I pushed the test code to this branch:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/log/?h=md-test-28

Thanks for the test, I know where the problem is now, mddev is
dereferenced before the null checking.

I'll fix this in the next version.

Thanks,
Kuai

> 
> Thanks,
> Song
> 
> 
> [  173.143010] ==================================================================
> [  173.144256] BUG: KASAN: null-ptr-deref in __mutex_lock+0xc0/0x920
> [  173.145232] Read of size 8 at addr 00000000000000a8 by task test/1215
> [  173.146138]
> [  173.146375] CPU: 26 PID: 1215 Comm: test Not tainted 6.6.0-rc2+ #8
> [  173.147254] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
> [  173.148840] Call Trace:
> [  173.149202]  <TASK>
> [  173.149531]  dump_stack_lvl+0xb5/0x100
> [  173.150093]  ? __pfx_dump_stack_lvl+0x10/0x10
> [  173.150724]  ? _printk+0xac/0xf0
> [  173.151251]  ? lock_acquired+0xff/0x680
> [  173.151852]  print_report+0xe6/0x510
> [  173.152372]  ? __might_resched+0x1a1/0x3d0
> [  173.152997]  ? __mutex_lock+0xc0/0x920
> [  173.153566]  kasan_report+0x119/0x150
> [  173.154114]  ? lock_acquire+0x18a/0x390
> [  173.154667]  ? __mutex_lock+0xc0/0x920
> [  173.155225]  ? mddev_suspend+0xbc/0x260
> [  173.155799]  __mutex_lock+0xc0/0x920
> [  173.156332]  ? lock_acquire+0x18a/0x390
> [  173.156928]  ? kernfs_find_and_get_ns+0x4c/0xb0
> [  173.157578]  ? __pfx___mutex_lock+0x10/0x10
> [  173.158177]  ? down_read+0x6b2/0x800
> [  173.158696]  ? lock_is_held_type+0xdb/0x150
> [  173.159300]  mddev_suspend+0xbc/0x260
> [  173.159832]  ? __pfx_lock_release+0x10/0x10
> [  173.160427]  ? lock_is_held_type+0xdb/0x150
> [  173.161074]  ? __pfx_mddev_suspend+0x10/0x10
> [  173.161698]  rdev_attr_store+0x5ba/0x600
> [  173.162282]  ? __pfx_sysfs_kf_write+0x10/0x10
> [  173.162915]  kernfs_fop_write_iter+0x1d1/0x280
> [  173.163595]  vfs_write+0x45d/0x5d0
> [  173.164113]  ? __pfx_vfs_write+0x10/0x10
> [  173.164709]  ? __pfx_lock_release+0x10/0x10
> [  173.165352]  ksys_write+0xed/0x1a0
> [  173.165912]  ? __pfx_ksys_write+0x10/0x10
> [  173.166501]  ? __audit_syscall_entry+0x1cf/0x200
> [  173.167191]  ? syscall_enter_from_user_mode+0x181/0x220
> [  173.168034]  do_syscall_64+0x43/0x90
> [  173.168588]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [  173.169355] RIP: 0033:0x7f4e65ced648
> [  173.169830] md: could not open device unknown-block(7,0).
> [  173.169914] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
> 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
> 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
> d4 55
> [  173.173324] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  173.174398] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
> [  173.175405] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
> [  173.176416] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
> [  173.177417] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
> [  173.178418] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
> [  173.179441]  </TASK>
> [  173.179775] ==================================================================
> [  173.180838] Disabling lock debugging due to kernel taint
> [  173.181662] BUG: kernel NULL pointer dereference, address: 00000000000000a8
> [  173.182654] #PF: supervisor read access in kernel mode
> [  173.183408] #PF: error_code(0x0000) - not-present page
> [  173.184152] PGD 0 P4D 0
> [  173.184531] Oops: 0000 [#1] PREEMPT SMP KASAN PTI
> [  173.185224] CPU: 26 PID: 1215 Comm: test Tainted: G    B
>    6.6.0-rc2+ #8
> [  173.186320] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
> [  173.187912] RIP: 0010:__mutex_lock+0xc0/0x920
> [  173.188557] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
> c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
> 02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
> fe 4d
> [  173.191203] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
> [  173.191958] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
> [  173.192968] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
> [  173.193976] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
> [  173.194986] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
> [  173.196263] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
> [  173.197274] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
> knlGS:0000000000000000
> [  173.198466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  173.199316] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
> [  173.200327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  173.201382] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  173.202430] Call Trace:
> [  173.202810]  <TASK>
> [  173.203173]  ? __die_body+0x63/0xb0
> [  173.203678]  ? page_fault_oops+0x2f3/0x440
> [  173.204338]  ? __pfx_page_fault_oops+0x10/0x10
> [  173.204981]  ? vprintk_emit+0x455/0x520
> [  173.205593]  ? __pfx_vprintk_emit+0x10/0x10
> [  173.206276]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
> [  173.207068]  ? do_user_addr_fault+0x796/0x840
> [  173.207694]  ? _printk+0xac/0xf0
> [  173.208188]  ? __pfx_do_user_addr_fault+0x10/0x10
> [  173.208879]  ? rcu_is_watching+0x30/0x60
> [  173.209475]  ? exc_page_fault+0x7d/0x290
> [  173.210043]  ? asm_exc_page_fault+0x22/0x30
> [  173.210639]  ? mddev_suspend+0xbc/0x260
> [  173.211294]  ? add_taint+0x41/0x90
> [  173.211798]  ? __mutex_lock+0xc0/0x920
> [  173.212352]  ? lock_acquire+0x18a/0x390
> [  173.212914]  ? kernfs_find_and_get_ns+0x4c/0xb0
> [  173.213623]  ? __pfx___mutex_lock+0x10/0x10
> [  173.214243]  ? down_read+0x6b2/0x800
> [  173.214773]  ? lock_is_held_type+0xdb/0x150
> [  173.215374]  mddev_suspend+0xbc/0x260
> [  173.215941]  ? __pfx_lock_release+0x10/0x10
> [  173.216541]  ? lock_is_held_type+0xdb/0x150
> [  173.217148]  ? __pfx_mddev_suspend+0x10/0x10
> [  173.217776]  rdev_attr_store+0x5ba/0x600
> [  173.218343]  ? __pfx_sysfs_kf_write+0x10/0x10
> [  173.218977]  kernfs_fop_write_iter+0x1d1/0x280
> [  173.219618]  vfs_write+0x45d/0x5d0
> [  173.220126]  ? __pfx_vfs_write+0x10/0x10
> [  173.220689]  ? __pfx_lock_release+0x10/0x10
> [  173.221342]  ksys_write+0xed/0x1a0
> [  173.221850]  ? __pfx_ksys_write+0x10/0x10
> [  173.222421]  ? __audit_syscall_entry+0x1cf/0x200
> [  173.223090]  ? syscall_enter_from_user_mode+0x181/0x220
> [  173.223845]  do_syscall_64+0x43/0x90
> [  173.224362]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [  173.225083] RIP: 0033:0x7f4e65ced648
> [  173.225599] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
> 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
> 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
> d4 55
> [  173.228199] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  173.229267] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
> [  173.230273] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
> [  173.231274] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
> [  173.232323] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
> [  173.233323] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
> [  173.234333]  </TASK>
> [  173.234657] Modules linked in:
> [  173.235118] CR2: 00000000000000a8
> [  173.235601] ---[ end trace 0000000000000000 ]---
> [  173.236270] RIP: 0010:__mutex_lock+0xc0/0x920
> [  173.236906] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
> c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
> 02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
> fe 4d
> [  173.239538] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
> [  173.240286] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
> [  173.241293] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
> [  173.242342] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
> [  173.243343] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
> [  173.244346] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
> [  173.245384] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
> knlGS:0000000000000000
> [  173.246548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  173.247362] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
> [  173.248371] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  173.249390] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  173.250395] Kernel panic - not syncing: Fatal exception
> [  173.251612] Kernel Offset: disabled
> [  173.252133] ---[ end Kernel panic - not syncing: Fatal exception ]---
> 
> 
> On Sun, Aug 27, 2023 at 7:04 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Changes in v2:
>>   - rebase with latest md-next
>>   - remove some follow up cleanup patches, these patches will be sent
>>   later after this patchset.
>>
>> After previous four patchset of preparatory work, this patchset impelement
>> a new version of mddev_suspend(), the new apis:
>>   - reconfig_mutex is not required;
>>   - the weird logical that suspend array hold 'reconfig_mutex' for
>>     mddev_check_recovery() to update superblock is not needed;
>>   - the special handling, 'pers->prepare_suspend', for raid456 is not
>>     needed;
>>   - It's safe to be called at any time once mddev is allocated, and it's
>>     designed to be used from slow path where array configuration is changed;
>>
>> And use the new api to replace:
>>
>> mddev_lock
>> mddev_suspend or not
>> // array reconfiguration
>> mddev_resume or not
>> mddev_unlock
>>
>> With:
>>
>> mddev_suspend
>> mddev_lock
>> // array reconfiguration
>> mddev_unlock
>> mddev_resume
>>
>> However, the above change is not possible for raid5 and raid-cluster in
>> some corner cases, and mddev_suspend/resume() is replaced with quiesce()
>> callback, which will suspend the array as well.
>>
>> This patchset is tested in my VM with mdadm testsuite with loop device
>> except for 10ddf tests(they always fail before this patchset).
>>
>> A lot of cleanups will be started after this patchset.
>>
>> Yu Kuai (28):
>>    md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
>>    md: use 'mddev->suspended' for is_md_suspended()
>>    md: add new helpers to suspend/resume array
>>    md: add new helpers to suspend/resume and lock/unlock array
>>    md: use new apis to suspend array for suspend_lo/hi_store()
>>    md: use new apis to suspend array for level_store()
>>    md: use new apis to suspend array for serialize_policy_store()
>>    md/dm-raid: use new apis to suspend array
>>    md/md-bitmap: use new apis to suspend array for location_store()
>>    md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
>>    md/raid5-cache: use new apis to suspend array for
>>      r5c_disable_writeback_async()
>>    md/raid5-cache: use new apis to suspend array for
>>      r5c_journal_mode_store()
>>    md/raid5: use new apis to suspend array for raid5_store_stripe_size()
>>    md/raid5: use new apis to suspend array for raid5_store_skip_copy()
>>    md/raid5: use new apis to suspend array for
>>      raid5_store_group_thread_cnt()
>>    md/raid5: use new apis to suspend array for
>>      raid5_change_consistency_policy()
>>    md/raid5: replace suspend with quiesce() callback
>>    md: quiesce before md_kick_rdev_from_array() for md-cluster
>>    md: use new apis to suspend array for ioctls involed array
>>      reconfiguration
>>    md: use new apis to suspend array for adding/removing rdev from
>>      state_store()
>>    md: use new apis to suspend array for bind_rdev_to_array()
>>    md: use new apis to suspend array related to serial pool in
>>      state_store()
>>    md: use new apis to suspend array in backlog_store()
>>    md: suspend array in md_start_sync() if array need reconfiguration
>>    md: cleanup mddev_create/destroy_serial_pool()
>>    md/md-linear: cleanup linear_add()
>>    md: remove old apis to suspend the array
>>    md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
>>
>>   drivers/md/dm-raid.c       |   8 +-
>>   drivers/md/md-autodetect.c |   4 +-
>>   drivers/md/md-bitmap.c     |  18 ++-
>>   drivers/md/md-linear.c     |   2 -
>>   drivers/md/md.c            | 250 ++++++++++++++++++++++---------------
>>   drivers/md/md.h            |  52 ++++++--
>>   drivers/md/raid5-cache.c   |  61 +++++----
>>   drivers/md/raid5.c         |  56 ++++-----
>>   8 files changed, 253 insertions(+), 198 deletions(-)
>>
>> --
>> 2.39.2
>>
> .
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [dm-devel] [PATCH -next v2 00/28] md: synchronize io with array reconfiguration
@ 2023-09-26  0:55     ` Yu Kuai
  0 siblings, 0 replies; 74+ messages in thread
From: Yu Kuai @ 2023-09-26  0:55 UTC (permalink / raw)
  To: Song Liu, Yu Kuai
  Cc: yi.zhang, yangerkun, snitzer, xni, linux-kernel, linux-raid,
	dm-devel, yukuai (C),
	agk

Hi,

在 2023/09/25 23:45, Song Liu 写道:
> Hi Kuai,
> 
> Thanks for the patchset!
> 
> I have got the following panic with mdadm test 23rdev-lifetime.
> Could you please look into it?
> 
> I pushed the test code to this branch:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/log/?h=md-test-28

Thanks for the test, I know where the problem is now, mddev is
dereferenced before the null checking.

I'll fix this in the next version.

Thanks,
Kuai

> 
> Thanks,
> Song
> 
> 
> [  173.143010] ==================================================================
> [  173.144256] BUG: KASAN: null-ptr-deref in __mutex_lock+0xc0/0x920
> [  173.145232] Read of size 8 at addr 00000000000000a8 by task test/1215
> [  173.146138]
> [  173.146375] CPU: 26 PID: 1215 Comm: test Not tainted 6.6.0-rc2+ #8
> [  173.147254] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
> [  173.148840] Call Trace:
> [  173.149202]  <TASK>
> [  173.149531]  dump_stack_lvl+0xb5/0x100
> [  173.150093]  ? __pfx_dump_stack_lvl+0x10/0x10
> [  173.150724]  ? _printk+0xac/0xf0
> [  173.151251]  ? lock_acquired+0xff/0x680
> [  173.151852]  print_report+0xe6/0x510
> [  173.152372]  ? __might_resched+0x1a1/0x3d0
> [  173.152997]  ? __mutex_lock+0xc0/0x920
> [  173.153566]  kasan_report+0x119/0x150
> [  173.154114]  ? lock_acquire+0x18a/0x390
> [  173.154667]  ? __mutex_lock+0xc0/0x920
> [  173.155225]  ? mddev_suspend+0xbc/0x260
> [  173.155799]  __mutex_lock+0xc0/0x920
> [  173.156332]  ? lock_acquire+0x18a/0x390
> [  173.156928]  ? kernfs_find_and_get_ns+0x4c/0xb0
> [  173.157578]  ? __pfx___mutex_lock+0x10/0x10
> [  173.158177]  ? down_read+0x6b2/0x800
> [  173.158696]  ? lock_is_held_type+0xdb/0x150
> [  173.159300]  mddev_suspend+0xbc/0x260
> [  173.159832]  ? __pfx_lock_release+0x10/0x10
> [  173.160427]  ? lock_is_held_type+0xdb/0x150
> [  173.161074]  ? __pfx_mddev_suspend+0x10/0x10
> [  173.161698]  rdev_attr_store+0x5ba/0x600
> [  173.162282]  ? __pfx_sysfs_kf_write+0x10/0x10
> [  173.162915]  kernfs_fop_write_iter+0x1d1/0x280
> [  173.163595]  vfs_write+0x45d/0x5d0
> [  173.164113]  ? __pfx_vfs_write+0x10/0x10
> [  173.164709]  ? __pfx_lock_release+0x10/0x10
> [  173.165352]  ksys_write+0xed/0x1a0
> [  173.165912]  ? __pfx_ksys_write+0x10/0x10
> [  173.166501]  ? __audit_syscall_entry+0x1cf/0x200
> [  173.167191]  ? syscall_enter_from_user_mode+0x181/0x220
> [  173.168034]  do_syscall_64+0x43/0x90
> [  173.168588]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [  173.169355] RIP: 0033:0x7f4e65ced648
> [  173.169830] md: could not open device unknown-block(7,0).
> [  173.169914] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
> 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
> 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
> d4 55
> [  173.173324] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  173.174398] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
> [  173.175405] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
> [  173.176416] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
> [  173.177417] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
> [  173.178418] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
> [  173.179441]  </TASK>
> [  173.179775] ==================================================================
> [  173.180838] Disabling lock debugging due to kernel taint
> [  173.181662] BUG: kernel NULL pointer dereference, address: 00000000000000a8
> [  173.182654] #PF: supervisor read access in kernel mode
> [  173.183408] #PF: error_code(0x0000) - not-present page
> [  173.184152] PGD 0 P4D 0
> [  173.184531] Oops: 0000 [#1] PREEMPT SMP KASAN PTI
> [  173.185224] CPU: 26 PID: 1215 Comm: test Tainted: G    B
>    6.6.0-rc2+ #8
> [  173.186320] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
> [  173.187912] RIP: 0010:__mutex_lock+0xc0/0x920
> [  173.188557] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
> c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
> 02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
> fe 4d
> [  173.191203] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
> [  173.191958] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
> [  173.192968] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
> [  173.193976] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
> [  173.194986] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
> [  173.196263] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
> [  173.197274] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
> knlGS:0000000000000000
> [  173.198466] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  173.199316] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
> [  173.200327] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  173.201382] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  173.202430] Call Trace:
> [  173.202810]  <TASK>
> [  173.203173]  ? __die_body+0x63/0xb0
> [  173.203678]  ? page_fault_oops+0x2f3/0x440
> [  173.204338]  ? __pfx_page_fault_oops+0x10/0x10
> [  173.204981]  ? vprintk_emit+0x455/0x520
> [  173.205593]  ? __pfx_vprintk_emit+0x10/0x10
> [  173.206276]  ? __pfx_lockdep_hardirqs_on_prepare+0x10/0x10
> [  173.207068]  ? do_user_addr_fault+0x796/0x840
> [  173.207694]  ? _printk+0xac/0xf0
> [  173.208188]  ? __pfx_do_user_addr_fault+0x10/0x10
> [  173.208879]  ? rcu_is_watching+0x30/0x60
> [  173.209475]  ? exc_page_fault+0x7d/0x290
> [  173.210043]  ? asm_exc_page_fault+0x22/0x30
> [  173.210639]  ? mddev_suspend+0xbc/0x260
> [  173.211294]  ? add_taint+0x41/0x90
> [  173.211798]  ? __mutex_lock+0xc0/0x920
> [  173.212352]  ? lock_acquire+0x18a/0x390
> [  173.212914]  ? kernfs_find_and_get_ns+0x4c/0xb0
> [  173.213623]  ? __pfx___mutex_lock+0x10/0x10
> [  173.214243]  ? down_read+0x6b2/0x800
> [  173.214773]  ? lock_is_held_type+0xdb/0x150
> [  173.215374]  mddev_suspend+0xbc/0x260
> [  173.215941]  ? __pfx_lock_release+0x10/0x10
> [  173.216541]  ? lock_is_held_type+0xdb/0x150
> [  173.217148]  ? __pfx_mddev_suspend+0x10/0x10
> [  173.217776]  rdev_attr_store+0x5ba/0x600
> [  173.218343]  ? __pfx_sysfs_kf_write+0x10/0x10
> [  173.218977]  kernfs_fop_write_iter+0x1d1/0x280
> [  173.219618]  vfs_write+0x45d/0x5d0
> [  173.220126]  ? __pfx_vfs_write+0x10/0x10
> [  173.220689]  ? __pfx_lock_release+0x10/0x10
> [  173.221342]  ksys_write+0xed/0x1a0
> [  173.221850]  ? __pfx_ksys_write+0x10/0x10
> [  173.222421]  ? __audit_syscall_entry+0x1cf/0x200
> [  173.223090]  ? syscall_enter_from_user_mode+0x181/0x220
> [  173.223845]  do_syscall_64+0x43/0x90
> [  173.224362]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [  173.225083] RIP: 0033:0x7f4e65ced648
> [  173.225599] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00
> 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00
> 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89
> d4 55
> [  173.228199] RSP: 002b:00007ffe9a2ac128 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000001
> [  173.229267] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f4e65ced648
> [  173.230273] RDX: 0000000000000007 RSI: 0000561ae26e29d0 RDI: 0000000000000001
> [  173.231274] RBP: 0000561ae26e29d0 R08: 000000000000000a R09: 00007f4e65d80620
> [  173.232323] R10: 000000000000000a R11: 0000000000000246 R12: 00007f4e65fc06e0
> [  173.233323] R13: 0000000000000007 R14: 00007f4e65fbb880 R15: 0000000000000007
> [  173.234333]  </TASK>
> [  173.234657] Modules linked in:
> [  173.235118] CR2: 00000000000000a8
> [  173.235601] ---[ end trace 0000000000000000 ]---
> [  173.236270] RIP: 0010:__mutex_lock+0xc0/0x920
> [  173.236906] Code: 00 e8 24 f3 77 fe 2e 2e 2e 31 c0 48 c7 c7 80 c7
> c5 89 e8 03 01 bf fe 83 3d ec e0 27 07 00 75 15 49 8d 7c 24 68 e8 30
> 02 bf fe <4d> 39 64 24 68 0f 85 00 08 00 00 bf 01 00 00 00 e8 5b e7 76
> fe 4d
> [  173.239538] RSP: 0018:ffff8881b18c7a20 EFLAGS: 00010286
> [  173.240286] RAX: ffff8881b0ae4001 RBX: 0000000000000000 RCX: ffffffff810e0df1
> [  173.241293] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff8900ea40
> [  173.242342] RBP: ffff8881b18c7b50 R08: ffffffff8900ea47 R09: 1ffffffff1201d48
> [  173.243343] R10: dffffc0000000000 R11: fffffbfff1201d49 R12: 0000000000000040
> [  173.244346] R13: ffffffff823e61cc R14: 0000000000000000 R15: 0000000000000000
> [  173.245384] FS:  00007f4e66b6e740(0000) GS:ffff888dfd200000(0000)
> knlGS:0000000000000000
> [  173.246548] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  173.247362] CR2: 00000000000000a8 CR3: 00000001b191e005 CR4: 0000000000370ee0
> [  173.248371] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  173.249390] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  173.250395] Kernel panic - not syncing: Fatal exception
> [  173.251612] Kernel Offset: disabled
> [  173.252133] ---[ end Kernel panic - not syncing: Fatal exception ]---
> 
> 
> On Sun, Aug 27, 2023 at 7:04 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Changes in v2:
>>   - rebase with latest md-next
>>   - remove some follow up cleanup patches, these patches will be sent
>>   later after this patchset.
>>
>> After previous four patchset of preparatory work, this patchset impelement
>> a new version of mddev_suspend(), the new apis:
>>   - reconfig_mutex is not required;
>>   - the weird logical that suspend array hold 'reconfig_mutex' for
>>     mddev_check_recovery() to update superblock is not needed;
>>   - the special handling, 'pers->prepare_suspend', for raid456 is not
>>     needed;
>>   - It's safe to be called at any time once mddev is allocated, and it's
>>     designed to be used from slow path where array configuration is changed;
>>
>> And use the new api to replace:
>>
>> mddev_lock
>> mddev_suspend or not
>> // array reconfiguration
>> mddev_resume or not
>> mddev_unlock
>>
>> With:
>>
>> mddev_suspend
>> mddev_lock
>> // array reconfiguration
>> mddev_unlock
>> mddev_resume
>>
>> However, the above change is not possible for raid5 and raid-cluster in
>> some corner cases, and mddev_suspend/resume() is replaced with quiesce()
>> callback, which will suspend the array as well.
>>
>> This patchset is tested in my VM with mdadm testsuite with loop device
>> except for 10ddf tests(they always fail before this patchset).
>>
>> A lot of cleanups will be started after this patchset.
>>
>> Yu Kuai (28):
>>    md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi'
>>    md: use 'mddev->suspended' for is_md_suspended()
>>    md: add new helpers to suspend/resume array
>>    md: add new helpers to suspend/resume and lock/unlock array
>>    md: use new apis to suspend array for suspend_lo/hi_store()
>>    md: use new apis to suspend array for level_store()
>>    md: use new apis to suspend array for serialize_policy_store()
>>    md/dm-raid: use new apis to suspend array
>>    md/md-bitmap: use new apis to suspend array for location_store()
>>    md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log'
>>    md/raid5-cache: use new apis to suspend array for
>>      r5c_disable_writeback_async()
>>    md/raid5-cache: use new apis to suspend array for
>>      r5c_journal_mode_store()
>>    md/raid5: use new apis to suspend array for raid5_store_stripe_size()
>>    md/raid5: use new apis to suspend array for raid5_store_skip_copy()
>>    md/raid5: use new apis to suspend array for
>>      raid5_store_group_thread_cnt()
>>    md/raid5: use new apis to suspend array for
>>      raid5_change_consistency_policy()
>>    md/raid5: replace suspend with quiesce() callback
>>    md: quiesce before md_kick_rdev_from_array() for md-cluster
>>    md: use new apis to suspend array for ioctls involed array
>>      reconfiguration
>>    md: use new apis to suspend array for adding/removing rdev from
>>      state_store()
>>    md: use new apis to suspend array for bind_rdev_to_array()
>>    md: use new apis to suspend array related to serial pool in
>>      state_store()
>>    md: use new apis to suspend array in backlog_store()
>>    md: suspend array in md_start_sync() if array need reconfiguration
>>    md: cleanup mddev_create/destroy_serial_pool()
>>    md/md-linear: cleanup linear_add()
>>    md: remove old apis to suspend the array
>>    md: rename __mddev_suspend/resume() back to mddev_suspend/resume()
>>
>>   drivers/md/dm-raid.c       |   8 +-
>>   drivers/md/md-autodetect.c |   4 +-
>>   drivers/md/md-bitmap.c     |  18 ++-
>>   drivers/md/md-linear.c     |   2 -
>>   drivers/md/md.c            | 250 ++++++++++++++++++++++---------------
>>   drivers/md/md.h            |  52 ++++++--
>>   drivers/md/raid5-cache.c   |  61 +++++----
>>   drivers/md/raid5.c         |  56 ++++-----
>>   8 files changed, 253 insertions(+), 198 deletions(-)
>>
>> --
>> 2.39.2
>>
> .
> 

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2023-09-26  0:57 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-28  1:59 [dm-devel] [PATCH -next v2 00/28] md: synchronize io with array reconfiguration Yu Kuai
2023-08-28  1:59 ` Yu Kuai
2023-08-28  1:59 ` [dm-devel] [PATCH -next v2 01/28] md: use READ_ONCE/WRITE_ONCE for 'suspend_lo' and 'suspend_hi' Yu Kuai
2023-08-28  1:59   ` Yu Kuai
2023-09-14  2:53   ` Xiao Ni
2023-09-14  2:53     ` [dm-devel] " Xiao Ni
2023-09-25  1:18     ` Yu Kuai
2023-09-25  1:18       ` [dm-devel] " Yu Kuai
2023-08-28  1:59 ` [dm-devel] [PATCH -next v2 02/28] md: use 'mddev->suspended' for is_md_suspended() Yu Kuai
2023-08-28  1:59   ` Yu Kuai
2023-09-20  8:46   ` Xiao Ni
2023-09-20  8:46     ` [dm-devel] " Xiao Ni
2023-09-25  1:34     ` Yu Kuai
2023-09-25  1:34       ` [dm-devel] " Yu Kuai
2023-08-28  1:59 ` [dm-devel] [PATCH -next v2 03/28] md: add new helpers to suspend/resume array Yu Kuai
2023-08-28  1:59   ` Yu Kuai
2023-09-25  7:21   ` Xiao Ni
2023-09-25  7:21     ` [dm-devel] " Xiao Ni
2023-09-25  7:23     ` Xiao Ni
2023-09-25  7:23       ` [dm-devel] " Xiao Ni
2023-08-28  1:59 ` [dm-devel] [PATCH -next v2 04/28] md: add new helpers to suspend/resume and lock/unlock array Yu Kuai
2023-08-28  1:59   ` Yu Kuai
2023-08-28  1:59 ` [dm-devel] [PATCH -next v2 05/28] md: use new apis to suspend array for suspend_lo/hi_store() Yu Kuai
2023-08-28  1:59   ` Yu Kuai
2023-08-28  1:59 ` [dm-devel] [PATCH -next v2 06/28] md: use new apis to suspend array for level_store() Yu Kuai
2023-08-28  1:59   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 07/28] md: use new apis to suspend array for serialize_policy_store() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 08/28] md/dm-raid: use new apis to suspend array Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 09/28] md/md-bitmap: use new apis to suspend array for location_store() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [PATCH -next v2 10/28] md/raid5-cache: use READ_ONCE/WRITE_ONCE for 'conf->log' Yu Kuai
2023-08-28  2:00   ` [dm-devel] " Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 11/28] md/raid5-cache: use new apis to suspend array for r5c_disable_writeback_async() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 12/28] md/raid5-cache: use new apis to suspend array for r5c_journal_mode_store() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 13/28] md/raid5: use new apis to suspend array for raid5_store_stripe_size() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 14/28] md/raid5: use new apis to suspend array for raid5_store_skip_copy() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 15/28] md/raid5: use new apis to suspend array for raid5_store_group_thread_cnt() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [PATCH -next v2 16/28] md/raid5: use new apis to suspend array for raid5_change_consistency_policy() Yu Kuai
2023-08-28  2:00   ` [dm-devel] " Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 17/28] md/raid5: replace suspend with quiesce() callback Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 18/28] md: quiesce before md_kick_rdev_from_array() for md-cluster Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 19/28] md: use new apis to suspend array for ioctls involed array reconfiguration Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 20/28] md: use new apis to suspend array for adding/removing rdev from state_store() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 21/28] md: use new apis to suspend array for bind_rdev_to_array() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 22/28] md: use new apis to suspend array related to serial pool in state_store() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 23/28] md: use new apis to suspend array in backlog_store() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 24/28] md: suspend array in md_start_sync() if array need reconfiguration Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 25/28] md: cleanup mddev_create/destroy_serial_pool() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 26/28] md/md-linear: cleanup linear_add() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 27/28] md: remove old apis to suspend the array Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-08-28  2:00 ` [dm-devel] [PATCH -next v2 28/28] md: rename __mddev_suspend/resume() back to mddev_suspend/resume() Yu Kuai
2023-08-28  2:00   ` Yu Kuai
2023-09-25 15:45 ` [PATCH -next v2 00/28] md: synchronize io with array reconfiguration Song Liu
2023-09-25 15:45   ` [dm-devel] " Song Liu
2023-09-26  0:55   ` Yu Kuai
2023-09-26  0:55     ` [dm-devel] " Yu Kuai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.