[PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action

dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action
@ 2024-05-09  1:18 Yu Kuai
  2024-05-09  1:18 ` [PATCH md-6.10 1/9] md: rearrange recovery_flage Yu Kuai
                   ` (8 more replies)
  0 siblings, 9 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Motivation of this patchset is that during code review, I found some
places is not good coded, and I decide to make code more readable.

Yu Kuai (9):
  md: rearrange recovery_flage
  md: add a new enum type sync_action
  md: add new helpers for sync_action
  md: factor out helper to start reshape from action_store()
  md: replace sysfs api sync_action with new helpers
  md: use new helers in md_do_sync()
  md: replace last_sync_action with new enum type
  md: factor out helpers for different sync_action in md_do_sync()
  md: pass in max_sectors for pers->sync_request()

 drivers/md/dm-raid.c |   2 +-
 drivers/md/md.c      | 367 ++++++++++++++++++++++++++++---------------
 drivers/md/md.h      | 122 +++++++++++---
 drivers/md/raid1.c   |   5 +-
 drivers/md/raid10.c  |   8 +-
 drivers/md/raid5.c   |   3 +-
 6 files changed, 344 insertions(+), 163 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 1/9] md: rearrange recovery_flage
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
@ 2024-05-09  1:18 ` Yu Kuai
  2024-05-13 15:12   ` Mariusz Tkaczyk
  2024-05-14  5:51   ` Xiao Ni
  2024-05-09  1:18 ` [PATCH md-6.10 2/9] md: add a new enum type sync_action Yu Kuai
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Currently there are lots of flags and the names are confusing, since
there are two main types of flags, sync thread runnng status and sync
thread action, rearrange and update comment to improve code readability,
there are no functional changes.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.h | 52 ++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 38 insertions(+), 14 deletions(-)

diff --git a/drivers/md/md.h b/drivers/md/md.h
index 029dd0491a36..2a1cb7b889e5 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -551,22 +551,46 @@ struct mddev {
 };
 
 enum recovery_flags {
+	/* flags for sync thread running status */
+
+	/*
+	 * set when one of sync action is set and new sync thread need to be
+	 * registered, or just add/remove spares from conf.
+	 */
+	MD_RECOVERY_NEEDED,
+	/* sync thread is running, or about to be started */
+	MD_RECOVERY_RUNNING,
+	/* sync thread needs to be aborted for some reason */
+	MD_RECOVERY_INTR,
+	/* sync thread is done and is waiting to be unregistered */
+	MD_RECOVERY_DONE,
+	/* running sync thread must abort immediately, and not restart */
+	MD_RECOVERY_FROZEN,
+	/* waiting for pers->start() to finish */
+	MD_RECOVERY_WAIT,
+	/* interrupted because io-error */
+	MD_RECOVERY_ERROR,
+
+	/* flags determines sync action */
+
+	/* if just this flag is set, action is resync. */
+	MD_RECOVERY_SYNC,
+	/*
+	 * paired with MD_RECOVERY_SYNC, if MD_RECOVERY_CHECK is not set,
+	 * action is repair, means user requested resync.
+	 */
+	MD_RECOVERY_REQUESTED,
 	/*
-	 * If neither SYNC or RESHAPE are set, then it is a recovery.
+	 * paired with MD_RECOVERY_SYNC and MD_RECOVERY_REQUESTED, action is
+	 * check.
 	 */
-	MD_RECOVERY_RUNNING,	/* a thread is running, or about to be started */
-	MD_RECOVERY_SYNC,	/* actually doing a resync, not a recovery */
-	MD_RECOVERY_RECOVER,	/* doing recovery, or need to try it. */
-	MD_RECOVERY_INTR,	/* resync needs to be aborted for some reason */
-	MD_RECOVERY_DONE,	/* thread is done and is waiting to be reaped */
-	MD_RECOVERY_NEEDED,	/* we might need to start a resync/recover */
-	MD_RECOVERY_REQUESTED,	/* user-space has requested a sync (used with SYNC) */
-	MD_RECOVERY_CHECK,	/* user-space request for check-only, no repair */
-	MD_RECOVERY_RESHAPE,	/* A reshape is happening */
-	MD_RECOVERY_FROZEN,	/* User request to abort, and not restart, any action */
-	MD_RECOVERY_ERROR,	/* sync-action interrupted because io-error */
-	MD_RECOVERY_WAIT,	/* waiting for pers->start() to finish */
-	MD_RESYNCING_REMOTE,	/* remote node is running resync thread */
+	MD_RECOVERY_CHECK,
+	/* recovery, or need to try it */
+	MD_RECOVERY_RECOVER,
+	/* reshape */
+	MD_RECOVERY_RESHAPE,
+	/* remote node is running resync thread */
+	MD_RESYNCING_REMOTE,
 };
 
 enum md_ro_state {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 2/9] md: add a new enum type sync_action
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
  2024-05-09  1:18 ` [PATCH md-6.10 1/9] md: rearrange recovery_flage Yu Kuai
@ 2024-05-09  1:18 ` Yu Kuai
  2024-05-14  6:13   ` Xiao Ni
  2024-05-09  1:18 ` [PATCH md-6.10 3/9] md: add new helpers for sync_action Yu Kuai
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

In order to make code related to sync_thread cleaner in following
patches, also add detail comment about each sync action.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.h | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 56 insertions(+), 1 deletion(-)

diff --git a/drivers/md/md.h b/drivers/md/md.h
index 2a1cb7b889e5..2edad966f90a 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -34,6 +34,61 @@
  */
 #define	MD_FAILFAST	(REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT)
 
+/* Status of sync thread. */
+enum sync_action {
+	/*
+	 * Represent by MD_RECOVERY_SYNC, start when:
+	 * 1) after assemble, sync data from first rdev to other copies, this
+	 * must be done first before other sync actions and will only execute
+	 * once;
+	 * 2) resize the array(notice that this is not reshape), sync data for
+	 * the new range;
+	 */
+	ACTION_RESYNC,
+	/*
+	 * Represent by MD_RECOVERY_RECOVER, start when:
+	 * 1) for new replacement, sync data based on the replace rdev or
+	 * available copies from other rdev;
+	 * 2) for new member disk while the array is degraded, sync data from
+	 * other rdev;
+	 * 3) reassemble after power failure or re-add a hot removed rdev, sync
+	 * data from first rdev to other copies based on bitmap;
+	 */
+	ACTION_RECOVER,
+	/*
+	 * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED |
+	 * MD_RECOVERY_CHECK, start when user echo "check" to sysfs api
+	 * sync_action, used to check if data copies from differenct rdev are
+	 * the same. The number of mismatch sectors will be exported to user
+	 * by sysfs api mismatch_cnt;
+	 */
+	ACTION_CHECK,
+	/*
+	 * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED, start when
+	 * user echo "repair" to sysfs api sync_action, usually paired with
+	 * ACTION_CHECK, used to force syncing data once user found that there
+	 * are inconsistent data,
+	 */
+	ACTION_REPAIR,
+	/*
+	 * Represent by MD_RECOVERY_RESHAPE, start when new member disk is added
+	 * to the conf, notice that this is different from spares or
+	 * replacement;
+	 */
+	ACTION_RESHAPE,
+	/*
+	 * Represent by MD_RECOVERY_FROZEN, can be set by sysfs api sync_action
+	 * or internal usage like setting the array read-only, will forbid above
+	 * actions.
+	 */
+	ACTION_FROZEN,
+	/*
+	 * All above actions don't match.
+	 */
+	ACTION_IDLE,
+	NR_SYNC_ACTIONS,
+};
+
 /*
  * The struct embedded in rdev is used to serialize IO.
  */
@@ -571,7 +626,7 @@ enum recovery_flags {
 	/* interrupted because io-error */
 	MD_RECOVERY_ERROR,
 
-	/* flags determines sync action */
+	/* flags determines sync action, see details in enum sync_action */
 
 	/* if just this flag is set, action is resync. */
 	MD_RECOVERY_SYNC,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
  2024-05-09  1:18 ` [PATCH md-6.10 1/9] md: rearrange recovery_flage Yu Kuai
  2024-05-09  1:18 ` [PATCH md-6.10 2/9] md: add a new enum type sync_action Yu Kuai
@ 2024-05-09  1:18 ` Yu Kuai
  2024-05-14  6:52   ` Xiao Ni
  2024-05-20 11:51   ` Su Yue
  2024-05-09  1:18 ` [PATCH md-6.10 4/9] md: factor out helper to start reshape from action_store() Yu Kuai
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

The new helpers will get current sync_action of the array, will be used
in later patches to make code cleaner.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/md/md.h |  3 +++
 2 files changed, 67 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 00bbafcd27bb..48ec35342d1b 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -69,6 +69,16 @@
 #include "md-bitmap.h"
 #include "md-cluster.h"
 
+static char *action_name[NR_SYNC_ACTIONS] = {
+	[ACTION_RESYNC]		= "resync",
+	[ACTION_RECOVER]	= "recover",
+	[ACTION_CHECK]		= "check",
+	[ACTION_REPAIR]		= "repair",
+	[ACTION_RESHAPE]	= "reshape",
+	[ACTION_FROZEN]		= "frozen",
+	[ACTION_IDLE]		= "idle",
+};
+
 /* pers_list is a list of registered personalities protected by pers_lock. */
 static LIST_HEAD(pers_list);
 static DEFINE_SPINLOCK(pers_lock);
@@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, const char *buf, size_t len)
 static struct md_sysfs_entry md_metadata =
 __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, metadata_show, metadata_store);
 
+enum sync_action md_sync_action(struct mddev *mddev)
+{
+	unsigned long recovery = mddev->recovery;
+
+	/*
+	 * frozen has the highest priority, means running sync_thread will be
+	 * stopped immediately, and no new sync_thread can start.
+	 */
+	if (test_bit(MD_RECOVERY_FROZEN, &recovery))
+		return ACTION_FROZEN;
+
+	/*
+	 * idle means no sync_thread is running, and no new sync_thread is
+	 * requested.
+	 */
+	if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
+	    (!md_is_rdwr(mddev) || !test_bit(MD_RECOVERY_NEEDED, &recovery)))
+		return ACTION_IDLE;
+
+	if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
+	    mddev->reshape_position != MaxSector)
+		return ACTION_RESHAPE;
+
+	if (test_bit(MD_RECOVERY_RECOVER, &recovery))
+		return ACTION_RECOVER;
+
+	if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
+		if (test_bit(MD_RECOVERY_CHECK, &recovery))
+			return ACTION_CHECK;
+		if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
+			return ACTION_REPAIR;
+		return ACTION_RESYNC;
+	}
+
+	return ACTION_IDLE;
+}
+
+enum sync_action md_sync_action_by_name(char *page)
+{
+	enum sync_action action;
+
+	for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
+		if (cmd_match(page, action_name[action]))
+			return action;
+	}
+
+	return NR_SYNC_ACTIONS;
+}
+
+char *md_sync_action_name(enum sync_action action)
+{
+	return action_name[action];
+}
+
 static ssize_t
 action_show(struct mddev *mddev, char *page)
 {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 2edad966f90a..72ca7a796df5 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -864,6 +864,9 @@ extern void md_unregister_thread(struct mddev *mddev, struct md_thread __rcu **t
 extern void md_wakeup_thread(struct md_thread __rcu *thread);
 extern void md_check_recovery(struct mddev *mddev);
 extern void md_reap_sync_thread(struct mddev *mddev);
+extern enum sync_action md_sync_action(struct mddev *mddev);
+extern enum sync_action md_sync_action_by_name(char *page);
+extern char *md_sync_action_name(enum sync_action action);
 extern bool md_write_start(struct mddev *mddev, struct bio *bi);
 extern void md_write_inc(struct mddev *mddev, struct bio *bi);
 extern void md_write_end(struct mddev *mddev);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 4/9] md: factor out helper to start reshape from action_store()
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
                   ` (2 preceding siblings ...)
  2024-05-09  1:18 ` [PATCH md-6.10 3/9] md: add new helpers for sync_action Yu Kuai
@ 2024-05-09  1:18 ` Yu Kuai
  2024-05-09  1:18 ` [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers Yu Kuai
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

There are no functional changes, just to make code cleaner and prepare
for following refactor.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 67 +++++++++++++++++++++++++++++++------------------
 drivers/md/md.h |  2 +-
 2 files changed, 43 insertions(+), 26 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 48ec35342d1b..7600da89d909 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4914,7 +4914,7 @@ enum sync_action md_sync_action(struct mddev *mddev)
 	return ACTION_IDLE;
 }
 
-enum sync_action md_sync_action_by_name(char *page)
+enum sync_action md_sync_action_by_name(const char *page)
 {
 	enum sync_action action;
 
@@ -5055,6 +5055,45 @@ static void frozen_sync_thread(struct mddev *mddev)
 	mutex_unlock(&mddev->sync_mutex);
 }
 
+static int mddev_start_reshape(struct mddev *mddev)
+{
+	int ret;
+
+	if (mddev->pers->start_reshape == NULL)
+		return -EINVAL;
+
+	ret = mddev_lock(mddev);
+	if (ret)
+		return ret;
+
+	if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
+		mddev_unlock(mddev);
+		return -EBUSY;
+	}
+
+	if (mddev->reshape_position == MaxSector ||
+	    mddev->pers->check_reshape == NULL ||
+	    mddev->pers->check_reshape(mddev)) {
+		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+		ret = mddev->pers->start_reshape(mddev);
+		if (ret) {
+			mddev_unlock(mddev);
+			return ret;
+		}
+	} else {
+		/*
+		 * If reshape is still in progress, and md_check_recovery() can
+		 * continue to reshape, don't restart reshape because data can
+		 * be corrupted for raid456.
+		 */
+		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+	}
+
+	mddev_unlock(mddev);
+	sysfs_notify_dirent_safe(mddev->sysfs_degraded);
+	return 0;
+}
+
 static ssize_t
 action_store(struct mddev *mddev, const char *page, size_t len)
 {
@@ -5074,32 +5113,10 @@ action_store(struct mddev *mddev, const char *page, size_t len)
 		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
 		set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
 	} else if (cmd_match(page, "reshape")) {
-		int err;
-		if (mddev->pers->start_reshape == NULL)
-			return -EINVAL;
-		err = mddev_lock(mddev);
-		if (!err) {
-			if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
-				err =  -EBUSY;
-			} else if (mddev->reshape_position == MaxSector ||
-				   mddev->pers->check_reshape == NULL ||
-				   mddev->pers->check_reshape(mddev)) {
-				clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
-				err = mddev->pers->start_reshape(mddev);
-			} else {
-				/*
-				 * If reshape is still in progress, and
-				 * md_check_recovery() can continue to reshape,
-				 * don't restart reshape because data can be
-				 * corrupted for raid456.
-				 */
-				clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
-			}
-			mddev_unlock(mddev);
-		}
+		int err = mddev_start_reshape(mddev);
+
 		if (err)
 			return err;
-		sysfs_notify_dirent_safe(mddev->sysfs_degraded);
 	} else {
 		if (cmd_match(page, "check"))
 			set_bit(MD_RECOVERY_CHECK, &mddev->recovery);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 72ca7a796df5..cf54870e89db 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -865,7 +865,7 @@ extern void md_wakeup_thread(struct md_thread __rcu *thread);
 extern void md_check_recovery(struct mddev *mddev);
 extern void md_reap_sync_thread(struct mddev *mddev);
 extern enum sync_action md_sync_action(struct mddev *mddev);
-extern enum sync_action md_sync_action_by_name(char *page);
+extern enum sync_action md_sync_action_by_name(const char *page);
 extern char *md_sync_action_name(enum sync_action action);
 extern bool md_write_start(struct mddev *mddev, struct bio *bi);
 extern void md_write_inc(struct mddev *mddev, struct bio *bi);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
                   ` (3 preceding siblings ...)
  2024-05-09  1:18 ` [PATCH md-6.10 4/9] md: factor out helper to start reshape from action_store() Yu Kuai
@ 2024-05-09  1:18 ` Yu Kuai
  2024-05-20 15:01   ` kernel test robot
  2024-05-09  1:18 ` [PATCH md-6.10 6/9] md: use new helers in md_do_sync() Yu Kuai
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

To get rid of extrem long if else if usage, and make code cleaner.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 94 +++++++++++++++++++++++++++----------------------
 1 file changed, 52 insertions(+), 42 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 7600da89d909..da6c94f03efb 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4934,27 +4934,9 @@ char *md_sync_action_name(enum sync_action action)
 static ssize_t
 action_show(struct mddev *mddev, char *page)
 {
-	char *type = "idle";
-	unsigned long recovery = mddev->recovery;
-	if (test_bit(MD_RECOVERY_FROZEN, &recovery))
-		type = "frozen";
-	else if (test_bit(MD_RECOVERY_RUNNING, &recovery) ||
-	    (md_is_rdwr(mddev) && test_bit(MD_RECOVERY_NEEDED, &recovery))) {
-		if (test_bit(MD_RECOVERY_RESHAPE, &recovery))
-			type = "reshape";
-		else if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
-			if (!test_bit(MD_RECOVERY_REQUESTED, &recovery))
-				type = "resync";
-			else if (test_bit(MD_RECOVERY_CHECK, &recovery))
-				type = "check";
-			else
-				type = "repair";
-		} else if (test_bit(MD_RECOVERY_RECOVER, &recovery))
-			type = "recover";
-		else if (mddev->reshape_position != MaxSector)
-			type = "reshape";
-	}
-	return sprintf(page, "%s\n", type);
+	enum sync_action action = md_sync_action(mddev);
+
+	return sprintf(page, "%s\n", md_sync_action_name(action));
 }
 
 /**
@@ -5097,35 +5079,63 @@ static int mddev_start_reshape(struct mddev *mddev)
 static ssize_t
 action_store(struct mddev *mddev, const char *page, size_t len)
 {
+	int ret;
+	enum sync_action action;
+
 	if (!mddev->pers || !mddev->pers->sync_request)
 		return -EINVAL;
 
+	action = md_sync_action_by_name(page);
 
-	if (cmd_match(page, "idle"))
-		idle_sync_thread(mddev);
-	else if (cmd_match(page, "frozen"))
-		frozen_sync_thread(mddev);
-	else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
-		return -EBUSY;
-	else if (cmd_match(page, "resync"))
-		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
-	else if (cmd_match(page, "recover")) {
-		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
-		set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
-	} else if (cmd_match(page, "reshape")) {
-		int err = mddev_start_reshape(mddev);
-
-		if (err)
-			return err;
+	/* TODO: mdadm rely on "idle" to start sync_thread. */
+	if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) {
+		switch (action) {
+		case ACTION_FROZEN:
+			frozen_sync_thread(mddev);
+			return len;
+		case ACTION_IDLE:
+			idle_sync_thread(mddev);
+			break;
+		case ACTION_RESHAPE:
+		case ACTION_RECOVER:
+		case ACTION_CHECK:
+		case ACTION_REPAIR:
+		case ACTION_RESYNC:
+			return -EBUSY;
+		default:
+			return -EINVAL;
+		}
 	} else {
-		if (cmd_match(page, "check"))
+		switch (action) {
+		case ACTION_FROZEN:
+			set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+			return len;
+		case ACTION_RESHAPE:
+			clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+			ret = mddev_start_reshape(mddev);
+			if (ret)
+				return ret;
+			break;
+		case ACTION_RECOVER:
+			clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+			set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
+			break;
+		case ACTION_CHECK:
 			set_bit(MD_RECOVERY_CHECK, &mddev->recovery);
-		else if (!cmd_match(page, "repair"))
+			fallthrough;
+		case ACTION_REPAIR:
+			set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
+			set_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+			fallthrough;
+		case ACTION_RESYNC:
+		case ACTION_IDLE:
+			clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
+			break;
+		default:
 			return -EINVAL;
-		clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
-		set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
-		set_bit(MD_RECOVERY_SYNC, &mddev->recovery);
+		}
 	}
+
 	if (mddev->ro == MD_AUTO_READ) {
 		/* A write to sync_action is enough to justify
 		 * canceling read-auto mode
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 6/9] md: use new helers in md_do_sync()
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
                   ` (4 preceding siblings ...)
  2024-05-09  1:18 ` [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers Yu Kuai
@ 2024-05-09  1:18 ` Yu Kuai
  2024-05-09  1:18 ` [PATCH md-6.10 7/9] md: replace last_sync_action with new enum type Yu Kuai
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

There are no functinal changes, just to make code cleaner.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 21 +++++----------------
 1 file changed, 5 insertions(+), 16 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index da6c94f03efb..0b609d79453a 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8943,7 +8943,8 @@ void md_do_sync(struct md_thread *thread)
 	sector_t last_check;
 	int skipped = 0;
 	struct md_rdev *rdev;
-	char *desc, *action = NULL;
+	enum sync_action action;
+	char *desc;
 	struct blk_plug plug;
 	int ret;
 
@@ -8974,21 +8975,9 @@ void md_do_sync(struct md_thread *thread)
 			goto skip;
 	}
 
-	if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
-		if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) {
-			desc = "data-check";
-			action = "check";
-		} else if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
-			desc = "requested-resync";
-			action = "repair";
-		} else
-			desc = "resync";
-	} else if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery))
-		desc = "reshape";
-	else
-		desc = "recovery";
-
-	mddev->last_sync_action = action ?: desc;
+	action = md_sync_action(mddev);
+	desc = md_sync_action_name(action);
+	mddev->last_sync_action = desc;
 
 	/*
 	 * Before starting a resync we must have set curr_resync to
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 7/9] md: replace last_sync_action with new enum type
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
                   ` (5 preceding siblings ...)
  2024-05-09  1:18 ` [PATCH md-6.10 6/9] md: use new helers in md_do_sync() Yu Kuai
@ 2024-05-09  1:18 ` Yu Kuai
  2024-05-09  1:18 ` [PATCH md-6.10 8/9] md: factor out helpers for different sync_action in md_do_sync() Yu Kuai
  2024-05-09  1:19 ` [PATCH md-6.10 9/9] md: pass in max_sectors for pers->sync_request() Yu Kuai
  8 siblings, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

The only difference is that "none" is removed and initial
last_sync_action will be idle.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/dm-raid.c | 2 +-
 drivers/md/md.c      | 7 ++++---
 drivers/md/md.h      | 9 ++++-----
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index abe88d1e6735..052c00c1eb15 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3542,7 +3542,7 @@ static void raid_status(struct dm_target *ti, status_type_t type,
 		recovery = rs->md.recovery;
 		state = decipher_sync_action(mddev, recovery);
 		progress = rs_get_progress(rs, recovery, state, resync_max_sectors);
-		resync_mismatches = (mddev->last_sync_action && !strcasecmp(mddev->last_sync_action, "check")) ?
+		resync_mismatches = mddev->last_sync_action == ACTION_CHECK ?
 				    atomic64_read(&mddev->resync_mismatches) : 0;
 
 		/* HM FIXME: do we want another state char for raid0? It shows 'D'/'A'/'-' now */
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0b609d79453a..2fc81175b46b 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -768,7 +768,7 @@ int mddev_init(struct mddev *mddev)
 	init_waitqueue_head(&mddev->recovery_wait);
 	mddev->reshape_position = MaxSector;
 	mddev->reshape_backwards = 0;
-	mddev->last_sync_action = "none";
+	mddev->last_sync_action = ACTION_IDLE;
 	mddev->resync_min = 0;
 	mddev->resync_max = MaxSector;
 	mddev->level = LEVEL_NONE;
@@ -5156,7 +5156,8 @@ __ATTR_PREALLOC(sync_action, S_IRUGO|S_IWUSR, action_show, action_store);
 static ssize_t
 last_sync_action_show(struct mddev *mddev, char *page)
 {
-	return sprintf(page, "%s\n", mddev->last_sync_action);
+	return sprintf(page, "%s\n",
+		       md_sync_action_name(mddev->last_sync_action));
 }
 
 static struct md_sysfs_entry md_last_scan_mode = __ATTR_RO(last_sync_action);
@@ -8977,7 +8978,7 @@ void md_do_sync(struct md_thread *thread)
 
 	action = md_sync_action(mddev);
 	desc = md_sync_action_name(action);
-	mddev->last_sync_action = desc;
+	mddev->last_sync_action = action;
 
 	/*
 	 * Before starting a resync we must have set curr_resync to
diff --git a/drivers/md/md.h b/drivers/md/md.h
index cf54870e89db..883b147a2c52 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -426,13 +426,12 @@ struct mddev {
 	struct md_thread __rcu		*thread;	/* management thread */
 	struct md_thread __rcu		*sync_thread;	/* doing resync or reconstruct */
 
-	/* 'last_sync_action' is initialized to "none".  It is set when a
-	 * sync operation (i.e "data-check", "requested-resync", "resync",
-	 * "recovery", or "reshape") is started.  It holds this value even
+	/*
+	 * Set when a sync operation is started. It holds this value even
 	 * when the sync thread is "frozen" (interrupted) or "idle" (stopped
-	 * or finished).  It is overwritten when a new sync operation is begun.
+	 * or finished). It is overwritten when a new sync operation is begun.
 	 */
-	char				*last_sync_action;
+	enum sync_action		last_sync_action;
 	sector_t			curr_resync;	/* last block scheduled */
 	/* As resync requests can complete out of order, we cannot easily track
 	 * how much resync has been completed.  So we occasionally pause until
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 8/9] md: factor out helpers for different sync_action in md_do_sync()
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
                   ` (6 preceding siblings ...)
  2024-05-09  1:18 ` [PATCH md-6.10 7/9] md: replace last_sync_action with new enum type Yu Kuai
@ 2024-05-09  1:18 ` Yu Kuai
  2024-05-14  7:27   ` Xiao Ni
  2024-05-09  1:19 ` [PATCH md-6.10 9/9] md: pass in max_sectors for pers->sync_request() Yu Kuai
  8 siblings, 1 reply; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:18 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

Make code cleaner by replace if else if with switch, and it's more
obvious now what is doning for each sync_action. There are no
functional changes.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 123 ++++++++++++++++++++++++++++--------------------
 1 file changed, 73 insertions(+), 50 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2fc81175b46b..42db128b82d9 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8928,6 +8928,77 @@ void md_allow_write(struct mddev *mddev)
 }
 EXPORT_SYMBOL_GPL(md_allow_write);
 
+static sector_t md_sync_max_sectors(struct mddev *mddev,
+				    enum sync_action action)
+{
+	switch (action) {
+	case ACTION_RESYNC:
+	case ACTION_CHECK:
+	case ACTION_REPAIR:
+		atomic64_set(&mddev->resync_mismatches, 0);
+		fallthrough;
+	case ACTION_RESHAPE:
+		return mddev->resync_max_sectors;
+	case ACTION_RECOVER:
+		return mddev->dev_sectors;
+	default:
+		return 0;
+	}
+}
+
+static sector_t md_sync_position(struct mddev *mddev, enum sync_action action)
+{
+	sector_t start = 0;
+	struct md_rdev *rdev;
+
+	switch (action) {
+	case ACTION_CHECK:
+	case ACTION_REPAIR:
+		return mddev->resync_min;
+	case ACTION_RESYNC:
+		if (!mddev->bitmap)
+			return mddev->recovery_cp;
+		return 0;
+	case ACTION_RESHAPE:
+		/*
+		 * If the original node aborts reshaping then we continue the
+		 * reshaping, so set again to avoid restart reshape from the
+		 * first beginning
+		 */
+		if (mddev_is_clustered(mddev) &&
+		    mddev->reshape_position != MaxSector)
+			return mddev->reshape_position;
+		return 0;
+	case ACTION_RECOVER:
+		start = MaxSector;
+		rcu_read_lock();
+		rdev_for_each_rcu(rdev, mddev)
+			if (rdev->raid_disk >= 0 &&
+			    !test_bit(Journal, &rdev->flags) &&
+			    !test_bit(Faulty, &rdev->flags) &&
+			    !test_bit(In_sync, &rdev->flags) &&
+			    rdev->recovery_offset < start)
+				start = rdev->recovery_offset;
+		rcu_read_unlock();
+
+		/* If there is a bitmap, we need to make sure all
+		 * writes that started before we added a spare
+		 * complete before we start doing a recovery.
+		 * Otherwise the write might complete and (via
+		 * bitmap_endwrite) set a bit in the bitmap after the
+		 * recovery has checked that bit and skipped that
+		 * region.
+		 */
+		if (mddev->bitmap) {
+			mddev->pers->quiesce(mddev, 1);
+			mddev->pers->quiesce(mddev, 0);
+		}
+		return start;
+	default:
+		return MaxSector;
+	}
+}
+
 #define SYNC_MARKS	10
 #define	SYNC_MARK_STEP	(3*HZ)
 #define UPDATE_FREQUENCY (5*60*HZ)
@@ -9046,56 +9117,8 @@ void md_do_sync(struct md_thread *thread)
 		spin_unlock(&all_mddevs_lock);
 	} while (mddev->curr_resync < MD_RESYNC_DELAYED);
 
-	j = 0;
-	if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
-		/* resync follows the size requested by the personality,
-		 * which defaults to physical size, but can be virtual size
-		 */
-		max_sectors = mddev->resync_max_sectors;
-		atomic64_set(&mddev->resync_mismatches, 0);
-		/* we don't use the checkpoint if there's a bitmap */
-		if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery))
-			j = mddev->resync_min;
-		else if (!mddev->bitmap)
-			j = mddev->recovery_cp;
-
-	} else if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) {
-		max_sectors = mddev->resync_max_sectors;
-		/*
-		 * If the original node aborts reshaping then we continue the
-		 * reshaping, so set j again to avoid restart reshape from the
-		 * first beginning
-		 */
-		if (mddev_is_clustered(mddev) &&
-		    mddev->reshape_position != MaxSector)
-			j = mddev->reshape_position;
-	} else {
-		/* recovery follows the physical size of devices */
-		max_sectors = mddev->dev_sectors;
-		j = MaxSector;
-		rcu_read_lock();
-		rdev_for_each_rcu(rdev, mddev)
-			if (rdev->raid_disk >= 0 &&
-			    !test_bit(Journal, &rdev->flags) &&
-			    !test_bit(Faulty, &rdev->flags) &&
-			    !test_bit(In_sync, &rdev->flags) &&
-			    rdev->recovery_offset < j)
-				j = rdev->recovery_offset;
-		rcu_read_unlock();
-
-		/* If there is a bitmap, we need to make sure all
-		 * writes that started before we added a spare
-		 * complete before we start doing a recovery.
-		 * Otherwise the write might complete and (via
-		 * bitmap_endwrite) set a bit in the bitmap after the
-		 * recovery has checked that bit and skipped that
-		 * region.
-		 */
-		if (mddev->bitmap) {
-			mddev->pers->quiesce(mddev, 1);
-			mddev->pers->quiesce(mddev, 0);
-		}
-	}
+	max_sectors = md_sync_max_sectors(mddev, action);
+	j = md_sync_position(mddev, action);
 
 	pr_info("md: %s of RAID array %s\n", desc, mdname(mddev));
 	pr_debug("md: minimum _guaranteed_  speed: %d KB/sec/disk.\n", speed_min(mddev));
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH md-6.10 9/9] md: pass in max_sectors for pers->sync_request()
  2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
                   ` (7 preceding siblings ...)
  2024-05-09  1:18 ` [PATCH md-6.10 8/9] md: factor out helpers for different sync_action in md_do_sync() Yu Kuai
@ 2024-05-09  1:19 ` Yu Kuai
  8 siblings, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-09  1:19 UTC (permalink / raw)
  To: agk, snitzer, mpatocka, song, xni
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yukuai1, yi.zhang,
	yangerkun

From: Yu Kuai <yukuai3@huawei.com>

For different sync_action, sync_thread will use different max_sectors,
see details in md_sync_max_sectors(), currently both md_do_sync() and
pers->sync_request() in eatch iteration have to get the same
max_sectors. Hence pass in max_sectors for pers->sync_request() to
prevent redundant code.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c     | 5 +++--
 drivers/md/md.h     | 3 ++-
 drivers/md/raid1.c  | 5 ++---
 drivers/md/raid10.c | 8 ++------
 drivers/md/raid5.c  | 3 +--
 5 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 42db128b82d9..95b0f9642137 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9200,7 +9200,8 @@ void md_do_sync(struct md_thread *thread)
 		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
 			break;
 
-		sectors = mddev->pers->sync_request(mddev, j, &skipped);
+		sectors = mddev->pers->sync_request(mddev, j, max_sectors,
+						    &skipped);
 		if (sectors == 0) {
 			set_bit(MD_RECOVERY_INTR, &mddev->recovery);
 			break;
@@ -9290,7 +9291,7 @@ void md_do_sync(struct md_thread *thread)
 		mddev->curr_resync_completed = mddev->curr_resync;
 		sysfs_notify_dirent_safe(mddev->sysfs_completed);
 	}
-	mddev->pers->sync_request(mddev, max_sectors, &skipped);
+	mddev->pers->sync_request(mddev, max_sectors, max_sectors, &skipped);
 
 	if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) &&
 	    mddev->curr_resync > MD_RESYNC_ACTIVE) {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 883b147a2c52..d03e11fff79a 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -731,7 +731,8 @@ struct md_personality
 	int (*hot_add_disk) (struct mddev *mddev, struct md_rdev *rdev);
 	int (*hot_remove_disk) (struct mddev *mddev, struct md_rdev *rdev);
 	int (*spare_active) (struct mddev *mddev);
-	sector_t (*sync_request)(struct mddev *mddev, sector_t sector_nr, int *skipped);
+	sector_t (*sync_request)(struct mddev *mddev, sector_t sector_nr,
+				 sector_t max_sector, int *skipped);
 	int (*resize) (struct mddev *mddev, sector_t sectors);
 	sector_t (*size) (struct mddev *mddev, sector_t sectors, int raid_disks);
 	int (*check_reshape) (struct mddev *mddev);
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index be8ac24f50b6..1f6dffada912 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2757,12 +2757,12 @@ static struct r1bio *raid1_alloc_init_r1buf(struct r1conf *conf)
  */
 
 static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
-				   int *skipped)
+				   sector_t max_sector, int *skipped)
 {
 	struct r1conf *conf = mddev->private;
 	struct r1bio *r1_bio;
 	struct bio *bio;
-	sector_t max_sector, nr_sectors;
+	sector_t nr_sectors;
 	int disk = -1;
 	int i;
 	int wonly = -1;
@@ -2778,7 +2778,6 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
 		if (init_resync(conf))
 			return 0;
 
-	max_sector = mddev->dev_sectors;
 	if (sector_nr >= max_sector) {
 		/* If we aborted, we need to abort the
 		 * sync on the 'current' bitmap chunk (there will
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index a4556d2e46bf..af54ac30a667 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3140,12 +3140,12 @@ static void raid10_set_cluster_sync_high(struct r10conf *conf)
  */
 
 static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
-			     int *skipped)
+				    sector_t max_sector, int *skipped)
 {
 	struct r10conf *conf = mddev->private;
 	struct r10bio *r10_bio;
 	struct bio *biolist = NULL, *bio;
-	sector_t max_sector, nr_sectors;
+	sector_t nr_sectors;
 	int i;
 	int max_sync;
 	sector_t sync_blocks;
@@ -3175,10 +3175,6 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 			return 0;
 
  skipped:
-	max_sector = mddev->dev_sectors;
-	if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ||
-	    test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery))
-		max_sector = mddev->resync_max_sectors;
 	if (sector_nr >= max_sector) {
 		conf->cluster_sync_low = 0;
 		conf->cluster_sync_high = 0;
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 2bd1ce9b3922..69a083ca41a3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6458,11 +6458,10 @@ static sector_t reshape_request(struct mddev *mddev, sector_t sector_nr, int *sk
 }
 
 static inline sector_t raid5_sync_request(struct mddev *mddev, sector_t sector_nr,
-					  int *skipped)
+					  sector_t max_sector, int *skipped)
 {
 	struct r5conf *conf = mddev->private;
 	struct stripe_head *sh;
-	sector_t max_sector = mddev->dev_sectors;
 	sector_t sync_blocks;
 	int still_degraded = 0;
 	int i;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 1/9] md: rearrange recovery_flage
  2024-05-09  1:18 ` [PATCH md-6.10 1/9] md: rearrange recovery_flage Yu Kuai
@ 2024-05-13 15:12   ` Mariusz Tkaczyk
  2024-05-14  1:36     ` Yu Kuai
  2024-05-14  5:51   ` Xiao Ni
  1 sibling, 1 reply; 31+ messages in thread
From: Mariusz Tkaczyk @ 2024-05-13 15:12 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, mpatocka, song, xni, dm-devel, linux-kernel,
	linux-raid, yukuai3, yi.zhang, yangerkun

On Thu,  9 May 2024 09:18:52 +0800
Yu Kuai <yukuai1@huaweicloud.com> wrote:

There is typo in subject.

> From: Yu Kuai <yukuai3@huawei.com>
> 
> Currently there are lots of flags and the names are confusing, since
> there are two main types of flags, sync thread runnng status and sync
> thread action, rearrange and update comment to improve code readability,
> there are no functional changes.
> 
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.h | 52 ++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 38 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 029dd0491a36..2a1cb7b889e5 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -551,22 +551,46 @@ struct mddev {
>  };
>  
>  enum recovery_flags {
> +	/* flags for sync thread running status */
> +
> +	/*
> +	 * set when one of sync action is set and new sync thread need to be
> +	 * registered, or just add/remove spares from conf.
> +	 */
> +	MD_RECOVERY_NEEDED,
> +	/* sync thread is running, or about to be started */
> +	MD_RECOVERY_RUNNING,
> +	/* sync thread needs to be aborted for some reason */
> +	MD_RECOVERY_INTR,
> +	/* sync thread is done and is waiting to be unregistered */
> +	MD_RECOVERY_DONE,
> +	/* running sync thread must abort immediately, and not restart */
> +	MD_RECOVERY_FROZEN,
> +	/* waiting for pers->start() to finish */
> +	MD_RECOVERY_WAIT,
> +	/* interrupted because io-error */
> +	MD_RECOVERY_ERROR,
> +
> +	/* flags determines sync action */
> +
> +	/* if just this flag is set, action is resync. */
> +	MD_RECOVERY_SYNC,
> +	/*
> +	 * paired with MD_RECOVERY_SYNC, if MD_RECOVERY_CHECK is not set,
> +	 * action is repair, means user requested resync.
> +	 */
> +	MD_RECOVERY_REQUESTED,
>  	/*
> -	 * If neither SYNC or RESHAPE are set, then it is a recovery.
> +	 * paired with MD_RECOVERY_SYNC and MD_RECOVERY_REQUESTED, action is
> +	 * check.
>  	 */
> -	MD_RECOVERY_RUNNING,	/* a thread is running, or about to be
> started */
> -	MD_RECOVERY_SYNC,	/* actually doing a resync, not a recovery
> */
> -	MD_RECOVERY_RECOVER,	/* doing recovery, or need to try it. */
> -	MD_RECOVERY_INTR,	/* resync needs to be aborted for some
> reason */
> -	MD_RECOVERY_DONE,	/* thread is done and is waiting to be
> reaped */
> -	MD_RECOVERY_NEEDED,	/* we might need to start a
> resync/recover */
> -	MD_RECOVERY_REQUESTED,	/* user-space has requested a sync
> (used with SYNC) */
> -	MD_RECOVERY_CHECK,	/* user-space request for check-only, no
> repair */
> -	MD_RECOVERY_RESHAPE,	/* A reshape is happening */
> -	MD_RECOVERY_FROZEN,	/* User request to abort, and not
> restart, any action */
> -	MD_RECOVERY_ERROR,	/* sync-action interrupted because
> io-error */
> -	MD_RECOVERY_WAIT,	/* waiting for pers->start() to finish */
> -	MD_RESYNCING_REMOTE,	/* remote node is running resync thread
> */
> +	MD_RECOVERY_CHECK,
> +	/* recovery, or need to try it */
> +	MD_RECOVERY_RECOVER,
> +	/* reshape */
> +	MD_RECOVERY_RESHAPE,
> +	/* remote node is running resync thread */
> +	MD_RESYNCING_REMOTE,
>  };
>  
>  enum md_ro_state {

I don't know if it is better readable but I know that Kernel coding style comes
with different approach. I used it for enum mddev_flags in md.h please take a
look.

Also, I get used to comment above, not below enum values but I don't have strong
justification here.

Thanks,
Mariusz


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 1/9] md: rearrange recovery_flage
  2024-05-13 15:12   ` Mariusz Tkaczyk
@ 2024-05-14  1:36     ` Yu Kuai
  0 siblings, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-14  1:36 UTC (permalink / raw)
  To: Mariusz Tkaczyk, Yu Kuai
  Cc: agk, snitzer, mpatocka, song, xni, dm-devel, linux-kernel,
	linux-raid, yi.zhang, yangerkun, yukuai (C)

Hi,

在 2024/05/13 23:12, Mariusz Tkaczyk 写道:
> On Thu,  9 May 2024 09:18:52 +0800
> Yu Kuai <yukuai1@huaweicloud.com> wrote:
> 
> There is typo in subject.

Yes, :)
> 
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Currently there are lots of flags and the names are confusing, since
>> there are two main types of flags, sync thread runnng status and sync
>> thread action, rearrange and update comment to improve code readability,
>> there are no functional changes.
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.h | 52 ++++++++++++++++++++++++++++++++++++-------------
>>   1 file changed, 38 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>> index 029dd0491a36..2a1cb7b889e5 100644
>> --- a/drivers/md/md.h
>> +++ b/drivers/md/md.h
>> @@ -551,22 +551,46 @@ struct mddev {
>>   };
>>   
>>   enum recovery_flags {
>> +	/* flags for sync thread running status */
>> +
>> +	/*
>> +	 * set when one of sync action is set and new sync thread need to be
>> +	 * registered, or just add/remove spares from conf.
>> +	 */
>> +	MD_RECOVERY_NEEDED,
>> +	/* sync thread is running, or about to be started */
>> +	MD_RECOVERY_RUNNING,
>> +	/* sync thread needs to be aborted for some reason */
>> +	MD_RECOVERY_INTR,
>> +	/* sync thread is done and is waiting to be unregistered */
>> +	MD_RECOVERY_DONE,
>> +	/* running sync thread must abort immediately, and not restart */
>> +	MD_RECOVERY_FROZEN,
>> +	/* waiting for pers->start() to finish */
>> +	MD_RECOVERY_WAIT,
>> +	/* interrupted because io-error */
>> +	MD_RECOVERY_ERROR,
>> +
>> +	/* flags determines sync action */
>> +
>> +	/* if just this flag is set, action is resync. */
>> +	MD_RECOVERY_SYNC,
>> +	/*
>> +	 * paired with MD_RECOVERY_SYNC, if MD_RECOVERY_CHECK is not set,
>> +	 * action is repair, means user requested resync.
>> +	 */
>> +	MD_RECOVERY_REQUESTED,
>>   	/*
>> -	 * If neither SYNC or RESHAPE are set, then it is a recovery.
>> +	 * paired with MD_RECOVERY_SYNC and MD_RECOVERY_REQUESTED, action is
>> +	 * check.
>>   	 */
>> -	MD_RECOVERY_RUNNING,	/* a thread is running, or about to be
>> started */
>> -	MD_RECOVERY_SYNC,	/* actually doing a resync, not a recovery
>> */
>> -	MD_RECOVERY_RECOVER,	/* doing recovery, or need to try it. */
>> -	MD_RECOVERY_INTR,	/* resync needs to be aborted for some
>> reason */
>> -	MD_RECOVERY_DONE,	/* thread is done and is waiting to be
>> reaped */
>> -	MD_RECOVERY_NEEDED,	/* we might need to start a
>> resync/recover */
>> -	MD_RECOVERY_REQUESTED,	/* user-space has requested a sync
>> (used with SYNC) */
>> -	MD_RECOVERY_CHECK,	/* user-space request for check-only, no
>> repair */
>> -	MD_RECOVERY_RESHAPE,	/* A reshape is happening */
>> -	MD_RECOVERY_FROZEN,	/* User request to abort, and not
>> restart, any action */
>> -	MD_RECOVERY_ERROR,	/* sync-action interrupted because
>> io-error */
>> -	MD_RECOVERY_WAIT,	/* waiting for pers->start() to finish */
>> -	MD_RESYNCING_REMOTE,	/* remote node is running resync thread
>> */
>> +	MD_RECOVERY_CHECK,
>> +	/* recovery, or need to try it */
>> +	MD_RECOVERY_RECOVER,
>> +	/* reshape */
>> +	MD_RECOVERY_RESHAPE,
>> +	/* remote node is running resync thread */
>> +	MD_RESYNCING_REMOTE,
>>   };
>>   
>>   enum md_ro_state {
> 
> I don't know if it is better readable but I know that Kernel coding style comes
> with different approach. I used it for enum mddev_flags in md.h please take a
> look.

1) There are two kinds of flags here, for sync thread running status and
for sync action, I think it's better to distinguish them.
2) The names are confusing, for sync thread status I prefer
SYNC_THREAD_xxx and for sync action I prefer SYNC_ACTION_xxx, I plan to
remove the flags to sync action and replace them with enum type in patch
2.

> 
> Also, I get used to comment above, not below enum values but I don't have strong
> justification here.

I do comment above here.

Thanks,
Kuai

> 
> Thanks,
> Mariusz
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 1/9] md: rearrange recovery_flage
  2024-05-09  1:18 ` [PATCH md-6.10 1/9] md: rearrange recovery_flage Yu Kuai
  2024-05-13 15:12   ` Mariusz Tkaczyk
@ 2024-05-14  5:51   ` Xiao Ni
  2024-05-14  6:10     ` Yu Kuai
  1 sibling, 1 reply; 31+ messages in thread
From: Xiao Ni @ 2024-05-14  5:51 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, mpatocka, song, dm-devel, linux-kernel, linux-raid,
	yukuai3, yi.zhang, yangerkun

On Mon, May 13, 2024 at 9:57 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Currently there are lots of flags and the names are confusing, since
> there are two main types of flags, sync thread runnng status and sync
> thread action, rearrange and update comment to improve code readability,
> there are no functional changes.
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.h | 52 ++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 38 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 029dd0491a36..2a1cb7b889e5 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -551,22 +551,46 @@ struct mddev {
>  };
>
>  enum recovery_flags {
> +       /* flags for sync thread running status */
> +
> +       /*
> +        * set when one of sync action is set and new sync thread need to be
> +        * registered, or just add/remove spares from conf.
> +        */
> +       MD_RECOVERY_NEEDED,
> +       /* sync thread is running, or about to be started */
> +       MD_RECOVERY_RUNNING,
> +       /* sync thread needs to be aborted for some reason */
> +       MD_RECOVERY_INTR,
> +       /* sync thread is done and is waiting to be unregistered */
> +       MD_RECOVERY_DONE,
> +       /* running sync thread must abort immediately, and not restart */
> +       MD_RECOVERY_FROZEN,
> +       /* waiting for pers->start() to finish */
> +       MD_RECOVERY_WAIT,
> +       /* interrupted because io-error */
> +       MD_RECOVERY_ERROR,
> +
> +       /* flags determines sync action */
> +
> +       /* if just this flag is set, action is resync. */
> +       MD_RECOVERY_SYNC,
> +       /*
> +        * paired with MD_RECOVERY_SYNC, if MD_RECOVERY_CHECK is not set,
> +        * action is repair, means user requested resync.
> +        */
> +       MD_RECOVERY_REQUESTED,
>         /*
> -        * If neither SYNC or RESHAPE are set, then it is a recovery.
> +        * paired with MD_RECOVERY_SYNC and MD_RECOVERY_REQUESTED, action is
> +        * check.

Hi Kuai

I did a test as follows:

echo check > /sys/block/md0/md/sync_action
I added some logs in md_do_sync to check these bits. It only prints
MD_RECOVERY_SYNC and MD_RECOVERY_CHECK without MD_RECOVERY_SYNC. So
the comment is not right?

Best Regards
Xiao

>          */
> -       MD_RECOVERY_RUNNING,    /* a thread is running, or about to be started */
> -       MD_RECOVERY_SYNC,       /* actually doing a resync, not a recovery */
> -       MD_RECOVERY_RECOVER,    /* doing recovery, or need to try it. */
> -       MD_RECOVERY_INTR,       /* resync needs to be aborted for some reason */
> -       MD_RECOVERY_DONE,       /* thread is done and is waiting to be reaped */
> -       MD_RECOVERY_NEEDED,     /* we might need to start a resync/recover */
> -       MD_RECOVERY_REQUESTED,  /* user-space has requested a sync (used with SYNC) */
> -       MD_RECOVERY_CHECK,      /* user-space request for check-only, no repair */
> -       MD_RECOVERY_RESHAPE,    /* A reshape is happening */
> -       MD_RECOVERY_FROZEN,     /* User request to abort, and not restart, any action */
> -       MD_RECOVERY_ERROR,      /* sync-action interrupted because io-error */
> -       MD_RECOVERY_WAIT,       /* waiting for pers->start() to finish */
> -       MD_RESYNCING_REMOTE,    /* remote node is running resync thread */
> +       MD_RECOVERY_CHECK,
> +       /* recovery, or need to try it */
> +       MD_RECOVERY_RECOVER,
> +       /* reshape */
> +       MD_RECOVERY_RESHAPE,
> +       /* remote node is running resync thread */
> +       MD_RESYNCING_REMOTE,
>  };
>
>  enum md_ro_state {
> --
> 2.39.2
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 1/9] md: rearrange recovery_flage
  2024-05-14  5:51   ` Xiao Ni
@ 2024-05-14  6:10     ` Yu Kuai
  2024-05-14  6:39       ` Xiao Ni
  0 siblings, 1 reply; 31+ messages in thread
From: Yu Kuai @ 2024-05-14  6:10 UTC (permalink / raw)
  To: Xiao Ni, Yu Kuai
  Cc: agk, snitzer, mpatocka, song, dm-devel, linux-kernel, linux-raid,
	yi.zhang, yangerkun, yukuai (C)

Hi,

在 2024/05/14 13:51, Xiao Ni 写道:
> On Mon, May 13, 2024 at 9:57 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Currently there are lots of flags and the names are confusing, since
>> there are two main types of flags, sync thread runnng status and sync
>> thread action, rearrange and update comment to improve code readability,
>> there are no functional changes.
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.h | 52 ++++++++++++++++++++++++++++++++++++-------------
>>   1 file changed, 38 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>> index 029dd0491a36..2a1cb7b889e5 100644
>> --- a/drivers/md/md.h
>> +++ b/drivers/md/md.h
>> @@ -551,22 +551,46 @@ struct mddev {
>>   };
>>
>>   enum recovery_flags {
>> +       /* flags for sync thread running status */
>> +
>> +       /*
>> +        * set when one of sync action is set and new sync thread need to be
>> +        * registered, or just add/remove spares from conf.
>> +        */
>> +       MD_RECOVERY_NEEDED,
>> +       /* sync thread is running, or about to be started */
>> +       MD_RECOVERY_RUNNING,
>> +       /* sync thread needs to be aborted for some reason */
>> +       MD_RECOVERY_INTR,
>> +       /* sync thread is done and is waiting to be unregistered */
>> +       MD_RECOVERY_DONE,
>> +       /* running sync thread must abort immediately, and not restart */
>> +       MD_RECOVERY_FROZEN,
>> +       /* waiting for pers->start() to finish */
>> +       MD_RECOVERY_WAIT,
>> +       /* interrupted because io-error */
>> +       MD_RECOVERY_ERROR,
>> +
>> +       /* flags determines sync action */
>> +
>> +       /* if just this flag is set, action is resync. */
>> +       MD_RECOVERY_SYNC,
>> +       /*
>> +        * paired with MD_RECOVERY_SYNC, if MD_RECOVERY_CHECK is not set,
>> +        * action is repair, means user requested resync.
>> +        */
>> +       MD_RECOVERY_REQUESTED,
>>          /*
>> -        * If neither SYNC or RESHAPE are set, then it is a recovery.
>> +        * paired with MD_RECOVERY_SYNC and MD_RECOVERY_REQUESTED, action is
>> +        * check.
> 
> Hi Kuai
> 
> I did a test as follows:
> 
> echo check > /sys/block/md0/md/sync_action
> I added some logs in md_do_sync to check these bits. It only prints
> MD_RECOVERY_SYNC and MD_RECOVERY_CHECK without MD_RECOVERY_SYNC. So
> the comment is not right?

There is a typo, I'm not sure what you mean yet. Can you show how you
add your logs? I think comment is right.

 From action_store:

                 if (cmd_match(page, "check"))
                         set_bit(MD_RECOVERY_CHECK, &mddev->recovery);
                 else if (!cmd_match(page, "repair"))
                         return -EINVAL;
                 clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
                 set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
                 set_bit(MD_RECOVERY_SYNC, &mddev->recovery);

 From md_do_sync:

         if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
                 if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) {
                         desc = "data-check";
                         action = "check";
                 } else if (test_bit(MD_RECOVERY_REQUESTED, 
&mddev->recovery)) {
                         desc = "requested-resync";
                         action = "repair";
                 } else
                         desc = "resync";

Thanks,
Kuai

> 
> Best Regards
> Xiao
> 
>>           */
>> -       MD_RECOVERY_RUNNING,    /* a thread is running, or about to be started */
>> -       MD_RECOVERY_SYNC,       /* actually doing a resync, not a recovery */
>> -       MD_RECOVERY_RECOVER,    /* doing recovery, or need to try it. */
>> -       MD_RECOVERY_INTR,       /* resync needs to be aborted for some reason */
>> -       MD_RECOVERY_DONE,       /* thread is done and is waiting to be reaped */
>> -       MD_RECOVERY_NEEDED,     /* we might need to start a resync/recover */
>> -       MD_RECOVERY_REQUESTED,  /* user-space has requested a sync (used with SYNC) */
>> -       MD_RECOVERY_CHECK,      /* user-space request for check-only, no repair */
>> -       MD_RECOVERY_RESHAPE,    /* A reshape is happening */
>> -       MD_RECOVERY_FROZEN,     /* User request to abort, and not restart, any action */
>> -       MD_RECOVERY_ERROR,      /* sync-action interrupted because io-error */
>> -       MD_RECOVERY_WAIT,       /* waiting for pers->start() to finish */
>> -       MD_RESYNCING_REMOTE,    /* remote node is running resync thread */
>> +       MD_RECOVERY_CHECK,
>> +       /* recovery, or need to try it */
>> +       MD_RECOVERY_RECOVER,
>> +       /* reshape */
>> +       MD_RECOVERY_RESHAPE,
>> +       /* remote node is running resync thread */
>> +       MD_RESYNCING_REMOTE,
>>   };
>>
>>   enum md_ro_state {
>> --
>> 2.39.2
>>
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 2/9] md: add a new enum type sync_action
  2024-05-09  1:18 ` [PATCH md-6.10 2/9] md: add a new enum type sync_action Yu Kuai
@ 2024-05-14  6:13   ` Xiao Ni
  0 siblings, 0 replies; 31+ messages in thread
From: Xiao Ni @ 2024-05-14  6:13 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, mpatocka, song, dm-devel, linux-kernel, linux-raid,
	yukuai3, yi.zhang, yangerkun

On Mon, May 13, 2024 at 6:19 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> In order to make code related to sync_thread cleaner in following
> patches, also add detail comment about each sync action.
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.h | 57 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 56 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 2a1cb7b889e5..2edad966f90a 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -34,6 +34,61 @@
>   */
>  #define        MD_FAILFAST     (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT)
>
> +/* Status of sync thread. */
> +enum sync_action {
> +       /*
> +        * Represent by MD_RECOVERY_SYNC, start when:
> +        * 1) after assemble, sync data from first rdev to other copies, this
> +        * must be done first before other sync actions and will only execute
> +        * once;
> +        * 2) resize the array(notice that this is not reshape), sync data for
> +        * the new range;
> +        */
> +       ACTION_RESYNC,
> +       /*
> +        * Represent by MD_RECOVERY_RECOVER, start when:
> +        * 1) for new replacement, sync data based on the replace rdev or
> +        * available copies from other rdev;
> +        * 2) for new member disk while the array is degraded, sync data from
> +        * other rdev;
> +        * 3) reassemble after power failure or re-add a hot removed rdev, sync
> +        * data from first rdev to other copies based on bitmap;
> +        */
> +       ACTION_RECOVER,
> +       /*
> +        * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED |
> +        * MD_RECOVERY_CHECK, start when user echo "check" to sysfs api

Same question with patch 01 here

Regards
Xiao

> +        * sync_action, used to check if data copies from differenct rdev are
> +        * the same. The number of mismatch sectors will be exported to user
> +        * by sysfs api mismatch_cnt;
> +        */
> +       ACTION_CHECK,
> +       /*
> +        * Represent by MD_RECOVERY_SYNC | MD_RECOVERY_REQUESTED, start when
> +        * user echo "repair" to sysfs api sync_action, usually paired with
> +        * ACTION_CHECK, used to force syncing data once user found that there
> +        * are inconsistent data,
> +        */
> +       ACTION_REPAIR,
> +       /*
> +        * Represent by MD_RECOVERY_RESHAPE, start when new member disk is added
> +        * to the conf, notice that this is different from spares or
> +        * replacement;
> +        */
> +       ACTION_RESHAPE,
> +       /*
> +        * Represent by MD_RECOVERY_FROZEN, can be set by sysfs api sync_action
> +        * or internal usage like setting the array read-only, will forbid above
> +        * actions.
> +        */
> +       ACTION_FROZEN,
> +       /*
> +        * All above actions don't match.
> +        */
> +       ACTION_IDLE,
> +       NR_SYNC_ACTIONS,
> +};
> +
>  /*
>   * The struct embedded in rdev is used to serialize IO.
>   */
> @@ -571,7 +626,7 @@ enum recovery_flags {
>         /* interrupted because io-error */
>         MD_RECOVERY_ERROR,
>
> -       /* flags determines sync action */
> +       /* flags determines sync action, see details in enum sync_action */
>
>         /* if just this flag is set, action is resync. */
>         MD_RECOVERY_SYNC,
> --
> 2.39.2
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 1/9] md: rearrange recovery_flage
  2024-05-14  6:10     ` Yu Kuai
@ 2024-05-14  6:39       ` Xiao Ni
  0 siblings, 0 replies; 31+ messages in thread
From: Xiao Ni @ 2024-05-14  6:39 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, mpatocka, song, dm-devel, linux-kernel, linux-raid,
	yi.zhang, yangerkun, yukuai (C)

On Tue, May 14, 2024 at 2:16 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2024/05/14 13:51, Xiao Ni 写道:
> > On Mon, May 13, 2024 at 9:57 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >>
> >> From: Yu Kuai <yukuai3@huawei.com>
> >>
> >> Currently there are lots of flags and the names are confusing, since
> >> there are two main types of flags, sync thread runnng status and sync
> >> thread action, rearrange and update comment to improve code readability,
> >> there are no functional changes.
> >>
> >> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> >> ---
> >>   drivers/md/md.h | 52 ++++++++++++++++++++++++++++++++++++-------------
> >>   1 file changed, 38 insertions(+), 14 deletions(-)
> >>
> >> diff --git a/drivers/md/md.h b/drivers/md/md.h
> >> index 029dd0491a36..2a1cb7b889e5 100644
> >> --- a/drivers/md/md.h
> >> +++ b/drivers/md/md.h
> >> @@ -551,22 +551,46 @@ struct mddev {
> >>   };
> >>
> >>   enum recovery_flags {
> >> +       /* flags for sync thread running status */
> >> +
> >> +       /*
> >> +        * set when one of sync action is set and new sync thread need to be
> >> +        * registered, or just add/remove spares from conf.
> >> +        */
> >> +       MD_RECOVERY_NEEDED,
> >> +       /* sync thread is running, or about to be started */
> >> +       MD_RECOVERY_RUNNING,
> >> +       /* sync thread needs to be aborted for some reason */
> >> +       MD_RECOVERY_INTR,
> >> +       /* sync thread is done and is waiting to be unregistered */
> >> +       MD_RECOVERY_DONE,
> >> +       /* running sync thread must abort immediately, and not restart */
> >> +       MD_RECOVERY_FROZEN,
> >> +       /* waiting for pers->start() to finish */
> >> +       MD_RECOVERY_WAIT,
> >> +       /* interrupted because io-error */
> >> +       MD_RECOVERY_ERROR,
> >> +
> >> +       /* flags determines sync action */
> >> +
> >> +       /* if just this flag is set, action is resync. */
> >> +       MD_RECOVERY_SYNC,
> >> +       /*
> >> +        * paired with MD_RECOVERY_SYNC, if MD_RECOVERY_CHECK is not set,
> >> +        * action is repair, means user requested resync.
> >> +        */
> >> +       MD_RECOVERY_REQUESTED,
> >>          /*
> >> -        * If neither SYNC or RESHAPE are set, then it is a recovery.
> >> +        * paired with MD_RECOVERY_SYNC and MD_RECOVERY_REQUESTED, action is
> >> +        * check.
> >
> > Hi Kuai
> >
> > I did a test as follows:
> >
> > echo check > /sys/block/md0/md/sync_action
> > I added some logs in md_do_sync to check these bits. It only prints
> > MD_RECOVERY_SYNC and MD_RECOVERY_CHECK without MD_RECOVERY_SYNC. So
> > the comment is not right?
>
> There is a typo, I'm not sure what you mean yet. Can you show how you
> add your logs? I think comment is right.

Sorry. I added logs in the wrong places. You're right. Data check has
the three bits. So, the comments are right.


>
>  From action_store:
>
>                  if (cmd_match(page, "check"))
>                          set_bit(MD_RECOVERY_CHECK, &mddev->recovery);
>                  else if (!cmd_match(page, "repair"))
>                          return -EINVAL;
>                  clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>                  set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
>                  set_bit(MD_RECOVERY_SYNC, &mddev->recovery);

Yes.

Best Regards
Xiao

>
>  From md_do_sync:
>
>          if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
>                  if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)) {
>                          desc = "data-check";
>                          action = "check";
>                  } else if (test_bit(MD_RECOVERY_REQUESTED,
> &mddev->recovery)) {
>                          desc = "requested-resync";
>                          action = "repair";
>                  } else
>                          desc = "resync";
>
> Thanks,
> Kuai
>
> >
> > Best Regards
> > Xiao
> >
> >>           */
> >> -       MD_RECOVERY_RUNNING,    /* a thread is running, or about to be started */
> >> -       MD_RECOVERY_SYNC,       /* actually doing a resync, not a recovery */
> >> -       MD_RECOVERY_RECOVER,    /* doing recovery, or need to try it. */
> >> -       MD_RECOVERY_INTR,       /* resync needs to be aborted for some reason */
> >> -       MD_RECOVERY_DONE,       /* thread is done and is waiting to be reaped */
> >> -       MD_RECOVERY_NEEDED,     /* we might need to start a resync/recover */
> >> -       MD_RECOVERY_REQUESTED,  /* user-space has requested a sync (used with SYNC) */
> >> -       MD_RECOVERY_CHECK,      /* user-space request for check-only, no repair */
> >> -       MD_RECOVERY_RESHAPE,    /* A reshape is happening */
> >> -       MD_RECOVERY_FROZEN,     /* User request to abort, and not restart, any action */
> >> -       MD_RECOVERY_ERROR,      /* sync-action interrupted because io-error */
> >> -       MD_RECOVERY_WAIT,       /* waiting for pers->start() to finish */
> >> -       MD_RESYNCING_REMOTE,    /* remote node is running resync thread */
> >> +       MD_RECOVERY_CHECK,
> >> +       /* recovery, or need to try it */
> >> +       MD_RECOVERY_RECOVER,
> >> +       /* reshape */
> >> +       MD_RECOVERY_RESHAPE,
> >> +       /* remote node is running resync thread */
> >> +       MD_RESYNCING_REMOTE,
> >>   };
> >>
> >>   enum md_ro_state {
> >> --
> >> 2.39.2
> >>
> >
> >
> > .
> >
>
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-09  1:18 ` [PATCH md-6.10 3/9] md: add new helpers for sync_action Yu Kuai
@ 2024-05-14  6:52   ` Xiao Ni
  2024-05-14  7:39     ` Yu Kuai
  2024-05-20 11:51   ` Su Yue
  1 sibling, 1 reply; 31+ messages in thread
From: Xiao Ni @ 2024-05-14  6:52 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, mpatocka, song, dm-devel, linux-kernel, linux-raid,
	yukuai3, yi.zhang, yangerkun

On Mon, May 13, 2024 at 5:31 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> The new helpers will get current sync_action of the array, will be used
> in later patches to make code cleaner.
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/md/md.h |  3 +++
>  2 files changed, 67 insertions(+)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 00bbafcd27bb..48ec35342d1b 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -69,6 +69,16 @@
>  #include "md-bitmap.h"
>  #include "md-cluster.h"
>
> +static char *action_name[NR_SYNC_ACTIONS] = {
> +       [ACTION_RESYNC]         = "resync",
> +       [ACTION_RECOVER]        = "recover",
> +       [ACTION_CHECK]          = "check",
> +       [ACTION_REPAIR]         = "repair",
> +       [ACTION_RESHAPE]        = "reshape",
> +       [ACTION_FROZEN]         = "frozen",
> +       [ACTION_IDLE]           = "idle",
> +};
> +
>  /* pers_list is a list of registered personalities protected by pers_lock. */
>  static LIST_HEAD(pers_list);
>  static DEFINE_SPINLOCK(pers_lock);
> @@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, const char *buf, size_t len)
>  static struct md_sysfs_entry md_metadata =
>  __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, metadata_show, metadata_store);
>
> +enum sync_action md_sync_action(struct mddev *mddev)
> +{
> +       unsigned long recovery = mddev->recovery;
> +
> +       /*
> +        * frozen has the highest priority, means running sync_thread will be
> +        * stopped immediately, and no new sync_thread can start.
> +        */
> +       if (test_bit(MD_RECOVERY_FROZEN, &recovery))
> +               return ACTION_FROZEN;
> +
> +       /*
> +        * idle means no sync_thread is running, and no new sync_thread is
> +        * requested.
> +        */
> +       if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
> +           (!md_is_rdwr(mddev) || !test_bit(MD_RECOVERY_NEEDED, &recovery)))
> +               return ACTION_IDLE;

Hi Kuai

Can I think the above judgement cover these two situations:
1. The array is readonly / readauto and it doesn't have
MD_RECOVERY_RUNNING. Now maybe it has MD_RECOVERY_NEEDED, it means one
array may want to do some sync action, but the array state is not
readwrite and it can't start.
2. The array doesn't have MD_RECOVERY_RUNNING and MD_RECOVERY_NEEDED

> +
> +       if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
> +           mddev->reshape_position != MaxSector)
> +               return ACTION_RESHAPE;
> +
> +       if (test_bit(MD_RECOVERY_RECOVER, &recovery))
> +               return ACTION_RECOVER;
> +
> +       if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
> +               if (test_bit(MD_RECOVERY_CHECK, &recovery))
> +                       return ACTION_CHECK;
> +               if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
> +                       return ACTION_REPAIR;
> +               return ACTION_RESYNC;
> +       }
> +
> +       return ACTION_IDLE;

Does it need this? I guess it's the reason in case there are other
situations, right?

Regards
Xiao

> +}
> +
> +enum sync_action md_sync_action_by_name(char *page)
> +{
> +       enum sync_action action;
> +
> +       for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
> +               if (cmd_match(page, action_name[action]))
> +                       return action;
> +       }
> +
> +       return NR_SYNC_ACTIONS;
> +}
> +
> +char *md_sync_action_name(enum sync_action action)
> +{
> +       return action_name[action];
> +}
> +
>  static ssize_t
>  action_show(struct mddev *mddev, char *page)
>  {
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 2edad966f90a..72ca7a796df5 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -864,6 +864,9 @@ extern void md_unregister_thread(struct mddev *mddev, struct md_thread __rcu **t
>  extern void md_wakeup_thread(struct md_thread __rcu *thread);
>  extern void md_check_recovery(struct mddev *mddev);
>  extern void md_reap_sync_thread(struct mddev *mddev);
> +extern enum sync_action md_sync_action(struct mddev *mddev);
> +extern enum sync_action md_sync_action_by_name(char *page);
> +extern char *md_sync_action_name(enum sync_action action);
>  extern bool md_write_start(struct mddev *mddev, struct bio *bi);
>  extern void md_write_inc(struct mddev *mddev, struct bio *bi);
>  extern void md_write_end(struct mddev *mddev);
> --
> 2.39.2
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 8/9] md: factor out helpers for different sync_action in md_do_sync()
  2024-05-09  1:18 ` [PATCH md-6.10 8/9] md: factor out helpers for different sync_action in md_do_sync() Yu Kuai
@ 2024-05-14  7:27   ` Xiao Ni
  0 siblings, 0 replies; 31+ messages in thread
From: Xiao Ni @ 2024-05-14  7:27 UTC (permalink / raw)
  To: Yu Kuai, agk, snitzer, mpatocka, song
  Cc: dm-devel, linux-kernel, linux-raid, yukuai3, yi.zhang, yangerkun


在 2024/5/9 上午9:18, Yu Kuai 写道:
> From: Yu Kuai <yukuai3@huawei.com>
>
> Make code cleaner by replace if else if with switch, and it's more
> obvious now what is doning for each sync_action. There are no

Hi Kuai

type error s/doning/doing/g

Regards

Xiao

> functional changes.
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>   drivers/md/md.c | 123 ++++++++++++++++++++++++++++--------------------
>   1 file changed, 73 insertions(+), 50 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 2fc81175b46b..42db128b82d9 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -8928,6 +8928,77 @@ void md_allow_write(struct mddev *mddev)
>   }
>   EXPORT_SYMBOL_GPL(md_allow_write);
>   
> +static sector_t md_sync_max_sectors(struct mddev *mddev,
> +				    enum sync_action action)
> +{
> +	switch (action) {
> +	case ACTION_RESYNC:
> +	case ACTION_CHECK:
> +	case ACTION_REPAIR:
> +		atomic64_set(&mddev->resync_mismatches, 0);
> +		fallthrough;
> +	case ACTION_RESHAPE:
> +		return mddev->resync_max_sectors;
> +	case ACTION_RECOVER:
> +		return mddev->dev_sectors;
> +	default:
> +		return 0;
> +	}
> +}
> +
> +static sector_t md_sync_position(struct mddev *mddev, enum sync_action action)
> +{
> +	sector_t start = 0;
> +	struct md_rdev *rdev;
> +
> +	switch (action) {
> +	case ACTION_CHECK:
> +	case ACTION_REPAIR:
> +		return mddev->resync_min;
> +	case ACTION_RESYNC:
> +		if (!mddev->bitmap)
> +			return mddev->recovery_cp;
> +		return 0;
> +	case ACTION_RESHAPE:
> +		/*
> +		 * If the original node aborts reshaping then we continue the
> +		 * reshaping, so set again to avoid restart reshape from the
> +		 * first beginning
> +		 */
> +		if (mddev_is_clustered(mddev) &&
> +		    mddev->reshape_position != MaxSector)
> +			return mddev->reshape_position;
> +		return 0;
> +	case ACTION_RECOVER:
> +		start = MaxSector;
> +		rcu_read_lock();
> +		rdev_for_each_rcu(rdev, mddev)
> +			if (rdev->raid_disk >= 0 &&
> +			    !test_bit(Journal, &rdev->flags) &&
> +			    !test_bit(Faulty, &rdev->flags) &&
> +			    !test_bit(In_sync, &rdev->flags) &&
> +			    rdev->recovery_offset < start)
> +				start = rdev->recovery_offset;
> +		rcu_read_unlock();
> +
> +		/* If there is a bitmap, we need to make sure all
> +		 * writes that started before we added a spare
> +		 * complete before we start doing a recovery.
> +		 * Otherwise the write might complete and (via
> +		 * bitmap_endwrite) set a bit in the bitmap after the
> +		 * recovery has checked that bit and skipped that
> +		 * region.
> +		 */
> +		if (mddev->bitmap) {
> +			mddev->pers->quiesce(mddev, 1);
> +			mddev->pers->quiesce(mddev, 0);
> +		}
> +		return start;
> +	default:
> +		return MaxSector;
> +	}
> +}
> +
>   #define SYNC_MARKS	10
>   #define	SYNC_MARK_STEP	(3*HZ)
>   #define UPDATE_FREQUENCY (5*60*HZ)
> @@ -9046,56 +9117,8 @@ void md_do_sync(struct md_thread *thread)
>   		spin_unlock(&all_mddevs_lock);
>   	} while (mddev->curr_resync < MD_RESYNC_DELAYED);
>   
> -	j = 0;
> -	if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
> -		/* resync follows the size requested by the personality,
> -		 * which defaults to physical size, but can be virtual size
> -		 */
> -		max_sectors = mddev->resync_max_sectors;
> -		atomic64_set(&mddev->resync_mismatches, 0);
> -		/* we don't use the checkpoint if there's a bitmap */
> -		if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery))
> -			j = mddev->resync_min;
> -		else if (!mddev->bitmap)
> -			j = mddev->recovery_cp;
> -
> -	} else if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) {
> -		max_sectors = mddev->resync_max_sectors;
> -		/*
> -		 * If the original node aborts reshaping then we continue the
> -		 * reshaping, so set j again to avoid restart reshape from the
> -		 * first beginning
> -		 */
> -		if (mddev_is_clustered(mddev) &&
> -		    mddev->reshape_position != MaxSector)
> -			j = mddev->reshape_position;
> -	} else {
> -		/* recovery follows the physical size of devices */
> -		max_sectors = mddev->dev_sectors;
> -		j = MaxSector;
> -		rcu_read_lock();
> -		rdev_for_each_rcu(rdev, mddev)
> -			if (rdev->raid_disk >= 0 &&
> -			    !test_bit(Journal, &rdev->flags) &&
> -			    !test_bit(Faulty, &rdev->flags) &&
> -			    !test_bit(In_sync, &rdev->flags) &&
> -			    rdev->recovery_offset < j)
> -				j = rdev->recovery_offset;
> -		rcu_read_unlock();
> -
> -		/* If there is a bitmap, we need to make sure all
> -		 * writes that started before we added a spare
> -		 * complete before we start doing a recovery.
> -		 * Otherwise the write might complete and (via
> -		 * bitmap_endwrite) set a bit in the bitmap after the
> -		 * recovery has checked that bit and skipped that
> -		 * region.
> -		 */
> -		if (mddev->bitmap) {
> -			mddev->pers->quiesce(mddev, 1);
> -			mddev->pers->quiesce(mddev, 0);
> -		}
> -	}
> +	max_sectors = md_sync_max_sectors(mddev, action);
> +	j = md_sync_position(mddev, action);
>   
>   	pr_info("md: %s of RAID array %s\n", desc, mdname(mddev));
>   	pr_debug("md: minimum _guaranteed_  speed: %d KB/sec/disk.\n", speed_min(mddev));


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-14  6:52   ` Xiao Ni
@ 2024-05-14  7:39     ` Yu Kuai
  2024-05-14  8:40       ` Xiao Ni
  0 siblings, 1 reply; 31+ messages in thread
From: Yu Kuai @ 2024-05-14  7:39 UTC (permalink / raw)
  To: Xiao Ni, Yu Kuai
  Cc: agk, snitzer, mpatocka, song, dm-devel, linux-kernel, linux-raid,
	yi.zhang, yangerkun, yukuai (C)

Hi,

在 2024/05/14 14:52, Xiao Ni 写道:
> On Mon, May 13, 2024 at 5:31 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> The new helpers will get current sync_action of the array, will be used
>> in later patches to make code cleaner.
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++
>>   drivers/md/md.h |  3 +++
>>   2 files changed, 67 insertions(+)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 00bbafcd27bb..48ec35342d1b 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -69,6 +69,16 @@
>>   #include "md-bitmap.h"
>>   #include "md-cluster.h"
>>
>> +static char *action_name[NR_SYNC_ACTIONS] = {
>> +       [ACTION_RESYNC]         = "resync",
>> +       [ACTION_RECOVER]        = "recover",
>> +       [ACTION_CHECK]          = "check",
>> +       [ACTION_REPAIR]         = "repair",
>> +       [ACTION_RESHAPE]        = "reshape",
>> +       [ACTION_FROZEN]         = "frozen",
>> +       [ACTION_IDLE]           = "idle",
>> +};
>> +
>>   /* pers_list is a list of registered personalities protected by pers_lock. */
>>   static LIST_HEAD(pers_list);
>>   static DEFINE_SPINLOCK(pers_lock);
>> @@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, const char *buf, size_t len)
>>   static struct md_sysfs_entry md_metadata =
>>   __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, metadata_show, metadata_store);
>>
>> +enum sync_action md_sync_action(struct mddev *mddev)
>> +{
>> +       unsigned long recovery = mddev->recovery;
>> +
>> +       /*
>> +        * frozen has the highest priority, means running sync_thread will be
>> +        * stopped immediately, and no new sync_thread can start.
>> +        */
>> +       if (test_bit(MD_RECOVERY_FROZEN, &recovery))
>> +               return ACTION_FROZEN;
>> +
>> +       /*
>> +        * idle means no sync_thread is running, and no new sync_thread is
>> +        * requested.
>> +        */
>> +       if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
>> +           (!md_is_rdwr(mddev) || !test_bit(MD_RECOVERY_NEEDED, &recovery)))
>> +               return ACTION_IDLE;
> 
> Hi Kuai
> 
> Can I think the above judgement cover these two situations:
> 1. The array is readonly / readauto and it doesn't have
> MD_RECOVERY_RUNNING. Now maybe it has MD_RECOVERY_NEEDED, it means one
> array may want to do some sync action, but the array state is not
> readwrite and it can't start.
> 2. The array doesn't have MD_RECOVERY_RUNNING and MD_RECOVERY_NEEDED
> 
>> +
>> +       if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
>> +           mddev->reshape_position != MaxSector)
>> +               return ACTION_RESHAPE;
>> +
>> +       if (test_bit(MD_RECOVERY_RECOVER, &recovery))
>> +               return ACTION_RECOVER;
>> +
>> +       if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
>> +               if (test_bit(MD_RECOVERY_CHECK, &recovery))
>> +                       return ACTION_CHECK;
>> +               if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
>> +                       return ACTION_REPAIR;
>> +               return ACTION_RESYNC;
>> +       }
>> +
>> +       return ACTION_IDLE;
> 
> Does it need this? I guess it's the reason in case there are other
> situations, right?

Yes, we need this, because they are many places to set
MD_RECOVERY_NEEDED, while there are no sync action actually, this case
is 'idle'.

Thanks,
Kuai

> 
> Regards
> Xiao
> 
>> +}
>> +
>> +enum sync_action md_sync_action_by_name(char *page)
>> +{
>> +       enum sync_action action;
>> +
>> +       for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
>> +               if (cmd_match(page, action_name[action]))
>> +                       return action;
>> +       }
>> +
>> +       return NR_SYNC_ACTIONS;
>> +}
>> +
>> +char *md_sync_action_name(enum sync_action action)
>> +{
>> +       return action_name[action];
>> +}
>> +
>>   static ssize_t
>>   action_show(struct mddev *mddev, char *page)
>>   {
>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>> index 2edad966f90a..72ca7a796df5 100644
>> --- a/drivers/md/md.h
>> +++ b/drivers/md/md.h
>> @@ -864,6 +864,9 @@ extern void md_unregister_thread(struct mddev *mddev, struct md_thread __rcu **t
>>   extern void md_wakeup_thread(struct md_thread __rcu *thread);
>>   extern void md_check_recovery(struct mddev *mddev);
>>   extern void md_reap_sync_thread(struct mddev *mddev);
>> +extern enum sync_action md_sync_action(struct mddev *mddev);
>> +extern enum sync_action md_sync_action_by_name(char *page);
>> +extern char *md_sync_action_name(enum sync_action action);
>>   extern bool md_write_start(struct mddev *mddev, struct bio *bi);
>>   extern void md_write_inc(struct mddev *mddev, struct bio *bi);
>>   extern void md_write_end(struct mddev *mddev);
>> --
>> 2.39.2
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-14  7:39     ` Yu Kuai
@ 2024-05-14  8:40       ` Xiao Ni
  2024-05-14  8:52         ` Yu Kuai
  0 siblings, 1 reply; 31+ messages in thread
From: Xiao Ni @ 2024-05-14  8:40 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, mpatocka, song, dm-devel, linux-kernel, linux-raid,
	yi.zhang, yangerkun, yukuai (C)

On Tue, May 14, 2024 at 3:39 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2024/05/14 14:52, Xiao Ni 写道:
> > On Mon, May 13, 2024 at 5:31 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >>
> >> From: Yu Kuai <yukuai3@huawei.com>
> >>
> >> The new helpers will get current sync_action of the array, will be used
> >> in later patches to make code cleaner.
> >>
> >> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> >> ---
> >>   drivers/md/md.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++
> >>   drivers/md/md.h |  3 +++
> >>   2 files changed, 67 insertions(+)
> >>
> >> diff --git a/drivers/md/md.c b/drivers/md/md.c
> >> index 00bbafcd27bb..48ec35342d1b 100644
> >> --- a/drivers/md/md.c
> >> +++ b/drivers/md/md.c
> >> @@ -69,6 +69,16 @@
> >>   #include "md-bitmap.h"
> >>   #include "md-cluster.h"
> >>
> >> +static char *action_name[NR_SYNC_ACTIONS] = {
> >> +       [ACTION_RESYNC]         = "resync",
> >> +       [ACTION_RECOVER]        = "recover",
> >> +       [ACTION_CHECK]          = "check",
> >> +       [ACTION_REPAIR]         = "repair",
> >> +       [ACTION_RESHAPE]        = "reshape",
> >> +       [ACTION_FROZEN]         = "frozen",
> >> +       [ACTION_IDLE]           = "idle",
> >> +};
> >> +
> >>   /* pers_list is a list of registered personalities protected by pers_lock. */
> >>   static LIST_HEAD(pers_list);
> >>   static DEFINE_SPINLOCK(pers_lock);
> >> @@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, const char *buf, size_t len)
> >>   static struct md_sysfs_entry md_metadata =
> >>   __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, metadata_show, metadata_store);
> >>
> >> +enum sync_action md_sync_action(struct mddev *mddev)
> >> +{
> >> +       unsigned long recovery = mddev->recovery;
> >> +
> >> +       /*
> >> +        * frozen has the highest priority, means running sync_thread will be
> >> +        * stopped immediately, and no new sync_thread can start.
> >> +        */
> >> +       if (test_bit(MD_RECOVERY_FROZEN, &recovery))
> >> +               return ACTION_FROZEN;
> >> +
> >> +       /*
> >> +        * idle means no sync_thread is running, and no new sync_thread is
> >> +        * requested.
> >> +        */
> >> +       if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
> >> +           (!md_is_rdwr(mddev) || !test_bit(MD_RECOVERY_NEEDED, &recovery)))
> >> +               return ACTION_IDLE;
> >
> > Hi Kuai
> >
> > Can I think the above judgement cover these two situations:
> > 1. The array is readonly / readauto and it doesn't have
> > MD_RECOVERY_RUNNING. Now maybe it has MD_RECOVERY_NEEDED, it means one
> > array may want to do some sync action, but the array state is not
> > readwrite and it can't start.
> > 2. The array doesn't have MD_RECOVERY_RUNNING and MD_RECOVERY_NEEDED
> >
> >> +
> >> +       if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
> >> +           mddev->reshape_position != MaxSector)
> >> +               return ACTION_RESHAPE;
> >> +
> >> +       if (test_bit(MD_RECOVERY_RECOVER, &recovery))
> >> +               return ACTION_RECOVER;
> >> +
> >> +       if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
> >> +               if (test_bit(MD_RECOVERY_CHECK, &recovery))
> >> +                       return ACTION_CHECK;
> >> +               if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
> >> +                       return ACTION_REPAIR;
> >> +               return ACTION_RESYNC;
> >> +       }
> >> +
> >> +       return ACTION_IDLE;
> >
> > Does it need this? I guess it's the reason in case there are other
> > situations, right?
>
> Yes, we need this, because they are many places to set
> MD_RECOVERY_NEEDED, while there are no sync action actually, this case
> is 'idle'.

To be frank, the logic in action_show is easier to understand than the
logic above. I have taken more than half an hour to think if the logic
here is right or not. In action_show, it only needs to think when it's
not idle and it's easy.

Now this patch logic needs to think in the opposite direction: when
it's idle. And it returns ACTION_IDLE at two places which means it
still needs to think about when it has sync action. So it's better to
keep the original logic in action_show now. It's just my 2 cents
point.

Best Regards
Xiao

>
> Thanks,
> Kuai
>
> >
> > Regards
> > Xiao
> >
> >> +}
> >> +
> >> +enum sync_action md_sync_action_by_name(char *page)
> >> +{
> >> +       enum sync_action action;
> >> +
> >> +       for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
> >> +               if (cmd_match(page, action_name[action]))
> >> +                       return action;
> >> +       }
> >> +
> >> +       return NR_SYNC_ACTIONS;
> >> +}
> >> +
> >> +char *md_sync_action_name(enum sync_action action)
> >> +{
> >> +       return action_name[action];
> >> +}
> >> +
> >>   static ssize_t
> >>   action_show(struct mddev *mddev, char *page)
> >>   {
> >> diff --git a/drivers/md/md.h b/drivers/md/md.h
> >> index 2edad966f90a..72ca7a796df5 100644
> >> --- a/drivers/md/md.h
> >> +++ b/drivers/md/md.h
> >> @@ -864,6 +864,9 @@ extern void md_unregister_thread(struct mddev *mddev, struct md_thread __rcu **t
> >>   extern void md_wakeup_thread(struct md_thread __rcu *thread);
> >>   extern void md_check_recovery(struct mddev *mddev);
> >>   extern void md_reap_sync_thread(struct mddev *mddev);
> >> +extern enum sync_action md_sync_action(struct mddev *mddev);
> >> +extern enum sync_action md_sync_action_by_name(char *page);
> >> +extern char *md_sync_action_name(enum sync_action action);
> >>   extern bool md_write_start(struct mddev *mddev, struct bio *bi);
> >>   extern void md_write_inc(struct mddev *mddev, struct bio *bi);
> >>   extern void md_write_end(struct mddev *mddev);
> >> --
> >> 2.39.2
> >>
> >
> > .
> >
>
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-14  8:40       ` Xiao Ni
@ 2024-05-14  8:52         ` Yu Kuai
  0 siblings, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-14  8:52 UTC (permalink / raw)
  To: Xiao Ni, Yu Kuai
  Cc: agk, snitzer, mpatocka, song, dm-devel, linux-kernel, linux-raid,
	yi.zhang, yangerkun, yukuai (C)



在 2024/05/14 16:40, Xiao Ni 写道:
> On Tue, May 14, 2024 at 3:39 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> Hi,
>>
>> 在 2024/05/14 14:52, Xiao Ni 写道:
>>> On Mon, May 13, 2024 at 5:31 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>>>
>>>> From: Yu Kuai <yukuai3@huawei.com>
>>>>
>>>> The new helpers will get current sync_action of the array, will be used
>>>> in later patches to make code cleaner.
>>>>
>>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>>> ---
>>>>    drivers/md/md.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    drivers/md/md.h |  3 +++
>>>>    2 files changed, 67 insertions(+)
>>>>
>>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>>> index 00bbafcd27bb..48ec35342d1b 100644
>>>> --- a/drivers/md/md.c
>>>> +++ b/drivers/md/md.c
>>>> @@ -69,6 +69,16 @@
>>>>    #include "md-bitmap.h"
>>>>    #include "md-cluster.h"
>>>>
>>>> +static char *action_name[NR_SYNC_ACTIONS] = {
>>>> +       [ACTION_RESYNC]         = "resync",
>>>> +       [ACTION_RECOVER]        = "recover",
>>>> +       [ACTION_CHECK]          = "check",
>>>> +       [ACTION_REPAIR]         = "repair",
>>>> +       [ACTION_RESHAPE]        = "reshape",
>>>> +       [ACTION_FROZEN]         = "frozen",
>>>> +       [ACTION_IDLE]           = "idle",
>>>> +};
>>>> +
>>>>    /* pers_list is a list of registered personalities protected by pers_lock. */
>>>>    static LIST_HEAD(pers_list);
>>>>    static DEFINE_SPINLOCK(pers_lock);
>>>> @@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, const char *buf, size_t len)
>>>>    static struct md_sysfs_entry md_metadata =
>>>>    __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, metadata_show, metadata_store);
>>>>
>>>> +enum sync_action md_sync_action(struct mddev *mddev)
>>>> +{
>>>> +       unsigned long recovery = mddev->recovery;
>>>> +
>>>> +       /*
>>>> +        * frozen has the highest priority, means running sync_thread will be
>>>> +        * stopped immediately, and no new sync_thread can start.
>>>> +        */
>>>> +       if (test_bit(MD_RECOVERY_FROZEN, &recovery))
>>>> +               return ACTION_FROZEN;
>>>> +
>>>> +       /*
>>>> +        * idle means no sync_thread is running, and no new sync_thread is
>>>> +        * requested.
>>>> +        */
>>>> +       if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
>>>> +           (!md_is_rdwr(mddev) || !test_bit(MD_RECOVERY_NEEDED, &recovery)))
>>>> +               return ACTION_IDLE;
>>>
>>> Hi Kuai
>>>
>>> Can I think the above judgement cover these two situations:
>>> 1. The array is readonly / readauto and it doesn't have
>>> MD_RECOVERY_RUNNING. Now maybe it has MD_RECOVERY_NEEDED, it means one
>>> array may want to do some sync action, but the array state is not
>>> readwrite and it can't start.
>>> 2. The array doesn't have MD_RECOVERY_RUNNING and MD_RECOVERY_NEEDED
>>>
>>>> +
>>>> +       if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
>>>> +           mddev->reshape_position != MaxSector)
>>>> +               return ACTION_RESHAPE;
>>>> +
>>>> +       if (test_bit(MD_RECOVERY_RECOVER, &recovery))
>>>> +               return ACTION_RECOVER;
>>>> +
>>>> +       if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
>>>> +               if (test_bit(MD_RECOVERY_CHECK, &recovery))
>>>> +                       return ACTION_CHECK;
>>>> +               if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
>>>> +                       return ACTION_REPAIR;
>>>> +               return ACTION_RESYNC;
>>>> +       }
>>>> +
>>>> +       return ACTION_IDLE;
>>>
>>> Does it need this? I guess it's the reason in case there are other
>>> situations, right?
>>
>> Yes, we need this, because they are many places to set
>> MD_RECOVERY_NEEDED, while there are no sync action actually, this case
>> is 'idle'.
> 
> To be frank, the logic in action_show is easier to understand than the
> logic above. I have taken more than half an hour to think if the logic
> here is right or not. In action_show, it only needs to think when it's
> not idle and it's easy.
> 
> Now this patch logic needs to think in the opposite direction: when
> it's idle. And it returns ACTION_IDLE at two places which means it
> still needs to think about when it has sync action. So it's better to
> keep the original logic in action_show now. It's just my 2 cents
> point.

Hi,

but the logical is exactlly the same as action_show(), and there are
no functional changes. I just remove the local variable and return
early, because I think code is cleaner this way...

action_show:

char *type = "idle"

if (test_bit() || xxx) {

  if (xxx)
   type ="reshape"
  else if(xxx)
   type ="resync/check/repair"
  else if(xxx)
   type = "recover"
  else if (xxx)
   type = "reshape"
  -> else is idle
}
-> else is idle

The above two place are corresponding to the new code to return
ACTION_IDLE.

Thanks,
Kuai

> 
> Best Regards
> Xiao
> 
>>
>> Thanks,
>> Kuai
>>
>>>
>>> Regards
>>> Xiao
>>>
>>>> +}
>>>> +
>>>> +enum sync_action md_sync_action_by_name(char *page)
>>>> +{
>>>> +       enum sync_action action;
>>>> +
>>>> +       for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
>>>> +               if (cmd_match(page, action_name[action]))
>>>> +                       return action;
>>>> +       }
>>>> +
>>>> +       return NR_SYNC_ACTIONS;
>>>> +}
>>>> +
>>>> +char *md_sync_action_name(enum sync_action action)
>>>> +{
>>>> +       return action_name[action];
>>>> +}
>>>> +
>>>>    static ssize_t
>>>>    action_show(struct mddev *mddev, char *page)
>>>>    {
>>>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>>>> index 2edad966f90a..72ca7a796df5 100644
>>>> --- a/drivers/md/md.h
>>>> +++ b/drivers/md/md.h
>>>> @@ -864,6 +864,9 @@ extern void md_unregister_thread(struct mddev *mddev, struct md_thread __rcu **t
>>>>    extern void md_wakeup_thread(struct md_thread __rcu *thread);
>>>>    extern void md_check_recovery(struct mddev *mddev);
>>>>    extern void md_reap_sync_thread(struct mddev *mddev);
>>>> +extern enum sync_action md_sync_action(struct mddev *mddev);
>>>> +extern enum sync_action md_sync_action_by_name(char *page);
>>>> +extern char *md_sync_action_name(enum sync_action action);
>>>>    extern bool md_write_start(struct mddev *mddev, struct bio *bi);
>>>>    extern void md_write_inc(struct mddev *mddev, struct bio *bi);
>>>>    extern void md_write_end(struct mddev *mddev);
>>>> --
>>>> 2.39.2
>>>>
>>>
>>> .
>>>
>>
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-09  1:18 ` [PATCH md-6.10 3/9] md: add new helpers for sync_action Yu Kuai
  2024-05-14  6:52   ` Xiao Ni
@ 2024-05-20 11:51   ` Su Yue
  2024-05-21  2:30     ` Yu Kuai
  2024-05-21  3:25     ` Xiao Ni
  1 sibling, 2 replies; 31+ messages in thread
From: Su Yue @ 2024-05-20 11:51 UTC (permalink / raw)
  To: Yu Kuai
  Cc: agk, snitzer, mpatocka, song, xni, dm-devel, linux-kernel,
	linux-raid, yukuai3, yi.zhang, yangerkun


On Thu 09 May 2024 at 09:18, Yu Kuai <yukuai1@huaweicloud.com> 
wrote:

> From: Yu Kuai <yukuai3@huawei.com>
>
> The new helpers will get current sync_action of the array, will 
> be used
> in later patches to make code cleaner.
>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 64 
>  +++++++++++++++++++++++++++++++++++++++++++++++++
>  drivers/md/md.h |  3 +++
>  2 files changed, 67 insertions(+)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 00bbafcd27bb..48ec35342d1b 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -69,6 +69,16 @@
>  #include "md-bitmap.h"
>  #include "md-cluster.h"
>
> +static char *action_name[NR_SYNC_ACTIONS] = {
>

Th array will not be modified, so:

static const char * const action_names[NR_SYNC_ACTIONS]

> +	[ACTION_RESYNC]		= "resync",
> +	[ACTION_RECOVER]	= "recover",
> +	[ACTION_CHECK]		= "check",
> +	[ACTION_REPAIR]		= "repair",
> +	[ACTION_RESHAPE]	= "reshape",
> +	[ACTION_FROZEN]		= "frozen",
> +	[ACTION_IDLE]		= "idle",
> +};
> +
>  /* pers_list is a list of registered personalities protected by 
>  pers_lock. */
>  static LIST_HEAD(pers_list);
>  static DEFINE_SPINLOCK(pers_lock);
> @@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, const 
> char *buf, size_t len)
>  static struct md_sysfs_entry md_metadata =
>  __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR, 
>  metadata_show, metadata_store);
>
> +enum sync_action md_sync_action(struct mddev *mddev)
> +{
> +	unsigned long recovery = mddev->recovery;
> +
> +	/*
> +	 * frozen has the highest priority, means running sync_thread 
> will be
> +	 * stopped immediately, and no new sync_thread can start.
> +	 */
> +	if (test_bit(MD_RECOVERY_FROZEN, &recovery))
> +		return ACTION_FROZEN;
> +
> +	/*
> +	 * idle means no sync_thread is running, and no new 
> sync_thread is
> +	 * requested.
> +	 */
> +	if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
> +	    (!md_is_rdwr(mddev) || !test_bit(MD_RECOVERY_NEEDED, 
> &recovery)))
> +		return ACTION_IDLE;
My brain was lost sometimes looking into nested conditions of md 
code...
I agree with Xiao Ni's suggestion that more comments about the 
array
state should be added.

> +	if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
> +	    mddev->reshape_position != MaxSector)
> +		return ACTION_RESHAPE;
> +
> +	if (test_bit(MD_RECOVERY_RECOVER, &recovery))
> +		return ACTION_RECOVER;
> +
>
In action_show, MD_RECOVERY_SYNC is tested first then 
MD_RECOVERY_RECOVER.
After looking through the logic of MD_RECOVERY_RECOVER 
clear/set_bit, the
change is fine to me. However, better to follow old pattern unless 
there
have resons.


> +	if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
> +		if (test_bit(MD_RECOVERY_CHECK, &recovery))
> +			return ACTION_CHECK;
> +		if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
> +			return ACTION_REPAIR;
> +		return ACTION_RESYNC;
> +	}
> +
> +	return ACTION_IDLE;
> +}
> +
> +enum sync_action md_sync_action_by_name(char *page)
> +{
> +	enum sync_action action;
> +
> +	for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
> +		if (cmd_match(page, action_name[action]))
> +			return action;
> +	}
> +
> +	return NR_SYNC_ACTIONS;
> +}
> +
> +char *md_sync_action_name(enum sync_action action)
>

And 'const char *'

--
Su

> +{
> +	return action_name[action];
> +}
> +
>  static ssize_t
>  action_show(struct mddev *mddev, char *page)
>  {
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 2edad966f90a..72ca7a796df5 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -864,6 +864,9 @@ extern void md_unregister_thread(struct 
> mddev *mddev, struct md_thread __rcu **t
>  extern void md_wakeup_thread(struct md_thread __rcu *thread);
>  extern void md_check_recovery(struct mddev *mddev);
>  extern void md_reap_sync_thread(struct mddev *mddev);
> +extern enum sync_action md_sync_action(struct mddev *mddev);
> +extern enum sync_action md_sync_action_by_name(char *page);
> +extern char *md_sync_action_name(enum sync_action action);
>  extern bool md_write_start(struct mddev *mddev, struct bio 
>  *bi);
>  extern void md_write_inc(struct mddev *mddev, struct bio *bi);
>  extern void md_write_end(struct mddev *mddev);

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
  2024-05-09  1:18 ` [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers Yu Kuai
@ 2024-05-20 15:01   ` kernel test robot
  2024-05-21  2:20     ` Yu Kuai
  0 siblings, 1 reply; 31+ messages in thread
From: kernel test robot @ 2024-05-20 15:01 UTC (permalink / raw)
  To: Yu Kuai
  Cc: oe-lkp, lkp, linux-raid, agk, snitzer, mpatocka, song, xni,
	dm-devel, linux-kernel, yukuai3, yukuai1, yi.zhang, yangerkun,
	oliver.sang



Hello,

kernel test robot noticed "mdadm-selftests.07reshape5intr.fail" on:

commit: 18effaab5f57ef44763e537c782f905e06f6c4f5 ("[PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers")
url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/md-rearrange-recovery_flage/20240509-093248
base: https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git for-next
patch link: https://lore.kernel.org/all/20240509011900.2694291-6-yukuai1@huaweicloud.com/
patch subject: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers

in testcase: mdadm-selftests
version: mdadm-selftests-x86_64-5f41845-1_20240412
with following parameters:

	disk: 1HDD
	test_prefix: 07reshape5intr



compiler: gcc-13
test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory

(please refer to attached dmesg/kmsg for entire log/backtrace)




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202405202204.4e3dc662-oliver.sang@intel.com

2024-05-14 21:36:26 mkdir -p /var/tmp
2024-05-14 21:36:26 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sda1
2024-05-14 21:36:57 mount -t ext3 /dev/sda1 /var/tmp
sed -e 's/{DEFAULT_METADATA}/1.2/g' \
-e 's,{MAP_PATH},/run/mdadm/map,g'  mdadm.8.in > mdadm.8
/usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
/usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
/usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
/usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
/usr/bin/install -D -m 644 udev-md-raid-creating.rules /lib/udev/rules.d/01-md-raid-creating.rules
/usr/bin/install -D -m 644 udev-md-raid-arrays.rules /lib/udev/rules.d/63-md-raid-arrays.rules
/usr/bin/install -D -m 644 udev-md-raid-assembly.rules /lib/udev/rules.d/64-md-raid-assembly.rules
/usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules /lib/udev/rules.d/69-md-clustered-confirm-device.rules
/usr/bin/install -D  -m 755 mdadm /sbin/mdadm
/usr/bin/install -D  -m 755 mdmon /sbin/mdmon
Testing on linux-6.9.0-rc2-00012-g18effaab5f57 kernel
/lkp/benchmarks/mdadm-selftests/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for details



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240520/202405202204.4e3dc662-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
  2024-05-20 15:01   ` kernel test robot
@ 2024-05-21  2:20     ` Yu Kuai
  2024-05-21  3:01       ` Oliver Sang
  0 siblings, 1 reply; 31+ messages in thread
From: Yu Kuai @ 2024-05-21  2:20 UTC (permalink / raw)
  To: kernel test robot, Yu Kuai
  Cc: oe-lkp, lkp, linux-raid, agk, snitzer, mpatocka, song, xni,
	dm-devel, linux-kernel, yi.zhang, yangerkun, yukuai (C)

Hi,

在 2024/05/20 23:01, kernel test robot 写道:
> 
> 
> Hello,
> 
> kernel test robot noticed "mdadm-selftests.07reshape5intr.fail" on:
> 
> commit: 18effaab5f57ef44763e537c782f905e06f6c4f5 ("[PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers")
> url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/md-rearrange-recovery_flage/20240509-093248
> base: https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git for-next
> patch link: https://lore.kernel.org/all/20240509011900.2694291-6-yukuai1@huaweicloud.com/
> patch subject: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
> 
> in testcase: mdadm-selftests
> version: mdadm-selftests-x86_64-5f41845-1_20240412
> with following parameters:
> 
> 	disk: 1HDD
> 	test_prefix: 07reshape5intr
> 
> 
> 
> compiler: gcc-13
> test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory
> 
> (please refer to attached dmesg/kmsg for entire log/backtrace)
> 
> 
> 
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202405202204.4e3dc662-oliver.sang@intel.com
> 
> 2024-05-14 21:36:26 mkdir -p /var/tmp
> 2024-05-14 21:36:26 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sda1
> 2024-05-14 21:36:57 mount -t ext3 /dev/sda1 /var/tmp
> sed -e 's/{DEFAULT_METADATA}/1.2/g' \
> -e 's,{MAP_PATH},/run/mdadm/map,g'  mdadm.8.in > mdadm.8
> /usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
> /usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
> /usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
> /usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
> /usr/bin/install -D -m 644 udev-md-raid-creating.rules /lib/udev/rules.d/01-md-raid-creating.rules
> /usr/bin/install -D -m 644 udev-md-raid-arrays.rules /lib/udev/rules.d/63-md-raid-arrays.rules
> /usr/bin/install -D -m 644 udev-md-raid-assembly.rules /lib/udev/rules.d/64-md-raid-assembly.rules
> /usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules /lib/udev/rules.d/69-md-clustered-confirm-device.rules
> /usr/bin/install -D  -m 755 mdadm /sbin/mdadm
> /usr/bin/install -D  -m 755 mdmon /sbin/mdmon
> Testing on linux-6.9.0-rc2-00012-g18effaab5f57 kernel
> /lkp/benchmarks/mdadm-selftests/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for detail
[root@fedora mdadm]# ./test --dev=loop --tests=07reshape5intr
test: skipping tests for multipath, which is removed in upstream 6.8+ 
kernels
test: skipping tests for linear, which is removed in upstream 6.8+ kernels
Testing on linux-6.9.0-rc2-00023-gf092583596a2 kernel
/root/mdadm/tests/07reshape5intr... FAILED - see 
/var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for details
   (KNOWN BROKEN TEST: always fails)

So, since this test is marked BROKEN.

Please share the whole log, and is it possible to share the two logs?

Thanks,
Kuai

> 
> 
> 
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240520/202405202204.4e3dc662-oliver.sang@intel.com
> 
> 
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-20 11:51   ` Su Yue
@ 2024-05-21  2:30     ` Yu Kuai
  2024-05-21  3:25     ` Xiao Ni
  1 sibling, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-21  2:30 UTC (permalink / raw)
  To: Su Yue, Yu Kuai
  Cc: agk, snitzer, mpatocka, song, xni, dm-devel, linux-kernel,
	linux-raid, yi.zhang, yangerkun, yukuai (C)

Hi,

在 2024/05/20 19:51, Su Yue 写道:
> 
> On Thu 09 May 2024 at 09:18, Yu Kuai <yukuai1@huaweicloud.com> wrote:
> 
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> The new helpers will get current sync_action of the array, will be used
>> in later patches to make code cleaner.
>>
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>  drivers/md/md.c | 64  +++++++++++++++++++++++++++++++++++++++++++++++++
>>  drivers/md/md.h |  3 +++
>>  2 files changed, 67 insertions(+)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 00bbafcd27bb..48ec35342d1b 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -69,6 +69,16 @@
>>  #include "md-bitmap.h"
>>  #include "md-cluster.h"
>>
>> +static char *action_name[NR_SYNC_ACTIONS] = {
>>
> 
> Th array will not be modified, so:
> 
> static const char * const action_names[NR_SYNC_ACTIONS]

Yes, this make sense.
> 
>> +    [ACTION_RESYNC]        = "resync",
>> +    [ACTION_RECOVER]    = "recover",
>> +    [ACTION_CHECK]        = "check",
>> +    [ACTION_REPAIR]        = "repair",
>> +    [ACTION_RESHAPE]    = "reshape",
>> +    [ACTION_FROZEN]        = "frozen",
>> +    [ACTION_IDLE]        = "idle",
>> +};
>> +
>>  /* pers_list is a list of registered personalities protected by 
>>  pers_lock. */
>>  static LIST_HEAD(pers_list);
>>  static DEFINE_SPINLOCK(pers_lock);
>> @@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, const char 
>> *buf, size_t len)
>>  static struct md_sysfs_entry md_metadata =
>>  __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR,  metadata_show, 
>> metadata_store);
>>
>> +enum sync_action md_sync_action(struct mddev *mddev)
>> +{
>> +    unsigned long recovery = mddev->recovery;
>> +
>> +    /*
>> +     * frozen has the highest priority, means running sync_thread 
>> will be
>> +     * stopped immediately, and no new sync_thread can start.
>> +     */
>> +    if (test_bit(MD_RECOVERY_FROZEN, &recovery))
>> +        return ACTION_FROZEN;
>> +
>> +    /*
>> +     * idle means no sync_thread is running, and no new sync_thread is
>> +     * requested.
>> +     */
>> +    if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
>> +        (!md_is_rdwr(mddev) || !test_bit(MD_RECOVERY_NEEDED, 
>> &recovery)))
>> +        return ACTION_IDLE;
> My brain was lost sometimes looking into nested conditions of md code...
> I agree with Xiao Ni's suggestion that more comments about the array
> state should be added.

Okay, perhaps you're refering md_is_rdwr()? which is supposed to be
related to "no new sync_thread is requestd", perhaps following is
better:

/* only read-write array can start sync_thread */
if (!(md_is_rdwr(mddev))
	return ACTION_IDLE;

/* sync_thread is not running, and no new sync_thread is requested */
if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
     !test_bit(MD_RECOVERY_NEEDED, &recovery)
	return ACTION_IDLE;

> 
>> +    if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
>> +        mddev->reshape_position != MaxSector)
>> +        return ACTION_RESHAPE;
>> +
>> +    if (test_bit(MD_RECOVERY_RECOVER, &recovery))
>> +        return ACTION_RECOVER;
>> +
>>
> In action_show, MD_RECOVERY_SYNC is tested first then MD_RECOVERY_RECOVER.
> After looking through the logic of MD_RECOVERY_RECOVER clear/set_bit, the
> change is fine to me. However, better to follow old pattern unless there
> have resons.

It's just because MD_RECOVERY_SYNC is more complicated, and I move it to
the last, just programming habits. :)
> 
> 
>> +    if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
>> +        if (test_bit(MD_RECOVERY_CHECK, &recovery))
>> +            return ACTION_CHECK;
>> +        if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
>> +            return ACTION_REPAIR;
>> +        return ACTION_RESYNC;
>> +    }
>> +
>> +    return ACTION_IDLE;
>> +}
>> +
>> +enum sync_action md_sync_action_by_name(char *page)
>> +{
>> +    enum sync_action action;
>> +
>> +    for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
>> +        if (cmd_match(page, action_name[action]))
>> +            return action;
>> +    }
>> +
>> +    return NR_SYNC_ACTIONS;
>> +}
>> +
>> +char *md_sync_action_name(enum sync_action action)
>>
> 
> And 'const char *'

Yes

Thanks,
Kuai

> 
> -- 
> Su
> 
>> +{
>> +    return action_name[action];
>> +}
>> +
>>  static ssize_t
>>  action_show(struct mddev *mddev, char *page)
>>  {
>> diff --git a/drivers/md/md.h b/drivers/md/md.h
>> index 2edad966f90a..72ca7a796df5 100644
>> --- a/drivers/md/md.h
>> +++ b/drivers/md/md.h
>> @@ -864,6 +864,9 @@ extern void md_unregister_thread(struct mddev 
>> *mddev, struct md_thread __rcu **t
>>  extern void md_wakeup_thread(struct md_thread __rcu *thread);
>>  extern void md_check_recovery(struct mddev *mddev);
>>  extern void md_reap_sync_thread(struct mddev *mddev);
>> +extern enum sync_action md_sync_action(struct mddev *mddev);
>> +extern enum sync_action md_sync_action_by_name(char *page);
>> +extern char *md_sync_action_name(enum sync_action action);
>>  extern bool md_write_start(struct mddev *mddev, struct bio  *bi);
>>  extern void md_write_inc(struct mddev *mddev, struct bio *bi);
>>  extern void md_write_end(struct mddev *mddev);
> .
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
  2024-05-21  2:20     ` Yu Kuai
@ 2024-05-21  3:01       ` Oliver Sang
  2024-05-21  3:11         ` Yu Kuai
  2024-05-21  3:21         ` Xiao Ni
  0 siblings, 2 replies; 31+ messages in thread
From: Oliver Sang @ 2024-05-21  3:01 UTC (permalink / raw)
  To: Yu Kuai
  Cc: oe-lkp, lkp, linux-raid, agk, snitzer, mpatocka, song, xni,
	dm-devel, linux-kernel, yi.zhang, yangerkun, yukuai (C)

[-- Attachment #1: Type: text/plain, Size: 3599 bytes --]

hi, Yu Kuai,

On Tue, May 21, 2024 at 10:20:54AM +0800, Yu Kuai wrote:
> Hi,
> 
> 在 2024/05/20 23:01, kernel test robot 写道:
> > 
> > 
> > Hello,
> > 
> > kernel test robot noticed "mdadm-selftests.07reshape5intr.fail" on:
> > 
> > commit: 18effaab5f57ef44763e537c782f905e06f6c4f5 ("[PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers")
> > url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/md-rearrange-recovery_flage/20240509-093248
> > base: https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git for-next
> > patch link: https://lore.kernel.org/all/20240509011900.2694291-6-yukuai1@huaweicloud.com/
> > patch subject: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
> > 
> > in testcase: mdadm-selftests
> > version: mdadm-selftests-x86_64-5f41845-1_20240412
> > with following parameters:
> > 
> > 	disk: 1HDD
> > 	test_prefix: 07reshape5intr
> > 
> > 
> > 
> > compiler: gcc-13
> > test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory
> > 
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > 
> > 
> > 
> > 
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202405202204.4e3dc662-oliver.sang@intel.com
> > 
> > 2024-05-14 21:36:26 mkdir -p /var/tmp
> > 2024-05-14 21:36:26 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sda1
> > 2024-05-14 21:36:57 mount -t ext3 /dev/sda1 /var/tmp
> > sed -e 's/{DEFAULT_METADATA}/1.2/g' \
> > -e 's,{MAP_PATH},/run/mdadm/map,g'  mdadm.8.in > mdadm.8
> > /usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
> > /usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
> > /usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
> > /usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
> > /usr/bin/install -D -m 644 udev-md-raid-creating.rules /lib/udev/rules.d/01-md-raid-creating.rules
> > /usr/bin/install -D -m 644 udev-md-raid-arrays.rules /lib/udev/rules.d/63-md-raid-arrays.rules
> > /usr/bin/install -D -m 644 udev-md-raid-assembly.rules /lib/udev/rules.d/64-md-raid-assembly.rules
> > /usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules /lib/udev/rules.d/69-md-clustered-confirm-device.rules
> > /usr/bin/install -D  -m 755 mdadm /sbin/mdadm
> > /usr/bin/install -D  -m 755 mdmon /sbin/mdmon
> > Testing on linux-6.9.0-rc2-00012-g18effaab5f57 kernel
> > /lkp/benchmarks/mdadm-selftests/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for detail
> [root@fedora mdadm]# ./test --dev=loop --tests=07reshape5intr
> test: skipping tests for multipath, which is removed in upstream 6.8+
> kernels
> test: skipping tests for linear, which is removed in upstream 6.8+ kernels
> Testing on linux-6.9.0-rc2-00023-gf092583596a2 kernel
> /root/mdadm/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log
> and /var/tmp/fail07reshape5intr.log for details
>   (KNOWN BROKEN TEST: always fails)
> 
> So, since this test is marked BROKEN.
> 
> Please share the whole log, and is it possible to share the two logs?


we only captured one log as attached log-18effaab5f.
also attached parent log FYI.


> 
> Thanks,
> Kuai
> 
> > 
> > 
> > 
> > The kernel config and materials to reproduce are available at:
> > https://download.01.org/0day-ci/archive/20240520/202405202204.4e3dc662-oliver.sang@intel.com
> > 
> > 
> > 
> 

[-- Attachment #2: log-18effaab5f --]
[-- Type: text/plain, Size: 3654 bytes --]

+ . /lkp/benchmarks/mdadm-selftests/tests/07reshape5intr
++ set -x
++ devs=/dev/loop1
++ st=UU
++ for disks in 2 3 4 5
++ eval 'devs="/dev/loop1' '$dev2"'
+++ devs='/dev/loop1 /dev/loop2'
++ st=UUU
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop1 bs=1024
dd: error writing '/dev/loop1': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.909885 s, 22.5 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop2 bs=1024
dd: error writing '/dev/loop2': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.633683 s, 32.3 MB/s
++ true
++ case $disks in
++ chunk=1024
++ mdadm -CR /dev/md0 -amd -l5 -c 1024 -n2 --assume-clean /dev/loop1 /dev/loop2
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ for args in $*
++ [[ -CR =~ /dev/ ]]
++ for args in $*
++ [[ /dev/md0 =~ /dev/ ]]
++ [[ /dev/md0 =~ md ]]
++ for args in $*
++ [[ -amd =~ /dev/ ]]
++ for args in $*
++ [[ -l5 =~ /dev/ ]]
++ for args in $*
++ [[ -c =~ /dev/ ]]
++ for args in $*
++ [[ 1024 =~ /dev/ ]]
++ for args in $*
++ [[ -n2 =~ /dev/ ]]
++ for args in $*
++ [[ --assume-clean =~ /dev/ ]]
++ for args in $*
++ [[ /dev/loop1 =~ /dev/ ]]
++ [[ /dev/loop1 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop1
mdadm: Unrecognised md component device - /dev/loop1
++ for args in $*
++ [[ /dev/loop2 =~ /dev/ ]]
++ [[ /dev/loop2 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop2
mdadm: Unrecognised md component device - /dev/loop2
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -CR /dev/md0 -amd -l5 -c 1024 -n2 --assume-clean /dev/loop1 /dev/loop2 --auto=yes
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm /dev/md0 --add /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet /dev/md0 --add /dev/loop6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ echo 20
++ echo 20
++ mdadm --grow /dev/md0 -n 3
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --grow /dev/md0 -n 3
++ rv=1
++ case $* in
++ cat /var/tmp/stderr
mdadm: Failed to initiate reshape!
++ return 1
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ '[' 5 -gt 0 ']'
++ grep -v idle /sys/block/md0/md/sync_action
++ die 'no reshape happening'
++ echo -e '\n\tERROR: no reshape happening \n'

	ERROR: no reshape happening 

++ save_log fail
++ status=fail
++ logfile=fail07reshape5intr.log
++ cat /var/tmp/stderr
++ cp /var/tmp/log /var/tmp/07reshape5intr.log
++ echo '## lkp-hsw-d05: saving dmesg.'
++ dmesg -c
++ echo '## lkp-hsw-d05: saving proc mdstat.'
++ cat /proc/mdstat
++ array=($(mdadm -Ds | cut -d' ' -f2))
+++ mdadm -Ds
+++ rm -f /var/tmp/stderr
+++ cut '-d ' -f2
+++ case $* in
+++ case $* in
+++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -Ds
+++ rv=0
+++ case $* in
+++ cat /var/tmp/stderr
+++ return 0
++ '[' fail == fail ']'
++ echo 'FAILED - see /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for details'
FAILED - see /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for details
++ '[' loop == lvm ']'
++ '[' loop == loop -o loop == disk ']'
++ '[' '!' -z /dev/md0 -a 1 -ge 1 ']'
++ echo '## lkp-hsw-d05: mdadm -D /dev/md0'
++ /lkp/benchmarks/mdadm-selftests/mdadm -D /dev/md0
++ cat /proc/mdstat
++ grep -q 'linear\|external'
++ md_disks=($($mdadm -D -Y ${array[@]} | grep "/dev/" | cut -d'=' -f2))
+++ /lkp/benchmarks/mdadm-selftests/mdadm -D -Y /dev/md0
+++ grep /dev/
+++ cut -d= -f2
++ cat /proc/mdstat
++ grep -q bitmap
++ '[' 1 -eq 0 ']'
++ exit 2

[-- Attachment #3: log-parent --]
[-- Type: text/plain, Size: 20198 bytes --]

+ . /lkp/benchmarks/mdadm-selftests/tests/07reshape5intr
++ set -x
++ devs=/dev/loop1
++ st=UU
++ for disks in 2 3 4 5
++ eval 'devs="/dev/loop1' '$dev2"'
+++ devs='/dev/loop1 /dev/loop2'
++ st=UUU
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop1 bs=1024
dd: error writing '/dev/loop1': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.860819 s, 23.8 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop2 bs=1024
dd: error writing '/dev/loop2': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.648983 s, 31.6 MB/s
++ true
++ case $disks in
++ chunk=1024
++ mdadm -CR /dev/md0 -amd -l5 -c 1024 -n2 --assume-clean /dev/loop1 /dev/loop2
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ for args in $*
++ [[ -CR =~ /dev/ ]]
++ for args in $*
++ [[ /dev/md0 =~ /dev/ ]]
++ [[ /dev/md0 =~ md ]]
++ for args in $*
++ [[ -amd =~ /dev/ ]]
++ for args in $*
++ [[ -l5 =~ /dev/ ]]
++ for args in $*
++ [[ -c =~ /dev/ ]]
++ for args in $*
++ [[ 1024 =~ /dev/ ]]
++ for args in $*
++ [[ -n2 =~ /dev/ ]]
++ for args in $*
++ [[ --assume-clean =~ /dev/ ]]
++ for args in $*
++ [[ /dev/loop1 =~ /dev/ ]]
++ [[ /dev/loop1 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop1
mdadm: Unrecognised md component device - /dev/loop1
++ for args in $*
++ [[ /dev/loop2 =~ /dev/ ]]
++ [[ /dev/loop2 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop2
mdadm: Unrecognised md component device - /dev/loop2
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -CR /dev/md0 -amd -l5 -c 1024 -n2 --assume-clean /dev/loop1 /dev/loop2 --auto=yes
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm /dev/md0 --add /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet /dev/md0 --add /dev/loop6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ echo 20
++ echo 20
++ mdadm --grow /dev/md0 -n 3
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --grow /dev/md0 -n 3
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ '[' 5 -gt 0 ']'
++ grep -v idle /sys/block/md0/md/sync_action
++ sleep 0.5
++ cnt=4
++ grep -sq reshape /proc/mdstat
++ check state UUU
++ case $1 in
++ grep -sq 'blocks.*\[UUU\]$' /proc/mdstat
++ sleep 0.5
++ mdadm --stop /dev/md0
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --stop /dev/md0
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm --assemble /dev/md0 /dev/loop1 /dev/loop2 /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --assemble /dev/md0 /dev/loop1 /dev/loop2 /dev/loop6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ echo 1000
++ echo 2000
++ check wait
++ case $1 in
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 2000000
++ sleep 0.1
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ grep -v idle /sys/block/md0/md/sync_action
++ echo 2000
++ echo check
++ check wait
++ case $1 in
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 2000000
++ sleep 0.1
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ grep -v idle /sys/block/md0/md/sync_action
++ echo 2000
+++ cat /sys/block/md0/md/mismatch_cnt
++ mm=0
++ '[' 0 -gt 0 ']'
++ mdadm -S /dev/md0
++ rm -f /var/tmp/stderr
++ case $* in
++ udevadm settle
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 20000
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -S /dev/md0
++ rv=0
++ case $* in
++ udevadm settle
++ echo 2000
++ cat /var/tmp/stderr
++ return 0
++ for disks in 2 3 4 5
++ eval 'devs="/dev/loop1' /dev/loop2 '$dev3"'
+++ devs='/dev/loop1 /dev/loop2 /dev/loop3'
++ st=UUUU
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop1 bs=1024
dd: error writing '/dev/loop1': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.430464 s, 47.6 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop2 bs=1024
dd: error writing '/dev/loop2': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.44228 s, 46.3 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop3 bs=1024
dd: error writing '/dev/loop3': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.442369 s, 46.3 MB/s
++ true
++ case $disks in
++ chunk=1024
++ mdadm -CR /dev/md0 -amd -l5 -c 1024 -n3 --assume-clean /dev/loop1 /dev/loop2 /dev/loop3
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ for args in $*
++ [[ -CR =~ /dev/ ]]
++ for args in $*
++ [[ /dev/md0 =~ /dev/ ]]
++ [[ /dev/md0 =~ md ]]
++ for args in $*
++ [[ -amd =~ /dev/ ]]
++ for args in $*
++ [[ -l5 =~ /dev/ ]]
++ for args in $*
++ [[ -c =~ /dev/ ]]
++ for args in $*
++ [[ 1024 =~ /dev/ ]]
++ for args in $*
++ [[ -n3 =~ /dev/ ]]
++ for args in $*
++ [[ --assume-clean =~ /dev/ ]]
++ for args in $*
++ [[ /dev/loop1 =~ /dev/ ]]
++ [[ /dev/loop1 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop1
mdadm: Unrecognised md component device - /dev/loop1
++ for args in $*
++ [[ /dev/loop2 =~ /dev/ ]]
++ [[ /dev/loop2 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop2
mdadm: Unrecognised md component device - /dev/loop2
++ for args in $*
++ [[ /dev/loop3 =~ /dev/ ]]
++ [[ /dev/loop3 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop3
mdadm: Unrecognised md component device - /dev/loop3
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -CR /dev/md0 -amd -l5 -c 1024 -n3 --assume-clean /dev/loop1 /dev/loop2 /dev/loop3 --auto=yes
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm /dev/md0 --add /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet /dev/md0 --add /dev/loop6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ echo 20
++ echo 20
++ mdadm --grow /dev/md0 -n 4
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --grow /dev/md0 -n 4
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
mdadm: Need to backup 6144K of critical section..
++ return 0
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ '[' 5 -gt 0 ']'
++ grep -v idle /sys/block/md0/md/sync_action
++ sleep 0.5
++ cnt=4
++ grep -sq reshape /proc/mdstat
++ check state UUUU
++ case $1 in
++ grep -sq 'blocks.*\[UUUU\]$' /proc/mdstat
++ sleep 0.5
++ mdadm --stop /dev/md0
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --stop /dev/md0
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm --assemble /dev/md0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --assemble /dev/md0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ echo 1000
++ echo 2000
++ check wait
++ case $1 in
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 2000000
++ sleep 0.1
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ grep -v idle /sys/block/md0/md/sync_action
++ echo 2000
++ echo check
++ check wait
++ case $1 in
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 2000000
++ sleep 0.1
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ grep -v idle /sys/block/md0/md/sync_action
++ echo 2000
+++ cat /sys/block/md0/md/mismatch_cnt
++ mm=0
++ '[' 0 -gt 0 ']'
++ mdadm -S /dev/md0
++ rm -f /var/tmp/stderr
++ case $* in
++ udevadm settle
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 20000
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -S /dev/md0
++ rv=0
++ case $* in
++ udevadm settle
++ echo 2000
++ cat /var/tmp/stderr
++ return 0
++ for disks in 2 3 4 5
++ eval 'devs="/dev/loop1' /dev/loop2 /dev/loop3 '$dev4"'
+++ devs='/dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4'
++ st=UUUUU
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop1 bs=1024
dd: error writing '/dev/loop1': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.406947 s, 50.3 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop2 bs=1024
dd: error writing '/dev/loop2': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.415534 s, 49.3 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop3 bs=1024
dd: error writing '/dev/loop3': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.474736 s, 43.1 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop4 bs=1024
dd: error writing '/dev/loop4': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.467228 s, 43.8 MB/s
++ true
++ case $disks in
++ chunk=512
++ mdadm -CR /dev/md0 -amd -l5 -c 512 -n4 --assume-clean /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ for args in $*
++ [[ -CR =~ /dev/ ]]
++ for args in $*
++ [[ /dev/md0 =~ /dev/ ]]
++ [[ /dev/md0 =~ md ]]
++ for args in $*
++ [[ -amd =~ /dev/ ]]
++ for args in $*
++ [[ -l5 =~ /dev/ ]]
++ for args in $*
++ [[ -c =~ /dev/ ]]
++ for args in $*
++ [[ 512 =~ /dev/ ]]
++ for args in $*
++ [[ -n4 =~ /dev/ ]]
++ for args in $*
++ [[ --assume-clean =~ /dev/ ]]
++ for args in $*
++ [[ /dev/loop1 =~ /dev/ ]]
++ [[ /dev/loop1 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop1
mdadm: Unrecognised md component device - /dev/loop1
++ for args in $*
++ [[ /dev/loop2 =~ /dev/ ]]
++ [[ /dev/loop2 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop2
mdadm: Unrecognised md component device - /dev/loop2
++ for args in $*
++ [[ /dev/loop3 =~ /dev/ ]]
++ [[ /dev/loop3 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop3
mdadm: Unrecognised md component device - /dev/loop3
++ for args in $*
++ [[ /dev/loop4 =~ /dev/ ]]
++ [[ /dev/loop4 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop4
mdadm: Unrecognised md component device - /dev/loop4
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -CR /dev/md0 -amd -l5 -c 512 -n4 --assume-clean /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 --auto=yes
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm /dev/md0 --add /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet /dev/md0 --add /dev/loop6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ echo 20
++ echo 20
++ mdadm --grow /dev/md0 -n 5
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --grow /dev/md0 -n 5
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
mdadm: Need to backup 6144K of critical section..
++ return 0
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ '[' 5 -gt 0 ']'
++ grep -v idle /sys/block/md0/md/sync_action
++ sleep 0.5
++ cnt=4
++ grep -sq reshape /proc/mdstat
++ check state UUUUU
++ case $1 in
++ grep -sq 'blocks.*\[UUUUU\]$' /proc/mdstat
++ sleep 0.5
++ mdadm --stop /dev/md0
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --stop /dev/md0
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm --assemble /dev/md0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --assemble /dev/md0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ echo 1000
++ echo 2000
++ check wait
++ case $1 in
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 2000000
++ sleep 0.1
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ grep -v idle /sys/block/md0/md/sync_action
++ echo 2000
++ echo check
++ check wait
++ case $1 in
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 2000000
++ sleep 0.1
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ grep -v idle /sys/block/md0/md/sync_action
++ echo 2000
+++ cat /sys/block/md0/md/mismatch_cnt
++ mm=0
++ '[' 0 -gt 0 ']'
++ mdadm -S /dev/md0
++ rm -f /var/tmp/stderr
++ case $* in
++ udevadm settle
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 20000
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -S /dev/md0
++ rv=0
++ case $* in
++ udevadm settle
++ echo 2000
++ cat /var/tmp/stderr
++ return 0
++ for disks in 2 3 4 5
++ eval 'devs="/dev/loop1' /dev/loop2 /dev/loop3 /dev/loop4 '$dev5"'
+++ devs='/dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5'
++ st=UUUUUU
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop1 bs=1024
dd: error writing '/dev/loop1': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.579398 s, 35.3 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop2 bs=1024
dd: error writing '/dev/loop2': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.459501 s, 44.6 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop3 bs=1024
dd: error writing '/dev/loop3': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.460076 s, 44.5 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop4 bs=1024
dd: error writing '/dev/loop4': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.468438 s, 43.7 MB/s
++ true
++ for d in $devs
++ dd if=/dev/urandom of=/dev/loop5 bs=1024
dd: error writing '/dev/loop5': No space left on device
20001+0 records in
20000+0 records out
20480000 bytes (20 MB, 20 MiB) copied, 0.443522 s, 46.2 MB/s
++ true
++ case $disks in
++ chunk=256
++ mdadm -CR /dev/md0 -amd -l5 -c 256 -n5 --assume-clean /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ for args in $*
++ [[ -CR =~ /dev/ ]]
++ for args in $*
++ [[ /dev/md0 =~ /dev/ ]]
++ [[ /dev/md0 =~ md ]]
++ for args in $*
++ [[ -amd =~ /dev/ ]]
++ for args in $*
++ [[ -l5 =~ /dev/ ]]
++ for args in $*
++ [[ -c =~ /dev/ ]]
++ for args in $*
++ [[ 256 =~ /dev/ ]]
++ for args in $*
++ [[ -n5 =~ /dev/ ]]
++ for args in $*
++ [[ --assume-clean =~ /dev/ ]]
++ for args in $*
++ [[ /dev/loop1 =~ /dev/ ]]
++ [[ /dev/loop1 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop1
mdadm: Unrecognised md component device - /dev/loop1
++ for args in $*
++ [[ /dev/loop2 =~ /dev/ ]]
++ [[ /dev/loop2 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop2
mdadm: Unrecognised md component device - /dev/loop2
++ for args in $*
++ [[ /dev/loop3 =~ /dev/ ]]
++ [[ /dev/loop3 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop3
mdadm: Unrecognised md component device - /dev/loop3
++ for args in $*
++ [[ /dev/loop4 =~ /dev/ ]]
++ [[ /dev/loop4 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop4
mdadm: Unrecognised md component device - /dev/loop4
++ for args in $*
++ [[ /dev/loop5 =~ /dev/ ]]
++ [[ /dev/loop5 =~ md ]]
++ /lkp/benchmarks/mdadm-selftests/mdadm --zero /dev/loop5
mdadm: Unrecognised md component device - /dev/loop5
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -CR /dev/md0 -amd -l5 -c 256 -n5 --assume-clean /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5 --auto=yes
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm /dev/md0 --add /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet /dev/md0 --add /dev/loop6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ echo 20
++ echo 20
++ mdadm --grow /dev/md0 -n 6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --grow /dev/md0 -n 6
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
mdadm: Need to backup 5120K of critical section..
++ return 0
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ '[' 5 -gt 0 ']'
++ grep -v idle /sys/block/md0/md/sync_action
++ sleep 0.5
++ cnt=4
++ grep -sq reshape /proc/mdstat
++ check state UUUUUU
++ case $1 in
++ grep -sq 'blocks.*\[UUUUUU\]$' /proc/mdstat
++ sleep 0.5
++ mdadm --stop /dev/md0
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --stop /dev/md0
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ mdadm --assemble /dev/md0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5 /dev/loop6
++ rm -f /var/tmp/stderr
++ case $* in
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet --assemble /dev/md0 /dev/loop1 /dev/loop2 /dev/loop3 /dev/loop4 /dev/loop5 /dev/loop6
mdadm: restoring critical section
++ rv=0
++ case $* in
++ cat /var/tmp/stderr
++ return 0
++ check reshape
++ case $1 in
++ cnt=5
++ grep -sq reshape /proc/mdstat
++ echo 1000
++ echo 2000
++ check wait
++ case $1 in
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 2000000
++ sleep 0.1
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ grep -v idle /sys/block/md0/md/sync_action
++ echo 2000
++ echo check
++ check wait
++ case $1 in
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 2000000
++ sleep 0.1
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ sleep 0.5
++ grep -Eq '(resync|recovery|reshape|check|repair) *=' /proc/mdstat
++ grep -v idle /sys/block/md0/md/sync_action
++ echo 2000
+++ cat /sys/block/md0/md/mismatch_cnt
++ mm=0
++ '[' 0 -gt 0 ']'
++ mdadm -S /dev/md0
++ rm -f /var/tmp/stderr
++ case $* in
++ udevadm settle
+++ cat /proc/sys/dev/raid/speed_limit_max
++ p=2000
++ echo 20000
++ case $* in
++ /lkp/benchmarks/mdadm-selftests/mdadm --quiet -S /dev/md0
++ rv=0
++ case $* in
++ udevadm settle
++ echo 2000
++ cat /var/tmp/stderr
++ return 0

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
  2024-05-21  3:01       ` Oliver Sang
@ 2024-05-21  3:11         ` Yu Kuai
  2024-05-21  3:21         ` Xiao Ni
  1 sibling, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-21  3:11 UTC (permalink / raw)
  To: Oliver Sang, Yu Kuai
  Cc: oe-lkp, lkp, linux-raid, agk, snitzer, mpatocka, song, xni,
	dm-devel, linux-kernel, yi.zhang, yangerkun, yukuai (C)

Hi,

在 2024/05/21 11:01, Oliver Sang 写道:
> hi, Yu Kuai,
> 
> On Tue, May 21, 2024 at 10:20:54AM +0800, Yu Kuai wrote:
>> Hi,
>>
>> 在 2024/05/20 23:01, kernel test robot 写道:
>>>
>>>
>>> Hello,
>>>
>>> kernel test robot noticed "mdadm-selftests.07reshape5intr.fail" on:
>>>
>>> commit: 18effaab5f57ef44763e537c782f905e06f6c4f5 ("[PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers")
>>> url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/md-rearrange-recovery_flage/20240509-093248
>>> base: https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git for-next
>>> patch link: https://lore.kernel.org/all/20240509011900.2694291-6-yukuai1@huaweicloud.com/
>>> patch subject: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
>>>
>>> in testcase: mdadm-selftests
>>> version: mdadm-selftests-x86_64-5f41845-1_20240412
>>> with following parameters:
>>>
>>> 	disk: 1HDD
>>> 	test_prefix: 07reshape5intr
>>>
>>>
>>>
>>> compiler: gcc-13
>>> test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory
>>>
>>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>>
>>>
>>>
>>>
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <oliver.sang@intel.com>
>>> | Closes: https://lore.kernel.org/oe-lkp/202405202204.4e3dc662-oliver.sang@intel.com
>>>
>>> 2024-05-14 21:36:26 mkdir -p /var/tmp
>>> 2024-05-14 21:36:26 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sda1
>>> 2024-05-14 21:36:57 mount -t ext3 /dev/sda1 /var/tmp
>>> sed -e 's/{DEFAULT_METADATA}/1.2/g' \
>>> -e 's,{MAP_PATH},/run/mdadm/map,g'  mdadm.8.in > mdadm.8
>>> /usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
>>> /usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
>>> /usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
>>> /usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
>>> /usr/bin/install -D -m 644 udev-md-raid-creating.rules /lib/udev/rules.d/01-md-raid-creating.rules
>>> /usr/bin/install -D -m 644 udev-md-raid-arrays.rules /lib/udev/rules.d/63-md-raid-arrays.rules
>>> /usr/bin/install -D -m 644 udev-md-raid-assembly.rules /lib/udev/rules.d/64-md-raid-assembly.rules
>>> /usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules /lib/udev/rules.d/69-md-clustered-confirm-device.rules
>>> /usr/bin/install -D  -m 755 mdadm /sbin/mdadm
>>> /usr/bin/install -D  -m 755 mdmon /sbin/mdmon
>>> Testing on linux-6.9.0-rc2-00012-g18effaab5f57 kernel
>>> /lkp/benchmarks/mdadm-selftests/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for detail
>> [root@fedora mdadm]# ./test --dev=loop --tests=07reshape5intr
>> test: skipping tests for multipath, which is removed in upstream 6.8+
>> kernels
>> test: skipping tests for linear, which is removed in upstream 6.8+ kernels
>> Testing on linux-6.9.0-rc2-00023-gf092583596a2 kernel
>> /root/mdadm/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log
>> and /var/tmp/fail07reshape5intr.log for details
>>    (KNOWN BROKEN TEST: always fails)
>>
>> So, since this test is marked BROKEN.
>>
>> Please share the whole log, and is it possible to share the two logs?
> 
> 
> we only captured one log as attached log-18effaab5f.
> also attached parent log FYI.
I mean please ignore the BROKEN test, and next time attach the two logs
if possible:

/var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log

Thanks for the test, we really need a per patch CI.
Kuai

> 
> 
>>
>> Thanks,
>> Kuai
>>
>>>
>>>
>>>
>>> The kernel config and materials to reproduce are available at:
>>> https://download.01.org/0day-ci/archive/20240520/202405202204.4e3dc662-oliver.sang@intel.com
>>>
>>>
>>>
>>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
  2024-05-21  3:01       ` Oliver Sang
  2024-05-21  3:11         ` Yu Kuai
@ 2024-05-21  3:21         ` Xiao Ni
  2024-05-22  2:46           ` Yu Kuai
  1 sibling, 1 reply; 31+ messages in thread
From: Xiao Ni @ 2024-05-21  3:21 UTC (permalink / raw)
  To: Oliver Sang
  Cc: Yu Kuai, oe-lkp, lkp, linux-raid, agk, snitzer, mpatocka, song,
	dm-devel, linux-kernel, yi.zhang, yangerkun, yukuai (C)

Hi Kuai

I've tested 07reshape5intr with the latest upstream kernel 15 times
without failure. So it's better to have a try with 07reshape5intr with
your patch set.

Regards
Xiao




On Tue, May 21, 2024 at 11:02 AM Oliver Sang <oliver.sang@intel.com> wrote:
>
> hi, Yu Kuai,
>
> On Tue, May 21, 2024 at 10:20:54AM +0800, Yu Kuai wrote:
> > Hi,
> >
> > 在 2024/05/20 23:01, kernel test robot 写道:
> > >
> > >
> > > Hello,
> > >
> > > kernel test robot noticed "mdadm-selftests.07reshape5intr.fail" on:
> > >
> > > commit: 18effaab5f57ef44763e537c782f905e06f6c4f5 ("[PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers")
> > > url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/md-rearrange-recovery_flage/20240509-093248
> > > base: https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git for-next
> > > patch link: https://lore.kernel.org/all/20240509011900.2694291-6-yukuai1@huaweicloud.com/
> > > patch subject: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
> > >
> > > in testcase: mdadm-selftests
> > > version: mdadm-selftests-x86_64-5f41845-1_20240412
> > > with following parameters:
> > >
> > >     disk: 1HDD
> > >     test_prefix: 07reshape5intr
> > >
> > >
> > >
> > > compiler: gcc-13
> > > test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory
> > >
> > > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > >
> > >
> > >
> > >
> > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > the same patch/commit), kindly add following tags
> > > | Reported-by: kernel test robot <oliver.sang@intel.com>
> > > | Closes: https://lore.kernel.org/oe-lkp/202405202204.4e3dc662-oliver.sang@intel.com
> > >
> > > 2024-05-14 21:36:26 mkdir -p /var/tmp
> > > 2024-05-14 21:36:26 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sda1
> > > 2024-05-14 21:36:57 mount -t ext3 /dev/sda1 /var/tmp
> > > sed -e 's/{DEFAULT_METADATA}/1.2/g' \
> > > -e 's,{MAP_PATH},/run/mdadm/map,g'  mdadm.8.in > mdadm.8
> > > /usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
> > > /usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
> > > /usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
> > > /usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
> > > /usr/bin/install -D -m 644 udev-md-raid-creating.rules /lib/udev/rules.d/01-md-raid-creating.rules
> > > /usr/bin/install -D -m 644 udev-md-raid-arrays.rules /lib/udev/rules.d/63-md-raid-arrays.rules
> > > /usr/bin/install -D -m 644 udev-md-raid-assembly.rules /lib/udev/rules.d/64-md-raid-assembly.rules
> > > /usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules /lib/udev/rules.d/69-md-clustered-confirm-device.rules
> > > /usr/bin/install -D  -m 755 mdadm /sbin/mdadm
> > > /usr/bin/install -D  -m 755 mdmon /sbin/mdmon
> > > Testing on linux-6.9.0-rc2-00012-g18effaab5f57 kernel
> > > /lkp/benchmarks/mdadm-selftests/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for detail
> > [root@fedora mdadm]# ./test --dev=loop --tests=07reshape5intr
> > test: skipping tests for multipath, which is removed in upstream 6.8+
> > kernels
> > test: skipping tests for linear, which is removed in upstream 6.8+ kernels
> > Testing on linux-6.9.0-rc2-00023-gf092583596a2 kernel
> > /root/mdadm/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log
> > and /var/tmp/fail07reshape5intr.log for details
> >   (KNOWN BROKEN TEST: always fails)
> >
> > So, since this test is marked BROKEN.
> >
> > Please share the whole log, and is it possible to share the two logs?
>
>
> we only captured one log as attached log-18effaab5f.
> also attached parent log FYI.
>
>
> >
> > Thanks,
> > Kuai
> >
> > >
> > >
> > >
> > > The kernel config and materials to reproduce are available at:
> > > https://download.01.org/0day-ci/archive/20240520/202405202204.4e3dc662-oliver.sang@intel.com
> > >
> > >
> > >
> >


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-20 11:51   ` Su Yue
  2024-05-21  2:30     ` Yu Kuai
@ 2024-05-21  3:25     ` Xiao Ni
  2024-05-21  3:50       ` Su Yue
  1 sibling, 1 reply; 31+ messages in thread
From: Xiao Ni @ 2024-05-21  3:25 UTC (permalink / raw)
  To: Su Yue
  Cc: Yu Kuai, agk, snitzer, mpatocka, song, dm-devel, linux-kernel,
	linux-raid, yukuai3, yi.zhang, yangerkun

On Mon, May 20, 2024 at 8:38 PM Su Yue <l@damenly.org> wrote:
>
>
> On Thu 09 May 2024 at 09:18, Yu Kuai <yukuai1@huaweicloud.com>
> wrote:
>
> > From: Yu Kuai <yukuai3@huawei.com>
> >
> > The new helpers will get current sync_action of the array, will
> > be used
> > in later patches to make code cleaner.
> >
> > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> > ---
> >  drivers/md/md.c | 64
> >  +++++++++++++++++++++++++++++++++++++++++++++++++
> >  drivers/md/md.h |  3 +++
> >  2 files changed, 67 insertions(+)
> >
> > diff --git a/drivers/md/md.c b/drivers/md/md.c
> > index 00bbafcd27bb..48ec35342d1b 100644
> > --- a/drivers/md/md.c
> > +++ b/drivers/md/md.c
> > @@ -69,6 +69,16 @@
> >  #include "md-bitmap.h"
> >  #include "md-cluster.h"
> >
> > +static char *action_name[NR_SYNC_ACTIONS] = {
> >
>
> Th array will not be modified, so:
>
> static const char * const action_names[NR_SYNC_ACTIONS]
>
> > +     [ACTION_RESYNC]         = "resync",
> > +     [ACTION_RECOVER]        = "recover",
> > +     [ACTION_CHECK]          = "check",
> > +     [ACTION_REPAIR]         = "repair",
> > +     [ACTION_RESHAPE]        = "reshape",
> > +     [ACTION_FROZEN]         = "frozen",
> > +     [ACTION_IDLE]           = "idle",
> > +};
> > +
> >  /* pers_list is a list of registered personalities protected by
> >  pers_lock. */
> >  static LIST_HEAD(pers_list);
> >  static DEFINE_SPINLOCK(pers_lock);
> > @@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, const
> > char *buf, size_t len)
> >  static struct md_sysfs_entry md_metadata =
> >  __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR,
> >  metadata_show, metadata_store);
> >
> > +enum sync_action md_sync_action(struct mddev *mddev)
> > +{
> > +     unsigned long recovery = mddev->recovery;
> > +
> > +     /*
> > +      * frozen has the highest priority, means running sync_thread
> > will be
> > +      * stopped immediately, and no new sync_thread can start.
> > +      */
> > +     if (test_bit(MD_RECOVERY_FROZEN, &recovery))
> > +             return ACTION_FROZEN;
> > +
> > +     /*
> > +      * idle means no sync_thread is running, and no new
> > sync_thread is
> > +      * requested.
> > +      */
> > +     if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
> > +         (!md_is_rdwr(mddev) || !test_bit(MD_RECOVERY_NEEDED,
> > &recovery)))
> > +             return ACTION_IDLE;
> My brain was lost sometimes looking into nested conditions of md
> code...

agree+

> I agree with Xiao Ni's suggestion that more comments about the
> array
> state should be added.

In fact, I suggest to keep the logic which is in action_show now. The
logic in action_show is easier to understand for me.

Best Regards
Xiao
>
> > +     if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
> > +         mddev->reshape_position != MaxSector)
> > +             return ACTION_RESHAPE;
> > +
> > +     if (test_bit(MD_RECOVERY_RECOVER, &recovery))
> > +             return ACTION_RECOVER;
> > +
> >
> In action_show, MD_RECOVERY_SYNC is tested first then
> MD_RECOVERY_RECOVER.
> After looking through the logic of MD_RECOVERY_RECOVER
> clear/set_bit, the
> change is fine to me. However, better to follow old pattern unless
> there
> have resons.
>
>
> > +     if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
> > +             if (test_bit(MD_RECOVERY_CHECK, &recovery))
> > +                     return ACTION_CHECK;
> > +             if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
> > +                     return ACTION_REPAIR;
> > +             return ACTION_RESYNC;
> > +     }
> > +
> > +     return ACTION_IDLE;
> > +}
> > +
> > +enum sync_action md_sync_action_by_name(char *page)
> > +{
> > +     enum sync_action action;
> > +
> > +     for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
> > +             if (cmd_match(page, action_name[action]))
> > +                     return action;
> > +     }
> > +
> > +     return NR_SYNC_ACTIONS;
> > +}
> > +
> > +char *md_sync_action_name(enum sync_action action)
> >
>
> And 'const char *'
>
> --
> Su
>
> > +{
> > +     return action_name[action];
> > +}
> > +
> >  static ssize_t
> >  action_show(struct mddev *mddev, char *page)
> >  {
> > diff --git a/drivers/md/md.h b/drivers/md/md.h
> > index 2edad966f90a..72ca7a796df5 100644
> > --- a/drivers/md/md.h
> > +++ b/drivers/md/md.h
> > @@ -864,6 +864,9 @@ extern void md_unregister_thread(struct
> > mddev *mddev, struct md_thread __rcu **t
> >  extern void md_wakeup_thread(struct md_thread __rcu *thread);
> >  extern void md_check_recovery(struct mddev *mddev);
> >  extern void md_reap_sync_thread(struct mddev *mddev);
> > +extern enum sync_action md_sync_action(struct mddev *mddev);
> > +extern enum sync_action md_sync_action_by_name(char *page);
> > +extern char *md_sync_action_name(enum sync_action action);
> >  extern bool md_write_start(struct mddev *mddev, struct bio
> >  *bi);
> >  extern void md_write_inc(struct mddev *mddev, struct bio *bi);
> >  extern void md_write_end(struct mddev *mddev);
>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 3/9] md: add new helpers for sync_action
  2024-05-21  3:25     ` Xiao Ni
@ 2024-05-21  3:50       ` Su Yue
  0 siblings, 0 replies; 31+ messages in thread
From: Su Yue @ 2024-05-21  3:50 UTC (permalink / raw)
  To: Xiao Ni
  Cc: Su Yue, Yu Kuai, agk, snitzer, mpatocka, song, dm-devel,
	linux-kernel, linux-raid, yukuai3, yi.zhang, yangerkun


On Tue 21 May 2024 at 11:25, Xiao Ni <xni@redhat.com> wrote:

> On Mon, May 20, 2024 at 8:38 PM Su Yue <l@damenly.org> wrote:
>>
>>
>> On Thu 09 May 2024 at 09:18, Yu Kuai <yukuai1@huaweicloud.com>
>> wrote:
>>
>> > From: Yu Kuai <yukuai3@huawei.com>
>> >
>> > The new helpers will get current sync_action of the array, 
>> > will
>> > be used
>> > in later patches to make code cleaner.
>> >
>> > Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> > ---
>> >  drivers/md/md.c | 64
>> >  +++++++++++++++++++++++++++++++++++++++++++++++++
>> >  drivers/md/md.h |  3 +++
>> >  2 files changed, 67 insertions(+)
>> >
>> > diff --git a/drivers/md/md.c b/drivers/md/md.c
>> > index 00bbafcd27bb..48ec35342d1b 100644
>> > --- a/drivers/md/md.c
>> > +++ b/drivers/md/md.c
>> > @@ -69,6 +69,16 @@
>> >  #include "md-bitmap.h"
>> >  #include "md-cluster.h"
>> >
>> > +static char *action_name[NR_SYNC_ACTIONS] = {
>> >
>>
>> Th array will not be modified, so:
>>
>> static const char * const action_names[NR_SYNC_ACTIONS]
>>
>> > +     [ACTION_RESYNC]         = "resync",
>> > +     [ACTION_RECOVER]        = "recover",
>> > +     [ACTION_CHECK]          = "check",
>> > +     [ACTION_REPAIR]         = "repair",
>> > +     [ACTION_RESHAPE]        = "reshape",
>> > +     [ACTION_FROZEN]         = "frozen",
>> > +     [ACTION_IDLE]           = "idle",
>> > +};
>> > +
>> >  /* pers_list is a list of registered personalities protected 
>> >  by
>> >  pers_lock. */
>> >  static LIST_HEAD(pers_list);
>> >  static DEFINE_SPINLOCK(pers_lock);
>> > @@ -4867,6 +4877,60 @@ metadata_store(struct mddev *mddev, 
>> > const
>> > char *buf, size_t len)
>> >  static struct md_sysfs_entry md_metadata =
>> >  __ATTR_PREALLOC(metadata_version, S_IRUGO|S_IWUSR,
>> >  metadata_show, metadata_store);
>> >
>> > +enum sync_action md_sync_action(struct mddev *mddev)
>> > +{
>> > +     unsigned long recovery = mddev->recovery;
>> > +
>> > +     /*
>> > +      * frozen has the highest priority, means running 
>> > sync_thread
>> > will be
>> > +      * stopped immediately, and no new sync_thread can 
>> > start.
>> > +      */
>> > +     if (test_bit(MD_RECOVERY_FROZEN, &recovery))
>> > +             return ACTION_FROZEN;
>> > +
>> > +     /*
>> > +      * idle means no sync_thread is running, and no new
>> > sync_thread is
>> > +      * requested.
>> > +      */
>> > +     if (!test_bit(MD_RECOVERY_RUNNING, &recovery) &&
>> > +         (!md_is_rdwr(mddev) || 
>> > !test_bit(MD_RECOVERY_NEEDED,
>> > &recovery)))
>> > +             return ACTION_IDLE;
>> My brain was lost sometimes looking into nested conditions of 
>> md
>> code...
>
> agree+
>
>> I agree with Xiao Ni's suggestion that more comments about the
>> array
>> state should be added.
>
> In fact, I suggest to keep the logic which is in action_show 
> now. The
> logic in action_show is easier to understand for me.
>

I'm in the middle, either of new/old logic looks good to me ;)
Thanks for clarifying.

--
Su
> Best Regards
> Xiao
>>
>> > +     if (test_bit(MD_RECOVERY_RESHAPE, &recovery) ||
>> > +         mddev->reshape_position != MaxSector)
>> > +             return ACTION_RESHAPE;
>> > +
>> > +     if (test_bit(MD_RECOVERY_RECOVER, &recovery))
>> > +             return ACTION_RECOVER;
>> > +
>> >
>> In action_show, MD_RECOVERY_SYNC is tested first then
>> MD_RECOVERY_RECOVER.
>> After looking through the logic of MD_RECOVERY_RECOVER
>> clear/set_bit, the
>> change is fine to me. However, better to follow old pattern 
>> unless
>> there
>> have resons.
>>
>>
>> > +     if (test_bit(MD_RECOVERY_SYNC, &recovery)) {
>> > +             if (test_bit(MD_RECOVERY_CHECK, &recovery))
>> > +                     return ACTION_CHECK;
>> > +             if (test_bit(MD_RECOVERY_REQUESTED, &recovery))
>> > +                     return ACTION_REPAIR;
>> > +             return ACTION_RESYNC;
>> > +     }
>> > +
>> > +     return ACTION_IDLE;
>> > +}
>> > +
>> > +enum sync_action md_sync_action_by_name(char *page)
>> > +{
>> > +     enum sync_action action;
>> > +
>> > +     for (action = 0; action < NR_SYNC_ACTIONS; ++action) {
>> > +             if (cmd_match(page, action_name[action]))
>> > +                     return action;
>> > +     }
>> > +
>> > +     return NR_SYNC_ACTIONS;
>> > +}
>> > +
>> > +char *md_sync_action_name(enum sync_action action)
>> >
>>
>> And 'const char *'
>>
>> --
>> Su
>>
>> > +{
>> > +     return action_name[action];
>> > +}
>> > +
>> >  static ssize_t
>> >  action_show(struct mddev *mddev, char *page)
>> >  {
>> > diff --git a/drivers/md/md.h b/drivers/md/md.h
>> > index 2edad966f90a..72ca7a796df5 100644
>> > --- a/drivers/md/md.h
>> > +++ b/drivers/md/md.h
>> > @@ -864,6 +864,9 @@ extern void md_unregister_thread(struct
>> > mddev *mddev, struct md_thread __rcu **t
>> >  extern void md_wakeup_thread(struct md_thread __rcu 
>> >  *thread);
>> >  extern void md_check_recovery(struct mddev *mddev);
>> >  extern void md_reap_sync_thread(struct mddev *mddev);
>> > +extern enum sync_action md_sync_action(struct mddev *mddev);
>> > +extern enum sync_action md_sync_action_by_name(char *page);
>> > +extern char *md_sync_action_name(enum sync_action action);
>> >  extern bool md_write_start(struct mddev *mddev, struct bio
>> >  *bi);
>> >  extern void md_write_inc(struct mddev *mddev, struct bio 
>> >  *bi);
>> >  extern void md_write_end(struct mddev *mddev);
>>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
  2024-05-21  3:21         ` Xiao Ni
@ 2024-05-22  2:46           ` Yu Kuai
  0 siblings, 0 replies; 31+ messages in thread
From: Yu Kuai @ 2024-05-22  2:46 UTC (permalink / raw)
  To: Xiao Ni, Oliver Sang
  Cc: Yu Kuai, oe-lkp, lkp, linux-raid, agk, snitzer, mpatocka, song,
	dm-devel, linux-kernel, yi.zhang, yangerkun, yukuai (C),
	Mariusz Tkaczyk

Hi,

在 2024/05/21 11:21, Xiao Ni 写道:
> Hi Kuai
> 
> I've tested 07reshape5intr with the latest upstream kernel 15 times
> without failure. So it's better to have a try with 07reshape5intr with
> your patch set.

I just discussed with Xiao on slack, for conclusion here:

The test 07reshape5intr will add a new disk to array, then start
reshape:

mdadm /dev/md0 --add /dev/xxx
mdadm --grow /dev/md0 -n 3

However, the grow will fail:
mdadm: Failed to initiate reshape!

Root cause is that in kernel, action_store() will return -EBUSY
if MD_RECOVERY_RUNNING is set:

// mdadm add
add_bound_rdev
  set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);

// daemon thread
md_check_recovery
  set_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
  // do nothing
		// mdadm grow
		action_store
		 if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
		  return -EBUSY
  clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery)

This is a long term problem, and we need new synchronization in kernel
to make sure the grow won't fail.

Thanks,
Kuai

> 
> Regards
> Xiao
> 
> 
> 
> 
> On Tue, May 21, 2024 at 11:02 AM Oliver Sang <oliver.sang@intel.com> wrote:
>>
>> hi, Yu Kuai,
>>
>> On Tue, May 21, 2024 at 10:20:54AM +0800, Yu Kuai wrote:
>>> Hi,
>>>
>>> 在 2024/05/20 23:01, kernel test robot 写道:
>>>>
>>>>
>>>> Hello,
>>>>
>>>> kernel test robot noticed "mdadm-selftests.07reshape5intr.fail" on:
>>>>
>>>> commit: 18effaab5f57ef44763e537c782f905e06f6c4f5 ("[PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers")
>>>> url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/md-rearrange-recovery_flage/20240509-093248
>>>> base: https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git for-next
>>>> patch link: https://lore.kernel.org/all/20240509011900.2694291-6-yukuai1@huaweicloud.com/
>>>> patch subject: [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers
>>>>
>>>> in testcase: mdadm-selftests
>>>> version: mdadm-selftests-x86_64-5f41845-1_20240412
>>>> with following parameters:
>>>>
>>>>      disk: 1HDD
>>>>      test_prefix: 07reshape5intr
>>>>
>>>>
>>>>
>>>> compiler: gcc-13
>>>> test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz (Haswell) with 16G memory
>>>>
>>>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>>>
>>>>
>>>>
>>>>
>>>> If you fix the issue in a separate patch/commit (i.e. not just a new version of
>>>> the same patch/commit), kindly add following tags
>>>> | Reported-by: kernel test robot <oliver.sang@intel.com>
>>>> | Closes: https://lore.kernel.org/oe-lkp/202405202204.4e3dc662-oliver.sang@intel.com
>>>>
>>>> 2024-05-14 21:36:26 mkdir -p /var/tmp
>>>> 2024-05-14 21:36:26 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sda1
>>>> 2024-05-14 21:36:57 mount -t ext3 /dev/sda1 /var/tmp
>>>> sed -e 's/{DEFAULT_METADATA}/1.2/g' \
>>>> -e 's,{MAP_PATH},/run/mdadm/map,g'  mdadm.8.in > mdadm.8
>>>> /usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
>>>> /usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
>>>> /usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
>>>> /usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
>>>> /usr/bin/install -D -m 644 udev-md-raid-creating.rules /lib/udev/rules.d/01-md-raid-creating.rules
>>>> /usr/bin/install -D -m 644 udev-md-raid-arrays.rules /lib/udev/rules.d/63-md-raid-arrays.rules
>>>> /usr/bin/install -D -m 644 udev-md-raid-assembly.rules /lib/udev/rules.d/64-md-raid-assembly.rules
>>>> /usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules /lib/udev/rules.d/69-md-clustered-confirm-device.rules
>>>> /usr/bin/install -D  -m 755 mdadm /sbin/mdadm
>>>> /usr/bin/install -D  -m 755 mdmon /sbin/mdmon
>>>> Testing on linux-6.9.0-rc2-00012-g18effaab5f57 kernel
>>>> /lkp/benchmarks/mdadm-selftests/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log and /var/tmp/fail07reshape5intr.log for detail
>>> [root@fedora mdadm]# ./test --dev=loop --tests=07reshape5intr
>>> test: skipping tests for multipath, which is removed in upstream 6.8+
>>> kernels
>>> test: skipping tests for linear, which is removed in upstream 6.8+ kernels
>>> Testing on linux-6.9.0-rc2-00023-gf092583596a2 kernel
>>> /root/mdadm/tests/07reshape5intr... FAILED - see /var/tmp/07reshape5intr.log
>>> and /var/tmp/fail07reshape5intr.log for details
>>>    (KNOWN BROKEN TEST: always fails)
>>>
>>> So, since this test is marked BROKEN.
>>>
>>> Please share the whole log, and is it possible to share the two logs?
>>
>>
>> we only captured one log as attached log-18effaab5f.
>> also attached parent log FYI.
>>
>>
>>>
>>> Thanks,
>>> Kuai
>>>
>>>>
>>>>
>>>>
>>>> The kernel config and materials to reproduce are available at:
>>>> https://download.01.org/0day-ci/archive/20240520/202405202204.4e3dc662-oliver.sang@intel.com
>>>>
>>>>
>>>>
>>>
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2024-05-22  2:46 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-09  1:18 [PATCH RFC md-6.10 0/9] md: refactor and cleanup for sync action Yu Kuai
2024-05-09  1:18 ` [PATCH md-6.10 1/9] md: rearrange recovery_flage Yu Kuai
2024-05-13 15:12   ` Mariusz Tkaczyk
2024-05-14  1:36     ` Yu Kuai
2024-05-14  5:51   ` Xiao Ni
2024-05-14  6:10     ` Yu Kuai
2024-05-14  6:39       ` Xiao Ni
2024-05-09  1:18 ` [PATCH md-6.10 2/9] md: add a new enum type sync_action Yu Kuai
2024-05-14  6:13   ` Xiao Ni
2024-05-09  1:18 ` [PATCH md-6.10 3/9] md: add new helpers for sync_action Yu Kuai
2024-05-14  6:52   ` Xiao Ni
2024-05-14  7:39     ` Yu Kuai
2024-05-14  8:40       ` Xiao Ni
2024-05-14  8:52         ` Yu Kuai
2024-05-20 11:51   ` Su Yue
2024-05-21  2:30     ` Yu Kuai
2024-05-21  3:25     ` Xiao Ni
2024-05-21  3:50       ` Su Yue
2024-05-09  1:18 ` [PATCH md-6.10 4/9] md: factor out helper to start reshape from action_store() Yu Kuai
2024-05-09  1:18 ` [PATCH md-6.10 5/9] md: replace sysfs api sync_action with new helpers Yu Kuai
2024-05-20 15:01   ` kernel test robot
2024-05-21  2:20     ` Yu Kuai
2024-05-21  3:01       ` Oliver Sang
2024-05-21  3:11         ` Yu Kuai
2024-05-21  3:21         ` Xiao Ni
2024-05-22  2:46           ` Yu Kuai
2024-05-09  1:18 ` [PATCH md-6.10 6/9] md: use new helers in md_do_sync() Yu Kuai
2024-05-09  1:18 ` [PATCH md-6.10 7/9] md: replace last_sync_action with new enum type Yu Kuai
2024-05-09  1:18 ` [PATCH md-6.10 8/9] md: factor out helpers for different sync_action in md_do_sync() Yu Kuai
2024-05-14  7:27   ` Xiao Ni
2024-05-09  1:19 ` [PATCH md-6.10 9/9] md: pass in max_sectors for pers->sync_request() Yu Kuai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).