All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/17] Bug fixes for mdadm tests
@ 2022-05-26 16:35 Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 01/17] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
                   ` (16 more replies)
  0 siblings, 17 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

Hi,

This is the updated series with the feedback received in v1[1].

This series includes fixes to fix all the kernel panics in the mdadm
tests and some, related, sparse issues. The first 12 patches
clean refactor the raid5-cache code so that the RCU usage of conf->log
can be cleaned up which is done in patch 13 -- fixing some actual kernel
NULL pointer dereference crashes in the mdadm test.

Patch 14 fixes some of the remaining sparse warnings that are just
missing __rcu annotations.

Patches 15 provides a cleanup for patches 16 and 17 which fix a couple
additional hangs seen in an mdadm test.

This series will be followed by another series for mdadm which fixes
the segfaults and annotates some failing tests to make mdadm tests
runnable fairly reliably, but I'll wait for a stable hash for this
series to note the kernel version tested against. Following that,
v3 of my lock contention series will be sent with more confidence
of its correctness.

This series is based on the current md/md-next branch as of today
(42b805af10). A git branch is available here:

  https://github.com/sbates130272/linux-p2pmem md-bug_v2

Thanks,

Logan

[1] https://lore.kernel.org/all/20220519191311.17119-1-logang@deltatee.com

--

Changes since v1:
  * Add a patch to move the struct r5l_log to raid5-log.h in order
    to fix a compiler error with rcu_access_pointer() in versions
    prior to gcc-10
  * Rework r5c_is_writeback() changes to make less churn (per Christoph)
  * Change some 1s to trues in rcu_dereference_protected calls (per
    Christoph)
  * Fix an odd hunk mistake in the RCU protection patch (per Christoph)
  * Fix an inverted conditional (noticed by Donald)
  * Add a patch to add an enum for the overloaded values used by
    mddev->curr_resync to make the status_resync() fixes clearer
    (per Christoph)

--

Logan Gunthorpe (17):
  md/raid5-log: Drop extern decorators for function prototypes
  md/raid5-cache: Add r5c_conf_is_writeback() helper
  md/raid5-cache: Refactor r5l_start() to take a struct r5conf
  md/raid5-cache: Refactor r5l_flush_stripe_to_raid() to take a struct
    r5conf
  md/raid5-cache: Refactor r5l_wake_reclaim() to take a struct r5conf
  md/raid5-cache: Refactor remaining functions to take a r5conf
  md/raid5-ppl: Drop unused argument from ppl_handle_flush_request()
  md/raid5-cache: Pass the log through to r5c_finish_cache_stripe()
  md/raid5-cache: Don't pass conf to r5c_calculate_new_cp()
  md/raid5-cache: Take struct r5l_log in
    r5c_log_required_to_flush_cache()
  md/raid5: Ensure array is suspended for calls to log_exit()
  md/raid5-cache: Move struct r5l_log definition to raid5-log.h
  md/raid5-cache: Add RCU protection to conf->log accesses
  md/raid5-cache: Annotate pslot with __rcu notation
  md: Use enum for overloaded magic numbers used by mddev->curr_resync
  md: Ensure resync is reported after it starts
  md: Notify sysfs sync_completed in md_reap_sync_thread()

 drivers/md/md.c          |  55 +++----
 drivers/md/md.h          |  15 ++
 drivers/md/raid5-cache.c | 304 ++++++++++++++++++---------------------
 drivers/md/raid5-log.h   | 178 ++++++++++++++++-------
 drivers/md/raid5-ppl.c   |   2 +-
 drivers/md/raid5.c       |  50 +++----
 drivers/md/raid5.h       |   2 +-
 7 files changed, 336 insertions(+), 270 deletions(-)


base-commit: 42b805af102471f53e3c7867b8c2b502ea4eef7e
--
2.30.2

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2 01/17] md/raid5-log: Drop extern decorators for function prototypes
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 02/17] md/raid5-cache: Add r5c_conf_is_writeback() helper Logan Gunthorpe
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

extern is not necessary and recommended against when defining prototype
functions in headers. checkpatch.pl complains about these. So remove
them.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-log.h | 75 ++++++++++++++++++++----------------------
 1 file changed, 36 insertions(+), 39 deletions(-)

diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 43c714a8798c..270ced4f770f 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -2,49 +2,46 @@
 #ifndef _RAID5_LOG_H
 #define _RAID5_LOG_H
 
-extern int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev);
-extern void r5l_exit_log(struct r5conf *conf);
-extern int r5l_write_stripe(struct r5l_log *log, struct stripe_head *head_sh);
-extern void r5l_write_stripe_run(struct r5l_log *log);
-extern void r5l_flush_stripe_to_raid(struct r5l_log *log);
-extern void r5l_stripe_write_finished(struct stripe_head *sh);
-extern int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
-extern void r5l_quiesce(struct r5l_log *log, int quiesce);
-extern bool r5l_log_disk_error(struct r5conf *conf);
-extern bool r5c_is_writeback(struct r5l_log *log);
-extern int
-r5c_try_caching_write(struct r5conf *conf, struct stripe_head *sh,
-		      struct stripe_head_state *s, int disks);
-extern void
-r5c_finish_stripe_write_out(struct r5conf *conf, struct stripe_head *sh,
-			    struct stripe_head_state *s);
-extern void r5c_release_extra_page(struct stripe_head *sh);
-extern void r5c_use_extra_page(struct stripe_head *sh);
-extern void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
-extern void r5c_handle_cached_data_endio(struct r5conf *conf,
-	struct stripe_head *sh, int disks);
-extern int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh);
-extern void r5c_make_stripe_write_out(struct stripe_head *sh);
-extern void r5c_flush_cache(struct r5conf *conf, int num);
-extern void r5c_check_stripe_cache_usage(struct r5conf *conf);
-extern void r5c_check_cached_full_stripe(struct r5conf *conf);
+int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev);
+void r5l_exit_log(struct r5conf *conf);
+int r5l_write_stripe(struct r5l_log *log, struct stripe_head *head_sh);
+void r5l_write_stripe_run(struct r5l_log *log);
+void r5l_flush_stripe_to_raid(struct r5l_log *log);
+void r5l_stripe_write_finished(struct stripe_head *sh);
+int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
+void r5l_quiesce(struct r5l_log *log, int quiesce);
+bool r5l_log_disk_error(struct r5conf *conf);
+bool r5c_is_writeback(struct r5l_log *log);
+int r5c_try_caching_write(struct r5conf *conf, struct stripe_head *sh,
+			  struct stripe_head_state *s, int disks);
+void r5c_finish_stripe_write_out(struct r5conf *conf, struct stripe_head *sh,
+				 struct stripe_head_state *s);
+void r5c_release_extra_page(struct stripe_head *sh);
+void r5c_use_extra_page(struct stripe_head *sh);
+void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
+void r5c_handle_cached_data_endio(struct r5conf *conf,
+				  struct stripe_head *sh, int disks);
+int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh);
+void r5c_make_stripe_write_out(struct stripe_head *sh);
+void r5c_flush_cache(struct r5conf *conf, int num);
+void r5c_check_stripe_cache_usage(struct r5conf *conf);
+void r5c_check_cached_full_stripe(struct r5conf *conf);
 extern struct md_sysfs_entry r5c_journal_mode;
-extern void r5c_update_on_rdev_error(struct mddev *mddev,
-				     struct md_rdev *rdev);
-extern bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect);
-extern int r5l_start(struct r5l_log *log);
+void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev);
+bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect);
+int r5l_start(struct r5l_log *log);
 
-extern struct dma_async_tx_descriptor *
+struct dma_async_tx_descriptor *
 ops_run_partial_parity(struct stripe_head *sh, struct raid5_percpu *percpu,
 		       struct dma_async_tx_descriptor *tx);
-extern int ppl_init_log(struct r5conf *conf);
-extern void ppl_exit_log(struct r5conf *conf);
-extern int ppl_write_stripe(struct r5conf *conf, struct stripe_head *sh);
-extern void ppl_write_stripe_run(struct r5conf *conf);
-extern void ppl_stripe_write_finished(struct stripe_head *sh);
-extern int ppl_modify_log(struct r5conf *conf, struct md_rdev *rdev, bool add);
-extern void ppl_quiesce(struct r5conf *conf, int quiesce);
-extern int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio);
+int ppl_init_log(struct r5conf *conf);
+void ppl_exit_log(struct r5conf *conf);
+int ppl_write_stripe(struct r5conf *conf, struct stripe_head *sh);
+void ppl_write_stripe_run(struct r5conf *conf);
+void ppl_stripe_write_finished(struct stripe_head *sh);
+int ppl_modify_log(struct r5conf *conf, struct md_rdev *rdev, bool add);
+void ppl_quiesce(struct r5conf *conf, int quiesce);
+int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio);
 extern struct md_sysfs_entry ppl_write_hint;
 
 static inline bool raid5_has_log(struct r5conf *conf)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 02/17] md/raid5-cache: Add r5c_conf_is_writeback() helper
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 01/17] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-30  5:56   ` Christoph Hellwig
  2022-05-26 16:35 ` [PATCH v2 03/17] md/raid5-cache: Refactor r5l_start() to take a struct r5conf Logan Gunthorpe
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

Most calls to r5c_is_writeback() have to dereference conf to get the
log. The log pointer should be protected by RCU, but isn't. Push
the dereference of conf->log into a new r5c_conf_is_writeback() helper
to reduce the number of sites that dereference conf->log so this can be
improved.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/raid5-cache.c |  7 ++++++-
 drivers/md/raid5-log.h   |  2 +-
 drivers/md/raid5.c       | 16 ++++++++--------
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 83c184eddbda..96f51ce9b6c1 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -247,12 +247,17 @@ enum r5l_io_unit_state {
 	IO_UNIT_STRIPE_END = 3,	/* stripes data finished writing to raid */
 };
 
-bool r5c_is_writeback(struct r5l_log *log)
+static bool r5c_is_writeback(struct r5l_log *log)
 {
 	return (log != NULL &&
 		log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK);
 }
 
+bool r5c_conf_is_writeback(struct r5conf *conf)
+{
+	return r5c_is_writeback(conf->log);
+}
+
 static sector_t r5l_ring_add(struct r5l_log *log, sector_t start, sector_t inc)
 {
 	start += inc;
diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 270ced4f770f..720e164e4d4d 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -11,7 +11,7 @@ void r5l_stripe_write_finished(struct stripe_head *sh);
 int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
 void r5l_quiesce(struct r5l_log *log, int quiesce);
 bool r5l_log_disk_error(struct r5conf *conf);
-bool r5c_is_writeback(struct r5l_log *log);
+bool r5c_conf_is_writeback(struct r5conf *conf);
 int r5c_try_caching_write(struct r5conf *conf, struct stripe_head *sh,
 			  struct stripe_head_state *s, int disks);
 void r5c_finish_stripe_write_out(struct r5conf *conf, struct stripe_head *sh,
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..458d88faf2e9 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -224,7 +224,7 @@ static void do_release_stripe(struct r5conf *conf, struct stripe_head *sh,
 	BUG_ON(!list_empty(&sh->lru));
 	BUG_ON(atomic_read(&conf->active_stripes)==0);
 
-	if (r5c_is_writeback(conf->log))
+	if (r5c_conf_is_writeback(conf))
 		for (i = sh->disks; i--; )
 			if (test_bit(R5_InJournal, &sh->dev[i].flags))
 				injournal++;
@@ -236,7 +236,7 @@ static void do_release_stripe(struct r5conf *conf, struct stripe_head *sh,
 	 *   2. when resync is requested fot the stripe.
 	 */
 	if (test_bit(STRIPE_SYNC_REQUESTED, &sh->state) ||
-	    (conf->quiesce && r5c_is_writeback(conf->log) &&
+	    (conf->quiesce && r5c_conf_is_writeback(conf) &&
 	     !test_bit(STRIPE_HANDLE, &sh->state) && injournal != 0)) {
 		if (test_bit(STRIPE_R5C_CACHING, &sh->state))
 			r5c_make_stripe_write_out(sh);
@@ -274,7 +274,7 @@ static void do_release_stripe(struct r5conf *conf, struct stripe_head *sh,
 				md_wakeup_thread(conf->mddev->thread);
 		atomic_dec(&conf->active_stripes);
 		if (!test_bit(STRIPE_EXPANDING, &sh->state)) {
-			if (!r5c_is_writeback(conf->log))
+			if (!r5c_conf_is_writeback(conf))
 				list_add_tail(&sh->lru, temp_inactive_list);
 			else {
 				WARN_ON(test_bit(R5_InJournal, &sh->dev[sh->pd_idx].flags));
@@ -1786,7 +1786,7 @@ static void ops_complete_prexor(void *stripe_head_ref)
 	pr_debug("%s: stripe %llu\n", __func__,
 		(unsigned long long)sh->sector);
 
-	if (r5c_is_writeback(sh->raid_conf->log))
+	if (r5c_conf_is_writeback(sh->raid_conf))
 		/*
 		 * raid5-cache write back uses orig_page during prexor.
 		 * After prexor, it is time to free orig_page
@@ -1905,9 +1905,9 @@ ops_run_biodrain(struct stripe_head *sh, struct dma_async_tx_descriptor *tx)
 					tx = async_copy_data(1, wbi, &dev->page,
 							     dev->offset,
 							     dev->sector, tx, sh,
-							     r5c_is_writeback(conf->log));
+							     r5c_conf_is_writeback(conf));
 					if (dev->page != dev->orig_page &&
-					    !r5c_is_writeback(conf->log)) {
+					    !r5c_conf_is_writeback(conf)) {
 						set_bit(R5_SkipCopy, &dev->flags);
 						clear_bit(R5_UPTODATE, &dev->flags);
 						clear_bit(R5_OVERWRITE, &dev->flags);
@@ -5085,7 +5085,7 @@ static void handle_stripe(struct stripe_head *sh)
 	 */
 
 	if (!sh->reconstruct_state && !sh->check_state && !sh->log_io) {
-		if (!r5c_is_writeback(conf->log)) {
+		if (!r5c_conf_is_writeback(conf)) {
 			if (s.to_write)
 				handle_stripe_dirtying(conf, sh, &s, disks);
 		} else { /* write back cache */
@@ -5533,7 +5533,7 @@ static struct stripe_head *__get_priority_stripe(struct r5conf *conf, int group)
 	struct stripe_head *sh, *tmp;
 	struct list_head *handle_list = NULL;
 	struct r5worker_group *wg;
-	bool second_try = !r5c_is_writeback(conf->log) &&
+	bool second_try = !r5c_conf_is_writeback(conf) &&
 		!r5l_log_disk_error(conf);
 	bool try_loprio = test_bit(R5C_LOG_TIGHT, &conf->cache_state) ||
 		r5l_log_disk_error(conf);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 03/17] md/raid5-cache: Refactor r5l_start() to take a struct r5conf
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 01/17] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 02/17] md/raid5-cache: Add r5c_conf_is_writeback() helper Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 04/17] md/raid5-cache: Refactor r5l_flush_stripe_to_raid() " Logan Gunthorpe
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

All calls to r5l_start() have to dereference conf to get the
log. The log pointer should be protected by RCU, but isn't. Push
the dereference of conf->log into r5l_start() to reduce
the number of sites that dereference conf->log so this can be improved.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 9 +++------
 drivers/md/raid5-log.h   | 2 +-
 drivers/md/raid5.c       | 4 ++--
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 96f51ce9b6c1..7c5dc45b1ea8 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -3034,20 +3034,17 @@ static int r5l_load_log(struct r5l_log *log)
 	return ret;
 }
 
-int r5l_start(struct r5l_log *log)
+int r5l_start(struct r5conf *conf)
 {
+	struct r5l_log *log = conf->log;
 	int ret;
 
 	if (!log)
 		return 0;
 
 	ret = r5l_load_log(log);
-	if (ret) {
-		struct mddev *mddev = log->rdev->mddev;
-		struct r5conf *conf = mddev->private;
-
+	if (ret)
 		r5l_exit_log(conf);
-	}
 	return ret;
 }
 
diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 720e164e4d4d..db4c1262b380 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -29,7 +29,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf);
 extern struct md_sysfs_entry r5c_journal_mode;
 void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev);
 bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect);
-int r5l_start(struct r5l_log *log);
+int r5l_start(struct r5conf *conf);
 
 struct dma_async_tx_descriptor *
 ops_run_partial_parity(struct stripe_head *sh, struct raid5_percpu *percpu,
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 458d88faf2e9..83911affcd88 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -8040,7 +8040,7 @@ static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 		if (ret)
 			return ret;
 
-		ret = r5l_start(conf->log);
+		ret = r5l_start(conf);
 		if (ret)
 			return ret;
 
@@ -8743,7 +8743,7 @@ static int raid5_start(struct mddev *mddev)
 {
 	struct r5conf *conf = mddev->private;
 
-	return r5l_start(conf->log);
+	return r5l_start(conf);
 }
 
 static struct md_personality raid6_personality =
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 04/17] md/raid5-cache: Refactor r5l_flush_stripe_to_raid() to take a struct r5conf
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (2 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 03/17] md/raid5-cache: Refactor r5l_start() to take a struct r5conf Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 05/17] md/raid5-cache: Refactor r5l_wake_reclaim() " Logan Gunthorpe
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

All calls to r5l_flush_stripe_to_raid() have to dereference conf to
get the log. The log pointer should be protected by RCU, but isn't. Push
the dereference of conf->log into r5l_flush_stripe_to_raid() to reduce
the number of sites that dereference conf->log so this can be improved.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 3 ++-
 drivers/md/raid5-log.h   | 4 ++--
 drivers/md/raid5.c       | 6 +++---
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 7c5dc45b1ea8..edff4e8d07dc 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1289,8 +1289,9 @@ static void r5l_log_flush_endio(struct bio *bio)
  * only write stripes of an io_unit to raid disks till the io_unit is the first
  * one whose data/parity is in log.
  */
-void r5l_flush_stripe_to_raid(struct r5l_log *log)
+void r5l_flush_stripe_to_raid(struct r5conf *conf)
 {
+	struct r5l_log *log = conf->log;
 	bool do_flush;
 
 	if (!log || !log->need_cache_flush)
diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index db4c1262b380..5ace25d11ea4 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -6,7 +6,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev);
 void r5l_exit_log(struct r5conf *conf);
 int r5l_write_stripe(struct r5l_log *log, struct stripe_head *head_sh);
 void r5l_write_stripe_run(struct r5l_log *log);
-void r5l_flush_stripe_to_raid(struct r5l_log *log);
+void r5l_flush_stripe_to_raid(struct r5conf *conf);
 void r5l_stripe_write_finished(struct stripe_head *sh);
 int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
 void r5l_quiesce(struct r5l_log *log, int quiesce);
@@ -96,7 +96,7 @@ static inline void log_write_stripe_run(struct r5conf *conf)
 static inline void log_flush_stripe_to_raid(struct r5conf *conf)
 {
 	if (conf->log)
-		r5l_flush_stripe_to_raid(conf->log);
+		r5l_flush_stripe_to_raid(conf);
 	else if (raid5_has_ppl(conf))
 		ppl_write_stripe_run(conf);
 }
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 83911affcd88..f1b55495de53 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6427,7 +6427,7 @@ static int handle_active_stripes(struct r5conf *conf, int group,
 	release_inactive_stripe_list(conf, temp_inactive_list,
 				     NR_STRIPE_HASH_LOCKS);
 
-	r5l_flush_stripe_to_raid(conf->log);
+	r5l_flush_stripe_to_raid(conf);
 	if (release_inactive) {
 		spin_lock_irq(&conf->device_lock);
 		return 0;
@@ -6483,7 +6483,7 @@ static void raid5_do_work(struct work_struct *work)
 
 	flush_deferred_bios(conf);
 
-	r5l_flush_stripe_to_raid(conf->log);
+	r5l_flush_stripe_to_raid(conf);
 
 	async_tx_issue_pending_all();
 	blk_finish_plug(&plug);
@@ -6570,7 +6570,7 @@ static void raid5d(struct md_thread *thread)
 
 	flush_deferred_bios(conf);
 
-	r5l_flush_stripe_to_raid(conf->log);
+	r5l_flush_stripe_to_raid(conf);
 
 	async_tx_issue_pending_all();
 	blk_finish_plug(&plug);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 05/17] md/raid5-cache: Refactor r5l_wake_reclaim() to take a struct r5conf
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (3 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 04/17] md/raid5-cache: Refactor r5l_flush_stripe_to_raid() " Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 06/17] md/raid5-cache: Refactor remaining functions to take a r5conf Logan Gunthorpe
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

All calls to r5l_wake_reclaim() have to dereference conf to get the
log. The log pointer should be protected by RCU, but isn't. Push
the dereference of conf->log into r5l_wake_reclaim() to reduce
the number of sites that dereference conf->log so this can be improved.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 41 +++++++++++++++++++++-------------------
 drivers/md/raid5-log.h   |  2 +-
 drivers/md/raid5.c       |  2 +-
 3 files changed, 24 insertions(+), 21 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index edff4e8d07dc..3427f95cb358 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -326,7 +326,20 @@ void r5c_handle_cached_data_endio(struct r5conf *conf,
 	}
 }
 
-void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
+static void __r5l_wake_reclaim(struct r5l_log *log, sector_t space)
+{
+	unsigned long target;
+	unsigned long new = (unsigned long)space; /* overflow in theory */
+
+	if (!log)
+		return;
+	do {
+		target = log->reclaim_target;
+		if (new < target)
+			return;
+	} while (cmpxchg(&log->reclaim_target, target, new) != target);
+	md_wakeup_thread(log->reclaim_thread);
+}
 
 /* Check whether we should flush some stripes to free up stripe cache */
 void r5c_check_stripe_cache_usage(struct r5conf *conf)
@@ -349,7 +362,7 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
 	 */
 	if (total_cached > conf->min_nr_stripes * 1 / 2 ||
 	    atomic_read(&conf->empty_inactive_list_nr) > 0)
-		r5l_wake_reclaim(conf->log, 0);
+		__r5l_wake_reclaim(conf->log, 0);
 }
 
 /*
@@ -368,7 +381,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
 	if (atomic_read(&conf->r5c_cached_full_stripes) >=
 	    min(R5C_FULL_STRIPE_FLUSH_BATCH(conf),
 		conf->chunk_sectors >> RAID5_STRIPE_SHIFT(conf)))
-		r5l_wake_reclaim(conf->log, 0);
+		__r5l_wake_reclaim(conf->log, 0);
 }
 
 /*
@@ -444,7 +457,7 @@ static inline void r5c_update_log_state(struct r5l_log *log)
 		clear_bit(R5C_LOG_TIGHT, &conf->cache_state);
 
 	if (wake_reclaim)
-		r5l_wake_reclaim(log, 0);
+		__r5l_wake_reclaim(log, 0);
 }
 
 /*
@@ -1087,7 +1100,7 @@ int r5l_write_stripe(struct r5l_log *log, struct stripe_head *sh)
 
 	mutex_unlock(&log->io_mutex);
 	if (wake_reclaim)
-		r5l_wake_reclaim(log, reserve);
+		__r5l_wake_reclaim(log, reserve);
 	return 0;
 }
 
@@ -1239,7 +1252,7 @@ static void __r5l_stripe_write_finished(struct r5l_io_unit *io)
 
 	if (r5l_reclaimable_space(log) > log->max_free_space ||
 	    test_bit(R5C_LOG_TIGHT, &conf->cache_state))
-		r5l_wake_reclaim(log, 0);
+		__r5l_wake_reclaim(log, 0);
 
 	spin_unlock_irqrestore(&log->io_list_lock, flags);
 	wake_up(&log->iounit_wait);
@@ -1564,19 +1577,9 @@ static void r5l_reclaim_thread(struct md_thread *thread)
 	r5l_do_reclaim(log);
 }
 
-void r5l_wake_reclaim(struct r5l_log *log, sector_t space)
+void r5l_wake_reclaim(struct r5conf *conf, sector_t space)
 {
-	unsigned long target;
-	unsigned long new = (unsigned long)space; /* overflow in theory */
-
-	if (!log)
-		return;
-	do {
-		target = log->reclaim_target;
-		if (new < target)
-			return;
-	} while (cmpxchg(&log->reclaim_target, target, new) != target);
-	md_wakeup_thread(log->reclaim_thread);
+	__r5l_wake_reclaim(conf->log, space);
 }
 
 void r5l_quiesce(struct r5l_log *log, int quiesce)
@@ -1588,7 +1591,7 @@ void r5l_quiesce(struct r5l_log *log, int quiesce)
 		mddev = log->rdev->mddev;
 		wake_up(&mddev->sb_wait);
 		kthread_park(log->reclaim_thread->tsk);
-		r5l_wake_reclaim(log, MaxSector);
+		__r5l_wake_reclaim(log, MaxSector);
 		r5l_do_reclaim(log);
 	} else
 		kthread_unpark(log->reclaim_thread->tsk);
diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 5ace25d11ea4..7594d2d80520 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -18,7 +18,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf, struct stripe_head *sh,
 				 struct stripe_head_state *s);
 void r5c_release_extra_page(struct stripe_head *sh);
 void r5c_use_extra_page(struct stripe_head *sh);
-void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
+void r5l_wake_reclaim(struct r5conf *conf, sector_t space);
 void r5c_handle_cached_data_endio(struct r5conf *conf,
 				  struct stripe_head *sh, int disks);
 int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index f1b55495de53..0d486f8aaf87 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -742,7 +742,7 @@ raid5_get_active_stripe(struct r5conf *conf, sector_t sector,
 			if (!sh) {
 				set_bit(R5_INACTIVE_BLOCKED,
 					&conf->cache_state);
-				r5l_wake_reclaim(conf->log, 0);
+				r5l_wake_reclaim(conf, 0);
 				wait_event_lock_irq(
 					conf->wait_for_stripe,
 					!list_empty(conf->inactive_list + hash) &&
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 06/17] md/raid5-cache: Refactor remaining functions to take a r5conf
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (4 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 05/17] md/raid5-cache: Refactor r5l_wake_reclaim() " Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 07/17] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() Logan Gunthorpe
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

Many of the interface functions dereference conf->log in inline
helpers in raid5-log.h. Refactor all off these to dereference
conf->log inside the function instead. This will help to fix the
rcu locking when accessing conf->log.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 18 +++++++++++-------
 drivers/md/raid5-log.h   | 20 ++++++++++----------
 2 files changed, 21 insertions(+), 17 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 3427f95cb358..24110b687055 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1006,9 +1006,9 @@ static inline void r5l_add_no_space_stripe(struct r5l_log *log,
  * running in raid5d, where reclaim could wait for raid5d too (when it flushes
  * data from log to raid disks), so we shouldn't wait for reclaim here
  */
-int r5l_write_stripe(struct r5l_log *log, struct stripe_head *sh)
+int r5l_write_stripe(struct r5conf *conf, struct stripe_head *sh)
 {
-	struct r5conf *conf = sh->raid_conf;
+	struct r5l_log *log = conf->log;
 	int write_disks = 0;
 	int data_pages, parity_pages;
 	int reserve;
@@ -1104,8 +1104,9 @@ int r5l_write_stripe(struct r5l_log *log, struct stripe_head *sh)
 	return 0;
 }
 
-void r5l_write_stripe_run(struct r5l_log *log)
+void r5l_write_stripe_run(struct r5conf *conf)
 {
+	struct r5l_log *log = conf->log;
 	if (!log)
 		return;
 	mutex_lock(&log->io_mutex);
@@ -1113,8 +1114,10 @@ void r5l_write_stripe_run(struct r5l_log *log)
 	mutex_unlock(&log->io_mutex);
 }
 
-int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio)
+int r5l_handle_flush_request(struct r5conf *conf, struct bio *bio)
 {
+	struct r5l_log *log = conf->log;
+
 	if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) {
 		/*
 		 * in write through (journal only)
@@ -1582,8 +1585,9 @@ void r5l_wake_reclaim(struct r5conf *conf, sector_t space)
 	__r5l_wake_reclaim(conf->log, space);
 }
 
-void r5l_quiesce(struct r5l_log *log, int quiesce)
+void r5l_quiesce(struct r5conf *conf, int quiesce)
 {
+	struct r5l_log *log = conf->log;
 	struct mddev *mddev;
 
 	if (quiesce) {
@@ -2892,9 +2896,9 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
 		set_bit(STRIPE_HANDLE, &sh->state);
 }
 
-int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh)
+int r5c_cache_data(struct r5conf *conf, struct stripe_head *sh)
 {
-	struct r5conf *conf = sh->raid_conf;
+	struct r5l_log *log = conf->log;
 	int pages = 0;
 	int reserve;
 	int i;
diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 7594d2d80520..e39db84c5de0 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -4,12 +4,12 @@
 
 int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev);
 void r5l_exit_log(struct r5conf *conf);
-int r5l_write_stripe(struct r5l_log *log, struct stripe_head *head_sh);
-void r5l_write_stripe_run(struct r5l_log *log);
+int r5l_write_stripe(struct r5conf *conf, struct stripe_head *head_sh);
+void r5l_write_stripe_run(struct r5conf *conf);
 void r5l_flush_stripe_to_raid(struct r5conf *conf);
 void r5l_stripe_write_finished(struct stripe_head *sh);
-int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
-void r5l_quiesce(struct r5l_log *log, int quiesce);
+int r5l_handle_flush_request(struct r5conf *conf, struct bio *bio);
+void r5l_quiesce(struct r5conf *conf, int quiesce);
 bool r5l_log_disk_error(struct r5conf *conf);
 bool r5c_conf_is_writeback(struct r5conf *conf);
 int r5c_try_caching_write(struct r5conf *conf, struct stripe_head *sh,
@@ -21,7 +21,7 @@ void r5c_use_extra_page(struct stripe_head *sh);
 void r5l_wake_reclaim(struct r5conf *conf, sector_t space);
 void r5c_handle_cached_data_endio(struct r5conf *conf,
 				  struct stripe_head *sh, int disks);
-int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh);
+int r5c_cache_data(struct r5conf *conf, struct stripe_head *sh);
 void r5c_make_stripe_write_out(struct stripe_head *sh);
 void r5c_flush_cache(struct r5conf *conf, int num);
 void r5c_check_stripe_cache_usage(struct r5conf *conf);
@@ -63,10 +63,10 @@ static inline int log_stripe(struct stripe_head *sh, struct stripe_head_state *s
 			/* writing out phase */
 			if (s->waiting_extra_page)
 				return 0;
-			return r5l_write_stripe(conf->log, sh);
+			return r5l_write_stripe(conf, sh);
 		} else if (test_bit(STRIPE_LOG_TRAPPED, &sh->state)) {
 			/* caching phase */
-			return r5c_cache_data(conf->log, sh);
+			return r5c_cache_data(conf, sh);
 		}
 	} else if (raid5_has_ppl(conf)) {
 		return ppl_write_stripe(conf, sh);
@@ -88,7 +88,7 @@ static inline void log_stripe_write_finished(struct stripe_head *sh)
 static inline void log_write_stripe_run(struct r5conf *conf)
 {
 	if (conf->log)
-		r5l_write_stripe_run(conf->log);
+		r5l_write_stripe_run(conf);
 	else if (raid5_has_ppl(conf))
 		ppl_write_stripe_run(conf);
 }
@@ -106,7 +106,7 @@ static inline int log_handle_flush_request(struct r5conf *conf, struct bio *bio)
 	int ret = -ENODEV;
 
 	if (conf->log)
-		ret = r5l_handle_flush_request(conf->log, bio);
+		ret = r5l_handle_flush_request(conf, bio);
 	else if (raid5_has_ppl(conf))
 		ret = ppl_handle_flush_request(conf->log, bio);
 
@@ -116,7 +116,7 @@ static inline int log_handle_flush_request(struct r5conf *conf, struct bio *bio)
 static inline void log_quiesce(struct r5conf *conf, int quiesce)
 {
 	if (conf->log)
-		r5l_quiesce(conf->log, quiesce);
+		r5l_quiesce(conf, quiesce);
 	else if (raid5_has_ppl(conf))
 		ppl_quiesce(conf, quiesce);
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 07/17] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request()
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (5 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 06/17] md/raid5-cache: Refactor remaining functions to take a r5conf Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 08/17] md/raid5-cache: Pass the log through to r5c_finish_cache_stripe() Logan Gunthorpe
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

ppl_handle_flush_request() takes an struct r5log argument but doesn't
use it. It has no buisiness taking this argument as it is only used
by raid5-cache and has no way to derference it anyway. Remove
the argument.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-log.h | 4 ++--
 drivers/md/raid5-ppl.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index e39db84c5de0..1c184fe20939 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -41,7 +41,7 @@ void ppl_write_stripe_run(struct r5conf *conf);
 void ppl_stripe_write_finished(struct stripe_head *sh);
 int ppl_modify_log(struct r5conf *conf, struct md_rdev *rdev, bool add);
 void ppl_quiesce(struct r5conf *conf, int quiesce);
-int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio);
+int ppl_handle_flush_request(struct bio *bio);
 extern struct md_sysfs_entry ppl_write_hint;
 
 static inline bool raid5_has_log(struct r5conf *conf)
@@ -108,7 +108,7 @@ static inline int log_handle_flush_request(struct r5conf *conf, struct bio *bio)
 	if (conf->log)
 		ret = r5l_handle_flush_request(conf, bio);
 	else if (raid5_has_ppl(conf))
-		ret = ppl_handle_flush_request(conf->log, bio);
+		ret = ppl_handle_flush_request(bio);
 
 	return ret;
 }
diff --git a/drivers/md/raid5-ppl.c b/drivers/md/raid5-ppl.c
index 973e2e06f19c..4f5bdb4cad2b 100644
--- a/drivers/md/raid5-ppl.c
+++ b/drivers/md/raid5-ppl.c
@@ -679,7 +679,7 @@ void ppl_quiesce(struct r5conf *conf, int quiesce)
 	}
 }
 
-int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio)
+int ppl_handle_flush_request(struct bio *bio)
 {
 	if (bio->bi_iter.bi_size == 0) {
 		bio_endio(bio);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 08/17] md/raid5-cache: Pass the log through to r5c_finish_cache_stripe()
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (6 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 07/17] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 09/17] md/raid5-cache: Don't pass conf to r5c_calculate_new_cp() Logan Gunthorpe
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

r5c_finish_cache_stripe() dereferences conf->log, which will need an
rcu_read_lock(). But that is not necessary here as the log is already
available in the call sites through other means.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 24110b687055..8284ce3e5cf6 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -507,10 +507,9 @@ static void r5c_handle_parity_cached(struct stripe_head *sh)
  * Setting proper flags after writing (or flushing) data and/or parity to the
  * log device. This is called from r5l_log_endio() or r5l_log_flush_endio().
  */
-static void r5c_finish_cache_stripe(struct stripe_head *sh)
+static void r5c_finish_cache_stripe(struct r5l_log *log,
+				    struct stripe_head *sh)
 {
-	struct r5l_log *log = sh->raid_conf->log;
-
 	if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) {
 		BUG_ON(test_bit(STRIPE_R5C_CACHING, &sh->state));
 		/*
@@ -528,14 +527,14 @@ static void r5c_finish_cache_stripe(struct stripe_head *sh)
 	}
 }
 
-static void r5l_io_run_stripes(struct r5l_io_unit *io)
+static void r5l_io_run_stripes(struct r5l_log *log, struct r5l_io_unit *io)
 {
 	struct stripe_head *sh, *next;
 
 	list_for_each_entry_safe(sh, next, &io->stripe_list, log_list) {
 		list_del_init(&sh->log_list);
 
-		r5c_finish_cache_stripe(sh);
+		r5c_finish_cache_stripe(log, sh);
 
 		set_bit(STRIPE_HANDLE, &sh->state);
 		raid5_release_stripe(sh);
@@ -554,7 +553,7 @@ static void r5l_log_run_stripes(struct r5l_log *log)
 			break;
 
 		list_move_tail(&io->log_sibling, &log->finished_ios);
-		r5l_io_run_stripes(io);
+		r5l_io_run_stripes(log, io);
 	}
 }
 
@@ -1284,7 +1283,7 @@ static void r5l_log_flush_endio(struct bio *bio)
 
 	spin_lock_irqsave(&log->io_list_lock, flags);
 	list_for_each_entry(io, &log->flushing_ios, log_sibling)
-		r5l_io_run_stripes(io);
+		r5l_io_run_stripes(log, io);
 	list_splice_tail_init(&log->flushing_ios, &log->finished_ios);
 	spin_unlock_irqrestore(&log->io_list_lock, flags);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 09/17] md/raid5-cache: Don't pass conf to r5c_calculate_new_cp()
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (7 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 08/17] md/raid5-cache: Pass the log through to r5c_finish_cache_stripe() Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 10/17] md/raid5-cache: Take struct r5l_log in r5c_log_required_to_flush_cache() Logan Gunthorpe
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

r5c_calculate_new_cp() only uses conf to dereference the log which
both callers already have a pointer to and no longer need to
obtain a conf through a complicated dereference chain for this use.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 8284ce3e5cf6..9a8f2a988b03 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1168,10 +1168,9 @@ static void r5l_run_no_space_stripes(struct r5l_log *log)
  * for write through mode, returns log->next_checkpoint
  * for write back, returns log_start of first sh in stripe_in_journal_list
  */
-static sector_t r5c_calculate_new_cp(struct r5conf *conf)
+static sector_t r5c_calculate_new_cp(struct r5l_log *log)
 {
 	struct stripe_head *sh;
-	struct r5l_log *log = conf->log;
 	sector_t new_cp;
 	unsigned long flags;
 
@@ -1179,12 +1178,12 @@ static sector_t r5c_calculate_new_cp(struct r5conf *conf)
 		return log->next_checkpoint;
 
 	spin_lock_irqsave(&log->stripe_in_journal_lock, flags);
-	if (list_empty(&conf->log->stripe_in_journal_list)) {
+	if (list_empty(&log->stripe_in_journal_list)) {
 		/* all stripes flushed */
 		spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
 		return log->next_checkpoint;
 	}
-	sh = list_first_entry(&conf->log->stripe_in_journal_list,
+	sh = list_first_entry(&log->stripe_in_journal_list,
 			      struct stripe_head, r5c);
 	new_cp = sh->log_start;
 	spin_unlock_irqrestore(&log->stripe_in_journal_lock, flags);
@@ -1193,10 +1192,8 @@ static sector_t r5c_calculate_new_cp(struct r5conf *conf)
 
 static sector_t r5l_reclaimable_space(struct r5l_log *log)
 {
-	struct r5conf *conf = log->rdev->mddev->private;
-
 	return r5l_ring_distance(log, log->last_checkpoint,
-				 r5c_calculate_new_cp(conf));
+				 r5c_calculate_new_cp(log));
 }
 
 static void r5l_run_no_mem_stripe(struct r5l_log *log)
@@ -1517,7 +1514,6 @@ static void r5c_do_reclaim(struct r5conf *conf)
 
 static void r5l_do_reclaim(struct r5l_log *log)
 {
-	struct r5conf *conf = log->rdev->mddev->private;
 	sector_t reclaim_target = xchg(&log->reclaim_target, 0);
 	sector_t reclaimable;
 	sector_t next_checkpoint;
@@ -1546,7 +1542,7 @@ static void r5l_do_reclaim(struct r5l_log *log)
 				    log->io_list_lock);
 	}
 
-	next_checkpoint = r5c_calculate_new_cp(conf);
+	next_checkpoint = r5c_calculate_new_cp(log);
 	spin_unlock_irq(&log->io_list_lock);
 
 	if (reclaimable == 0 || !write_super)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 10/17] md/raid5-cache: Take struct r5l_log in r5c_log_required_to_flush_cache()
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (8 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 09/17] md/raid5-cache: Don't pass conf to r5c_calculate_new_cp() Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 11/17] md/raid5: Ensure array is suspended for calls to log_exit() Logan Gunthorpe
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

The only call site of r5c_log_required_to_flush_cache() already has
a log pointer and has already checked that it is writeback. So the
dereference and writeback check is redundant. Just pass the log pointer.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 9a8f2a988b03..a3c4d43d6deb 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -412,13 +412,9 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
  *     (stripe_in_journal_count) * (max_degraded + 1) +
  *     (group_cnt + 1) * (raid_disks - max_degraded)
  */
-static sector_t r5c_log_required_to_flush_cache(struct r5conf *conf)
+static sector_t r5c_log_required_to_flush_cache(struct r5conf *conf,
+						struct r5l_log *log)
 {
-	struct r5l_log *log = conf->log;
-
-	if (!r5c_is_writeback(log))
-		return 0;
-
 	return BLOCK_SECTORS *
 		((conf->max_degraded + 1) * atomic_read(&log->stripe_in_journal_count) +
 		 (conf->raid_disks - conf->max_degraded) * (conf->group_cnt + 1));
@@ -443,7 +439,7 @@ static inline void r5c_update_log_state(struct r5l_log *log)
 
 	free_space = r5l_ring_distance(log, log->log_start,
 				       log->last_checkpoint);
-	reclaim_space = r5c_log_required_to_flush_cache(conf);
+	reclaim_space = r5c_log_required_to_flush_cache(conf, log);
 	if (free_space < 2 * reclaim_space)
 		set_bit(R5C_LOG_CRITICAL, &conf->cache_state);
 	else {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 11/17] md/raid5: Ensure array is suspended for calls to log_exit()
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (9 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 10/17] md/raid5-cache: Take struct r5l_log in r5c_log_required_to_flush_cache() Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-26 16:35 ` [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h Logan Gunthorpe
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

The raid5-cache code relies on there being no IO in flight when
log_exit() is called. There are two places where this is not
guaranteed so add mddev_suspend() and mddev_resume() calls to these
sites.

The site in raid5_remove_disk() has a comment saying that it is
called in raid5d and thus cannot wait for pending writes; however that
does not appear to be correct anymore (if it ever was) as
raid5_remove_disk() is called from hot_remove_disk() which only
appears to be called in the md_ioctl(). Thus, the comment is removed,
as well as the racy check and replaced with calls to suspend/resume.

The site in raid5_change_consistency_policy() is in the error path,
and another similar call site already has suspend/resume calls just
below it; so it should be equally safe to make that change here.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 0d486f8aaf87..4b25a0262cb5 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7938,18 +7938,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
 
 	print_raid5_conf(conf);
 	if (test_bit(Journal, &rdev->flags) && conf->log) {
-		/*
-		 * we can't wait pending write here, as this is called in
-		 * raid5d, wait will deadlock.
-		 * neilb: there is no locking about new writes here,
-		 * so this cannot be safe.
-		 */
-		if (atomic_read(&conf->active_stripes) ||
-		    atomic_read(&conf->r5c_cached_full_stripes) ||
-		    atomic_read(&conf->r5c_cached_partial_stripes)) {
-			return -EBUSY;
-		}
+		mddev_suspend(mddev);
 		log_exit(conf);
+		mddev_resume(mddev);
 		return 0;
 	}
 	if (rdev == rcu_access_pointer(p->rdev))
@@ -8697,8 +8688,11 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 			err = log_init(conf, NULL, true);
 			if (!err) {
 				err = resize_stripes(conf, conf->pool_size);
-				if (err)
+				if (err) {
+					mddev_suspend(mddev);
 					log_exit(conf);
+					mddev_resume(mddev);
+				}
 			}
 		} else
 			err = -EINVAL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (10 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 11/17] md/raid5: Ensure array is suspended for calls to log_exit() Logan Gunthorpe
@ 2022-05-26 16:35 ` Logan Gunthorpe
  2022-05-30  5:59   ` Christoph Hellwig
  2022-05-26 16:36 ` [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log accesses Logan Gunthorpe
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:35 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

Move struct r5l_log definition to raid5-log.h. While this reduces
encapsulation, it is necessary for the definition of r5l_log to be
public so that rcu_access_pointer() can be used on conf-log in the
next patch.

rcu_access_pointer(p) doesn't technically dereference the log pointer
however, it does use typeof(*p) and some older GCC versions (anything
earlier than gcc-10) will wrongly try to dereference the structure:

    include/linux/rcupdate.h:384:9: error: dereferencing pointer to
                 incomplete type ‘struct r5l_log’

      typeof(*p) *local = (typeof(*p) *__force)READ_ONCE(p); \
           ^

    include/linux/rcupdate.h:495:31: note: in expansion of
                  macro ‘__rcu_access_pointer’

       #define rcu_access_pointer(p) __rcu_access_pointer((p),
       __UNIQUE_ID(rcu), __rcu)

To prevent this, simply provide the definition where
rcu_access_pointer() may be used.

Reported-by: Donald Buczek <buczek@molgen.mpg.de>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/raid5-cache.c | 75 ----------------------------------------
 drivers/md/raid5-log.h   | 75 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 75 insertions(+), 75 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index a3c4d43d6deb..6349cfaae2c8 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -79,81 +79,6 @@ static char *r5c_journal_mode_str[] = {"write-through",
  *	- return IO for pending writes
  */
 
-struct r5l_log {
-	struct md_rdev *rdev;
-
-	u32 uuid_checksum;
-
-	sector_t device_size;		/* log device size, round to
-					 * BLOCK_SECTORS */
-	sector_t max_free_space;	/* reclaim run if free space is at
-					 * this size */
-
-	sector_t last_checkpoint;	/* log tail. where recovery scan
-					 * starts from */
-	u64 last_cp_seq;		/* log tail sequence */
-
-	sector_t log_start;		/* log head. where new data appends */
-	u64 seq;			/* log head sequence */
-
-	sector_t next_checkpoint;
-
-	struct mutex io_mutex;
-	struct r5l_io_unit *current_io;	/* current io_unit accepting new data */
-
-	spinlock_t io_list_lock;
-	struct list_head running_ios;	/* io_units which are still running,
-					 * and have not yet been completely
-					 * written to the log */
-	struct list_head io_end_ios;	/* io_units which have been completely
-					 * written to the log but not yet written
-					 * to the RAID */
-	struct list_head flushing_ios;	/* io_units which are waiting for log
-					 * cache flush */
-	struct list_head finished_ios;	/* io_units which settle down in log disk */
-	struct bio flush_bio;
-
-	struct list_head no_mem_stripes;   /* pending stripes, -ENOMEM */
-
-	struct kmem_cache *io_kc;
-	mempool_t io_pool;
-	struct bio_set bs;
-	mempool_t meta_pool;
-
-	struct md_thread *reclaim_thread;
-	unsigned long reclaim_target;	/* number of space that need to be
-					 * reclaimed.  if it's 0, reclaim spaces
-					 * used by io_units which are in
-					 * IO_UNIT_STRIPE_END state (eg, reclaim
-					 * dones't wait for specific io_unit
-					 * switching to IO_UNIT_STRIPE_END
-					 * state) */
-	wait_queue_head_t iounit_wait;
-
-	struct list_head no_space_stripes; /* pending stripes, log has no space */
-	spinlock_t no_space_stripes_lock;
-
-	bool need_cache_flush;
-
-	/* for r5c_cache */
-	enum r5c_journal_mode r5c_journal_mode;
-
-	/* all stripes in r5cache, in the order of seq at sh->log_start */
-	struct list_head stripe_in_journal_list;
-
-	spinlock_t stripe_in_journal_lock;
-	atomic_t stripe_in_journal_count;
-
-	/* to submit async io_units, to fulfill ordering of flush */
-	struct work_struct deferred_io_work;
-	/* to disable write back during in degraded mode */
-	struct work_struct disable_writeback_work;
-
-	/* to for chunk_aligned_read in writeback mode, details below */
-	spinlock_t tree_lock;
-	struct radix_tree_root big_stripe_tree;
-};
-
 /*
  * Enable chunk_aligned_read() with write back cache.
  *
diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 1c184fe20939..5948c41f1f2e 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -2,6 +2,81 @@
 #ifndef _RAID5_LOG_H
 #define _RAID5_LOG_H
 
+struct r5l_log {
+	struct md_rdev *rdev;
+
+	u32 uuid_checksum;
+
+	sector_t device_size;		/* log device size, round to
+					 * BLOCK_SECTORS */
+	sector_t max_free_space;	/* reclaim run if free space is at
+					 * this size */
+
+	sector_t last_checkpoint;	/* log tail. where recovery scan
+					 * starts from */
+	u64 last_cp_seq;		/* log tail sequence */
+
+	sector_t log_start;		/* log head. where new data appends */
+	u64 seq;			/* log head sequence */
+
+	sector_t next_checkpoint;
+
+	struct mutex io_mutex;
+	struct r5l_io_unit *current_io;	/* current io_unit accepting new data */
+
+	spinlock_t io_list_lock;
+	struct list_head running_ios;	/* io_units which are still running,
+					 * and have not yet been completely
+					 * written to the log */
+	struct list_head io_end_ios;	/* io_units which have been completely
+					 * written to the log but not yet written
+					 * to the RAID */
+	struct list_head flushing_ios;	/* io_units which are waiting for log
+					 * cache flush */
+	struct list_head finished_ios;	/* io_units which settle down in log disk */
+	struct bio flush_bio;
+
+	struct list_head no_mem_stripes;   /* pending stripes, -ENOMEM */
+
+	struct kmem_cache *io_kc;
+	mempool_t io_pool;
+	struct bio_set bs;
+	mempool_t meta_pool;
+
+	struct md_thread *reclaim_thread;
+	unsigned long reclaim_target;	/* number of space that need to be
+					 * reclaimed.  if it's 0, reclaim spaces
+					 * used by io_units which are in
+					 * IO_UNIT_STRIPE_END state (eg, reclaim
+					 * dones't wait for specific io_unit
+					 * switching to IO_UNIT_STRIPE_END
+					 * state) */
+	wait_queue_head_t iounit_wait;
+
+	struct list_head no_space_stripes; /* pending stripes, log has no space */
+	spinlock_t no_space_stripes_lock;
+
+	bool need_cache_flush;
+
+	/* for r5c_cache */
+	enum r5c_journal_mode r5c_journal_mode;
+
+	/* all stripes in r5cache, in the order of seq at sh->log_start */
+	struct list_head stripe_in_journal_list;
+
+	spinlock_t stripe_in_journal_lock;
+	atomic_t stripe_in_journal_count;
+
+	/* to submit async io_units, to fulfill ordering of flush */
+	struct work_struct deferred_io_work;
+	/* to disable write back during in degraded mode */
+	struct work_struct disable_writeback_work;
+
+	/* to for chunk_aligned_read in writeback mode, details below */
+	spinlock_t tree_lock;
+	struct radix_tree_root big_stripe_tree;
+};
+
 int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev);
 void r5l_exit_log(struct r5conf *conf);
 int r5l_write_stripe(struct r5conf *conf, struct stripe_head *head_sh);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log accesses
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (11 preceding siblings ...)
  2022-05-26 16:35 ` [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h Logan Gunthorpe
@ 2022-05-26 16:36 ` Logan Gunthorpe
  2022-05-30  6:01   ` Christoph Hellwig
  2022-05-26 16:36 ` [PATCH v2 14/17] md/raid5-cache: Annotate pslot with __rcu notation Logan Gunthorpe
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:36 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

The mdadm test 21raid5cache randomly fails with NULL pointer accesses
of conf->log when run repeatedly. conf->log was sort of protected with
RCU, but most dereferences were not done with the correct functions.

Add rcu_read_locks(), rcu_dereference_protected() and rcu_access_pointers()
calls to the appropriate places and mark the pointer with __rcu.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/raid5-cache.c | 132 +++++++++++++++++++++++++++------------
 drivers/md/raid5-log.h   |  14 ++---
 drivers/md/raid5.c       |   4 +-
 drivers/md/raid5.h       |   2 +-
 4 files changed, 102 insertions(+), 50 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 6349cfaae2c8..7c56ee977c3a 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -180,7 +180,12 @@ static bool r5c_is_writeback(struct r5l_log *log)
 
 bool r5c_conf_is_writeback(struct r5conf *conf)
 {
-	return r5c_is_writeback(conf->log);
+	bool ret;
+
+	rcu_read_lock();
+	ret = r5c_is_writeback(rcu_dereference(conf->log));
+	rcu_read_unlock();
+	return ret;
 }
 
 static sector_t r5l_ring_add(struct r5l_log *log, sector_t start, sector_t inc)
@@ -266,12 +271,23 @@ static void __r5l_wake_reclaim(struct r5l_log *log, sector_t space)
 	md_wakeup_thread(log->reclaim_thread);
 }
 
+static struct r5l_log *get_log_for_io(struct r5conf *conf)
+{
+	/*
+	 * rcu_dereference_protected is safe because the array will be
+	 * quiesced before log_exit() so it can't be called while
+	 * an IO is in progress.
+	 */
+	return rcu_dereference_protected(conf->log, true);
+}
+
 /* Check whether we should flush some stripes to free up stripe cache */
 void r5c_check_stripe_cache_usage(struct r5conf *conf)
 {
+	struct r5l_log *log = get_log_for_io(conf);
 	int total_cached;
 
-	if (!r5c_is_writeback(conf->log))
+	if (!r5c_is_writeback(log))
 		return;
 
 	total_cached = atomic_read(&conf->r5c_cached_partial_stripes) +
@@ -287,7 +303,7 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
 	 */
 	if (total_cached > conf->min_nr_stripes * 1 / 2 ||
 	    atomic_read(&conf->empty_inactive_list_nr) > 0)
-		__r5l_wake_reclaim(conf->log, 0);
+		__r5l_wake_reclaim(log, 0);
 }
 
 /*
@@ -296,7 +312,9 @@ void r5c_check_stripe_cache_usage(struct r5conf *conf)
  */
 void r5c_check_cached_full_stripe(struct r5conf *conf)
 {
-	if (!r5c_is_writeback(conf->log))
+	struct r5l_log *log = get_log_for_io(conf);
+
+	if (!r5c_is_writeback(log))
 		return;
 
 	/*
@@ -306,7 +324,7 @@ void r5c_check_cached_full_stripe(struct r5conf *conf)
 	if (atomic_read(&conf->r5c_cached_full_stripes) >=
 	    min(R5C_FULL_STRIPE_FLUSH_BATCH(conf),
 		conf->chunk_sectors >> RAID5_STRIPE_SHIFT(conf)))
-		__r5l_wake_reclaim(conf->log, 0);
+		__r5l_wake_reclaim(log, 0);
 }
 
 /*
@@ -388,7 +406,7 @@ static inline void r5c_update_log_state(struct r5l_log *log)
 void r5c_make_stripe_write_out(struct stripe_head *sh)
 {
 	struct r5conf *conf = sh->raid_conf;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 
 	BUG_ON(!r5c_is_writeback(log));
 
@@ -630,7 +648,7 @@ static void r5c_disable_writeback_async(struct work_struct *work)
 
 	/* wait superblock change before suspend */
 	wait_event(mddev->sb_wait,
-		   conf->log == NULL ||
+		   !rcu_access_pointer(conf->log) ||
 		   (!test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
 		    (locked = mddev_trylock(mddev))));
 	if (locked) {
@@ -928,7 +946,7 @@ static inline void r5l_add_no_space_stripe(struct r5l_log *log,
  */
 int r5l_write_stripe(struct r5conf *conf, struct stripe_head *sh)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 	int write_disks = 0;
 	int data_pages, parity_pages;
 	int reserve;
@@ -1026,7 +1044,8 @@ int r5l_write_stripe(struct r5conf *conf, struct stripe_head *sh)
 
 void r5l_write_stripe_run(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
+
 	if (!log)
 		return;
 	mutex_lock(&log->io_mutex);
@@ -1036,7 +1055,7 @@ void r5l_write_stripe_run(struct r5conf *conf)
 
 int r5l_handle_flush_request(struct r5conf *conf, struct bio *bio)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 
 	if (log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_THROUGH) {
 		/*
@@ -1224,7 +1243,7 @@ static void r5l_log_flush_endio(struct bio *bio)
  */
 void r5l_flush_stripe_to_raid(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 	bool do_flush;
 
 	if (!log || !log->need_cache_flush)
@@ -1339,7 +1358,7 @@ void r5c_flush_cache(struct r5conf *conf, int num)
 	struct stripe_head *sh, *next;
 
 	lockdep_assert_held(&conf->device_lock);
-	if (!conf->log)
+	if (!rcu_access_pointer(conf->log))
 		return;
 
 	count = 0;
@@ -1358,9 +1377,8 @@ void r5c_flush_cache(struct r5conf *conf, int num)
 	}
 }
 
-static void r5c_do_reclaim(struct r5conf *conf)
+static void r5c_do_reclaim(struct r5conf *conf, struct r5l_log *log)
 {
-	struct r5l_log *log = conf->log;
 	struct stripe_head *sh;
 	int count = 0;
 	unsigned long flags;
@@ -1488,22 +1506,28 @@ static void r5l_reclaim_thread(struct md_thread *thread)
 {
 	struct mddev *mddev = thread->mddev;
 	struct r5conf *conf = mddev->private;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log;
 
+	/*
+	 * This is safe, because the reclaim thread will be stopped before
+	 * the log is freed in log_exit()
+	 */
+	log = rcu_dereference_protected(conf->log, true);
 	if (!log)
 		return;
-	r5c_do_reclaim(conf);
+
+	r5c_do_reclaim(conf, log);
 	r5l_do_reclaim(log);
 }
 
 void r5l_wake_reclaim(struct r5conf *conf, sector_t space)
 {
-	__r5l_wake_reclaim(conf->log, space);
+	__r5l_wake_reclaim(get_log_for_io(conf), space);
 }
 
 void r5l_quiesce(struct r5conf *conf, int quiesce)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 	struct mddev *mddev;
 
 	if (quiesce) {
@@ -2461,16 +2485,20 @@ static void r5l_write_super(struct r5l_log *log, sector_t cp)
 static ssize_t r5c_journal_mode_show(struct mddev *mddev, char *page)
 {
 	struct r5conf *conf;
-	int ret;
+	struct r5l_log *log;
+	int ret = 0;
 
 	spin_lock(&mddev->lock);
 	conf = mddev->private;
-	if (!conf || !conf->log) {
-		spin_unlock(&mddev->lock);
-		return 0;
-	}
+	if (!conf)
+		goto out_spin_unlock;
 
-	switch (conf->log->r5c_journal_mode) {
+	rcu_read_lock();
+	log = rcu_dereference(conf->log);
+	if (!log)
+		goto out;
+
+	switch (log->r5c_journal_mode) {
 	case R5C_JOURNAL_MODE_WRITE_THROUGH:
 		ret = snprintf(
 			page, PAGE_SIZE, "[%s] %s\n",
@@ -2486,6 +2514,10 @@ static ssize_t r5c_journal_mode_show(struct mddev *mddev, char *page)
 	default:
 		ret = 0;
 	}
+
+out:
+	rcu_read_unlock();
+out_spin_unlock:
 	spin_unlock(&mddev->lock);
 	return ret;
 }
@@ -2499,13 +2531,15 @@ static ssize_t r5c_journal_mode_show(struct mddev *mddev, char *page)
 int r5c_journal_mode_set(struct mddev *mddev, int mode)
 {
 	struct r5conf *conf;
+	struct r5l_log *log;
+	int ret = 0;
 
 	if (mode < R5C_JOURNAL_MODE_WRITE_THROUGH ||
 	    mode > R5C_JOURNAL_MODE_WRITE_BACK)
 		return -EINVAL;
 
 	conf = mddev->private;
-	if (!conf || !conf->log)
+	if (!conf || !rcu_access_pointer(conf->log))
 		return -ENODEV;
 
 	if (raid5_calc_degraded(conf) > 0 &&
@@ -2513,12 +2547,19 @@ int r5c_journal_mode_set(struct mddev *mddev, int mode)
 		return -EINVAL;
 
 	mddev_suspend(mddev);
-	conf->log->r5c_journal_mode = mode;
+	rcu_read_lock();
+	log = rcu_dereference(conf->log);
+	if (log) {
+		log->r5c_journal_mode = mode;
+		pr_debug("md/raid:%s: setting r5c cache mode to %d: %s\n",
+			 mdname(mddev), mode, r5c_journal_mode_str[mode]);
+	} else {
+		ret = -ENODEV;
+	}
+	rcu_read_unlock();
 	mddev_resume(mddev);
 
-	pr_debug("md/raid:%s: setting r5c cache mode to %d: %s\n",
-		 mdname(mddev), mode, r5c_journal_mode_str[mode]);
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(r5c_journal_mode_set);
 
@@ -2564,7 +2605,7 @@ int r5c_try_caching_write(struct r5conf *conf,
 			  struct stripe_head_state *s,
 			  int disks)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 	int i;
 	struct r5dev *dev;
 	int to_cache = 0;
@@ -2731,7 +2772,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
 				 struct stripe_head *sh,
 				 struct stripe_head_state *s)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 	int i;
 	int do_wakeup = 0;
 	sector_t tree_index;
@@ -2814,7 +2855,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
 
 int r5c_cache_data(struct r5conf *conf, struct stripe_head *sh)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 	int pages = 0;
 	int reserve;
 	int i;
@@ -2870,7 +2911,7 @@ int r5c_cache_data(struct r5conf *conf, struct stripe_head *sh)
 /* check whether this big stripe is in write back cache. */
 bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log = get_log_for_io(conf);
 	sector_t tree_index;
 	void *slot;
 
@@ -2880,6 +2921,7 @@ bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect)
 	WARN_ON_ONCE(!rcu_read_lock_held());
 	tree_index = r5c_tree_index(conf, sect);
 	slot = radix_tree_lookup(&log->big_stripe_tree, tree_index);
+
 	return slot != NULL;
 }
 
@@ -2960,30 +3002,38 @@ static int r5l_load_log(struct r5l_log *log)
 
 int r5l_start(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log;
 	int ret;
 
+	log = rcu_dereference_protected(conf->log,
+			lockdep_is_held(&conf->mddev->reconfig_mutex));
 	if (!log)
 		return 0;
 
 	ret = r5l_load_log(log);
 	if (ret)
 		r5l_exit_log(conf);
+
 	return ret;
 }
 
 void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev)
 {
 	struct r5conf *conf = mddev->private;
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log;
 
+	rcu_read_lock();
+	log = rcu_dereference(conf->log);
 	if (!log)
-		return;
+		goto out;
 
 	if ((raid5_calc_degraded(conf) > 0 ||
 	     test_bit(Journal, &rdev->flags)) &&
-	    conf->log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
+	    log->r5c_journal_mode == R5C_JOURNAL_MODE_WRITE_BACK)
 		schedule_work(&log->disable_writeback_work);
+
+out:
+	rcu_read_unlock();
 }
 
 int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
@@ -3091,15 +3141,17 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
 
 void r5l_exit_log(struct r5conf *conf)
 {
-	struct r5l_log *log = conf->log;
+	struct r5l_log *log;
 
-	conf->log = NULL;
-	synchronize_rcu();
+	lockdep_assert_held(&conf->mddev->reconfig_mutex);
+	log = rcu_replace_pointer(conf->log, NULL, 1);
 
 	/* Ensure disable_writeback_work wakes up and exits */
 	wake_up(&conf->mddev->sb_wait);
 	flush_work(&log->disable_writeback_work);
 	md_unregister_thread(&log->reclaim_thread);
+	synchronize_rcu();
+
 	mempool_exit(&log->meta_pool);
 	bioset_exit(&log->bs);
 	mempool_exit(&log->io_pool);
diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 5948c41f1f2e..5b910e2d4279 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -133,7 +133,7 @@ static inline int log_stripe(struct stripe_head *sh, struct stripe_head_state *s
 {
 	struct r5conf *conf = sh->raid_conf;
 
-	if (conf->log) {
+	if (rcu_access_pointer(conf->log)) {
 		if (!test_bit(STRIPE_R5C_CACHING, &sh->state)) {
 			/* writing out phase */
 			if (s->waiting_extra_page)
@@ -154,7 +154,7 @@ static inline void log_stripe_write_finished(struct stripe_head *sh)
 {
 	struct r5conf *conf = sh->raid_conf;
 
-	if (conf->log)
+	if (rcu_access_pointer(conf->log))
 		r5l_stripe_write_finished(sh);
 	else if (raid5_has_ppl(conf))
 		ppl_stripe_write_finished(sh);
@@ -162,7 +162,7 @@ static inline void log_stripe_write_finished(struct stripe_head *sh)
 
 static inline void log_write_stripe_run(struct r5conf *conf)
 {
-	if (conf->log)
+	if (rcu_access_pointer(conf->log))
 		r5l_write_stripe_run(conf);
 	else if (raid5_has_ppl(conf))
 		ppl_write_stripe_run(conf);
@@ -170,7 +170,7 @@ static inline void log_write_stripe_run(struct r5conf *conf)
 
 static inline void log_flush_stripe_to_raid(struct r5conf *conf)
 {
-	if (conf->log)
+	if (rcu_access_pointer(conf->log))
 		r5l_flush_stripe_to_raid(conf);
 	else if (raid5_has_ppl(conf))
 		ppl_write_stripe_run(conf);
@@ -180,7 +180,7 @@ static inline int log_handle_flush_request(struct r5conf *conf, struct bio *bio)
 {
 	int ret = -ENODEV;
 
-	if (conf->log)
+	if (rcu_access_pointer(conf->log))
 		ret = r5l_handle_flush_request(conf, bio);
 	else if (raid5_has_ppl(conf))
 		ret = ppl_handle_flush_request(bio);
@@ -190,7 +190,7 @@ static inline int log_handle_flush_request(struct r5conf *conf, struct bio *bio)
 
 static inline void log_quiesce(struct r5conf *conf, int quiesce)
 {
-	if (conf->log)
+	if (rcu_access_pointer(conf->log))
 		r5l_quiesce(conf, quiesce);
 	else if (raid5_has_ppl(conf))
 		ppl_quiesce(conf, quiesce);
@@ -198,7 +198,7 @@ static inline void log_quiesce(struct r5conf *conf, int quiesce)
 
 static inline void log_exit(struct r5conf *conf)
 {
-	if (conf->log)
+	if (rcu_access_pointer(conf->log))
 		r5l_exit_log(conf);
 	else if (raid5_has_ppl(conf))
 		ppl_exit_log(conf);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 4b25a0262cb5..93b4b69650bb 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7937,7 +7937,7 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
 	struct md_rdev *tmp;
 
 	print_raid5_conf(conf);
-	if (test_bit(Journal, &rdev->flags) && conf->log) {
+	if (test_bit(Journal, &rdev->flags) && rcu_access_pointer(conf->log)) {
 		mddev_suspend(mddev);
 		log_exit(conf);
 		mddev_resume(mddev);
@@ -8019,7 +8019,7 @@ static int raid5_add_disk(struct mddev *mddev, struct md_rdev *rdev)
 	int last = conf->raid_disks - 1;
 
 	if (test_bit(Journal, &rdev->flags)) {
-		if (conf->log)
+		if (rcu_access_pointer(conf->log))
 			return -EBUSY;
 
 		rdev->raid_disk = 0;
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 638d29863503..04fe5b6f679c 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -684,7 +684,7 @@ struct r5conf {
 	struct r5worker_group	*worker_groups;
 	int			group_cnt;
 	int			worker_cnt_per_group;
-	struct r5l_log		*log;
+	struct r5l_log		__rcu *log;
 	void			*log_private;
 
 	spinlock_t		pending_bios_lock;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 14/17] md/raid5-cache: Annotate pslot with __rcu notation
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (12 preceding siblings ...)
  2022-05-26 16:36 ` [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log accesses Logan Gunthorpe
@ 2022-05-26 16:36 ` Logan Gunthorpe
  2022-05-26 16:36 ` [PATCH v2 15/17] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:36 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

radix_tree_lookup_slot() and radix_tree_replace_slot() API expect the
slot returned and looked up to be marked with __rcu. Otherwise
sparse warnings are generated:

  drivers/md/raid5-cache.c:2939:23: warning: incorrect type in
			assignment (different address spaces)
  drivers/md/raid5-cache.c:2939:23:    expected void **pslot
  drivers/md/raid5-cache.c:2939:23:    got void [noderef] __rcu **

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 7c56ee977c3a..5ddfe29bc26a 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -2609,7 +2609,7 @@ int r5c_try_caching_write(struct r5conf *conf,
 	int i;
 	struct r5dev *dev;
 	int to_cache = 0;
-	void **pslot;
+	void __rcu **pslot;
 	sector_t tree_index;
 	int ret;
 	uintptr_t refcount;
@@ -2776,7 +2776,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
 	int i;
 	int do_wakeup = 0;
 	sector_t tree_index;
-	void **pslot;
+	void __rcu **pslot;
 	uintptr_t refcount;
 
 	if (!log || !test_bit(R5_InJournal, &sh->dev[sh->pd_idx].flags))
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 15/17] md: Use enum for overloaded magic numbers used by mddev->curr_resync
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (13 preceding siblings ...)
  2022-05-26 16:36 ` [PATCH v2 14/17] md/raid5-cache: Annotate pslot with __rcu notation Logan Gunthorpe
@ 2022-05-26 16:36 ` Logan Gunthorpe
  2022-05-30  6:01   ` Christoph Hellwig
  2022-05-26 16:36 ` [PATCH v2 16/17] md: Ensure resync is reported after it starts Logan Gunthorpe
  2022-05-26 16:36 ` [PATCH v2 17/17] md: Notify sysfs sync_completed in md_reap_sync_thread() Logan Gunthorpe
  16 siblings, 1 reply; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:36 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

Comments in the code document special values used for
mddev->curr_resync. Make this clearer by using an enum to label these
values.

The only functional change is a couple places use the wrong comparison
operator that implied 3 is another special value. They are all
fixed to imply that 3 or greater is an active resync.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/md.c | 40 ++++++++++++++++++----------------------
 drivers/md/md.h | 15 +++++++++++++++
 2 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 8273ac5eef06..0893029865eb 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5001,7 +5001,7 @@ static ssize_t
 sync_speed_show(struct mddev *mddev, char *page)
 {
 	unsigned long resync, dt, db;
-	if (mddev->curr_resync == 0)
+	if (mddev->curr_resync == MD_RESYNC_NONE)
 		return sprintf(page, "none\n");
 	resync = mddev->curr_mark_cnt - atomic_read(&mddev->recovery_active);
 	dt = (jiffies - mddev->resync_mark) / HZ;
@@ -5020,8 +5020,8 @@ sync_completed_show(struct mddev *mddev, char *page)
 	if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
 		return sprintf(page, "none\n");
 
-	if (mddev->curr_resync == 1 ||
-	    mddev->curr_resync == 2)
+	if (mddev->curr_resync == MD_RESYNC_YIELDED ||
+	    mddev->curr_resync == MD_RESYNC_DELAYED)
 		return sprintf(page, "delayed\n");
 
 	if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ||
@@ -8018,7 +8018,7 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev)
 		max_sectors = mddev->dev_sectors;
 
 	resync = mddev->curr_resync;
-	if (resync <= 3) {
+	if (resync < MD_RESYNC_ACTIVE) {
 		if (test_bit(MD_RECOVERY_DONE, &mddev->recovery))
 			/* Still cleaning up */
 			resync = max_sectors;
@@ -8027,7 +8027,7 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev)
 	else
 		resync -= atomic_read(&mddev->recovery_active);
 
-	if (resync == 0) {
+	if (resync == MD_RESYNC_NONE) {
 		if (test_bit(MD_RESYNCING_REMOTE, &mddev->recovery)) {
 			struct md_rdev *rdev;
 
@@ -8051,7 +8051,7 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev)
 		}
 		return 0;
 	}
-	if (resync < 3) {
+	if (resync < MD_RESYNC_ACTIVE) {
 		seq_printf(seq, "\tresync=DELAYED");
 		return 1;
 	}
@@ -8729,13 +8729,7 @@ void md_do_sync(struct md_thread *thread)
 
 	mddev->last_sync_action = action ?: desc;
 
-	/* we overload curr_resync somewhat here.
-	 * 0 == not engaged in resync at all
-	 * 2 == checking that there is no conflict with another sync
-	 * 1 == like 2, but have yielded to allow conflicting resync to
-	 *		commence
-	 * other == active in resync - this many blocks
-	 *
+	/*
 	 * Before starting a resync we must have set curr_resync to
 	 * 2, and then checked that every "conflicting" array has curr_resync
 	 * less than ours.  When we find one that is the same or higher
@@ -8747,7 +8741,7 @@ void md_do_sync(struct md_thread *thread)
 
 	do {
 		int mddev2_minor = -1;
-		mddev->curr_resync = 2;
+		mddev->curr_resync = MD_RESYNC_DELAYED;
 
 	try_again:
 		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
@@ -8759,12 +8753,14 @@ void md_do_sync(struct md_thread *thread)
 			&&  mddev2->curr_resync
 			&&  match_mddev_units(mddev, mddev2)) {
 				DEFINE_WAIT(wq);
-				if (mddev < mddev2 && mddev->curr_resync == 2) {
+				if (mddev < mddev2 &&
+				    mddev->curr_resync == MD_RESYNC_DELAYED) {
 					/* arbitrarily yield */
-					mddev->curr_resync = 1;
+					mddev->curr_resync = MD_RESYNC_YIELDED;
 					wake_up(&resync_wait);
 				}
-				if (mddev > mddev2 && mddev->curr_resync == 1)
+				if (mddev > mddev2 &&
+				    mddev->curr_resync == MD_RESYNC_YIELDED)
 					/* no need to wait here, we can wait the next
 					 * time 'round when curr_resync == 2
 					 */
@@ -8792,7 +8788,7 @@ void md_do_sync(struct md_thread *thread)
 				finish_wait(&resync_wait, &wq);
 			}
 		}
-	} while (mddev->curr_resync < 2);
+	} while (mddev->curr_resync < MD_RESYNC_DELAYED);
 
 	j = 0;
 	if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
@@ -8876,7 +8872,7 @@ void md_do_sync(struct md_thread *thread)
 			 desc, mdname(mddev));
 		mddev->curr_resync = j;
 	} else
-		mddev->curr_resync = 3; /* no longer delayed */
+		mddev->curr_resync = MD_RESYNC_ACTIVE; /* no longer delayed */
 	mddev->curr_resync_completed = j;
 	sysfs_notify_dirent_safe(mddev->sysfs_completed);
 	md_new_event();
@@ -9011,14 +9007,14 @@ void md_do_sync(struct md_thread *thread)
 
 	if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
 	    !test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
-	    mddev->curr_resync > 3) {
+	    mddev->curr_resync >= MD_RESYNC_ACTIVE) {
 		mddev->curr_resync_completed = mddev->curr_resync;
 		sysfs_notify_dirent_safe(mddev->sysfs_completed);
 	}
 	mddev->pers->sync_request(mddev, max_sectors, &skipped);
 
 	if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) &&
-	    mddev->curr_resync > 3) {
+	    mddev->curr_resync >= MD_RESYNC_ACTIVE) {
 		if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
 			if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) {
 				if (mddev->curr_resync >= mddev->recovery_cp) {
@@ -9082,7 +9078,7 @@ void md_do_sync(struct md_thread *thread)
 	} else if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery))
 		mddev->resync_min = mddev->curr_resync_completed;
 	set_bit(MD_RECOVERY_DONE, &mddev->recovery);
-	mddev->curr_resync = 0;
+	mddev->curr_resync = MD_RESYNC_NONE;
 	spin_unlock(&mddev->lock);
 
 	wake_up(&resync_wait);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 5f62c46ac2d3..2d06003a4c3f 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -288,6 +288,21 @@ struct serial_info {
 	sector_t _subtree_last; /* highest sector in subtree of rb node */
 };
 
+/*
+ * mddev->curr_resync stores the current sector of the resync but
+ * also has some overloaded values.
+ */
+enum {
+	/* No resync in progress */
+	MD_RESYNC_NONE = 0,
+	/* Yielded to allow another conflicting resync to commence */
+	MD_RESYNC_YIELDED = 1,
+	/* Delayed to check that there is no conflict with another sync */
+	MD_RESYNC_DELAYED = 2,
+	/* Any value greater than or equal to this is in an active resync */
+	MD_RESYNC_ACTIVE = 3,
+};
+
 struct mddev {
 	void				*private;
 	struct md_personality		*pers;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 16/17] md: Ensure resync is reported after it starts
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (14 preceding siblings ...)
  2022-05-26 16:36 ` [PATCH v2 15/17] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
@ 2022-05-26 16:36 ` Logan Gunthorpe
  2022-05-30  6:02   ` Christoph Hellwig
  2022-05-26 16:36 ` [PATCH v2 17/17] md: Notify sysfs sync_completed in md_reap_sync_thread() Logan Gunthorpe
  16 siblings, 1 reply; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:36 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

The 07layouts test in mdadm fails on some systems. The failure
presents itself as the backup file not being removed before the next
layout is grown into:

  mdadm: /dev/md0: cannot create backup file /tmp/md-test-backup:
      File exists

This is because the background mdadm process, which is responsible for
cleaning up this backup file gets into an infinite loop waiting for
the reshape to start. mdadm checks the mdstat file if a reshape is
going and, if it is not, it waits for an event on the file or times
out in 5 seconds. On faster machines, the reshape may complete before
the 5 seconds times out, and thus the background mdadm process loops
waiting for a reshape to start that has already occurred.

mdadm reads the mdstat file to start, but mdstat does not report that the
reshape has begun, even though it has indeed begun. So the mdstat_wait()
call (in mdadm) which polls on the mdstat file won't ever return until
timing out.

The reason mdstat reports the reshape has started is due to an issue
in status_resync(). recovery_active is subtracted from curr_resync which
will result in a value of zero for the first chunk of reshaped data, and
the resulting read will report no reshape in progress.

To fix this, if "resync - recovery_active" is an overloaded value, force
the value to be MD_RESYNC_ACTIVE so the code reports a resync in progress.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/md.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0893029865eb..2be429874d18 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8022,10 +8022,20 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev)
 		if (test_bit(MD_RECOVERY_DONE, &mddev->recovery))
 			/* Still cleaning up */
 			resync = max_sectors;
-	} else if (resync > max_sectors)
+	} else if (resync > max_sectors) {
 		resync = max_sectors;
-	else
+	} else {
 		resync -= atomic_read(&mddev->recovery_active);
+		if (resync < MD_RESYNC_ACTIVE) {
+			/*
+			 * Resync has started, but the subtraction has
+			 * yielded one of the special values. Force it
+			 * to active to ensure the status reports an
+			 * active resync.
+			 */
+			resync = MD_RESYNC_ACTIVE;
+		}
+	}
 
 	if (resync == MD_RESYNC_NONE) {
 		if (test_bit(MD_RESYNCING_REMOTE, &mddev->recovery)) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH v2 17/17] md: Notify sysfs sync_completed in md_reap_sync_thread()
  2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (15 preceding siblings ...)
  2022-05-26 16:36 ` [PATCH v2 16/17] md: Ensure resync is reported after it starts Logan Gunthorpe
@ 2022-05-26 16:36 ` Logan Gunthorpe
  16 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-26 16:36 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

The mdadm test 07layouts randomly produces a kernel hung task deadlock.
The deadlock is caused by the suspend_lo/suspend_hi files being set by
the mdadm background process during reshape and not being cleared
because the process hangs. (Leaving aside the issue of the fragility of
freezing kernel tasks by buggy userspace processes...)

When the background mdadm process hangs it, is waiting (without a
timeout) on a change to the sync_completed file signalling that the
reshape has completed. The process is woken up a couple times when
the reshape finishes but it is woken up before MD_RECOVERY_RUNNING
is cleared so sync_completed_show() reports 0 instead of "none".

To fix this, notify the sysfs file in md_reap_sync_thread() after
MD_RECOVERY_RUNNING has been cleared. This wakes up mdadm and causes
it to continue and write to suspend_lo/suspend_hi to allow IO to
continue.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/md.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2be429874d18..2c07c9508222 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9476,6 +9476,7 @@ void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held)
 	wake_up(&resync_wait);
 	/* flag recovery needed just to double check */
 	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+	sysfs_notify_dirent_safe(mddev->sysfs_completed);
 	sysfs_notify_dirent_safe(mddev->sysfs_action);
 	md_new_event();
 	if (mddev->event_work.func)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 02/17] md/raid5-cache: Add r5c_conf_is_writeback() helper
  2022-05-26 16:35 ` [PATCH v2 02/17] md/raid5-cache: Add r5c_conf_is_writeback() helper Logan Gunthorpe
@ 2022-05-30  5:56   ` Christoph Hellwig
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2022-05-30  5:56 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h
  2022-05-26 16:35 ` [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h Logan Gunthorpe
@ 2022-05-30  5:59   ` Christoph Hellwig
  2022-05-30 15:48     ` Logan Gunthorpe
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Hellwig @ 2022-05-30  5:59 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

On Thu, May 26, 2022 at 10:35:59AM -0600, Logan Gunthorpe wrote:
> Move struct r5l_log definition to raid5-log.h. While this reduces
> encapsulation, it is necessary for the definition of r5l_log to be
> public so that rcu_access_pointer() can be used on conf-log in the
> next patch.
> 
> rcu_access_pointer(p) doesn't technically dereference the log pointer
> however, it does use typeof(*p) and some older GCC versions (anything
> earlier than gcc-10) will wrongly try to dereference the structure:
> 
>     include/linux/rcupdate.h:384:9: error: dereferencing pointer to
>                  incomplete type ‘struct r5l_log’
> 
>       typeof(*p) *local = (typeof(*p) *__force)READ_ONCE(p); \
>            ^
> 
>     include/linux/rcupdate.h:495:31: note: in expansion of
>                   macro ‘__rcu_access_pointer’
> 
>        #define rcu_access_pointer(p) __rcu_access_pointer((p),
>        __UNIQUE_ID(rcu), __rcu)
> 
> To prevent this, simply provide the definition where
> rcu_access_pointer() may be used.

What about just moving any code that does the rcu_access_pointer on
conf->log to raid5-cache.c and doing an out of line call for it
instead?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log accesses
  2022-05-26 16:36 ` [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log accesses Logan Gunthorpe
@ 2022-05-30  6:01   ` Christoph Hellwig
  2022-05-30 15:57     ` Logan Gunthorpe
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Hellwig @ 2022-05-30  6:01 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

On Thu, May 26, 2022 at 10:36:00AM -0600, Logan Gunthorpe wrote:
> The mdadm test 21raid5cache randomly fails with NULL pointer accesses
> of conf->log when run repeatedly. conf->log was sort of protected with
> RCU, but most dereferences were not done with the correct functions.
> 
> Add rcu_read_locks(), rcu_dereference_protected() and rcu_access_pointers()
> calls to the appropriate places and mark the pointer with __rcu.

Looking at the code a bit more, is this really enough?  Calls to
r5c_is_writeback / r5c_confi_is_writeback are sprinkled all over the
code, and my gut feeling is the value is not expected to change over
way longer critical sections than this.  So maybe the answer here is to
fix up the release to be properly locked as it only affects the non-I/O
slow path anyway.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 15/17] md: Use enum for overloaded magic numbers used by mddev->curr_resync
  2022-05-26 16:36 ` [PATCH v2 15/17] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
@ 2022-05-30  6:01   ` Christoph Hellwig
  2022-05-30 15:43     ` Logan Gunthorpe
  0 siblings, 1 reply; 29+ messages in thread
From: Christoph Hellwig @ 2022-05-30  6:01 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

On Thu, May 26, 2022 at 10:36:02AM -0600, Logan Gunthorpe wrote:
> Comments in the code document special values used for
> mddev->curr_resync. Make this clearer by using an enum to label these
> values.
> 
> The only functional change is a couple places use the wrong comparison
> operator that implied 3 is another special value. They are all
> fixed to imply that 3 or greater is an active resync.

This already looks good, but shouldn't the curr_resync also be changed
to the new enum type?

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 16/17] md: Ensure resync is reported after it starts
  2022-05-26 16:36 ` [PATCH v2 16/17] md: Ensure resync is reported after it starts Logan Gunthorpe
@ 2022-05-30  6:02   ` Christoph Hellwig
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2022-05-30  6:02 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 15/17] md: Use enum for overloaded magic numbers used by mddev->curr_resync
  2022-05-30  6:01   ` Christoph Hellwig
@ 2022-05-30 15:43     ` Logan Gunthorpe
  0 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-30 15:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-raid, Song Liu, Donald Buczek, Guoqing Jiang,
	Xiao Ni, Stephen Bates, Martin Oliveira, David Sloan



On 2022-05-30 00:01, Christoph Hellwig wrote:
> On Thu, May 26, 2022 at 10:36:02AM -0600, Logan Gunthorpe wrote:
>> Comments in the code document special values used for
>> mddev->curr_resync. Make this clearer by using an enum to label these
>> values.
>>
>> The only functional change is a couple places use the wrong comparison
>> operator that implied 3 is another special value. They are all
>> fixed to imply that 3 or greater is an active resync.
> 
> This already looks good, but shouldn't the curr_resync also be changed
> to the new enum type?

I thought not, seeing curr_resync actually represents a sector_t except
for these handful of special values.
 Thanks,

Logan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h
  2022-05-30  5:59   ` Christoph Hellwig
@ 2022-05-30 15:48     ` Logan Gunthorpe
  2022-06-01 22:36       ` Song Liu
  0 siblings, 1 reply; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-30 15:48 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-raid, Song Liu, Donald Buczek, Guoqing Jiang,
	Xiao Ni, Stephen Bates, Martin Oliveira, David Sloan



On 2022-05-29 23:59, Christoph Hellwig wrote:
> On Thu, May 26, 2022 at 10:35:59AM -0600, Logan Gunthorpe wrote:
>> Move struct r5l_log definition to raid5-log.h. While this reduces
>> encapsulation, it is necessary for the definition of r5l_log to be
>> public so that rcu_access_pointer() can be used on conf-log in the
>> next patch.
>>
>> rcu_access_pointer(p) doesn't technically dereference the log pointer
>> however, it does use typeof(*p) and some older GCC versions (anything
>> earlier than gcc-10) will wrongly try to dereference the structure:
>>
>>     include/linux/rcupdate.h:384:9: error: dereferencing pointer to
>>                  incomplete type ‘struct r5l_log’
>>
>>       typeof(*p) *local = (typeof(*p) *__force)READ_ONCE(p); \
>>            ^
>>
>>     include/linux/rcupdate.h:495:31: note: in expansion of
>>                   macro ‘__rcu_access_pointer’
>>
>>        #define rcu_access_pointer(p) __rcu_access_pointer((p),
>>        __UNIQUE_ID(rcu), __rcu)
>>
>> To prevent this, simply provide the definition where
>> rcu_access_pointer() may be used.
> 
> What about just moving any code that does the rcu_access_pointer on
> conf->log to raid5-cache.c and doing an out of line call for it
> instead?

I guess we could do that. All the inline functions in raid5-log.h are
there to choose between the r5l or the ppl implementaiton. So it that
would mean the r5l implementation would probably be inlined and ppl
would be doing a second out of line call. Not sure if that matters, but
it seems a little odd.

Logan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log accesses
  2022-05-30  6:01   ` Christoph Hellwig
@ 2022-05-30 15:57     ` Logan Gunthorpe
  0 siblings, 0 replies; 29+ messages in thread
From: Logan Gunthorpe @ 2022-05-30 15:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-raid, Song Liu, Donald Buczek, Guoqing Jiang,
	Xiao Ni, Stephen Bates, Martin Oliveira, David Sloan



On 2022-05-30 00:01, Christoph Hellwig wrote:
> On Thu, May 26, 2022 at 10:36:00AM -0600, Logan Gunthorpe wrote:
>> The mdadm test 21raid5cache randomly fails with NULL pointer accesses
>> of conf->log when run repeatedly. conf->log was sort of protected with
>> RCU, but most dereferences were not done with the correct functions.
>>
>> Add rcu_read_locks(), rcu_dereference_protected() and rcu_access_pointers()
>> calls to the appropriate places and mark the pointer with __rcu.
> 
> Looking at the code a bit more, is this really enough?  Calls to
> r5c_is_writeback / r5c_confi_is_writeback are sprinkled all over the
> code, and my gut feeling is the value is not expected to change over
> way longer critical sections than this.  So maybe the answer here is to
> fix up the release to be properly locked as it only affects the non-I/O
> slow path anyway.

Yeah, I think your gut feeling is correct. It looks like all the
is_writeback calls are in the IO path as well. I'll review this again
and see if we can just replace the RCU stuff and the paths that were
hitting NULL pointer deference with the taking of a lock.

Logan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h
  2022-05-30 15:48     ` Logan Gunthorpe
@ 2022-06-01 22:36       ` Song Liu
  2022-06-01 22:42         ` Logan Gunthorpe
  0 siblings, 1 reply; 29+ messages in thread
From: Song Liu @ 2022-06-01 22:36 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Christoph Hellwig, open list, linux-raid, Donald Buczek,
	Guoqing Jiang, Xiao Ni, Stephen Bates, Martin Oliveira,
	David Sloan

On Mon, May 30, 2022 at 8:48 AM Logan Gunthorpe <logang@deltatee.com> wrote:
>
>
>
> On 2022-05-29 23:59, Christoph Hellwig wrote:
> > On Thu, May 26, 2022 at 10:35:59AM -0600, Logan Gunthorpe wrote:
> >> Move struct r5l_log definition to raid5-log.h. While this reduces
> >> encapsulation, it is necessary for the definition of r5l_log to be
> >> public so that rcu_access_pointer() can be used on conf-log in the
> >> next patch.
> >>
> >> rcu_access_pointer(p) doesn't technically dereference the log pointer
> >> however, it does use typeof(*p) and some older GCC versions (anything
> >> earlier than gcc-10) will wrongly try to dereference the structure:
> >>
> >>     include/linux/rcupdate.h:384:9: error: dereferencing pointer to
> >>                  incomplete type ‘struct r5l_log’
> >>
> >>       typeof(*p) *local = (typeof(*p) *__force)READ_ONCE(p); \
> >>            ^
> >>
> >>     include/linux/rcupdate.h:495:31: note: in expansion of
> >>                   macro ‘__rcu_access_pointer’
> >>
> >>        #define rcu_access_pointer(p) __rcu_access_pointer((p),
> >>        __UNIQUE_ID(rcu), __rcu)
> >>
> >> To prevent this, simply provide the definition where
> >> rcu_access_pointer() may be used.
> >
> > What about just moving any code that does the rcu_access_pointer on
> > conf->log to raid5-cache.c and doing an out of line call for it
> > instead?
>
> I guess we could do that. All the inline functions in raid5-log.h are
> there to choose between the r5l or the ppl implementaiton. So it that
> would mean the r5l implementation would probably be inlined and ppl
> would be doing a second out of line call. Not sure if that matters, but
> it seems a little odd.

I like the current version better. raid5-log.h is not used in many files anyway.

Thanks,
Song

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h
  2022-06-01 22:36       ` Song Liu
@ 2022-06-01 22:42         ` Logan Gunthorpe
  2022-06-01 22:50           ` Song Liu
  0 siblings, 1 reply; 29+ messages in thread
From: Logan Gunthorpe @ 2022-06-01 22:42 UTC (permalink / raw)
  To: Song Liu
  Cc: Christoph Hellwig, open list, linux-raid, Donald Buczek,
	Guoqing Jiang, Xiao Ni, Stephen Bates, Martin Oliveira,
	David Sloan



On 2022-06-01 16:36, Song Liu wrote:
>> I guess we could do that. All the inline functions in raid5-log.h are
>> there to choose between the r5l or the ppl implementaiton. So it that
>> would mean the r5l implementation would probably be inlined and ppl
>> would be doing a second out of line call. Not sure if that matters, but
>> it seems a little odd.
> 
> I like the current version better. raid5-log.h is not used in many files anyway.

It's a moot point now. v3 will follow Christoph's other feedback which
essentially removes the RCU access altogether and adds an appropriate
lock in a couple places.

Logan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h
  2022-06-01 22:42         ` Logan Gunthorpe
@ 2022-06-01 22:50           ` Song Liu
  0 siblings, 0 replies; 29+ messages in thread
From: Song Liu @ 2022-06-01 22:50 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Christoph Hellwig, open list, linux-raid, Donald Buczek,
	Guoqing Jiang, Xiao Ni, Stephen Bates, Martin Oliveira,
	David Sloan

On Wed, Jun 1, 2022 at 3:42 PM Logan Gunthorpe <logang@deltatee.com> wrote:
>
>
>
> On 2022-06-01 16:36, Song Liu wrote:
> >> I guess we could do that. All the inline functions in raid5-log.h are
> >> there to choose between the r5l or the ppl implementaiton. So it that
> >> would mean the r5l implementation would probably be inlined and ppl
> >> would be doing a second out of line call. Not sure if that matters, but
> >> it seems a little odd.
> >
> > I like the current version better. raid5-log.h is not used in many files anyway.
>
> It's a moot point now. v3 will follow Christoph's other feedback which
> essentially removes the RCU access altogether and adds an appropriate
> lock in a couple places.

Ah, that's right. Thanks for pointing it out.

Song

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2022-06-01 22:50 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-26 16:35 [PATCH v2 00/17] Bug fixes for mdadm tests Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 01/17] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 02/17] md/raid5-cache: Add r5c_conf_is_writeback() helper Logan Gunthorpe
2022-05-30  5:56   ` Christoph Hellwig
2022-05-26 16:35 ` [PATCH v2 03/17] md/raid5-cache: Refactor r5l_start() to take a struct r5conf Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 04/17] md/raid5-cache: Refactor r5l_flush_stripe_to_raid() " Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 05/17] md/raid5-cache: Refactor r5l_wake_reclaim() " Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 06/17] md/raid5-cache: Refactor remaining functions to take a r5conf Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 07/17] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 08/17] md/raid5-cache: Pass the log through to r5c_finish_cache_stripe() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 09/17] md/raid5-cache: Don't pass conf to r5c_calculate_new_cp() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 10/17] md/raid5-cache: Take struct r5l_log in r5c_log_required_to_flush_cache() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 11/17] md/raid5: Ensure array is suspended for calls to log_exit() Logan Gunthorpe
2022-05-26 16:35 ` [PATCH v2 12/17] md/raid5-cache: Move struct r5l_log definition to raid5-log.h Logan Gunthorpe
2022-05-30  5:59   ` Christoph Hellwig
2022-05-30 15:48     ` Logan Gunthorpe
2022-06-01 22:36       ` Song Liu
2022-06-01 22:42         ` Logan Gunthorpe
2022-06-01 22:50           ` Song Liu
2022-05-26 16:36 ` [PATCH v2 13/17] md/raid5-cache: Add RCU protection to conf->log accesses Logan Gunthorpe
2022-05-30  6:01   ` Christoph Hellwig
2022-05-30 15:57     ` Logan Gunthorpe
2022-05-26 16:36 ` [PATCH v2 14/17] md/raid5-cache: Annotate pslot with __rcu notation Logan Gunthorpe
2022-05-26 16:36 ` [PATCH v2 15/17] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
2022-05-30  6:01   ` Christoph Hellwig
2022-05-30 15:43     ` Logan Gunthorpe
2022-05-26 16:36 ` [PATCH v2 16/17] md: Ensure resync is reported after it starts Logan Gunthorpe
2022-05-30  6:02   ` Christoph Hellwig
2022-05-26 16:36 ` [PATCH v2 17/17] md: Notify sysfs sync_completed in md_reap_sync_thread() Logan Gunthorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.