All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/11] Bug fixes for mdadm tests
@ 2022-06-02 18:18 Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 01/11] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
                   ` (10 more replies)
  0 siblings, 11 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

Hi,

This is the updated series with the feedback received in v2[1].

This series includes fixes to fix all the kernel panics in the mdadm
tests and some, related, sparse issues, plus some cleanup.

The first two patches are cleanup from the original series that aren't
related now but I thought are worth including.

Patches 3 through 6 fix bugs with conf->log and remove the single,
unecessary, RCU access. This cleans up some sparse errors.

Patch 7 cleans up some sparse warnings with pslot usage.

Patch 8 is a cleanup which adds an enum so that patch 9 can
fix an mdadm hang. Patch 10 also fixes an mdadm hang.

I've also included Patch 11 in this series which fixes a recent
mistake in raid5-ppl that was reported by 0day which I don't think
has been fixed yet.

This series will be followed by another series for mdadm which fixes
the segfaults and annotates some failing tests to make mdadm tests
runnable fairly reliably, but I'll wait for a stable hash for this
series to note the kernel version tested against. Following that,
v3 of my lock contention series will be sent with more confidence
of its correctness.

This series is based on the current md/md-next branch as of yesterday
(42b805af10). A git branch is available here:

  https://github.com/sbates130272/linux-p2pmem md-bug_v3

Thanks,

Logan

--

Changes since v2:
  * Rework the RCU changes to remove the RCU usage instead of using
    it every. This makes more sense seeing most accesses do not need
    RCU due to them being on the IO path, or with mddev_lock() held
    and the remaining ones are on the slow path and may use
    mddev_lock(). (Per Christoph)
  * Collect a couple more Reviewed-by tags from Christoph

Changes since v1:
  * Add a patch to move the struct r5l_log to raid5-log.h in order
    to fix a compiler error with rcu_access_pointer() in versions
    prior to gcc-10
  * Rework r5c_is_writeback() changes to make less churn (per Christoph)
  * Change some 1s to trues in rcu_dereference_protected calls (per
    Christoph)
  * Fix an odd hunk mistake in the RCU protection patch (per Christoph)
  * Fix an inverted conditional (noticed by Donald)
  * Add a patch to add an enum for the overloaded values used by
    mddev->curr_resync to make the status_resync() fixes clearer
    (per Christoph)

--

[1] https://lore.kernel.org/all/20220526163604.32736-1-logang@deltatee.com

Logan Gunthorpe (11):
  md/raid5-log: Drop extern decorators for function prototypes
  md/raid5-ppl: Drop unused argument from ppl_handle_flush_request()
  md/raid5: Ensure array is suspended for calls to log_exit()
  md/raid5-cache: Take mddev_lock in r5c_journal_mode_show()
  md/raid5-cache: Drop RCU usage of conf->log
  md/raid5-cache: Clear conf->log after finishing work
  md/raid5-cache: Annotate pslot with __rcu notation
  md: Use enum for overloaded magic numbers used by mddev->curr_resync
  md: Ensure resync is reported after it starts
  md: Notify sysfs sync_completed in md_reap_sync_thread()
  md/raid5-ppl: Fix argument order in bio_alloc_bioset()

 drivers/md/md.c          | 55 +++++++++++++++-------------
 drivers/md/md.h          | 15 ++++++++
 drivers/md/raid5-cache.c | 34 +++++++++---------
 drivers/md/raid5-log.h   | 77 +++++++++++++++++++---------------------
 drivers/md/raid5-ppl.c   |  6 ++--
 drivers/md/raid5.c       | 18 ++++------
 6 files changed, 109 insertions(+), 96 deletions(-)


base-commit: 42b805af102471f53e3c7867b8c2b502ea4eef7e
--
2.30.2

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 01/11] md/raid5-log: Drop extern decorators for function prototypes
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 02/11] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() Logan Gunthorpe
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

extern is not necessary and recommended against when defining prototype
functions in headers. checkpatch.pl complains about these. So remove
them.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-log.h | 75 ++++++++++++++++++++----------------------
 1 file changed, 36 insertions(+), 39 deletions(-)

diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 43c714a8798c..270ced4f770f 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -2,49 +2,46 @@
 #ifndef _RAID5_LOG_H
 #define _RAID5_LOG_H
 
-extern int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev);
-extern void r5l_exit_log(struct r5conf *conf);
-extern int r5l_write_stripe(struct r5l_log *log, struct stripe_head *head_sh);
-extern void r5l_write_stripe_run(struct r5l_log *log);
-extern void r5l_flush_stripe_to_raid(struct r5l_log *log);
-extern void r5l_stripe_write_finished(struct stripe_head *sh);
-extern int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
-extern void r5l_quiesce(struct r5l_log *log, int quiesce);
-extern bool r5l_log_disk_error(struct r5conf *conf);
-extern bool r5c_is_writeback(struct r5l_log *log);
-extern int
-r5c_try_caching_write(struct r5conf *conf, struct stripe_head *sh,
-		      struct stripe_head_state *s, int disks);
-extern void
-r5c_finish_stripe_write_out(struct r5conf *conf, struct stripe_head *sh,
-			    struct stripe_head_state *s);
-extern void r5c_release_extra_page(struct stripe_head *sh);
-extern void r5c_use_extra_page(struct stripe_head *sh);
-extern void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
-extern void r5c_handle_cached_data_endio(struct r5conf *conf,
-	struct stripe_head *sh, int disks);
-extern int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh);
-extern void r5c_make_stripe_write_out(struct stripe_head *sh);
-extern void r5c_flush_cache(struct r5conf *conf, int num);
-extern void r5c_check_stripe_cache_usage(struct r5conf *conf);
-extern void r5c_check_cached_full_stripe(struct r5conf *conf);
+int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev);
+void r5l_exit_log(struct r5conf *conf);
+int r5l_write_stripe(struct r5l_log *log, struct stripe_head *head_sh);
+void r5l_write_stripe_run(struct r5l_log *log);
+void r5l_flush_stripe_to_raid(struct r5l_log *log);
+void r5l_stripe_write_finished(struct stripe_head *sh);
+int r5l_handle_flush_request(struct r5l_log *log, struct bio *bio);
+void r5l_quiesce(struct r5l_log *log, int quiesce);
+bool r5l_log_disk_error(struct r5conf *conf);
+bool r5c_is_writeback(struct r5l_log *log);
+int r5c_try_caching_write(struct r5conf *conf, struct stripe_head *sh,
+			  struct stripe_head_state *s, int disks);
+void r5c_finish_stripe_write_out(struct r5conf *conf, struct stripe_head *sh,
+				 struct stripe_head_state *s);
+void r5c_release_extra_page(struct stripe_head *sh);
+void r5c_use_extra_page(struct stripe_head *sh);
+void r5l_wake_reclaim(struct r5l_log *log, sector_t space);
+void r5c_handle_cached_data_endio(struct r5conf *conf,
+				  struct stripe_head *sh, int disks);
+int r5c_cache_data(struct r5l_log *log, struct stripe_head *sh);
+void r5c_make_stripe_write_out(struct stripe_head *sh);
+void r5c_flush_cache(struct r5conf *conf, int num);
+void r5c_check_stripe_cache_usage(struct r5conf *conf);
+void r5c_check_cached_full_stripe(struct r5conf *conf);
 extern struct md_sysfs_entry r5c_journal_mode;
-extern void r5c_update_on_rdev_error(struct mddev *mddev,
-				     struct md_rdev *rdev);
-extern bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect);
-extern int r5l_start(struct r5l_log *log);
+void r5c_update_on_rdev_error(struct mddev *mddev, struct md_rdev *rdev);
+bool r5c_big_stripe_cached(struct r5conf *conf, sector_t sect);
+int r5l_start(struct r5l_log *log);
 
-extern struct dma_async_tx_descriptor *
+struct dma_async_tx_descriptor *
 ops_run_partial_parity(struct stripe_head *sh, struct raid5_percpu *percpu,
 		       struct dma_async_tx_descriptor *tx);
-extern int ppl_init_log(struct r5conf *conf);
-extern void ppl_exit_log(struct r5conf *conf);
-extern int ppl_write_stripe(struct r5conf *conf, struct stripe_head *sh);
-extern void ppl_write_stripe_run(struct r5conf *conf);
-extern void ppl_stripe_write_finished(struct stripe_head *sh);
-extern int ppl_modify_log(struct r5conf *conf, struct md_rdev *rdev, bool add);
-extern void ppl_quiesce(struct r5conf *conf, int quiesce);
-extern int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio);
+int ppl_init_log(struct r5conf *conf);
+void ppl_exit_log(struct r5conf *conf);
+int ppl_write_stripe(struct r5conf *conf, struct stripe_head *sh);
+void ppl_write_stripe_run(struct r5conf *conf);
+void ppl_stripe_write_finished(struct stripe_head *sh);
+int ppl_modify_log(struct r5conf *conf, struct md_rdev *rdev, bool add);
+void ppl_quiesce(struct r5conf *conf, int quiesce);
+int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio);
 extern struct md_sysfs_entry ppl_write_hint;
 
 static inline bool raid5_has_log(struct r5conf *conf)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 02/11] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request()
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 01/11] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 03/11] md/raid5: Ensure array is suspended for calls to log_exit() Logan Gunthorpe
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

ppl_handle_flush_request() takes an struct r5log argument but doesn't
use it. It has no buisiness taking this argument as it is only used
by raid5-cache and has no way to derference it anyway. Remove
the argument.

No functional changes intended.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-log.h | 4 ++--
 drivers/md/raid5-ppl.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index 270ced4f770f..c8332502669e 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -41,7 +41,7 @@ void ppl_write_stripe_run(struct r5conf *conf);
 void ppl_stripe_write_finished(struct stripe_head *sh);
 int ppl_modify_log(struct r5conf *conf, struct md_rdev *rdev, bool add);
 void ppl_quiesce(struct r5conf *conf, int quiesce);
-int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio);
+int ppl_handle_flush_request(struct bio *bio);
 extern struct md_sysfs_entry ppl_write_hint;
 
 static inline bool raid5_has_log(struct r5conf *conf)
@@ -108,7 +108,7 @@ static inline int log_handle_flush_request(struct r5conf *conf, struct bio *bio)
 	if (conf->log)
 		ret = r5l_handle_flush_request(conf->log, bio);
 	else if (raid5_has_ppl(conf))
-		ret = ppl_handle_flush_request(conf->log, bio);
+		ret = ppl_handle_flush_request(bio);
 
 	return ret;
 }
diff --git a/drivers/md/raid5-ppl.c b/drivers/md/raid5-ppl.c
index 973e2e06f19c..4f5bdb4cad2b 100644
--- a/drivers/md/raid5-ppl.c
+++ b/drivers/md/raid5-ppl.c
@@ -679,7 +679,7 @@ void ppl_quiesce(struct r5conf *conf, int quiesce)
 	}
 }
 
-int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio)
+int ppl_handle_flush_request(struct bio *bio)
 {
 	if (bio->bi_iter.bi_size == 0) {
 		bio_endio(bio);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 03/11] md/raid5: Ensure array is suspended for calls to log_exit()
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 01/11] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 02/11] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 04/11] md/raid5-cache: Take mddev_lock in r5c_journal_mode_show() Logan Gunthorpe
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

The raid5-cache code relies on there being no IO in flight when
log_exit() is called. There are two places where this is not
guaranteed so add mddev_suspend() and mddev_resume() calls to these
sites.

The site in raid5_remove_disk() has a comment saying that it is
called in raid5d and thus cannot wait for pending writes; however that
does not appear to be correct anymore (if it ever was) as
raid5_remove_disk() is called from hot_remove_disk() which only
appears to be called in the md_ioctl(). Thus, the comment is removed,
as well as the racy check and replaced with calls to suspend/resume.

The site in raid5_change_consistency_policy() is in the error path,
and another similar call site already has suspend/resume calls just
below it; so it should be equally safe to make that change here.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 5d09256d7f81..3ad37dd4c5cd 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7938,18 +7938,9 @@ static int raid5_remove_disk(struct mddev *mddev, struct md_rdev *rdev)
 
 	print_raid5_conf(conf);
 	if (test_bit(Journal, &rdev->flags) && conf->log) {
-		/*
-		 * we can't wait pending write here, as this is called in
-		 * raid5d, wait will deadlock.
-		 * neilb: there is no locking about new writes here,
-		 * so this cannot be safe.
-		 */
-		if (atomic_read(&conf->active_stripes) ||
-		    atomic_read(&conf->r5c_cached_full_stripes) ||
-		    atomic_read(&conf->r5c_cached_partial_stripes)) {
-			return -EBUSY;
-		}
+		mddev_suspend(mddev);
 		log_exit(conf);
+		mddev_resume(mddev);
 		return 0;
 	}
 	if (rdev == rcu_access_pointer(p->rdev))
@@ -8697,8 +8688,11 @@ static int raid5_change_consistency_policy(struct mddev *mddev, const char *buf)
 			err = log_init(conf, NULL, true);
 			if (!err) {
 				err = resize_stripes(conf, conf->pool_size);
-				if (err)
+				if (err) {
+					mddev_suspend(mddev);
 					log_exit(conf);
+					mddev_resume(mddev);
+				}
 			}
 		} else
 			err = -EINVAL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 04/11] md/raid5-cache: Take mddev_lock in r5c_journal_mode_show()
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (2 preceding siblings ...)
  2022-06-02 18:18 ` [PATCH v3 03/11] md/raid5: Ensure array is suspended for calls to log_exit() Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-03  6:39   ` Christoph Hellwig
  2022-06-02 18:18 ` [PATCH v3 05/11] md/raid5-cache: Drop RCU usage of conf->log Logan Gunthorpe
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

The mddev->lock spinlock doesn't protect against the removal of
conf->log in r5l_exit_log() so conf->log may be freed before it
is used.

To fix this, take the mddev_lock() insteaad of the mddev->lock spinlock.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/raid5-cache.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 83c184eddbda..69b95005abca 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -2534,10 +2534,13 @@ static ssize_t r5c_journal_mode_show(struct mddev *mddev, char *page)
 	struct r5conf *conf;
 	int ret;
 
-	spin_lock(&mddev->lock);
+	ret = mddev_lock(mddev);
+	if (ret)
+		return ret;
+
 	conf = mddev->private;
 	if (!conf || !conf->log) {
-		spin_unlock(&mddev->lock);
+		mddev_unlock(mddev);
 		return 0;
 	}
 
@@ -2557,7 +2560,7 @@ static ssize_t r5c_journal_mode_show(struct mddev *mddev, char *page)
 	default:
 		ret = 0;
 	}
-	spin_unlock(&mddev->lock);
+	mddev_unlock(mddev);
 	return ret;
 }
 
@@ -3167,6 +3170,8 @@ void r5l_exit_log(struct r5conf *conf)
 {
 	struct r5l_log *log = conf->log;
 
+	lockdep_assert_held(&conf->mddev->reconfig_mutex);
+
 	conf->log = NULL;
 	synchronize_rcu();
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 05/11] md/raid5-cache: Drop RCU usage of conf->log
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (3 preceding siblings ...)
  2022-06-02 18:18 ` [PATCH v3 04/11] md/raid5-cache: Take mddev_lock in r5c_journal_mode_show() Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-03  6:43   ` Christoph Hellwig
  2022-06-02 18:18 ` [PATCH v3 06/11] md/raid5-cache: Clear conf->log after finishing work Logan Gunthorpe
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

The only place that uses RCU to access conf->log is in
r5l_log_disk_error(). This function is mostly used in the IO path
and once with mddev_lock() held in raid5_change_consistency_policy().

It is known that the IO will be suspended before the log is freed and
r5l_log_exit() is called with the mddev_lock() held.

This should mean that conf->log can not be freed while the function is
being called, so the RCU protection is not necessary. Drop the
rcu_read_lock() as well as the synchronize_rcu() and
rcu_assign_pointer() usage.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/raid5-cache.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 69b95005abca..48020d36b334 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -1590,18 +1590,13 @@ void r5l_quiesce(struct r5l_log *log, int quiesce)
 
 bool r5l_log_disk_error(struct r5conf *conf)
 {
-	struct r5l_log *log;
-	bool ret;
-	/* don't allow write if journal disk is missing */
-	rcu_read_lock();
-	log = rcu_dereference(conf->log);
+	struct r5l_log *log = conf->log;
 
+	/* don't allow write if journal disk is missing */
 	if (!log)
-		ret = test_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
+		return test_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
 	else
-		ret = test_bit(Faulty, &log->rdev->flags);
-	rcu_read_unlock();
-	return ret;
+		return test_bit(Faulty, &log->rdev->flags);
 }
 
 #define R5L_RECOVERY_PAGE_POOL_SIZE 256
@@ -3148,7 +3143,7 @@ int r5l_init_log(struct r5conf *conf, struct md_rdev *rdev)
 	spin_lock_init(&log->stripe_in_journal_lock);
 	atomic_set(&log->stripe_in_journal_count, 0);
 
-	rcu_assign_pointer(conf->log, log);
+	conf->log = log;
 
 	set_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
 	return 0;
@@ -3173,7 +3168,6 @@ void r5l_exit_log(struct r5conf *conf)
 	lockdep_assert_held(&conf->mddev->reconfig_mutex);
 
 	conf->log = NULL;
-	synchronize_rcu();
 
 	/* Ensure disable_writeback_work wakes up and exits */
 	wake_up(&conf->mddev->sb_wait);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 06/11] md/raid5-cache: Clear conf->log after finishing work
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (4 preceding siblings ...)
  2022-06-02 18:18 ` [PATCH v3 05/11] md/raid5-cache: Drop RCU usage of conf->log Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-03  6:43   ` Christoph Hellwig
  2022-06-02 18:18 ` [PATCH v3 07/11] md/raid5-cache: Annotate pslot with __rcu notation Logan Gunthorpe
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

A NULL pointer dereferlence on conf->log is seen randomly with
the mdadm test 21raid5cache. Kasan reporst:

BUG: KASAN: null-ptr-deref in r5l_reclaimable_space+0xf5/0x140
Read of size 8 at addr 0000000000000860 by task md0_reclaim/3086

Call Trace:
  dump_stack_lvl+0x5a/0x74
  kasan_report.cold+0x5f/0x1a9
  __asan_load8+0x69/0x90
  r5l_reclaimable_space+0xf5/0x140
  r5l_do_reclaim+0xf4/0x5e0
  r5l_reclaim_thread+0x69/0x3b0
  md_thread+0x1a2/0x2c0
  kthread+0x177/0x1b0
  ret_from_fork+0x22/0x30

This is caused by conf->log being cleared in r5l_exit_log() before
stopping the reclaim thread.

To fix this, clear conf->log after the reclaim_thread is unregistered
and after flushing disable_writeback_work.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/raid5-cache.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 48020d36b334..d6461d4814ab 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -3167,12 +3167,13 @@ void r5l_exit_log(struct r5conf *conf)
 
 	lockdep_assert_held(&conf->mddev->reconfig_mutex);
 
-	conf->log = NULL;
-
 	/* Ensure disable_writeback_work wakes up and exits */
 	wake_up(&conf->mddev->sb_wait);
 	flush_work(&log->disable_writeback_work);
 	md_unregister_thread(&log->reclaim_thread);
+
+	conf->log = NULL;
+
 	mempool_exit(&log->meta_pool);
 	bioset_exit(&log->bs);
 	mempool_exit(&log->io_pool);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 07/11] md/raid5-cache: Annotate pslot with __rcu notation
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (5 preceding siblings ...)
  2022-06-02 18:18 ` [PATCH v3 06/11] md/raid5-cache: Clear conf->log after finishing work Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 08/11] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

radix_tree_lookup_slot() and radix_tree_replace_slot() API expect the
slot returned and looked up to be marked with __rcu. Otherwise
sparse warnings are generated:

  drivers/md/raid5-cache.c:2939:23: warning: incorrect type in
			assignment (different address spaces)
  drivers/md/raid5-cache.c:2939:23:    expected void **pslot
  drivers/md/raid5-cache.c:2939:23:    got void [noderef] __rcu **

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/raid5-cache.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index d6461d4814ab..15cab59adfcf 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -2637,7 +2637,7 @@ int r5c_try_caching_write(struct r5conf *conf,
 	int i;
 	struct r5dev *dev;
 	int to_cache = 0;
-	void **pslot;
+	void __rcu **pslot;
 	sector_t tree_index;
 	int ret;
 	uintptr_t refcount;
@@ -2804,7 +2804,7 @@ void r5c_finish_stripe_write_out(struct r5conf *conf,
 	int i;
 	int do_wakeup = 0;
 	sector_t tree_index;
-	void **pslot;
+	void __rcu **pslot;
 	uintptr_t refcount;
 
 	if (!log || !test_bit(R5_InJournal, &sh->dev[sh->pd_idx].flags))
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 08/11] md: Use enum for overloaded magic numbers used by mddev->curr_resync
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (6 preceding siblings ...)
  2022-06-02 18:18 ` [PATCH v3 07/11] md/raid5-cache: Annotate pslot with __rcu notation Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 09/11] md: Ensure resync is reported after it starts Logan Gunthorpe
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

Comments in the code document special values used for
mddev->curr_resync. Make this clearer by using an enum to label these
values.

The only functional change is a couple places use the wrong comparison
operator that implied 3 is another special value. They are all
fixed to imply that 3 or greater is an active resync.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/md.c | 40 ++++++++++++++++++----------------------
 drivers/md/md.h | 15 +++++++++++++++
 2 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 8273ac5eef06..0893029865eb 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -5001,7 +5001,7 @@ static ssize_t
 sync_speed_show(struct mddev *mddev, char *page)
 {
 	unsigned long resync, dt, db;
-	if (mddev->curr_resync == 0)
+	if (mddev->curr_resync == MD_RESYNC_NONE)
 		return sprintf(page, "none\n");
 	resync = mddev->curr_mark_cnt - atomic_read(&mddev->recovery_active);
 	dt = (jiffies - mddev->resync_mark) / HZ;
@@ -5020,8 +5020,8 @@ sync_completed_show(struct mddev *mddev, char *page)
 	if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
 		return sprintf(page, "none\n");
 
-	if (mddev->curr_resync == 1 ||
-	    mddev->curr_resync == 2)
+	if (mddev->curr_resync == MD_RESYNC_YIELDED ||
+	    mddev->curr_resync == MD_RESYNC_DELAYED)
 		return sprintf(page, "delayed\n");
 
 	if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ||
@@ -8018,7 +8018,7 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev)
 		max_sectors = mddev->dev_sectors;
 
 	resync = mddev->curr_resync;
-	if (resync <= 3) {
+	if (resync < MD_RESYNC_ACTIVE) {
 		if (test_bit(MD_RECOVERY_DONE, &mddev->recovery))
 			/* Still cleaning up */
 			resync = max_sectors;
@@ -8027,7 +8027,7 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev)
 	else
 		resync -= atomic_read(&mddev->recovery_active);
 
-	if (resync == 0) {
+	if (resync == MD_RESYNC_NONE) {
 		if (test_bit(MD_RESYNCING_REMOTE, &mddev->recovery)) {
 			struct md_rdev *rdev;
 
@@ -8051,7 +8051,7 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev)
 		}
 		return 0;
 	}
-	if (resync < 3) {
+	if (resync < MD_RESYNC_ACTIVE) {
 		seq_printf(seq, "\tresync=DELAYED");
 		return 1;
 	}
@@ -8729,13 +8729,7 @@ void md_do_sync(struct md_thread *thread)
 
 	mddev->last_sync_action = action ?: desc;
 
-	/* we overload curr_resync somewhat here.
-	 * 0 == not engaged in resync at all
-	 * 2 == checking that there is no conflict with another sync
-	 * 1 == like 2, but have yielded to allow conflicting resync to
-	 *		commence
-	 * other == active in resync - this many blocks
-	 *
+	/*
 	 * Before starting a resync we must have set curr_resync to
 	 * 2, and then checked that every "conflicting" array has curr_resync
 	 * less than ours.  When we find one that is the same or higher
@@ -8747,7 +8741,7 @@ void md_do_sync(struct md_thread *thread)
 
 	do {
 		int mddev2_minor = -1;
-		mddev->curr_resync = 2;
+		mddev->curr_resync = MD_RESYNC_DELAYED;
 
 	try_again:
 		if (test_bit(MD_RECOVERY_INTR, &mddev->recovery))
@@ -8759,12 +8753,14 @@ void md_do_sync(struct md_thread *thread)
 			&&  mddev2->curr_resync
 			&&  match_mddev_units(mddev, mddev2)) {
 				DEFINE_WAIT(wq);
-				if (mddev < mddev2 && mddev->curr_resync == 2) {
+				if (mddev < mddev2 &&
+				    mddev->curr_resync == MD_RESYNC_DELAYED) {
 					/* arbitrarily yield */
-					mddev->curr_resync = 1;
+					mddev->curr_resync = MD_RESYNC_YIELDED;
 					wake_up(&resync_wait);
 				}
-				if (mddev > mddev2 && mddev->curr_resync == 1)
+				if (mddev > mddev2 &&
+				    mddev->curr_resync == MD_RESYNC_YIELDED)
 					/* no need to wait here, we can wait the next
 					 * time 'round when curr_resync == 2
 					 */
@@ -8792,7 +8788,7 @@ void md_do_sync(struct md_thread *thread)
 				finish_wait(&resync_wait, &wq);
 			}
 		}
-	} while (mddev->curr_resync < 2);
+	} while (mddev->curr_resync < MD_RESYNC_DELAYED);
 
 	j = 0;
 	if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
@@ -8876,7 +8872,7 @@ void md_do_sync(struct md_thread *thread)
 			 desc, mdname(mddev));
 		mddev->curr_resync = j;
 	} else
-		mddev->curr_resync = 3; /* no longer delayed */
+		mddev->curr_resync = MD_RESYNC_ACTIVE; /* no longer delayed */
 	mddev->curr_resync_completed = j;
 	sysfs_notify_dirent_safe(mddev->sysfs_completed);
 	md_new_event();
@@ -9011,14 +9007,14 @@ void md_do_sync(struct md_thread *thread)
 
 	if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) &&
 	    !test_bit(MD_RECOVERY_INTR, &mddev->recovery) &&
-	    mddev->curr_resync > 3) {
+	    mddev->curr_resync >= MD_RESYNC_ACTIVE) {
 		mddev->curr_resync_completed = mddev->curr_resync;
 		sysfs_notify_dirent_safe(mddev->sysfs_completed);
 	}
 	mddev->pers->sync_request(mddev, max_sectors, &skipped);
 
 	if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) &&
-	    mddev->curr_resync > 3) {
+	    mddev->curr_resync >= MD_RESYNC_ACTIVE) {
 		if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
 			if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) {
 				if (mddev->curr_resync >= mddev->recovery_cp) {
@@ -9082,7 +9078,7 @@ void md_do_sync(struct md_thread *thread)
 	} else if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery))
 		mddev->resync_min = mddev->curr_resync_completed;
 	set_bit(MD_RECOVERY_DONE, &mddev->recovery);
-	mddev->curr_resync = 0;
+	mddev->curr_resync = MD_RESYNC_NONE;
 	spin_unlock(&mddev->lock);
 
 	wake_up(&resync_wait);
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 5f62c46ac2d3..2d06003a4c3f 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -288,6 +288,21 @@ struct serial_info {
 	sector_t _subtree_last; /* highest sector in subtree of rb node */
 };
 
+/*
+ * mddev->curr_resync stores the current sector of the resync but
+ * also has some overloaded values.
+ */
+enum {
+	/* No resync in progress */
+	MD_RESYNC_NONE = 0,
+	/* Yielded to allow another conflicting resync to commence */
+	MD_RESYNC_YIELDED = 1,
+	/* Delayed to check that there is no conflict with another sync */
+	MD_RESYNC_DELAYED = 2,
+	/* Any value greater than or equal to this is in an active resync */
+	MD_RESYNC_ACTIVE = 3,
+};
+
 struct mddev {
 	void				*private;
 	struct md_personality		*pers;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 09/11] md: Ensure resync is reported after it starts
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (7 preceding siblings ...)
  2022-06-02 18:18 ` [PATCH v3 08/11] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 10/11] md: Notify sysfs sync_completed in md_reap_sync_thread() Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 11/11] md/raid5-ppl: Fix argument order in bio_alloc_bioset() Logan Gunthorpe
  10 siblings, 0 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

The 07layouts test in mdadm fails on some systems. The failure
presents itself as the backup file not being removed before the next
layout is grown into:

  mdadm: /dev/md0: cannot create backup file /tmp/md-test-backup:
      File exists

This is because the background mdadm process, which is responsible for
cleaning up this backup file gets into an infinite loop waiting for
the reshape to start. mdadm checks the mdstat file if a reshape is
going and, if it is not, it waits for an event on the file or times
out in 5 seconds. On faster machines, the reshape may complete before
the 5 seconds times out, and thus the background mdadm process loops
waiting for a reshape to start that has already occurred.

mdadm reads the mdstat file to start, but mdstat does not report that the
reshape has begun, even though it has indeed begun. So the mdstat_wait()
call (in mdadm) which polls on the mdstat file won't ever return until
timing out.

The reason mdstat reports the reshape has started is due to an issue
in status_resync(). recovery_active is subtracted from curr_resync which
will result in a value of zero for the first chunk of reshaped data, and
the resulting read will report no reshape in progress.

To fix this, if "resync - recovery_active" is an overloaded value, force
the value to be MD_RESYNC_ACTIVE so the code reports a resync in progress.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/md.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 0893029865eb..2be429874d18 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -8022,10 +8022,20 @@ static int status_resync(struct seq_file *seq, struct mddev *mddev)
 		if (test_bit(MD_RECOVERY_DONE, &mddev->recovery))
 			/* Still cleaning up */
 			resync = max_sectors;
-	} else if (resync > max_sectors)
+	} else if (resync > max_sectors) {
 		resync = max_sectors;
-	else
+	} else {
 		resync -= atomic_read(&mddev->recovery_active);
+		if (resync < MD_RESYNC_ACTIVE) {
+			/*
+			 * Resync has started, but the subtraction has
+			 * yielded one of the special values. Force it
+			 * to active to ensure the status reports an
+			 * active resync.
+			 */
+			resync = MD_RESYNC_ACTIVE;
+		}
+	}
 
 	if (resync == MD_RESYNC_NONE) {
 		if (test_bit(MD_RESYNCING_REMOTE, &mddev->recovery)) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 10/11] md: Notify sysfs sync_completed in md_reap_sync_thread()
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (8 preceding siblings ...)
  2022-06-02 18:18 ` [PATCH v3 09/11] md: Ensure resync is reported after it starts Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-02 18:18 ` [PATCH v3 11/11] md/raid5-ppl: Fix argument order in bio_alloc_bioset() Logan Gunthorpe
  10 siblings, 0 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe,
	Christoph Hellwig

The mdadm test 07layouts randomly produces a kernel hung task deadlock.
The deadlock is caused by the suspend_lo/suspend_hi files being set by
the mdadm background process during reshape and not being cleared
because the process hangs. (Leaving aside the issue of the fragility of
freezing kernel tasks by buggy userspace processes...)

When the background mdadm process hangs it, is waiting (without a
timeout) on a change to the sync_completed file signalling that the
reshape has completed. The process is woken up a couple times when
the reshape finishes but it is woken up before MD_RECOVERY_RUNNING
is cleared so sync_completed_show() reports 0 instead of "none".

To fix this, notify the sysfs file in md_reap_sync_thread() after
MD_RECOVERY_RUNNING has been cleared. This wakes up mdadm and causes
it to continue and write to suspend_lo/suspend_hi to allow IO to
continue.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/md.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2be429874d18..2c07c9508222 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9476,6 +9476,7 @@ void md_reap_sync_thread(struct mddev *mddev, bool reconfig_mutex_held)
 	wake_up(&resync_wait);
 	/* flag recovery needed just to double check */
 	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+	sysfs_notify_dirent_safe(mddev->sysfs_completed);
 	sysfs_notify_dirent_safe(mddev->sysfs_action);
 	md_new_event();
 	if (mddev->event_work.func)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 11/11] md/raid5-ppl: Fix argument order in bio_alloc_bioset()
  2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
                   ` (9 preceding siblings ...)
  2022-06-02 18:18 ` [PATCH v3 10/11] md: Notify sysfs sync_completed in md_reap_sync_thread() Logan Gunthorpe
@ 2022-06-02 18:18 ` Logan Gunthorpe
  2022-06-03  6:45   ` Christoph Hellwig
  10 siblings, 1 reply; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-02 18:18 UTC (permalink / raw)
  To: linux-kernel, linux-raid, Song Liu
  Cc: Christoph Hellwig, Donald Buczek, Guoqing Jiang, Xiao Ni,
	Stephen Bates, Martin Oliveira, David Sloan, Logan Gunthorpe

bio_alloc_bioset() takes a block device, number of vectors, the
OP flags, the GFP mask and the bio set. However when the prototype
was changed, the callisite in ppl_do_flush() had the OP flags and
the GFP flags reversed. This introduced some sparse error:

  drivers/md/raid5-ppl.c:632:57: warning: incorrect type in argument 3
				    (different base types)
  drivers/md/raid5-ppl.c:632:57:    expected unsigned int opf
  drivers/md/raid5-ppl.c:632:57:    got restricted gfp_t [usertype]
  drivers/md/raid5-ppl.c:633:61: warning: incorrect type in argument 4
  				    (different base types)
  drivers/md/raid5-ppl.c:633:61:    expected restricted gfp_t [usertype]
				    gfp_mask
  drivers/md/raid5-ppl.c:633:61:    got unsigned long long

The sparse error introduction may not have been reported correctly by
0day due to other work that was cleaning up other sparse errors in this
area.

Fixes: 609be1066731 ("block: pass a block_device and opf to bio_alloc_bioset")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/md/raid5-ppl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/raid5-ppl.c b/drivers/md/raid5-ppl.c
index 4f5bdb4cad2b..db49edec362a 100644
--- a/drivers/md/raid5-ppl.c
+++ b/drivers/md/raid5-ppl.c
@@ -629,9 +629,9 @@ static void ppl_do_flush(struct ppl_io_unit *io)
 		if (bdev) {
 			struct bio *bio;
 
-			bio = bio_alloc_bioset(bdev, 0, GFP_NOIO,
+			bio = bio_alloc_bioset(bdev, 0,
 					       REQ_OP_WRITE | REQ_PREFLUSH,
-					       &ppl_conf->flush_bs);
+					       GFP_NOIO, &ppl_conf->flush_bs);
 			bio->bi_private = io;
 			bio->bi_end_io = ppl_flush_endio;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 04/11] md/raid5-cache: Take mddev_lock in r5c_journal_mode_show()
  2022-06-02 18:18 ` [PATCH v3 04/11] md/raid5-cache: Take mddev_lock in r5c_journal_mode_show() Logan Gunthorpe
@ 2022-06-03  6:39   ` Christoph Hellwig
  2022-06-03 21:47     ` Logan Gunthorpe
  0 siblings, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2022-06-03  6:39 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

On Thu, Jun 02, 2022 at 12:18:10PM -0600, Logan Gunthorpe wrote:
>  	conf = mddev->private;
>  	if (!conf || !conf->log) {
> -		spin_unlock(&mddev->lock);
> +		mddev_unlock(mddev);
>  		return 0;
>  	}
>  
> @@ -2557,7 +2560,7 @@ static ssize_t r5c_journal_mode_show(struct mddev *mddev, char *page)
>  	default:
>  		ret = 0;
>  	}
> -	spin_unlock(&mddev->lock);
> +	mddev_unlock(mddev);
>  	return ret;

Using a goto out_unlock would be nice here to keep the critical
sections simple.  But even as-is this looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

> +	lockdep_assert_held(&conf->mddev->reconfig_mutex);
> +

.. but this looks unrelated and misplaced in this patch.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 05/11] md/raid5-cache: Drop RCU usage of conf->log
  2022-06-02 18:18 ` [PATCH v3 05/11] md/raid5-cache: Drop RCU usage of conf->log Logan Gunthorpe
@ 2022-06-03  6:43   ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2022-06-03  6:43 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

On Thu, Jun 02, 2022 at 12:18:11PM -0600, Logan Gunthorpe wrote:
> The only place that uses RCU to access conf->log is in
> r5l_log_disk_error(). This function is mostly used in the IO path
> and once with mddev_lock() held in raid5_change_consistency_policy().
> 
> It is known that the IO will be suspended before the log is freed and
> r5l_log_exit() is called with the mddev_lock() held.
> 
> This should mean that conf->log can not be freed while the function is
> being called, so the RCU protection is not necessary. Drop the
> rcu_read_lock() as well as the synchronize_rcu() and
> rcu_assign_pointer() usage.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 06/11] md/raid5-cache: Clear conf->log after finishing work
  2022-06-02 18:18 ` [PATCH v3 06/11] md/raid5-cache: Clear conf->log after finishing work Logan Gunthorpe
@ 2022-06-03  6:43   ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2022-06-03  6:43 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

On Thu, Jun 02, 2022 at 12:18:12PM -0600, Logan Gunthorpe wrote:
> A NULL pointer dereferlence on conf->log is seen randomly with
> the mdadm test 21raid5cache. Kasan reporst:

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 11/11] md/raid5-ppl: Fix argument order in bio_alloc_bioset()
  2022-06-02 18:18 ` [PATCH v3 11/11] md/raid5-ppl: Fix argument order in bio_alloc_bioset() Logan Gunthorpe
@ 2022-06-03  6:45   ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2022-06-03  6:45 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: linux-kernel, linux-raid, Song Liu, Christoph Hellwig,
	Donald Buczek, Guoqing Jiang, Xiao Ni, Stephen Bates,
	Martin Oliveira, David Sloan

On Thu, Jun 02, 2022 at 12:18:17PM -0600, Logan Gunthorpe wrote:
> bio_alloc_bioset() takes a block device, number of vectors, the
> OP flags, the GFP mask and the bio set. However when the prototype
> was changed, the callisite in ppl_do_flush() had the OP flags and
> the GFP flags reversed. This introduced some sparse error:
> 
>   drivers/md/raid5-ppl.c:632:57: warning: incorrect type in argument 3
> 				    (different base types)
>   drivers/md/raid5-ppl.c:632:57:    expected unsigned int opf
>   drivers/md/raid5-ppl.c:632:57:    got restricted gfp_t [usertype]
>   drivers/md/raid5-ppl.c:633:61: warning: incorrect type in argument 4
>   				    (different base types)
>   drivers/md/raid5-ppl.c:633:61:    expected restricted gfp_t [usertype]
> 				    gfp_mask
>   drivers/md/raid5-ppl.c:633:61:    got unsigned long long
> 
> The sparse error introduction may not have been reported correctly by
> 0day due to other work that was cleaning up other sparse errors in this
> area.
> 
> Fixes: 609be1066731 ("block: pass a block_device and opf to bio_alloc_bioset")
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>

Ooops.  Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

(and should probably be first in the series)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 04/11] md/raid5-cache: Take mddev_lock in r5c_journal_mode_show()
  2022-06-03  6:39   ` Christoph Hellwig
@ 2022-06-03 21:47     ` Logan Gunthorpe
  0 siblings, 0 replies; 17+ messages in thread
From: Logan Gunthorpe @ 2022-06-03 21:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-kernel, linux-raid, Song Liu, Donald Buczek, Guoqing Jiang,
	Xiao Ni, Stephen Bates, Martin Oliveira, David Sloan



On 2022-06-03 00:39, Christoph Hellwig wrote:
> On Thu, Jun 02, 2022 at 12:18:10PM -0600, Logan Gunthorpe wrote:
>>  	conf = mddev->private;
>>  	if (!conf || !conf->log) {
>> -		spin_unlock(&mddev->lock);
>> +		mddev_unlock(mddev);
>>  		return 0;
>>  	}
>>  
>> @@ -2557,7 +2560,7 @@ static ssize_t r5c_journal_mode_show(struct mddev *mddev, char *page)
>>  	default:
>>  		ret = 0;
>>  	}
>> -	spin_unlock(&mddev->lock);
>> +	mddev_unlock(mddev);
>>  	return ret;
> 
> Using a goto out_unlock would be nice here to keep the critical
> sections simple.  But even as-is this looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
>> +	lockdep_assert_held(&conf->mddev->reconfig_mutex);
>> +
> 
> .. but this looks unrelated and misplaced in this patch.
> 

Ok, yup, I'll fix these issues up and send an updated series next week.

Logan

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2022-06-03 21:47 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-02 18:18 [PATCH v3 00/11] Bug fixes for mdadm tests Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 01/11] md/raid5-log: Drop extern decorators for function prototypes Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 02/11] md/raid5-ppl: Drop unused argument from ppl_handle_flush_request() Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 03/11] md/raid5: Ensure array is suspended for calls to log_exit() Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 04/11] md/raid5-cache: Take mddev_lock in r5c_journal_mode_show() Logan Gunthorpe
2022-06-03  6:39   ` Christoph Hellwig
2022-06-03 21:47     ` Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 05/11] md/raid5-cache: Drop RCU usage of conf->log Logan Gunthorpe
2022-06-03  6:43   ` Christoph Hellwig
2022-06-02 18:18 ` [PATCH v3 06/11] md/raid5-cache: Clear conf->log after finishing work Logan Gunthorpe
2022-06-03  6:43   ` Christoph Hellwig
2022-06-02 18:18 ` [PATCH v3 07/11] md/raid5-cache: Annotate pslot with __rcu notation Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 08/11] md: Use enum for overloaded magic numbers used by mddev->curr_resync Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 09/11] md: Ensure resync is reported after it starts Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 10/11] md: Notify sysfs sync_completed in md_reap_sync_thread() Logan Gunthorpe
2022-06-02 18:18 ` [PATCH v3 11/11] md/raid5-ppl: Fix argument order in bio_alloc_bioset() Logan Gunthorpe
2022-06-03  6:45   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.