* [PATCH 1/6] md/raid1: only allocate write behind bio for WriteMostly device
2021-10-04 15:34 [PATCH 0/6] Misc changes for md Guoqing Jiang
@ 2021-10-04 15:34 ` Guoqing Jiang
2021-10-05 1:05 ` Guoqing Jiang
2021-10-05 5:55 ` Jens Stutte
2021-10-04 15:34 ` [PATCH 2/6] md/bitmap: don't set max_write_behind if there is no write mostly device Guoqing Jiang
` (5 subsequent siblings)
6 siblings, 2 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-04 15:34 UTC (permalink / raw)
To: song; +Cc: linux-raid
Commit 6607cd319b6b91bff94e90f798a61c031650b514 ("raid1: ensure write
behind bio has less than BIO_MAX_VECS sectors") tried to guarantee the
size of behind bio is not bigger than BIO_MAX_VECS sectors.
Unfortunately the same calltrace still could happen since an array could
enable write-behind without write mostly device.
To match the manpage of mdadm (which says "write-behind is only attempted
on drives marked as write-mostly"), we need to check WriteMostly flag to
avoid such unexpected behavior.
[1]. https://bugzilla.kernel.org/show_bug.cgi?id=213181#c25
Cc: stable@vger.kernel.org # v5.12+
Cc: Jens Stutte <jens@chianterastutte.eu>
Reported-by: Jens Stutte <jens@chianterastutte.eu>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/md/raid1.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 19598bd38939..6ba12f0f0f03 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1496,7 +1496,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
if (!r1_bio->bios[i])
continue;
- if (first_clone) {
+ if (first_clone && test_bit(WriteMostly, &rdev->flags)) {
/* do behind I/O ?
* Not if there are too many, or cannot
* allocate memory, or a reader on WriteMostly
--
2.31.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/6] md/raid1: only allocate write behind bio for WriteMostly device
2021-10-04 15:34 ` [PATCH 1/6] md/raid1: only allocate write behind bio for WriteMostly device Guoqing Jiang
@ 2021-10-05 1:05 ` Guoqing Jiang
2021-10-05 5:55 ` Jens Stutte
1 sibling, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-05 1:05 UTC (permalink / raw)
To: song; +Cc: linux-raid, jens
Now cc Jens actually.
On 10/4/21 11:34 PM, Guoqing Jiang wrote:
> Commit 6607cd319b6b91bff94e90f798a61c031650b514 ("raid1: ensure write
> behind bio has less than BIO_MAX_VECS sectors") tried to guarantee the
> size of behind bio is not bigger than BIO_MAX_VECS sectors.
>
> Unfortunately the same calltrace still could happen since an array could
> enable write-behind without write mostly device.
>
> To match the manpage of mdadm (which says "write-behind is only attempted
> on drives marked as write-mostly"), we need to check WriteMostly flag to
> avoid such unexpected behavior.
>
> [1]. https://bugzilla.kernel.org/show_bug.cgi?id=213181#c25
>
> Cc: stable@vger.kernel.org # v5.12+
> Cc: Jens Stutte <jens@chianterastutte.eu>
> Reported-by: Jens Stutte <jens@chianterastutte.eu>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
> drivers/md/raid1.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 19598bd38939..6ba12f0f0f03 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1496,7 +1496,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> if (!r1_bio->bios[i])
> continue;
>
> - if (first_clone) {
> + if (first_clone && test_bit(WriteMostly, &rdev->flags)) {
> /* do behind I/O ?
> * Not if there are too many, or cannot
> * allocate memory, or a reader on WriteMostly
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/6] md/raid1: only allocate write behind bio for WriteMostly device
2021-10-04 15:34 ` [PATCH 1/6] md/raid1: only allocate write behind bio for WriteMostly device Guoqing Jiang
2021-10-05 1:05 ` Guoqing Jiang
@ 2021-10-05 5:55 ` Jens Stutte
1 sibling, 0 replies; 14+ messages in thread
From: Jens Stutte @ 2021-10-05 5:55 UTC (permalink / raw)
To: Guoqing Jiang, song; +Cc: linux-raid
FWIW, I can confirm that an equivalent change (except for variable
renaming, see [1]) is running well here now for a while now.
Thank you to work on this!
Am 04.10.21 um 17:34 schrieb Guoqing Jiang:
> Commit 6607cd319b6b91bff94e90f798a61c031650b514 ("raid1: ensure write
> behind bio has less than BIO_MAX_VECS sectors") tried to guarantee the
> size of behind bio is not bigger than BIO_MAX_VECS sectors.
>
> Unfortunately the same calltrace still could happen since an array could
> enable write-behind without write mostly device.
>
> To match the manpage of mdadm (which says "write-behind is only attempted
> on drives marked as write-mostly"), we need to check WriteMostly flag to
> avoid such unexpected behavior.
>
> [1]. https://bugzilla.kernel.org/show_bug.cgi?id=213181#c25
>
> Cc: stable@vger.kernel.org # v5.12+
> Cc: Jens Stutte <jens@chianterastutte.eu>
> Reported-by: Jens Stutte <jens@chianterastutte.eu>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
> drivers/md/raid1.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 19598bd38939..6ba12f0f0f03 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1496,7 +1496,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
> if (!r1_bio->bios[i])
> continue;
>
> - if (first_clone) {
> + if (first_clone && test_bit(WriteMostly, &rdev->flags)) {
> /* do behind I/O ?
> * Not if there are too many, or cannot
> * allocate memory, or a reader on WriteMostly
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/6] md/bitmap: don't set max_write_behind if there is no write mostly device
2021-10-04 15:34 [PATCH 0/6] Misc changes for md Guoqing Jiang
2021-10-04 15:34 ` [PATCH 1/6] md/raid1: only allocate write behind bio for WriteMostly device Guoqing Jiang
@ 2021-10-04 15:34 ` Guoqing Jiang
2021-10-07 6:25 ` Song Liu
2021-10-04 15:34 ` [PATCH 3/6] md/raid1: use rdev in raid1_write_request directly Guoqing Jiang
` (4 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-04 15:34 UTC (permalink / raw)
To: song; +Cc: linux-raid
We shouldn't set it since write behind IO should only happen to write
mostly device.
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/md/md-bitmap.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
index e29c6298ef5c..0346281b1555 100644
--- a/drivers/md/md-bitmap.c
+++ b/drivers/md/md-bitmap.c
@@ -2469,11 +2469,28 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
{
unsigned long backlog;
unsigned long old_mwb = mddev->bitmap_info.max_write_behind;
+ struct md_rdev *rdev;
+ bool has_write_mostly = false;
int rv = kstrtoul(buf, 10, &backlog);
if (rv)
return rv;
if (backlog > COUNTER_MAX)
return -EINVAL;
+
+ /*
+ * Without write mostly device, it doesn't make sense to set
+ * backlog for max_write_behind.
+ */
+ rdev_for_each(rdev, mddev)
+ if (test_bit(WriteMostly, &rdev->flags)) {
+ has_write_mostly = true;
+ break;
+ }
+ if (!has_write_mostly) {
+ pr_warn_ratelimited("md: No write mostly device available\n");
+ return -EINVAL;
+ }
+
mddev->bitmap_info.max_write_behind = backlog;
if (!backlog && mddev->serial_info_pool) {
/* serial_info_pool is not needed if backlog is zero */
--
2.31.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/6] md/bitmap: don't set max_write_behind if there is no write mostly device
2021-10-04 15:34 ` [PATCH 2/6] md/bitmap: don't set max_write_behind if there is no write mostly device Guoqing Jiang
@ 2021-10-07 6:25 ` Song Liu
2021-10-07 10:20 ` Guoqing Jiang
0 siblings, 1 reply; 14+ messages in thread
From: Song Liu @ 2021-10-07 6:25 UTC (permalink / raw)
To: Guoqing Jiang; +Cc: linux-raid
On Mon, Oct 4, 2021 at 8:40 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> We shouldn't set it since write behind IO should only happen to write
> mostly device.
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
> drivers/md/md-bitmap.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
> index e29c6298ef5c..0346281b1555 100644
> --- a/drivers/md/md-bitmap.c
> +++ b/drivers/md/md-bitmap.c
> @@ -2469,11 +2469,28 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
> {
> unsigned long backlog;
> unsigned long old_mwb = mddev->bitmap_info.max_write_behind;
> + struct md_rdev *rdev;
> + bool has_write_mostly = false;
> int rv = kstrtoul(buf, 10, &backlog);
> if (rv)
> return rv;
> if (backlog > COUNTER_MAX)
> return -EINVAL;
> +
> + /*
> + * Without write mostly device, it doesn't make sense to set
> + * backlog for max_write_behind.
> + */
> + rdev_for_each(rdev, mddev)
> + if (test_bit(WriteMostly, &rdev->flags)) {
> + has_write_mostly = true;
> + break;
> + }
> + if (!has_write_mostly) {
> + pr_warn_ratelimited("md: No write mostly device available\n");
Most of these _store functions do not print warnings for invalid changes. So
I am not sure whether we want to add this one. If we do want it, we should
make it clear, as
"md: No write mostly device available. Cannot set backlog\n".
We may also add the device name there.
Thanks,
Song
> + return -EINVAL;
> + }
> +
> mddev->bitmap_info.max_write_behind = backlog;
> if (!backlog && mddev->serial_info_pool) {
> /* serial_info_pool is not needed if backlog is zero */
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/6] md/bitmap: don't set max_write_behind if there is no write mostly device
2021-10-07 6:25 ` Song Liu
@ 2021-10-07 10:20 ` Guoqing Jiang
0 siblings, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-07 10:20 UTC (permalink / raw)
To: Song Liu; +Cc: linux-raid
On 10/7/21 2:25 PM, Song Liu wrote:
> On Mon, Oct 4, 2021 at 8:40 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>> We shouldn't set it since write behind IO should only happen to write
>> mostly device.
>>
>> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
>> ---
>> drivers/md/md-bitmap.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c
>> index e29c6298ef5c..0346281b1555 100644
>> --- a/drivers/md/md-bitmap.c
>> +++ b/drivers/md/md-bitmap.c
>> @@ -2469,11 +2469,28 @@ backlog_store(struct mddev *mddev, const char *buf, size_t len)
>> {
>> unsigned long backlog;
>> unsigned long old_mwb = mddev->bitmap_info.max_write_behind;
>> + struct md_rdev *rdev;
>> + bool has_write_mostly = false;
>> int rv = kstrtoul(buf, 10, &backlog);
>> if (rv)
>> return rv;
>> if (backlog > COUNTER_MAX)
>> return -EINVAL;
>> +
>> + /*
>> + * Without write mostly device, it doesn't make sense to set
>> + * backlog for max_write_behind.
>> + */
>> + rdev_for_each(rdev, mddev)
>> + if (test_bit(WriteMostly, &rdev->flags)) {
>> + has_write_mostly = true;
>> + break;
>> + }
>> + if (!has_write_mostly) {
>> + pr_warn_ratelimited("md: No write mostly device available\n");
> Most of these _store functions do not print warnings for invalid changes. So
> I am not sure whether we want to add this one.
I think it is better to makes users know the reason of invalidation.
> If we do want it, we should make it clear, as
> "md: No write mostly device available. Cannot set backlog\n".
> We may also add the device name there.
Sure, I will make it clearer.
Thanks,
Guoqing
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/6] md/raid1: use rdev in raid1_write_request directly
2021-10-04 15:34 [PATCH 0/6] Misc changes for md Guoqing Jiang
2021-10-04 15:34 ` [PATCH 1/6] md/raid1: only allocate write behind bio for WriteMostly device Guoqing Jiang
2021-10-04 15:34 ` [PATCH 2/6] md/bitmap: don't set max_write_behind if there is no write mostly device Guoqing Jiang
@ 2021-10-04 15:34 ` Guoqing Jiang
2021-10-04 15:34 ` [PATCH 4/6] md/raid10: add 'read_err' to raid10_read_request Guoqing Jiang
` (3 subsequent siblings)
6 siblings, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-04 15:34 UTC (permalink / raw)
To: song; +Cc: linux-raid
We already get rdev from conf->mirrors[i].rdev at the beginning of the
loop, so just use it.
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/md/raid1.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 6ba12f0f0f03..7dc8026cf6ee 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1529,13 +1529,12 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
r1_bio->bios[i] = mbio;
- mbio->bi_iter.bi_sector = (r1_bio->sector +
- conf->mirrors[i].rdev->data_offset);
- bio_set_dev(mbio, conf->mirrors[i].rdev->bdev);
+ mbio->bi_iter.bi_sector = (r1_bio->sector + rdev->data_offset);
+ bio_set_dev(mbio, rdev->bdev);
mbio->bi_end_io = raid1_end_write_request;
mbio->bi_opf = bio_op(bio) | (bio->bi_opf & (REQ_SYNC | REQ_FUA));
- if (test_bit(FailFast, &conf->mirrors[i].rdev->flags) &&
- !test_bit(WriteMostly, &conf->mirrors[i].rdev->flags) &&
+ if (test_bit(FailFast, &rdev->flags) &&
+ !test_bit(WriteMostly, &rdev->flags) &&
conf->raid_disks - mddev->degraded > 1)
mbio->bi_opf |= MD_FAILFAST;
mbio->bi_private = r1_bio;
@@ -1546,7 +1545,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
trace_block_bio_remap(mbio, disk_devt(mddev->gendisk),
r1_bio->sector);
/* flush_pending_writes() needs access to the rdev so...*/
- mbio->bi_bdev = (void *)conf->mirrors[i].rdev;
+ mbio->bi_bdev = (void *)rdev;
cb = blk_check_plugged(raid1_unplug, mddev, sizeof(*plug));
if (cb)
--
2.31.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/6] md/raid10: add 'read_err' to raid10_read_request
2021-10-04 15:34 [PATCH 0/6] Misc changes for md Guoqing Jiang
` (2 preceding siblings ...)
2021-10-04 15:34 ` [PATCH 3/6] md/raid1: use rdev in raid1_write_request directly Guoqing Jiang
@ 2021-10-04 15:34 ` Guoqing Jiang
2021-10-07 6:20 ` Song Liu
2021-10-04 15:34 ` [PATCH 5/6] md/raid5: call roundup_pow_of_two in raid5_run Guoqing Jiang
` (2 subsequent siblings)
6 siblings, 1 reply; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-04 15:34 UTC (permalink / raw)
To: song; +Cc: linux-raid
Add the paramenter since the err retry logic is only meaningful when
the caller is handle_read_error.
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/md/raid10.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index aa2636582841..29eb538db953 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1116,7 +1116,7 @@ static void regular_request_wait(struct mddev *mddev, struct r10conf *conf,
}
static void raid10_read_request(struct mddev *mddev, struct bio *bio,
- struct r10bio *r10_bio)
+ struct r10bio *r10_bio, bool read_err)
{
struct r10conf *conf = mddev->private;
struct bio *read_bio;
@@ -1129,7 +1129,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
struct md_rdev *err_rdev = NULL;
gfp_t gfp = GFP_NOIO;
- if (slot >= 0 && r10_bio->devs[slot].rdev) {
+ if (read_err && slot >= 0 && r10_bio->devs[slot].rdev) {
/*
* This is an error retry, but we cannot
* safely dereference the rdev in the r10_bio,
@@ -1519,7 +1519,7 @@ static void __make_request(struct mddev *mddev, struct bio *bio, int sectors)
conf->geo.raid_disks);
if (bio_data_dir(bio) == READ)
- raid10_read_request(mddev, bio, r10_bio);
+ raid10_read_request(mddev, bio, r10_bio, false);
else
raid10_write_request(mddev, bio, r10_bio);
}
@@ -2918,7 +2918,7 @@ static void handle_read_error(struct mddev *mddev, struct r10bio *r10_bio)
rdev_dec_pending(rdev, mddev);
allow_barrier(conf);
r10_bio->state = 0;
- raid10_read_request(mddev, r10_bio->master_bio, r10_bio);
+ raid10_read_request(mddev, r10_bio->master_bio, r10_bio, true);
}
static void handle_write_completed(struct r10conf *conf, struct r10bio *r10_bio)
--
2.31.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 4/6] md/raid10: add 'read_err' to raid10_read_request
2021-10-04 15:34 ` [PATCH 4/6] md/raid10: add 'read_err' to raid10_read_request Guoqing Jiang
@ 2021-10-07 6:20 ` Song Liu
2021-10-07 10:16 ` Guoqing Jiang
0 siblings, 1 reply; 14+ messages in thread
From: Song Liu @ 2021-10-07 6:20 UTC (permalink / raw)
To: Guoqing Jiang; +Cc: linux-raid
On Mon, Oct 4, 2021 at 8:40 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> Add the paramenter since the err retry logic is only meaningful when
> the caller is handle_read_error.
>
> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
> ---
> drivers/md/raid10.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index aa2636582841..29eb538db953 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -1116,7 +1116,7 @@ static void regular_request_wait(struct mddev *mddev, struct r10conf *conf,
> }
>
> static void raid10_read_request(struct mddev *mddev, struct bio *bio,
> - struct r10bio *r10_bio)
> + struct r10bio *r10_bio, bool read_err)
> {
> struct r10conf *conf = mddev->private;
> struct bio *read_bio;
> @@ -1129,7 +1129,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
> struct md_rdev *err_rdev = NULL;
> gfp_t gfp = GFP_NOIO;
>
> - if (slot >= 0 && r10_bio->devs[slot].rdev) {
> + if (read_err && slot >= 0 && r10_bio->devs[slot].rdev) {
How about we just move this section to a separate function?
Thanks,
Song
> /*
> * This is an error retry, but we cannot
> * safely dereference the rdev in the r10_bio,
> @@ -1519,7 +1519,7 @@ static void __make_request(struct mddev *mddev, struct bio *bio, int sectors)
> conf->geo.raid_disks);
>
> if (bio_data_dir(bio) == READ)
> - raid10_read_request(mddev, bio, r10_bio);
> + raid10_read_request(mddev, bio, r10_bio, false);
> else
> raid10_write_request(mddev, bio, r10_bio);
> }
> @@ -2918,7 +2918,7 @@ static void handle_read_error(struct mddev *mddev, struct r10bio *r10_bio)
> rdev_dec_pending(rdev, mddev);
> allow_barrier(conf);
> r10_bio->state = 0;
> - raid10_read_request(mddev, r10_bio->master_bio, r10_bio);
> + raid10_read_request(mddev, r10_bio->master_bio, r10_bio, true);
> }
>
> static void handle_write_completed(struct r10conf *conf, struct r10bio *r10_bio)
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 4/6] md/raid10: add 'read_err' to raid10_read_request
2021-10-07 6:20 ` Song Liu
@ 2021-10-07 10:16 ` Guoqing Jiang
0 siblings, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-07 10:16 UTC (permalink / raw)
To: Song Liu; +Cc: linux-raid
On 10/7/21 2:20 PM, Song Liu wrote:
> On Mon, Oct 4, 2021 at 8:40 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>> Add the paramenter since the err retry logic is only meaningful when
>> the caller is handle_read_error.
>>
>> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
>> ---
>> drivers/md/raid10.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index aa2636582841..29eb538db953 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -1116,7 +1116,7 @@ static void regular_request_wait(struct mddev *mddev, struct r10conf *conf,
>> }
>>
>> static void raid10_read_request(struct mddev *mddev, struct bio *bio,
>> - struct r10bio *r10_bio)
>> + struct r10bio *r10_bio, bool read_err)
>> {
>> struct r10conf *conf = mddev->private;
>> struct bio *read_bio;
>> @@ -1129,7 +1129,7 @@ static void raid10_read_request(struct mddev *mddev, struct bio *bio,
>> struct md_rdev *err_rdev = NULL;
>> gfp_t gfp = GFP_NOIO;
>>
>> - if (slot >= 0 && r10_bio->devs[slot].rdev) {
>> + if (read_err && slot >= 0 && r10_bio->devs[slot].rdev) {
> How about we just move this section to a separate function?
Will add an additional patch for it.
Thanks,
Guoqing
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 5/6] md/raid5: call roundup_pow_of_two in raid5_run
2021-10-04 15:34 [PATCH 0/6] Misc changes for md Guoqing Jiang
` (3 preceding siblings ...)
2021-10-04 15:34 ` [PATCH 4/6] md/raid10: add 'read_err' to raid10_read_request Guoqing Jiang
@ 2021-10-04 15:34 ` Guoqing Jiang
2021-10-04 15:34 ` [PATCH 6/6] md: remove unused argument from md_new_event Guoqing Jiang
2021-10-07 6:32 ` [PATCH 0/6] Misc changes for md Song Liu
6 siblings, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-04 15:34 UTC (permalink / raw)
To: song; +Cc: linux-raid
Let's call roundup_pow_of_two here instead of open code.
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/md/raid5.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 02ed53b20654..4ea9e7b647b0 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7732,10 +7732,7 @@ static int raid5_run(struct mddev *mddev)
* discard data disk but write parity disk
*/
stripe = stripe * PAGE_SIZE;
- /* Round up to power of 2, as discard handling
- * currently assumes that */
- while ((stripe-1) & stripe)
- stripe = (stripe | (stripe-1)) + 1;
+ stripe = roundup_pow_of_two(stripe);
mddev->queue->limits.discard_alignment = stripe;
mddev->queue->limits.discard_granularity = stripe;
--
2.31.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 6/6] md: remove unused argument from md_new_event
2021-10-04 15:34 [PATCH 0/6] Misc changes for md Guoqing Jiang
` (4 preceding siblings ...)
2021-10-04 15:34 ` [PATCH 5/6] md/raid5: call roundup_pow_of_two in raid5_run Guoqing Jiang
@ 2021-10-04 15:34 ` Guoqing Jiang
2021-10-07 6:32 ` [PATCH 0/6] Misc changes for md Song Liu
6 siblings, 0 replies; 14+ messages in thread
From: Guoqing Jiang @ 2021-10-04 15:34 UTC (permalink / raw)
To: song; +Cc: linux-raid
Actually, mddev is not used by md_new_event.
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
---
drivers/md/md.c | 30 +++++++++++++++---------------
drivers/md/md.h | 2 +-
drivers/md/raid10.c | 2 +-
drivers/md/raid5.c | 2 +-
4 files changed, 18 insertions(+), 18 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index c322841d4edc..90c624bec695 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -352,7 +352,7 @@ static bool create_on_open = true;
*/
static DECLARE_WAIT_QUEUE_HEAD(md_event_waiters);
static atomic_t md_event_count;
-void md_new_event(struct mddev *mddev)
+void md_new_event(void)
{
atomic_inc(&md_event_count);
wake_up(&md_event_waiters);
@@ -2886,7 +2886,7 @@ static int add_bound_rdev(struct md_rdev *rdev)
if (mddev->degraded)
set_bit(MD_RECOVERY_RECOVER, &mddev->recovery);
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
- md_new_event(mddev);
+ md_new_event();
md_wakeup_thread(mddev->thread);
return 0;
}
@@ -3002,7 +3002,7 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags);
md_wakeup_thread(mddev->thread);
}
- md_new_event(mddev);
+ md_new_event();
}
}
} else if (cmd_match(buf, "writemostly")) {
@@ -4099,7 +4099,7 @@ level_store(struct mddev *mddev, const char *buf, size_t len)
if (!mddev->thread)
md_update_sb(mddev, 1);
sysfs_notify_dirent_safe(mddev->sysfs_level);
- md_new_event(mddev);
+ md_new_event();
rv = len;
out_unlock:
mddev_unlock(mddev);
@@ -4620,7 +4620,7 @@ new_dev_store(struct mddev *mddev, const char *buf, size_t len)
export_rdev(rdev);
mddev_unlock(mddev);
if (!err)
- md_new_event(mddev);
+ md_new_event();
return err ? err : len;
}
@@ -6041,7 +6041,7 @@ int md_run(struct mddev *mddev)
if (mddev->sb_flags)
md_update_sb(mddev, 0);
- md_new_event(mddev);
+ md_new_event();
return 0;
bitmap_abort:
@@ -6431,7 +6431,7 @@ static int do_md_stop(struct mddev *mddev, int mode,
if (mddev->hold_active == UNTIL_STOP)
mddev->hold_active = 0;
}
- md_new_event(mddev);
+ md_new_event();
sysfs_notify_dirent_safe(mddev->sysfs_state);
return 0;
}
@@ -6935,7 +6935,7 @@ static int hot_remove_disk(struct mddev *mddev, dev_t dev)
md_wakeup_thread(mddev->thread);
else
md_update_sb(mddev, 1);
- md_new_event(mddev);
+ md_new_event();
return 0;
busy:
@@ -7008,7 +7008,7 @@ static int hot_add_disk(struct mddev *mddev, dev_t dev)
*/
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
- md_new_event(mddev);
+ md_new_event();
return 0;
abort_export:
@@ -7982,7 +7982,7 @@ void md_error(struct mddev *mddev, struct md_rdev *rdev)
md_wakeup_thread(mddev->thread);
if (mddev->event_work.func)
queue_work(md_misc_wq, &mddev->event_work);
- md_new_event(mddev);
+ md_new_event();
}
EXPORT_SYMBOL(md_error);
@@ -8866,7 +8866,7 @@ void md_do_sync(struct md_thread *thread)
mddev->curr_resync = 3; /* no longer delayed */
mddev->curr_resync_completed = j;
sysfs_notify_dirent_safe(mddev->sysfs_completed);
- md_new_event(mddev);
+ md_new_event();
update_time = jiffies;
blk_start_plug(&plug);
@@ -8937,7 +8937,7 @@ void md_do_sync(struct md_thread *thread)
/* this is the earliest that rebuild will be
* visible in /proc/mdstat
*/
- md_new_event(mddev);
+ md_new_event();
if (last_check + window > io_sectors || j == max_sectors)
continue;
@@ -9161,7 +9161,7 @@ static int remove_and_add_spares(struct mddev *mddev,
sysfs_link_rdev(mddev, rdev);
if (!test_bit(Journal, &rdev->flags))
spares++;
- md_new_event(mddev);
+ md_new_event();
set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags);
}
}
@@ -9195,7 +9195,7 @@ static void md_start_sync(struct work_struct *ws)
} else
md_wakeup_thread(mddev->sync_thread);
sysfs_notify_dirent_safe(mddev->sysfs_action);
- md_new_event(mddev);
+ md_new_event();
}
/*
@@ -9454,7 +9454,7 @@ void md_reap_sync_thread(struct mddev *mddev)
/* flag recovery needed just to double check */
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
sysfs_notify_dirent_safe(mddev->sysfs_action);
- md_new_event(mddev);
+ md_new_event();
if (mddev->event_work.func)
queue_work(md_misc_wq, &mddev->event_work);
}
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 4c96c36bd01a..53ea7a6961de 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -731,7 +731,7 @@ extern int sync_page_io(struct md_rdev *rdev, sector_t sector, int size,
struct page *page, int op, int op_flags,
bool metadata_op);
extern void md_do_sync(struct md_thread *thread);
-extern void md_new_event(struct mddev *mddev);
+extern void md_new_event(void);
extern void md_allow_write(struct mddev *mddev);
extern void md_wait_for_blocked_rdev(struct md_rdev *rdev, struct mddev *mddev);
extern void md_set_array_sectors(struct mddev *mddev, sector_t array_sectors);
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 29eb538db953..0aee0675f18e 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4647,7 +4647,7 @@ static int raid10_start_reshape(struct mddev *mddev)
}
conf->reshape_checkpoint = jiffies;
md_wakeup_thread(mddev->sync_thread);
- md_new_event(mddev);
+ md_new_event();
return 0;
abort:
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 4ea9e7b647b0..9c1a5877cf9f 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -8279,7 +8279,7 @@ static int raid5_start_reshape(struct mddev *mddev)
}
conf->reshape_checkpoint = jiffies;
md_wakeup_thread(mddev->sync_thread);
- md_new_event(mddev);
+ md_new_event();
return 0;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/6] Misc changes for md
2021-10-04 15:34 [PATCH 0/6] Misc changes for md Guoqing Jiang
` (5 preceding siblings ...)
2021-10-04 15:34 ` [PATCH 6/6] md: remove unused argument from md_new_event Guoqing Jiang
@ 2021-10-07 6:32 ` Song Liu
6 siblings, 0 replies; 14+ messages in thread
From: Song Liu @ 2021-10-07 6:32 UTC (permalink / raw)
To: Guoqing Jiang; +Cc: linux-raid
On Mon, Oct 4, 2021 at 8:40 AM Guoqing Jiang <guoqing.jiang@linux.dev> wrote:
>
> Hello,
>
> The first patch fixes the same calltrace as commit 6607cd319b6b ("raid1:
> ensure write behind bio has less than BIO_MAX_VECS sectors") tried
> before, but unfortunately the calltrace still could happen if array
> without write mostly device is configured with write-behind enabled.
> So the first patch is suitable for fix branch which others are materials
> for next branch.
>
> Pls review.
>
> Thanks,
> Guoqing
>
> Guoqing Jiang (6):
> md/raid1: only allocate write behind bio for WriteMostly device
> md/bitmap: don't set max_write_behind if there is no write mostly
> device
> md/raid1: use rdev in raid1_write_request directly
> md/raid10: add 'read_err' to raid10_read_request
> md/raid5: call roundup_pow_of_two in raid5_run
> md: remove unused argument from md_new_event
Thanks for these fixes and cleanups! I applied 1, 3, 5, 6 to md-next.
Song
^ permalink raw reply [flat|nested] 14+ messages in thread