All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-29 19:33 ` Guilherme G. Piccoli
  0 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-07-29 19:33 UTC (permalink / raw)
  To: linux-raid
  Cc: Song Liu, gpiccoli, NeilBrown, linux-block, dm-devel, jay.vosburgh

Currently md/raid0 is not provided with any mechanism to validate if
an array member got removed or failed. The driver keeps sending BIOs
regardless of the state of array members. This leads to the following
situation: if a raid0 array member is removed and the array is mounted,
some user writing to this array won't realize that errors are happening
unless they check kernel log or perform one fsync per written file.

In other words, no -EIO is returned and writes (except direct ones) appear
normal. Meaning the user might think the wrote data is correctly stored in
the array, but instead garbage was written given that raid0 does stripping
(and so, it requires all its members to be working in order to not corrupt
data).

This patch proposes a small change in this behavior: we check if the block
device's gendisk is UP when submitting the BIO to the array member, and if
it isn't, we flag the md device as broken and fail subsequent I/Os. In case
the array is restored (i.e., the missing array member is back), the flag is
cleared and we can submit BIOs normally.

With this patch, the filesystem reacts much faster to the event of missing
array member: after some I/O errors, ext4 for instance aborts the journal
and prevents corruption. Without this change, we're able to keep writing
in the disk and after a machine reboot, e2fsck shows some severe fs errors
that demand fixing. This patch was tested in both ext4 and xfs
filesystems.

Cc: NeilBrown <neilb@suse.com>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---

After an attempt to change the way md/raid0 fails in case one of its
members gets removed (see [0]), we got not so great feedback from the
community and also, the change was complex and had corner cases.
One of the points which seemed to be a consensus in that attempt was
that raid0 could benefit from a way of failing faster in case a member
gets removed. This patch aims for that, in a simpler way than earlier
proposed. Any feedbacks are welcome, thanks in advance!


Guilherme


[0] lore.kernel.org/linux-block/20190418220448.7219-1-gpiccoli@canonical.com


 drivers/md/md.c    | 5 +++++
 drivers/md/md.h    | 3 +++
 drivers/md/raid0.c | 8 ++++++++
 3 files changed, 16 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 24638ccedce4..fba49918d591 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -376,6 +376,11 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
 	struct mddev *mddev = q->queuedata;
 	unsigned int sectors;
 
+	if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
+		bio_io_error(bio);
+		return BLK_QC_T_NONE;
+	}
+
 	blk_queue_split(q, &bio);
 
 	if (mddev == NULL || mddev->pers == NULL) {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 10f98200e2f8..41552e615c4c 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -248,6 +248,9 @@ enum mddev_flags {
 	MD_UPDATING_SB,		/* md_check_recovery is updating the metadata
 				 * without explicitly holding reconfig_mutex.
 				 */
+	MD_BROKEN,              /* This is used in RAID-0 only, to stop I/O
+				 * in case an array member is gone/failed.
+				 */
 };
 
 enum mddev_sb_flags {
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index bf5cf184a260..58a9cc5193bf 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -586,6 +586,14 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
 
 	zone = find_zone(mddev->private, &sector);
 	tmp_dev = map_sector(mddev, zone, sector, &sector);
+
+	if (unlikely(!(tmp_dev->bdev->bd_disk->flags & GENHD_FL_UP))) {
+		set_bit(MD_BROKEN, &mddev->flags);
+		bio_io_error(bio);
+		return true;
+	}
+
+	clear_bit(MD_BROKEN, &mddev->flags);
 	bio_set_dev(bio, tmp_dev->bdev);
 	bio->bi_iter.bi_sector = sector + zone->dev_start +
 		tmp_dev->data_offset;
-- 
2.22.0

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-29 19:33 ` Guilherme G. Piccoli
  0 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-07-29 19:33 UTC (permalink / raw)
  To: linux-raid
  Cc: linux-block, dm-devel, gpiccoli, jay.vosburgh, NeilBrown, Song Liu

Currently md/raid0 is not provided with any mechanism to validate if
an array member got removed or failed. The driver keeps sending BIOs
regardless of the state of array members. This leads to the following
situation: if a raid0 array member is removed and the array is mounted,
some user writing to this array won't realize that errors are happening
unless they check kernel log or perform one fsync per written file.

In other words, no -EIO is returned and writes (except direct ones) appear
normal. Meaning the user might think the wrote data is correctly stored in
the array, but instead garbage was written given that raid0 does stripping
(and so, it requires all its members to be working in order to not corrupt
data).

This patch proposes a small change in this behavior: we check if the block
device's gendisk is UP when submitting the BIO to the array member, and if
it isn't, we flag the md device as broken and fail subsequent I/Os. In case
the array is restored (i.e., the missing array member is back), the flag is
cleared and we can submit BIOs normally.

With this patch, the filesystem reacts much faster to the event of missing
array member: after some I/O errors, ext4 for instance aborts the journal
and prevents corruption. Without this change, we're able to keep writing
in the disk and after a machine reboot, e2fsck shows some severe fs errors
that demand fixing. This patch was tested in both ext4 and xfs
filesystems.

Cc: NeilBrown <neilb@suse.com>
Cc: Song Liu <songliubraving@fb.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
---

After an attempt to change the way md/raid0 fails in case one of its
members gets removed (see [0]), we got not so great feedback from the
community and also, the change was complex and had corner cases.
One of the points which seemed to be a consensus in that attempt was
that raid0 could benefit from a way of failing faster in case a member
gets removed. This patch aims for that, in a simpler way than earlier
proposed. Any feedbacks are welcome, thanks in advance!


Guilherme


[0] lore.kernel.org/linux-block/20190418220448.7219-1-gpiccoli@canonical.com


 drivers/md/md.c    | 5 +++++
 drivers/md/md.h    | 3 +++
 drivers/md/raid0.c | 8 ++++++++
 3 files changed, 16 insertions(+)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 24638ccedce4..fba49918d591 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -376,6 +376,11 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
 	struct mddev *mddev = q->queuedata;
 	unsigned int sectors;
 
+	if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
+		bio_io_error(bio);
+		return BLK_QC_T_NONE;
+	}
+
 	blk_queue_split(q, &bio);
 
 	if (mddev == NULL || mddev->pers == NULL) {
diff --git a/drivers/md/md.h b/drivers/md/md.h
index 10f98200e2f8..41552e615c4c 100644
--- a/drivers/md/md.h
+++ b/drivers/md/md.h
@@ -248,6 +248,9 @@ enum mddev_flags {
 	MD_UPDATING_SB,		/* md_check_recovery is updating the metadata
 				 * without explicitly holding reconfig_mutex.
 				 */
+	MD_BROKEN,              /* This is used in RAID-0 only, to stop I/O
+				 * in case an array member is gone/failed.
+				 */
 };
 
 enum mddev_sb_flags {
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index bf5cf184a260..58a9cc5193bf 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -586,6 +586,14 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
 
 	zone = find_zone(mddev->private, &sector);
 	tmp_dev = map_sector(mddev, zone, sector, &sector);
+
+	if (unlikely(!(tmp_dev->bdev->bd_disk->flags & GENHD_FL_UP))) {
+		set_bit(MD_BROKEN, &mddev->flags);
+		bio_io_error(bio);
+		return true;
+	}
+
+	clear_bit(MD_BROKEN, &mddev->flags);
 	bio_set_dev(bio, tmp_dev->bdev);
 	bio->bi_iter.bi_sector = sector + zone->dev_start +
 		tmp_dev->data_offset;
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-29 19:33 ` Guilherme G. Piccoli
@ 2019-07-29 20:18   ` Roman Mamedov
  -1 siblings, 0 replies; 26+ messages in thread
From: Roman Mamedov @ 2019-07-29 20:18 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, jay.vosburgh

On Mon, 29 Jul 2019 16:33:59 -0300
"Guilherme G. Piccoli" <gpiccoli@canonical.com> wrote:

> Currently md/raid0 is not provided with any mechanism to validate if
> an array member got removed or failed. The driver keeps sending BIOs
> regardless of the state of array members. This leads to the following
> situation: if a raid0 array member is removed and the array is mounted,
> some user writing to this array won't realize that errors are happening
> unless they check kernel log or perform one fsync per written file.
> 
> In other words, no -EIO is returned and writes (except direct ones) appear
> normal. Meaning the user might think the wrote data is correctly stored in
> the array, but instead garbage was written given that raid0 does stripping
> (and so, it requires all its members to be working in order to not corrupt
> data).

If that's correct, then this seems to be a critical weak point in cases when
we have a RAID0 as a member device in RAID1/5/6/10 arrays.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-29 20:18   ` Roman Mamedov
  0 siblings, 0 replies; 26+ messages in thread
From: Roman Mamedov @ 2019-07-29 20:18 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-raid, linux-block, dm-devel, jay.vosburgh, NeilBrown, Song Liu

On Mon, 29 Jul 2019 16:33:59 -0300
"Guilherme G. Piccoli" <gpiccoli@canonical.com> wrote:

> Currently md/raid0 is not provided with any mechanism to validate if
> an array member got removed or failed. The driver keeps sending BIOs
> regardless of the state of array members. This leads to the following
> situation: if a raid0 array member is removed and the array is mounted,
> some user writing to this array won't realize that errors are happening
> unless they check kernel log or perform one fsync per written file.
> 
> In other words, no -EIO is returned and writes (except direct ones) appear
> normal. Meaning the user might think the wrote data is correctly stored in
> the array, but instead garbage was written given that raid0 does stripping
> (and so, it requires all its members to be working in order to not corrupt
> data).

If that's correct, then this seems to be a critical weak point in cases when
we have a RAID0 as a member device in RAID1/5/6/10 arrays.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-29 20:18   ` Roman Mamedov
@ 2019-07-29 20:27     ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-07-29 20:27 UTC (permalink / raw)
  To: Roman Mamedov
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, jay.vosburgh



On 29/07/2019 17:18, Roman Mamedov wrote:
> On Mon, 29 Jul 2019 16:33:59 -0300
> "Guilherme G. Piccoli" <gpiccoli@canonical.com> wrote:
> 
>> Currently md/raid0 is not provided with any mechanism to validate if
>> an array member got removed or failed. The driver keeps sending BIOs
>> regardless of the state of array members. This leads to the following
>> situation: if a raid0 array member is removed and the array is mounted,
>> some user writing to this array won't realize that errors are happening
>> unless they check kernel log or perform one fsync per written file.
>>
>> In other words, no -EIO is returned and writes (except direct ones) appear
>> normal. Meaning the user might think the wrote data is correctly stored in
>> the array, but instead garbage was written given that raid0 does stripping
>> (and so, it requires all its members to be working in order to not corrupt
>> data).
> 
> If that's correct, then this seems to be a critical weak point in cases when
> we have a RAID0 as a member device in RAID1/5/6/10 arrays.
> 

Hi Roman, I don't think this is usual setup. I understand that there are
RAID10 (also known as RAID 0+1) in which we can have like 4 devices, and
they pair in 2 sets of two disks using stripping, then these sets are
paired using mirroring. This is handled by raid10 driver however, so it
won't suffer for this issue.

I don't think it's common or even makes sense to back a raid1 with 2
pure raid0 devices.
Thanks for your comment!
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-29 20:27     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-07-29 20:27 UTC (permalink / raw)
  To: Roman Mamedov
  Cc: linux-raid, linux-block, dm-devel, jay.vosburgh, NeilBrown, Song Liu



On 29/07/2019 17:18, Roman Mamedov wrote:
> On Mon, 29 Jul 2019 16:33:59 -0300
> "Guilherme G. Piccoli" <gpiccoli@canonical.com> wrote:
> 
>> Currently md/raid0 is not provided with any mechanism to validate if
>> an array member got removed or failed. The driver keeps sending BIOs
>> regardless of the state of array members. This leads to the following
>> situation: if a raid0 array member is removed and the array is mounted,
>> some user writing to this array won't realize that errors are happening
>> unless they check kernel log or perform one fsync per written file.
>>
>> In other words, no -EIO is returned and writes (except direct ones) appear
>> normal. Meaning the user might think the wrote data is correctly stored in
>> the array, but instead garbage was written given that raid0 does stripping
>> (and so, it requires all its members to be working in order to not corrupt
>> data).
> 
> If that's correct, then this seems to be a critical weak point in cases when
> we have a RAID0 as a member device in RAID1/5/6/10 arrays.
> 

Hi Roman, I don't think this is usual setup. I understand that there are
RAID10 (also known as RAID 0+1) in which we can have like 4 devices, and
they pair in 2 sets of two disks using stripping, then these sets are
paired using mirroring. This is handled by raid10 driver however, so it
won't suffer for this issue.

I don't think it's common or even makes sense to back a raid1 with 2
pure raid0 devices.
Thanks for your comment!
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-29 20:27     ` Guilherme G. Piccoli
@ 2019-07-29 20:36       ` Roman Mamedov
  -1 siblings, 0 replies; 26+ messages in thread
From: Roman Mamedov @ 2019-07-29 20:36 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, jay.vosburgh

On Mon, 29 Jul 2019 17:27:15 -0300
"Guilherme G. Piccoli" <gpiccoli@canonical.com> wrote:

> Hi Roman, I don't think this is usual setup. I understand that there are
> RAID10 (also known as RAID 0+1) in which we can have like 4 devices, and
> they pair in 2 sets of two disks using stripping, then these sets are
> paired using mirroring. This is handled by raid10 driver however, so it
> won't suffer for this issue.
> 
> I don't think it's common or even makes sense to back a raid1 with 2
> pure raid0 devices.

It might be not a usual setup, but it is a nice possibility that you get with
MD. If for the moment you don't have drives of the needed size, but have
smaller drives. E.g.:

- had a 2x1TB RAID1;
- one disk fails;
- no 1TB disks at hand;
- but lots of 500GB disks;
- let's make a 2x500GB RAID0 and have that stand in for the missing 1TB
member for the time being;

Or here's for a detailed rationale of a more permanent scenario:
https://louwrentius.com/building-a-raid-6-array-of-mixed-drives.html

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-29 20:36       ` Roman Mamedov
  0 siblings, 0 replies; 26+ messages in thread
From: Roman Mamedov @ 2019-07-29 20:36 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: linux-raid, linux-block, dm-devel, jay.vosburgh, NeilBrown, Song Liu

On Mon, 29 Jul 2019 17:27:15 -0300
"Guilherme G. Piccoli" <gpiccoli@canonical.com> wrote:

> Hi Roman, I don't think this is usual setup. I understand that there are
> RAID10 (also known as RAID 0+1) in which we can have like 4 devices, and
> they pair in 2 sets of two disks using stripping, then these sets are
> paired using mirroring. This is handled by raid10 driver however, so it
> won't suffer for this issue.
> 
> I don't think it's common or even makes sense to back a raid1 with 2
> pure raid0 devices.

It might be not a usual setup, but it is a nice possibility that you get with
MD. If for the moment you don't have drives of the needed size, but have
smaller drives. E.g.:

- had a 2x1TB RAID1;
- one disk fails;
- no 1TB disks at hand;
- but lots of 500GB disks;
- let's make a 2x500GB RAID0 and have that stand in for the missing 1TB
member for the time being;

Or here's for a detailed rationale of a more permanent scenario:
https://louwrentius.com/building-a-raid-6-array-of-mixed-drives.html

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-29 20:36       ` Roman Mamedov
@ 2019-07-29 20:49         ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-07-29 20:49 UTC (permalink / raw)
  To: Roman Mamedov
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, jay.vosburgh

On 29/07/2019 17:36, Roman Mamedov wrote:
> On Mon, 29 Jul 2019 17:27:15 -0300
> "Guilherme G. Piccoli" <gpiccoli@canonical.com> wrote:
> 
>> Hi Roman, I don't think this is usual setup. I understand that there are
>> RAID10 (also known as RAID 0+1) in which we can have like 4 devices, and
>> they pair in 2 sets of two disks using stripping, then these sets are
>> paired using mirroring. This is handled by raid10 driver however, so it
>> won't suffer for this issue.
>>
>> I don't think it's common or even makes sense to back a raid1 with 2
>> pure raid0 devices.
> 
> It might be not a usual setup, but it is a nice possibility that you get with
> MD. If for the moment you don't have drives of the needed size, but have
> smaller drives. E.g.:
> 
> - had a 2x1TB RAID1;
> - one disk fails;
> - no 1TB disks at hand;
> - but lots of 500GB disks;
> - let's make a 2x500GB RAID0 and have that stand in for the missing 1TB
> member for the time being;
> 
> Or here's for a detailed rationale of a more permanent scenario:
> https://louwrentius.com/building-a-raid-6-array-of-mixed-drives.html
> 

Oh, that's nice to know, thanks for the clarification Roman.
I wasn't aware this was more or less common.

Anyway, I agree with you: in this case, it's a weak point of raid0 to be
so slow to react in case of failures in one member. I hope this patch
helps to alleviate the issue.
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-29 20:49         ` Guilherme G. Piccoli
  0 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-07-29 20:49 UTC (permalink / raw)
  To: Roman Mamedov
  Cc: linux-raid, linux-block, dm-devel, jay.vosburgh, NeilBrown, Song Liu

On 29/07/2019 17:36, Roman Mamedov wrote:
> On Mon, 29 Jul 2019 17:27:15 -0300
> "Guilherme G. Piccoli" <gpiccoli@canonical.com> wrote:
> 
>> Hi Roman, I don't think this is usual setup. I understand that there are
>> RAID10 (also known as RAID 0+1) in which we can have like 4 devices, and
>> they pair in 2 sets of two disks using stripping, then these sets are
>> paired using mirroring. This is handled by raid10 driver however, so it
>> won't suffer for this issue.
>>
>> I don't think it's common or even makes sense to back a raid1 with 2
>> pure raid0 devices.
> 
> It might be not a usual setup, but it is a nice possibility that you get with
> MD. If for the moment you don't have drives of the needed size, but have
> smaller drives. E.g.:
> 
> - had a 2x1TB RAID1;
> - one disk fails;
> - no 1TB disks at hand;
> - but lots of 500GB disks;
> - let's make a 2x500GB RAID0 and have that stand in for the missing 1TB
> member for the time being;
> 
> Or here's for a detailed rationale of a more permanent scenario:
> https://louwrentius.com/building-a-raid-6-array-of-mixed-drives.html
> 

Oh, that's nice to know, thanks for the clarification Roman.
I wasn't aware this was more or less common.

Anyway, I agree with you: in this case, it's a weak point of raid0 to be
so slow to react in case of failures in one member. I hope this patch
helps to alleviate the issue.
Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-29 20:27     ` Guilherme G. Piccoli
@ 2019-07-29 21:14       ` Reindl Harald
  -1 siblings, 0 replies; 26+ messages in thread
From: Reindl Harald @ 2019-07-29 21:14 UTC (permalink / raw)
  To: Guilherme G. Piccoli, Roman Mamedov
  Cc: linux-block, Song Liu, NeilBrown, linux-raid, dm-devel, jay.vosburgh



Am 29.07.19 um 22:27 schrieb Guilherme G. Piccoli:
>> If that's correct, then this seems to be a critical weak point in cases when
>> we have a RAID0 as a member device in RAID1/5/6/10 arrays.
> 
> Hi Roman, I don't think this is usual setup. I understand that there are
> RAID10 (also known as RAID 0+1) in which we can have like 4 devices, and
> they pair in 2 sets of two disks using stripping, then these sets are
> paired using mirroring. This is handled by raid10 driver however, so it
> won't suffer for this issue.
> 
> I don't think it's common or even makes sense to back a raid1 with 2
> pure raid0 devices.
if i would have been aware that RAID10 don't support "--write-mostly" to
make a hybrid HDD/SSD RAID (https://www.tansi.org/hybrid/) i would
likely have done exactly that to buy only 2 instead 4 x 2 TB SSD disks
here and frankly i have another 5 machines where this limitation of
RAID110 on linux sucks

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-29 21:14       ` Reindl Harald
  0 siblings, 0 replies; 26+ messages in thread
From: Reindl Harald @ 2019-07-29 21:14 UTC (permalink / raw)
  To: Guilherme G. Piccoli, Roman Mamedov
  Cc: linux-raid, linux-block, dm-devel, jay.vosburgh, NeilBrown, Song Liu



Am 29.07.19 um 22:27 schrieb Guilherme G. Piccoli:
>> If that's correct, then this seems to be a critical weak point in cases when
>> we have a RAID0 as a member device in RAID1/5/6/10 arrays.
> 
> Hi Roman, I don't think this is usual setup. I understand that there are
> RAID10 (also known as RAID 0+1) in which we can have like 4 devices, and
> they pair in 2 sets of two disks using stripping, then these sets are
> paired using mirroring. This is handled by raid10 driver however, so it
> won't suffer for this issue.
> 
> I don't think it's common or even makes sense to back a raid1 with 2
> pure raid0 devices.
if i would have been aware that RAID10 don't support "--write-mostly" to
make a hybrid HDD/SSD RAID (https://www.tansi.org/hybrid/) i would
likely have done exactly that to buy only 2 instead 4 x 2 TB SSD disks
here and frankly i have another 5 machines where this limitation of
RAID110 on linux sucks

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-29 19:33 ` Guilherme G. Piccoli
@ 2019-07-30  0:08   ` NeilBrown
  -1 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2019-07-30  0:08 UTC (permalink / raw)
  To: Guilherme G. Piccoli, linux-raid
  Cc: linux-block, jay.vosburgh, Song Liu, dm-devel, Neil F Brown


[-- Attachment #1.1: Type: text/plain, Size: 4323 bytes --]

On Mon, Jul 29 2019,  Guilherme G. Piccoli  wrote:

> Currently md/raid0 is not provided with any mechanism to validate if
> an array member got removed or failed. The driver keeps sending BIOs
> regardless of the state of array members. This leads to the following
> situation: if a raid0 array member is removed and the array is mounted,
> some user writing to this array won't realize that errors are happening
> unless they check kernel log or perform one fsync per written file.
>
> In other words, no -EIO is returned and writes (except direct ones) appear
> normal. Meaning the user might think the wrote data is correctly stored in
> the array, but instead garbage was written given that raid0 does stripping
> (and so, it requires all its members to be working in order to not corrupt
> data).
>
> This patch proposes a small change in this behavior: we check if the block
> device's gendisk is UP when submitting the BIO to the array member, and if
> it isn't, we flag the md device as broken and fail subsequent I/Os. In case
> the array is restored (i.e., the missing array member is back), the flag is
> cleared and we can submit BIOs normally.
>
> With this patch, the filesystem reacts much faster to the event of missing
> array member: after some I/O errors, ext4 for instance aborts the journal
> and prevents corruption. Without this change, we're able to keep writing
> in the disk and after a machine reboot, e2fsck shows some severe fs errors
> that demand fixing. This patch was tested in both ext4 and xfs
> filesystems.
>
> Cc: NeilBrown <neilb@suse.com>
> Cc: Song Liu <songliubraving@fb.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
> ---
>
> After an attempt to change the way md/raid0 fails in case one of its
> members gets removed (see [0]), we got not so great feedback from the
> community and also, the change was complex and had corner cases.
> One of the points which seemed to be a consensus in that attempt was
> that raid0 could benefit from a way of failing faster in case a member
> gets removed. This patch aims for that, in a simpler way than earlier
> proposed. Any feedbacks are welcome, thanks in advance!
>
>
> Guilherme
>
>
> [0] lore.kernel.org/linux-block/20190418220448.7219-1-gpiccoli@canonical.com
>
>
>  drivers/md/md.c    | 5 +++++
>  drivers/md/md.h    | 3 +++
>  drivers/md/raid0.c | 8 ++++++++
>  3 files changed, 16 insertions(+)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 24638ccedce4..fba49918d591 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -376,6 +376,11 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
>  	struct mddev *mddev = q->queuedata;
>  	unsigned int sectors;
>  
> +	if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
> +		bio_io_error(bio);
> +		return BLK_QC_T_NONE;
> +	}

I think this should only fail WRITE requests, not READ requests.

Otherwise the patch is probably reasonable.

NeilBrown


> +
>  	blk_queue_split(q, &bio);
>  
>  	if (mddev == NULL || mddev->pers == NULL) {
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 10f98200e2f8..41552e615c4c 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -248,6 +248,9 @@ enum mddev_flags {
>  	MD_UPDATING_SB,		/* md_check_recovery is updating the metadata
>  				 * without explicitly holding reconfig_mutex.
>  				 */
> +	MD_BROKEN,              /* This is used in RAID-0 only, to stop I/O
> +				 * in case an array member is gone/failed.
> +				 */
>  };
>  
>  enum mddev_sb_flags {
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index bf5cf184a260..58a9cc5193bf 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -586,6 +586,14 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
>  
>  	zone = find_zone(mddev->private, &sector);
>  	tmp_dev = map_sector(mddev, zone, sector, &sector);
> +
> +	if (unlikely(!(tmp_dev->bdev->bd_disk->flags & GENHD_FL_UP))) {
> +		set_bit(MD_BROKEN, &mddev->flags);
> +		bio_io_error(bio);
> +		return true;
> +	}
> +
> +	clear_bit(MD_BROKEN, &mddev->flags);
>  	bio_set_dev(bio, tmp_dev->bdev);
>  	bio->bi_iter.bi_sector = sector + zone->dev_start +
>  		tmp_dev->data_offset;
> -- 
> 2.22.0

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-30  0:08   ` NeilBrown
  0 siblings, 0 replies; 26+ messages in thread
From: NeilBrown @ 2019-07-30  0:08 UTC (permalink / raw)
  To: Guilherme G. Piccoli, linux-raid
  Cc: jay.vosburgh, Song Liu, dm-devel, Neil F Brown, linux-block

[-- Attachment #1: Type: text/plain, Size: 4323 bytes --]

On Mon, Jul 29 2019,  Guilherme G. Piccoli  wrote:

> Currently md/raid0 is not provided with any mechanism to validate if
> an array member got removed or failed. The driver keeps sending BIOs
> regardless of the state of array members. This leads to the following
> situation: if a raid0 array member is removed and the array is mounted,
> some user writing to this array won't realize that errors are happening
> unless they check kernel log or perform one fsync per written file.
>
> In other words, no -EIO is returned and writes (except direct ones) appear
> normal. Meaning the user might think the wrote data is correctly stored in
> the array, but instead garbage was written given that raid0 does stripping
> (and so, it requires all its members to be working in order to not corrupt
> data).
>
> This patch proposes a small change in this behavior: we check if the block
> device's gendisk is UP when submitting the BIO to the array member, and if
> it isn't, we flag the md device as broken and fail subsequent I/Os. In case
> the array is restored (i.e., the missing array member is back), the flag is
> cleared and we can submit BIOs normally.
>
> With this patch, the filesystem reacts much faster to the event of missing
> array member: after some I/O errors, ext4 for instance aborts the journal
> and prevents corruption. Without this change, we're able to keep writing
> in the disk and after a machine reboot, e2fsck shows some severe fs errors
> that demand fixing. This patch was tested in both ext4 and xfs
> filesystems.
>
> Cc: NeilBrown <neilb@suse.com>
> Cc: Song Liu <songliubraving@fb.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
> ---
>
> After an attempt to change the way md/raid0 fails in case one of its
> members gets removed (see [0]), we got not so great feedback from the
> community and also, the change was complex and had corner cases.
> One of the points which seemed to be a consensus in that attempt was
> that raid0 could benefit from a way of failing faster in case a member
> gets removed. This patch aims for that, in a simpler way than earlier
> proposed. Any feedbacks are welcome, thanks in advance!
>
>
> Guilherme
>
>
> [0] lore.kernel.org/linux-block/20190418220448.7219-1-gpiccoli@canonical.com
>
>
>  drivers/md/md.c    | 5 +++++
>  drivers/md/md.h    | 3 +++
>  drivers/md/raid0.c | 8 ++++++++
>  3 files changed, 16 insertions(+)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 24638ccedce4..fba49918d591 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -376,6 +376,11 @@ static blk_qc_t md_make_request(struct request_queue *q, struct bio *bio)
>  	struct mddev *mddev = q->queuedata;
>  	unsigned int sectors;
>  
> +	if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
> +		bio_io_error(bio);
> +		return BLK_QC_T_NONE;
> +	}

I think this should only fail WRITE requests, not READ requests.

Otherwise the patch is probably reasonable.

NeilBrown


> +
>  	blk_queue_split(q, &bio);
>  
>  	if (mddev == NULL || mddev->pers == NULL) {
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 10f98200e2f8..41552e615c4c 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -248,6 +248,9 @@ enum mddev_flags {
>  	MD_UPDATING_SB,		/* md_check_recovery is updating the metadata
>  				 * without explicitly holding reconfig_mutex.
>  				 */
> +	MD_BROKEN,              /* This is used in RAID-0 only, to stop I/O
> +				 * in case an array member is gone/failed.
> +				 */
>  };
>  
>  enum mddev_sb_flags {
> diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
> index bf5cf184a260..58a9cc5193bf 100644
> --- a/drivers/md/raid0.c
> +++ b/drivers/md/raid0.c
> @@ -586,6 +586,14 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio)
>  
>  	zone = find_zone(mddev->private, &sector);
>  	tmp_dev = map_sector(mddev, zone, sector, &sector);
> +
> +	if (unlikely(!(tmp_dev->bdev->bd_disk->flags & GENHD_FL_UP))) {
> +		set_bit(MD_BROKEN, &mddev->flags);
> +		bio_io_error(bio);
> +		return true;
> +	}
> +
> +	clear_bit(MD_BROKEN, &mddev->flags);
>  	bio_set_dev(bio, tmp_dev->bdev);
>  	bio->bi_iter.bi_sector = sector + zone->dev_start +
>  		tmp_dev->data_offset;
> -- 
> 2.22.0

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-30  0:08   ` NeilBrown
@ 2019-07-30 12:30     ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-07-30 12:30 UTC (permalink / raw)
  To: NeilBrown, linux-raid
  Cc: linux-block, jay.vosburgh, Song Liu, dm-devel, Neil F Brown

On 29/07/2019 21:08, NeilBrown wrote:
>[...]
>> +	if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
>> +		bio_io_error(bio);
>> +		return BLK_QC_T_NONE;
>> +	}
> 
> I think this should only fail WRITE requests, not READ requests.
> 
> Otherwise the patch is probably reasonable.
> 
> NeilBrown

Thanks for the feedback Neil! I thought about it; it seemed to me better
to deny/fail the reads instead of returning "wrong" reads, since a file
read in a raid0 will be incomplete if one member is missing.
But it's fine for me to change that in the next iteration of this patch.

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-30 12:30     ` Guilherme G. Piccoli
  0 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-07-30 12:30 UTC (permalink / raw)
  To: NeilBrown, linux-raid
  Cc: jay.vosburgh, Song Liu, dm-devel, Neil F Brown, linux-block

On 29/07/2019 21:08, NeilBrown wrote:
>[...]
>> +	if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
>> +		bio_io_error(bio);
>> +		return BLK_QC_T_NONE;
>> +	}
> 
> I think this should only fail WRITE requests, not READ requests.
> 
> Otherwise the patch is probably reasonable.
> 
> NeilBrown

Thanks for the feedback Neil! I thought about it; it seemed to me better
to deny/fail the reads instead of returning "wrong" reads, since a file
read in a raid0 will be incomplete if one member is missing.
But it's fine for me to change that in the next iteration of this patch.

Cheers,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-30 12:30     ` Guilherme G. Piccoli
@ 2019-07-31 19:54       ` Song Liu
  -1 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-07-31 19:54 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Neil F Brown, Song Liu, NeilBrown, linux-raid, dm-devel,
	linux-block, Jay Vosburgh

On Tue, Jul 30, 2019 at 5:31 AM Guilherme G. Piccoli
<gpiccoli@canonical.com> wrote:
>
> On 29/07/2019 21:08, NeilBrown wrote:
> >[...]
> >> +    if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
> >> +            bio_io_error(bio);
> >> +            return BLK_QC_T_NONE;
> >> +    }
> >
> > I think this should only fail WRITE requests, not READ requests.
> >
> > Otherwise the patch is probably reasonable.
> >
> > NeilBrown
>
> Thanks for the feedback Neil! I thought about it; it seemed to me better
> to deny/fail the reads instead of returning "wrong" reads, since a file
> read in a raid0 will be incomplete if one member is missing.
> But it's fine for me to change that in the next iteration of this patch.

For reads at block/page level, we will either get EIO or valid data, right?

If that's not the case, we should fail all writes.

Thanks,
Song

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-31 19:54       ` Song Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-07-31 19:54 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: NeilBrown, linux-raid, Jay Vosburgh, Song Liu, dm-devel,
	Neil F Brown, linux-block

On Tue, Jul 30, 2019 at 5:31 AM Guilherme G. Piccoli
<gpiccoli@canonical.com> wrote:
>
> On 29/07/2019 21:08, NeilBrown wrote:
> >[...]
> >> +    if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
> >> +            bio_io_error(bio);
> >> +            return BLK_QC_T_NONE;
> >> +    }
> >
> > I think this should only fail WRITE requests, not READ requests.
> >
> > Otherwise the patch is probably reasonable.
> >
> > NeilBrown
>
> Thanks for the feedback Neil! I thought about it; it seemed to me better
> to deny/fail the reads instead of returning "wrong" reads, since a file
> read in a raid0 will be incomplete if one member is missing.
> But it's fine for me to change that in the next iteration of this patch.

For reads at block/page level, we will either get EIO or valid data, right?

If that's not the case, we should fail all writes.

Thanks,
Song

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-31 19:54       ` Song Liu
@ 2019-07-31 19:56         ` Song Liu
  -1 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-07-31 19:56 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Neil F Brown, Song Liu, NeilBrown, linux-raid, dm-devel,
	linux-block, Jay Vosburgh

On Wed, Jul 31, 2019 at 12:54 PM Song Liu <liu.song.a23@gmail.com> wrote:
>
> On Tue, Jul 30, 2019 at 5:31 AM Guilherme G. Piccoli
> <gpiccoli@canonical.com> wrote:
> >
> > On 29/07/2019 21:08, NeilBrown wrote:
> > >[...]
> > >> +    if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
> > >> +            bio_io_error(bio);
> > >> +            return BLK_QC_T_NONE;
> > >> +    }
> > >
> > > I think this should only fail WRITE requests, not READ requests.
> > >
> > > Otherwise the patch is probably reasonable.
> > >
> > > NeilBrown
> >
> > Thanks for the feedback Neil! I thought about it; it seemed to me better
> > to deny/fail the reads instead of returning "wrong" reads, since a file
> > read in a raid0 will be incomplete if one member is missing.
> > But it's fine for me to change that in the next iteration of this patch.
>
> For reads at block/page level, we will either get EIO or valid data, right?
>
> If that's not the case, we should fail all writes.

Oops, I meant all _reads_.

>
> Thanks,
> Song

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-07-31 19:56         ` Song Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-07-31 19:56 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: NeilBrown, linux-raid, Jay Vosburgh, Song Liu, dm-devel,
	Neil F Brown, linux-block

On Wed, Jul 31, 2019 at 12:54 PM Song Liu <liu.song.a23@gmail.com> wrote:
>
> On Tue, Jul 30, 2019 at 5:31 AM Guilherme G. Piccoli
> <gpiccoli@canonical.com> wrote:
> >
> > On 29/07/2019 21:08, NeilBrown wrote:
> > >[...]
> > >> +    if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
> > >> +            bio_io_error(bio);
> > >> +            return BLK_QC_T_NONE;
> > >> +    }
> > >
> > > I think this should only fail WRITE requests, not READ requests.
> > >
> > > Otherwise the patch is probably reasonable.
> > >
> > > NeilBrown
> >
> > Thanks for the feedback Neil! I thought about it; it seemed to me better
> > to deny/fail the reads instead of returning "wrong" reads, since a file
> > read in a raid0 will be incomplete if one member is missing.
> > But it's fine for me to change that in the next iteration of this patch.
>
> For reads at block/page level, we will either get EIO or valid data, right?
>
> If that's not the case, we should fail all writes.

Oops, I meant all _reads_.

>
> Thanks,
> Song

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-07-31 19:56         ` Song Liu
@ 2019-08-01 20:28           ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-01 20:28 UTC (permalink / raw)
  To: Song Liu
  Cc: Neil F Brown, Song Liu, NeilBrown, linux-raid, dm-devel,
	linux-block, Jay Vosburgh



On 31/07/2019 16:56, Song Liu wrote:
> On Wed, Jul 31, 2019 at 12:54 PM Song Liu <liu.song.a23@gmail.com> wrote:
>>
>> On Tue, Jul 30, 2019 at 5:31 AM Guilherme G. Piccoli
>> <gpiccoli@canonical.com> wrote:
>>>
>>> On 29/07/2019 21:08, NeilBrown wrote:
>>>> [...]
>>>>> +    if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
>>>>> +            bio_io_error(bio);
>>>>> +            return BLK_QC_T_NONE;
>>>>> +    }
>>>>
>>>> I think this should only fail WRITE requests, not READ requests.
>>>>
>>>> Otherwise the patch is probably reasonable.
>>>>
>>>> NeilBrown
>>>
>>> Thanks for the feedback Neil! I thought about it; it seemed to me better
>>> to deny/fail the reads instead of returning "wrong" reads, since a file
>>> read in a raid0 will be incomplete if one member is missing.
>>> But it's fine for me to change that in the next iteration of this patch.
>>
>> For reads at block/page level, we will either get EIO or valid data, right?
>>
>> If that's not the case, we should fail all writes.
> 
> Oops, I meant all _reads_.

Hi Song, thanks for the feedback! After changing the patch and testing a
bit, it behaves exactly as you said, we got either valid data read from
the healthy devices or -EIO for the data tentatively read from the
failed/missing array members.

So, I'll resubmit with that change. Also, I've noticed clearing the
BROKEN flag seem unnecessary, if user stops the array in order to fix
the missing member, it'll require a re-assembly and the array is gonna
work again.

Do you / Neil considers this fix relevant to md/linear too? If so, I can
also include that in the V2.
Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-08-01 20:28           ` Guilherme G. Piccoli
  0 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-01 20:28 UTC (permalink / raw)
  To: Song Liu
  Cc: NeilBrown, linux-raid, Jay Vosburgh, Song Liu, dm-devel,
	Neil F Brown, linux-block



On 31/07/2019 16:56, Song Liu wrote:
> On Wed, Jul 31, 2019 at 12:54 PM Song Liu <liu.song.a23@gmail.com> wrote:
>>
>> On Tue, Jul 30, 2019 at 5:31 AM Guilherme G. Piccoli
>> <gpiccoli@canonical.com> wrote:
>>>
>>> On 29/07/2019 21:08, NeilBrown wrote:
>>>> [...]
>>>>> +    if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
>>>>> +            bio_io_error(bio);
>>>>> +            return BLK_QC_T_NONE;
>>>>> +    }
>>>>
>>>> I think this should only fail WRITE requests, not READ requests.
>>>>
>>>> Otherwise the patch is probably reasonable.
>>>>
>>>> NeilBrown
>>>
>>> Thanks for the feedback Neil! I thought about it; it seemed to me better
>>> to deny/fail the reads instead of returning "wrong" reads, since a file
>>> read in a raid0 will be incomplete if one member is missing.
>>> But it's fine for me to change that in the next iteration of this patch.
>>
>> For reads at block/page level, we will either get EIO or valid data, right?
>>
>> If that's not the case, we should fail all writes.
> 
> Oops, I meant all _reads_.

Hi Song, thanks for the feedback! After changing the patch and testing a
bit, it behaves exactly as you said, we got either valid data read from
the healthy devices or -EIO for the data tentatively read from the
failed/missing array members.

So, I'll resubmit with that change. Also, I've noticed clearing the
BROKEN flag seem unnecessary, if user stops the array in order to fix
the missing member, it'll require a re-assembly and the array is gonna
work again.

Do you / Neil considers this fix relevant to md/linear too? If so, I can
also include that in the V2.
Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-08-01 20:28           ` Guilherme G. Piccoli
@ 2019-08-01 22:43             ` Song Liu
  -1 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-08-01 22:43 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Neil F Brown, Song Liu, NeilBrown, linux-raid, dm-devel,
	linux-block, Jay Vosburgh



> On Aug 1, 2019, at 1:28 PM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> 
> 
> On 31/07/2019 16:56, Song Liu wrote:
>> On Wed, Jul 31, 2019 at 12:54 PM Song Liu <liu.song.a23@gmail.com> wrote:
>>> 
>>> On Tue, Jul 30, 2019 at 5:31 AM Guilherme G. Piccoli
>>> <gpiccoli@canonical.com> wrote:
>>>> 
>>>> On 29/07/2019 21:08, NeilBrown wrote:
>>>>> [...]
>>>>>> +    if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
>>>>>> +            bio_io_error(bio);
>>>>>> +            return BLK_QC_T_NONE;
>>>>>> +    }
>>>>> 
>>>>> I think this should only fail WRITE requests, not READ requests.
>>>>> 
>>>>> Otherwise the patch is probably reasonable.
>>>>> 
>>>>> NeilBrown
>>>> 
>>>> Thanks for the feedback Neil! I thought about it; it seemed to me better
>>>> to deny/fail the reads instead of returning "wrong" reads, since a file
>>>> read in a raid0 will be incomplete if one member is missing.
>>>> But it's fine for me to change that in the next iteration of this patch.
>>> 
>>> For reads at block/page level, we will either get EIO or valid data, right?
>>> 
>>> If that's not the case, we should fail all writes.
>> 
>> Oops, I meant all _reads_.
> 
> Hi Song, thanks for the feedback! After changing the patch and testing a
> bit, it behaves exactly as you said, we got either valid data read from
> the healthy devices or -EIO for the data tentatively read from the
> failed/missing array members.

Thanks for testing this out. 

> 
> So, I'll resubmit with that change. Also, I've noticed clearing the
> BROKEN flag seem unnecessary, if user stops the array in order to fix
> the missing member, it'll require a re-assembly and the array is gonna
> work again.
> 
> Do you / Neil considers this fix relevant to md/linear too? If so, I can
> also include that in the V2.

Yes, please also include fix for md/linear. 

Song

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-08-01 22:43             ` Song Liu
  0 siblings, 0 replies; 26+ messages in thread
From: Song Liu @ 2019-08-01 22:43 UTC (permalink / raw)
  To: Guilherme G. Piccoli
  Cc: Song Liu, NeilBrown, linux-raid, Jay Vosburgh, dm-devel,
	Neil F Brown, linux-block



> On Aug 1, 2019, at 1:28 PM, Guilherme G. Piccoli <gpiccoli@canonical.com> wrote:
> 
> 
> 
> On 31/07/2019 16:56, Song Liu wrote:
>> On Wed, Jul 31, 2019 at 12:54 PM Song Liu <liu.song.a23@gmail.com> wrote:
>>> 
>>> On Tue, Jul 30, 2019 at 5:31 AM Guilherme G. Piccoli
>>> <gpiccoli@canonical.com> wrote:
>>>> 
>>>> On 29/07/2019 21:08, NeilBrown wrote:
>>>>> [...]
>>>>>> +    if (unlikely(test_bit(MD_BROKEN, &mddev->flags))) {
>>>>>> +            bio_io_error(bio);
>>>>>> +            return BLK_QC_T_NONE;
>>>>>> +    }
>>>>> 
>>>>> I think this should only fail WRITE requests, not READ requests.
>>>>> 
>>>>> Otherwise the patch is probably reasonable.
>>>>> 
>>>>> NeilBrown
>>>> 
>>>> Thanks for the feedback Neil! I thought about it; it seemed to me better
>>>> to deny/fail the reads instead of returning "wrong" reads, since a file
>>>> read in a raid0 will be incomplete if one member is missing.
>>>> But it's fine for me to change that in the next iteration of this patch.
>>> 
>>> For reads at block/page level, we will either get EIO or valid data, right?
>>> 
>>> If that's not the case, we should fail all writes.
>> 
>> Oops, I meant all _reads_.
> 
> Hi Song, thanks for the feedback! After changing the patch and testing a
> bit, it behaves exactly as you said, we got either valid data read from
> the healthy devices or -EIO for the data tentatively read from the
> failed/missing array members.

Thanks for testing this out. 

> 
> So, I'll resubmit with that change. Also, I've noticed clearing the
> BROKEN flag seem unnecessary, if user stops the array in order to fix
> the missing member, it'll require a re-assembly and the array is gonna
> work again.
> 
> Do you / Neil considers this fix relevant to md/linear too? If so, I can
> also include that in the V2.

Yes, please also include fix for md/linear. 

Song

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
  2019-08-01 22:43             ` Song Liu
@ 2019-08-16 13:45               ` Guilherme G. Piccoli
  -1 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-16 13:45 UTC (permalink / raw)
  To: Song Liu
  Cc: Neil F Brown, Song Liu, NeilBrown, linux-raid, dm-devel,
	linux-block, Jay Vosburgh

On 01/08/2019 19:43, Song Liu wrote:
> 
> [...]
>> Hi Song, thanks for the feedback! After changing the patch and testing a
>> bit, it behaves exactly as you said, we got either valid data read from
>> the healthy devices or -EIO for the data tentatively read from the
>> failed/missing array members.
> 
> Thanks for testing this out. 
> 
>>
>> So, I'll resubmit with that change. Also, I've noticed clearing the
>> BROKEN flag seem unnecessary, if user stops the array in order to fix
>> the missing member, it'll require a re-assembly and the array is gonna
>> work again.
>>
>> Do you / Neil considers this fix relevant to md/linear too? If so, I can
>> also include that in the V2.
> 
> Yes, please also include fix for md/linear. 
> 
> Song
> 

V2 just sent:
lore.kernel.org/linux-block/20190816133441.29350-1-gpiccoli@canonical.com

Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] md/raid0: Fail BIOs if their underlying block device is gone
@ 2019-08-16 13:45               ` Guilherme G. Piccoli
  0 siblings, 0 replies; 26+ messages in thread
From: Guilherme G. Piccoli @ 2019-08-16 13:45 UTC (permalink / raw)
  To: Song Liu
  Cc: Song Liu, NeilBrown, linux-raid, Jay Vosburgh, dm-devel,
	Neil F Brown, linux-block

On 01/08/2019 19:43, Song Liu wrote:
> 
> [...]
>> Hi Song, thanks for the feedback! After changing the patch and testing a
>> bit, it behaves exactly as you said, we got either valid data read from
>> the healthy devices or -EIO for the data tentatively read from the
>> failed/missing array members.
> 
> Thanks for testing this out. 
> 
>>
>> So, I'll resubmit with that change. Also, I've noticed clearing the
>> BROKEN flag seem unnecessary, if user stops the array in order to fix
>> the missing member, it'll require a re-assembly and the array is gonna
>> work again.
>>
>> Do you / Neil considers this fix relevant to md/linear too? If so, I can
>> also include that in the V2.
> 
> Yes, please also include fix for md/linear. 
> 
> Song
> 

V2 just sent:
lore.kernel.org/linux-block/20190816133441.29350-1-gpiccoli@canonical.com

Thanks,


Guilherme

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2019-08-16 13:45 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-29 19:33 [PATCH] md/raid0: Fail BIOs if their underlying block device is gone Guilherme G. Piccoli
2019-07-29 19:33 ` Guilherme G. Piccoli
2019-07-29 20:18 ` Roman Mamedov
2019-07-29 20:18   ` Roman Mamedov
2019-07-29 20:27   ` Guilherme G. Piccoli
2019-07-29 20:27     ` Guilherme G. Piccoli
2019-07-29 20:36     ` Roman Mamedov
2019-07-29 20:36       ` Roman Mamedov
2019-07-29 20:49       ` Guilherme G. Piccoli
2019-07-29 20:49         ` Guilherme G. Piccoli
2019-07-29 21:14     ` Reindl Harald
2019-07-29 21:14       ` Reindl Harald
2019-07-30  0:08 ` NeilBrown
2019-07-30  0:08   ` NeilBrown
2019-07-30 12:30   ` Guilherme G. Piccoli
2019-07-30 12:30     ` Guilherme G. Piccoli
2019-07-31 19:54     ` Song Liu
2019-07-31 19:54       ` Song Liu
2019-07-31 19:56       ` Song Liu
2019-07-31 19:56         ` Song Liu
2019-08-01 20:28         ` Guilherme G. Piccoli
2019-08-01 20:28           ` Guilherme G. Piccoli
2019-08-01 22:43           ` Song Liu
2019-08-01 22:43             ` Song Liu
2019-08-16 13:45             ` Guilherme G. Piccoli
2019-08-16 13:45               ` Guilherme G. Piccoli

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.