All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH][HSD116699] raid1: prefer disk without bad blocks
@ 2017-05-10 10:53 Tomasz Majchrzak
  2017-05-10 13:16 ` [PATCH v2] " Tomasz Majchrzak
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Majchrzak @ 2017-05-10 10:53 UTC (permalink / raw)
  To: linux-raid; +Cc: shli, Tomasz Majchrzak

If an array consists of two drives and the first drive has the bad
block, the read request to the region overlapping the bad block chooses
the same disk (with bad block) as device to read from over and over and
the request gets stuck. If the first disk only partially overlaps with
bad block, it becomes a candidate ("best disk") for shorter range of
sectors. The second disk is capable of reading the entire requested
range and it is updated accordingly, however it is not recorded as a
best device for the request. In the end the request is sent to the first
disk to read entire range of sectors. It fails and is re-tried in a
moment but with the same outcome.

Actually it is quite likely scenario but it had little exposure in my
test until commit 715d40b93b10 ("md/raid1: add failfast handling for
reads.") removed preference for idle disk. Such scenario had been passing
as second disk was always chosen when idle.

Update a candidate ("best disk") to read from if disk can read entire
range. Do it only if other disk has already been chosen as a candidate
for a smaller range. Otherwise, don't update it - let the head position /
disk type logic to select a disk.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 drivers/md/raid1.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index a1f3fbe..e7ab3bb 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -628,8 +628,11 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect
 					break;
 			}
 			continue;
-		} else
+		} else {
+			if ((sectors > best_good_sectors) && (best_disk >= 0))
+				best_disk = disk;
 			best_good_sectors = sectors;
+		}
 
 		if (best_disk >= 0)
 			/* At least two disks to choose from so failfast is OK */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2] raid1: prefer disk without bad blocks
  2017-05-10 10:53 [PATCH][HSD116699] raid1: prefer disk without bad blocks Tomasz Majchrzak
@ 2017-05-10 13:16 ` Tomasz Majchrzak
  2017-05-10 19:57   ` Shaohua Li
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Majchrzak @ 2017-05-10 13:16 UTC (permalink / raw)
  To: linux-raid; +Cc: shli, Tomasz Majchrzak

If an array consists of two drives and the first drive has the bad
block, the read request to the region overlapping the bad block chooses
the same disk (with bad block) as device to read from over and over and
the request gets stuck. If the first disk only partially overlaps with
bad block, it becomes a candidate ("best disk") for shorter range of
sectors. The second disk is capable of reading the entire requested
range and it is updated accordingly, however it is not recorded as a
best device for the request. In the end the request is sent to the first
disk to read entire range of sectors. It fails and is re-tried in a
moment but with the same outcome.

Actually it is quite likely scenario but it had little exposure in my
test until commit 715d40b93b10 ("md/raid1: add failfast handling for
reads.") removed preference for idle disk. Such scenario had been passing
as second disk was always chosen when idle.

Update a candidate ("best disk") to read from if disk can read entire
range. Do it only if other disk has already been chosen as a candidate
for a smaller range. Otherwise, don't update it - let the head position /
disk type logic to select a disk.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 drivers/md/raid1.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

v2:
  updated the title not to include internal defect number :)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index a1f3fbe..e7ab3bb 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -628,8 +628,11 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect
 					break;
 			}
 			continue;
-		} else
+		} else {
+			if ((sectors > best_good_sectors) && (best_disk >= 0))
+				best_disk = disk;
 			best_good_sectors = sectors;
+		}
 
 		if (best_disk >= 0)
 			/* At least two disks to choose from so failfast is OK */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] raid1: prefer disk without bad blocks
  2017-05-10 13:16 ` [PATCH v2] " Tomasz Majchrzak
@ 2017-05-10 19:57   ` Shaohua Li
  2017-05-12 12:23     ` Tomasz Majchrzak
  0 siblings, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2017-05-10 19:57 UTC (permalink / raw)
  To: Tomasz Majchrzak; +Cc: linux-raid

On Wed, May 10, 2017 at 03:16:20PM +0200, Tomasz Majchrzak wrote:
> If an array consists of two drives and the first drive has the bad
> block, the read request to the region overlapping the bad block chooses
> the same disk (with bad block) as device to read from over and over and
> the request gets stuck. If the first disk only partially overlaps with
> bad block, it becomes a candidate ("best disk") for shorter range of
> sectors. The second disk is capable of reading the entire requested
> range and it is updated accordingly, however it is not recorded as a
> best device for the request. In the end the request is sent to the first
> disk to read entire range of sectors. It fails and is re-tried in a
> moment but with the same outcome.
> 
> Actually it is quite likely scenario but it had little exposure in my
> test until commit 715d40b93b10 ("md/raid1: add failfast handling for
> reads.") removed preference for idle disk. Such scenario had been passing
> as second disk was always chosen when idle.
> 
> Update a candidate ("best disk") to read from if disk can read entire
> range. Do it only if other disk has already been chosen as a candidate
> for a smaller range. Otherwise, don't update it - let the head position /
> disk type logic to select a disk.
> 
> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
> ---
>  drivers/md/raid1.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> v2:
>   updated the title not to include internal defect number :)
> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index a1f3fbe..e7ab3bb 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -628,8 +628,11 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect
>  					break;
>  			}
>  			continue;
> -		} else
> +		} else {
> +			if ((sectors > best_good_sectors) && (best_disk >= 0))
> +				best_disk = disk;

best_disk = -1 seems better here to me. If this is the only disk, we will
choose in below 'if (best_disk == -1)' check. If there are more disks which
cover the whole range, we will choose the best disk with minimum seek/pending.

Thanks,
Shaohua
>  			best_good_sectors = sectors;
> +		}
>  
>  		if (best_disk >= 0)
>  			/* At least two disks to choose from so failfast is OK */
> -- 
> 1.8.3.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] raid1: prefer disk without bad blocks
  2017-05-10 19:57   ` Shaohua Li
@ 2017-05-12 12:23     ` Tomasz Majchrzak
  2017-05-12 12:26       ` [PATCH v3] " Tomasz Majchrzak
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Majchrzak @ 2017-05-12 12:23 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-raid

On Wed, May 10, 2017 at 12:57:01PM -0700, Shaohua Li wrote:
> > ---
> >  drivers/md/raid1.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > v2:
> >   updated the title not to include internal defect number :)
> > 
> > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> > index a1f3fbe..e7ab3bb 100644
> > --- a/drivers/md/raid1.c
> > +++ b/drivers/md/raid1.c
> > @@ -628,8 +628,11 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect
> >  					break;
> >  			}
> >  			continue;
> > -		} else
> > +		} else {
> > +			if ((sectors > best_good_sectors) && (best_disk >= 0))
> > +				best_disk = disk;
> 
> best_disk = -1 seems better here to me. If this is the only disk, we will
> choose in below 'if (best_disk == -1)' check. If there are more disks which
> cover the whole range, we will choose the best disk with minimum seek/pending.
>
Agreed, it's a neater way to fix it. I'll send the patch.

Tomek

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v3] raid1: prefer disk without bad blocks
  2017-05-12 12:23     ` Tomasz Majchrzak
@ 2017-05-12 12:26       ` Tomasz Majchrzak
  0 siblings, 0 replies; 5+ messages in thread
From: Tomasz Majchrzak @ 2017-05-12 12:26 UTC (permalink / raw)
  To: linux-raid; +Cc: shli, Tomasz Majchrzak

If an array consists of two drives and the first drive has the bad
block, the read request to the region overlapping the bad block chooses
the same disk (with bad block) as device to read from over and over and
the request gets stuck. If the first disk only partially overlaps with
bad block, it becomes a candidate ("best disk") for shorter range of
sectors. The second disk is capable of reading the entire requested
range and it is updated accordingly, however it is not recorded as a
best device for the request. In the end the request is sent to the first
disk to read entire range of sectors. It fails and is re-tried in a
moment but with the same outcome.

Actually it is quite likely scenario but it had little exposure in my
test until commit 715d40b93b10 ("md/raid1: add failfast handling for
reads.") removed preference for idle disk. Such scenario had been
passing as second disk was always chosen when idle.

Reset a candidate ("best disk") to read from if disk can read entire
range. Do it only if other disk has already been chosen as a candidate
for a smaller range. The head position / disk type logic will select
the best disk to read from - it is fine as disk with bad block won't be
considered for it.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 drivers/md/raid1.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 7c1f733..c5b32aa 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -666,8 +666,11 @@ static int read_balance(struct r1conf *conf, struct r1bio *r1_bio, int *max_sect
 					break;
 			}
 			continue;
-		} else
+		} else {
+			if ((sectors > best_good_sectors) && (best_disk >= 0))
+				best_disk = -1;
 			best_good_sectors = sectors;
+		}
 
 		if (best_disk >= 0)
 			/* At least two disks to choose from so failfast is OK */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-05-12 12:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-10 10:53 [PATCH][HSD116699] raid1: prefer disk without bad blocks Tomasz Majchrzak
2017-05-10 13:16 ` [PATCH v2] " Tomasz Majchrzak
2017-05-10 19:57   ` Shaohua Li
2017-05-12 12:23     ` Tomasz Majchrzak
2017-05-12 12:26       ` [PATCH v3] " Tomasz Majchrzak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.