linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] badblocks: fix overlapping check for clearing
@ 2016-09-06  8:58 Tomasz Majchrzak
  2016-10-07  4:50 ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Majchrzak @ 2016-09-06  8:58 UTC (permalink / raw)
  To: linux-block; +Cc: neilb

Current bad block clear implementation assumes the range to clear
overlaps with at least one bad block already stored. If given range to
clear precedes first bad block in a list, the first entry is incorrectly
updated.

Check not only if stored block end is past clear block end but also if
stored block start is before clear block end.

Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
---
 block/badblocks.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/badblocks.c b/block/badblocks.c
index 7be53cb..b2ffcc7 100644
--- a/block/badblocks.c
+++ b/block/badblocks.c
@@ -354,7 +354,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
 		 * current range.  Earlier ranges could also overlap,
 		 * but only this one can overlap the end of the range.
 		 */
-		if (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) {
+		if ((BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) &&
+		    (BB_OFFSET(p[lo]) <= target)) {
 			/* Partial overlap, leave the tail of this range */
 			int ack = BB_ACK(p[lo]);
 			sector_t a = BB_OFFSET(p[lo]);
@@ -377,7 +378,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
 			lo--;
 		}
 		while (lo >= 0 &&
-		       BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) {
+		       (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) &&
+		       (BB_OFFSET(p[lo]) <= target)) {
 			/* This range does overlap */
 			if (BB_OFFSET(p[lo]) < s) {
 				/* Keep the early parts of this range. */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] badblocks: fix overlapping check for clearing
  2016-09-06  8:58 [PATCH] badblocks: fix overlapping check for clearing Tomasz Majchrzak
@ 2016-10-07  4:50 ` NeilBrown
  2016-10-10 22:32   ` Dan Williams
  0 siblings, 1 reply; 5+ messages in thread
From: NeilBrown @ 2016-10-07  4:50 UTC (permalink / raw)
  To: Tomasz Majchrzak, linux-block; +Cc: Dan Williams

[-- Attachment #1: Type: text/plain, Size: 2599 bytes --]

On Tue, Sep 06 2016, Tomasz Majchrzak wrote:

> Current bad block clear implementation assumes the range to clear
> overlaps with at least one bad block already stored. If given range to
> clear precedes first bad block in a list, the first entry is incorrectly
> updated.

In the original md context, it would only ever be called on a block that
was already in the list.
But you are right that it is best not to assume this, and to code more
safely.



>
> Check not only if stored block end is past clear block end but also if
> stored block start is before clear block end.
>
> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>

Dan Williams seems to have taken responsibility for this code through
his nvdimm tree, so I've added him to 'cc' in the hope that he looks at
this (I wonder if he is even on linux-block ....)


> ---
>  block/badblocks.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/block/badblocks.c b/block/badblocks.c
> index 7be53cb..b2ffcc7 100644
> --- a/block/badblocks.c
> +++ b/block/badblocks.c
> @@ -354,7 +354,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
>  		 * current range.  Earlier ranges could also overlap,
>  		 * but only this one can overlap the end of the range.
>  		 */
> -		if (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) {
> +		if ((BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) &&
> +		    (BB_OFFSET(p[lo]) <= target)) {

hmmm..
'target' is the sector just beyond the set of sectors to remove from the
list.
BB_OFFSET(p[lo]) is the first sector in a range that was found in the
list.
If these are equal, then are aren't clearing anything in this range.
So I would have '<', not '<='.

I don't think this makes the code wrong as we end up assigning to p[lo]
the value that is already there.  But it might be confusing.


>  			/* Partial overlap, leave the tail of this range */
>  			int ack = BB_ACK(p[lo]);
>  			sector_t a = BB_OFFSET(p[lo]);
> @@ -377,7 +378,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
>  			lo--;
>  		}
>  		while (lo >= 0 &&
> -		       BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) {
> +		       (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) &&
> +		       (BB_OFFSET(p[lo]) <= target)) {

Ditto.

But the code is, I think, correct. Just not how I would have written it.
So

 Acked-by: NeilBrown <neilb@suse.com>

Thanks,
NeilBrown


>  			/* This range does overlap */
>  			if (BB_OFFSET(p[lo]) < s) {
>  				/* Keep the early parts of this range. */
> -- 
> 1.8.3.1

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] badblocks: fix overlapping check for clearing
  2016-10-07  4:50 ` NeilBrown
@ 2016-10-10 22:32   ` Dan Williams
  2016-10-12 10:26     ` Tomasz Majchrzak
  0 siblings, 1 reply; 5+ messages in thread
From: Dan Williams @ 2016-10-10 22:32 UTC (permalink / raw)
  To: NeilBrown; +Cc: Tomasz Majchrzak, linux-block

On Thu, Oct 6, 2016 at 9:50 PM, NeilBrown <neilb@suse.com> wrote:
> On Tue, Sep 06 2016, Tomasz Majchrzak wrote:
>
>> Current bad block clear implementation assumes the range to clear
>> overlaps with at least one bad block already stored. If given range to
>> clear precedes first bad block in a list, the first entry is incorrectly
>> updated.
>
> In the original md context, it would only ever be called on a block that
> was already in the list.
> But you are right that it is best not to assume this, and to code more
> safely.
>
>
>
>>
>> Check not only if stored block end is past clear block end but also if
>> stored block start is before clear block end.
>>
>> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
>
> Dan Williams seems to have taken responsibility for this code through
> his nvdimm tree, so I've added him to 'cc' in the hope that he looks at
> this (I wonder if he is even on linux-block ....)

I am, but I missed this, thanks for the cc.

>
>
>> ---
>>  block/badblocks.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/block/badblocks.c b/block/badblocks.c
>> index 7be53cb..b2ffcc7 100644
>> --- a/block/badblocks.c
>> +++ b/block/badblocks.c
>> @@ -354,7 +354,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
>>                * current range.  Earlier ranges could also overlap,
>>                * but only this one can overlap the end of the range.
>>                */
>> -             if (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) {
>> +             if ((BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) &&
>> +                 (BB_OFFSET(p[lo]) <= target)) {
>
> hmmm..
> 'target' is the sector just beyond the set of sectors to remove from the
> list.
> BB_OFFSET(p[lo]) is the first sector in a range that was found in the
> list.
> If these are equal, then are aren't clearing anything in this range.
> So I would have '<', not '<='.
>
> I don't think this makes the code wrong as we end up assigning to p[lo]
> the value that is already there.  But it might be confusing.
>
>
>>                       /* Partial overlap, leave the tail of this range */
>>                       int ack = BB_ACK(p[lo]);
>>                       sector_t a = BB_OFFSET(p[lo]);
>> @@ -377,7 +378,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
>>                       lo--;
>>               }
>>               while (lo >= 0 &&
>> -                    BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) {
>> +                    (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) &&
>> +                    (BB_OFFSET(p[lo]) <= target)) {
>
> Ditto.
>
> But the code is, I think, correct. Just not how I would have written it.
> So
>
>  Acked-by: NeilBrown <neilb@suse.com>

I agree with the comments to change "<=" to "<".  Tomasz, care to
re-send with those changes?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] badblocks: fix overlapping check for clearing
  2016-10-10 22:32   ` Dan Williams
@ 2016-10-12 10:26     ` Tomasz Majchrzak
  2016-10-19  2:36       ` NeilBrown
  0 siblings, 1 reply; 5+ messages in thread
From: Tomasz Majchrzak @ 2016-10-12 10:26 UTC (permalink / raw)
  To: Dan Williams, linux-block; +Cc: NeilBrown, linux-raid

On Mon, Oct 10, 2016 at 03:32:58PM -0700, Dan Williams wrote:
> > On Tue, Sep 06 2016, Tomasz Majchrzak wrote:
> >> ---
> >>  block/badblocks.c | 6 ++++--
> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/block/badblocks.c b/block/badblocks.c
> >> index 7be53cb..b2ffcc7 100644
> >> --- a/block/badblocks.c
> >> +++ b/block/badblocks.c
> >> @@ -354,7 +354,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
> >>                * current range.  Earlier ranges could also overlap,
> >>                * but only this one can overlap the end of the range.
> >>                */
> >> -             if (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) {
> >> +             if ((BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) &&
> >> +                 (BB_OFFSET(p[lo]) <= target)) {
> >
> > hmmm..
> > 'target' is the sector just beyond the set of sectors to remove from the
> > list.
> > BB_OFFSET(p[lo]) is the first sector in a range that was found in the
> > list.
> > If these are equal, then are aren't clearing anything in this range.
> > So I would have '<', not '<='.
> >
> > I don't think this makes the code wrong as we end up assigning to p[lo]
> > the value that is already there.  But it might be confusing.
> >
> >
> >>                       /* Partial overlap, leave the tail of this range */
> >>                       int ack = BB_ACK(p[lo]);
> >>                       sector_t a = BB_OFFSET(p[lo]);
> >> @@ -377,7 +378,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
> >>                       lo--;
> >>               }
> >>               while (lo >= 0 &&
> >> -                    BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) {
> >> +                    (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) &&
> >> +                    (BB_OFFSET(p[lo]) <= target)) {
> >
> > Ditto.
> >
> > But the code is, I think, correct. Just not how I would have written it.
> > So
> >
> >  Acked-by: NeilBrown <neilb@suse.com>
> 
> I agree with the comments to change "<=" to "<".  Tomasz, care to
> re-send with those changes?

I have just resent the patch with your suggestions included.

> > In the original md context, it would only ever be called on a block that
> > was already in the list.

Actually MD RAID10 calls it this way. See handle_write_completed, it iterates
over all copies and clears the bad block if error has not been returned. I have
a test case which fails for that reason - existing bad block is modified by
clear block. It is very unlikely to happen in real life as it depends on
specific layout of bad blocks and their discovery order, however it's a gap that
needs to be closed.

I had put some effort to see if clearing of non-existing bad block in RAID10 can
lead to some incorrect behaviour but I haven't found any. It seems that my patch
is sufficient to fix the problem.

Tomek

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] badblocks: fix overlapping check for clearing
  2016-10-12 10:26     ` Tomasz Majchrzak
@ 2016-10-19  2:36       ` NeilBrown
  0 siblings, 0 replies; 5+ messages in thread
From: NeilBrown @ 2016-10-19  2:36 UTC (permalink / raw)
  To: Tomasz Majchrzak, Dan Williams, linux-block; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3178 bytes --]

On Wed, Oct 12 2016, Tomasz Majchrzak wrote:

> On Mon, Oct 10, 2016 at 03:32:58PM -0700, Dan Williams wrote:
>> > On Tue, Sep 06 2016, Tomasz Majchrzak wrote:
>> >> ---
>> >>  block/badblocks.c | 6 ++++--
>> >>  1 file changed, 4 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/block/badblocks.c b/block/badblocks.c
>> >> index 7be53cb..b2ffcc7 100644
>> >> --- a/block/badblocks.c
>> >> +++ b/block/badblocks.c
>> >> @@ -354,7 +354,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
>> >>                * current range.  Earlier ranges could also overlap,
>> >>                * but only this one can overlap the end of the range.
>> >>                */
>> >> -             if (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) {
>> >> +             if ((BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > target) &&
>> >> +                 (BB_OFFSET(p[lo]) <= target)) {
>> >
>> > hmmm..
>> > 'target' is the sector just beyond the set of sectors to remove from the
>> > list.
>> > BB_OFFSET(p[lo]) is the first sector in a range that was found in the
>> > list.
>> > If these are equal, then are aren't clearing anything in this range.
>> > So I would have '<', not '<='.
>> >
>> > I don't think this makes the code wrong as we end up assigning to p[lo]
>> > the value that is already there.  But it might be confusing.
>> >
>> >
>> >>                       /* Partial overlap, leave the tail of this range */
>> >>                       int ack = BB_ACK(p[lo]);
>> >>                       sector_t a = BB_OFFSET(p[lo]);
>> >> @@ -377,7 +378,8 @@ int badblocks_clear(struct badblocks *bb, sector_t s, int sectors)
>> >>                       lo--;
>> >>               }
>> >>               while (lo >= 0 &&
>> >> -                    BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) {
>> >> +                    (BB_OFFSET(p[lo]) + BB_LEN(p[lo]) > s) &&
>> >> +                    (BB_OFFSET(p[lo]) <= target)) {
>> >
>> > Ditto.
>> >
>> > But the code is, I think, correct. Just not how I would have written it.
>> > So
>> >
>> >  Acked-by: NeilBrown <neilb@suse.com>
>> 
>> I agree with the comments to change "<=" to "<".  Tomasz, care to
>> re-send with those changes?
>
> I have just resent the patch with your suggestions included.
>
>> > In the original md context, it would only ever be called on a block that
>> > was already in the list.
>
> Actually MD RAID10 calls it this way. See handle_write_completed, it iterates
> over all copies and clears the bad block if error has not been returned. I have
> a test case which fails for that reason - existing bad block is modified by
> clear block. It is very unlikely to happen in real life as it depends on
> specific layout of bad blocks and their discovery order, however it's a gap that
> needs to be closed.

Ahh, I didn't realize that.  I see that you are correct though.

>
> I had put some effort to see if clearing of non-existing bad block in RAID10 can
> lead to some incorrect behaviour but I haven't found any. It seems that my patch
> is sufficient to fix the problem.

Yes.  Thanks for a lot for sorting this out :-)

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-10-19  2:36 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-06  8:58 [PATCH] badblocks: fix overlapping check for clearing Tomasz Majchrzak
2016-10-07  4:50 ` NeilBrown
2016-10-10 22:32   ` Dan Williams
2016-10-12 10:26     ` Tomasz Majchrzak
2016-10-19  2:36       ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).