All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
@ 2021-08-16 23:55 Qu Wenruo
  2021-08-17  7:55 ` Nikolay Borisov
  2021-09-07 14:13 ` David Sterba
  0 siblings, 2 replies; 9+ messages in thread
From: Qu Wenruo @ 2021-08-16 23:55 UTC (permalink / raw)
  To: linux-btrfs

There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error.

It has indeed caught several bugs during subpage development.

But the BUG_ON() itself will bring down the whole system which is
sometimes overkilled.

Replace it with a WARN() and exit gracefully, so that it won't crash the
whole system while we can still catch the code logic error.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Changelog:
v2:
- Re-send as an independent patch
- Add WARN() to catch the code logic error
---
 fs/btrfs/file-item.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
index 2673c6ba7a4e..7f58d80a480f 100644
--- a/fs/btrfs/file-item.c
+++ b/fs/btrfs/file-item.c
@@ -665,7 +665,18 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio,
 
 		if (!ordered) {
 			ordered = btrfs_lookup_ordered_extent(inode, offset);
-			BUG_ON(!ordered); /* Logic error */
+			/*
+			 * The bio range is not covered by any ordered extent,
+			 * must be a code logic error.
+			 */
+			if (unlikely(!ordered)) {
+				WARN(1, KERN_WARNING
+		"no ordered extent for root %llu ino %llu offset %llu\n",
+				     inode->root->root_key.objectid,
+				     btrfs_ino(inode), offset);
+				kvfree(sums);
+				return BLK_STS_IOERR;
+			}
 		}
 
 		nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info,
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
  2021-08-16 23:55 [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling Qu Wenruo
@ 2021-08-17  7:55 ` Nikolay Borisov
  2021-08-17  8:03   ` Nikolay Borisov
  2021-08-17  8:10   ` Qu Wenruo
  2021-09-07 14:13 ` David Sterba
  1 sibling, 2 replies; 9+ messages in thread
From: Nikolay Borisov @ 2021-08-17  7:55 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 17.08.21 г. 2:55, Qu Wenruo wrote:
> There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error.
> 
> It has indeed caught several bugs during subpage development.
> 
> But the BUG_ON() itself will bring down the whole system which is
> sometimes overkilled.
> 
> Replace it with a WARN() and exit gracefully, so that it won't crash the
> whole system while we can still catch the code logic error.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Re-send as an independent patch
> - Add WARN() to catch the code logic error
> ---
>  fs/btrfs/file-item.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
> index 2673c6ba7a4e..7f58d80a480f 100644
> --- a/fs/btrfs/file-item.c
> +++ b/fs/btrfs/file-item.c
> @@ -665,7 +665,18 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio,
>  
>  		if (!ordered) {
>  			ordered = btrfs_lookup_ordered_extent(inode, offset);
> -			BUG_ON(!ordered); /* Logic error */
> +			/*
> +			 * The bio range is not covered by any ordered extent,
> +			 * must be a code logic error.
> +			 */
> +			if (unlikely(!ordered)) {
> +				WARN(1, KERN_WARNING
> +		"no ordered extent for root %llu ino %llu offset %llu\n",
> +				     inode->root->root_key.objectid,
> +				     btrfs_ino(inode), offset);
> +				kvfree(sums);
> +				return BLK_STS_IOERR;
> +			}

nit: How about :

if (WARN_ON(!ordered)  {
btrfs_err(foo)
}

That way you get the unlikely(!ordered) 'for free' and the code is
somewhat cleaner IMO.

While at it I also have to say the structure of the inner loop is rather
iffy, because it's really if/else with an implicit 'else'. How about
converting it to https://paste.ubuntu.com/p/kyWsRrkzWq/

>  		}
>  
>  		nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info,
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
  2021-08-17  7:55 ` Nikolay Borisov
@ 2021-08-17  8:03   ` Nikolay Borisov
  2021-08-17  8:10   ` Qu Wenruo
  1 sibling, 0 replies; 9+ messages in thread
From: Nikolay Borisov @ 2021-08-17  8:03 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs



On 17.08.21 г. 10:55, Nikolay Borisov wrote:
> 
> 
> On 17.08.21 г. 2:55, Qu Wenruo wrote:
>> There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error.
>>
>> It has indeed caught several bugs during subpage development.
>>
>> But the BUG_ON() itself will bring down the whole system which is
>> sometimes overkilled.
>>
>> Replace it with a WARN() and exit gracefully, so that it won't crash the
>> whole system while we can still catch the code logic error.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> Changelog:
>> v2:
>> - Re-send as an independent patch
>> - Add WARN() to catch the code logic error
>> ---
>>  fs/btrfs/file-item.c | 13 ++++++++++++-
>>  1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
>> index 2673c6ba7a4e..7f58d80a480f 100644
>> --- a/fs/btrfs/file-item.c
>> +++ b/fs/btrfs/file-item.c
>> @@ -665,7 +665,18 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio,
>>  
>>  		if (!ordered) {
>>  			ordered = btrfs_lookup_ordered_extent(inode, offset);
>> -			BUG_ON(!ordered); /* Logic error */
>> +			/*
>> +			 * The bio range is not covered by any ordered extent,
>> +			 * must be a code logic error.
>> +			 */
>> +			if (unlikely(!ordered)) {
>> +				WARN(1, KERN_WARNING
>> +		"no ordered extent for root %llu ino %llu offset %llu\n",
>> +				     inode->root->root_key.objectid,
>> +				     btrfs_ino(inode), offset);
>> +				kvfree(sums);
>> +				return BLK_STS_IOERR;
>> +			}
> 
> nit: How about :
> 
> if (WARN_ON(!ordered)  {
> btrfs_err(foo)
> }
> 
> That way you get the unlikely(!ordered) 'for free' and the code is
> somewhat cleaner IMO.
> 
> While at it I also have to say the structure of the inner loop is rather
> iffy, because it's really if/else with an implicit 'else'. How about
> converting it to https://paste.ubuntu.com/p/kyWsRrkzWq/

This actually won't work since we need the in_range code to be executed
after we've done the adjustments to variables in the 'else' part...

> 
>>  		}
>>  
>>  		nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info,
>>
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
  2021-08-17  7:55 ` Nikolay Borisov
  2021-08-17  8:03   ` Nikolay Borisov
@ 2021-08-17  8:10   ` Qu Wenruo
  2021-08-18 23:00     ` David Sterba
  1 sibling, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2021-08-17  8:10 UTC (permalink / raw)
  To: Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2021/8/17 下午3:55, Nikolay Borisov wrote:
>
>
> On 17.08.21 г. 2:55, Qu Wenruo wrote:
>> There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error.
>>
>> It has indeed caught several bugs during subpage development.
>>
>> But the BUG_ON() itself will bring down the whole system which is
>> sometimes overkilled.
>>
>> Replace it with a WARN() and exit gracefully, so that it won't crash the
>> whole system while we can still catch the code logic error.
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> Changelog:
>> v2:
>> - Re-send as an independent patch
>> - Add WARN() to catch the code logic error
>> ---
>>   fs/btrfs/file-item.c | 13 ++++++++++++-
>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
>> index 2673c6ba7a4e..7f58d80a480f 100644
>> --- a/fs/btrfs/file-item.c
>> +++ b/fs/btrfs/file-item.c
>> @@ -665,7 +665,18 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio,
>>
>>   		if (!ordered) {
>>   			ordered = btrfs_lookup_ordered_extent(inode, offset);
>> -			BUG_ON(!ordered); /* Logic error */
>> +			/*
>> +			 * The bio range is not covered by any ordered extent,
>> +			 * must be a code logic error.
>> +			 */
>> +			if (unlikely(!ordered)) {
>> +				WARN(1, KERN_WARNING
>> +		"no ordered extent for root %llu ino %llu offset %llu\n",
>> +				     inode->root->root_key.objectid,
>> +				     btrfs_ino(inode), offset);
>> +				kvfree(sums);
>> +				return BLK_STS_IOERR;
>> +			}
>
> nit: How about :
>
> if (WARN_ON(!ordered)  {

I still remember that if (WARN_ON()) usage is not recommended by David.

Is that still the case?

> btrfs_err(foo)
> }
>
> That way you get the unlikely(!ordered) 'for free' and the code is
> somewhat cleaner IMO.
>
> While at it I also have to say the structure of the inner loop is rather
> iffy, because it's really if/else with an implicit 'else'. How about
> converting it to https://paste.ubuntu.com/p/kyWsRrkzWq/

I have no obvious preference between the existing one and the new one.

But since you mentioned it, I would prefer a third way, exactly the code
of the existing if () branch into a function, like
switch_ordered_extent(), and add a comment before the if () line.

To me, that would look easier to read.

Thanks,
Qu
>
>>   		}
>>
>>   		nr_sectors = BTRFS_BYTES_TO_BLKS(fs_info,
>>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
  2021-08-17  8:10   ` Qu Wenruo
@ 2021-08-18 23:00     ` David Sterba
  2021-08-19  1:08       ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: David Sterba @ 2021-08-18 23:00 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: Nikolay Borisov, Qu Wenruo, linux-btrfs

On Tue, Aug 17, 2021 at 04:10:43PM +0800, Qu Wenruo wrote:
> On 2021/8/17 下午3:55, Nikolay Borisov wrote:
> > On 17.08.21 г. 2:55, Qu Wenruo wrote:
> >> @@ -665,7 +665,18 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio,
> >>
> >>   		if (!ordered) {
> >>   			ordered = btrfs_lookup_ordered_extent(inode, offset);
> >> -			BUG_ON(!ordered); /* Logic error */
> >> +			/*
> >> +			 * The bio range is not covered by any ordered extent,
> >> +			 * must be a code logic error.
> >> +			 */
> >> +			if (unlikely(!ordered)) {
> >> +				WARN(1, KERN_WARNING
> >> +		"no ordered extent for root %llu ino %llu offset %llu\n",
> >> +				     inode->root->root_key.objectid,
> >> +				     btrfs_ino(inode), offset);
> >> +				kvfree(sums);
> >> +				return BLK_STS_IOERR;
> >> +			}
> >
> > nit: How about :
> >
> > if (WARN_ON(!ordered)  {
> 
> I still remember that if (WARN_ON()) usage is not recommended by David.
> 
> Is that still the case?

Quick grep shows there are many if (WARN_ON(...)) so as long as it's a
simple "if (WARN_ON(condition))" and the code is readable I won't
object.

The problematic one is "if (!WARN_ON(condition))", because it warns when
condition is true, but the if does not continue and that breaks the
reading flow. The acceptable pattern read like "if condition and warn
eventually and continue".

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
  2021-08-18 23:00     ` David Sterba
@ 2021-08-19  1:08       ` Qu Wenruo
  2021-08-19 12:23         ` David Sterba
  0 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2021-08-19  1:08 UTC (permalink / raw)
  To: dsterba, Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2021/8/19 上午7:00, David Sterba wrote:
> On Tue, Aug 17, 2021 at 04:10:43PM +0800, Qu Wenruo wrote:
>> On 2021/8/17 下午3:55, Nikolay Borisov wrote:
>>> On 17.08.21 г. 2:55, Qu Wenruo wrote:
>>>> @@ -665,7 +665,18 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio,
>>>>
>>>>    		if (!ordered) {
>>>>    			ordered = btrfs_lookup_ordered_extent(inode, offset);
>>>> -			BUG_ON(!ordered); /* Logic error */
>>>> +			/*
>>>> +			 * The bio range is not covered by any ordered extent,
>>>> +			 * must be a code logic error.
>>>> +			 */
>>>> +			if (unlikely(!ordered)) {
>>>> +				WARN(1, KERN_WARNING
>>>> +		"no ordered extent for root %llu ino %llu offset %llu\n",
>>>> +				     inode->root->root_key.objectid,
>>>> +				     btrfs_ino(inode), offset);
>>>> +				kvfree(sums);
>>>> +				return BLK_STS_IOERR;
>>>> +			}
>>>
>>> nit: How about :
>>>
>>> if (WARN_ON(!ordered)  {
>>
>> I still remember that if (WARN_ON()) usage is not recommended by David.
>>
>> Is that still the case?
>
> Quick grep shows there are many if (WARN_ON(...)) so as long as it's a
> simple "if (WARN_ON(condition))" and the code is readable I won't
> object.
>
> The problematic one is "if (!WARN_ON(condition))", because it warns when
> condition is true, but the if does not continue and that breaks the
> reading flow. The acceptable pattern read like "if condition and warn
> eventually and continue".
>

I want to play safe by never bothering the WARN_ON() in if () condition
anymore.

The fact that some if (WARN_ON()) is acceptable and some is not is
already worrying.

Can we just stick to no-WARN_ON-in-if policy?

Thanks,
Qu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
  2021-08-19  1:08       ` Qu Wenruo
@ 2021-08-19 12:23         ` David Sterba
  2021-08-19 12:34           ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: David Sterba @ 2021-08-19 12:23 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: dsterba, Nikolay Borisov, Qu Wenruo, linux-btrfs

On Thu, Aug 19, 2021 at 09:08:00AM +0800, Qu Wenruo wrote:
> I want to play safe by never bothering the WARN_ON() in if () condition
> anymore.
> 
> The fact that some if (WARN_ON()) is acceptable and some is not is
> already worrying.
> 
> Can we just stick to no-WARN_ON-in-if policy?

So what would be the reocommended way to do that? Also with the obvious
question "is the warning needed at all?"

	if (condition) {
		WARN_ON(1);
		...
	}

This will hide the condition in the warning report, the value is usually
in RAX and sometimes it helps to analyze what happend. We'd have to
either duplicate it like

	if (condition) {
		WARN_ON(condition);
		...
	}

or use a temporary variable.

Another question is what to do with current if+WARN calls. Lots of them
are there without a comment. Ideally if there's a need for a warning
even on a production build, there should be a message also printed.

Lots of them seem to be an assert in disguise (like in
extract_ordered_extent) or are called before returning EUCLEAN. We've
talked about this a few times before, we'd need more fine grained
warnings/assertions or have them better documented. There are 300+ of
them so that's a lot for a single pass audit, but at least some of them
share a common pattern and can be unified.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
  2021-08-19 12:23         ` David Sterba
@ 2021-08-19 12:34           ` Qu Wenruo
  0 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2021-08-19 12:34 UTC (permalink / raw)
  To: dsterba, Nikolay Borisov, Qu Wenruo, linux-btrfs



On 2021/8/19 下午8:23, David Sterba wrote:
> On Thu, Aug 19, 2021 at 09:08:00AM +0800, Qu Wenruo wrote:
>> I want to play safe by never bothering the WARN_ON() in if () condition
>> anymore.
>>
>> The fact that some if (WARN_ON()) is acceptable and some is not is
>> already worrying.
>>
>> Can we just stick to no-WARN_ON-in-if policy?
>
> So what would be the reocommended way to do that? Also with the obvious
> question "is the warning needed at all?"

IMHO we should use WARN() other than WARN_ON().
The WARN_ON(), even with proper condition, is not providing enough info.

We want the warning mostly for fstests to catch the error, so that we
know it's something wrong.
>
> 	if (condition) {
> 		WARN_ON(1);
> 		...
> 	}
>
> This will hide the condition in the warning report, the value is usually
> in RAX and sometimes it helps to analyze what happend. We'd have to
> either duplicate it like

The condition itself is not really that helpful, compared to proper
warning message.

In fact, the line number is more useful than the condition itself.

>
> 	if (condition) {
> 		WARN_ON(condition);
> 		...
> 	}
>
> or use a temporary variable.
>
> Another question is what to do with current if+WARN calls. Lots of them
> are there without a comment. Ideally if there's a need for a warning
> even on a production build, there should be a message also printed.

We can add comments on them when we're going to refactor the code.

But historic problems shouldn't be a blockage for us to update our
current code guideline.

>
> Lots of them seem to be an assert in disguise (like in
> extract_ordered_extent) or are called before returning EUCLEAN. We've
> talked about this a few times before, we'd need more fine grained
> warnings/assertions or have them better documented. There are 300+ of
> them so that's a lot for a single pass audit, but at least some of them
> share a common pattern and can be unified.
>

This already shows that we need a proper guideline for how to do proper
warning.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling
  2021-08-16 23:55 [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling Qu Wenruo
  2021-08-17  7:55 ` Nikolay Borisov
@ 2021-09-07 14:13 ` David Sterba
  1 sibling, 0 replies; 9+ messages in thread
From: David Sterba @ 2021-09-07 14:13 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Tue, Aug 17, 2021 at 07:55:40AM +0800, Qu Wenruo wrote:
> There is a BUG_ON() in btrfs_csum_one_bio() to catch code logic error.
> 
> It has indeed caught several bugs during subpage development.
> 
> But the BUG_ON() itself will bring down the whole system which is
> sometimes overkilled.
> 
> Replace it with a WARN() and exit gracefully, so that it won't crash the
> whole system while we can still catch the code logic error.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> Changelog:
> v2:
> - Re-send as an independent patch
> - Add WARN() to catch the code logic error

Added to misc-next, without changes. The way with separate condition and
WARN is ok for now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-09-07 14:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-16 23:55 [PATCH v2] btrfs: replace BUG_ON() in btrfs_csum_one_bio() with proper error handling Qu Wenruo
2021-08-17  7:55 ` Nikolay Borisov
2021-08-17  8:03   ` Nikolay Borisov
2021-08-17  8:10   ` Qu Wenruo
2021-08-18 23:00     ` David Sterba
2021-08-19  1:08       ` Qu Wenruo
2021-08-19 12:23         ` David Sterba
2021-08-19 12:34           ` Qu Wenruo
2021-09-07 14:13 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.