linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] btrfs: remove BUG_ON used as assertions
@ 2019-12-15 17:12 Aditya Pakki
  2019-12-15 17:42 ` Nikolay Borisov
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Aditya Pakki @ 2019-12-15 17:12 UTC (permalink / raw)
  To: pakki001
  Cc: kjlu, Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel

alloc_extent_state_atomic() allocates extents via GFP_ATOMIC flag
and cannot fail. There are multiple invocations of BUG_ON on the
return value to check for failure. The patch replaces certain
invocations of BUG_ON by returning the error upstream.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
---
 fs/btrfs/extent_io.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index eb8bd0258360..e72e5a333e71 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -989,7 +989,10 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 	node = tree_search_for_insert(tree, start, &p, &parent);
 	if (!node) {
 		prealloc = alloc_extent_state_atomic(prealloc);
-		BUG_ON(!prealloc);
+		if (!prealloc) {
+			err = -ENOMEM;
+			goto out;
+		}
 		err = insert_state(tree, prealloc, start, end,
 				   &p, &parent, &bits, changeset);
 		if (err)
@@ -1054,7 +1057,10 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 		}
 
 		prealloc = alloc_extent_state_atomic(prealloc);
-		BUG_ON(!prealloc);
+		if (!prealloc) {
+			err = -ENOMEM;
+			goto out;
+		}
 		err = split_state(tree, state, prealloc, start);
 		if (err)
 			extent_io_tree_panic(tree, err);
@@ -1091,7 +1097,10 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 			this_end = last_start - 1;
 
 		prealloc = alloc_extent_state_atomic(prealloc);
-		BUG_ON(!prealloc);
+		if (!prealloc) {
+			err = -ENOMEM;
+			goto out;
+		}
 
 		/*
 		 * Avoid to free 'prealloc' if it can be merged with
@@ -1121,7 +1130,10 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 		}
 
 		prealloc = alloc_extent_state_atomic(prealloc);
-		BUG_ON(!prealloc);
+		if (!prealloc) {
+			err = -ENOMEM;
+			goto out;
+		}
 		err = split_state(tree, state, prealloc, end + 1);
 		if (err)
 			extent_io_tree_panic(tree, err);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: remove BUG_ON used as assertions
  2019-12-15 17:12 [PATCH] btrfs: remove BUG_ON used as assertions Aditya Pakki
@ 2019-12-15 17:42 ` Nikolay Borisov
  2019-12-18 16:31 ` David Sterba
  2019-12-18 16:38 ` Josef Bacik
  2 siblings, 0 replies; 6+ messages in thread
From: Nikolay Borisov @ 2019-12-15 17:42 UTC (permalink / raw)
  To: Aditya Pakki
  Cc: kjlu, Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel



On 15.12.19 г. 19:12 ч., Aditya Pakki wrote:
> alloc_extent_state_atomic() allocates extents via GFP_ATOMIC flag
> and cannot fail. There are multiple invocations of BUG_ON on the
> return value to check for failure. The patch replaces certain
> invocations of BUG_ON by returning the error upstream.
> 
> Signed-off-by: Aditya Pakki <pakki001@umn.edu>

Have you actually audited all callers of __set_extent_bit whether they
correctly handle failures?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: remove BUG_ON used as assertions
  2019-12-15 17:12 [PATCH] btrfs: remove BUG_ON used as assertions Aditya Pakki
  2019-12-15 17:42 ` Nikolay Borisov
@ 2019-12-18 16:31 ` David Sterba
  2019-12-18 16:38 ` Josef Bacik
  2 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2019-12-18 16:31 UTC (permalink / raw)
  To: Aditya Pakki
  Cc: kjlu, Chris Mason, Josef Bacik, David Sterba, linux-btrfs, linux-kernel

On Sun, Dec 15, 2019 at 11:12:36AM -0600, Aditya Pakki wrote:
> alloc_extent_state_atomic() allocates extents via GFP_ATOMIC flag
> and cannot fail.

Here you say it cannot fail (without a proof)

> There are multiple invocations of BUG_ON on the
> return value to check for failure. The patch replaces certain
> invocations of BUG_ON by returning the error upstream.
> 
> Signed-off-by: Aditya Pakki <pakki001@umn.edu>
> ---
>  fs/btrfs/extent_io.c | 20 ++++++++++++++++----
>  1 file changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
> index eb8bd0258360..e72e5a333e71 100644
> --- a/fs/btrfs/extent_io.c
> +++ b/fs/btrfs/extent_io.c
> @@ -989,7 +989,10 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
>  	node = tree_search_for_insert(tree, start, &p, &parent);
>  	if (!node) {
>  		prealloc = alloc_extent_state_atomic(prealloc);
> -		BUG_ON(!prealloc);
> +		if (!prealloc) {
> +			err = -ENOMEM;
> +			goto out;

And add error handling when it fails, which is in contradiction but
anyway, replacing BUG_ON with error checks requires auditing the whole
call chain up. We've had attempts to do this change already, none of them
was correct so unfortunatelly the BUG_ON needs to stay to avoid errors
being silently ignored.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: remove BUG_ON used as assertions
  2019-12-15 17:12 [PATCH] btrfs: remove BUG_ON used as assertions Aditya Pakki
  2019-12-15 17:42 ` Nikolay Borisov
  2019-12-18 16:31 ` David Sterba
@ 2019-12-18 16:38 ` Josef Bacik
  2019-12-18 16:47   ` David Sterba
  2 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2019-12-18 16:38 UTC (permalink / raw)
  To: Aditya Pakki; +Cc: kjlu, Chris Mason, David Sterba, linux-btrfs, linux-kernel

On 12/15/19 12:12 PM, Aditya Pakki wrote:
> alloc_extent_state_atomic() allocates extents via GFP_ATOMIC flag
> and cannot fail. There are multiple invocations of BUG_ON on the
> return value to check for failure. The patch replaces certain
> invocations of BUG_ON by returning the error upstream.
> 
> Signed-off-by: Aditya Pakki <pakki001@umn.edu>

I already tried this a few months ago and gave up.  There are a few things if 
you want to tackle something like this

1) use bpf's error injection thing to make sure you handle every path that can 
error out.  This is the script I wrote to do just that

https://github.com/josefbacik/debug-scripts/blob/master/error-injection-stress.py

2) We actually can't fail here.  We would need to go back and make _all_ callers 
of lock_extent_bits() handle the allocation error.  This is theoretically 
possible, but a giant pain in the ass.  In general we can make allocations here 
and we need to be able to make them.

3) We should probably mark this path with __GFP_NOFAIL because again, this is 
locking and we need locking to succeed.

Thanks,

Josef

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: remove BUG_ON used as assertions
  2019-12-18 16:38 ` Josef Bacik
@ 2019-12-18 16:47   ` David Sterba
  2019-12-18 19:54     ` Josef Bacik
  0 siblings, 1 reply; 6+ messages in thread
From: David Sterba @ 2019-12-18 16:47 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Aditya Pakki, kjlu, Chris Mason, David Sterba, linux-btrfs, linux-kernel

On Wed, Dec 18, 2019 at 11:38:18AM -0500, Josef Bacik wrote:
> On 12/15/19 12:12 PM, Aditya Pakki wrote:
> > alloc_extent_state_atomic() allocates extents via GFP_ATOMIC flag
> > and cannot fail. There are multiple invocations of BUG_ON on the
> > return value to check for failure. The patch replaces certain
> > invocations of BUG_ON by returning the error upstream.
> > 
> > Signed-off-by: Aditya Pakki <pakki001@umn.edu>
> 
> I already tried this a few months ago and gave up.  There are a few things if 
> you want to tackle something like this
> 
> 1) use bpf's error injection thing to make sure you handle every path that can 
> error out.  This is the script I wrote to do just that
> 
> https://github.com/josefbacik/debug-scripts/blob/master/error-injection-stress.py
> 
> 2) We actually can't fail here.  We would need to go back and make _all_ callers 
> of lock_extent_bits() handle the allocation error.  This is theoretically 
> possible, but a giant pain in the ass.  In general we can make allocations here 
> and we need to be able to make them.
> 
> 3) We should probably mark this path with __GFP_NOFAIL because again, this is 
> locking and we need locking to succeed.

NOFAIL can introduce loops that could lead to deadlocks, if not used
carefully. __set_extent_bit is not just locking, so if one thread wants
to set bits, allocate, wait, allocator goes to write some memory

eg.

set_extent_bit on some range
  alloc state (NOFAIL)
    allocator wants to flush dome dirty data
                   ------------------------------>
		                               set_extent_bit
					         alloc state (NOFAIL)
						 (wait)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] btrfs: remove BUG_ON used as assertions
  2019-12-18 16:47   ` David Sterba
@ 2019-12-18 19:54     ` Josef Bacik
  0 siblings, 0 replies; 6+ messages in thread
From: Josef Bacik @ 2019-12-18 19:54 UTC (permalink / raw)
  To: dsterba, Aditya Pakki, kjlu, Chris Mason, David Sterba,
	linux-btrfs, linux-kernel

On 12/18/19 11:47 AM, David Sterba wrote:
> On Wed, Dec 18, 2019 at 11:38:18AM -0500, Josef Bacik wrote:
>> On 12/15/19 12:12 PM, Aditya Pakki wrote:
>>> alloc_extent_state_atomic() allocates extents via GFP_ATOMIC flag
>>> and cannot fail. There are multiple invocations of BUG_ON on the
>>> return value to check for failure. The patch replaces certain
>>> invocations of BUG_ON by returning the error upstream.
>>>
>>> Signed-off-by: Aditya Pakki <pakki001@umn.edu>
>>
>> I already tried this a few months ago and gave up.  There are a few things if
>> you want to tackle something like this
>>
>> 1) use bpf's error injection thing to make sure you handle every path that can
>> error out.  This is the script I wrote to do just that
>>
>> https://github.com/josefbacik/debug-scripts/blob/master/error-injection-stress.py
>>
>> 2) We actually can't fail here.  We would need to go back and make _all_ callers
>> of lock_extent_bits() handle the allocation error.  This is theoretically
>> possible, but a giant pain in the ass.  In general we can make allocations here
>> and we need to be able to make them.
>>
>> 3) We should probably mark this path with __GFP_NOFAIL because again, this is
>> locking and we need locking to succeed.
> 
> NOFAIL can introduce loops that could lead to deadlocks, if not used
> carefully. __set_extent_bit is not just locking, so if one thread wants
> to set bits, allocate, wait, allocator goes to write some memory
> 
> eg.
> 
> set_extent_bit on some range
>    alloc state (NOFAIL)
>      allocator wants to flush dome dirty data
>                     ------------------------------>
> 		                               set_extent_bit
> 					         alloc state (NOFAIL)
> 						 (wait)
> 

Yes obviously I just want it for EXTENT_LOCKED.  But we could even just use a 
mempool to be really safe, since most places are going to use GFP_KERNEL or 
something else related, we only really need the safety in a few critical areas. 
Thanks,

Josef


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-18 19:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-15 17:12 [PATCH] btrfs: remove BUG_ON used as assertions Aditya Pakki
2019-12-15 17:42 ` Nikolay Borisov
2019-12-18 16:31 ` David Sterba
2019-12-18 16:38 ` Josef Bacik
2019-12-18 16:47   ` David Sterba
2019-12-18 19:54     ` Josef Bacik

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).