All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] btrfs: check page->mapping when loading free space cache
@ 2019-09-24 20:50 Josef Bacik
  2019-09-24 20:50 ` [PATCH 2/2] btrfs: use btrfs_block_group_cache_done in update_block_group Josef Bacik
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Josef Bacik @ 2019-09-24 20:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

While testing 5.2 we ran into the following panic

[52238.017028] BUG: kernel NULL pointer dereference, address: 0000000000000001
[52238.105608] RIP: 0010:drop_buffers+0x3d/0x150
[52238.304051] Call Trace:
[52238.308958]  try_to_free_buffers+0x15b/0x1b0
[52238.317503]  shrink_page_list+0x1164/0x1780
[52238.325877]  shrink_inactive_list+0x18f/0x3b0
[52238.334596]  shrink_node_memcg+0x23e/0x7d0
[52238.342790]  ? do_shrink_slab+0x4f/0x290
[52238.350648]  shrink_node+0xce/0x4a0
[52238.357628]  balance_pgdat+0x2c7/0x510
[52238.365135]  kswapd+0x216/0x3e0
[52238.371425]  ? wait_woken+0x80/0x80
[52238.378412]  ? balance_pgdat+0x510/0x510
[52238.386265]  kthread+0x111/0x130
[52238.392727]  ? kthread_create_on_node+0x60/0x60
[52238.401782]  ret_from_fork+0x1f/0x30

The page we were trying to drop had a page->private, but had no
page->mapping and so called drop_buffers, assuming that we had a
buffer_head on the page, and then panic'ed trying to deref 1, which is
our page->private for data pages.

This is happening because we're truncating the free space cache while
we're trying to load the free space cache.  This isn't supposed to
happen, and I'll fix that in a followup patch.  However we still
shouldn't allow those sort of mistakes to result in messing with pages
that do not belong to us.  So add the page->mapping check to verify that
we still own this page after dropping and re-acquiring the page lock.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/free-space-cache.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index d54dcd0ab230..d86ada9c3c54 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -385,6 +385,12 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, struct inode *inode
 		if (uptodate && !PageUptodate(page)) {
 			btrfs_readpage(NULL, page);
 			lock_page(page);
+			if (page->mapping != inode->i_mapping) {
+				btrfs_err(BTRFS_I(inode)->root->fs_info,
+					  "free space cache page truncated");
+				io_ctl_drop_pages(io_ctl);
+				return -EIO;
+			}
 			if (!PageUptodate(page)) {
 				btrfs_err(BTRFS_I(inode)->root->fs_info,
 					   "error reading free space cache");
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] btrfs: use btrfs_block_group_cache_done in update_block_group
  2019-09-24 20:50 [PATCH 1/2] btrfs: check page->mapping when loading free space cache Josef Bacik
@ 2019-09-24 20:50 ` Josef Bacik
  2019-10-01 14:08   ` Nikolay Borisov
  2019-09-27 10:28 ` [PATCH 1/2] btrfs: check page->mapping when loading free space cache Filipe Manana
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Josef Bacik @ 2019-09-24 20:50 UTC (permalink / raw)
  To: linux-btrfs, kernel-team

When free'ing extents in a block group we check to see if the block
group is not cached, and then cache it if we need to.  However we'll
just carry on as long as we're loading the cache.  This is problematic
because we are dirtying the block group here.  If we are fast enough we
could do a transaction commit and clear the free space cache while we're
still loading the space cache in another thread.  This truncates the
free space inode, which will keep it from loading the space cache.

Fix this by using the btrfs_block_group_cache_done helper so that we try
to load the space cache unconditionally here, which will result in the
caller waiting for the fast caching to complete and keep us from
truncating the free space inode.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/block-group.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
index bf7e3f23bba7..d7b3a516f27a 100644
--- a/fs/btrfs/block-group.c
+++ b/fs/btrfs/block-group.c
@@ -2661,7 +2661,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
 		 * is because we need the unpinning stage to actually add the
 		 * space back to the block group, otherwise we will leak space.
 		 */
-		if (!alloc && cache->cached == BTRFS_CACHE_NO)
+		if (!alloc && !btrfs_block_group_cache_done(cache))
 			btrfs_cache_block_group(cache, 1);
 
 		byte_in_group = bytenr - cache->key.objectid;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] btrfs: check page->mapping when loading free space cache
  2019-09-24 20:50 [PATCH 1/2] btrfs: check page->mapping when loading free space cache Josef Bacik
  2019-09-24 20:50 ` [PATCH 2/2] btrfs: use btrfs_block_group_cache_done in update_block_group Josef Bacik
@ 2019-09-27 10:28 ` Filipe Manana
  2019-09-27 12:23 ` Nikolay Borisov
  2019-10-15 17:34 ` David Sterba
  3 siblings, 0 replies; 6+ messages in thread
From: Filipe Manana @ 2019-09-27 10:28 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Thu, Sep 26, 2019 at 11:53 AM Josef Bacik <josef@toxicpanda.com> wrote:
>
> While testing 5.2 we ran into the following panic
>
> [52238.017028] BUG: kernel NULL pointer dereference, address: 0000000000000001
> [52238.105608] RIP: 0010:drop_buffers+0x3d/0x150
> [52238.304051] Call Trace:
> [52238.308958]  try_to_free_buffers+0x15b/0x1b0
> [52238.317503]  shrink_page_list+0x1164/0x1780
> [52238.325877]  shrink_inactive_list+0x18f/0x3b0
> [52238.334596]  shrink_node_memcg+0x23e/0x7d0
> [52238.342790]  ? do_shrink_slab+0x4f/0x290
> [52238.350648]  shrink_node+0xce/0x4a0
> [52238.357628]  balance_pgdat+0x2c7/0x510
> [52238.365135]  kswapd+0x216/0x3e0
> [52238.371425]  ? wait_woken+0x80/0x80
> [52238.378412]  ? balance_pgdat+0x510/0x510
> [52238.386265]  kthread+0x111/0x130
> [52238.392727]  ? kthread_create_on_node+0x60/0x60
> [52238.401782]  ret_from_fork+0x1f/0x30
>
> The page we were trying to drop had a page->private, but had no
> page->mapping and so called drop_buffers, assuming that we had a
> buffer_head on the page, and then panic'ed trying to deref 1, which is
> our page->private for data pages.
>
> This is happening because we're truncating the free space cache while
> we're trying to load the free space cache.  This isn't supposed to
> happen, and I'll fix that in a followup patch.  However we still
> shouldn't allow those sort of mistakes to result in messing with pages
> that do not belong to us.  So add the page->mapping check to verify that
> we still own this page after dropping and re-acquiring the page lock.
>
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Reviewed-by: Filipe Manana <fdmanana@suse.com>

Looks good, thanks.


> ---
>  fs/btrfs/free-space-cache.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index d54dcd0ab230..d86ada9c3c54 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -385,6 +385,12 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, struct inode *inode
>                 if (uptodate && !PageUptodate(page)) {
>                         btrfs_readpage(NULL, page);
>                         lock_page(page);
> +                       if (page->mapping != inode->i_mapping) {
> +                               btrfs_err(BTRFS_I(inode)->root->fs_info,
> +                                         "free space cache page truncated");
> +                               io_ctl_drop_pages(io_ctl);
> +                               return -EIO;
> +                       }
>                         if (!PageUptodate(page)) {
>                                 btrfs_err(BTRFS_I(inode)->root->fs_info,
>                                            "error reading free space cache");
> --
> 2.21.0
>


-- 
Filipe David Manana,

“Whether you think you can, or you think you can't — you're right.”

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] btrfs: check page->mapping when loading free space cache
  2019-09-24 20:50 [PATCH 1/2] btrfs: check page->mapping when loading free space cache Josef Bacik
  2019-09-24 20:50 ` [PATCH 2/2] btrfs: use btrfs_block_group_cache_done in update_block_group Josef Bacik
  2019-09-27 10:28 ` [PATCH 1/2] btrfs: check page->mapping when loading free space cache Filipe Manana
@ 2019-09-27 12:23 ` Nikolay Borisov
  2019-10-15 17:34 ` David Sterba
  3 siblings, 0 replies; 6+ messages in thread
From: Nikolay Borisov @ 2019-09-27 12:23 UTC (permalink / raw)
  To: Josef Bacik, kernel-team, linux-btrfs



On 24.09.19 г. 23:50 ч., Josef Bacik wrote:
> While testing 5.2 we ran into the following panic
> 
> [52238.017028] BUG: kernel NULL pointer dereference, address: 0000000000000001
> [52238.105608] RIP: 0010:drop_buffers+0x3d/0x150
> [52238.304051] Call Trace:
> [52238.308958]  try_to_free_buffers+0x15b/0x1b0
> [52238.317503]  shrink_page_list+0x1164/0x1780
> [52238.325877]  shrink_inactive_list+0x18f/0x3b0
> [52238.334596]  shrink_node_memcg+0x23e/0x7d0
> [52238.342790]  ? do_shrink_slab+0x4f/0x290
> [52238.350648]  shrink_node+0xce/0x4a0
> [52238.357628]  balance_pgdat+0x2c7/0x510
> [52238.365135]  kswapd+0x216/0x3e0
> [52238.371425]  ? wait_woken+0x80/0x80
> [52238.378412]  ? balance_pgdat+0x510/0x510
> [52238.386265]  kthread+0x111/0x130
> [52238.392727]  ? kthread_create_on_node+0x60/0x60
> [52238.401782]  ret_from_fork+0x1f/0x30
> 
> The page we were trying to drop had a page->private, but had no
> page->mapping and so called drop_buffers, assuming that we had a
> buffer_head on the page, and then panic'ed trying to deref 1, which is
> our page->private for data pages.
> 
> This is happening because we're truncating the free space cache while
> we're trying to load the free space cache.  This isn't supposed to
> happen, and I'll fix that in a followup patch.  However we still
> shouldn't allow those sort of mistakes to result in messing with pages
> that do not belong to us.  So add the page->mapping check to verify that
> we still own this page after dropping and re-acquiring the page lock.

Where is this page being unlocked? Is it:
btrfs_readpage
 extent_read_full_page
  __extent_read_full_page
   __do_readpage
    if (!nr) unlock_page  <-- nr can be 0 only if submit_extent_page
returns an error, correct? That's somewhat subtle and buried inside a
long call chaing and really non obvious. So please mention when is the
page lock being dropped.

Code-wise LGTM:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/free-space-cache.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
> index d54dcd0ab230..d86ada9c3c54 100644
> --- a/fs/btrfs/free-space-cache.c
> +++ b/fs/btrfs/free-space-cache.c
> @@ -385,6 +385,12 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, struct inode *inode
>  		if (uptodate && !PageUptodate(page)) {
>  			btrfs_readpage(NULL, page);
>  			lock_page(page);
> +			if (page->mapping != inode->i_mapping) {
> +				btrfs_err(BTRFS_I(inode)->root->fs_info,
> +					  "free space cache page truncated");
> +				io_ctl_drop_pages(io_ctl);
> +				return -EIO;
> +			}
>  			if (!PageUptodate(page)) {
>  				btrfs_err(BTRFS_I(inode)->root->fs_info,
>  					   "error reading free space cache");
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] btrfs: use btrfs_block_group_cache_done in update_block_group
  2019-09-24 20:50 ` [PATCH 2/2] btrfs: use btrfs_block_group_cache_done in update_block_group Josef Bacik
@ 2019-10-01 14:08   ` Nikolay Borisov
  0 siblings, 0 replies; 6+ messages in thread
From: Nikolay Borisov @ 2019-10-01 14:08 UTC (permalink / raw)
  To: Josef Bacik, kernel-team, linux-btrfs



On 24.09.19 г. 23:50 ч., Josef Bacik wrote:
> When free'ing extents in a block group we check to see if the block
> group is not cached, and then cache it if we need to.  However we'll
> just carry on as long as we're loading the cache.  This is problematic
> because we are dirtying the block group here.  If we are fast enough we
> could do a transaction commit and clear the free space cache while we're

This would imply a race condition between loading the space cache and
running delayed refs - because there where
__btrfs_free_extent->btrfs_update_blockgroup(alloc=0) is called from, no ?

> still loading the space cache in another thread.  This truncates the
> free space inode, which will keep it from loading the space cache.
> 
> Fix this by using the btrfs_block_group_cache_done helper so that we try
> to load the space cache unconditionally here, which will result in the
> caller waiting for the fast caching to complete and keep us from
> truncating the free space inode.

So the problem was that cache->cached == BTRFS_CACHE_NO misses other
interim states e.g. CACHE_STARTED/CACHE_FAST. Which leads to the
aforementioned race, correct?


> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Based on our offlist discussion:

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/block-group.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c
> index bf7e3f23bba7..d7b3a516f27a 100644
> --- a/fs/btrfs/block-group.c
> +++ b/fs/btrfs/block-group.c
> @@ -2661,7 +2661,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans,
>  		 * is because we need the unpinning stage to actually add the
>  		 * space back to the block group, otherwise we will leak space.
>  		 */
> -		if (!alloc && cache->cached == BTRFS_CACHE_NO)
> +		if (!alloc && !btrfs_block_group_cache_done(cache))
>  			btrfs_cache_block_group(cache, 1);
>  
>  		byte_in_group = bytenr - cache->key.objectid;
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] btrfs: check page->mapping when loading free space cache
  2019-09-24 20:50 [PATCH 1/2] btrfs: check page->mapping when loading free space cache Josef Bacik
                   ` (2 preceding siblings ...)
  2019-09-27 12:23 ` Nikolay Borisov
@ 2019-10-15 17:34 ` David Sterba
  3 siblings, 0 replies; 6+ messages in thread
From: David Sterba @ 2019-10-15 17:34 UTC (permalink / raw)
  To: Josef Bacik; +Cc: linux-btrfs, kernel-team

On Tue, Sep 24, 2019 at 04:50:43PM -0400, Josef Bacik wrote:
> While testing 5.2 we ran into the following panic
> 
> [52238.017028] BUG: kernel NULL pointer dereference, address: 0000000000000001
> [52238.105608] RIP: 0010:drop_buffers+0x3d/0x150
> [52238.304051] Call Trace:
> [52238.308958]  try_to_free_buffers+0x15b/0x1b0
> [52238.317503]  shrink_page_list+0x1164/0x1780
> [52238.325877]  shrink_inactive_list+0x18f/0x3b0
> [52238.334596]  shrink_node_memcg+0x23e/0x7d0
> [52238.342790]  ? do_shrink_slab+0x4f/0x290
> [52238.350648]  shrink_node+0xce/0x4a0
> [52238.357628]  balance_pgdat+0x2c7/0x510
> [52238.365135]  kswapd+0x216/0x3e0
> [52238.371425]  ? wait_woken+0x80/0x80
> [52238.378412]  ? balance_pgdat+0x510/0x510
> [52238.386265]  kthread+0x111/0x130
> [52238.392727]  ? kthread_create_on_node+0x60/0x60
> [52238.401782]  ret_from_fork+0x1f/0x30
> 
> The page we were trying to drop had a page->private, but had no
> page->mapping and so called drop_buffers, assuming that we had a
> buffer_head on the page, and then panic'ed trying to deref 1, which is
> our page->private for data pages.
> 
> This is happening because we're truncating the free space cache while
> we're trying to load the free space cache.  This isn't supposed to
> happen, and I'll fix that in a followup patch.  However we still
> shouldn't allow those sort of mistakes to result in messing with pages
> that do not belong to us.  So add the page->mapping check to verify that
> we still own this page after dropping and re-acquiring the page lock.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Patches 1 and 2 moved to misc-next. Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-15 17:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-24 20:50 [PATCH 1/2] btrfs: check page->mapping when loading free space cache Josef Bacik
2019-09-24 20:50 ` [PATCH 2/2] btrfs: use btrfs_block_group_cache_done in update_block_group Josef Bacik
2019-10-01 14:08   ` Nikolay Borisov
2019-09-27 10:28 ` [PATCH 1/2] btrfs: check page->mapping when loading free space cache Filipe Manana
2019-09-27 12:23 ` Nikolay Borisov
2019-10-15 17:34 ` David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.