* Re: Patch "btrfs: qgroup: fix data leak caused by race between writeback and truncate" has been added to the 4.14-stable tree
[not found] <15957672628285@kroah.com>
@ 2020-07-26 12:51 ` Qu Wenruo
2020-07-26 12:54 ` Qu Wenruo
0 siblings, 1 reply; 3+ messages in thread
From: Qu Wenruo @ 2020-07-26 12:51 UTC (permalink / raw)
To: gregkh; +Cc: stable-commits, stable
On 2020/7/26 下午8:41, gregkh@linuxfoundation.org wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> btrfs: qgroup: fix data leak caused by race between writeback and truncate
>
> to the 4.14-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
>
> The filename of the patch is:
> btrfs-qgroup-fix-data-leak-caused-by-race-between-writeback-and-truncate.patch
> and it can be found in the queue-4.14 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable@vger.kernel.org> know about it.
Please don't merge this patch for any of the stable branches.
This patch needs one unmerged patch ("btrfs: change timing for qgroup
reserved space for ordered extents to fix reserved space leak", already
in maintainer's tree) as prerequisite.
The behavior without that patch could be problematic.
I should have noticed this earlier.
Thanks,
Qu
>
>
> From fa91e4aa1716004ea8096d5185ec0451e206aea0 Mon Sep 17 00:00:00 2001
> From: Qu Wenruo <wqu@suse.com>
> Date: Fri, 17 Jul 2020 15:12:05 +0800
> Subject: btrfs: qgroup: fix data leak caused by race between writeback and truncate
>
> From: Qu Wenruo <wqu@suse.com>
>
> commit fa91e4aa1716004ea8096d5185ec0451e206aea0 upstream.
>
> [BUG]
> When running tests like generic/013 on test device with btrfs quota
> enabled, it can normally lead to data leak, detected at unmount time:
>
> BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096
> ------------[ cut here ]------------
> WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs]
> RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
> Call Trace:
> btrfs_put_super+0x15/0x17 [btrfs]
> generic_shutdown_super+0x72/0x110
> kill_anon_super+0x18/0x30
> btrfs_kill_super+0x17/0x30 [btrfs]
> deactivate_locked_super+0x3b/0xa0
> deactivate_super+0x40/0x50
> cleanup_mnt+0x135/0x190
> __cleanup_mnt+0x12/0x20
> task_work_run+0x64/0xb0
> __prepare_exit_to_usermode+0x1bc/0x1c0
> __syscall_return_slowpath+0x47/0x230
> do_syscall_64+0x64/0xb0
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> ---[ end trace caf08beafeca2392 ]---
> BTRFS error (device dm-3): qgroup reserved space leaked
>
> [CAUSE]
> In the offending case, the offending operations are:
> 2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
> 2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0
>
> The following sequence of events could happen after the writev():
> CPU1 (writeback) | CPU2 (truncate)
> -----------------------------------------------------------------
> btrfs_writepages() |
> |- extent_write_cache_pages() |
> |- Got page for 1003520 |
> | 1003520 is Dirty, no writeback |
> | So (!clear_page_dirty_for_io()) |
> | gets called for it |
> |- Now page 1003520 is Clean. |
> | | btrfs_setattr()
> | | |- btrfs_setsize()
> | | |- truncate_setsize()
> | | New i_size is 18388
> |- __extent_writepage() |
> | |- page_offset() > i_size |
> |- btrfs_invalidatepage() |
> |- Page is clean, so no qgroup |
> callback executed
>
> This means, the qgroup reserved data space is not properly released in
> btrfs_invalidatepage() as the page is Clean.
>
> [FIX]
> Instead of checking the dirty bit of a page, call
> btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().
>
> As qgroup rsv are completely bound to the QGROUP_RESERVED bit of
> io_tree, not bound to page status, thus we won't cause double freeing
> anyway.
>
> Fixes: 0b34c261e235 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero")
> CC: stable@vger.kernel.org # 4.14+
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> Signed-off-by: David Sterba <dsterba@suse.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> ---
> fs/btrfs/inode.c | 23 ++++++++++-------------
> 1 file changed, 10 insertions(+), 13 deletions(-)
>
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -9197,20 +9197,17 @@ again:
> /*
> * Qgroup reserved space handler
> * Page here will be either
> - * 1) Already written to disk
> - * In this case, its reserved space is released from data rsv map
> - * and will be freed by delayed_ref handler finally.
> - * So even we call qgroup_free_data(), it won't decrease reserved
> - * space.
> - * 2) Not written to disk
> - * This means the reserved space should be freed here. However,
> - * if a truncate invalidates the page (by clearing PageDirty)
> - * and the page is accounted for while allocating extent
> - * in btrfs_check_data_free_space() we let delayed_ref to
> - * free the entire extent.
> + * 1) Already written to disk or ordered extent already submitted
> + * Then its QGROUP_RESERVED bit in io_tree is already cleaned.
> + * Qgroup will be handled by its qgroup_record then.
> + * btrfs_qgroup_free_data() call will do nothing here.
> + *
> + * 2) Not written to disk yet
> + * Then btrfs_qgroup_free_data() call will clear the QGROUP_RESERVED
> + * bit of its io_tree, and free the qgroup reserved data space.
> + * Since the IO will never happen for this page.
> */
> - if (PageDirty(page))
> - btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE);
> + btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE);
> if (!inode_evicting) {
> clear_extent_bit(tree, page_start, page_end,
> EXTENT_LOCKED | EXTENT_DIRTY |
>
>
> Patches currently in stable-queue which might be from wqu@suse.com are
>
> queue-4.14/btrfs-qgroup-fix-data-leak-caused-by-race-between-writeback-and-truncate.patch
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Patch "btrfs: qgroup: fix data leak caused by race between writeback and truncate" has been added to the 4.14-stable tree
2020-07-26 12:51 ` Patch "btrfs: qgroup: fix data leak caused by race between writeback and truncate" has been added to the 4.14-stable tree Qu Wenruo
@ 2020-07-26 12:54 ` Qu Wenruo
2020-07-26 13:10 ` Greg KH
0 siblings, 1 reply; 3+ messages in thread
From: Qu Wenruo @ 2020-07-26 12:54 UTC (permalink / raw)
To: gregkh; +Cc: stable-commits, stable, linux-btrfs
On 2020/7/26 下午8:51, Qu Wenruo wrote:
>
>
> On 2020/7/26 下午8:41, gregkh@linuxfoundation.org wrote:
>>
>> This is a note to let you know that I've just added the patch titled
>>
>> btrfs: qgroup: fix data leak caused by race between writeback and truncate
>>
>> to the 4.14-stable tree which can be found at:
>> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
>>
>> The filename of the patch is:
>> btrfs-qgroup-fix-data-leak-caused-by-race-between-writeback-and-truncate.patch
>> and it can be found in the queue-4.14 subdirectory.
>>
>> If you, or anyone else, feels it should not be added to the stable tree,
>> please let <stable@vger.kernel.org> know about it.
>
> Please don't merge this patch for any of the stable branches.
>
> This patch needs one unmerged patch ("btrfs: change timing for qgroup
> reserved space for ordered extents to fix reserved space leak", already
> in maintainer's tree) as prerequisite.
Also add btrfs mail list to the discssusion.
>
> The behavior without that patch could be problematic.
>
> I should have noticed this earlier.
>
> Thanks,
> Qu
>
>>
>>
>> From fa91e4aa1716004ea8096d5185ec0451e206aea0 Mon Sep 17 00:00:00 2001
>> From: Qu Wenruo <wqu@suse.com>
>> Date: Fri, 17 Jul 2020 15:12:05 +0800
>> Subject: btrfs: qgroup: fix data leak caused by race between writeback and truncate
>>
>> From: Qu Wenruo <wqu@suse.com>
>>
>> commit fa91e4aa1716004ea8096d5185ec0451e206aea0 upstream.
>>
>> [BUG]
>> When running tests like generic/013 on test device with btrfs quota
>> enabled, it can normally lead to data leak, detected at unmount time:
>>
>> BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096
>> ------------[ cut here ]------------
>> WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs]
>> RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
>> Call Trace:
>> btrfs_put_super+0x15/0x17 [btrfs]
>> generic_shutdown_super+0x72/0x110
>> kill_anon_super+0x18/0x30
>> btrfs_kill_super+0x17/0x30 [btrfs]
>> deactivate_locked_super+0x3b/0xa0
>> deactivate_super+0x40/0x50
>> cleanup_mnt+0x135/0x190
>> __cleanup_mnt+0x12/0x20
>> task_work_run+0x64/0xb0
>> __prepare_exit_to_usermode+0x1bc/0x1c0
>> __syscall_return_slowpath+0x47/0x230
>> do_syscall_64+0x64/0xb0
>> entry_SYSCALL_64_after_hwframe+0x44/0xa9
>> ---[ end trace caf08beafeca2392 ]---
>> BTRFS error (device dm-3): qgroup reserved space leaked
>>
>> [CAUSE]
>> In the offending case, the offending operations are:
>> 2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
>> 2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0
>>
>> The following sequence of events could happen after the writev():
>> CPU1 (writeback) | CPU2 (truncate)
>> -----------------------------------------------------------------
>> btrfs_writepages() |
>> |- extent_write_cache_pages() |
>> |- Got page for 1003520 |
>> | 1003520 is Dirty, no writeback |
>> | So (!clear_page_dirty_for_io()) |
>> | gets called for it |
>> |- Now page 1003520 is Clean. |
>> | | btrfs_setattr()
>> | | |- btrfs_setsize()
>> | | |- truncate_setsize()
>> | | New i_size is 18388
>> |- __extent_writepage() |
>> | |- page_offset() > i_size |
>> |- btrfs_invalidatepage() |
>> |- Page is clean, so no qgroup |
>> callback executed
>>
>> This means, the qgroup reserved data space is not properly released in
>> btrfs_invalidatepage() as the page is Clean.
>>
>> [FIX]
>> Instead of checking the dirty bit of a page, call
>> btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().
>>
>> As qgroup rsv are completely bound to the QGROUP_RESERVED bit of
>> io_tree, not bound to page status, thus we won't cause double freeing
>> anyway.
>>
>> Fixes: 0b34c261e235 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero")
>> CC: stable@vger.kernel.org # 4.14+
>> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> Signed-off-by: David Sterba <dsterba@suse.com>
>> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>
>> ---
>> fs/btrfs/inode.c | 23 ++++++++++-------------
>> 1 file changed, 10 insertions(+), 13 deletions(-)
>>
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -9197,20 +9197,17 @@ again:
>> /*
>> * Qgroup reserved space handler
>> * Page here will be either
>> - * 1) Already written to disk
>> - * In this case, its reserved space is released from data rsv map
>> - * and will be freed by delayed_ref handler finally.
>> - * So even we call qgroup_free_data(), it won't decrease reserved
>> - * space.
>> - * 2) Not written to disk
>> - * This means the reserved space should be freed here. However,
>> - * if a truncate invalidates the page (by clearing PageDirty)
>> - * and the page is accounted for while allocating extent
>> - * in btrfs_check_data_free_space() we let delayed_ref to
>> - * free the entire extent.
>> + * 1) Already written to disk or ordered extent already submitted
>> + * Then its QGROUP_RESERVED bit in io_tree is already cleaned.
>> + * Qgroup will be handled by its qgroup_record then.
>> + * btrfs_qgroup_free_data() call will do nothing here.
>> + *
>> + * 2) Not written to disk yet
>> + * Then btrfs_qgroup_free_data() call will clear the QGROUP_RESERVED
>> + * bit of its io_tree, and free the qgroup reserved data space.
>> + * Since the IO will never happen for this page.
>> */
>> - if (PageDirty(page))
>> - btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE);
>> + btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE);
>> if (!inode_evicting) {
>> clear_extent_bit(tree, page_start, page_end,
>> EXTENT_LOCKED | EXTENT_DIRTY |
>>
>>
>> Patches currently in stable-queue which might be from wqu@suse.com are
>>
>> queue-4.14/btrfs-qgroup-fix-data-leak-caused-by-race-between-writeback-and-truncate.patch
>>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Patch "btrfs: qgroup: fix data leak caused by race between writeback and truncate" has been added to the 4.14-stable tree
2020-07-26 12:54 ` Qu Wenruo
@ 2020-07-26 13:10 ` Greg KH
0 siblings, 0 replies; 3+ messages in thread
From: Greg KH @ 2020-07-26 13:10 UTC (permalink / raw)
To: Qu Wenruo; +Cc: stable-commits, stable, linux-btrfs
On Sun, Jul 26, 2020 at 08:54:29PM +0800, Qu Wenruo wrote:
>
>
> On 2020/7/26 下午8:51, Qu Wenruo wrote:
> >
> >
> > On 2020/7/26 下午8:41, gregkh@linuxfoundation.org wrote:
> >>
> >> This is a note to let you know that I've just added the patch titled
> >>
> >> btrfs: qgroup: fix data leak caused by race between writeback and truncate
> >>
> >> to the 4.14-stable tree which can be found at:
> >> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary
> >>
> >> The filename of the patch is:
> >> btrfs-qgroup-fix-data-leak-caused-by-race-between-writeback-and-truncate.patch
> >> and it can be found in the queue-4.14 subdirectory.
> >>
> >> If you, or anyone else, feels it should not be added to the stable tree,
> >> please let <stable@vger.kernel.org> know about it.
> >
> > Please don't merge this patch for any of the stable branches.
> >
> > This patch needs one unmerged patch ("btrfs: change timing for qgroup
> > reserved space for ordered extents to fix reserved space leak", already
> > in maintainer's tree) as prerequisite.
>
> Also add btrfs mail list to the discssusion.
>
> >
> > The behavior without that patch could be problematic.
> >
> > I should have noticed this earlier.
No problem, now dropped from all stable queues.
greg k-h
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-07-26 13:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <15957672628285@kroah.com>
2020-07-26 12:51 ` Patch "btrfs: qgroup: fix data leak caused by race between writeback and truncate" has been added to the 4.14-stable tree Qu Wenruo
2020-07-26 12:54 ` Qu Wenruo
2020-07-26 13:10 ` Greg KH
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.