* [PATCHv3] btrfs: Fix error handling in btrfs_cleanup_ordered_extents
@ 2018-11-21 15:10 Nikolay Borisov
2018-12-04 16:28 ` Nikolay Borisov
2018-12-05 16:13 ` David Sterba
0 siblings, 2 replies; 3+ messages in thread
From: Nikolay Borisov @ 2018-11-21 15:10 UTC (permalink / raw)
To: linux-btrfs; +Cc: Nikolay Borisov
Running btrfs/124 in a loop hung up on me sporadically with the
following call trace:
btrfs D 0 5760 5324 0x00000000
Call Trace:
? __schedule+0x243/0x800
schedule+0x33/0x90
btrfs_start_ordered_extent+0x10c/0x1b0 [btrfs]
? wait_woken+0xa0/0xa0
btrfs_wait_ordered_range+0xbb/0x100 [btrfs]
btrfs_relocate_block_group+0x1ff/0x230 [btrfs]
btrfs_relocate_chunk+0x49/0x100 [btrfs]
btrfs_balance+0xbeb/0x1740 [btrfs]
btrfs_ioctl_balance+0x2ee/0x380 [btrfs]
btrfs_ioctl+0x1691/0x3110 [btrfs]
? lockdep_hardirqs_on+0xed/0x180
? __handle_mm_fault+0x8e7/0xfb0
? _raw_spin_unlock+0x24/0x30
? __handle_mm_fault+0x8e7/0xfb0
? do_vfs_ioctl+0xa5/0x6e0
? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
do_vfs_ioctl+0xa5/0x6e0
? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
ksys_ioctl+0x3a/0x70
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
This happens because during page writeback it's valid for
writepage_delalloc to instantiate a delalloc range which doesn't
belong to the page currently being written back.
The reason this case is valid is due to find_lock_delalloc_range
returning any available range after the passed delalloc_start and
ignorting whether the page under writeback is within that range.
In turn ordered extents (OE) are always created for the returned range
from find_lock_delalloc_range. If, however, a failure occurs while OE
are being created then the clean up code in btrfs_cleanup_ordered_extents
will be called.
Unfortunately the code in btrfs_cleanup_ordered_extents doesn't consider
the case of such 'foreign' range being processed and instead it always
assumes that the range OE are created for belongs to the page. This
leads to the first page of such foregin range to not be cleaned up since
it's deliberately missed skipped by the current cleaning up code.
Fix this by correctly checking whether the current page belongs to the
range being instantiated and if so adjsut the range parameters
passed for cleaning up. If it doesn't, then just clean the whole OE
range directly.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
---
V3:
* Re-worded the commit for easier comprehension
* Added RB tag from Josef
V2:
* Fix compilation failure due to missing parentheses
* Fixed the "Fixes" tag.
fs/btrfs/inode.c | 29 ++++++++++++++++++++---------
1 file changed, 20 insertions(+), 9 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a0fc564626df..9f65f131f244 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -110,17 +110,17 @@ static void __endio_write_update_ordered(struct inode *inode,
* extent_clear_unlock_delalloc() to clear both the bits EXTENT_DO_ACCOUNTING
* and EXTENT_DELALLOC simultaneously, because that causes the reserved metadata
* to be released, which we want to happen only when finishing the ordered
- * extent (btrfs_finish_ordered_io()). Also note that the caller of
- * btrfs_run_delalloc_range already does proper cleanup for the first page of
- * the range, that is, it invokes the callback writepage_end_io_hook() for the
- * range of the first page.
+ * extent (btrfs_finish_ordered_io()).
*/
static inline void btrfs_cleanup_ordered_extents(struct inode *inode,
- const u64 offset,
- const u64 bytes)
+ struct page *locked_page,
+ u64 offset, u64 bytes)
{
unsigned long index = offset >> PAGE_SHIFT;
unsigned long end_index = (offset + bytes - 1) >> PAGE_SHIFT;
+ u64 page_start = page_offset(locked_page);
+ u64 page_end = page_start + PAGE_SIZE - 1;
+
struct page *page;
while (index <= end_index) {
@@ -131,8 +131,18 @@ static inline void btrfs_cleanup_ordered_extents(struct inode *inode,
ClearPagePrivate2(page);
put_page(page);
}
- return __endio_write_update_ordered(inode, offset + PAGE_SIZE,
- bytes - PAGE_SIZE, false);
+
+ /*
+ * In case this page belongs to the delalloc range being instantiated
+ * then skip it, since the first page of a range is going to be
+ * properly cleaned up by the caller of run_delalloc_range
+ */
+ if (page_start >= offset && page_end <= (offset + bytes - 1)) {
+ offset += PAGE_SIZE;
+ bytes -= PAGE_SIZE;
+ }
+
+ return __endio_write_update_ordered(inode, offset, bytes, false);
}
static int btrfs_dirty_inode(struct inode *inode);
@@ -1605,7 +1615,8 @@ int btrfs_run_delalloc_range(void *private_data, struct page *locked_page,
write_flags);
}
if (ret)
- btrfs_cleanup_ordered_extents(inode, start, end - start + 1);
+ btrfs_cleanup_ordered_extents(inode, locked_page, start,
+ end - start + 1);
return ret;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCHv3] btrfs: Fix error handling in btrfs_cleanup_ordered_extents
2018-11-21 15:10 [PATCHv3] btrfs: Fix error handling in btrfs_cleanup_ordered_extents Nikolay Borisov
@ 2018-12-04 16:28 ` Nikolay Borisov
2018-12-05 16:13 ` David Sterba
1 sibling, 0 replies; 3+ messages in thread
From: Nikolay Borisov @ 2018-12-04 16:28 UTC (permalink / raw)
To: linux-btrfs
On 21.11.18 г. 17:10 ч., Nikolay Borisov wrote:
> Running btrfs/124 in a loop hung up on me sporadically with the
> following call trace:
> btrfs D 0 5760 5324 0x00000000
> Call Trace:
> ? __schedule+0x243/0x800
> schedule+0x33/0x90
> btrfs_start_ordered_extent+0x10c/0x1b0 [btrfs]
> ? wait_woken+0xa0/0xa0
> btrfs_wait_ordered_range+0xbb/0x100 [btrfs]
> btrfs_relocate_block_group+0x1ff/0x230 [btrfs]
> btrfs_relocate_chunk+0x49/0x100 [btrfs]
> btrfs_balance+0xbeb/0x1740 [btrfs]
> btrfs_ioctl_balance+0x2ee/0x380 [btrfs]
> btrfs_ioctl+0x1691/0x3110 [btrfs]
> ? lockdep_hardirqs_on+0xed/0x180
> ? __handle_mm_fault+0x8e7/0xfb0
> ? _raw_spin_unlock+0x24/0x30
> ? __handle_mm_fault+0x8e7/0xfb0
> ? do_vfs_ioctl+0xa5/0x6e0
> ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
> do_vfs_ioctl+0xa5/0x6e0
> ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> ksys_ioctl+0x3a/0x70
> __x64_sys_ioctl+0x16/0x20
> do_syscall_64+0x60/0x1b0
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> This happens because during page writeback it's valid for
> writepage_delalloc to instantiate a delalloc range which doesn't
> belong to the page currently being written back.
>
> The reason this case is valid is due to find_lock_delalloc_range
> returning any available range after the passed delalloc_start and
> ignorting whether the page under writeback is within that range.
> In turn ordered extents (OE) are always created for the returned range
> from find_lock_delalloc_range. If, however, a failure occurs while OE
> are being created then the clean up code in btrfs_cleanup_ordered_extents
> will be called.
>
> Unfortunately the code in btrfs_cleanup_ordered_extents doesn't consider
> the case of such 'foreign' range being processed and instead it always
> assumes that the range OE are created for belongs to the page. This
> leads to the first page of such foregin range to not be cleaned up since
> it's deliberately missed skipped by the current cleaning up code.
>
> Fix this by correctly checking whether the current page belongs to the
> range being instantiated and if so adjsut the range parameters
> passed for cleaning up. If it doesn't, then just clean the whole OE
> range directly.
>
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
> ---
> V3:
> * Re-worded the commit for easier comprehension
> * Added RB tag from Josef
>
> V2:
> * Fix compilation failure due to missing parentheses
> * Fixed the "Fixes" tag.
> fs/btrfs/inode.c | 29 ++++++++++++++++++++---------
> 1 file changed, 20 insertions(+), 9 deletions(-)
>
Ping,
Also this patch needs:
Fixes: 524272607e88 ("btrfs: Handle delalloc error correctly to avoid
ordered extent hang") and it needs to be applied to the stable releases 4.14
<snip>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCHv3] btrfs: Fix error handling in btrfs_cleanup_ordered_extents
2018-11-21 15:10 [PATCHv3] btrfs: Fix error handling in btrfs_cleanup_ordered_extents Nikolay Borisov
2018-12-04 16:28 ` Nikolay Borisov
@ 2018-12-05 16:13 ` David Sterba
1 sibling, 0 replies; 3+ messages in thread
From: David Sterba @ 2018-12-05 16:13 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: linux-btrfs
On Wed, Nov 21, 2018 at 05:10:52PM +0200, Nikolay Borisov wrote:
> Running btrfs/124 in a loop hung up on me sporadically with the
> following call trace:
> btrfs D 0 5760 5324 0x00000000
> Call Trace:
> ? __schedule+0x243/0x800
> schedule+0x33/0x90
> btrfs_start_ordered_extent+0x10c/0x1b0 [btrfs]
> ? wait_woken+0xa0/0xa0
> btrfs_wait_ordered_range+0xbb/0x100 [btrfs]
> btrfs_relocate_block_group+0x1ff/0x230 [btrfs]
> btrfs_relocate_chunk+0x49/0x100 [btrfs]
> btrfs_balance+0xbeb/0x1740 [btrfs]
> btrfs_ioctl_balance+0x2ee/0x380 [btrfs]
> btrfs_ioctl+0x1691/0x3110 [btrfs]
> ? lockdep_hardirqs_on+0xed/0x180
> ? __handle_mm_fault+0x8e7/0xfb0
> ? _raw_spin_unlock+0x24/0x30
> ? __handle_mm_fault+0x8e7/0xfb0
> ? do_vfs_ioctl+0xa5/0x6e0
> ? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
> do_vfs_ioctl+0xa5/0x6e0
> ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
> ksys_ioctl+0x3a/0x70
> __x64_sys_ioctl+0x16/0x20
> do_syscall_64+0x60/0x1b0
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> This happens because during page writeback it's valid for
> writepage_delalloc to instantiate a delalloc range which doesn't
> belong to the page currently being written back.
>
> The reason this case is valid is due to find_lock_delalloc_range
> returning any available range after the passed delalloc_start and
> ignorting whether the page under writeback is within that range.
> In turn ordered extents (OE) are always created for the returned range
> from find_lock_delalloc_range. If, however, a failure occurs while OE
> are being created then the clean up code in btrfs_cleanup_ordered_extents
> will be called.
>
> Unfortunately the code in btrfs_cleanup_ordered_extents doesn't consider
> the case of such 'foreign' range being processed and instead it always
> assumes that the range OE are created for belongs to the page. This
> leads to the first page of such foregin range to not be cleaned up since
> it's deliberately missed skipped by the current cleaning up code.
>
> Fix this by correctly checking whether the current page belongs to the
> range being instantiated and if so adjsut the range parameters
> passed for cleaning up. If it doesn't, then just clean the whole OE
> range directly.
>
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Added to misc-next, thanks.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2018-12-05 16:14 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-21 15:10 [PATCHv3] btrfs: Fix error handling in btrfs_cleanup_ordered_extents Nikolay Borisov
2018-12-04 16:28 ` Nikolay Borisov
2018-12-05 16:13 ` David Sterba
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).