* [PATCH] btrfs: Fix error handling in btrfs_cleanup_ordered_extents
@ 2018-10-26 11:13 Nikolay Borisov
2018-10-26 11:29 ` Holger Hoffstätte
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Nikolay Borisov @ 2018-10-26 11:13 UTC (permalink / raw)
To: dsterba; +Cc: linux-btrfs, wqu, Nikolay Borisov
Running btrfs/124 in a loop hung up on me sporadically with the
following call trace:
btrfs D 0 5760 5324 0x00000000
Call Trace:
? __schedule+0x243/0x800
schedule+0x33/0x90
btrfs_start_ordered_extent+0x10c/0x1b0 [btrfs]
? wait_woken+0xa0/0xa0
btrfs_wait_ordered_range+0xbb/0x100 [btrfs]
btrfs_relocate_block_group+0x1ff/0x230 [btrfs]
btrfs_relocate_chunk+0x49/0x100 [btrfs]
btrfs_balance+0xbeb/0x1740 [btrfs]
btrfs_ioctl_balance+0x2ee/0x380 [btrfs]
btrfs_ioctl+0x1691/0x3110 [btrfs]
? lockdep_hardirqs_on+0xed/0x180
? __handle_mm_fault+0x8e7/0xfb0
? _raw_spin_unlock+0x24/0x30
? __handle_mm_fault+0x8e7/0xfb0
? do_vfs_ioctl+0xa5/0x6e0
? btrfs_ioctl_get_supported_features+0x30/0x30 [btrfs]
do_vfs_ioctl+0xa5/0x6e0
? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
ksys_ioctl+0x3a/0x70
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x60/0x1b0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Turns out during page writeback it's possible that the code in
writepage_delalloc can instantiate a delalloc range which doesn't
belong to the page currently being written back. This happens since
find_lock_delalloc_range returns up to BTRFS_MAX_EXTENT_SIZE delalloc range
when asked and doens't really consider the range of the passed page. When such
a foregin range is found the code proceeds to run_delalloc_range and calls the
appropriate function to fill the delalloc and create ordered extents. If,
however, a failure occurs while this operation is in effect then the clean up
code in btrfs_cleanup_ordered_extents will be called. This function has the
wrong assumption that caller of run_delalloc_range always properly cleans the
first page of the range hence when it calls __endio_write_update_ordered it
explicitly ommits the first page of the delalloc range. This is wrong because
this function could be cleaning a delalloc range that doesn't belong to the
current page. This in turn means that the page cleanup code in
__extent_writepage will not really free the initial page for the range, leaving
a hanging ordered extent with bytes_left set to 4k. This bug has been present
ever since the original introduction of the cleanup code in 524272607e88.
Fix this by correctly checking whether the current page belongs to the
range being instantiated and if so correctly adjust the range parameters
passed for cleaning up. If it doesn't, then just clean the whole OE
range directly.
524272607e88 ("btrfs: Handle delalloc error correctly to avoid ordered extent hang")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
With this patch I can no longer get btrfs/124 to hang nor do I see any
regressions in xfstest.
Furthermore, this needs to go in stable since v4.12
fs/btrfs/inode.c | 29 ++++++++++++++++++++---------
1 file changed, 20 insertions(+), 9 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index de4e3a9563f7..f6a96da2f2ea 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -109,17 +109,17 @@ static void __endio_write_update_ordered(struct inode *inode,
* extent_clear_unlock_delalloc() to clear both the bits EXTENT_DO_ACCOUNTING
* and EXTENT_DELALLOC simultaneously, because that causes the reserved metadata
* to be released, which we want to happen only when finishing the ordered
- * extent (btrfs_finish_ordered_io()). Also note that the caller of the
- * fill_delalloc() callback already does proper cleanup for the first page of
- * the range, that is, it invokes the callback writepage_end_io_hook() for the
- * range of the first page.
+ * extent (btrfs_finish_ordered_io()).
*/
static inline void btrfs_cleanup_ordered_extents(struct inode *inode,
- const u64 offset,
- const u64 bytes)
+ struct page *locked_page,
+ u64 offset, u64 bytes)
{
unsigned long index = offset >> PAGE_SHIFT;
unsigned long end_index = (offset + bytes - 1) >> PAGE_SHIFT;
+ u64 page_start = page_offset(locked_page);
+ u64 page_end = page_start + PAGE_SIZE - 1;
+
struct page *page;
while (index <= end_index) {
@@ -130,8 +130,18 @@ static inline void btrfs_cleanup_ordered_extents(struct inode *inode,
ClearPagePrivate2(page);
put_page(page);
}
- return __endio_write_update_ordered(inode, offset + PAGE_SIZE,
- bytes - PAGE_SIZE, false);
+
+ /*
+ * In case this page belongs to the delalloc range being instantiated
+ * then skip it, since the first page of a range is going to be
+ * properly cleaned up by the caller of run_delalloc_range
+ */
+ if (page_start >= offset && page_end <= (offset + bytes - 1) {
+ offset += PAGE_SIZE;
+ bytes -= PAGE_SIZE;
+ }
+
+ return __endio_write_update_ordered(inode, offset, bytes, false);
}
static int btrfs_dirty_inode(struct inode *inode);
@@ -1605,7 +1615,8 @@ static int run_delalloc_range(void *private_data, struct page *locked_page,
write_flags);
}
if (ret)
- btrfs_cleanup_ordered_extents(inode, start, end - start + 1);
+ btrfs_cleanup_ordered_extents(inode, locked_page, start,
+ end - start + 1);
return ret;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] btrfs: Fix error handling in btrfs_cleanup_ordered_extents
2018-10-26 11:13 [PATCH] btrfs: Fix error handling in btrfs_cleanup_ordered_extents Nikolay Borisov
@ 2018-10-26 11:29 ` Holger Hoffstätte
2018-10-26 12:33 ` kbuild test robot
2018-10-26 13:13 ` kbuild test robot
2 siblings, 0 replies; 4+ messages in thread
From: Holger Hoffstätte @ 2018-10-26 11:29 UTC (permalink / raw)
To: Nikolay Borisov, dsterba; +Cc: linux-btrfs, wqu
On 10/26/18 13:13, Nikolay Borisov wrote:
> + if (page_start >= offset && page_end <= (offset + bytes - 1) {
fs/btrfs/inode.c: In function 'btrfs_cleanup_ordered_extents':
fs/btrfs/inode.c:140:62: error: expected ')' before '{' token
if (page_start >= offset && page_end <= (offset + bytes - 1) {
~ ^~
)
You're welcome :)
Holger
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] btrfs: Fix error handling in btrfs_cleanup_ordered_extents
2018-10-26 11:13 [PATCH] btrfs: Fix error handling in btrfs_cleanup_ordered_extents Nikolay Borisov
2018-10-26 11:29 ` Holger Hoffstätte
@ 2018-10-26 12:33 ` kbuild test robot
2018-10-26 13:13 ` kbuild test robot
2 siblings, 0 replies; 4+ messages in thread
From: kbuild test robot @ 2018-10-26 12:33 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: kbuild-all, dsterba, linux-btrfs, wqu, Nikolay Borisov
[-- Attachment #1: Type: text/plain, Size: 3786 bytes --]
Hi Nikolay,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on v4.19]
[also build test ERROR on next-20181019]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Nikolay-Borisov/btrfs-Fix-error-handling-in-btrfs_cleanup_ordered_extents/20181026-194005
config: i386-randconfig-x007-201842 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386
All errors (new ones prefixed by >>):
fs//btrfs/inode.c: In function 'btrfs_cleanup_ordered_extents':
>> fs//btrfs/inode.c:140:63: error: expected ')' before '{' token
if (page_start >= offset && page_end <= (offset + bytes - 1) {
^
>> fs//btrfs/inode.c:146:1: error: expected expression before '}' token
}
^
vim +140 fs//btrfs/inode.c
86
87 static int btrfs_setsize(struct inode *inode, struct iattr *attr);
88 static int btrfs_truncate(struct inode *inode, bool skip_writeback);
89 static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent);
90 static noinline int cow_file_range(struct inode *inode,
91 struct page *locked_page,
92 u64 start, u64 end, u64 delalloc_end,
93 int *page_started, unsigned long *nr_written,
94 int unlock, struct btrfs_dedupe_hash *hash);
95 static struct extent_map *create_io_em(struct inode *inode, u64 start, u64 len,
96 u64 orig_start, u64 block_start,
97 u64 block_len, u64 orig_block_len,
98 u64 ram_bytes, int compress_type,
99 int type);
100
101 static void __endio_write_update_ordered(struct inode *inode,
102 const u64 offset, const u64 bytes,
103 const bool uptodate);
104
105 /*
106 * Cleanup all submitted ordered extents in specified range to handle errors
107 * from the fill_dellaloc() callback.
108 *
109 * NOTE: caller must ensure that when an error happens, it can not call
110 * extent_clear_unlock_delalloc() to clear both the bits EXTENT_DO_ACCOUNTING
111 * and EXTENT_DELALLOC simultaneously, because that causes the reserved metadata
112 * to be released, which we want to happen only when finishing the ordered
113 * extent (btrfs_finish_ordered_io()).
114 */
115 static inline void btrfs_cleanup_ordered_extents(struct inode *inode,
116 struct page *locked_page,
117 u64 offset, u64 bytes)
118 {
119 unsigned long index = offset >> PAGE_SHIFT;
120 unsigned long end_index = (offset + bytes - 1) >> PAGE_SHIFT;
121 u64 page_start = page_offset(locked_page);
122 u64 page_end = page_start + PAGE_SIZE - 1;
123
124 struct page *page;
125
126 while (index <= end_index) {
127 page = find_get_page(inode->i_mapping, index);
128 index++;
129 if (!page)
130 continue;
131 ClearPagePrivate2(page);
132 put_page(page);
133 }
134
135 /*
136 * In case this page belongs to the delalloc range being instantiated
137 * then skip it, since the first page of a range is going to be
138 * properly cleaned up by the caller of run_delalloc_range
139 */
> 140 if (page_start >= offset && page_end <= (offset + bytes - 1) {
141 offset += PAGE_SIZE;
142 bytes -= PAGE_SIZE;
143 }
144
145 return __endio_write_update_ordered(inode, offset, bytes, false);
> 146 }
147
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28220 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] btrfs: Fix error handling in btrfs_cleanup_ordered_extents
2018-10-26 11:13 [PATCH] btrfs: Fix error handling in btrfs_cleanup_ordered_extents Nikolay Borisov
2018-10-26 11:29 ` Holger Hoffstätte
2018-10-26 12:33 ` kbuild test robot
@ 2018-10-26 13:13 ` kbuild test robot
2 siblings, 0 replies; 4+ messages in thread
From: kbuild test robot @ 2018-10-26 13:13 UTC (permalink / raw)
To: Nikolay Borisov; +Cc: kbuild-all, dsterba, linux-btrfs, wqu, Nikolay Borisov
[-- Attachment #1: Type: text/plain, Size: 3753 bytes --]
Hi Nikolay,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on v4.19]
[also build test ERROR on next-20181019]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Nikolay-Borisov/btrfs-Fix-error-handling-in-btrfs_cleanup_ordered_extents/20181026-194005
config: x86_64-randconfig-x014-201842 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All error/warnings (new ones prefixed by >>):
fs/btrfs/inode.c: In function 'btrfs_cleanup_ordered_extents':
>> fs/btrfs/inode.c:10618:0: error: unterminated argument list invoking macro "if"
};
>> fs/btrfs/inode.c:140:2: error: expected '(' at end of input
if (page_start >= offset && page_end <= (offset + bytes - 1) {
^~
>> fs/btrfs/inode.c:140:2: warning: this 'if' clause does not guard... [-Wmisleading-indentation]
fs/btrfs/inode.c:10618:0: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the 'if'
};
>> fs/btrfs/inode.c:140:2: error: expected declaration or statement at end of input
if (page_start >= offset && page_end <= (offset + bytes - 1) {
^~
fs/btrfs/inode.c:122:6: warning: unused variable 'page_end' [-Wunused-variable]
u64 page_end = page_start + PAGE_SIZE - 1;
^~~~~~~~
fs/btrfs/inode.c: At top level:
fs/btrfs/inode.c:87:12: warning: 'btrfs_setsize' declared 'static' but never defined [-Wunused-function]
static int btrfs_setsize(struct inode *inode, struct iattr *attr);
^~~~~~~~~~~~~
fs/btrfs/inode.c:88:12: warning: 'btrfs_truncate' declared 'static' but never defined [-Wunused-function]
static int btrfs_truncate(struct inode *inode, bool skip_writeback);
^~~~~~~~~~~~~~
fs/btrfs/inode.c:89:12: warning: 'btrfs_finish_ordered_io' declared 'static' but never defined [-Wunused-function]
static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent);
^~~~~~~~~~~~~~~~~~~~~~~
fs/btrfs/inode.c:90:21: warning: 'cow_file_range' declared 'static' but never defined [-Wunused-function]
static noinline int cow_file_range(struct inode *inode,
^~~~~~~~~~~~~~
fs/btrfs/inode.c:95:27: warning: 'create_io_em' declared 'static' but never defined [-Wunused-function]
static struct extent_map *create_io_em(struct inode *inode, u64 start, u64 len,
^~~~~~~~~~~~
fs/btrfs/inode.c:101:13: warning: '__endio_write_update_ordered' declared 'static' but never defined [-Wunused-function]
static void __endio_write_update_ordered(struct inode *inode,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~
fs/btrfs/inode.c:71:27: warning: 'btrfs_inode_cachep' defined but not used [-Wunused-variable]
static struct kmem_cache *btrfs_inode_cachep;
^~~~~~~~~~~~~~~~~~
vim +/if +10618 fs/btrfs/inode.c
76dda93c Yan, Zheng 2009-09-21 10615
82d339d9 Alexey Dobriyan 2009-10-09 10616 const struct dentry_operations btrfs_dentry_operations = {
76dda93c Yan, Zheng 2009-09-21 10617 .d_delete = btrfs_dentry_delete,
76dda93c Yan, Zheng 2009-09-21 @10618 };
:::::: The code at line 10618 was first introduced by commit
:::::: 76dda93c6ae2c1dc3e6cde34569d6aca26b0c918 Btrfs: add snapshot/subvolume destroy ioctl
:::::: TO: Yan, Zheng <zheng.yan@oracle.com>
:::::: CC: Chris Mason <chris.mason@oracle.com>
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34678 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-10-26 13:15 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-26 11:13 [PATCH] btrfs: Fix error handling in btrfs_cleanup_ordered_extents Nikolay Borisov
2018-10-26 11:29 ` Holger Hoffstätte
2018-10-26 12:33 ` kbuild test robot
2018-10-26 13:13 ` kbuild test robot
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.