* FAILED: patch "[PATCH] btrfs: flush write bio if we loop in extent_write_cache_pages" failed to apply to 4.9-stable tree
@ 2020-02-09 12:03 gregkh
2020-02-09 19:14 ` Sasha Levin
0 siblings, 1 reply; 2+ messages in thread
From: gregkh @ 2020-02-09 12:03 UTC (permalink / raw)
To: josef, dsterba, fdmanana; +Cc: stable
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 42ffb0bf584ae5b6b38f72259af1e0ee417ac77f Mon Sep 17 00:00:00 2001
From: Josef Bacik <josef@toxicpanda.com>
Date: Thu, 23 Jan 2020 15:33:02 -0500
Subject: [PATCH] btrfs: flush write bio if we loop in extent_write_cache_pages
There exists a deadlock with range_cyclic that has existed forever. If
we loop around with a bio already built we could deadlock with a writer
who has the page locked that we're attempting to write but is waiting on
a page in our bio to be written out. The task traces are as follows
PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5"
#0 [ffffc900297bb658] __schedule at ffffffff81a4c33f
#1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3
#2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42
#3 [ffffc900297bb708] __lock_page at ffffffff811f145b
#4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502
#5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684
#6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff
#7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0
#8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2
#9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd
PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND:
"aio-dio-invalid"
#0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f
#1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3
#2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42
#3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6
#4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7
#5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359
#6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933
#7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8
#8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d
#9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032
I used drgn to find the respective pages we were stuck on
page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901
page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874
As you can see the kworker is waiting for bit 0 (PG_locked) on index
7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index
8148. aio-dio-invalid has 7680, and the kworker epd looks like the
following
crash> struct extent_page_data ffffc900297bbbb0
struct extent_page_data {
bio = 0xffff889f747ed830,
tree = 0xffff889eed6ba448,
extent_locked = 0,
sync_io = 0
}
Probably worth mentioning as well that it waits for writeback of the
page to complete while holding a lock on it (at prepare_pages()).
Using drgn I walked the bio pages looking for page
0xffffea00fbfc7500 which is the one we're waiting for writeback on
bio = Object(prog, 'struct bio', address=0xffff889f747ed830)
for i in range(0, bio.bi_vcnt.value_()):
bv = bio.bi_io_vec[i]
if bv.bv_page.value_() == 0xffffea00fbfc7500:
print("FOUND IT")
which validated what I suspected.
The fix for this is simple, flush the epd before we loop back around to
the beginning of the file during writeout.
Fixes: b293f02e1423 ("Btrfs: Add writepages support")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index e2d30287e2d5..8ff17bc30d5a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -4166,7 +4166,16 @@ static int extent_write_cache_pages(struct address_space *mapping,
*/
scanned = 1;
index = 0;
- goto retry;
+
+ /*
+ * If we're looping we could run into a page that is locked by a
+ * writer and that writer could be waiting on writeback for a
+ * page in our current bio, and thus deadlock, so flush the
+ * write bio here.
+ */
+ ret = flush_write_bio(epd);
+ if (!ret)
+ goto retry;
}
if (wbc->range_cyclic || (wbc->nr_to_write > 0 && range_whole))
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: FAILED: patch "[PATCH] btrfs: flush write bio if we loop in extent_write_cache_pages" failed to apply to 4.9-stable tree
2020-02-09 12:03 FAILED: patch "[PATCH] btrfs: flush write bio if we loop in extent_write_cache_pages" failed to apply to 4.9-stable tree gregkh
@ 2020-02-09 19:14 ` Sasha Levin
0 siblings, 0 replies; 2+ messages in thread
From: Sasha Levin @ 2020-02-09 19:14 UTC (permalink / raw)
To: gregkh; +Cc: josef, dsterba, fdmanana, stable
On Sun, Feb 09, 2020 at 01:03:49PM +0100, gregkh@linuxfoundation.org wrote:
>
>The patch below does not apply to the 4.9-stable tree.
>If someone wants it applied there, or to any other stable or longterm
>tree, then please email the backport, including the original git commit
>id to <stable@vger.kernel.org>.
>
>thanks,
>
>greg k-h
>
>------------------ original commit in Linus's tree ------------------
>
>From 42ffb0bf584ae5b6b38f72259af1e0ee417ac77f Mon Sep 17 00:00:00 2001
>From: Josef Bacik <josef@toxicpanda.com>
>Date: Thu, 23 Jan 2020 15:33:02 -0500
>Subject: [PATCH] btrfs: flush write bio if we loop in extent_write_cache_pages
>
>There exists a deadlock with range_cyclic that has existed forever. If
>we loop around with a bio already built we could deadlock with a writer
>who has the page locked that we're attempting to write but is waiting on
>a page in our bio to be written out. The task traces are as follows
>
> PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5"
> #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f
> #1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3
> #2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42
> #3 [ffffc900297bb708] __lock_page at ffffffff811f145b
> #4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502
> #5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684
> #6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff
> #7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0
> #8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2
> #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd
>
> PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND:
> "aio-dio-invalid"
> #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f
> #1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3
> #2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42
> #3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6
> #4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7
> #5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359
> #6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933
> #7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8
> #8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d
> #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032
>
>I used drgn to find the respective pages we were stuck on
>
>page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901
>page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874
>
>As you can see the kworker is waiting for bit 0 (PG_locked) on index
>7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index
>8148. aio-dio-invalid has 7680, and the kworker epd looks like the
>following
>
> crash> struct extent_page_data ffffc900297bbbb0
> struct extent_page_data {
> bio = 0xffff889f747ed830,
> tree = 0xffff889eed6ba448,
> extent_locked = 0,
> sync_io = 0
> }
>
>Probably worth mentioning as well that it waits for writeback of the
>page to complete while holding a lock on it (at prepare_pages()).
>
>Using drgn I walked the bio pages looking for page
>0xffffea00fbfc7500 which is the one we're waiting for writeback on
>
> bio = Object(prog, 'struct bio', address=0xffff889f747ed830)
> for i in range(0, bio.bi_vcnt.value_()):
> bv = bio.bi_io_vec[i]
> if bv.bv_page.value_() == 0xffffea00fbfc7500:
> print("FOUND IT")
>
>which validated what I suspected.
>
>The fix for this is simple, flush the epd before we loop back around to
>the beginning of the file during writeout.
>
>Fixes: b293f02e1423 ("Btrfs: Add writepages support")
>CC: stable@vger.kernel.org # 4.4+
>Reviewed-by: Filipe Manana <fdmanana@suse.com>
>Signed-off-by: Josef Bacik <josef@toxicpanda.com>
>Signed-off-by: David Sterba <dsterba@suse.com>
There was a build error here because we didn't have aab6e9edf07f
("btrfs: unify extent_page_data type passed as void") on 4.9 and older.
Fixed and queued for 4.9-4.4.
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2020-02-09 19:14 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-09 12:03 FAILED: patch "[PATCH] btrfs: flush write bio if we loop in extent_write_cache_pages" failed to apply to 4.9-stable tree gregkh
2020-02-09 19:14 ` Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).