All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Zdenek Sojka <zsojka@seznam.cz>,
	Stefan Priebe - Profihost AG <s.priebe@profihost.ag>,
	Drazen Kacar <drazen.kacar@oradian.com>,
	Filipe Manana <fdmanana@suse.com>,
	David Sterba <dsterba@suse.com>,
	Ben Hutchings <ben.hutchings@codethink.co.uk>
Subject: [PATCH 4.19 20/71] Btrfs: fix unwritten extent buffers and hangs on future writeback attempts
Date: Mon,  9 Nov 2020 13:55:14 +0100	[thread overview]
Message-ID: <20201109125020.855552422@linuxfoundation.org> (raw)
In-Reply-To: <20201109125019.906191744@linuxfoundation.org>

From: Filipe Manana <fdmanana@suse.com>

commit 18dfa7117a3f379862dcd3f67cadd678013bb9dd upstream.

The lock_extent_buffer_io() returns 1 to the caller to tell it everything
went fine and the callers needs to start writeback for the extent buffer
(submit a bio, etc), 0 to tell the caller everything went fine but it does
not need to start writeback for the extent buffer, and a negative value if
some error happened.

When it's about to return 1 it tries to lock all pages, and if a try lock
on a page fails, and we didn't flush any existing bio in our "epd", it
calls flush_write_bio(epd) and overwrites the return value of 1 to 0 or
an error. The page might have been locked elsewhere, not with the goal
of starting writeback of the extent buffer, and even by some code other
than btrfs, like page migration for example, so it does not mean the
writeback of the extent buffer was already started by some other task,
so returning a 0 tells the caller (btree_write_cache_pages()) to not
start writeback for the extent buffer. Note that epd might currently have
either no bio, so flush_write_bio() returns 0 (success) or it might have
a bio for another extent buffer with a lower index (logical address).

Since we return 0 with the EXTENT_BUFFER_WRITEBACK bit set on the
extent buffer and writeback is never started for the extent buffer,
future attempts to writeback the extent buffer will hang forever waiting
on that bit to be cleared, since it can only be cleared after writeback
completes. Such hang is reported with a trace like the following:

  [49887.347053] INFO: task btrfs-transacti:1752 blocked for more than 122 seconds.
  [49887.347059]       Not tainted 5.2.13-gentoo #2
  [49887.347060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [49887.347062] btrfs-transacti D    0  1752      2 0x80004000
  [49887.347064] Call Trace:
  [49887.347069]  ? __schedule+0x265/0x830
  [49887.347071]  ? bit_wait+0x50/0x50
  [49887.347072]  ? bit_wait+0x50/0x50
  [49887.347074]  schedule+0x24/0x90
  [49887.347075]  io_schedule+0x3c/0x60
  [49887.347077]  bit_wait_io+0x8/0x50
  [49887.347079]  __wait_on_bit+0x6c/0x80
  [49887.347081]  ? __lock_release.isra.29+0x155/0x2d0
  [49887.347083]  out_of_line_wait_on_bit+0x7b/0x80
  [49887.347084]  ? var_wake_function+0x20/0x20
  [49887.347087]  lock_extent_buffer_for_io+0x28c/0x390
  [49887.347089]  btree_write_cache_pages+0x18e/0x340
  [49887.347091]  do_writepages+0x29/0xb0
  [49887.347093]  ? kmem_cache_free+0x132/0x160
  [49887.347095]  ? convert_extent_bit+0x544/0x680
  [49887.347097]  filemap_fdatawrite_range+0x70/0x90
  [49887.347099]  btrfs_write_marked_extents+0x53/0x120
  [49887.347100]  btrfs_write_and_wait_transaction.isra.4+0x38/0xa0
  [49887.347102]  btrfs_commit_transaction+0x6bb/0x990
  [49887.347103]  ? start_transaction+0x33e/0x500
  [49887.347105]  transaction_kthread+0x139/0x15c

So fix this by not overwriting the return value (ret) with the result
from flush_write_bio(). We also need to clear the EXTENT_BUFFER_WRITEBACK
bit in case flush_write_bio() returns an error, otherwise it will hang
any future attempts to writeback the extent buffer, and undo all work
done before (set back EXTENT_BUFFER_DIRTY, etc).

This is a regression introduced in the 5.2 kernel.

Fixes: 2e3c25136adfb ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()")
Fixes: f4340622e0226 ("btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up")
Reported-by: Zdenek Sojka <zsojka@seznam.cz>
Link: https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#u
Reported-by: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Link: https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#t
Reported-by: Drazen Kacar <drazen.kacar@oradian.com>
Link: https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 fs/btrfs/extent_io.c |   35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3554,6 +3554,13 @@ void wait_on_extent_buffer_writeback(str
 		       TASK_UNINTERRUPTIBLE);
 }
 
+static void end_extent_buffer_writeback(struct extent_buffer *eb)
+{
+	clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
+	smp_mb__after_atomic();
+	wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK);
+}
+
 /*
  * Lock eb pages and flush the bio if we can't the locks
  *
@@ -3626,8 +3633,11 @@ lock_extent_buffer_for_io(struct extent_
 
 		if (!trylock_page(p)) {
 			if (!flush) {
-				ret = flush_write_bio(epd);
-				if (ret < 0) {
+				int err;
+
+				err = flush_write_bio(epd);
+				if (err < 0) {
+					ret = err;
 					failed_page_nr = i;
 					goto err_unlock;
 				}
@@ -3642,16 +3652,23 @@ err_unlock:
 	/* Unlock already locked pages */
 	for (i = 0; i < failed_page_nr; i++)
 		unlock_page(eb->pages[i]);
+	/*
+	 * Clear EXTENT_BUFFER_WRITEBACK and wake up anyone waiting on it.
+	 * Also set back EXTENT_BUFFER_DIRTY so future attempts to this eb can
+	 * be made and undo everything done before.
+	 */
+	btrfs_tree_lock(eb);
+	spin_lock(&eb->refs_lock);
+	set_bit(EXTENT_BUFFER_DIRTY, &eb->bflags);
+	end_extent_buffer_writeback(eb);
+	spin_unlock(&eb->refs_lock);
+	percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, eb->len,
+				 fs_info->dirty_metadata_batch);
+	btrfs_clear_header_flag(eb, BTRFS_HEADER_FLAG_WRITTEN);
+	btrfs_tree_unlock(eb);
 	return ret;
 }
 
-static void end_extent_buffer_writeback(struct extent_buffer *eb)
-{
-	clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
-	smp_mb__after_atomic();
-	wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK);
-}
-
 static void set_btree_ioerr(struct page *page)
 {
 	struct extent_buffer *eb = (struct extent_buffer *)page->private;



  parent reply	other threads:[~2020-11-09 13:08 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-09 12:54 [PATCH 4.19 00/71] 4.19.156-rc1 review Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 4.19 01/71] drm/i915: Break up error capture compression loops with cond_resched() Greg Kroah-Hartman
2020-11-09 18:20   ` Pavel Machek
2020-11-09 12:54 ` [PATCH 4.19 02/71] tipc: fix use-after-free in tipc_bcast_get_mode Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 4.19 03/71] ptrace: fix task_join_group_stop() for the case when current is traced Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 4.19 04/71] cadence: force nonlinear buffers to be cloned Greg Kroah-Hartman
2020-11-09 12:54 ` [PATCH 4.19 05/71] chelsio/chtls: fix memory leaks caused by a race Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 06/71] chelsio/chtls: fix always leaking ctrl_skb Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 07/71] gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 08/71] gianfar: Account for Tx PTP timestamp in the skb headroom Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 09/71] net: usb: qmi_wwan: add Telit LE910Cx 0x1230 composition Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 10/71] sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian platforms Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 11/71] sfp: Fix error handing in sfp_probe() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 12/71] blktrace: fix debugfs use after free Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 13/71] btrfs: extent_io: Kill the forward declaration of flush_write_bio Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 14/71] btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 15/71] Revert "btrfs: flush write bio if we loop in extent_write_cache_pages" Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 16/71] btrfs: flush write bio if we loop in extent_write_cache_pages Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 17/71] btrfs: extent_io: Handle errors better in extent_write_full_page() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 18/71] btrfs: extent_io: Handle errors better in btree_write_cache_pages() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 19/71] btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io() Greg Kroah-Hartman
2020-11-11 12:44   ` Pavel Machek
2020-11-11 14:39     ` Ben Hutchings
2020-11-12 16:06       ` Sasha Levin
2020-11-09 12:55 ` Greg Kroah-Hartman [this message]
2020-11-09 12:55 ` [PATCH 4.19 21/71] btrfs: Dont submit any btree write bio if the fs has errors Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 22/71] btrfs: Move btrfs_check_chunk_valid() to tree-check.[ch] and export it Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 23/71] btrfs: tree-checker: Make chunk item checker messages more readable Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 24/71] btrfs: tree-checker: Make btrfs_check_chunk_valid() return EUCLEAN instead of EIO Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 25/71] btrfs: tree-checker: Check chunk item at tree block read time Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 26/71] btrfs: tree-checker: Verify dev item Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 27/71] btrfs: tree-checker: Fix wrong check on max devid Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 28/71] btrfs: tree-checker: Enhance chunk checker to validate chunk profile Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 29/71] btrfs: tree-checker: Verify inode item Greg Kroah-Hartman
2020-11-11 13:13   ` Pavel Machek
2020-11-11 13:30     ` Qu Wenruo
2020-11-11 13:38       ` Pavel Machek
2020-11-11 14:04         ` Qu Wenruo
2020-11-09 12:55 ` [PATCH 4.19 30/71] btrfs: tree-checker: fix the error message for transid error Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 31/71] Fonts: Replace discarded const qualifier Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 32/71] ALSA: usb-audio: Add implicit feedback quirk for Zoom UAC-2 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 33/71] ALSA: usb-audio: add usb vendor id as DSD-capable for Khadas devices Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 34/71] ALSA: usb-audio: Add implicit feedback quirk for Qu-16 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 35/71] ALSA: usb-audio: Add implicit feedback quirk for MODX Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 36/71] mm: mempolicy: fix potential pte_unmap_unlock pte error Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 37/71] lib/crc32test: remove extra local_irq_disable/enable Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 38/71] kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 39/71] mm: always have io_remap_pfn_range() set pgprot_decrypted() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 40/71] gfs2: Wake up when sd_glock_disposal becomes zero Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 41/71] ring-buffer: Fix recursion protection transitions between interrupt context Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 42/71] ftrace: Fix recursion check for NMI test Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 43/71] ftrace: Handle tracing when switching between context Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 44/71] tracing: Fix out of bounds write in get_trace_buf Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 45/71] futex: Handle transient "ownerless" rtmutex state correctly Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 46/71] ARM: dts: sun4i-a10: fix cpu_alert temperature Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 47/71] x86/kexec: Use up-to-dated screen_info copy to fill boot params Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 48/71] of: Fix reserved-memory overlap detection Greg Kroah-Hartman
2020-11-11 12:53   ` Pavel Machek
2020-11-11 14:34     ` Vincent Whitchurch
2020-11-09 12:55 ` [PATCH 4.19 49/71] blk-cgroup: Fix memleak on error path Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 50/71] blk-cgroup: Pre-allocate tree node on blkg_conf_prep Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 51/71] scsi: core: Dont start concurrent async scan on same host Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 52/71] vsock: use ns_capable_noaudit() on socket create Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 53/71] drm/vc4: drv: Add error handding for bind Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 54/71] ACPI: NFIT: Fix comparison to -ENXIO Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 55/71] vt: Disable KD_FONT_OP_COPY Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 56/71] fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 57/71] serial: 8250_mtk: Fix uart_get_baud_rate warning Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 58/71] serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 59/71] USB: serial: cyberjack: fix write-URB completion race Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 60/71] USB: serial: option: add Quectel EC200T module support Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 61/71] USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 62/71] USB: serial: option: add Telit FN980 composition 0x1055 Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 63/71] USB: Add NO_LPM quirk for Kingston flash drive Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 64/71] usb: mtu3: fix panic in mtu3_gadget_stop() Greg Kroah-Hartman
2020-11-09 12:55 ` [PATCH 4.19 65/71] ARC: stack unwinding: avoid indefinite looping Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 66/71] Revert "ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE" Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 67/71] PM: runtime: Resume the device earlier in __device_release_driver() Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 68/71] perf/core: Fix a memory leak in perf_event_parse_addr_filter() Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 69/71] tools: perf: Fix build error in v4.19.y Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 70/71] net: dsa: read mac address from DT for slave device Greg Kroah-Hartman
2020-11-09 12:56 ` [PATCH 4.19 71/71] arm64: dts: marvell: espressobin: Add ethernet switch aliases Greg Kroah-Hartman
2020-11-09 15:43 ` [PATCH 4.19 00/71] 4.19.156-rc1 review Jon Hunter
2020-11-09 19:22 ` Pavel Machek
2020-11-09 23:07 ` Guenter Roeck
2020-11-09 23:22 ` Shuah Khan
2020-11-10  7:44 ` Naresh Kamboju
2020-11-19  8:10 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201109125020.855552422@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ben.hutchings@codethink.co.uk \
    --cc=drazen.kacar@oradian.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=s.priebe@profihost.ag \
    --cc=stable@vger.kernel.org \
    --cc=zsojka@seznam.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.