Stable Archive on lore.kernel.org
 help / color / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Zdenek Sojka <zsojka@seznam.cz>,
	Stefan Priebe - Profihost AG <s.priebe@profihost.ag>,
	Drazen Kacar <drazen.kacar@oradian.com>,
	Filipe Manana <fdmanana@suse.com>,
	David Sterba <dsterba@suse.com>
Subject: [PATCH 5.2 35/37] Btrfs: fix unwritten extent buffers and hangs on future writeback attempts
Date: Fri, 13 Sep 2019 14:07:40 +0100
Message-ID: <20190913130522.060024148@linuxfoundation.org> (raw)
In-Reply-To: <20190913130510.727515099@linuxfoundation.org>

From: Filipe Manana <fdmanana@suse.com>

commit 18dfa7117a3f379862dcd3f67cadd678013bb9dd upstream.

The lock_extent_buffer_io() returns 1 to the caller to tell it everything
went fine and the callers needs to start writeback for the extent buffer
(submit a bio, etc), 0 to tell the caller everything went fine but it does
not need to start writeback for the extent buffer, and a negative value if
some error happened.

When it's about to return 1 it tries to lock all pages, and if a try lock
on a page fails, and we didn't flush any existing bio in our "epd", it
calls flush_write_bio(epd) and overwrites the return value of 1 to 0 or
an error. The page might have been locked elsewhere, not with the goal
of starting writeback of the extent buffer, and even by some code other
than btrfs, like page migration for example, so it does not mean the
writeback of the extent buffer was already started by some other task,
so returning a 0 tells the caller (btree_write_cache_pages()) to not
start writeback for the extent buffer. Note that epd might currently have
either no bio, so flush_write_bio() returns 0 (success) or it might have
a bio for another extent buffer with a lower index (logical address).

Since we return 0 with the EXTENT_BUFFER_WRITEBACK bit set on the
extent buffer and writeback is never started for the extent buffer,
future attempts to writeback the extent buffer will hang forever waiting
on that bit to be cleared, since it can only be cleared after writeback
completes. Such hang is reported with a trace like the following:

  [49887.347053] INFO: task btrfs-transacti:1752 blocked for more than 122 seconds.
  [49887.347059]       Not tainted 5.2.13-gentoo #2
  [49887.347060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
  [49887.347062] btrfs-transacti D    0  1752      2 0x80004000
  [49887.347064] Call Trace:
  [49887.347069]  ? __schedule+0x265/0x830
  [49887.347071]  ? bit_wait+0x50/0x50
  [49887.347072]  ? bit_wait+0x50/0x50
  [49887.347074]  schedule+0x24/0x90
  [49887.347075]  io_schedule+0x3c/0x60
  [49887.347077]  bit_wait_io+0x8/0x50
  [49887.347079]  __wait_on_bit+0x6c/0x80
  [49887.347081]  ? __lock_release.isra.29+0x155/0x2d0
  [49887.347083]  out_of_line_wait_on_bit+0x7b/0x80
  [49887.347084]  ? var_wake_function+0x20/0x20
  [49887.347087]  lock_extent_buffer_for_io+0x28c/0x390
  [49887.347089]  btree_write_cache_pages+0x18e/0x340
  [49887.347091]  do_writepages+0x29/0xb0
  [49887.347093]  ? kmem_cache_free+0x132/0x160
  [49887.347095]  ? convert_extent_bit+0x544/0x680
  [49887.347097]  filemap_fdatawrite_range+0x70/0x90
  [49887.347099]  btrfs_write_marked_extents+0x53/0x120
  [49887.347100]  btrfs_write_and_wait_transaction.isra.4+0x38/0xa0
  [49887.347102]  btrfs_commit_transaction+0x6bb/0x990
  [49887.347103]  ? start_transaction+0x33e/0x500
  [49887.347105]  transaction_kthread+0x139/0x15c

So fix this by not overwriting the return value (ret) with the result
from flush_write_bio(). We also need to clear the EXTENT_BUFFER_WRITEBACK
bit in case flush_write_bio() returns an error, otherwise it will hang
any future attempts to writeback the extent buffer, and undo all work
done before (set back EXTENT_BUFFER_DIRTY, etc).

This is a regression introduced in the 5.2 kernel.

Fixes: 2e3c25136adfb ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()")
Fixes: f4340622e0226 ("btrfs: extent_io: Move the BUG_ON() in flush_write_bio() one level up")
Reported-by: Zdenek Sojka <zsojka@seznam.cz>
Link: https://lore.kernel.org/linux-btrfs/GpO.2yos.3WGDOLpx6t%7D.1TUDYM@seznam.cz/T/#u
Reported-by: Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Link: https://lore.kernel.org/linux-btrfs/5c4688ac-10a7-fb07-70e8-c5d31a3fbb38@profihost.ag/T/#t
Reported-by: Drazen Kacar <drazen.kacar@oradian.com>
Link: https://lore.kernel.org/linux-btrfs/DB8PR03MB562876ECE2319B3E579590F799C80@DB8PR03MB5628.eurprd03.prod.outlook.com/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204377
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/extent_io.c |   35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -3591,6 +3591,13 @@ void wait_on_extent_buffer_writeback(str
 		       TASK_UNINTERRUPTIBLE);
 }
 
+static void end_extent_buffer_writeback(struct extent_buffer *eb)
+{
+	clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
+	smp_mb__after_atomic();
+	wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK);
+}
+
 /*
  * Lock eb pages and flush the bio if we can't the locks
  *
@@ -3662,8 +3669,11 @@ static noinline_for_stack int lock_exten
 
 		if (!trylock_page(p)) {
 			if (!flush) {
-				ret = flush_write_bio(epd);
-				if (ret < 0) {
+				int err;
+
+				err = flush_write_bio(epd);
+				if (err < 0) {
+					ret = err;
 					failed_page_nr = i;
 					goto err_unlock;
 				}
@@ -3678,16 +3688,23 @@ err_unlock:
 	/* Unlock already locked pages */
 	for (i = 0; i < failed_page_nr; i++)
 		unlock_page(eb->pages[i]);
+	/*
+	 * Clear EXTENT_BUFFER_WRITEBACK and wake up anyone waiting on it.
+	 * Also set back EXTENT_BUFFER_DIRTY so future attempts to this eb can
+	 * be made and undo everything done before.
+	 */
+	btrfs_tree_lock(eb);
+	spin_lock(&eb->refs_lock);
+	set_bit(EXTENT_BUFFER_DIRTY, &eb->bflags);
+	end_extent_buffer_writeback(eb);
+	spin_unlock(&eb->refs_lock);
+	percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, eb->len,
+				 fs_info->dirty_metadata_batch);
+	btrfs_clear_header_flag(eb, BTRFS_HEADER_FLAG_WRITTEN);
+	btrfs_tree_unlock(eb);
 	return ret;
 }
 
-static void end_extent_buffer_writeback(struct extent_buffer *eb)
-{
-	clear_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags);
-	smp_mb__after_atomic();
-	wake_up_bit(&eb->bflags, EXTENT_BUFFER_WRITEBACK);
-}
-
 static void set_btree_ioerr(struct page *page)
 {
 	struct extent_buffer *eb = (struct extent_buffer *)page->private;



  parent reply index

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-13 13:07 [PATCH 5.2 00/37] 5.2.15-stable review Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 01/37] gpio: pca953x: correct type of reg_direction Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 02/37] gpio: pca953x: use pca953x_read_regs instead of regmap_bulk_read Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 03/37] ALSA: hda - Fix potential endless loop at applying quirks Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 04/37] ALSA: hda/realtek - Fix overridden device-specific initialization Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 05/37] ALSA: hda/realtek - Add quirk for HP Pavilion 15 Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 06/37] ALSA: hda/realtek - Enable internal speaker & headset mic of ASUS UX431FL Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 07/37] ALSA: hda/realtek - Fix the problem of two front mics on a ThinkCentre Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 08/37] sched/fair: Dont assign runtime for throttled cfs_rq Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 09/37] drm/vmwgfx: Fix double free in vmw_recv_msg() Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 10/37] drm/nouveau/sec2/gp102: add missing MODULE_FIRMWAREs Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 11/37] vhost/test: fix build for vhost test Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 12/37] vhost/test: fix build for vhost test - again Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 13/37] powerpc/64e: Drop stale call to smp_processor_id() which hangs SMP startup Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 14/37] powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 15/37] powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 16/37] batman-adv: fix uninit-value in batadv_netlink_get_ifindex() Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 17/37] batman-adv: Only read OGM tvlv_len after buffer len check Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 18/37] bcache: only clear BTREE_NODE_dirty bit when it is set Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 19/37] bcache: add comments for mutex_lock(&b->write_lock) Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 20/37] bcache: fix race in btree_flush_write() Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 21/37] IB/rdmavt: Add new completion inline Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 22/37] IB/{rdmavt, qib, hfi1}: Convert to new completion API Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 23/37] IB/hfi1: Unreserve a flushed OPFN request Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 24/37] drm/i915: Disable SAMPLER_STATE prefetching on all Gen11 steppings Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 25/37] drm/i915: Make sure cdclk is high enough for DP audio on VLV/CHV Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 26/37] mmc: sdhci-sprd: Fix the incorrect soft reset operation when runtime resuming Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 27/37] usb: chipidea: imx: add imx7ulp support Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 28/37] usb: chipidea: imx: fix EPROBE_DEFER support during driver probe Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 29/37] virtio/s390: fix race on airq_areas[] Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 30/37] drm/i915: Support flags in whitlist WAs Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 31/37] drm/i915: Support whitelist workarounds on all engines Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 32/37] drm/i915: whitelist PS_(DEPTH|INVOCATION)_COUNT Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 33/37] drm/i915: Add whitelist workarounds for ICL Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 34/37] drm/i915/icl: whitelist PS_(DEPTH|INVOCATION)_COUNT Greg Kroah-Hartman
2019-09-13 13:07 ` Greg Kroah-Hartman [this message]
2019-09-13 13:07 ` [PATCH 5.2 36/37] vhost: block speculation of translated descriptors Greg Kroah-Hartman
2019-09-14  0:54   ` Stefan Lippers-Hollmann
2019-09-14  5:50     ` Greg Kroah-Hartman
2019-09-14  7:15       ` Stefan Lippers-Hollmann
2019-09-14  8:08         ` Greg Kroah-Hartman
2019-09-15  9:34           ` Thomas Backlund
2019-09-15 13:37             ` Greg Kroah-Hartman
2019-09-13 13:07 ` [PATCH 5.2 37/37] vhost: make sure log_num < in_num Greg Kroah-Hartman
2019-09-13 19:39 ` [PATCH 5.2 00/37] 5.2.15-stable review kernelci.org bot
2019-09-14  4:26 ` Naresh Kamboju
2019-09-14  7:43   ` Greg Kroah-Hartman
2019-09-14 14:08 ` Guenter Roeck
2019-09-15 13:34 ` Greg Kroah-Hartman
2019-09-16  9:25 ` Jon Hunter
2019-09-16 10:41   ` Greg Kroah-Hartman

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190913130522.060024148@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=drazen.kacar@oradian.com \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=s.priebe@profihost.ag \
    --cc=stable@vger.kernel.org \
    --cc=zsojka@seznam.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Stable Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/stable/0 stable/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 stable stable/ https://lore.kernel.org/stable \
		stable@vger.kernel.org
	public-inbox-index stable

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.stable


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git