stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
	David Sterba <dsterba@suse.com>
Subject: [PATCH 4.9 36/83] Btrfs: fix race updating log root item during fsync
Date: Sun,  9 Jun 2019 18:42:06 +0200	[thread overview]
Message-ID: <20190609164130.842954351@linuxfoundation.org> (raw)
In-Reply-To: <20190609164127.843327870@linuxfoundation.org>

From: Filipe Manana <fdmanana@suse.com>

commit 06989c799f04810f6876900d4760c0edda369cf7 upstream.

When syncing the log, the final phase of a fsync operation, we need to
either create a log root's item or update the existing item in the log
tree of log roots, and that depends on the current value of the log
root's log_transid - if it's 1 we need to create the log root item,
otherwise it must exist already and we update it. Since there is no
synchronization between updating the log_transid and checking it for
deciding whether the log root's item needs to be created or updated, we
end up with a tiny race window that results in attempts to update the
item to fail because the item was not yet created:

              CPU 1                                    CPU 2

  btrfs_sync_log()

    lock root->log_mutex

    set log root's log_transid to 1

    unlock root->log_mutex

                                               btrfs_sync_log()

                                                 lock root->log_mutex

                                                 sets log root's
                                                 log_transid to 2

                                                 unlock root->log_mutex

    update_log_root()

      sees log root's log_transid
      with a value of 2

        calls btrfs_update_root(),
        which fails with -EUCLEAN
        and causes transaction abort

Until recently the race lead to a BUG_ON at btrfs_update_root(), but after
the recent commit 7ac1e464c4d47 ("btrfs: Don't panic when we can't find a
root key") we just abort the current transaction.

A sample trace of the BUG_ON() on a SLE12 kernel:

  ------------[ cut here ]------------
  kernel BUG at ../fs/btrfs/root-tree.c:157!
  Oops: Exception in kernel mode, sig: 5 [#1]
  SMP NR_CPUS=2048 NUMA pSeries
  (...)
  Supported: Yes, External
  CPU: 78 PID: 76303 Comm: rtas_errd Tainted: G                 X 4.4.156-94.57-default #1
  task: c00000ffa906d010 ti: c00000ff42b08000 task.ti: c00000ff42b08000
  NIP: d000000036ae5cdc LR: d000000036ae5cd8 CTR: 0000000000000000
  REGS: c00000ff42b0b860 TRAP: 0700   Tainted: G                 X  (4.4.156-94.57-default)
  MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 22444484  XER: 20000000
  CFAR: d000000036aba66c SOFTE: 1
  GPR00: d000000036ae5cd8 c00000ff42b0bae0 d000000036bda220 0000000000000054
  GPR04: 0000000000000001 0000000000000000 c00007ffff8d37c8 0000000000000000
  GPR08: c000000000e19c00 0000000000000000 0000000000000000 3736343438312079
  GPR12: 3930373337303434 c000000007a3a800 00000000007fffff 0000000000000023
  GPR16: c00000ffa9d26028 c00000ffa9d261f8 0000000000000010 c00000ffa9d2ab28
  GPR20: c00000ff42b0bc48 0000000000000001 c00000ff9f0d9888 0000000000000001
  GPR24: c00000ffa9d26000 c00000ffa9d261e8 c00000ffa9d2a800 c00000ff9f0d9888
  GPR28: c00000ffa9d26028 c00000ffa9d2aa98 0000000000000001 c00000ffa98f5b20
  NIP [d000000036ae5cdc] btrfs_update_root+0x25c/0x4e0 [btrfs]
  LR [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs]
  Call Trace:
  [c00000ff42b0bae0] [d000000036ae5cd8] btrfs_update_root+0x258/0x4e0 [btrfs] (unreliable)
  [c00000ff42b0bba0] [d000000036b53610] btrfs_sync_log+0x2d0/0xc60 [btrfs]
  [c00000ff42b0bce0] [d000000036b1785c] btrfs_sync_file+0x44c/0x4e0 [btrfs]
  [c00000ff42b0bd80] [c00000000032e300] vfs_fsync_range+0x70/0x120
  [c00000ff42b0bdd0] [c00000000032e44c] do_fsync+0x5c/0xb0
  [c00000ff42b0be10] [c00000000032e8dc] SyS_fdatasync+0x2c/0x40
  [c00000ff42b0be30] [c000000000009488] system_call+0x3c/0x100
  Instruction dump:
  7f43d378 4bffebb9 60000000 88d90008 3d220000 e8b90000 3b390009 e87a01f0
  e8898e08 e8f90000 4bfd48e5 60000000 <0fe00000> e95b0060 39200004 394a0ea0
  ---[ end trace 8f2dc8f919cabab8 ]---

So fix this by doing the check of log_transid and updating or creating the
log root's item while holding the root's log_mutex.

Fixes: 7237f1833601d ("Btrfs: fix tree logs parallel sync")
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/btrfs/tree-log.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -2827,6 +2827,12 @@ int btrfs_sync_log(struct btrfs_trans_ha
 	log->log_transid = root->log_transid;
 	root->log_start_pid = 0;
 	/*
+	 * Update or create log root item under the root's log_mutex to prevent
+	 * races with concurrent log syncs that can lead to failure to update
+	 * log root item because it was not created yet.
+	 */
+	ret = update_log_root(trans, log);
+	/*
 	 * IO has been started, blocks of the log tree have WRITTEN flag set
 	 * in their headers. new modifications of the log will be written to
 	 * new positions. so it's safe to allow log writers to go in.
@@ -2845,8 +2851,6 @@ int btrfs_sync_log(struct btrfs_trans_ha
 
 	mutex_unlock(&log_root_tree->log_mutex);
 
-	ret = update_log_root(trans, log);
-
 	mutex_lock(&log_root_tree->log_mutex);
 	if (atomic_dec_and_test(&log_root_tree->log_writers)) {
 		/*



  parent reply	other threads:[~2019-06-09 17:16 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-09 16:41 [PATCH 4.9 00/83] 4.9.181-stable review Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 01/83] ipv6: Consider sk_bound_dev_if when binding a raw socket to an address Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 02/83] llc: fix skb leak in llc_build_and_send_ui_pkt() Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 03/83] net: fec: fix the clk mismatch in failed_reset path Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 04/83] net-gro: fix use-after-free read in napi_gro_frags() Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 05/83] net: stmmac: fix reset gpio free missing Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 06/83] usbnet: fix kernel crash after disconnect Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 07/83] tipc: Avoid copying bytes beyond the supplied data Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 08/83] bnxt_en: Fix aggregation buffer leak under OOM condition Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 09/83] ipv4/igmp: fix another memory leak in igmpv3_del_delrec() Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 10/83] ipv4/igmp: fix build error if !CONFIG_IP_MULTICAST Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 11/83] net: dsa: mv88e6xxx: fix handling of upper half of STATS_TYPE_PORT Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 12/83] net: mvneta: Fix err code path of probe Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 13/83] net: mvpp2: fix bad MVPP2_TXQ_SCHED_TOKEN_CNTR_REG queue value Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 14/83] crypto: vmx - ghash: do nosimd fallback manually Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 15/83] xen/pciback: Dont disable PCI_COMMAND on PCI device reset Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 16/83] Revert "tipc: fix modprobe tipc failed after switch order of device registration" Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 17/83] tipc: fix modprobe tipc failed after switch order of device registration Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 18/83] sparc64: Fix regression in non-hypervisor TLB flush xcall Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 19/83] include/linux/bitops.h: sanitize rotate primitives Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 20/83] xhci: update bounce buffer with correct sg num Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 21/83] xhci: Use %zu for printing size_t type Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 22/83] xhci: Convert xhci_handshake() to use readl_poll_timeout_atomic() Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 23/83] usb: xhci: avoid null pointer deref when bos field is NULL Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 24/83] usbip: usbip_host: fix BUG: sleeping function called from invalid context Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 25/83] usbip: usbip_host: fix stub_dev lock context imbalance regression Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 26/83] USB: Fix slab-out-of-bounds write in usb_get_bos_descriptor Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 27/83] USB: sisusbvga: fix oops in error path of sisusb_probe Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 28/83] USB: Add LPM quirk for Surface Dock GigE adapter Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 4.9 29/83] USB: rio500: refuse more than one device at a time Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 30/83] USB: rio500: fix memory leak in close after disconnect Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 31/83] media: usb: siano: Fix general protection fault in smsusb Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 32/83] media: usb: siano: Fix false-positive "uninitialized variable" warning Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 33/83] media: smsusb: better handle optional alignment Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 34/83] scsi: zfcp: fix missing zfcp_port reference put on -EBUSY from port_remove Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 35/83] scsi: zfcp: fix to prevent port_remove with pure auto scan LUNs (only sdevs) Greg Kroah-Hartman
2019-06-09 16:42 ` Greg Kroah-Hartman [this message]
2019-06-09 16:42 ` [PATCH 4.9 37/83] powerpc/perf: Fix MMCRA corruption by bhrb_filter Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 38/83] ALSA: hda/realtek - Set default power save node to 0 Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 39/83] drm/nouveau/i2c: Disable i2c bus access after ->fini() Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 40/83] tty: serial: msm_serial: Fix XON/XOFF Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 41/83] tty: max310x: Fix external crystal register setup Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 42/83] memcg: make it work on sparse non-0-node systems Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 43/83] kernel/signal.c: trace_signal_deliver when signal_group_exit Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 44/83] docs: Fix conf.py for Sphinx 2.0 Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 45/83] staging: vc04_services: prevent integer overflow in create_pagelist() Greg Kroah-Hartman
2019-06-19 16:02   ` Martin Weinelt
2019-06-19 17:13     ` Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 46/83] CIFS: cifs_read_allocate_pages: dont iterate through whole page array on ENOMEM Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 47/83] gcc-plugins: Fix build failures under Darwin host Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 48/83] drm/vmwgfx: Dont send drm sysfs hotplug events on initial master set Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 49/83] brcmfmac: add length checks in scheduled scan result handler Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 50/83] brcmfmac: assure SSID length from firmware is limited Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 51/83] brcmfmac: add subtype check for event handling in data path Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 52/83] binder: Replace "%p" with "%pK" for stable Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 53/83] binder: replace "%p" with "%pK" Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 54/83] fs: prevent page refcount overflow in pipe_buf_get Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 55/83] mm, gup: remove broken VM_BUG_ON_PAGE compound check for hugepages Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 56/83] mm, gup: ensure real head page is ref-counted when using hugepages Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 57/83] mm: prevent get_user_pages() from overflowing page refcount Greg Kroah-Hartman
2019-07-31 15:14   ` Vlastimil Babka
2019-06-09 16:42 ` [PATCH 4.9 58/83] mm: make page ref count overflow check tighter and more explicit Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 59/83] Revert "x86/build: Move _etext to actual end of .text" Greg Kroah-Hartman
2019-06-10 11:57   ` Willy Tarreau
2019-06-09 16:42 ` [PATCH 4.9 60/83] efi/libstub: Unify command line param parsing Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 61/83] media: uvcvideo: Fix uvc_alloc_entity() allocation alignment Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 62/83] ethtool: fix potential userspace buffer overflow Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 63/83] neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 64/83] net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 65/83] net: rds: fix memory leak in rds_ib_flush_mr_pool Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 66/83] pktgen: do not sleep with the thread lock held Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 67/83] ipv6: fix EFAULT on sendto with icmpv6 and hdrincl Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 68/83] ipv6: use READ_ONCE() for inet->hdrincl as in ipv4 Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 69/83] Revert "fib_rules: fix error in backport of e9919a24d302 ("fib_rules: return 0...")" Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 70/83] Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied" Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 71/83] rcu: locking and unlocking need to always be at least barriers Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 72/83] parisc: Use implicit space register selection for loading the coherence index of I/O pdirs Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 73/83] fuse: fallocate: fix return with locked inode Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 74/83] x86/power: Fix nosmt vs hibernation triple fault during resume Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 75/83] MIPS: pistachio: Build uImage.gz by default Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 76/83] Revert "MIPS: perf: ath79: Fix perfcount IRQ assignment" Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 77/83] genwqe: Prevent an integer overflow in the ioctl Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 78/83] drm/gma500/cdv: Check vbt config bits when detecting lvds panels Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 79/83] drm/radeon: prefer lower reference dividers Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 80/83] drm/i915: Fix I915_EXEC_RING_MASK Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 81/83] TTY: serial_core, add ->install Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 82/83] fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 4.9 83/83] fuse: Add FOPEN_STREAM to use stream_open() Greg Kroah-Hartman
2019-06-09 22:10 ` [PATCH 4.9 00/83] 4.9.181-stable review kernelci.org bot
2019-06-10  6:38 ` Naresh Kamboju
2019-06-10  8:50 ` Jon Hunter
2019-06-10 14:42 ` Guenter Roeck
2019-06-10 21:49 ` shuah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190609164130.842954351@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).