linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.9 35/71] btrfs: fix potential deadlock in the search ioctl
Date: Fri, 11 Sep 2020 14:46:19 +0200	[thread overview]
Message-ID: <20200911122506.680591979@linuxfoundation.org> (raw)
In-Reply-To: <20200911122504.928931589@linuxfoundation.org>

From: Josef Bacik <josef@toxicpanda.com>

[ Upstream commit a48b73eca4ceb9b8a4b97f290a065335dbcd8a04 ]

With the conversion of the tree locks to rwsem I got the following
lockdep splat:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.8.0-rc7-00165-g04ec4da5f45f-dirty #922 Not tainted
  ------------------------------------------------------
  compsize/11122 is trying to acquire lock:
  ffff889fabca8768 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x3e/0x90

  but task is already holding lock:
  ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (btrfs-fs-00){++++}-{3:3}:
	 down_write_nested+0x3b/0x70
	 __btrfs_tree_lock+0x24/0x120
	 btrfs_search_slot+0x756/0x990
	 btrfs_lookup_inode+0x3a/0xb4
	 __btrfs_update_delayed_inode+0x93/0x270
	 btrfs_async_run_delayed_root+0x168/0x230
	 btrfs_work_helper+0xd4/0x570
	 process_one_work+0x2ad/0x5f0
	 worker_thread+0x3a/0x3d0
	 kthread+0x133/0x150
	 ret_from_fork+0x1f/0x30

  -> #1 (&delayed_node->mutex){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 btrfs_delayed_update_inode+0x50/0x440
	 btrfs_update_inode+0x8a/0xf0
	 btrfs_dirty_inode+0x5b/0xd0
	 touch_atime+0xa1/0xd0
	 btrfs_file_mmap+0x3f/0x60
	 mmap_region+0x3a4/0x640
	 do_mmap+0x376/0x580
	 vm_mmap_pgoff+0xd5/0x120
	 ksys_mmap_pgoff+0x193/0x230
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&mm->mmap_lock#2){++++}-{3:3}:
	 __lock_acquire+0x1272/0x2310
	 lock_acquire+0x9e/0x360
	 __might_fault+0x68/0x90
	 _copy_to_user+0x1e/0x80
	 copy_to_sk.isra.32+0x121/0x300
	 search_ioctl+0x106/0x200
	 btrfs_ioctl_tree_search_v2+0x7b/0xf0
	 btrfs_ioctl+0x106f/0x30a0
	 ksys_ioctl+0x83/0xc0
	 __x64_sys_ioctl+0x16/0x20
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

  Chain exists of:
    &mm->mmap_lock#2 --> &delayed_node->mutex --> btrfs-fs-00

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(btrfs-fs-00);
				 lock(&delayed_node->mutex);
				 lock(btrfs-fs-00);
    lock(&mm->mmap_lock#2);

   *** DEADLOCK ***

  1 lock held by compsize/11122:
   #0: ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180

  stack backtrace:
  CPU: 17 PID: 11122 Comm: compsize Kdump: loaded Not tainted 5.8.0-rc7-00165-g04ec4da5f45f-dirty #922
  Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
  Call Trace:
   dump_stack+0x78/0xa0
   check_noncircular+0x165/0x180
   __lock_acquire+0x1272/0x2310
   lock_acquire+0x9e/0x360
   ? __might_fault+0x3e/0x90
   ? find_held_lock+0x72/0x90
   __might_fault+0x68/0x90
   ? __might_fault+0x3e/0x90
   _copy_to_user+0x1e/0x80
   copy_to_sk.isra.32+0x121/0x300
   ? btrfs_search_forward+0x2a6/0x360
   search_ioctl+0x106/0x200
   btrfs_ioctl_tree_search_v2+0x7b/0xf0
   btrfs_ioctl+0x106f/0x30a0
   ? __do_sys_newfstat+0x5a/0x70
   ? ksys_ioctl+0x83/0xc0
   ksys_ioctl+0x83/0xc0
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x50/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

The problem is we're doing a copy_to_user() while holding tree locks,
which can deadlock if we have to do a page fault for the copy_to_user().
This exists even without my locking changes, so it needs to be fixed.
Rework the search ioctl to do the pre-fault and then
copy_to_user_nofault for the copying.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/extent_io.c |  8 ++++----
 fs/btrfs/extent_io.h |  6 +++---
 fs/btrfs/ioctl.c     | 27 ++++++++++++++++++++-------
 3 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index fa22bb29eee6f..d6c827a9ebc56 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -5488,9 +5488,9 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv,
 	}
 }
 
-int read_extent_buffer_to_user(const struct extent_buffer *eb,
-			       void __user *dstv,
-			       unsigned long start, unsigned long len)
+int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
+				       void __user *dstv,
+				       unsigned long start, unsigned long len)
 {
 	size_t cur;
 	size_t offset;
@@ -5511,7 +5511,7 @@ int read_extent_buffer_to_user(const struct extent_buffer *eb,
 
 		cur = min(len, (PAGE_SIZE - offset));
 		kaddr = page_address(page);
-		if (copy_to_user(dst, kaddr + offset, cur)) {
+		if (probe_user_write(dst, kaddr + offset, cur)) {
 			ret = -EFAULT;
 			break;
 		}
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 9ecdc9584df77..75c03aa1800fe 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -401,9 +401,9 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv,
 void read_extent_buffer(const struct extent_buffer *eb, void *dst,
 			unsigned long start,
 			unsigned long len);
-int read_extent_buffer_to_user(const struct extent_buffer *eb,
-			       void __user *dst, unsigned long start,
-			       unsigned long len);
+int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
+				       void __user *dst, unsigned long start,
+				       unsigned long len);
 void write_extent_buffer(struct extent_buffer *eb, const void *src,
 			 unsigned long start, unsigned long len);
 void copy_extent_buffer(struct extent_buffer *dst, struct extent_buffer *src,
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index eefe103c65daa..6db46daeed16b 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2041,9 +2041,14 @@ static noinline int copy_to_sk(struct btrfs_path *path,
 		sh.len = item_len;
 		sh.transid = found_transid;
 
-		/* copy search result header */
-		if (copy_to_user(ubuf + *sk_offset, &sh, sizeof(sh))) {
-			ret = -EFAULT;
+		/*
+		 * Copy search result header. If we fault then loop again so we
+		 * can fault in the pages and -EFAULT there if there's a
+		 * problem. Otherwise we'll fault and then copy the buffer in
+		 * properly this next time through
+		 */
+		if (probe_user_write(ubuf + *sk_offset, &sh, sizeof(sh))) {
+			ret = 0;
 			goto out;
 		}
 
@@ -2051,10 +2056,14 @@ static noinline int copy_to_sk(struct btrfs_path *path,
 
 		if (item_len) {
 			char __user *up = ubuf + *sk_offset;
-			/* copy the item */
-			if (read_extent_buffer_to_user(leaf, up,
-						       item_off, item_len)) {
-				ret = -EFAULT;
+			/*
+			 * Copy the item, same behavior as above, but reset the
+			 * * sk_offset so we copy the full thing again.
+			 */
+			if (read_extent_buffer_to_user_nofault(leaf, up,
+						item_off, item_len)) {
+				ret = 0;
+				*sk_offset -= sizeof(sh);
 				goto out;
 			}
 
@@ -2142,6 +2151,10 @@ static noinline int search_ioctl(struct inode *inode,
 	key.offset = sk->min_offset;
 
 	while (1) {
+		ret = fault_in_pages_writeable(ubuf, *buf_size - sk_offset);
+		if (ret)
+			break;
+
 		ret = btrfs_search_forward(root, &key, path, sk->min_transid);
 		if (ret != 0) {
 			if (ret > 0)
-- 
2.25.1




  parent reply	other threads:[~2020-09-11 13:21 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-11 12:45 [PATCH 4.9 00/71] 4.9.236-rc1 review Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 01/71] HID: core: Correctly handle ReportSize being zero Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 02/71] HID: core: Sanitize event code and type when mapping input Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 03/71] perf record/stat: Explicitly call out event modifiers in the documentation Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 04/71] hwmon: (applesmc) check status earlier Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 05/71] nvmet: Disable keep-alive timer when kato is cleared to 0h Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 06/71] ceph: dont allow setlease on cephfs Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 07/71] s390: dont trace preemption in percpu macros Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 08/71] xen/xenbus: Fix granting of vmallocd memory Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 09/71] dmaengine: of-dma: Fix of_dma_router_xlates of_dma_xlate handling Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 10/71] batman-adv: Avoid uninitialized chaddr when handling DHCP Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 11/71] batman-adv: bla: use netif_rx_ni when not in interrupt context Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 12/71] dmaengine: at_hdmac: check return value of of_find_device_by_node() in at_dma_xlate() Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 13/71] MIPS: mm: BMIPS5000 has inclusive physical caches Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 14/71] MIPS: BMIPS: Also call bmips_cpu_setup() for secondary cores Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 15/71] netfilter: nf_tables: add NFTA_SET_USERDATA if not null Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 16/71] netfilter: nf_tables: incorrect enum nft_list_attributes definition Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 17/71] netfilter: nf_tables: fix destination register zeroing Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 18/71] net: hns: Fix memleak in hns_nic_dev_probe Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 19/71] ravb: Fixed to be able to unload modules Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 20/71] net: arc_emac: Fix memleak in arc_mdio_probe Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 21/71] dmaengine: pl330: Fix burst length if burst size is smaller than bus width Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 22/71] bnxt_en: Check for zero dir entries in NVRAM Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 23/71] bnxt_en: Fix PCI AER error recovery flow Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 24/71] fix regression in "epoll: Keep a reference on files added to the check list" Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 25/71] tg3: Fix soft lockup when tg3_reset_task() fails Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 26/71] iommu/vt-d: Serialize IOMMU GCMD register modifications Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 27/71] thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430 Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 28/71] include/linux/log2.h: add missing () around n in roundup_pow_of_two() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 29/71] btrfs: drop path before adding new uuid tree entry Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 30/71] btrfs: Remove redundant extent_buffer_get in get_old_root Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 31/71] btrfs: Remove extraneous extent_buffer_get from tree_mod_log_rewind Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 32/71] btrfs: set the lockdep class for log tree extent buffers Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 33/71] uaccess: Add non-pagefault user-space read functions Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 34/71] uaccess: Add non-pagefault user-space write function Greg Kroah-Hartman
2020-09-11 12:46 ` Greg Kroah-Hartman [this message]
2020-09-11 12:46 ` [PATCH 4.9 36/71] net: usb: qmi_wwan: add Telit 0x1050 composition Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 37/71] drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201 Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 38/71] qmi_wwan: new Telewell and Sierra device IDs Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 39/71] usb: qmi_wwan: add D-Link DWM-222 A2 device ID Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 40/71] ALSA: ca0106: fix error code handling Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 41/71] ALSA: pcm: oss: Remove superfluous WARN_ON() for mulaw sanity check Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 42/71] ALSA: firewire-digi00x: exclude Avid Adrenaline from detection Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 43/71] block: allow for_each_bvec to support zero len bvec Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 44/71] block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 45/71] libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 46/71] dm cache metadata: Avoid returning cmd->bm wild pointer on error Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 47/71] dm thin " Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 48/71] mm: slub: fix conversion of freelist_corrupted() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 49/71] vfio/type1: Support faulting PFNMAP vmas Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 50/71] vfio-pci: Fault mmaps to enable vma tracking Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 51/71] vfio-pci: Invalidate mmaps and block MMIO access on disabled memory Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 52/71] KVM: arm64: Add kvm_extable for vaxorcism code Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 53/71] KVM: arm64: Defer guest entry when an asynchronous exception is pending Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 54/71] KVM: arm64: Survive synchronous exceptions caused by AT instructions Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 55/71] KVM: arm64: Set HCR_EL2.PTW to prevent AT taking synchronous exception Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 56/71] net: refactor bind_bucket fastreuse into helper Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 57/71] net: initialize fastreuse on inet_inherit_port Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 58/71] vfio/pci: Fix SR-IOV VF handling with MMIO blocking Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 59/71] checkpatch: fix the usage of capture group ( ... ) Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 60/71] mm/hugetlb: fix a race between hugetlb sysctl handlers Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 61/71] cfg80211: regulatory: reject invalid hints Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 62/71] net: usb: Fix uninit-was-stored issue in asix_read_phy_addr() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 63/71] ALSA; firewire-tascam: exclude Tascam FE-8 from detection Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 64/71] fs/affs: use octal for permissions Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 65/71] affs: fix basic permission bits to actually work Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 66/71] net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 67/71] bnxt: dont enable NAPI until rings are ready Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 68/71] netlabel: fix problems with mapping removal Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 69/71] net: usb: dm9601: Add USB ID of Keenetic Plus DSL Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 70/71] sctp: not disable bh in the whole sctp_get_port_local() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 71/71] net: disable netpoll on fresh napis Greg Kroah-Hartman
2020-09-11 22:31 ` [PATCH 4.9 00/71] 4.9.236-rc1 review Shuah Khan
2020-09-12  2:16 ` Guenter Roeck
2020-09-12  8:06 ` Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200911122506.680591979@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).