All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.14 44/65] btrfs: fix potential deadlock in the search ioctl
Date: Tue,  8 Sep 2020 17:26:29 +0200	[thread overview]
Message-ID: <20200908152219.316437880@linuxfoundation.org> (raw)
In-Reply-To: <20200908152217.022816723@linuxfoundation.org>

From: Josef Bacik <josef@toxicpanda.com>

[ Upstream commit a48b73eca4ceb9b8a4b97f290a065335dbcd8a04 ]

With the conversion of the tree locks to rwsem I got the following
lockdep splat:

  ======================================================
  WARNING: possible circular locking dependency detected
  5.8.0-rc7-00165-g04ec4da5f45f-dirty #922 Not tainted
  ------------------------------------------------------
  compsize/11122 is trying to acquire lock:
  ffff889fabca8768 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x3e/0x90

  but task is already holding lock:
  ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (btrfs-fs-00){++++}-{3:3}:
	 down_write_nested+0x3b/0x70
	 __btrfs_tree_lock+0x24/0x120
	 btrfs_search_slot+0x756/0x990
	 btrfs_lookup_inode+0x3a/0xb4
	 __btrfs_update_delayed_inode+0x93/0x270
	 btrfs_async_run_delayed_root+0x168/0x230
	 btrfs_work_helper+0xd4/0x570
	 process_one_work+0x2ad/0x5f0
	 worker_thread+0x3a/0x3d0
	 kthread+0x133/0x150
	 ret_from_fork+0x1f/0x30

  -> #1 (&delayed_node->mutex){+.+.}-{3:3}:
	 __mutex_lock+0x9f/0x930
	 btrfs_delayed_update_inode+0x50/0x440
	 btrfs_update_inode+0x8a/0xf0
	 btrfs_dirty_inode+0x5b/0xd0
	 touch_atime+0xa1/0xd0
	 btrfs_file_mmap+0x3f/0x60
	 mmap_region+0x3a4/0x640
	 do_mmap+0x376/0x580
	 vm_mmap_pgoff+0xd5/0x120
	 ksys_mmap_pgoff+0x193/0x230
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&mm->mmap_lock#2){++++}-{3:3}:
	 __lock_acquire+0x1272/0x2310
	 lock_acquire+0x9e/0x360
	 __might_fault+0x68/0x90
	 _copy_to_user+0x1e/0x80
	 copy_to_sk.isra.32+0x121/0x300
	 search_ioctl+0x106/0x200
	 btrfs_ioctl_tree_search_v2+0x7b/0xf0
	 btrfs_ioctl+0x106f/0x30a0
	 ksys_ioctl+0x83/0xc0
	 __x64_sys_ioctl+0x16/0x20
	 do_syscall_64+0x50/0x90
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

  Chain exists of:
    &mm->mmap_lock#2 --> &delayed_node->mutex --> btrfs-fs-00

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(btrfs-fs-00);
				 lock(&delayed_node->mutex);
				 lock(btrfs-fs-00);
    lock(&mm->mmap_lock#2);

   *** DEADLOCK ***

  1 lock held by compsize/11122:
   #0: ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180

  stack backtrace:
  CPU: 17 PID: 11122 Comm: compsize Kdump: loaded Not tainted 5.8.0-rc7-00165-g04ec4da5f45f-dirty #922
  Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
  Call Trace:
   dump_stack+0x78/0xa0
   check_noncircular+0x165/0x180
   __lock_acquire+0x1272/0x2310
   lock_acquire+0x9e/0x360
   ? __might_fault+0x3e/0x90
   ? find_held_lock+0x72/0x90
   __might_fault+0x68/0x90
   ? __might_fault+0x3e/0x90
   _copy_to_user+0x1e/0x80
   copy_to_sk.isra.32+0x121/0x300
   ? btrfs_search_forward+0x2a6/0x360
   search_ioctl+0x106/0x200
   btrfs_ioctl_tree_search_v2+0x7b/0xf0
   btrfs_ioctl+0x106f/0x30a0
   ? __do_sys_newfstat+0x5a/0x70
   ? ksys_ioctl+0x83/0xc0
   ksys_ioctl+0x83/0xc0
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x50/0x90
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

The problem is we're doing a copy_to_user() while holding tree locks,
which can deadlock if we have to do a page fault for the copy_to_user().
This exists even without my locking changes, so it needs to be fixed.
Rework the search ioctl to do the pre-fault and then
copy_to_user_nofault for the copying.

CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/btrfs/extent_io.c |  8 ++++----
 fs/btrfs/extent_io.h |  6 +++---
 fs/btrfs/ioctl.c     | 27 ++++++++++++++++++++-------
 3 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index ef1fd6a09d8e5..0ba338cffa937 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -5448,9 +5448,9 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv,
 	}
 }
 
-int read_extent_buffer_to_user(const struct extent_buffer *eb,
-			       void __user *dstv,
-			       unsigned long start, unsigned long len)
+int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
+				       void __user *dstv,
+				       unsigned long start, unsigned long len)
 {
 	size_t cur;
 	size_t offset;
@@ -5471,7 +5471,7 @@ int read_extent_buffer_to_user(const struct extent_buffer *eb,
 
 		cur = min(len, (PAGE_SIZE - offset));
 		kaddr = page_address(page);
-		if (copy_to_user(dst, kaddr + offset, cur)) {
+		if (probe_user_write(dst, kaddr + offset, cur)) {
 			ret = -EFAULT;
 			break;
 		}
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index e5535bbe69537..90c5095ae97ee 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -455,9 +455,9 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv,
 void read_extent_buffer(const struct extent_buffer *eb, void *dst,
 			unsigned long start,
 			unsigned long len);
-int read_extent_buffer_to_user(const struct extent_buffer *eb,
-			       void __user *dst, unsigned long start,
-			       unsigned long len);
+int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
+				       void __user *dst, unsigned long start,
+				       unsigned long len);
 void write_extent_buffer_fsid(struct extent_buffer *eb, const void *src);
 void write_extent_buffer_chunk_tree_uuid(struct extent_buffer *eb,
 		const void *src);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index e82b4f3f490c1..7b3e41987d072 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2021,9 +2021,14 @@ static noinline int copy_to_sk(struct btrfs_path *path,
 		sh.len = item_len;
 		sh.transid = found_transid;
 
-		/* copy search result header */
-		if (copy_to_user(ubuf + *sk_offset, &sh, sizeof(sh))) {
-			ret = -EFAULT;
+		/*
+		 * Copy search result header. If we fault then loop again so we
+		 * can fault in the pages and -EFAULT there if there's a
+		 * problem. Otherwise we'll fault and then copy the buffer in
+		 * properly this next time through
+		 */
+		if (probe_user_write(ubuf + *sk_offset, &sh, sizeof(sh))) {
+			ret = 0;
 			goto out;
 		}
 
@@ -2031,10 +2036,14 @@ static noinline int copy_to_sk(struct btrfs_path *path,
 
 		if (item_len) {
 			char __user *up = ubuf + *sk_offset;
-			/* copy the item */
-			if (read_extent_buffer_to_user(leaf, up,
-						       item_off, item_len)) {
-				ret = -EFAULT;
+			/*
+			 * Copy the item, same behavior as above, but reset the
+			 * * sk_offset so we copy the full thing again.
+			 */
+			if (read_extent_buffer_to_user_nofault(leaf, up,
+						item_off, item_len)) {
+				ret = 0;
+				*sk_offset -= sizeof(sh);
 				goto out;
 			}
 
@@ -2122,6 +2131,10 @@ static noinline int search_ioctl(struct inode *inode,
 	key.offset = sk->min_offset;
 
 	while (1) {
+		ret = fault_in_pages_writeable(ubuf, *buf_size - sk_offset);
+		if (ret)
+			break;
+
 		ret = btrfs_search_forward(root, &key, path, sk->min_transid);
 		if (ret != 0) {
 			if (ret > 0)
-- 
2.25.1




  parent reply	other threads:[~2020-09-08 17:49 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-08 15:25 [PATCH 4.14 00/65] 4.14.197-rc1 review Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 01/65] HID: core: Correctly handle ReportSize being zero Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 02/65] HID: core: Sanitize event code and type when mapping input Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 03/65] perf record/stat: Explicitly call out event modifiers in the documentation Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 04/65] drm/msm: add shutdown support for display platform_driver Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 05/65] hwmon: (applesmc) check status earlier Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 06/65] nvmet: Disable keep-alive timer when kato is cleared to 0h Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 07/65] ceph: dont allow setlease on cephfs Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 08/65] cpuidle: Fixup IRQ state Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 09/65] s390: dont trace preemption in percpu macros Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 10/65] xen/xenbus: Fix granting of vmallocd memory Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 11/65] dmaengine: of-dma: Fix of_dma_router_xlates of_dma_xlate handling Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 12/65] batman-adv: Avoid uninitialized chaddr when handling DHCP Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 13/65] batman-adv: Fix own OGM check in aggregated OGMs Greg Kroah-Hartman
2020-09-08 15:25 ` [PATCH 4.14 14/65] batman-adv: bla: use netif_rx_ni when not in interrupt context Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 15/65] dmaengine: at_hdmac: check return value of of_find_device_by_node() in at_dma_xlate() Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 16/65] MIPS: mm: BMIPS5000 has inclusive physical caches Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 17/65] MIPS: BMIPS: Also call bmips_cpu_setup() for secondary cores Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 18/65] netfilter: nf_tables: add NFTA_SET_USERDATA if not null Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 19/65] netfilter: nf_tables: incorrect enum nft_list_attributes definition Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 20/65] netfilter: nf_tables: fix destination register zeroing Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 21/65] net: hns: Fix memleak in hns_nic_dev_probe Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 22/65] net: systemport: Fix memleak in bcm_sysport_probe Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 23/65] ravb: Fixed to be able to unload modules Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 24/65] net: arc_emac: Fix memleak in arc_mdio_probe Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 25/65] dmaengine: pl330: Fix burst length if burst size is smaller than bus width Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 26/65] gtp: add GTPA_LINK info to msg sent to userspace Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 27/65] bnxt_en: Check for zero dir entries in NVRAM Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 28/65] bnxt_en: Fix PCI AER error recovery flow Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 29/65] nvmet-fc: Fix a missed _irqsave version of spin_lock in nvmet_fc_fod_op_done() Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 30/65] perf tools: Correct SNOOPX field offset Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 31/65] net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 32/65] fix regression in "epoll: Keep a reference on files added to the check list" Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 33/65] drm/radeon: Prefer lower feedback dividers Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 34/65] tg3: Fix soft lockup when tg3_reset_task() fails Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 35/65] iommu/vt-d: Serialize IOMMU GCMD register modifications Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 36/65] thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430 Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 37/65] include/linux/log2.h: add missing () around n in roundup_pow_of_two() Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 38/65] btrfs: drop path before adding new uuid tree entry Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 39/65] btrfs: Remove redundant extent_buffer_get in get_old_root Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 40/65] btrfs: Remove extraneous extent_buffer_get from tree_mod_log_rewind Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 41/65] btrfs: set the lockdep class for log tree extent buffers Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 42/65] uaccess: Add non-pagefault user-space read functions Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 43/65] uaccess: Add non-pagefault user-space write function Greg Kroah-Hartman
2020-09-08 15:26 ` Greg Kroah-Hartman [this message]
2020-09-08 15:26 ` [PATCH 4.14 45/65] net: usb: qmi_wwan: add Telit 0x1050 composition Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 46/65] usb: qmi_wwan: add D-Link DWM-222 A2 device ID Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 47/65] ALSA: ca0106: fix error code handling Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 48/65] ALSA: pcm: oss: Remove superfluous WARN_ON() for mulaw sanity check Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 49/65] ALSA: hda/hdmi: always check pin power status in i915 pin fixup Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 50/65] ALSA: firewire-digi00x: exclude Avid Adrenaline from detection Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 51/65] affs: fix basic permission bits to actually work Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 52/65] block: allow for_each_bvec to support zero len bvec Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 53/65] block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 54/65] libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 55/65] dm cache metadata: Avoid returning cmd->bm wild pointer on error Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 56/65] dm thin " Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 57/65] mm: slub: fix conversion of freelist_corrupted() Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 58/65] KVM: arm64: Add kvm_extable for vaxorcism code Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 59/65] KVM: arm64: Defer guest entry when an asynchronous exception is pending Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 60/65] KVM: arm64: Survive synchronous exceptions caused by AT instructions Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 61/65] KVM: arm64: Set HCR_EL2.PTW to prevent AT taking synchronous exception Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 62/65] checkpatch: fix the usage of capture group ( ... ) Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 63/65] mm/hugetlb: fix a race between hugetlb sysctl handlers Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 64/65] cfg80211: regulatory: reject invalid hints Greg Kroah-Hartman
2020-09-08 15:26 ` [PATCH 4.14 65/65] net: usb: Fix uninit-was-stored issue in asix_read_phy_addr() Greg Kroah-Hartman
2020-09-09  1:47 ` [PATCH 4.14 00/65] 4.14.197-rc1 review Shuah Khan
2020-09-09  7:03 ` Jon Hunter
2020-09-09  8:19 ` Naresh Kamboju
2020-09-09 16:39 ` Guenter Roeck
2020-09-09 16:39 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200908152219.316437880@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dsterba@suse.com \
    --cc=fdmanana@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.