From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
Josef Bacik <josef@toxicpanda.com>,
David Sterba <dsterba@suse.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.9 35/71] btrfs: fix potential deadlock in the search ioctl
Date: Fri, 11 Sep 2020 14:46:19 +0200 [thread overview]
Message-ID: <20200911122506.680591979@linuxfoundation.org> (raw)
In-Reply-To: <20200911122504.928931589@linuxfoundation.org>
From: Josef Bacik <josef@toxicpanda.com>
[ Upstream commit a48b73eca4ceb9b8a4b97f290a065335dbcd8a04 ]
With the conversion of the tree locks to rwsem I got the following
lockdep splat:
======================================================
WARNING: possible circular locking dependency detected
5.8.0-rc7-00165-g04ec4da5f45f-dirty #922 Not tainted
------------------------------------------------------
compsize/11122 is trying to acquire lock:
ffff889fabca8768 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x3e/0x90
but task is already holding lock:
ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (btrfs-fs-00){++++}-{3:3}:
down_write_nested+0x3b/0x70
__btrfs_tree_lock+0x24/0x120
btrfs_search_slot+0x756/0x990
btrfs_lookup_inode+0x3a/0xb4
__btrfs_update_delayed_inode+0x93/0x270
btrfs_async_run_delayed_root+0x168/0x230
btrfs_work_helper+0xd4/0x570
process_one_work+0x2ad/0x5f0
worker_thread+0x3a/0x3d0
kthread+0x133/0x150
ret_from_fork+0x1f/0x30
-> #1 (&delayed_node->mutex){+.+.}-{3:3}:
__mutex_lock+0x9f/0x930
btrfs_delayed_update_inode+0x50/0x440
btrfs_update_inode+0x8a/0xf0
btrfs_dirty_inode+0x5b/0xd0
touch_atime+0xa1/0xd0
btrfs_file_mmap+0x3f/0x60
mmap_region+0x3a4/0x640
do_mmap+0x376/0x580
vm_mmap_pgoff+0xd5/0x120
ksys_mmap_pgoff+0x193/0x230
do_syscall_64+0x50/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
-> #0 (&mm->mmap_lock#2){++++}-{3:3}:
__lock_acquire+0x1272/0x2310
lock_acquire+0x9e/0x360
__might_fault+0x68/0x90
_copy_to_user+0x1e/0x80
copy_to_sk.isra.32+0x121/0x300
search_ioctl+0x106/0x200
btrfs_ioctl_tree_search_v2+0x7b/0xf0
btrfs_ioctl+0x106f/0x30a0
ksys_ioctl+0x83/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x50/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
other info that might help us debug this:
Chain exists of:
&mm->mmap_lock#2 --> &delayed_node->mutex --> btrfs-fs-00
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(btrfs-fs-00);
lock(&delayed_node->mutex);
lock(btrfs-fs-00);
lock(&mm->mmap_lock#2);
*** DEADLOCK ***
1 lock held by compsize/11122:
#0: ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180
stack backtrace:
CPU: 17 PID: 11122 Comm: compsize Kdump: loaded Not tainted 5.8.0-rc7-00165-g04ec4da5f45f-dirty #922
Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018
Call Trace:
dump_stack+0x78/0xa0
check_noncircular+0x165/0x180
__lock_acquire+0x1272/0x2310
lock_acquire+0x9e/0x360
? __might_fault+0x3e/0x90
? find_held_lock+0x72/0x90
__might_fault+0x68/0x90
? __might_fault+0x3e/0x90
_copy_to_user+0x1e/0x80
copy_to_sk.isra.32+0x121/0x300
? btrfs_search_forward+0x2a6/0x360
search_ioctl+0x106/0x200
btrfs_ioctl_tree_search_v2+0x7b/0xf0
btrfs_ioctl+0x106f/0x30a0
? __do_sys_newfstat+0x5a/0x70
? ksys_ioctl+0x83/0xc0
ksys_ioctl+0x83/0xc0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x50/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
The problem is we're doing a copy_to_user() while holding tree locks,
which can deadlock if we have to do a page fault for the copy_to_user().
This exists even without my locking changes, so it needs to be fixed.
Rework the search ioctl to do the pre-fault and then
copy_to_user_nofault for the copying.
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
fs/btrfs/extent_io.c | 8 ++++----
fs/btrfs/extent_io.h | 6 +++---
fs/btrfs/ioctl.c | 27 ++++++++++++++++++++-------
3 files changed, 27 insertions(+), 14 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index fa22bb29eee6f..d6c827a9ebc56 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -5488,9 +5488,9 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv,
}
}
-int read_extent_buffer_to_user(const struct extent_buffer *eb,
- void __user *dstv,
- unsigned long start, unsigned long len)
+int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
+ void __user *dstv,
+ unsigned long start, unsigned long len)
{
size_t cur;
size_t offset;
@@ -5511,7 +5511,7 @@ int read_extent_buffer_to_user(const struct extent_buffer *eb,
cur = min(len, (PAGE_SIZE - offset));
kaddr = page_address(page);
- if (copy_to_user(dst, kaddr + offset, cur)) {
+ if (probe_user_write(dst, kaddr + offset, cur)) {
ret = -EFAULT;
break;
}
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 9ecdc9584df77..75c03aa1800fe 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -401,9 +401,9 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv,
void read_extent_buffer(const struct extent_buffer *eb, void *dst,
unsigned long start,
unsigned long len);
-int read_extent_buffer_to_user(const struct extent_buffer *eb,
- void __user *dst, unsigned long start,
- unsigned long len);
+int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb,
+ void __user *dst, unsigned long start,
+ unsigned long len);
void write_extent_buffer(struct extent_buffer *eb, const void *src,
unsigned long start, unsigned long len);
void copy_extent_buffer(struct extent_buffer *dst, struct extent_buffer *src,
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index eefe103c65daa..6db46daeed16b 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2041,9 +2041,14 @@ static noinline int copy_to_sk(struct btrfs_path *path,
sh.len = item_len;
sh.transid = found_transid;
- /* copy search result header */
- if (copy_to_user(ubuf + *sk_offset, &sh, sizeof(sh))) {
- ret = -EFAULT;
+ /*
+ * Copy search result header. If we fault then loop again so we
+ * can fault in the pages and -EFAULT there if there's a
+ * problem. Otherwise we'll fault and then copy the buffer in
+ * properly this next time through
+ */
+ if (probe_user_write(ubuf + *sk_offset, &sh, sizeof(sh))) {
+ ret = 0;
goto out;
}
@@ -2051,10 +2056,14 @@ static noinline int copy_to_sk(struct btrfs_path *path,
if (item_len) {
char __user *up = ubuf + *sk_offset;
- /* copy the item */
- if (read_extent_buffer_to_user(leaf, up,
- item_off, item_len)) {
- ret = -EFAULT;
+ /*
+ * Copy the item, same behavior as above, but reset the
+ * * sk_offset so we copy the full thing again.
+ */
+ if (read_extent_buffer_to_user_nofault(leaf, up,
+ item_off, item_len)) {
+ ret = 0;
+ *sk_offset -= sizeof(sh);
goto out;
}
@@ -2142,6 +2151,10 @@ static noinline int search_ioctl(struct inode *inode,
key.offset = sk->min_offset;
while (1) {
+ ret = fault_in_pages_writeable(ubuf, *buf_size - sk_offset);
+ if (ret)
+ break;
+
ret = btrfs_search_forward(root, &key, path, sk->min_transid);
if (ret != 0) {
if (ret > 0)
--
2.25.1
next prev parent reply other threads:[~2020-09-11 13:21 UTC|newest]
Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-11 12:45 [PATCH 4.9 00/71] 4.9.236-rc1 review Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 01/71] HID: core: Correctly handle ReportSize being zero Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 02/71] HID: core: Sanitize event code and type when mapping input Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 03/71] perf record/stat: Explicitly call out event modifiers in the documentation Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 04/71] hwmon: (applesmc) check status earlier Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 05/71] nvmet: Disable keep-alive timer when kato is cleared to 0h Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 06/71] ceph: dont allow setlease on cephfs Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 07/71] s390: dont trace preemption in percpu macros Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 08/71] xen/xenbus: Fix granting of vmallocd memory Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 09/71] dmaengine: of-dma: Fix of_dma_router_xlates of_dma_xlate handling Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 10/71] batman-adv: Avoid uninitialized chaddr when handling DHCP Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 11/71] batman-adv: bla: use netif_rx_ni when not in interrupt context Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 12/71] dmaengine: at_hdmac: check return value of of_find_device_by_node() in at_dma_xlate() Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 13/71] MIPS: mm: BMIPS5000 has inclusive physical caches Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 14/71] MIPS: BMIPS: Also call bmips_cpu_setup() for secondary cores Greg Kroah-Hartman
2020-09-11 12:45 ` [PATCH 4.9 15/71] netfilter: nf_tables: add NFTA_SET_USERDATA if not null Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 16/71] netfilter: nf_tables: incorrect enum nft_list_attributes definition Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 17/71] netfilter: nf_tables: fix destination register zeroing Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 18/71] net: hns: Fix memleak in hns_nic_dev_probe Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 19/71] ravb: Fixed to be able to unload modules Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 20/71] net: arc_emac: Fix memleak in arc_mdio_probe Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 21/71] dmaengine: pl330: Fix burst length if burst size is smaller than bus width Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 22/71] bnxt_en: Check for zero dir entries in NVRAM Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 23/71] bnxt_en: Fix PCI AER error recovery flow Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 24/71] fix regression in "epoll: Keep a reference on files added to the check list" Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 25/71] tg3: Fix soft lockup when tg3_reset_task() fails Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 26/71] iommu/vt-d: Serialize IOMMU GCMD register modifications Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 27/71] thermal: ti-soc-thermal: Fix bogus thermal shutdowns for omap4430 Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 28/71] include/linux/log2.h: add missing () around n in roundup_pow_of_two() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 29/71] btrfs: drop path before adding new uuid tree entry Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 30/71] btrfs: Remove redundant extent_buffer_get in get_old_root Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 31/71] btrfs: Remove extraneous extent_buffer_get from tree_mod_log_rewind Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 32/71] btrfs: set the lockdep class for log tree extent buffers Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 33/71] uaccess: Add non-pagefault user-space read functions Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 34/71] uaccess: Add non-pagefault user-space write function Greg Kroah-Hartman
2020-09-11 12:46 ` Greg Kroah-Hartman [this message]
2020-09-11 12:46 ` [PATCH 4.9 36/71] net: usb: qmi_wwan: add Telit 0x1050 composition Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 37/71] drivers: net: usb: qmi_wwan: add QMI_QUIRK_SET_DTR for Telit PID 0x1201 Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 38/71] qmi_wwan: new Telewell and Sierra device IDs Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 39/71] usb: qmi_wwan: add D-Link DWM-222 A2 device ID Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 40/71] ALSA: ca0106: fix error code handling Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 41/71] ALSA: pcm: oss: Remove superfluous WARN_ON() for mulaw sanity check Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 42/71] ALSA: firewire-digi00x: exclude Avid Adrenaline from detection Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 43/71] block: allow for_each_bvec to support zero len bvec Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 44/71] block: Move SECTOR_SIZE and SECTOR_SHIFT definitions into <linux/blkdev.h> Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 45/71] libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 46/71] dm cache metadata: Avoid returning cmd->bm wild pointer on error Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 47/71] dm thin " Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 48/71] mm: slub: fix conversion of freelist_corrupted() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 49/71] vfio/type1: Support faulting PFNMAP vmas Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 50/71] vfio-pci: Fault mmaps to enable vma tracking Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 51/71] vfio-pci: Invalidate mmaps and block MMIO access on disabled memory Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 52/71] KVM: arm64: Add kvm_extable for vaxorcism code Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 53/71] KVM: arm64: Defer guest entry when an asynchronous exception is pending Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 54/71] KVM: arm64: Survive synchronous exceptions caused by AT instructions Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 55/71] KVM: arm64: Set HCR_EL2.PTW to prevent AT taking synchronous exception Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 56/71] net: refactor bind_bucket fastreuse into helper Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 57/71] net: initialize fastreuse on inet_inherit_port Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 58/71] vfio/pci: Fix SR-IOV VF handling with MMIO blocking Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 59/71] checkpatch: fix the usage of capture group ( ... ) Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 60/71] mm/hugetlb: fix a race between hugetlb sysctl handlers Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 61/71] cfg80211: regulatory: reject invalid hints Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 62/71] net: usb: Fix uninit-was-stored issue in asix_read_phy_addr() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 63/71] ALSA; firewire-tascam: exclude Tascam FE-8 from detection Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 64/71] fs/affs: use octal for permissions Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 65/71] affs: fix basic permission bits to actually work Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 66/71] net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 67/71] bnxt: dont enable NAPI until rings are ready Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 68/71] netlabel: fix problems with mapping removal Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 69/71] net: usb: dm9601: Add USB ID of Keenetic Plus DSL Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 70/71] sctp: not disable bh in the whole sctp_get_port_local() Greg Kroah-Hartman
2020-09-11 12:46 ` [PATCH 4.9 71/71] net: disable netpoll on fresh napis Greg Kroah-Hartman
2020-09-11 22:31 ` [PATCH 4.9 00/71] 4.9.236-rc1 review Shuah Khan
2020-09-12 2:16 ` Guenter Roeck
2020-09-12 8:06 ` Naresh Kamboju
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200911122506.680591979@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=josef@toxicpanda.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).