All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Lei Xue <carmark.dlut@gmail.com>,
	Dave Wysochanski <dwysocha@redhat.com>,
	David Howells <dhowells@redhat.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.14 19/77] cachefiles: Fix race between read_waiter and read_copier involving op->to_do
Date: Mon,  1 Jun 2020 19:53:24 +0200	[thread overview]
Message-ID: <20200601174019.894338974@linuxfoundation.org> (raw)
In-Reply-To: <20200601174016.396817032@linuxfoundation.org>

From: Lei Xue <carmark.dlut@gmail.com>

[ Upstream commit 7bb0c5338436dae953622470d52689265867f032 ]

There is a potential race in fscache operation enqueuing for reading and
copying multiple pages from cachefiles to netfs.  The problem can be seen
easily on a heavy loaded system (for example many processes reading files
continually on an NFS share covered by fscache triggered this problem within
a few minutes).

The race is due to cachefiles_read_waiter() adding the op to the monitor
to_do list and then then drop the object->work_lock spinlock before
completing fscache_enqueue_operation().  Once the lock is dropped,
cachefiles_read_copier() grabs the op, completes processing it, and
makes it through fscache_retrieval_complete() which sets the op->state to
the final state of FSCACHE_OP_ST_COMPLETE(4).  When cachefiles_read_waiter()
finally gets through the remainder of fscache_enqueue_operation()
it sees the invalid state, and hits the ASSERTCMP and the following
oops is seen:
[ 2259.612361] FS-Cache:
[ 2259.614785] FS-Cache: Assertion failed
[ 2259.618639] FS-Cache: 4 == 5 is false
[ 2259.622456] ------------[ cut here ]------------
[ 2259.627190] kernel BUG at fs/fscache/operation.c:70!
...
[ 2259.791675] RIP: 0010:[<ffffffffc061b4cf>]  [<ffffffffc061b4cf>] fscache_enqueue_operation+0xff/0x170 [fscache]
[ 2259.802059] RSP: 0000:ffffa0263d543be0  EFLAGS: 00010046
[ 2259.807521] RAX: 0000000000000019 RBX: ffffa01a4d390480 RCX: 0000000000000006
[ 2259.814847] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffa0263d553890
[ 2259.822176] RBP: ffffa0263d543be8 R08: 0000000000000000 R09: ffffa0263c2d8708
[ 2259.829502] R10: 0000000000001e7f R11: 0000000000000000 R12: ffffa01a4d390480
[ 2259.844483] R13: ffff9fa9546c5920 R14: ffffa0263d543c80 R15: ffffa0293ff9bf10
[ 2259.859554] FS:  00007f4b6efbd700(0000) GS:ffffa0263d540000(0000) knlGS:0000000000000000
[ 2259.875571] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2259.889117] CR2: 00007f49e1624ff0 CR3: 0000012b38b38000 CR4: 00000000007607e0
[ 2259.904015] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2259.918764] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2259.933449] PKRU: 55555554
[ 2259.943654] Call Trace:
[ 2259.953592]  <IRQ>
[ 2259.955577]  [<ffffffffc03a7c12>] cachefiles_read_waiter+0x92/0xf0 [cachefiles]
[ 2259.978039]  [<ffffffffa34d3942>] __wake_up_common+0x82/0x120
[ 2259.991392]  [<ffffffffa34d3a63>] __wake_up_common_lock+0x83/0xc0
[ 2260.004930]  [<ffffffffa34d3510>] ? task_rq_unlock+0x20/0x20
[ 2260.017863]  [<ffffffffa34d3ab3>] __wake_up+0x13/0x20
[ 2260.030230]  [<ffffffffa34c72a0>] __wake_up_bit+0x50/0x70
[ 2260.042535]  [<ffffffffa35bdcdb>] unlock_page+0x2b/0x30
[ 2260.054495]  [<ffffffffa35bdd09>] page_endio+0x29/0x90
[ 2260.066184]  [<ffffffffa368fc81>] mpage_end_io+0x51/0x80

CPU1
cachefiles_read_waiter()
 20 static int cachefiles_read_waiter(wait_queue_entry_t *wait, unsigned mode,
 21                                   int sync, void *_key)
 22 {
...
 61         spin_lock(&object->work_lock);
 62         list_add_tail(&monitor->op_link, &op->to_do);
 63         spin_unlock(&object->work_lock);
<begin race window>
 64
 65         fscache_enqueue_retrieval(op);
182 static inline void fscache_enqueue_retrieval(struct fscache_retrieval *op)
183 {
184         fscache_enqueue_operation(&op->op);
185 }
 58 void fscache_enqueue_operation(struct fscache_operation *op)
 59 {
 60         struct fscache_cookie *cookie = op->object->cookie;
 61
 62         _enter("{OBJ%x OP%x,%u}",
 63                op->object->debug_id, op->debug_id, atomic_read(&op->usage));
 64
 65         ASSERT(list_empty(&op->pend_link));
 66         ASSERT(op->processor != NULL);
 67         ASSERT(fscache_object_is_available(op->object));
 68         ASSERTCMP(atomic_read(&op->usage), >, 0);
<end race window>

CPU2
cachefiles_read_copier()
168         while (!list_empty(&op->to_do)) {
...
202                 fscache_end_io(op, monitor->netfs_page, error);
203                 put_page(monitor->netfs_page);
204                 fscache_retrieval_complete(op, 1);

CPU1
 58 void fscache_enqueue_operation(struct fscache_operation *op)
 59 {
...
 69         ASSERTIFCMP(op->state != FSCACHE_OP_ST_IN_PROGRESS,
 70                     op->state, ==,  FSCACHE_OP_ST_CANCELLED);

Signed-off-by: Lei Xue <carmark.dlut@gmail.com>
Signed-off-by: Dave Wysochanski <dwysocha@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/cachefiles/rdwr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
index 5e9176ec0d3a..c073a0f680fd 100644
--- a/fs/cachefiles/rdwr.c
+++ b/fs/cachefiles/rdwr.c
@@ -64,9 +64,9 @@ static int cachefiles_read_waiter(wait_queue_entry_t *wait, unsigned mode,
 	object = container_of(op->op.object, struct cachefiles_object, fscache);
 	spin_lock(&object->work_lock);
 	list_add_tail(&monitor->op_link, &op->to_do);
+	fscache_enqueue_retrieval(op);
 	spin_unlock(&object->work_lock);
 
-	fscache_enqueue_retrieval(op);
 	fscache_put_retrieval(op);
 	return 0;
 }
-- 
2.25.1




  parent reply	other threads:[~2020-06-01 18:01 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-01 17:53 [PATCH 4.14 00/77] 4.14.183-rc1 review Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 01/77] ax25: fix setsockopt(SO_BINDTODEVICE) Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 02/77] net: ipip: fix wrong address family in init error path Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 03/77] net/mlx5: Add command entry handling completion Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 04/77] net: revert "net: get rid of an signed integer overflow in ip_idents_reserve()" Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 05/77] net sched: fix reporting the first-time use timestamp Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 06/77] r8152: support additional Microsoft Surface Ethernet Adapter variant Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 07/77] sctp: Start shutdown on association restart if in SHUTDOWN-SENT state and socket is closed Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 08/77] net/mlx5e: Update netdev txq on completions during closure Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 09/77] net: qrtr: Fix passing invalid reference to qrtr_local_enqueue() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 10/77] net: sun: fix missing release regions in cas_init_one() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 11/77] net/mlx4_core: fix a memory leak bug Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 12/77] ARM: dts: rockchip: fix phy nodename for rk3228-evb Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 13/77] arm64: dts: rockchip: swap interrupts interrupt-names rk3399 gpu node Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 14/77] ARM: dts: rockchip: fix pinctrl sub nodename for spi in rk322x.dtsi Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 15/77] gpio: tegra: mask GPIO IRQs during IRQ shutdown Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 16/77] net: microchip: encx24j600: add missed kthread_stop Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 17/77] gfs2: move privileged user check to gfs2_quota_lock_check Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 18/77] gfs2: dont call quota_unhold if quotas are not locked Greg Kroah-Hartman
2020-06-01 17:53 ` Greg Kroah-Hartman [this message]
2020-06-01 17:53 ` [PATCH 4.14 20/77] usb: gadget: legacy: fix redundant initialization warnings Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 21/77] net: freescale: select CONFIG_FIXED_PHY where needed Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 22/77] cifs: Fix null pointer check in cifs_read Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 23/77] samples: bpf: Fix build error Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 24/77] Input: usbtouchscreen - add support for BonXeon TP Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 25/77] Input: i8042 - add ThinkPad S230u to i8042 nomux list Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 26/77] Input: evdev - call input_flush_device() on release(), not flush() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 27/77] Input: xpad - add custom init packet for Xbox One S controllers Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 28/77] Input: dlink-dir685-touchkeys - fix a typo in driver name Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 29/77] Input: i8042 - add ThinkPad S230u to i8042 reset list Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 30/77] Input: synaptics-rmi4 - really fix attn_data use-after-free Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 31/77] Input: synaptics-rmi4 - fix error return code in rmi_driver_probe() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 32/77] ARM: 8843/1: use unified assembler in headers Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 33/77] ARM: uaccess: consolidate uaccess asm to asm/uaccess-asm.h Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 34/77] ARM: uaccess: integrate uaccess_save and uaccess_restore Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 35/77] ARM: uaccess: fix DACR mismatch with nested exceptions Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 36/77] gpio: exar: Fix bad handling for ida_simple_get error path Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 37/77] IB/qib: Call kobject_put() when kobject_init_and_add() fails Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 38/77] ARM: dts: imx6q-bx50v3: Add internal switch Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 39/77] ARM: dts/imx6q-bx50v3: Set display interface clock parents Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 40/77] ARM: dts: bcm2835-rpi-zero-w: Fix led polarity Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 41/77] mmc: block: Fix use-after-free issue for rpmb Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 42/77] RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 43/77] ALSA: hwdep: fix a left shifting 1 by 31 UB bug Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 44/77] ALSA: usb-audio: mixer: volume quirk for ESS Technology Asus USB DAC Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 45/77] exec: Always set cap_ambient in cap_bprm_set_creds Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 46/77] ALSA: hda/realtek - Add new codec supported for ALC287 Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 47/77] libceph: ignore pool overlay and cache logic on redirects Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 48/77] mm: remove VM_BUG_ON(PageSlab()) from page_mapcount() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 49/77] fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 50/77] include/asm-generic/topology.h: guard cpumask_of_node() macro argument Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 51/77] iommu: Fix reference count leak in iommu_group_alloc Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 52/77] parisc: Fix kernel panic in mem_init() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 53/77] mac80211: mesh: fix discovery timer re-arming issue / crash Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 54/77] x86/dma: Fix max PFN arithmetic overflow on 32 bit systems Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 55/77] copy_xstate_to_kernel(): dont leave parts of destination uninitialized Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 56/77] xfrm: allow to accept packets with ipv6 NEXTHDR_HOP in xfrm_input Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 57/77] xfrm: call xfrm_output_gso when inner_protocol is set in xfrm_output Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 58/77] xfrm: fix a warning in xfrm_policy_insert_list Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 59/77] xfrm: fix a NULL-ptr deref in xfrm_local_error Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 60/77] xfrm: fix error in comment Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 61/77] vti4: eliminated some duplicate code Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 62/77] ip_vti: receive ipip packet by calling ip_tunnel_rcv Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 63/77] netfilter: nft_reject_bridge: enable reject with bridge vlan Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 64/77] netfilter: ipset: Fix subcounter update skip Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 65/77] netfilter: nfnetlink_cthelper: unbreak userspace helper support Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 66/77] netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 67/77] esp6: get the right proto for transport mode in esp6_gso_encap Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 68/77] qlcnic: fix missing release in qlcnic_83xx_interrupt_test Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 69/77] bonding: Fix reference count leak in bond_sysfs_slave_add Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 70/77] Revert "Input: i8042 - add ThinkPad S230u to i8042 nomux list" Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 71/77] netfilter: nf_conntrack_pptp: fix compilation warning with W=1 build Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 72/77] mm/vmalloc.c: dont dereference possible NULL pointer in __vunmap() Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 73/77] sc16is7xx: move label err_spi to correct section Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 74/77] rxrpc: Fix transport sockopts to get IPv4 errors on an IPv6 socket Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 75/77] KVM: VMX: check for existence of secondary exec controls before accessing Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 76/77] net: hns: fix unsigned comparison to less than zero Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 77/77] net: hns: Fixes the missing put_device in positive leg for roce reset Greg Kroah-Hartman
2020-06-02  7:34 ` [PATCH 4.14 00/77] 4.14.183-rc1 review Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200601174019.894338974@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=carmark.dlut@gmail.com \
    --cc=dhowells@redhat.com \
    --cc=dwysocha@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.