linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Jann Horn <jannh@google.com>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: [PATCH 4.17 15/97] fix mntput/mntput race
Date: Tue, 14 Aug 2018 19:16:27 +0200	[thread overview]
Message-ID: <20180814171433.772941377@linuxfoundation.org> (raw)
In-Reply-To: <20180814171433.160434170@linuxfoundation.org>

4.17-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Al Viro <viro@zeniv.linux.org.uk>

commit 9ea0a46ca2c318fcc449c1e6b62a7230a17888f1 upstream.

mntput_no_expire() does the calculation of total refcount under mount_lock;
unfortunately, the decrement (as well as all increments) are done outside
of it, leading to false positives in the "are we dropping the last reference"
test.  Consider the following situation:
	* mnt is a lazy-umounted mount, kept alive by two opened files.  One
of those files gets closed.  Total refcount of mnt is 2.  On CPU 42
mntput(mnt) (called from __fput()) drops one reference, decrementing component
	* After it has looked at component #0, the process on CPU 0 does
mntget(), incrementing component #0, gets preempted and gets to run again -
on CPU 69.  There it does mntput(), which drops the reference (component #69)
and proceeds to spin on mount_lock.
	* On CPU 42 our first mntput() finishes counting.  It observes the
decrement of component #69, but not the increment of component #0.  As the
result, the total it gets is not 1 as it should've been - it's 0.  At which
point we decide that vfsmount needs to be killed and proceed to free it and
shut the filesystem down.  However, there's still another opened file
on that filesystem, with reference to (now freed) vfsmount, etc. and we are
screwed.

It's not a wide race, but it can be reproduced with artificial slowdown of
the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups.

Fix consists of moving the refcount decrement under mount_lock; the tricky
part is that we want (and can) keep the fast case (i.e. mount that still
has non-NULL ->mnt_ns) entirely out of mount_lock.  All places that zero
mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu()
before that mntput().  IOW, if mntput() observes (under rcu_read_lock())
a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to
be dropped.

Reported-by: Jann Horn <jannh@google.com>
Tested-by: Jann Horn <jannh@google.com>
Fixes: 48a066e72d97 ("RCU'd vsfmounts")
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 fs/namespace.c |   14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1195,12 +1195,22 @@ static DECLARE_DELAYED_WORK(delayed_mntp
 static void mntput_no_expire(struct mount *mnt)
 {
 	rcu_read_lock();
-	mnt_add_count(mnt, -1);
-	if (likely(mnt->mnt_ns)) { /* shouldn't be the last one */
+	if (likely(READ_ONCE(mnt->mnt_ns))) {
+		/*
+		 * Since we don't do lock_mount_hash() here,
+		 * ->mnt_ns can change under us.  However, if it's
+		 * non-NULL, then there's a reference that won't
+		 * be dropped until after an RCU delay done after
+		 * turning ->mnt_ns NULL.  So if we observe it
+		 * non-NULL under rcu_read_lock(), the reference
+		 * we are dropping is not the final one.
+		 */
+		mnt_add_count(mnt, -1);
 		rcu_read_unlock();
 		return;
 	}
 	lock_mount_hash();
+	mnt_add_count(mnt, -1);
 	if (mnt_get_count(mnt)) {
 		rcu_read_unlock();
 		unlock_mount_hash();



  parent reply	other threads:[~2018-08-14 17:28 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-14 17:16 [PATCH 4.17 00/97] 4.17.15-stable review Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 01/97] parisc: Enable CONFIG_MLONGCALLS by default Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 03/97] Mark HI and TASKLET softirq synchronous Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 04/97] stop_machine: Disable preemption after queueing stopper threads Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 05/97] sched/deadline: Update rq_clock of later_rq when pushing a task Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 06/97] zram: remove BD_CAP_SYNCHRONOUS_IO with writeback feature Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 07/97] xen/netfront: dont cache skb_shinfo() Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 08/97] bpf, sockmap: fix leak in bpf_tcp_sendmsg wait for mem path Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 09/97] bpf, sockmap: fix bpf_tcp_sendmsg sock error handling Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 10/97] scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 11/97] scsi: qla2xxx: Fix memory leak for allocating abort IOCB Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 12/97] init: rename and re-order boot_cpu_state_init() Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 13/97] root dentries need RCU-delayed freeing Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 14/97] make sure that __dentry_kill() always invalidates d_seq, unhashed or not Greg Kroah-Hartman
2018-08-14 17:16 ` Greg Kroah-Hartman [this message]
2018-08-14 17:16 ` [PATCH 4.17 16/97] fix __legitimize_mnt()/mntput() race Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 17/97] ARM: dts: imx6sx: fix irq for pcie bridge Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 18/97] x86/paravirt: Fix spectre-v2 mitigations for paravirt guests Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 19/97] x86/speculation: Protect against userspace-userspace spectreRSB Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 20/97] kprobes/x86: Fix %p uses in error messages Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 21/97] x86/irqflags: Provide a declaration for native_save_fl Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 22/97] x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 23/97] x86/speculation/l1tf: Change order of offset/type in swap entry Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 24/97] x86/speculation/l1tf: Protect swap entries against L1TF Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 25/97] x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 26/97] x86/speculation/l1tf: Make sure the first page is always reserved Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 27/97] x86/speculation/l1tf: Add sysfs reporting for l1tf Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 28/97] x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 29/97] x86/speculation/l1tf: Limit swap file size to MAX_PA/2 Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 30/97] x86/bugs: Move the l1tf function and define pr_fmt properly Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 31/97] sched/smt: Update sched_smt_present at runtime Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 32/97] x86/smp: Provide topology_is_primary_thread() Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 33/97] x86/topology: Provide topology_smt_supported() Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 34/97] cpu/hotplug: Make bringup/teardown of smp threads symmetric Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 35/97] cpu/hotplug: Split do_cpu_down() Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 36/97] cpu/hotplug: Provide knobs to control SMT Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 37/97] x86/cpu: Remove the pointless CPU printout Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 38/97] x86/cpu/AMD: Remove the pointless detect_ht() call Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 39/97] x86/cpu/common: Provide detect_ht_early() Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 40/97] x86/cpu/topology: Provide detect_extended_topology_early() Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 41/97] x86/cpu/intel: Evaluate smp_num_siblings early Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 42/97] x86/CPU/AMD: Do not check CPUID max ext level before parsing SMP info Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 43/97] x86/cpu/AMD: Evaluate smp_num_siblings early Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 45/97] x86/speculation/l1tf: Extend 64bit swap file size limit Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 46/97] x86/cpufeatures: Add detection of L1D cache flush support Greg Kroah-Hartman
2018-08-14 17:16 ` [PATCH 4.17 47/97] x86/CPU/AMD: Move TOPOEXT reenablement before reading smp_num_siblings Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 48/97] x86/speculation/l1tf: Protect PAE swap entries against L1TF Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 49/97] x86/speculation/l1tf: Fix up pte->pfn conversion for PAE Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 50/97] Revert "x86/apic: Ignore secondary threads if nosmt=force" Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 51/97] cpu/hotplug: Boot HT siblings at least once Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 52/97] x86/KVM: Warn user if KVM is loaded SMT and L1TF CPU bug being present Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 53/97] x86/KVM/VMX: Add module argument for L1TF mitigation Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 54/97] x86/KVM/VMX: Add L1D flush algorithm Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 55/97] x86/KVM/VMX: Add L1D MSR based flush Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 56/97] x86/KVM/VMX: Add L1D flush logic Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 57/97] x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest numbers Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 58/97] x86/KVM/VMX: Add find_msr() helper function Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 59/97] x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 60/97] x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 61/97] x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 62/97] cpu/hotplug: Online siblings when SMT control is turned on Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 63/97] x86/litf: Introduce vmx status variable Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 64/97] x86/kvm: Drop L1TF MSR list approach Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 65/97] x86/l1tf: Handle EPT disabled state proper Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 66/97] x86/kvm: Move l1tf setup function Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 67/97] x86/kvm: Add static key for flush always Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 68/97] x86/kvm: Serialize L1D flush parameter setter Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 69/97] x86/kvm: Allow runtime control of L1D flush Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 70/97] cpu/hotplug: Expose SMT control init function Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 71/97] cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 72/97] x86/bugs, kvm: Introduce boot-time control of L1TF mitigations Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 73/97] Documentation: Add section about CPU vulnerabilities Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 74/97] x86/speculation/l1tf: Unbreak !__HAVE_ARCH_PFN_MODIFY_ALLOWED architectures Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 75/97] x86/KVM/VMX: Initialize the vmx_l1d_flush_pages content Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 76/97] Documentation/l1tf: Fix typos Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 77/97] cpu/hotplug: detect SMT disabled by BIOS Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 78/97] x86/KVM/VMX: Dont set l1tf_flush_l1d to true from vmx_l1d_flush() Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 79/97] x86/KVM/VMX: Replace vmx_l1d_flush_always with vmx_l1d_flush_cond Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 80/97] x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 81/97] x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 82/97] x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 83/97] x86: Dont include linux/irq.h from asm/hardirq.h Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 84/97] x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 85/97] x86/KVM/VMX: Dont set l1tf_flush_l1d from vmx_handle_external_intr() Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 86/97] Documentation/l1tf: Remove Yonah processors from not vulnerable list Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 87/97] x86/speculation: Simplify sysfs report of VMX L1TF vulnerability Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 88/97] x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 89/97] KVM: VMX: Tell the nested hypervisor " Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 90/97] cpu/hotplug: Fix SMT supported evaluation Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 91/97] x86/speculation/l1tf: Invert all not present mappings Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 92/97] x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 93/97] x86/mm/pat: Make set_memory_np() L1TF safe Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 94/97] x86/mm/kmmio: Make the tracer robust against L1TF Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 95/97] tools headers: Synchronize prctl.h ABI header Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 96/97] tools headers: Synchronise x86 cpufeatures.h for L1TF additions Greg Kroah-Hartman
2018-08-14 17:17 ` [PATCH 4.17 97/97] x86/microcode: Allow late microcode loading with SMT disabled Greg Kroah-Hartman
2018-08-15  6:14 ` [PATCH 4.17 00/97] 4.17.15-stable review Greg Kroah-Hartman
2018-08-15 13:15 ` Guenter Roeck
2018-08-15 20:31 ` Dan Rue
2018-08-16 10:08   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180814171433.772941377@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).