All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Tianyu Lan <Tianyu.Lan@microsoft.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Konrad Wilk <konrad.wilk@oracle.com>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Mukesh Ojha <mojha@codeaurora.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Jiri Kosina <jkosina@suse.cz>, Rik van Riel <riel@surriel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Micheal Kelley <michael.h.kelley@microsoft.com>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Borislav Petkov <bp@alien8.de>
Subject: [PATCH 4.9 49/56] cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n
Date: Mon,  1 Apr 2019 19:03:05 +0200	[thread overview]
Message-ID: <20190401170106.565033962@linuxfoundation.org> (raw)
In-Reply-To: <20190401170103.398401360@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

commit 206b92353c839c0b27a0b9bec24195f93fd6cf7a upstream.

Tianyu reported a crash in a CPU hotplug teardown callback when booting a
kernel which has CONFIG_HOTPLUG_CPU disabled with the 'nosmt' boot
parameter.

It turns out that the SMP=y CONFIG_HOTPLUG_CPU=n case has been broken
forever in case that a bringup callback fails. Unfortunately this issue was
not recognized when the CPU hotplug code was reworked, so the shortcoming
just stayed in place.

When a bringup callback fails, the CPU hotplug code rolls back the
operation and takes the CPU offline.

The 'nosmt' command line argument uses a bringup failure to abort the
bringup of SMT sibling CPUs. This partial bringup is required due to the
MCE misdesign on Intel CPUs.

With CONFIG_HOTPLUG_CPU=y the rollback works perfectly fine, but
CONFIG_HOTPLUG_CPU=n lacks essential mechanisms to exercise the low level
teardown of a CPU including the synchronizations in various facilities like
RCU, NOHZ and others.

As a consequence the teardown callbacks which must be executed on the
outgoing CPU within stop machine with interrupts disabled are executed on
the control CPU in interrupt enabled and preemptible context causing the
kernel to crash and burn. The pre state machine code has a different
failure mode which is more subtle and resulting in a less obvious use after
free crash because the control side frees resources which are still in use
by the undead CPU.

But this is not a x86 only problem. Any architecture which supports the
SMP=y HOTPLUG_CPU=n combination suffers from the same issue. It's just less
likely to be triggered because in 99.99999% of the cases all bringup
callbacks succeed.

The easy solution of making HOTPLUG_CPU mandatory for SMP is not working on
all architectures as the following architectures have either no hotplug
support at all or not all subarchitectures support it:

 alpha, arc, hexagon, openrisc, riscv, sparc (32bit), mips (partial).

Crashing the kernel in such a situation is not an acceptable state
either.

Implement a minimal rollback variant by limiting the teardown to the point
where all regular teardown callbacks have been invoked and leave the CPU in
the 'dead' idle state. This has the following consequences:

 - the CPU is brought down to the point where the stop_machine takedown
   would happen.

 - the CPU stays there forever and is idle

 - The CPU is cleared in the CPU active mask, but not in the CPU online
   mask which is a legit state.

 - Interrupts are not forced away from the CPU

 - All facilities which only look at online mask would still see it, but
   that is the case during normal hotplug/unplug operations as well. It's
   just a (way) longer time frame.

This will expose issues, which haven't been exposed before or only seldom,
because now the normally transient state of being non active but online is
a permanent state. In testing this exposed already an issue vs. work queues
where the vmstat code schedules work on the almost dead CPU which ends up
in an unbound workqueue and triggers 'preemtible context' warnings. This is
not a problem of this change, it merily exposes an already existing issue.
Still this is better than crashing fully without a chance to debug it.

This is mainly thought as workaround for those architectures which do not
support HOTPLUG_CPU. All others should enforce HOTPLUG_CPU for SMP.

Fixes: 2e1a3483ce74 ("cpu/hotplug: Split out the state walk into functions")
Reported-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Mukesh Ojha <mojha@codeaurora.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Micheal Kelley <michael.h.kelley@microsoft.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190326163811.503390616@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 kernel/cpu.c |   20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -591,6 +591,20 @@ static void undo_cpu_up(unsigned int cpu
 	}
 }
 
+static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
+{
+	if (IS_ENABLED(CONFIG_HOTPLUG_CPU))
+		return true;
+	/*
+	 * When CPU hotplug is disabled, then taking the CPU down is not
+	 * possible because takedown_cpu() and the architecture and
+	 * subsystem specific mechanisms are not available. So the CPU
+	 * which would be completely unplugged again needs to stay around
+	 * in the current state.
+	 */
+	return st->state <= CPUHP_BRINGUP_CPU;
+}
+
 static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
 			      enum cpuhp_state target)
 {
@@ -601,8 +615,10 @@ static int cpuhp_up_callbacks(unsigned i
 		st->state++;
 		ret = cpuhp_invoke_callback(cpu, st->state, true, NULL);
 		if (ret) {
-			st->target = prev_state;
-			undo_cpu_up(cpu, st);
+			if (can_rollback_cpu(st)) {
+				st->target = prev_state;
+				undo_cpu_up(cpu, st);
+			}
 			break;
 		}
 	}



  parent reply	other threads:[~2019-04-01 17:28 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-01 17:02 [PATCH 4.9 00/56] 4.9.167-stable review Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 01/56] Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 02/56] Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 03/56] video: fbdev: Set pixclock = 0 in goldfishfb Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 04/56] cfg80211: size various nl80211 messages correctly Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 05/56] stmmac: copy unicast mac address to MAC registers Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 06/56] dccp: do not use ipv6 header for ipv4 flow Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 07/56] mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 08/56] net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 09/56] net: rose: fix a possible stack overflow Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 10/56] packets: Always register packet sk in the same order Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 11/56] tcp: do not use ipv6 header for ipv4 flow Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 12/56] vxlan: Dont call gro_cells_destroy() before device is unregistered Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 13/56] sctp: get sctphdr by offset in sctp_compute_cksum Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 14/56] mac8390: Fix mmio access size probe Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 15/56] tun: properly test for IFF_UP Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 16/56] tun: add a missing rcu_read_unlock() in error path Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 17/56] btrfs: remove WARN_ON in log_dir_items Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 18/56] btrfs: raid56: properly unmap parity page in finish_parity_scrub() Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 19/56] ARM: imx6q: cpuidle: fix bug that CPU might not wake up at expected time Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 20/56] powerpc: bpf: Fix generation of load/store DW instructions Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 21/56] NFSv4.1 dont free interrupted slot on open Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 22/56] net: dsa: qca8k: remove leftover phy accessors Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 23/56] ALSA: rawmidi: Fix potential Spectre v1 vulnerability Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 24/56] ALSA: seq: oss: Fix " Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 25/56] ALSA: pcm: Fix possible OOB access in PCM oss plugins Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 26/56] ALSA: pcm: Dont suspend stream in unrecoverable PCM state Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 27/56] fs/open.c: allow opening only regular files during execve() Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 28/56] scsi: sd: Fix a race between closing an sd device and sd I/O Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 29/56] scsi: sd: Quiesce warning if device does not report optimal I/O size Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 30/56] scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_Host Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 31/56] scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 32/56] tty: atmel_serial: fix a potential NULL pointer dereference Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 33/56] staging: comedi: ni_mio_common: Fix divide-by-zero for DIO cmdtest Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 34/56] staging: vt6655: Remove vif check from vnt_interrupt Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 35/56] staging: vt6655: Fix interrupt race condition on device start up Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 36/56] serial: max310x: Fix to avoid potential NULL pointer dereference Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 37/56] serial: sh-sci: Fix setting SCSCR_TIE while transferring data Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 38/56] USB: serial: cp210x: add new device id Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 39/56] USB: serial: ftdi_sio: add additional NovaTech products Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 40/56] USB: serial: mos7720: fix mos_parport refcount imbalance on error path Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 41/56] USB: serial: option: set driver_info for SIM5218 and compatibles Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 42/56] USB: serial: option: add Olicard 600 Greg Kroah-Hartman
2019-04-01 17:02 ` [PATCH 4.9 43/56] Disable kgdboc failed by echo space to /sys/module/kgdboc/parameters/kgdboc Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 44/56] fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 45/56] gpio: adnp: Fix testing wrong value in adnp_gpio_direction_input Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 46/56] usb: common: Consider only available nodes for dr_mode Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 47/56] usb: host: xhci-rcar: Add XHCI_TRUST_TX_LENGTH quirk Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 48/56] perf intel-pt: Fix TSC slip Greg Kroah-Hartman
2019-04-01 17:03 ` Greg Kroah-Hartman [this message]
2019-04-01 17:03 ` [PATCH 4.9 50/56] x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 51/56] KVM: Reject device ioctls from processes other than the VMs creator Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 52/56] KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 53/56] USB: gadget: f_hid: fix deadlock in f_hidg_write() Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 54/56] xhci: Fix port resume done detection for SS ports with LPM enabled Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 55/56] Revert "USB: core: only clean up what we allocated" Greg Kroah-Hartman
2019-04-01 17:03 ` [PATCH 4.9 56/56] arm64: support keyctl() system call in 32-bit mode Greg Kroah-Hartman
2019-04-01 21:03 ` [PATCH 4.9 00/56] 4.9.167-stable review kernelci.org bot
2019-04-02  3:01 ` Naresh Kamboju
2019-04-02  9:03 ` Jon Hunter
2019-04-02  9:03   ` Jon Hunter
2019-04-02 19:04 ` Guenter Roeck
2019-04-02 23:54 ` shuah

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190401170106.565033962@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=Tianyu.Lan@microsoft.com \
    --cc=bp@alien8.de \
    --cc=jkosina@suse.cz \
    --cc=jpoimboe@redhat.com \
    --cc=konrad.wilk@oracle.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=michael.h.kelley@microsoft.com \
    --cc=mojha@codeaurora.org \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.