linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Jiri Kosina <jkosina@suse.cz>,
	Pavel Machek <pavel@ucw.cz>, Thomas Gleixner <tglx@linutronix.de>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Subject: [PATCH 5.1 39/70] x86/power: Fix nosmt vs hibernation triple fault during resume
Date: Sun,  9 Jun 2019 18:41:50 +0200	[thread overview]
Message-ID: <20190609164130.488111346@linuxfoundation.org> (raw)
In-Reply-To: <20190609164127.541128197@linuxfoundation.org>

From: Jiri Kosina <jkosina@suse.cz>

commit ec527c318036a65a083ef68d8ba95789d2212246 upstream.

As explained in

	0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")

we always, no matter what, have to bring up x86 HT siblings during boot at
least once in order to avoid first MCE bringing the system to its knees.

That means that whenever 'nosmt' is supplied on the kernel command-line,
all the HT siblings are as a result sitting in mwait or cpudile after
going through the online-offline cycle at least once.

This causes a serious issue though when a kernel, which saw 'nosmt' on its
commandline, is going to perform resume from hibernation: if the resume
from the hibernated image is successful, cr3 is flipped in order to point
to the address space of the kernel that is being resumed, which in turn
means that all the HT siblings are all of a sudden mwaiting on address
which is no longer valid.

That results in triple fault shortly after cr3 is switched, and machine
reboots.

Fix this by always waking up all the SMT siblings before initiating the
'restore from hibernation' process; this guarantees that all the HT
siblings will be properly carried over to the resumed kernel waiting in
resume_play_dead(), and acted upon accordingly afterwards, based on the
target kernel configuration.

Symmetricaly, the resumed kernel has to push the SMT siblings to mwait
again in case it has SMT disabled; this means it has to online all
the siblings when resuming (so that they come out of hlt) and offline
them again to let them reach mwait.

Cc: 4.19+ <stable@vger.kernel.org> # v4.19+
Debugged-by: Thomas Gleixner <tglx@linutronix.de>
Fixes: 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 arch/x86/power/cpu.c       |   10 ++++++++++
 arch/x86/power/hibernate.c |   33 +++++++++++++++++++++++++++++++++
 include/linux/cpu.h        |    4 ++++
 kernel/cpu.c               |    4 ++--
 kernel/power/hibernate.c   |    9 +++++++++
 5 files changed, 58 insertions(+), 2 deletions(-)

--- a/arch/x86/power/cpu.c
+++ b/arch/x86/power/cpu.c
@@ -299,7 +299,17 @@ int hibernate_resume_nonboot_cpu_disable
 	 * address in its instruction pointer may not be possible to resolve
 	 * any more at that point (the page tables used by it previously may
 	 * have been overwritten by hibernate image data).
+	 *
+	 * First, make sure that we wake up all the potentially disabled SMT
+	 * threads which have been initially brought up and then put into
+	 * mwait/cpuidle sleep.
+	 * Those will be put to proper (not interfering with hibernation
+	 * resume) sleep afterwards, and the resumed kernel will decide itself
+	 * what to do with them.
 	 */
+	ret = cpuhp_smt_enable();
+	if (ret)
+		return ret;
 	smp_ops.play_dead = resume_play_dead;
 	ret = disable_nonboot_cpus();
 	smp_ops.play_dead = play_dead;
--- a/arch/x86/power/hibernate.c
+++ b/arch/x86/power/hibernate.c
@@ -11,6 +11,7 @@
 #include <linux/suspend.h>
 #include <linux/scatterlist.h>
 #include <linux/kdebug.h>
+#include <linux/cpu.h>
 
 #include <crypto/hash.h>
 
@@ -246,3 +247,35 @@ out:
 	__flush_tlb_all();
 	return 0;
 }
+
+int arch_resume_nosmt(void)
+{
+	int ret = 0;
+	/*
+	 * We reached this while coming out of hibernation. This means
+	 * that SMT siblings are sleeping in hlt, as mwait is not safe
+	 * against control transition during resume (see comment in
+	 * hibernate_resume_nonboot_cpu_disable()).
+	 *
+	 * If the resumed kernel has SMT disabled, we have to take all the
+	 * SMT siblings out of hlt, and offline them again so that they
+	 * end up in mwait proper.
+	 *
+	 * Called with hotplug disabled.
+	 */
+	cpu_hotplug_enable();
+	if (cpu_smt_control == CPU_SMT_DISABLED ||
+			cpu_smt_control == CPU_SMT_FORCE_DISABLED) {
+		enum cpuhp_smt_control old = cpu_smt_control;
+
+		ret = cpuhp_smt_enable();
+		if (ret)
+			goto out;
+		ret = cpuhp_smt_disable(old);
+		if (ret)
+			goto out;
+	}
+out:
+	cpu_hotplug_disable();
+	return ret;
+}
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -183,10 +183,14 @@ enum cpuhp_smt_control {
 extern enum cpuhp_smt_control cpu_smt_control;
 extern void cpu_smt_disable(bool force);
 extern void cpu_smt_check_topology(void);
+extern int cpuhp_smt_enable(void);
+extern int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval);
 #else
 # define cpu_smt_control		(CPU_SMT_ENABLED)
 static inline void cpu_smt_disable(bool force) { }
 static inline void cpu_smt_check_topology(void) { }
+static inline int cpuhp_smt_enable(void) { return 0; }
+static inline int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { return 0; }
 #endif
 
 /*
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2064,7 +2064,7 @@ static void cpuhp_online_cpu_device(unsi
 	kobject_uevent(&dev->kobj, KOBJ_ONLINE);
 }
 
-static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
+int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 {
 	int cpu, ret = 0;
 
@@ -2096,7 +2096,7 @@ static int cpuhp_smt_disable(enum cpuhp_
 	return ret;
 }
 
-static int cpuhp_smt_enable(void)
+int cpuhp_smt_enable(void)
 {
 	int cpu, ret = 0;
 
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -258,6 +258,11 @@ void swsusp_show_speed(ktime_t start, kt
 		(kps % 1000) / 10);
 }
 
+__weak int arch_resume_nosmt(void)
+{
+	return 0;
+}
+
 /**
  * create_image - Create a hibernation image.
  * @platform_mode: Whether or not to use the platform driver.
@@ -325,6 +330,10 @@ static int create_image(int platform_mod
  Enable_cpus:
 	enable_nonboot_cpus();
 
+	/* Allow architectures to do nosmt-specific post-resume dances */
+	if (!in_suspend)
+		error = arch_resume_nosmt();
+
  Platform_finish:
 	platform_finish(platform_mode);
 



  parent reply	other threads:[~2019-06-09 17:20 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-09 16:41 [PATCH 5.1 00/70] 5.1.9-stable review Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 01/70] ethtool: fix potential userspace buffer overflow Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 02/70] Fix memory leak in sctp_process_init Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 03/70] ipv4: not do cache for local delivery if bc_forwarding is enabled Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 04/70] ipv6: fix the check before getting the cookie in rt6_get_cookie Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 05/70] net: ethernet: ti: cpsw_ethtool: fix ethtool ring param set Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 06/70] net: mvpp2: Use strscpy to handle stat strings Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 07/70] net: rds: fix memory leak in rds_ib_flush_mr_pool Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 08/70] net: sfp: read eeprom in maximum 16 byte increments Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 09/70] packet: unconditionally free po->rollover Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 10/70] pktgen: do not sleep with the thread lock held Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 11/70] Revert "fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied" Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 12/70] udp: only choose unbound UDP socket for multicast when not in a VRF Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 13/70] ipv6: use READ_ONCE() for inet->hdrincl as in ipv4 Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 14/70] ipv6: fix EFAULT on sendto with icmpv6 and hdrincl Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 15/70] net: aquantia: fix wol configuration not applied sometimes Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 16/70] neighbor: Reset gc_entries counter if new entry is released before insert Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 17/70] neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 18/70] cls_matchall: avoid panic when receiving a packet before filter set Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 19/70] ipmr_base: Do not reset index in mr_table_dump Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 20/70] net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 21/70] net/tls: replace the sleeping lock around RX resync with a bit lock Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 22/70] rcu: locking and unlocking need to always be at least barriers Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 23/70] habanalabs: fix debugfs code Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 24/70] ARC: mm: SIGSEGV userspace trying to access kernel virtual memory Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 25/70] parisc: Use implicit space register selection for loading the coherence index of I/O pdirs Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 26/70] parisc: Fix crash due alternative coding for NP iopdir_fdc bit Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 27/70] SUNRPC fix regression in umount of a secure mount Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 28/70] SUNRPC: Fix a use after free when a server rejects the RPCSEC_GSS credential Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 29/70] NFSv4.1: Again fix a race where CB_NOTIFY_LOCK fails to wake a waiter Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 30/70] NFSv4.1: Fix bug only first CB_NOTIFY_LOCK is handled Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 31/70] fuse: fallocate: fix return with locked inode Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 32/70] fuse: fix copy_file_range() in the writeback case Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 33/70] pstore: Set tfm to NULL on free_buf_for_compression Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 34/70] pstore/ram: Run without kernel crash dump region Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 35/70] kbuild: use more portable command -v for cc-cross-prefix Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 36/70] memstick: mspro_block: Fix an error code in mspro_block_issue_req() Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 37/70] mmc: tmio: fix SCC error handling to avoid false positive CRC error Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 38/70] mmc: sdhci_am654: Fix SLOTTYPE write Greg Kroah-Hartman
2019-06-09 16:41 ` Greg Kroah-Hartman [this message]
2019-06-09 16:41 ` [PATCH 5.1 40/70] x86/insn-eval: Fix use-after-free access to LDT entry Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 41/70] i2c: xiic: Add max_read_len quirk Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 42/70] s390/mm: fix address space detection in exception handling Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 43/70] nvme-rdma: fix queue mapping when queue count is limited Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 44/70] xen-blkfront: switch kcalloc to kvcalloc for large array allocation Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 45/70] MIPS: Bounds check virt_addr_valid Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 46/70] MIPS: pistachio: Build uImage.gz by default Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 47/70] genwqe: Prevent an integer overflow in the ioctl Greg Kroah-Hartman
2019-06-09 16:41 ` [PATCH 5.1 48/70] test_firmware: Use correct snprintf() limit Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 49/70] drm/rockchip: fix fb references in async update Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 50/70] drm/vc4: " Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 51/70] drm/gma500/cdv: Check vbt config bits when detecting lvds panels Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 52/70] drm/msm: fix fb references in async update Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 53/70] drm: add non-desktop quirk for Valve HMDs Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 54/70] drm/nouveau: add kconfig option to turn off nouveau legacy contexts. (v3) Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 55/70] drm: add non-desktop quirks to Sensics and OSVR headsets Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 56/70] drm: Fix timestamp docs for variable refresh properties Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 57/70] drm/amdgpu/psp: move psp version specific function pointers to early_init Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 58/70] drm/radeon: prefer lower reference dividers Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 59/70] drm/amdgpu: remove ATPX_DGPU_REQ_POWER_FOR_DISPLAYS check when hotplug-in Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 60/70] drm/i915: Fix I915_EXEC_RING_MASK Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 61/70] drm/amdgpu/soc15: skip reset on init Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 62/70] drm/amd/display: Add ASICREV_IS_PICASSO Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 63/70] drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2) Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 64/70] drm/i915/fbc: disable framebuffer compression on GeminiLake Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 65/70] drm/i915/gvt: emit init breadcrumb for gvt request Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 66/70] drm/i915: Maintain consistent documentation subsection ordering Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 67/70] drm: dont block fb changes for async plane updates Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 68/70] drm/i915/gvt: Initialize intel_gvt_gtt_entry in stack Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 69/70] drm/amd: fix fb references in async update Greg Kroah-Hartman
2019-06-09 16:42 ` [PATCH 5.1 70/70] TTY: serial_core, add ->install Greg Kroah-Hartman
2019-06-09 22:37 ` [PATCH 5.1 00/70] 5.1.9-stable review Jiunn Chang
2019-06-10  5:57   ` Greg Kroah-Hartman
2019-06-10  6:03 ` Naresh Kamboju
2019-06-10 14:26   ` Greg Kroah-Hartman
2019-06-10  8:52 ` Jon Hunter
2019-06-10 14:25   ` Greg Kroah-Hartman
2019-06-10 14:45 ` Guenter Roeck
2019-06-10 14:51   ` Greg Kroah-Hartman
2019-06-10 22:01 ` shuah
2019-06-11  7:21   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190609164130.488111346@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=jkosina@suse.cz \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pavel@ucw.cz \
    --cc=rafael.j.wysocki@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).