linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Nathan Lynch <nathanl@linux.ibm.com>,
	"Gautham R. Shenoy" <ego@linux.vnet.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 4.4 06/36] powerpc/rtas: use device model APIs and serialization during LPM
Date: Sun,  6 Oct 2019 19:18:48 +0200	[thread overview]
Message-ID: <20191006171044.425420392@linuxfoundation.org> (raw)
In-Reply-To: <20191006171038.266461022@linuxfoundation.org>

From: Nathan Lynch <nathanl@linux.ibm.com>

[ Upstream commit a6717c01ddc259f6f73364779df058e2c67309f8 ]

The LPAR migration implementation and userspace-initiated cpu hotplug
can interleave their executions like so:

1. Set cpu 7 offline via sysfs.

2. Begin a partition migration, whose implementation requires the OS
   to ensure all present cpus are online; cpu 7 is onlined:

     rtas_ibm_suspend_me -> rtas_online_cpus_mask -> cpu_up

   This sets cpu 7 online in all respects except for the cpu's
   corresponding struct device; dev->offline remains true.

3. Set cpu 7 online via sysfs. _cpu_up() determines that cpu 7 is
   already online and returns success. The driver core (device_online)
   sets dev->offline = false.

4. The migration completes and restores cpu 7 to offline state:

     rtas_ibm_suspend_me -> rtas_offline_cpus_mask -> cpu_down

This leaves cpu7 in a state where the driver core considers the cpu
device online, but in all other respects it is offline and
unused. Attempts to online the cpu via sysfs appear to succeed but the
driver core actually does not pass the request to the lower-level
cpuhp support code. This makes the cpu unusable until the cpu device
is manually set offline and then online again via sysfs.

Instead of directly calling cpu_up/cpu_down, the migration code should
use the higher-level device core APIs to maintain consistent state and
serialize operations.

Fixes: 120496ac2d2d ("powerpc: Bring all threads online prior to migration/hibernation")
Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802192926.19277-2-nathanl@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/powerpc/kernel/rtas.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 5a753fae8265a..0c42e872d548b 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -857,15 +857,17 @@ static int rtas_cpu_state_change_mask(enum rtas_cpu_state state,
 		return 0;
 
 	for_each_cpu(cpu, cpus) {
+		struct device *dev = get_cpu_device(cpu);
+
 		switch (state) {
 		case DOWN:
-			cpuret = cpu_down(cpu);
+			cpuret = device_offline(dev);
 			break;
 		case UP:
-			cpuret = cpu_up(cpu);
+			cpuret = device_online(dev);
 			break;
 		}
-		if (cpuret) {
+		if (cpuret < 0) {
 			pr_debug("%s: cpu_%s for cpu#%d returned %d.\n",
 					__func__,
 					((state == UP) ? "up" : "down"),
@@ -954,6 +956,8 @@ int rtas_ibm_suspend_me(u64 handle)
 	data.token = rtas_token("ibm,suspend-me");
 	data.complete = &done;
 
+	lock_device_hotplug();
+
 	/* All present CPUs must be online */
 	cpumask_andnot(offline_mask, cpu_present_mask, cpu_online_mask);
 	cpuret = rtas_online_cpus_mask(offline_mask);
@@ -985,6 +989,7 @@ int rtas_ibm_suspend_me(u64 handle)
 				__func__);
 
 out:
+	unlock_device_hotplug();
 	free_cpumask_var(offline_mask);
 	return atomic_read(&data.error);
 }
-- 
2.20.1




  parent reply	other threads:[~2019-10-06 17:21 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-06 17:18 [PATCH 4.4 00/36] 4.4.196-stable review Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 01/36] video: ssd1307fb: Start page range at page_offset Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 02/36] gpu: drm: radeon: Fix a possible null-pointer dereference in radeon_connector_set_property() Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 03/36] ipmi_si: Only schedule continuously in the thread in maintenance mode Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 04/36] clk: qoriq: Fix -Wunused-const-variable Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 05/36] clk: sirf: Dont reference clk_init_data after registration Greg Kroah-Hartman
2019-10-06 17:18 ` Greg Kroah-Hartman [this message]
2019-10-06 17:18 ` [PATCH 4.4 07/36] powerpc/futex: Fix warning: oldval may be used uninitialized in this function Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 08/36] powerpc/pseries/mobility: use cond_resched when updating device tree Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 09/36] pinctrl: tegra: Fix write barrier placement in pmx_writel Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 10/36] powerpc/eeh: Clear stale EEH_DEV_NO_HANDLER flag Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 11/36] vfio_pci: Restore original state on release Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 12/36] powerpc/64s/exception: machine check use correct cfar for late handler Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 13/36] powerpc/pseries: correctly track irq state in default idle Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 14/36] scsi: core: Reduce memory required for SCSI logging Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 15/36] mfd: intel-lpss: Remove D3cold delay Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 16/36] ARM: 8898/1: mm: Dont treat faults reported from cache maintenance as writes Greg Kroah-Hartman
2019-10-06 17:18 ` [PATCH 4.4 17/36] HID: apple: Fix stuck function keys when using FN Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 18/36] security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 19/36] fat: work around race with userspaces read via blockdev while mounting Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 20/36] hypfs: Fix error number left in struct pointer member Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 21/36] ocfs2: wait for recovering done after direct unlock request Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 22/36] kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 23/36] ANDROID: binder: remove waitqueue when thread exits Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 24/36] ANDROID: binder: synchronize_rcu() when using POLLFREE Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 25/36] hso: fix NULL-deref on tty open Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 26/36] ipv6: drop incoming packets having a v4mapped source address Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 27/36] net: ipv4: avoid mixed n_redirects and rate_tokens usage Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 28/36] net: qlogic: Fix memory leak in ql_alloc_large_buffers Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 29/36] nfc: fix memory leak in llcp_sock_bind() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 30/36] sch_dsmark: fix potential NULL deref in dsmark_init() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 31/36] xen-netfront: do not use ~0U as error return value for xennet_fill_frags() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 32/36] net/rds: Fix error handling in rds_ib_add_one() Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 33/36] sch_cbq: validate TCA_CBQ_WRROPT to avoid crash Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 34/36] Smack: Dont ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 35/36] smack: use GFP_NOFS while holding inode_smack::smk_lock Greg Kroah-Hartman
2019-10-06 17:19 ` [PATCH 4.4 36/36] NFC: fix attrs checks in netlink interface Greg Kroah-Hartman
2019-10-06 22:01 ` [PATCH 4.4 00/36] 4.4.196-stable review kernelci.org bot
2019-10-07 10:07 ` Jon Hunter
2019-10-07 12:53 ` Guenter Roeck
2019-10-07 14:49   ` Greg Kroah-Hartman
2019-10-07 22:36     ` Guenter Roeck
2019-10-08  5:14       ` Greg Kroah-Hartman
2019-10-07 23:07     ` Sasha Levin
2019-10-07 23:16       ` Guenter Roeck
2019-10-08  1:49         ` Sasha Levin
2019-10-08  3:13           ` Guenter Roeck
2019-10-07 14:31 ` Guenter Roeck
2019-10-07 16:37 ` Daniel Díaz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191006171044.425420392@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=ego@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=nathanl@linux.ibm.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).