kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
@ 2023-02-03 13:50 James Morse
  2023-02-03 13:50 ` [RFC PATCH 01/32] ia64: Fix build error due to switch case label appearing next to declaration James Morse
                   ` (34 more replies)
  0 siblings, 35 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

Hello!

This series adds what looks like cpuhotplug support to arm64 for use in
virtual machines. It does this by moving the cpu_register() calls for
architectures that support ACPI out of the arch code by using
GENERIC_CPU_DEVICES, then into the ACPI processor driver.

The kubernetes folk really want to be able to add CPUs to an existing VM,
in exactly the same way they do on x86. The use-case is pre-booting guests
with one CPU, then adding the number that were actually needed when the
workload is provisioned.

Wait? Doesn't arm64 support cpuhotplug already!?
In the arm world, cpuhotplug gets used to mean removing the power from a CPU.
The CPU is offline, and remains present. For x86, and ACPI, cpuhotplug
has the additional step of physically removing the CPU, so that it isn't
present anymore.

Arm64 doesn't support this, and can't support it: CPUs are really a slice
of the SoC, and there is not enough information in the existing ACPI tables
to describe which bits of the slice also got removed. Without a reference
machine: adding this support to the spec is a wild goose chase.

Critically: everything described in the firmware tables must remain present.

For a virtual machine this is easy as all the other bits of 'virtual SoC'
are emulated, so they can (and do) remain present when a vCPU is 'removed'.

On a system that supports cpuhotplug the MADT has to describe every possible
CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
the guest is started.
With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
policy about which CPUs can be brought online.

This series adds support for virtual-cpuhotplug as exactly that: firmware
policy. This may even work on a physical machine too; for a guest the part of
firmware is played by the VMM. (typically Qemu).

PSCI support is modified to return 'DENIED' if the CPU can't be brought
online/enabled yet. The CPU object's _STA method's enabled bit is used to
indicate firmware's current disposition. If the CPU has its enabled bit clear,
it will not be registered with sysfs, and attempts to bring it online will
fail. The notifications that _STA has changed its value then work in the same
way as physical hotplug, and firmware can cause the CPU to be registered some
time later, allowing it to be brought online.

This creates something that looks like cpuhotplug to user-space, as the sysfs
files appear and disappear, and the udev notifications look the same.

One notable difference is the CPU present mask, which is exposed via sysfs.
Because the CPUs remain present throughout, they can still be seen in that mask.
This value does get used by webbrowsers to estimate the number of CPUs
as the CPU online mask is constantly changed on mobile phones.

Linux is tolerant of PSCI returning errors, as its always been allowed to do
that. To avoid confusing OS that can't tolerate this, we needed an additional
bit in the MADT GICC flags. This series copies ACPI_MADT_ONLINE_CAPABLE, which
appears to be for this purpose, but calls it ACPI_MADT_GICC_CPU_CAPABLE as it
has a different bit position in the GICC.

This code is unconditionally enabled for all ACPI architectures.
If there are problems with firmware tables on some devices, the CPUs will
already be online by the time the acpi_processor_make_enabled() is called.
A mismatch here causes a firmware-bug message and kernel taint. This should
only affect people with broken firmware who also boot with maxcpus=1, and
bring CPUs online later.

I had a go at switching the remaining architectures over to GENERIC_CPU_DEVICES,
so that the Kconfig symbol can be removed, but I got stuck with powerpc
and s390.


The first patch has already been posted as a fix here:
https://www.spinics.net/lists/linux-ia64/msg21920.html
I've only build tested Loongarch and ia64.


If folk want to play along at home, you'll need a copy of Qemu that supports this.
https://github.com/salil-mehta/qemu.git salil/virt-cpuhp-armv8/rfc-v1-port29092022.psci.present

You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
to match your host kernel. Replace your '-smp' argument with something like:
| -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1

then feed the following to the Qemu montior;
| (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
| (qemu) device_del cpu1


This series is based on v6.2-rc3, and can be retrieved from:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v1


Thanks,

James Morse (29):
  ia64: Fix build error due to switch case label appearing next to
    declaration
  ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture
  drivers: base: Use present CPUs in GENERIC_CPU_DEVICES
  drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden
  drivers: base: Move cpu_dev_init() after node_dev_init()
  arm64: setup: Switch over to GENERIC_CPU_DEVICES using
    arch_register_cpu()
  ia64/topology: Switch over to GENERIC_CPU_DEVICES
  x86/topology: Switch over to GENERIC_CPU_DEVICES
  LoongArch: Switch over to GENERIC_CPU_DEVICES
  arch_topology: Make register_cpu_capacity_sysctl() tolerant to late
    CPUs
  ACPI: processor: Add support for processors described as container
    packages
  ACPI: processor: Register CPUs that are online, but not described in
    the DSDT
  ACPI: processor: Register all CPUs from acpi_processor_get_info()
  ACPI: Rename ACPI_HOTPLUG_CPU to include 'present'
  ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove()
  ACPI: Rename acpi_processor_hotadd_init and remove pre-processor
    guards
  ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
  ACPI: Check _STA present bit before making CPUs not present
  ACPI: Warn when the present bit changes but the feature is not enabled
  drivers: base: Implement weak arch_unregister_cpu()
  LoongArch: Use the __weak version of arch_unregister_cpu()
  arm64: acpi: Move get_cpu_for_acpi_id() to a header
  ACPICA: Add new MADT GICC flags fields [code first?]
  arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a
    helper
  irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
  irqchip/gic-v3: Add support for ACPI's disabled but 'online capable'
    CPUs
  ACPI: add support to register CPUs based on the _STA enabled bit
  arm64: document virtual CPU hotplug's expectations
  cpumask: Add enabled cpumask for present CPUs that can be brought
    online

Jean-Philippe Brucker (3):
  arm64: psci: Ignore DENIED CPUs
  KVM: arm64: Pass hypercalls to userspace
  KVM: arm64: Pass PSCI calls to userspace

 Documentation/arm64/cpu-hotplug.rst       |  79 ++++++++++++
 Documentation/arm64/index.rst             |   1 +
 Documentation/virt/kvm/api.rst            |  31 ++++-
 Documentation/virt/kvm/arm/hypercalls.rst |   1 +
 arch/arm64/Kconfig                        |   1 +
 arch/arm64/include/asm/acpi.h             |  11 ++
 arch/arm64/include/asm/cpu.h              |   1 -
 arch/arm64/include/asm/kvm_host.h         |   2 +
 arch/arm64/kernel/acpi_numa.c             |  11 --
 arch/arm64/kernel/psci.c                  |   2 +-
 arch/arm64/kernel/setup.c                 |  13 +-
 arch/arm64/kernel/smp.c                   |   5 +-
 arch/arm64/kvm/arm.c                      |  15 ++-
 arch/arm64/kvm/hypercalls.c               |  28 ++++-
 arch/arm64/kvm/psci.c                     |  13 ++
 arch/ia64/Kconfig                         |   2 +
 arch/ia64/include/asm/acpi.h              |   2 +-
 arch/ia64/include/asm/cpu.h               |  11 --
 arch/ia64/kernel/acpi.c                   |   6 +-
 arch/ia64/kernel/setup.c                  |   2 +-
 arch/ia64/kernel/sys_ia64.c               |   7 +-
 arch/ia64/kernel/topology.c               |  35 +-----
 arch/loongarch/Kconfig                    |   2 +
 arch/loongarch/kernel/topology.c          |  31 +----
 arch/x86/Kconfig                          |   2 +
 arch/x86/include/asm/cpu.h                |   6 -
 arch/x86/kernel/acpi/boot.c               |   4 +-
 arch/x86/kernel/topology.c                |  19 +--
 drivers/acpi/Kconfig                      |   5 +-
 drivers/acpi/acpi_processor.c             | 146 +++++++++++++++++-----
 drivers/acpi/processor_core.c             |   2 +-
 drivers/acpi/scan.c                       | 116 +++++++++++------
 drivers/base/arch_topology.c              |  38 ++++--
 drivers/base/cpu.c                        |  31 ++++-
 drivers/base/init.c                       |   2 +-
 drivers/firmware/psci/psci.c              |   2 +
 drivers/irqchip/irq-gic-v3.c              |  38 +++---
 include/acpi/acpi_bus.h                   |   1 +
 include/acpi/actbl2.h                     |   1 +
 include/kvm/arm_hypercalls.h              |   1 +
 include/kvm/arm_psci.h                    |   4 +
 include/linux/acpi.h                      |  10 +-
 include/linux/cpu.h                       |   6 +
 include/linux/cpumask.h                   |  25 ++++
 include/uapi/linux/kvm.h                  |   2 +
 kernel/cpu.c                              |   3 +
 46 files changed, 532 insertions(+), 244 deletions(-)
 create mode 100644 Documentation/arm64/cpu-hotplug.rst

-- 
2.30.2


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [RFC PATCH 01/32] ia64: Fix build error due to switch case label appearing next to declaration
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 02/32] ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture James Morse
                   ` (33 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

Since commit aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to
report ITC frequency"), gcc 10.1.0 fails to build ia64 with the gnomic:
| ../arch/ia64/kernel/sys_ia64.c: In function 'ia64_clock_getres':
| ../arch/ia64/kernel/sys_ia64.c:189:3: error: a label can only be part of a statement and a declaration is not a statement
|   189 |   s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);

This line appears immediately after a case label in a switch.

Move the declarations out of the case, to the top of the function.

Signed-off-by: James Morse <james.morse@arm.com>
---
This patch was previously posted as a fix here:
https://www.spinics.net/lists/linux-ia64/msg21920.html
Its included here to keep the kbuild robot quiet.

 arch/ia64/kernel/sys_ia64.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/ia64/kernel/sys_ia64.c b/arch/ia64/kernel/sys_ia64.c
index f6a502e8f02c..6e948d015332 100644
--- a/arch/ia64/kernel/sys_ia64.c
+++ b/arch/ia64/kernel/sys_ia64.c
@@ -170,6 +170,9 @@ ia64_mremap (unsigned long addr, unsigned long old_len, unsigned long new_len, u
 asmlinkage long
 ia64_clock_getres(const clockid_t which_clock, struct __kernel_timespec __user *tp)
 {
+	struct timespec64 rtn_tp;
+	s64 tick_ns;
+
 	/*
 	 * ia64's clock_gettime() syscall is implemented as a vdso call
 	 * fsys_clock_gettime(). Currently it handles only
@@ -185,8 +188,8 @@ ia64_clock_getres(const clockid_t which_clock, struct __kernel_timespec __user *
 	switch (which_clock) {
 	case CLOCK_REALTIME:
 	case CLOCK_MONOTONIC:
-		s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);
-		struct timespec64 rtn_tp = ns_to_timespec64(tick_ns);
+		tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);
+		rtn_tp = ns_to_timespec64(tick_ns);
 		return put_timespec64(&rtn_tp, tp);
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 02/32] ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
  2023-02-03 13:50 ` [RFC PATCH 01/32] ia64: Fix build error due to switch case label appearing next to declaration James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-08-30 18:31   ` Russell King (Oracle)
  2023-02-03 13:50 ` [RFC PATCH 03/32] drivers: base: Use present CPUs in GENERIC_CPU_DEVICES James Morse
                   ` (32 subsequent siblings)
  34 siblings, 1 reply; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

arm64 doesn't support physical hotadd of CPUs that were not present at boot.
Much of the platform description is in static tables which do not have
update methods. arm64 does support HOTPLUG_CPU, which is backed by a
firmware interface to turn CPUs on and off.

acpi_processor_hotadd_init() and acpi_processor_remove() are for adding
and removing CPUs that were not present at boot. arm64 systems that do this
are not supported as there is currently insufficient information in the
platform description. (e.g. did the GICR get removed too?)

arm64 currently relies on the MADT enabled flag check in map_gicc_mpidr()
to prevent CPUs that were not described as present at boot from being
added to the system. Adding support for virtual CPU hotplug (where the
vCPUs have been present the whole time) would require this check to be
removed, possibly allowing physical CPUs to be added.

Disable ACPI_HOTPLUG_CPU for arm64 by removing 'default y' and selecting
it on the other three ACPI architectures. This allows the weak definitions
of some symbols to be removed.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/ia64/Kconfig                |  1 +
 arch/loongarch/Kconfig           |  1 +
 arch/loongarch/include/asm/cpu.h |  7 +++++++
 arch/x86/Kconfig                 |  1 +
 drivers/acpi/Kconfig             |  1 -
 drivers/acpi/acpi_processor.c    | 18 ------------------
 6 files changed, 10 insertions(+), 19 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index d7e4a24e8644..deabb8843aea 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -15,6 +15,7 @@ config IA64
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ACPI
+	select ACPI_HOTPLUG_CPU if ACPI
 	select ACPI_NUMA if NUMA
 	select ARCH_ENABLE_MEMORY_HOTPLUG
 	select ARCH_ENABLE_MEMORY_HOTREMOVE
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 9cc8b84f7eb0..075aa50e5d5f 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -5,6 +5,7 @@ config LOONGARCH
 	select ACPI
 	select ACPI_GENERIC_GSI if ACPI
 	select ACPI_MCFG if ACPI
+	select ACPI_HOTPLUG_CPU if ACPI
 	select ACPI_SYSTEM_POWER_STATES_SUPPORT	if ACPI
 	select ARCH_BINFMT_ELF_STATE
 	select ARCH_ENABLE_MEMORY_HOTPLUG
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
index 754f28506791..da79862ff1f3 100644
--- a/arch/loongarch/include/asm/cpu.h
+++ b/arch/loongarch/include/asm/cpu.h
@@ -124,4 +124,11 @@ enum cpu_type_enum {
 #define LOONGARCH_CPU_GUESTID		BIT_ULL(CPU_FEATURE_GUESTID)
 #define LOONGARCH_CPU_HYPERVISOR	BIT_ULL(CPU_FEATURE_HYPERVISOR)
 
+#if !defined(__ASSEMBLY__)
+#ifdef CONFIG_HOTPLUG_CPU
+extern int arch_register_cpu(int num);
+extern void arch_unregister_cpu(int);
+#endif
+#endif /* ! __ASSEMBLY__ */
+
 #endif /* _ASM_CPU_H */
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 3604074a878b..6a520c22c3eb 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -59,6 +59,7 @@ config X86
 	#
 	select ACPI_LEGACY_TABLES_LOOKUP	if ACPI
 	select ACPI_SYSTEM_POWER_STATES_SUPPORT	if ACPI
+	select ACPI_HOTPLUG_CPU			if ACPI
 	select ARCH_32BIT_OFF_T			if X86_32
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index ccbeab9500ec..4845e5b525ac 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -309,7 +309,6 @@ config ACPI_HOTPLUG_CPU
 	bool
 	depends on ACPI_PROCESSOR && HOTPLUG_CPU
 	select ACPI_CONTAINER
-	default y
 
 config ACPI_PROCESSOR_AGGREGATOR
 	tristate "Processor Aggregator"
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 6737b1cbf6d6..16b314340e68 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -150,24 +150,6 @@ static int acpi_processor_errata(void)
 
 /* Initialization */
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
-int __weak acpi_map_cpu(acpi_handle handle,
-		phys_cpuid_t physid, u32 acpi_id, int *pcpu)
-{
-	return -ENODEV;
-}
-
-int __weak acpi_unmap_cpu(int cpu)
-{
-	return -ENODEV;
-}
-
-int __weak arch_register_cpu(int cpu)
-{
-	return -ENODEV;
-}
-
-void __weak arch_unregister_cpu(int cpu) {}
-
 static int acpi_processor_hotadd_init(struct acpi_processor *pr)
 {
 	unsigned long long sta;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 03/32] drivers: base: Use present CPUs in GENERIC_CPU_DEVICES
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
  2023-02-03 13:50 ` [RFC PATCH 01/32] ia64: Fix build error due to switch case label appearing next to declaration James Morse
  2023-02-03 13:50 ` [RFC PATCH 02/32] ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 04/32] drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden James Morse
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

The four ACPI architectures only create sysfs entries using register_cpu()
for present CPUs, whereas GENERIC_CPU_DEVICES does this for possible CPUs.

Only two of the eight architectures that use GENERIC_CPU_DEVICES have a
distinction between present and possible CPUs.

To allow all four ACPI architectures to use GENERIC_CPU_DEVICES, change
it to use for_each_present_cpu().

The following architectures use GENERIC_CPU_DEVICES but are not SMP,
so possible == present:
 * m68k
 * microblaze
 * nios2

The following architectures use GENERIC_CPU_DEVICES and consider
possible == present:
 * csky: setup_smp()
 * hexagon: compare smp_start_cpus() and smp_prepare_cpus()
 * parisc: smp_prepare_boot_cpu() marks the boot cpu as present,
   processor_probe() sets possible for all CPUs and present for all CPUs
   except the boot cpu.

um appears to be a subarchitecture of x86.

The remaining architecture using GENERIC_CPU_DEVICES is openrisc,
where smp_init_cpus() makes all CPUs < NR_CPUS possible, whereas
smp_prepare_cpus() only makes CPUs < setup_max_cpus present. After this
change, openrisc systems that boot with max_cpus=1 would not see other
CPUs present in sysfs. This should not be a problem as these CPUs can't
be brought online as _cpu_up() checks cpu_present().

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/base/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 4c98849577d4..cf6407c34ede 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -500,7 +500,7 @@ static void __init cpu_dev_register_generic(void)
 #ifdef CONFIG_GENERIC_CPU_DEVICES
 	int i;
 
-	for_each_possible_cpu(i) {
+	for_each_present_cpu(i) {
 		if (register_cpu(&per_cpu(cpu_devices, i), i))
 			panic("Failed to register CPU device");
 	}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 04/32] drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (2 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 03/32] drivers: base: Use present CPUs in GENERIC_CPU_DEVICES James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 05/32] drivers: base: Move cpu_dev_init() after node_dev_init() James Morse
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

architectures often have extra per-cpu work that needs doing
before a CPU is registered, often to determine if a CPU is
hotpluggable.

To allow more architectures to use GENERIC_CPU_DEVICES, wrap the call
as a __weak arch_register_cpu(). This aligns with the way x86, ia64
and loongarch register hotplug CPUs when they become present.

ACPI's acpi_processor.c also has a __weak version of this symbol
because arm64 doesn't define one. The duplicate __weak definitions
are only a problem if arm64 selects GENERIC_CPU_DEVICES without
defining one. This gets fixed up in later patches.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/ia64/include/asm/cpu.h      |  1 -
 arch/loongarch/include/asm/cpu.h |  1 -
 arch/x86/include/asm/cpu.h       |  1 -
 drivers/base/cpu.c               | 14 ++++++++++----
 include/linux/cpu.h              |  5 +++++
 5 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
index db125df9e088..a3e690e685e5 100644
--- a/arch/ia64/include/asm/cpu.h
+++ b/arch/ia64/include/asm/cpu.h
@@ -16,7 +16,6 @@ DECLARE_PER_CPU(struct ia64_cpu, cpu_devices);
 DECLARE_PER_CPU(int, cpu_state);
 
 #ifdef CONFIG_HOTPLUG_CPU
-extern int arch_register_cpu(int num);
 extern void arch_unregister_cpu(int);
 #endif
 
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
index da79862ff1f3..1e2c7c61dbea 100644
--- a/arch/loongarch/include/asm/cpu.h
+++ b/arch/loongarch/include/asm/cpu.h
@@ -126,7 +126,6 @@ enum cpu_type_enum {
 
 #if !defined(__ASSEMBLY__)
 #ifdef CONFIG_HOTPLUG_CPU
-extern int arch_register_cpu(int num);
 extern void arch_unregister_cpu(int);
 #endif
 #endif /* ! __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index 78796b98a544..a0a62ac00e88 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -28,7 +28,6 @@ struct x86_cpu {
 };
 
 #ifdef CONFIG_HOTPLUG_CPU
-extern int arch_register_cpu(int num);
 extern void arch_unregister_cpu(int);
 extern void start_cpu0(void);
 #ifdef CONFIG_DEBUG_HOTPLUG_CPU0
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index cf6407c34ede..178936533d87 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -492,19 +492,25 @@ bool cpu_is_hotpluggable(unsigned int cpu)
 EXPORT_SYMBOL_GPL(cpu_is_hotpluggable);
 
 #ifdef CONFIG_GENERIC_CPU_DEVICES
-static DEFINE_PER_CPU(struct cpu, cpu_devices);
+DEFINE_PER_CPU(struct cpu, cpu_devices);
+
+int __weak arch_register_cpu(int cpu)
+{
+	return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
+}
 #endif
 
 static void __init cpu_dev_register_generic(void)
 {
-#ifdef CONFIG_GENERIC_CPU_DEVICES
 	int i;
 
+	if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
+		return;
+
 	for_each_present_cpu(i) {
-		if (register_cpu(&per_cpu(cpu_devices, i), i))
+		if (arch_register_cpu(i))
 			panic("Failed to register CPU device");
 	}
-#endif
 }
 
 #ifdef CONFIG_GENERIC_CPU_VULNERABILITIES
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 314802f98b9d..86e79e702325 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -75,12 +75,17 @@ extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,
 				 const struct attribute_group **groups,
 				 const char *fmt, ...);
+extern int arch_register_cpu(int cpu);
 #ifdef CONFIG_HOTPLUG_CPU
 extern void unregister_cpu(struct cpu *cpu);
 extern ssize_t arch_cpu_probe(const char *, size_t);
 extern ssize_t arch_cpu_release(const char *, size_t);
 #endif
 
+#ifdef CONFIG_GENERIC_CPU_DEVICES
+DECLARE_PER_CPU(struct cpu, cpu_devices);
+#endif
+
 /*
  * These states are not related to the core CPU hotplug mechanism. They are
  * used by various (sub)architectures to track internal state
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 05/32] drivers: base: Move cpu_dev_init() after node_dev_init()
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (3 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 04/32] drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 06/32] arm64: setup: Switch over to GENERIC_CPU_DEVICES using arch_register_cpu() James Morse
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

NUMA systems require the node descriptions to be ready before CPUs are
registered. This is so that the node symlinks can be created in sysfs.

Currently none of these platforms use GENERIC_CPU_DEVICES, meaning
that CPUs aren't registered by cpu_dev_init().

Move cpu_dev_init() after node_dev_init() so that NUMA architectures
can use GENERIC_CPU_DEVICES.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/base/init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/init.c b/drivers/base/init.c
index 397eb9880cec..c4954835128c 100644
--- a/drivers/base/init.c
+++ b/drivers/base/init.c
@@ -35,8 +35,8 @@ void __init driver_init(void)
 	of_core_init();
 	platform_bus_init();
 	auxiliary_bus_init();
-	cpu_dev_init();
 	memory_dev_init();
 	node_dev_init();
+	cpu_dev_init();
 	container_dev_init();
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 06/32] arm64: setup: Switch over to GENERIC_CPU_DEVICES using arch_register_cpu()
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (4 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 05/32] drivers: base: Move cpu_dev_init() after node_dev_init() James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 07/32] ia64/topology: Switch over to GENERIC_CPU_DEVICES James Morse
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

To allow ACPI's _STA value to hide CPUs that are present, but not
available to online right now due to VMM of firmware policy, the
register_cpu() call needs to be made by the ACPI machinery when ACPI
is in use.

Switching to GENERIC_CPU_DEVICES is an intermediate step to allow all
four ACPI architectures to be modified at once.

Switch over to GENERIC_CPU_DEVICES, and provide an arch_register_cpu()
that populates the hotpluggable flag. arch_register_cpu() is also the
interface the ACPI machinery expects.

The struct cpu in struct cpuinfo_arm64 is never used directly, remove
it to use the one GENERIC_CPU_DEVICES provides.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig           |  1 +
 arch/arm64/include/asm/cpu.h |  1 -
 arch/arm64/kernel/setup.c    | 13 ++++---------
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 03934808b2ed..6d272dad8eba 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -127,6 +127,7 @@ config ARM64
 	select GENERIC_ARCH_TOPOLOGY
 	select GENERIC_CLOCKEVENTS_BROADCAST
 	select GENERIC_CPU_AUTOPROBE
+	select GENERIC_CPU_DEVICES
 	select GENERIC_CPU_VULNERABILITIES
 	select GENERIC_EARLY_IOREMAP
 	select GENERIC_IDLE_POLL_SETUP
diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index fd7a92219eea..eca1aea89e81 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -38,7 +38,6 @@ struct cpuinfo_32bit {
 };
 
 struct cpuinfo_arm64 {
-	struct cpu	cpu;
 	struct kobject	kobj;
 	u64		reg_ctr;
 	u64		reg_cntfrq;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 12cfe9d0d3fa..de6eb242faec 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -395,19 +395,14 @@ static inline bool cpu_can_disable(unsigned int cpu)
 	return false;
 }
 
-static int __init topology_init(void)
+int arch_register_cpu(int num)
 {
-	int i;
+	struct cpu *cpu = &per_cpu(cpu_devices, num);
 
-	for_each_possible_cpu(i) {
-		struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
-		cpu->hotpluggable = cpu_can_disable(i);
-		register_cpu(cpu, i);
-	}
+	cpu->hotpluggable = cpu_can_disable(num);
 
-	return 0;
+	return register_cpu(cpu, num);
 }
-subsys_initcall(topology_init);
 
 static void dump_kernel_offset(void)
 {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 07/32] ia64/topology: Switch over to GENERIC_CPU_DEVICES
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (5 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 06/32] arm64: setup: Switch over to GENERIC_CPU_DEVICES using arch_register_cpu() James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 08/32] x86/topology: " James Morse
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

ia64 has its own arch specific data structure for cpus: struct ia64_cpu.
This has one member, making ia64's cpu_devices the same as that
provided be GENERIC_CPU_DEVICES.
ia64 craetes a percpu struct ia64_cpu called cpu_devices, which has no
users. Instead it uses the struct ia64_cpu named sysfs_cpus allocated at
boot.

Remove the arch specific structure allocation and initialisation.
ia64's arch_register_cpu() now overrides the weak version from
GENERIC_CPU_DEVICES, and uses the percpu cpu_devices defined by
core code.

All uses of sysfs_cpus are changed to use the percpu cpu_devices.

This is an intermediate step to the logic being moved to drivers/acpi,
where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/ia64/Kconfig           |  1 +
 arch/ia64/include/asm/cpu.h |  6 ------
 arch/ia64/kernel/topology.c | 35 +++++------------------------------
 3 files changed, 6 insertions(+), 36 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index deabb8843aea..146a2226e2a3 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -40,6 +40,7 @@ config IA64
 	select HAVE_FUNCTION_DESCRIPTORS
 	select HAVE_VIRT_CPU_ACCOUNTING
 	select HUGETLB_PAGE_SIZE_VARIABLE if HUGETLB_PAGE
+	select GENERIC_CPU_DEVICES
 	select GENERIC_IRQ_PROBE
 	select GENERIC_PENDING_IRQ if SMP
 	select GENERIC_IRQ_SHOW
diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
index a3e690e685e5..6e9786c6ec98 100644
--- a/arch/ia64/include/asm/cpu.h
+++ b/arch/ia64/include/asm/cpu.h
@@ -7,12 +7,6 @@
 #include <linux/topology.h>
 #include <linux/percpu.h>
 
-struct ia64_cpu {
-	struct cpu cpu;
-};
-
-DECLARE_PER_CPU(struct ia64_cpu, cpu_devices);
-
 DECLARE_PER_CPU(int, cpu_state);
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c
index 94a848b06f15..8f5cafde2bc9 100644
--- a/arch/ia64/kernel/topology.c
+++ b/arch/ia64/kernel/topology.c
@@ -26,8 +26,6 @@
 #include <asm/numa.h>
 #include <asm/cpu.h>
 
-static struct ia64_cpu *sysfs_cpus;
-
 void arch_fix_phys_package_id(int num, u32 slot)
 {
 #ifdef CONFIG_SMP
@@ -41,50 +39,27 @@ EXPORT_SYMBOL_GPL(arch_fix_phys_package_id);
 #ifdef CONFIG_HOTPLUG_CPU
 int __ref arch_register_cpu(int num)
 {
+	struct cpu *cpu = &per_cpu(cpu_devices, num);
+
 	/*
 	 * If CPEI can be re-targeted or if this is not
 	 * CPEI target, then it is hotpluggable
 	 */
 	if (can_cpei_retarget() || !is_cpu_cpei_target(num))
-		sysfs_cpus[num].cpu.hotpluggable = 1;
+		cpu->hotpluggable = 1;
 	map_cpu_to_node(num, node_cpuid[num].nid);
-	return register_cpu(&sysfs_cpus[num].cpu, num);
+	return register_cpu(cpu, num);
 }
 EXPORT_SYMBOL(arch_register_cpu);
 
 void __ref arch_unregister_cpu(int num)
 {
-	unregister_cpu(&sysfs_cpus[num].cpu);
+	unregister_cpu(&per_cpu(cpu_devices, num));
 	unmap_cpu_from_node(num, cpu_to_node(num));
 }
 EXPORT_SYMBOL(arch_unregister_cpu);
-#else
-static int __init arch_register_cpu(int num)
-{
-	return register_cpu(&sysfs_cpus[num].cpu, num);
-}
 #endif /*CONFIG_HOTPLUG_CPU*/
 
-
-static int __init topology_init(void)
-{
-	int i, err = 0;
-
-	sysfs_cpus = kcalloc(NR_CPUS, sizeof(struct ia64_cpu), GFP_KERNEL);
-	if (!sysfs_cpus)
-		panic("kzalloc in topology_init failed - NR_CPUS too big?");
-
-	for_each_present_cpu(i) {
-		if((err = arch_register_cpu(i)))
-			goto out;
-	}
-out:
-	return err;
-}
-
-subsys_initcall(topology_init);
-
-
 /*
  * Export cpu cache information through sysfs
  */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 08/32] x86/topology: Switch over to GENERIC_CPU_DEVICES
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (6 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 07/32] ia64/topology: Switch over to GENERIC_CPU_DEVICES James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 09/32] LoongArch: " James Morse
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
overridden by the arch code, switch over to this to allow common code
to choose when the register_cpu() call is made.

x86's struct cpus come from struct x86_cpu, which has no other members
or users. Remove this and use the version defined by common code.

This is an intermediate step to the logic being moved to drivers/acpi,
where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/x86/Kconfig           |  1 +
 arch/x86/include/asm/cpu.h |  4 ----
 arch/x86/kernel/topology.c | 19 +++----------------
 3 files changed, 4 insertions(+), 20 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 6a520c22c3eb..e1aaee2f8412 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -144,6 +144,7 @@ config X86
 	select GENERIC_CLOCKEVENTS_MIN_ADJUST
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
+	select GENERIC_CPU_DEVICES
 	select GENERIC_CPU_VULNERABILITIES
 	select GENERIC_EARLY_IOREMAP
 	select GENERIC_ENTRY
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index a0a62ac00e88..2955541abebb 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -23,10 +23,6 @@ static inline void prefill_possible_map(void) {}
 
 #endif /* CONFIG_SMP */
 
-struct x86_cpu {
-	struct cpu cpu;
-};
-
 #ifdef CONFIG_HOTPLUG_CPU
 extern void arch_unregister_cpu(int);
 extern void start_cpu0(void);
diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
index 1b83377274b8..2275e22dfc8b 100644
--- a/arch/x86/kernel/topology.c
+++ b/arch/x86/kernel/topology.c
@@ -35,8 +35,6 @@
 #include <asm/io_apic.h>
 #include <asm/cpu.h>
 
-static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
-
 #ifdef CONFIG_HOTPLUG_CPU
 
 #ifdef CONFIG_BOOTPARAM_HOTPLUG_CPU0
@@ -131,15 +129,15 @@ int arch_register_cpu(int num)
 		}
 	}
 	if (num || cpu0_hotpluggable)
-		per_cpu(cpu_devices, num).cpu.hotpluggable = 1;
+		per_cpu(cpu_devices, num).hotpluggable = 1;
 
-	return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
+	return register_cpu(&per_cpu(cpu_devices, num), num);
 }
 EXPORT_SYMBOL(arch_register_cpu);
 
 void arch_unregister_cpu(int num)
 {
-	unregister_cpu(&per_cpu(cpu_devices, num).cpu);
+	unregister_cpu(&per_cpu(cpu_devices, num));
 }
 EXPORT_SYMBOL(arch_unregister_cpu);
 #else /* CONFIG_HOTPLUG_CPU */
@@ -149,14 +147,3 @@ static int __init arch_register_cpu(int num)
 	return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
 }
 #endif /* CONFIG_HOTPLUG_CPU */
-
-static int __init topology_init(void)
-{
-	int i;
-
-	for_each_present_cpu(i)
-		arch_register_cpu(i);
-
-	return 0;
-}
-subsys_initcall(topology_init);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 09/32] LoongArch: Switch over to GENERIC_CPU_DEVICES
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (7 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 08/32] x86/topology: " James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 10/32] arch_topology: Make register_cpu_capacity_sysctl() tolerant to late CPUs James Morse
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
overridden by the arch code, switch over to this to allow common code
to choose when the register_cpu() call is made.

This allows topology_init() to be removed.

This is an intermediate step to the logic being moved to drivers/acpi,
where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/loongarch/Kconfig           |  1 +
 arch/loongarch/kernel/topology.c | 22 +---------------------
 2 files changed, 2 insertions(+), 21 deletions(-)

diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 075aa50e5d5f..53f6feba0008 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -64,6 +64,7 @@ config LOONGARCH
 	select GENERIC_CLOCKEVENTS
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
+	select GENERIC_CPU_DEVICES
 	select GENERIC_ENTRY
 	select GENERIC_GETTIMEOFDAY
 	select GENERIC_IOREMAP if !ARCH_IOREMAP
diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
index caa7cd859078..a86d51269b09 100644
--- a/arch/loongarch/kernel/topology.c
+++ b/arch/loongarch/kernel/topology.c
@@ -7,15 +7,13 @@
 #include <linux/percpu.h>
 #include <asm/bootinfo.h>
 
-static DEFINE_PER_CPU(struct cpu, cpu_devices);
-
 #ifdef CONFIG_HOTPLUG_CPU
 int arch_register_cpu(int cpu)
 {
 	int ret;
 	struct cpu *c = &per_cpu(cpu_devices, cpu);
 
-	c->hotpluggable = 1;
+	c->hotpluggable = !io_master(cpu);
 	ret = register_cpu(c, cpu);
 	if (ret < 0)
 		pr_warn("register_cpu %d failed (%d)\n", cpu, ret);
@@ -33,21 +31,3 @@ void arch_unregister_cpu(int cpu)
 }
 EXPORT_SYMBOL(arch_unregister_cpu);
 #endif
-
-static int __init topology_init(void)
-{
-	int i, ret;
-
-	for_each_present_cpu(i) {
-		struct cpu *c = &per_cpu(cpu_devices, i);
-
-		c->hotpluggable = !io_master(i);
-		ret = register_cpu(c, i);
-		if (ret < 0)
-			pr_warn("topology_init: register_cpu %d failed (%d)\n", i, ret);
-	}
-
-	return 0;
-}
-
-subsys_initcall(topology_init);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 10/32] arch_topology: Make register_cpu_capacity_sysctl() tolerant to late CPUs
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (8 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 09/32] LoongArch: " James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 11/32] ACPI: processor: Add support for processors described as container packages James Morse
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

register_cpu_capacity_sysctl() adds a property to sysfs that describes
the CPUs capacity. This is done from a subsys_initcall() that assumes
all possible CPUs are registered.

With CPU hotplug, possible CPUs aren't registered until they become
present, (or for arm64 enabled). This leads to messages during boot:
| register_cpu_capacity_sysctl: too early to get CPU1 device!
and once these CPUs are added to the system, the file is missing.

Move this to a cpuhp callback, so that the file is created once
CPUs are brought online. This covers CPUs that are added late by
mechanisms like hotplug.
One observable difference is the file is now missing for offline CPUs.

Signed-off-by: James Morse <james.morse@arm.com>
---
If the offline CPUs thing is a problem for the tools that consume
this value, we'd need to move cpu_capacity to be part of cpu.c's
common_cpu_attr_groups.
---
 drivers/base/arch_topology.c | 38 ++++++++++++++++++++++++------------
 1 file changed, 26 insertions(+), 12 deletions(-)

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index e7d6e6657ffa..83a46bc3105d 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -220,20 +220,34 @@ static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
 
 static DEVICE_ATTR_RO(cpu_capacity);
 
+static int cpu_capacity_sysctl_add(unsigned int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+
+	if (!cpu_dev)
+		return -ENOENT;
+
+	device_create_file(cpu_dev, &dev_attr_cpu_capacity);
+
+	return 0;
+}
+
+static int cpu_capacity_sysctl_remove(unsigned int cpu)
+{
+	struct device *cpu_dev = get_cpu_device(cpu);
+
+	if (!cpu_dev)
+		return -ENOENT;
+
+	device_remove_file(cpu_dev, &dev_attr_cpu_capacity);
+
+	return 0;
+}
+
 static int register_cpu_capacity_sysctl(void)
 {
-	int i;
-	struct device *cpu;
-
-	for_each_possible_cpu(i) {
-		cpu = get_cpu_device(i);
-		if (!cpu) {
-			pr_err("%s: too early to get CPU%d device!\n",
-			       __func__, i);
-			continue;
-		}
-		device_create_file(cpu, &dev_attr_cpu_capacity);
-	}
+	cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "topology/cpu-capacity",
+			  cpu_capacity_sysctl_add, cpu_capacity_sysctl_remove);
 
 	return 0;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 11/32] ACPI: processor: Add support for processors described as container packages
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (9 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 10/32] arch_topology: Make register_cpu_capacity_sysctl() tolerant to late CPUs James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 12/32] ACPI: processor: Register CPUs that are online, but not described in the DSDT James Morse
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

ACPI has two ways of describing processors in the DSDT. Either as a device
object with HID ACPI0007, or as a type 'C' package inside a Processor
Container. The ACPI processor driver probes CPUs described as devices, but
not those described as packages.

Duplicate descriptions are not allowed, the ACPI processor driver already
parses the UID from both devices and containers. acpi_processor_get_info()
returns an error if the UID exists twice in the DSDT.

The missing probe for CPUs described as packages creates a problem for
move the cpu_register() calls into the acpi_processor driver, as CPUs
described like this don't get registered, leading to errors from other
subsystems when they try to add new sysfs entries to the CPU node.
(e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)

To fix this, parse the processor container and call acpi_processor_add()
for each processor that is discovered like this. The processor container
handler is added with acpi_scan_add_handler(), so no detach call will
arrive.

Qemu TCG describes CPUs using packages in a processor container.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/acpi_processor.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 16b314340e68..52668fa22c51 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -528,9 +528,33 @@ static struct acpi_scan_handler processor_handler = {
 	},
 };
 
+static acpi_status acpi_processor_container_walk(acpi_handle handle,
+						  u32 lvl,
+						  void *context,
+						  void **rv)
+{
+	struct acpi_device *adev;
+	acpi_status status;
+
+	adev = acpi_get_acpi_dev(handle);
+	if (!adev)
+		return AE_ERROR;
+
+	status = acpi_processor_add(adev, &processor_device_ids[0]);
+	acpi_put_acpi_dev(adev);
+
+	return status;
+}
+
 static int acpi_processor_container_attach(struct acpi_device *dev,
 					   const struct acpi_device_id *id)
 {
+	acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle,
+						ACPI_UINT32_MAX,
+						acpi_processor_container_walk,
+						NULL, NULL, NULL);
+
+
 	return 1;
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 12/32] ACPI: processor: Register CPUs that are online, but not described in the DSDT
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (10 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 11/32] ACPI: processor: Add support for processors described as container packages James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 13/32] ACPI: processor: Register all CPUs from acpi_processor_get_info() James Morse
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

ACPI has two descriptions of CPUs, on in the MADT/APIC table, the other
in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors"
says "Each processor in the system must be declared in the ACPI
namespace"). Having two descriptions allows firmware authors to get
this wrong.

If CPUs are described in the MADT/APIC, they will be brought online
early during boot. Once the register_cpu() calls are moved to ACPI,
they will be based on the ACPI description of the CPUs. When CPUs are
missing from the ACPI description, they will end up online, but not
registered.

Add a helper that runs after acpi_init() has completed to register
CPUs that are online, but weren't found in the DSDT. Any CPU that
is registered by this code triggers a firmware-bug warning and kernel
taint.

Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug
is configured.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 52668fa22c51..967837b60453 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -694,6 +694,25 @@ void __init acpi_processor_init(void)
 	acpi_scan_add_handler(&processor_container_handler);
 }
 
+static int acpi_processor_register_missing_cpus(void)
+{
+	int cpu;
+
+	if (acpi_disabled)
+		return 0;
+
+	for_each_online_cpu(cpu) {
+		if (!get_cpu_device(cpu)) {
+			pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu);
+			add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
+			arch_register_cpu(cpu);
+		}
+	}
+
+	return 0;
+}
+subsys_initcall_sync(acpi_processor_register_missing_cpus);
+
 #ifdef CONFIG_ACPI_PROCESSOR_CSTATE
 /**
  * acpi_processor_claim_cst_control - Request _CST control from the platform.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 13/32] ACPI: processor: Register all CPUs from acpi_processor_get_info()
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (11 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 12/32] ACPI: processor: Register CPUs that are online, but not described in the DSDT James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 14/32] ACPI: Rename ACPI_HOTPLUG_CPU to include 'present' James Morse
                   ` (21 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

To allow ACPI to skip the call to arch_register_cpu() when the _STA
value indicates the CPU can't be brought online right now, move the
arch_register_cpu() call into acpi_processor_get_info().

Systems can still be booted with 'acpi=off', or in the case of arm64,
not include an ACPI description at all. For these, the CPUs are
registered by cpu_dev_register_generic().

This moves the CPU register logic back to a subsys_initcall(),
while the memory nodes will have been registered earlier.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/acpi_processor.c | 6 ++++++
 drivers/base/cpu.c            | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 967837b60453..a93a6c4115c4 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -272,6 +272,12 @@ static int acpi_processor_get_info(struct acpi_device *device)
 			pr->id = 0;
 	}
 
+	if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id)) {
+		int ret = arch_register_cpu(pr->id);
+		if (ret)
+			return ret;
+	}
+
 	/*
 	 *  Extra Processor objects may be enumerated on MP systems with
 	 *  less than the max # of CPUs. They should be ignored _iff
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 178936533d87..0ba646022a5e 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -504,7 +504,7 @@ static void __init cpu_dev_register_generic(void)
 {
 	int i;
 
-	if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
+	if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
 		return;
 
 	for_each_present_cpu(i) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 14/32] ACPI: Rename ACPI_HOTPLUG_CPU to include 'present'
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (12 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 13/32] ACPI: processor: Register all CPUs from acpi_processor_get_info() James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 15/32] ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove() James Morse
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

The code behind ACPI_HOTPLUG_CPU allows a not-present CPU to become present.
This isn't the only use of HOTPLUG_CPU. On arm64 an offline CPU may be
disabled by firmware, preventing it from being brought back online, but it
remains present throughout.

Adding code to prevent user-space trying to online these disabled CPUs
needs some additional terminology.

Rename the Kconfig symbol CONFIG_ACPI_HOTPLUG_PRESENT_CPU to reflect
that it makes possible CPUs present.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/ia64/Kconfig             |  2 +-
 arch/ia64/include/asm/acpi.h  |  2 +-
 arch/ia64/kernel/acpi.c       |  6 +++---
 arch/ia64/kernel/setup.c      |  2 +-
 arch/x86/Kconfig              |  2 +-
 arch/x86/kernel/acpi/boot.c   |  4 ++--
 drivers/acpi/Kconfig          |  4 ++--
 drivers/acpi/acpi_processor.c | 10 +++++-----
 include/linux/acpi.h          |  6 +++---
 9 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 146a2226e2a3..f02894cafef8 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -15,7 +15,7 @@ config IA64
 	select ARCH_MIGHT_HAVE_PC_PARPORT
 	select ARCH_MIGHT_HAVE_PC_SERIO
 	select ACPI
-	select ACPI_HOTPLUG_CPU if ACPI
+	select ACPI_HOTPLUG_PRESENT_CPU if ACPI
 	select ACPI_NUMA if NUMA
 	select ARCH_ENABLE_MEMORY_HOTPLUG
 	select ARCH_ENABLE_MEMORY_HOTREMOVE
diff --git a/arch/ia64/include/asm/acpi.h b/arch/ia64/include/asm/acpi.h
index 87927eb824cc..2fcdfd4ecce8 100644
--- a/arch/ia64/include/asm/acpi.h
+++ b/arch/ia64/include/asm/acpi.h
@@ -52,7 +52,7 @@ extern unsigned int is_cpu_cpei_target(unsigned int cpu);
 extern void set_cpei_target_cpu(unsigned int cpu);
 extern unsigned int get_cpei_target_cpu(void);
 extern void prefill_possible_map(void);
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 extern int additional_cpus;
 #else
 #define additional_cpus 0
diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index 96d13cb7c19f..38b3155ce8d9 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -194,7 +194,7 @@ acpi_parse_plat_int_src(union acpi_subtable_headers * header,
 	return 0;
 }
 
-#ifdef CONFIG_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 unsigned int can_cpei_retarget(void)
 {
 	extern int cpe_vector;
@@ -711,7 +711,7 @@ int acpi_isa_irq_to_gsi(unsigned isa_irq, u32 *gsi)
 /*
  *  ACPI based hotplug CPU support
  */
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
 {
 #ifdef CONFIG_ACPI_NUMA
@@ -822,7 +822,7 @@ int acpi_unmap_cpu(int cpu)
 	return (0);
 }
 EXPORT_SYMBOL(acpi_unmap_cpu);
-#endif				/* CONFIG_ACPI_HOTPLUG_CPU */
+#endif				/* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
 
 #ifdef CONFIG_ACPI_NUMA
 static acpi_status acpi_map_iosapic(acpi_handle handle, u32 depth,
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index c05728044272..1e826ed3bd56 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -569,7 +569,7 @@ setup_arch (char **cmdline_p)
 #ifdef CONFIG_ACPI_NUMA
 	acpi_numa_init();
 	acpi_numa_fixup();
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 	prefill_possible_map();
 #endif
 	per_cpu_scan_finalize((cpumask_empty(&early_cpu_possible_map) ?
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e1aaee2f8412..23620969dfc0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -59,7 +59,7 @@ config X86
 	#
 	select ACPI_LEGACY_TABLES_LOOKUP	if ACPI
 	select ACPI_SYSTEM_POWER_STATES_SUPPORT	if ACPI
-	select ACPI_HOTPLUG_CPU			if ACPI
+	select ACPI_HOTPLUG_PRESENT_CPU	if ACPI
 	select ARCH_32BIT_OFF_T			if X86_32
 	select ARCH_CLOCKSOURCE_INIT
 	select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 907cc98b1938..5a4869d6cd87 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -796,7 +796,7 @@ static void __init acpi_set_irq_model_ioapic(void)
 /*
  *  ACPI based hotplug support for CPU
  */
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 #include <acpi/processor.h>
 
 static int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
@@ -845,7 +845,7 @@ int acpi_unmap_cpu(int cpu)
 	return (0);
 }
 EXPORT_SYMBOL(acpi_unmap_cpu);
-#endif				/* CONFIG_ACPI_HOTPLUG_CPU */
+#endif				/* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
 
 int acpi_register_ioapic(acpi_handle handle, u64 phys_addr, u32 gsi_base)
 {
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 4845e5b525ac..702b1ff1b7ad 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -305,7 +305,7 @@ config ACPI_IPMI
 	  To compile this driver as a module, choose M here:
 	  the module will be called as acpi_ipmi.
 
-config ACPI_HOTPLUG_CPU
+config ACPI_HOTPLUG_PRESENT_CPU
 	bool
 	depends on ACPI_PROCESSOR && HOTPLUG_CPU
 	select ACPI_CONTAINER
@@ -399,7 +399,7 @@ config ACPI_PCI_SLOT
 
 config ACPI_CONTAINER
 	bool "Container and Module Devices"
-	default (ACPI_HOTPLUG_MEMORY || ACPI_HOTPLUG_CPU)
+	default (ACPI_HOTPLUG_MEMORY || ACPI_HOTPLUG_PRESENT_CPU)
 	help
 	  This driver supports ACPI Container and Module devices (IDs
 	  ACPI0004, PNP0A05, and PNP0A06).
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index a93a6c4115c4..682721594820 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -149,7 +149,7 @@ static int acpi_processor_errata(void)
 }
 
 /* Initialization */
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 static int acpi_processor_hotadd_init(struct acpi_processor *pr)
 {
 	unsigned long long sta;
@@ -194,7 +194,7 @@ static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
 {
 	return -ENODEV;
 }
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
 
 static int acpi_processor_get_info(struct acpi_device *device)
 {
@@ -413,7 +413,7 @@ static int acpi_processor_add(struct acpi_device *device,
 	return result;
 }
 
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 /* Removal */
 static void acpi_processor_remove(struct acpi_device *device)
 {
@@ -457,7 +457,7 @@ static void acpi_processor_remove(struct acpi_device *device)
 	free_cpumask_var(pr->throttling.shared_cpu_map);
 	kfree(pr);
 }
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
 
 #ifdef CONFIG_X86
 static bool acpi_hwp_native_thermal_lvt_set;
@@ -526,7 +526,7 @@ static const struct acpi_device_id processor_device_ids[] = {
 static struct acpi_scan_handler processor_handler = {
 	.ids = processor_device_ids,
 	.attach = acpi_processor_add,
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 	.detach = acpi_processor_remove,
 #endif
 	.hotplug = {
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 5e6a876e17ba..0bc1eacd0dfc 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -327,12 +327,12 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
 }
 #endif
 
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 /* Arch dependent functions for cpu hotplug support */
 int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
 		 int *pcpu);
 int acpi_unmap_cpu(int cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
 
 #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
 int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
@@ -655,7 +655,7 @@ static inline u32 acpi_osc_ctx_get_cxl_control(struct acpi_osc_context *context)
 #define ACPI_GSB_ACCESS_ATTRIB_RAW_PROCESS	0x0000000F
 
 /* Enable _OST when all relevant hotplug operations are enabled */
-#if defined(CONFIG_ACPI_HOTPLUG_CPU) &&			\
+#if defined(CONFIG_ACPI_HOTPLUG_PRESENT_CPU) &&			\
 	defined(CONFIG_ACPI_HOTPLUG_MEMORY) &&		\
 	defined(CONFIG_ACPI_CONTAINER)
 #define ACPI_HOTPLUG_OST
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 15/32] ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove()
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (13 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 14/32] ACPI: Rename ACPI_HOTPLUG_CPU to include 'present' James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 16/32] ACPI: Rename acpi_processor_hotadd_init and remove pre-processor guards James Morse
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

A subsequent patch will change acpi_scan_hot_remove() to call
acpi_bus_trim_one() instead of acpi_bus_trim(), meaning it can no longer
rely on the prototype in the header file.

Move these functions further up the file.
No change in behaviour.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/scan.c | 76 ++++++++++++++++++++++-----------------------
 1 file changed, 38 insertions(+), 38 deletions(-)

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 274344434282..cecf94192771 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -245,6 +245,44 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
 	return 0;
 }
 
+static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
+{
+	struct acpi_scan_handler *handler = adev->handler;
+
+	acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
+
+	adev->flags.match_driver = false;
+	if (handler) {
+		if (handler->detach)
+			handler->detach(adev);
+
+		adev->handler = NULL;
+	} else {
+		device_release_driver(&adev->dev);
+	}
+	/*
+	 * Most likely, the device is going away, so put it into D3cold before
+	 * that.
+	 */
+	acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
+	adev->flags.initialized = false;
+	acpi_device_clear_enumerated(adev);
+
+	return 0;
+}
+
+/**
+ * acpi_bus_trim - Detach scan handlers and drivers from ACPI device objects.
+ * @adev: Root of the ACPI namespace scope to walk.
+ *
+ * Must be called under acpi_scan_lock.
+ */
+void acpi_bus_trim(struct acpi_device *adev)
+{
+	acpi_bus_trim_one(adev, NULL);
+}
+EXPORT_SYMBOL_GPL(acpi_bus_trim);
+
 static int acpi_scan_hot_remove(struct acpi_device *device)
 {
 	acpi_handle handle = device->handle;
@@ -2453,44 +2491,6 @@ int acpi_bus_scan(acpi_handle handle)
 }
 EXPORT_SYMBOL(acpi_bus_scan);
 
-static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
-{
-	struct acpi_scan_handler *handler = adev->handler;
-
-	acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
-
-	adev->flags.match_driver = false;
-	if (handler) {
-		if (handler->detach)
-			handler->detach(adev);
-
-		adev->handler = NULL;
-	} else {
-		device_release_driver(&adev->dev);
-	}
-	/*
-	 * Most likely, the device is going away, so put it into D3cold before
-	 * that.
-	 */
-	acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
-	adev->flags.initialized = false;
-	acpi_device_clear_enumerated(adev);
-
-	return 0;
-}
-
-/**
- * acpi_bus_trim - Detach scan handlers and drivers from ACPI device objects.
- * @adev: Root of the ACPI namespace scope to walk.
- *
- * Must be called under acpi_scan_lock.
- */
-void acpi_bus_trim(struct acpi_device *adev)
-{
-	acpi_bus_trim_one(adev, NULL);
-}
-EXPORT_SYMBOL_GPL(acpi_bus_trim);
-
 int acpi_bus_register_early_device(int type)
 {
 	struct acpi_device *device = NULL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 16/32] ACPI: Rename acpi_processor_hotadd_init and remove pre-processor guards
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (14 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 15/32] ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove() James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 17/32] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug James Morse
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

acpi_processor_hotadd_init() will make a CPU present by mapping it
based on its hardware id.

'hotadd_init' is ambiguous once there are two different behaviours
for cpu hotplug. This is for toggling the _STA present bit. Subsequent
patches will add support for toggling the _STA enabled bit, named
acpi_processor_make_enabled().

Rename it acpi_processor_make_present() to make it clear this is
for CPUs that were not previously present.

Expose the function prototypes it uses to allow the preprocessor
guards to be removed. The IS_ENABLED() check will let the compiler
dead-code elimination pass remove this if it isn't going to be
used.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/acpi_processor.c | 14 +++++---------
 include/linux/acpi.h          |  2 --
 2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 682721594820..6ab55f109417 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -149,13 +149,15 @@ static int acpi_processor_errata(void)
 }
 
 /* Initialization */
-#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
-static int acpi_processor_hotadd_init(struct acpi_processor *pr)
+static int acpi_processor_make_present(struct acpi_processor *pr)
 {
 	unsigned long long sta;
 	acpi_status status;
 	int ret;
 
+	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
+		return -ENODEV;
+
 	if (invalid_phys_cpuid(pr->phys_id))
 		return -ENODEV;
 
@@ -189,12 +191,6 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr)
 	cpu_maps_update_done();
 	return ret;
 }
-#else
-static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
-{
-	return -ENODEV;
-}
-#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
 
 static int acpi_processor_get_info(struct acpi_device *device)
 {
@@ -287,7 +283,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
 	 *  because cpuid <-> apicid mapping is persistent now.
 	 */
 	if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
-		int ret = acpi_processor_hotadd_init(pr);
+		int ret = acpi_processor_make_present(pr);
 
 		if (ret)
 			return ret;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 0bc1eacd0dfc..a59bca4dbcd5 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -327,12 +327,10 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
 }
 #endif
 
-#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 /* Arch dependent functions for cpu hotplug support */
 int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
 		 int *pcpu);
 int acpi_unmap_cpu(int cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
 
 #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
 int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 17/32] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (15 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 16/32] ACPI: Rename acpi_processor_hotadd_init and remove pre-processor guards James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 18/32] ACPI: Check _STA present bit before making CPUs not present James Morse
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

struct acpi_scan_handler has a detach callback that is used to remove
a driver when a bus is changed. When interacting with an eject-request,
the detach callback is called before _EJ0.

This means the ACPI processor driver can't use _STA to determine if a
CPU has been made not-present, or some of the other _STA bits have been
changed. acpi_processor_remove() needs to know the value of _STA after
_EJ0 has been called.

Add a post_eject callback to struct acpi_scan_handler. This is called
after acpi_scan_hot_remove() has successfully called _EJ0. Because
acpi_bus_trim_one() also clears the handler pointer, it needs to be
told if the caller will go on to call acpi_bus_post_eject(), so
that acpi_device_clear_enumerated() and clearing the handler pointer
can be deferred. The existing not-used pointer is used for this.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/acpi_processor.c |  4 +--
 drivers/acpi/scan.c           | 52 ++++++++++++++++++++++++++++++-----
 include/acpi/acpi_bus.h       |  1 +
 3 files changed, 48 insertions(+), 9 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 6ab55f109417..ab0f80b83773 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -411,7 +411,7 @@ static int acpi_processor_add(struct acpi_device *device,
 
 #ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 /* Removal */
-static void acpi_processor_remove(struct acpi_device *device)
+static void acpi_processor_post_eject(struct acpi_device *device)
 {
 	struct acpi_processor *pr;
 
@@ -523,7 +523,7 @@ static struct acpi_scan_handler processor_handler = {
 	.ids = processor_device_ids,
 	.attach = acpi_processor_add,
 #ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
-	.detach = acpi_processor_remove,
+	.post_eject = acpi_processor_post_eject,
 #endif
 	.hotplug = {
 		.enabled = true,
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index cecf94192771..cd9bedb54393 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -245,18 +245,28 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
 	return 0;
 }
 
-static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
+/**
+ * acpi_bus_trim_one() - Detach scan handlers and drivers from ACPI device
+ *                       objects.
+ * @adev:       Root of the ACPI namespace scope to walk.
+ * @eject:      Pointer to a bool that indicates if this was due to an
+ *              eject-request.
+ *
+ * Must be called under acpi_scan_lock.
+ * If @eject points to true, clearing the device enumeration is deferred until
+ * acpi_bus_post_eject() is called.
+ */
+static int acpi_bus_trim_one(struct acpi_device *adev, void *eject)
 {
 	struct acpi_scan_handler *handler = adev->handler;
+	bool is_eject = *(bool *)eject;
 
-	acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
+	acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, eject);
 
 	adev->flags.match_driver = false;
 	if (handler) {
 		if (handler->detach)
 			handler->detach(adev);
-
-		adev->handler = NULL;
 	} else {
 		device_release_driver(&adev->dev);
 	}
@@ -266,7 +276,12 @@ static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
 	 */
 	acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
 	adev->flags.initialized = false;
-	acpi_device_clear_enumerated(adev);
+
+	/* For eject this is deferred to acpi_bus_post_eject() */
+	if (!is_eject) {
+		adev->handler = NULL;
+		acpi_device_clear_enumerated(adev);
+	}
 
 	return 0;
 }
@@ -279,15 +294,36 @@ static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
  */
 void acpi_bus_trim(struct acpi_device *adev)
 {
-	acpi_bus_trim_one(adev, NULL);
+	bool eject = false;
+
+	acpi_bus_trim_one(adev, &eject);
 }
 EXPORT_SYMBOL_GPL(acpi_bus_trim);
 
+static int acpi_bus_post_eject(struct acpi_device *adev, void *not_used)
+{
+	struct acpi_scan_handler *handler = adev->handler;
+
+	acpi_dev_for_each_child_reverse(adev, acpi_bus_post_eject, NULL);
+
+	if (handler) {
+		if (handler->post_eject)
+			handler->post_eject(adev);
+
+		adev->handler = NULL;
+	}
+
+	acpi_device_clear_enumerated(adev);
+
+	return 0;
+}
+
 static int acpi_scan_hot_remove(struct acpi_device *device)
 {
 	acpi_handle handle = device->handle;
 	unsigned long long sta;
 	acpi_status status;
+	bool eject = true;
 
 	if (device->handler && device->handler->hotplug.demand_offline) {
 		if (!acpi_scan_is_offline(device, true))
@@ -300,7 +336,7 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
 
 	acpi_handle_debug(handle, "Ejecting\n");
 
-	acpi_bus_trim(device);
+	acpi_bus_trim_one(device, &eject);
 
 	acpi_evaluate_lck(handle, 0);
 	/*
@@ -323,6 +359,8 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
 	} else if (sta & ACPI_STA_DEVICE_ENABLED) {
 		acpi_handle_warn(handle,
 			"Eject incomplete - status 0x%llx\n", sta);
+	} else {
+		acpi_bus_post_eject(device, NULL);
 	}
 
 	return 0;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index cd3b75e08ec3..dc4028cafd33 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -126,6 +126,7 @@ struct acpi_scan_handler {
 	bool (*match)(const char *idstr, const struct acpi_device_id **matchid);
 	int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
 	void (*detach)(struct acpi_device *dev);
+	void (*post_eject)(struct acpi_device *dev);
 	void (*bind)(struct device *phys_dev);
 	void (*unbind)(struct device *phys_dev);
 	struct acpi_hotplug_profile hotplug;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 18/32] ACPI: Check _STA present bit before making CPUs not present
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (16 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 17/32] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 19/32] ACPI: Warn when the present bit changes but the feature is not enabled James Morse
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

When called acpi_processor_post_eject() unconditionally make a CPU
not-present and unregisters it.

To add support for AML events where the CPU has become disabled, but
remains present, the _STA method should be checked before calling
acpi_processor_remove().

Rename acpi_processor_post_eject() acpi_processor_remove_possible(), and
check the _STA before calling.

Adding the function prototype for arch_unregister_cpu() allows the
preprocessor guards to be removed.

After this change CPUs will remain registered and visible to
user-space as offline if buggy firmware triggers an eject-request,
but doesn't clear the corresponding _STA bits after _EJ0 has been
called.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/acpi_processor.c | 31 +++++++++++++++++++++++++------
 include/linux/cpu.h           |  1 +
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index ab0f80b83773..e6419b06cb37 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -409,13 +409,12 @@ static int acpi_processor_add(struct acpi_device *device,
 	return result;
 }
 
-#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 /* Removal */
-static void acpi_processor_post_eject(struct acpi_device *device)
+static void acpi_processor_make_not_present(struct acpi_device *device)
 {
 	struct acpi_processor *pr;
 
-	if (!device || !acpi_driver_data(device))
+	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
 		return;
 
 	pr = acpi_driver_data(device);
@@ -453,7 +452,29 @@ static void acpi_processor_post_eject(struct acpi_device *device)
 	free_cpumask_var(pr->throttling.shared_cpu_map);
 	kfree(pr);
 }
-#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
+
+static void acpi_processor_post_eject(struct acpi_device *device)
+{
+	struct acpi_processor *pr;
+	unsigned long long sta;
+	acpi_status status;
+
+	if (!device)
+		return;
+
+	pr = acpi_driver_data(device);
+	if (!pr || pr->id >= nr_cpu_ids || invalid_phys_cpuid(pr->phys_id))
+		return;
+
+	status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
+	if (ACPI_FAILURE(status))
+		return;
+
+	if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_PRESENT)) {
+		acpi_processor_make_not_present(device);
+		return;
+	}
+}
 
 #ifdef CONFIG_X86
 static bool acpi_hwp_native_thermal_lvt_set;
@@ -522,9 +543,7 @@ static const struct acpi_device_id processor_device_ids[] = {
 static struct acpi_scan_handler processor_handler = {
 	.ids = processor_device_ids,
 	.attach = acpi_processor_add,
-#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
 	.post_eject = acpi_processor_post_eject,
-#endif
 	.hotplug = {
 		.enabled = true,
 	},
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 86e79e702325..f6f198a3972b 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -76,6 +76,7 @@ struct device *cpu_device_create(struct device *parent, void *drvdata,
 				 const struct attribute_group **groups,
 				 const char *fmt, ...);
 extern int arch_register_cpu(int cpu);
+extern void arch_unregister_cpu(int cpu);
 #ifdef CONFIG_HOTPLUG_CPU
 extern void unregister_cpu(struct cpu *cpu);
 extern ssize_t arch_cpu_probe(const char *, size_t);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 19/32] ACPI: Warn when the present bit changes but the feature is not enabled
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (17 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 18/32] ACPI: Check _STA present bit before making CPUs not present James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 20/32] drivers: base: Implement weak arch_unregister_cpu() James Morse
                   ` (15 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

ACPI firmware can trigger the events to add and remove CPUs, but the
OS may not support this.

Print a warning when this happens.

This gives early warning on arm64 systems that don't support
CONFIG_ACPI_HOTPLUG_PRESENT_CPU, as making CPUs not present has
side effects for other parts of the system.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/acpi_processor.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index e6419b06cb37..572a12672c0e 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -155,8 +155,10 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
 	acpi_status status;
 	int ret;
 
-	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
+	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
+		pr_err_once("Changing CPU present bit is not supported\n");
 		return -ENODEV;
+	}
 
 	if (invalid_phys_cpuid(pr->phys_id))
 		return -ENODEV;
@@ -414,8 +416,10 @@ static void acpi_processor_make_not_present(struct acpi_device *device)
 {
 	struct acpi_processor *pr;
 
-	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
+	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
+		pr_err_once("Changing CPU present bit is not supported");
 		return;
+	}
 
 	pr = acpi_driver_data(device);
 	if (pr->id >= nr_cpu_ids)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 20/32] drivers: base: Implement weak arch_unregister_cpu()
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (18 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 19/32] ACPI: Warn when the present bit changes but the feature is not enabled James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 21/32] LoongArch: Use the __weak version of arch_unregister_cpu() James Morse
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

Add arch_unregister_cpu() to allow the ACPI machinery to call
unregister_cpu(). This is enough for arm64, but needs to be
overridden by x86 and ia64 who need to do more work.

CC: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/ia64/include/asm/cpu.h      | 4 ----
 arch/loongarch/include/asm/cpu.h | 6 ------
 arch/x86/include/asm/cpu.h       | 1 -
 drivers/base/cpu.c               | 5 +++++
 4 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
index 6e9786c6ec98..3b36c6a382bb 100644
--- a/arch/ia64/include/asm/cpu.h
+++ b/arch/ia64/include/asm/cpu.h
@@ -9,8 +9,4 @@
 
 DECLARE_PER_CPU(int, cpu_state);
 
-#ifdef CONFIG_HOTPLUG_CPU
-extern void arch_unregister_cpu(int);
-#endif
-
 #endif /* _ASM_IA64_CPU_H_ */
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
index 1e2c7c61dbea..754f28506791 100644
--- a/arch/loongarch/include/asm/cpu.h
+++ b/arch/loongarch/include/asm/cpu.h
@@ -124,10 +124,4 @@ enum cpu_type_enum {
 #define LOONGARCH_CPU_GUESTID		BIT_ULL(CPU_FEATURE_GUESTID)
 #define LOONGARCH_CPU_HYPERVISOR	BIT_ULL(CPU_FEATURE_HYPERVISOR)
 
-#if !defined(__ASSEMBLY__)
-#ifdef CONFIG_HOTPLUG_CPU
-extern void arch_unregister_cpu(int);
-#endif
-#endif /* ! __ASSEMBLY__ */
-
 #endif /* _ASM_CPU_H */
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index 2955541abebb..e5d820be3b72 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -24,7 +24,6 @@ static inline void prefill_possible_map(void) {}
 #endif /* CONFIG_SMP */
 
 #ifdef CONFIG_HOTPLUG_CPU
-extern void arch_unregister_cpu(int);
 extern void start_cpu0(void);
 #ifdef CONFIG_DEBUG_HOTPLUG_CPU0
 extern int _debug_hotplug_cpu(int cpu, int action);
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 0ba646022a5e..bc2ce8c7f383 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -498,6 +498,11 @@ int __weak arch_register_cpu(int cpu)
 {
 	return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
 }
+
+void __weak arch_unregister_cpu(int num)
+{
+	unregister_cpu(&per_cpu(cpu_devices, num));
+}
 #endif
 
 static void __init cpu_dev_register_generic(void)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 21/32] LoongArch: Use the __weak version of arch_unregister_cpu()
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (19 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 20/32] drivers: base: Implement weak arch_unregister_cpu() James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 22/32] arm64: acpi: Move get_cpu_for_acpi_id() to a header James Morse
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

LoongArch provides its own arch_unregister_cpu(). This clears the
hotpluggable flag, then unregisters the CPU.

It isn't necessary to clear the hotpluggable flag when unregistering
a cpu. unregister_cpu() writes NULL to the percpu cpu_sys_devices
pointer, meaning cpu_is_hotpluggable() will return false, as
get_cpu_device() has returned NULL.

Remove arch_unregister_cpu() and use the __weak version.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/loongarch/kernel/topology.c | 9 ---------
 1 file changed, 9 deletions(-)

diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
index a86d51269b09..494044990ba9 100644
--- a/arch/loongarch/kernel/topology.c
+++ b/arch/loongarch/kernel/topology.c
@@ -21,13 +21,4 @@ int arch_register_cpu(int cpu)
 	return ret;
 }
 EXPORT_SYMBOL(arch_register_cpu);
-
-void arch_unregister_cpu(int cpu)
-{
-	struct cpu *c = &per_cpu(cpu_devices, cpu);
-
-	c->hotpluggable = 0;
-	unregister_cpu(c);
-}
-EXPORT_SYMBOL(arch_unregister_cpu);
 #endif
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 22/32] arm64: acpi: Move get_cpu_for_acpi_id() to a header
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (20 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 21/32] LoongArch: Use the __weak version of arch_unregister_cpu() James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 23/32] ACPICA: Add new MADT GICC flags fields [code first?] James Morse
                   ` (12 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

ACPI identifies CPUs by UID. get_cpu_for_acpi_id() maps the ACPI UID
to the linux CPU number.

The helper to retrieve this mapping is only available in arm64's numa
code.

Move it to live next to get_acpi_id_for_cpu().

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/acpi.h | 11 +++++++++++
 arch/arm64/kernel/acpi_numa.c | 11 -----------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index bd68e1b7f29f..0d1da93a5bad 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -97,6 +97,17 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
 	return	acpi_cpu_get_madt_gicc(cpu)->uid;
 }
 
+static inline int get_cpu_for_acpi_id(u32 uid)
+{
+	int cpu;
+
+	for (cpu = 0; cpu < nr_cpu_ids; cpu++)
+		if (uid == get_acpi_id_for_cpu(cpu))
+			return cpu;
+
+	return -EINVAL;
+}
+
 static inline void arch_fix_phys_package_id(int num, u32 slot) { }
 void __init acpi_init_cpus(void);
 int apei_claim_sea(struct pt_regs *regs);
diff --git a/arch/arm64/kernel/acpi_numa.c b/arch/arm64/kernel/acpi_numa.c
index e51535a5f939..0c036a9a3c33 100644
--- a/arch/arm64/kernel/acpi_numa.c
+++ b/arch/arm64/kernel/acpi_numa.c
@@ -34,17 +34,6 @@ int __init acpi_numa_get_nid(unsigned int cpu)
 	return acpi_early_node_map[cpu];
 }
 
-static inline int get_cpu_for_acpi_id(u32 uid)
-{
-	int cpu;
-
-	for (cpu = 0; cpu < nr_cpu_ids; cpu++)
-		if (uid == get_acpi_id_for_cpu(cpu))
-			return cpu;
-
-	return -EINVAL;
-}
-
 static int __init acpi_parse_gicc_pxm(union acpi_subtable_headers *header,
 				      const unsigned long end)
 {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 23/32] ACPICA: Add new MADT GICC flags fields [code first?]
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (21 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 22/32] arm64: acpi: Move get_cpu_for_acpi_id() to a header James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 24/32] arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper James Morse
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

Add the new flag field to the MADT's GICC structure.

'Online Capable' indicates a disabled CPU can be enabled later.

Signed-off-by: James Morse <james.morse@arm.com>
---
This patch probably needs to go via the upstream acpica project,
but is included here so the feature can be tested.
---
 include/acpi/actbl2.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index b2973dbe37ee..9b1c15c56db3 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -1040,6 +1040,7 @@ struct acpi_madt_generic_interrupt {
 /* ACPI_MADT_ENABLED                    (1)      Processor is usable if set */
 #define ACPI_MADT_PERFORMANCE_IRQ_MODE  (1<<1)	/* 01: Performance Interrupt Mode */
 #define ACPI_MADT_VGIC_IRQ_MODE         (1<<2)	/* 02: VGIC Maintenance Interrupt mode */
+#define ACPI_MADT_GICC_CPU_CAPABLE      (1<<3)	/* 03: CPU is online capable */
 
 /* 12: Generic Distributor (ACPI 5.0 + ACPI 6.0 changes) */
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 24/32] arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (22 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 23/32] ACPICA: Add new MADT GICC flags fields [code first?] James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 25/32] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() James Morse
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

ACPI, irqchip and the architecture code all inspect the MADT
enabled bit for a GICC entry in the MADT.

The addition of an 'online capable' bit means all these sites need updating.

Move the current checks behind a helper to make future updates easier.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/smp.c       |  2 +-
 drivers/acpi/processor_core.c |  2 +-
 drivers/irqchip/irq-gic-v3.c  | 10 ++++------
 include/linux/acpi.h          |  5 +++++
 4 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index ffc5d76cf695..5669b013c2b7 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -525,7 +525,7 @@ acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
 {
 	u64 hwid = processor->arm_mpidr;
 
-	if (!(processor->flags & ACPI_MADT_ENABLED)) {
+	if (!acpi_gicc_is_usable(processor)) {
 		pr_debug("skipping disabled CPU entry with 0x%llx MPIDR\n", hwid);
 		return;
 	}
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index 2ac48cda5b20..1ba273622faa 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -90,7 +90,7 @@ static int map_gicc_mpidr(struct acpi_subtable_header *entry,
 	struct acpi_madt_generic_interrupt *gicc =
 	    container_of(entry, struct acpi_madt_generic_interrupt, header);
 
-	if (!(gicc->flags & ACPI_MADT_ENABLED))
+	if (!acpi_gicc_is_usable(gicc))
 		return -ENODEV;
 
 	/* device_declaration means Device object in DSDT, in the
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 997104d4338e..d484cccfd612 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2189,8 +2189,7 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
 	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
 	void __iomem *redist_base;
 
-	/* GICC entry which has !ACPI_MADT_ENABLED is not unusable so skip */
-	if (!(gicc->flags & ACPI_MADT_ENABLED))
+	if (!acpi_gicc_is_usable(gicc))
 		return 0;
 
 	redist_base = ioremap(gicc->gicr_base_address, size);
@@ -2240,7 +2239,7 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
 	 * If GICC is enabled and has valid gicr base address, then it means
 	 * GICR base is presented via GICC
 	 */
-	if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address) {
+	if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
 		acpi_data.enabled_rdists++;
 		return 0;
 	}
@@ -2249,7 +2248,7 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
 	 * It's perfectly valid firmware can pass disabled GICC entry, driver
 	 * should not treat as errors, skip the entry instead of probe fail.
 	 */
-	if (!(gicc->flags & ACPI_MADT_ENABLED))
+	if (!acpi_gicc_is_usable(gicc))
 		return 0;
 
 	return -ENODEV;
@@ -2308,8 +2307,7 @@ static int __init gic_acpi_parse_virt_madt_gicc(union acpi_subtable_headers *hea
 	int maint_irq_mode;
 	static int first_madt = true;
 
-	/* Skip unusable CPUs */
-	if (!(gicc->flags & ACPI_MADT_ENABLED))
+	if (!acpi_gicc_is_usable(gicc))
 		return 0;
 
 	maint_irq_mode = (gicc->flags & ACPI_MADT_VGIC_IRQ_MODE) ?
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index a59bca4dbcd5..d8e59953a27f 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -267,6 +267,11 @@ acpi_table_parse_cedt(enum acpi_cedt_type id,
 int acpi_parse_mcfg (struct acpi_table_header *header);
 void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
 
+static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
+{
+	return (gicc->flags & ACPI_MADT_ENABLED);
+}
+
 /* the following numa functions are architecture-dependent */
 void acpi_numa_slit_init (struct acpi_table_slit *slit);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 25/32] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (23 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 24/32] arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 26/32] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs James Morse
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions().
It should only count the number of enabled redistributors, but it
also tries to sanity check the GICC entry, currently returning an
error if the Enabled bit is set, but the gicr_base_address is zero.

Adding support for the online-capable bit to the sanity check
complictes it, for no benefit. The existing check implicitly
depends on gic_acpi_count_gicr_regions() previous failing to find
any GICR regions (as it is valid to have gicr_base_address of zero if
the redistributors are described via a GICR entry).

Instead of complicating the check, remove it. Failures that happen
at this point cause the irqchip not to register, meaning no irqs
can be requested. The kernel grinds to a panic() pretty quickly.

Without the check, MADT tables that exhibit this problem are still
caught by gic_populate_rdist(), which helpfully also prints what
went wrong:
| CPU4: mpidr 100 has no re-distributor!

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/irqchip/irq-gic-v3.c | 18 ++++++------------
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index d484cccfd612..a8b969d652b6 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2237,21 +2237,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
 
 	/*
 	 * If GICC is enabled and has valid gicr base address, then it means
-	 * GICR base is presented via GICC
+	 * GICR base is presented via GICC. The redistributor is only known to
+	 * be accessible if the GICC is marked as enabled. If this bit is not
+	 * set, we'd need to add the redistributor at runtime, which isn't
+	 * supported.
 	 */
-	if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
+	if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)
 		acpi_data.enabled_rdists++;
-		return 0;
-	}
 
-	/*
-	 * It's perfectly valid firmware can pass disabled GICC entry, driver
-	 * should not treat as errors, skip the entry instead of probe fail.
-	 */
-	if (!acpi_gicc_is_usable(gicc))
-		return 0;
-
-	return -ENODEV;
+	return 0;
 }
 
 static int __init gic_acpi_count_gicr_regions(void)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 26/32] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (24 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 25/32] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 27/32] arm64: psci: Ignore DENIED CPUs James Morse
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

To support virtual CPU hotplug, ACPI has added an 'online capable' bit
to the MADT GICC entries. This indicates a disabled CPU entry may not
be possible to online via PSCI until firmware has set enabled bit in
_STA.

What about the redistributor in the GICC entry? ACPI doesn't want to say.
Assume the worst: When a redistributor is described in the GICC entry,
but the entry is marked as disabled at boot, assume the redistributor
is inaccessible.

The GICv3 driver doesn't support late online of redistributors, so this
means the corresponding CPU can't be brought online either. Clear the
possible and present bits.

Systems that want CPU hotplug in a VM can ensure their redistributors
are always-on, and describe them that way with a GICR entry in the MADT.

When mapping redistributors found via GICC entries, handle the case
where the arch code believes the CPU is present and possible, but it
does not have an accessible redistributor. Print a warning and clear
the present and possible bits.

Signed-off-by: James Morse <james.morse@arm.com>
----
Disabled but online-capable CPUs cause this message to be printed
if their redistributors are described via GICC:
| GICv3: CPU 3's redistributor is inaccessible: this CPU can't be brought online

If ACPI's _STA tries to make the cpu present later, this message is printed:
| Changing CPU present bit is not supported
---
 drivers/irqchip/irq-gic-v3.c | 14 ++++++++++++++
 include/linux/acpi.h         |  3 ++-
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index a8b969d652b6..8c1ea3df0094 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2187,11 +2187,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
 				(struct acpi_madt_generic_interrupt *)header;
 	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
 	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
+	int cpu = get_cpu_for_acpi_id(gicc->uid);
 	void __iomem *redist_base;
 
 	if (!acpi_gicc_is_usable(gicc))
 		return 0;
 
+	/*
+	 * Capable but disabled CPUs can be brought online later. What about
+	 * the redistributor? ACPI doesn't want to say!
+	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
+	 * Otherwise, prevent such CPUs from being brought online.
+	 */
+	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
+		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
+		set_cpu_present(cpu, false);
+		set_cpu_possible(cpu, false);
+		return 0;
+	}
+
 	redist_base = ioremap(gicc->gicr_base_address, size);
 	if (!redist_base)
 		return -ENOMEM;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index d8e59953a27f..c1b74ebd5774 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -269,7 +269,8 @@ void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
 
 static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
 {
-	return (gicc->flags & ACPI_MADT_ENABLED);
+	return ((gicc->flags & ACPI_MADT_ENABLED ||
+		 gicc->flags & ACPI_MADT_GICC_CPU_CAPABLE));
 }
 
 /* the following numa functions are architecture-dependent */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 27/32] arm64: psci: Ignore DENIED CPUs
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (25 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 26/32] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 28/32] ACPI: add support to register CPUs based on the _STA enabled bit James Morse
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

When a CPU is marked as disabled, but online capable in the MADT, PSCI
applies some firmware policy to control when it can be brought online.
PSCI returns DENIED to a CPU_ON request if this is not currently
permitted. The OS can learn the current policy from the _STA enabled bit.

Handle the PSCI DENIED return code gracefully instead of printing an
error.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
[ morse: Rewrote commit message ]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/psci.c     | 2 +-
 arch/arm64/kernel/smp.c      | 3 ++-
 drivers/firmware/psci/psci.c | 2 ++
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index 29a8e444db83..4fcc0cdd757b 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -40,7 +40,7 @@ static int cpu_psci_cpu_boot(unsigned int cpu)
 {
 	phys_addr_t pa_secondary_entry = __pa_symbol(secondary_entry);
 	int err = psci_ops.cpu_on(cpu_logical_map(cpu), pa_secondary_entry);
-	if (err)
+	if (err && err != -EPROBE_DEFER)
 		pr_err("failed to boot CPU%d (%d)\n", cpu, err);
 
 	return err;
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 5669b013c2b7..ea031641545d 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -125,7 +125,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	/* Now bring the CPU into our world */
 	ret = boot_secondary(cpu, idle);
 	if (ret) {
-		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
+		if (ret != -EPROBE_DEFER)
+			pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
 		return ret;
 	}
 
diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index e7bcfca4159f..3389c913b2ea 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -212,6 +212,8 @@ static int __psci_cpu_on(u32 fn, unsigned long cpuid, unsigned long entry_point)
 	int err;
 
 	err = invoke_psci_fn(fn, cpuid, entry_point, 0);
+	if (err == PSCI_RET_DENIED)
+		return -EPROBE_DEFER;
 	return psci_to_linux_errno(err);
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 28/32] ACPI: add support to register CPUs based on the _STA enabled bit
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (26 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 27/32] arm64: psci: Ignore DENIED CPUs James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace James Morse
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

acpi_processor_get_info() registers all present CPUs. Registering a
CPU is what creates the sysfs entries and triggers the udev
notifications.

arm64 virtual machines that support 'virtual cpu hotplug' use the
enabled bit to indicate whether the CPU can be brought online, as
the existing ACPI tables require all hardware to be described and
present.

If firmware describes a CPU as present, but disabled, skip the
registration. Such CPUs are present, but can't be brought online for
whatever reason. (e.g. firmware/hypervisor policy).

Once firmware sets the enabled bit, the CPU can be registered and
brought online by user-space. Online CPUs, or CPUs that are missing
an _STA method must always be registered.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/acpi/acpi_processor.c | 34 +++++++++++++++++++++++++++++++++-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 572a12672c0e..20e810a066da 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -194,6 +194,35 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
 	return ret;
 }
 
+static int acpi_processor_make_enabled(struct acpi_processor *pr)
+{
+	unsigned long long sta;
+	acpi_status status;
+	bool present, enabled;
+
+	if (!acpi_has_method(pr->handle, "_STA"))
+		return arch_register_cpu(pr->id);
+
+	status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
+	if (ACPI_FAILURE(status))
+		return -ENODEV;
+
+	present = sta & ACPI_STA_DEVICE_PRESENT;
+	enabled = sta & ACPI_STA_DEVICE_ENABLED;
+
+	if (cpu_online(pr->id) && (!present || !enabled)) {
+		pr_err_once(FW_BUG "CPU %u is online, but described as not present or disabled!\n", pr->id);
+		add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
+	} else if (!present || !enabled) {
+		return -ENODEV;
+	}
+
+	if (get_cpu_device(pr->id))
+		return -EEXIST;
+
+	return arch_register_cpu(pr->id);
+}
+
 static int acpi_processor_get_info(struct acpi_device *device)
 {
 	union acpi_object object = { 0 };
@@ -271,7 +300,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
 	}
 
 	if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id)) {
-		int ret = arch_register_cpu(pr->id);
+		int ret = acpi_processor_make_enabled(pr);
 		if (ret)
 			return ret;
 	}
@@ -478,6 +507,9 @@ static void acpi_processor_post_eject(struct acpi_device *device)
 		acpi_processor_make_not_present(device);
 		return;
 	}
+
+	if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED))
+		arch_unregister_cpu(pr->id);
 }
 
 #ifdef CONFIG_X86
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (27 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 28/32] ACPI: add support to register CPUs based on the _STA enabled bit James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 21:08   ` Oliver Upton
  2023-02-05 10:12   ` Marc Zyngier
  2023-02-03 13:50 ` [RFC PATCH 30/32] KVM: arm64: Pass PSCI calls " James Morse
                   ` (5 subsequent siblings)
  34 siblings, 2 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
request to handle all hypercalls that aren't handled by KVM. With the
help of another capability, this will allow userspace to handle PSCI
calls.

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Signed-off-by: James Morse <james.morse@arm.com>

---

Notes on this implementation:

* A similar mechanism was proposed for SDEI some time ago [1]. This RFC
  generalizes the idea to all hypercalls, since that was suggested on
  the list [2, 3].

* We're reusing kvm_run.hypercall. I copied x0-x5 into
  kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
  this, because:
  - Most user handlers will need to write results back into the
    registers (x0-x3 for SMCCC), so if we keep this shortcut we should
    go all the way and read them back on return to kernel.
  - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
    handling the call.
  - SMCCC uses x0-x16 for parameters.
  x0 does contain the SMCCC function ID and may be useful for fast
  dispatch, we could keep that plus the immediate number.

* Add a flag in the kvm_run.hypercall telling whether this is HVC or
  SMC?  Can be added later in those bottom longmode and pad fields.

* On top of this we could share with userspace which HVC ranges are
  available and which ones are handled by KVM. That can actually be added
  independently, through a vCPU/VM device attribute which doesn't consume
  a new ioctl:
  - userspace issues HAS_ATTR ioctl on the vcpu fd to query whether this
    feature is available.
  - userspace queries the number N of HVC ranges using one GET_ATTR.
  - userspace passes an array of N ranges using another GET_ATTR. The
    array is filled and returned by KVM.

* Enabling this using a vCPU arch feature rather than the whole-VM
  capability would be fine, but it would be difficult to do the same for
  the following psci-in-user capability. So let's enable everything at
  the VM scope.

* No idea whether this work out of the box for AArch32 guests.

[1] https://lore.kernel.org/linux-arm-kernel/20170808164616.25949-12-james.morse@arm.com/
[2] https://lore.kernel.org/linux-arm-kernel/bf7e83f1-c58e-8d65-edd0-d08f27b8b766@arm.com/
[3] https://lore.kernel.org/linux-arm-kernel/f56cf420-affc-35f0-2355-801a924b8a35@arm.com/
---
 Documentation/virt/kvm/api.rst    | 17 +++++++++++++++--
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/arm.c              |  5 +++++
 arch/arm64/kvm/hypercalls.c       | 28 +++++++++++++++++++++++++++-
 include/kvm/arm_psci.h            |  4 ++++
 include/uapi/linux/kvm.h          |  1 +
 6 files changed, 53 insertions(+), 3 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index deb494f759ed..9a28a9cc1163 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6122,8 +6122,12 @@ to the byte array.
 			__u32 pad;
 		} hypercall;
 
-Unused.  This was once used for 'hypercall to userspace'.  To implement
-such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
+On x86 this was once used for 'hypercall to userspace'.  To implement such
+functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
+
+On arm64 it is used for hypercalls, when the KVM_CAP_ARM_HVC_TO_USER capability
+is enabled. 'nr' contains the HVC or SMC immediate. 'args' contains registers
+x0 - x5. The other parameters are unused.
 
 .. note:: KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
 
@@ -8276,6 +8280,15 @@ structure.
 When getting the Modified Change Topology Report value, the attr->addr
 must point to a byte where the value will be stored or retrieved from.
 
+8.37 KVM_CAP_ARM_HVC_TO_USER
+----------------------------
+
+:Architecture: arm64
+
+This capability indicates that KVM can pass unhandled hypercalls to userspace,
+if the VMM enables it. Hypercalls are passed with KVM_EXIT_HYPERCALL in
+kvm_run::hypercall.
+
 9. Known KVM API problems
 =========================
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 35a159d131b5..40911ebfa710 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -213,6 +213,7 @@ struct kvm_arch {
 #define KVM_ARCH_FLAG_EL1_32BIT				4
 	/* PSCI SYSTEM_SUSPEND enabled for the guest */
 #define KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED		5
+#define KVM_ARCH_FLAG_HVC_TO_USER			6
 
 	unsigned long flags;
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 9c5573bc4614..815b7e8f88e1 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -101,6 +101,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		r = 0;
 		set_bit(KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED, &kvm->arch.flags);
 		break;
+	case KVM_CAP_ARM_HVC_TO_USER:
+		r = 0;
+		set_bit(KVM_ARCH_FLAG_HVC_TO_USER, &kvm->arch.flags);
+		break;
 	default:
 		r = -EINVAL;
 		break;
@@ -230,6 +234,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_VCPU_ATTRIBUTES:
 	case KVM_CAP_PTP_KVM:
 	case KVM_CAP_ARM_SYSTEM_SUSPEND:
+	case KVM_CAP_ARM_HVC_TO_USER:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index c9f401fa01a9..efaf05d40dab 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -121,6 +121,28 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu, u32 func_id)
 	}
 }
 
+static int kvm_hvc_user(struct kvm_vcpu *vcpu)
+{
+	int i;
+	struct kvm_run *run = vcpu->run;
+
+	if (!test_bit(KVM_ARCH_FLAG_HVC_TO_USER, &vcpu->kvm->arch.flags)) {
+		smccc_set_retval(vcpu, SMCCC_RET_NOT_SUPPORTED, 0, 0, 0);
+		return 1;
+	}
+
+	run->exit_reason = KVM_EXIT_HYPERCALL;
+	run->hypercall.nr = kvm_vcpu_hvc_get_imm(vcpu);
+	/* Add the first parameters for fast access. */
+	for (i = 0; i < 6; i++)
+		run->hypercall.args[i] = vcpu_get_reg(vcpu, i);
+	run->hypercall.ret = 0;
+	run->hypercall.longmode = 0;
+	run->hypercall.pad = 0;
+
+	return 0;
+}
+
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 {
 	struct kvm_smccc_features *smccc_feat = &vcpu->kvm->arch.smccc_feat;
@@ -219,8 +241,12 @@ int kvm_hvc_call_handler(struct kvm_vcpu *vcpu)
 	case ARM_SMCCC_TRNG_RND32:
 	case ARM_SMCCC_TRNG_RND64:
 		return kvm_trng_call(vcpu);
-	default:
+	case KVM_PSCI_FN_BASE...KVM_PSCI_FN_LAST:
+	case PSCI_0_2_FN_BASE...PSCI_0_2_FN_LAST:
+	case PSCI_0_2_FN64_BASE...PSCI_0_2_FN64_LAST:
 		return kvm_psci_call(vcpu);
+	default:
+		return kvm_hvc_user(vcpu);
 	}
 
 out:
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index 6e55b9283789..7391fa51419a 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -17,6 +17,10 @@
 
 #define KVM_ARM_PSCI_LATEST	KVM_ARM_PSCI_1_1
 
+#define KVM_PSCI_FN_LAST	KVM_PSCI_FN(3)
+#define PSCI_0_2_FN_LAST	PSCI_0_2_FN(0x3f)
+#define PSCI_0_2_FN64_LAST	PSCI_0_2_FN64(0x3f)
+
 static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
 {
 	/*
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 55155e262646..2ead8b9aae56 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1175,6 +1175,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
+#define KVM_CAP_ARM_HVC_TO_USER 226
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 30/32] KVM: arm64: Pass PSCI calls to userspace
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (28 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-05-23  9:32   ` Salil Mehta
  2023-02-03 13:50 ` [RFC PATCH 31/32] arm64: document virtual CPU hotplug's expectations James Morse
                   ` (4 subsequent siblings)
  34 siblings, 1 reply; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

When the KVM_CAP_ARM_PSCI_TO_USER capability is available, userspace can
request to handle PSCI calls.

This is required for virtual CPU hotplug to allow the VMM to enforce the
online/offline policy it has advertised via ACPI. By managing PSCI in
user-space, the VMM is able to return PSCI_DENIED when the guest attempts
to bring a disabled vCPU online.
Without this, the VMM is only able to not-run the vCPU, the kernel will
have already returned PSCI_SUCCESS to the guest. This results in
timeouts during boot as the OS must wait for the secondary vCPU.

SMCCC probe requires PSCI v1.x. If userspace only implements PSCI v0.2,
the guest won't query SMCCC support through PSCI and won't use the
spectre workarounds. We could hijack PSCI_VERSION and pretend to support
v1.0 if userspace does not, then handle all v1.0 calls ourselves
(including guessing the PSCI feature set implemented by the guest), but
that seems unnecessary. After all the API already allows userspace to
force a version lower than v1.0 using the firmware pseudo-registers.

The KVM_REG_ARM_PSCI_VERSION pseudo-register currently resets to either
v0.1 if userspace doesn't set KVM_ARM_VCPU_PSCI_0_2, or
KVM_ARM_PSCI_LATEST (1.0).

Suggested-by: James Morse <james.morse@arm.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
[morse: Added description of why this is required]
Signed-off-by: James Morse <james.morse@arm.com>
---
 Documentation/virt/kvm/api.rst            | 14 ++++++++++++++
 Documentation/virt/kvm/arm/hypercalls.rst |  1 +
 arch/arm64/include/asm/kvm_host.h         |  1 +
 arch/arm64/kvm/arm.c                      | 10 +++++++---
 arch/arm64/kvm/hypercalls.c               |  2 +-
 arch/arm64/kvm/psci.c                     | 13 +++++++++++++
 include/kvm/arm_hypercalls.h              |  1 +
 include/uapi/linux/kvm.h                  |  1 +
 8 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 9a28a9cc1163..eb99436a1d97 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8289,6 +8289,20 @@ This capability indicates that KVM can pass unhandled hypercalls to userspace,
 if the VMM enables it. Hypercalls are passed with KVM_EXIT_HYPERCALL in
 kvm_run::hypercall.
 
+8.38 KVM_CAP_ARM_PSCI_TO_USER
+-----------------------------
+
+:Architectures: arm64
+
+When the VMM enables this capability, all PSCI calls are passed to userspace
+instead of being handled by KVM. Capability KVM_CAP_ARM_HVC_TO_USER must be
+enabled first.
+
+Userspace should support at least PSCI v1.0. Otherwise SMCCC features won't be
+available to the guest. Userspace does not need to handle the SMCCC_VERSION
+parameter for the PSCI_FEATURES function. The KVM_ARM_VCPU_PSCI_0_2 vCPU
+feature should be set even if this capability is enabled.
+
 9. Known KVM API problems
 =========================
 
diff --git a/Documentation/virt/kvm/arm/hypercalls.rst b/Documentation/virt/kvm/arm/hypercalls.rst
index 3e23084644ba..4c111afa7d74 100644
--- a/Documentation/virt/kvm/arm/hypercalls.rst
+++ b/Documentation/virt/kvm/arm/hypercalls.rst
@@ -34,6 +34,7 @@ The following registers are defined:
   - Allows any PSCI version implemented by KVM and compatible with
     v0.2 to be set with SET_ONE_REG
   - Affects the whole VM (even if the register view is per-vcpu)
+  - Defaults to PSCI 1.0 if userspace enables KVM_CAP_ARM_PSCI_TO_USER.
 
 * KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
     Holds the state of the firmware support to mitigate CVE-2017-5715, as
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 40911ebfa710..a9eff47bcb43 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -214,6 +214,7 @@ struct kvm_arch {
 	/* PSCI SYSTEM_SUSPEND enabled for the guest */
 #define KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED		5
 #define KVM_ARCH_FLAG_HVC_TO_USER			6
+#define KVM_ARCH_FLAG_PSCI_TO_USER			7
 
 	unsigned long flags;
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 815b7e8f88e1..3dba4e01f4d8 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -76,7 +76,7 @@ int kvm_arch_check_processor_compat(void *opaque)
 int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 			    struct kvm_enable_cap *cap)
 {
-	int r;
+	int r = -EINVAL;
 
 	if (cap->flags)
 		return -EINVAL;
@@ -105,8 +105,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		r = 0;
 		set_bit(KVM_ARCH_FLAG_HVC_TO_USER, &kvm->arch.flags);
 		break;
-	default:
-		r = -EINVAL;
+	case KVM_CAP_ARM_PSCI_TO_USER:
+		if (test_bit(KVM_ARCH_FLAG_HVC_TO_USER, &kvm->arch.flags)) {
+			r = 0;
+			set_bit(KVM_ARCH_FLAG_PSCI_TO_USER, &kvm->arch.flags);
+		}
 		break;
 	}
 
@@ -235,6 +238,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_PTP_KVM:
 	case KVM_CAP_ARM_SYSTEM_SUSPEND:
 	case KVM_CAP_ARM_HVC_TO_USER:
+	case KVM_CAP_ARM_PSCI_TO_USER:
 		r = 1;
 		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
index efaf05d40dab..3c2136cd7a3f 100644
--- a/arch/arm64/kvm/hypercalls.c
+++ b/arch/arm64/kvm/hypercalls.c
@@ -121,7 +121,7 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu, u32 func_id)
 	}
 }
 
-static int kvm_hvc_user(struct kvm_vcpu *vcpu)
+int kvm_hvc_user(struct kvm_vcpu *vcpu)
 {
 	int i;
 	struct kvm_run *run = vcpu->run;
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 7fbc4c1b9df0..8505b26f0a83 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -418,6 +418,16 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
 	return 1;
 }
 
+static bool kvm_psci_call_is_user(struct kvm_vcpu *vcpu)
+{
+	/* Handle the special case of SMCCC probe through PSCI */
+	if (smccc_get_function(vcpu) == PSCI_1_0_FN_PSCI_FEATURES &&
+	    smccc_get_arg1(vcpu) == ARM_SMCCC_VERSION_FUNC_ID)
+		return false;
+
+	return test_bit(KVM_ARCH_FLAG_PSCI_TO_USER, &vcpu->kvm->arch.flags);
+}
+
 /**
  * kvm_psci_call - handle PSCI call if r0 value is in range
  * @vcpu: Pointer to the VCPU struct
@@ -443,6 +453,9 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
 		return 1;
 	}
 
+	if (kvm_psci_call_is_user(vcpu))
+		return kvm_hvc_user(vcpu);
+
 	switch (kvm_psci_version(vcpu)) {
 	case KVM_ARM_PSCI_1_1:
 		return kvm_psci_1_x_call(vcpu, 1);
diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
index 1188f116cf4e..ea7073d1a82e 100644
--- a/include/kvm/arm_hypercalls.h
+++ b/include/kvm/arm_hypercalls.h
@@ -6,6 +6,7 @@
 
 #include <asm/kvm_emulate.h>
 
+int kvm_hvc_user(struct kvm_vcpu *vcpu);
 int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
 
 static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 2ead8b9aae56..c5da9d703a0f 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1176,6 +1176,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
 #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
 #define KVM_CAP_ARM_HVC_TO_USER 226
+#define KVM_CAP_ARM_PSCI_TO_USER 227
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 31/32] arm64: document virtual CPU hotplug's expectations
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (29 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 30/32] KVM: arm64: Pass PSCI calls " James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-02-03 13:50 ` [RFC PATCH 32/32] cpumask: Add enabled cpumask for present CPUs that can be brought online James Morse
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

Add a description of physical and virtual CPU hotplug, explain the
differences and elaborate on what is required in ACPI for a working
virtual hotplug system.

Signed-off-by: James Morse <james.morse@arm.com>
---
 Documentation/arm64/cpu-hotplug.rst | 79 +++++++++++++++++++++++++++++
 Documentation/arm64/index.rst       |  1 +
 2 files changed, 80 insertions(+)
 create mode 100644 Documentation/arm64/cpu-hotplug.rst

diff --git a/Documentation/arm64/cpu-hotplug.rst b/Documentation/arm64/cpu-hotplug.rst
new file mode 100644
index 000000000000..76ba8d932c72
--- /dev/null
+++ b/Documentation/arm64/cpu-hotplug.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. _cpuhp_index:
+
+====================
+CPU Hotplug and ACPI
+====================
+
+CPU hotplug in the arm64 world is commonly used to describe the kernel taking
+CPUs online/offline using PSCI. This document is about ACPI firmware allowing
+CPUs that were not available during boot to be added to the system later.
+
+``possible`` and ``present`` refer to the state of the CPU as seen by linux.
+
+
+CPU Hotplug on physical systems - CPUs not present at boot
+----------------------------------------------------------
+
+Physical systems need to mark a CPU that is ``possible`` but not ``present`` as
+being ``present``. An example would be a dual socket machine, where the package
+in one of the sockets can be replaced while the system is running.
+
+This is not supported.
+
+In the arm64 world CPUs are not a single device but a slice of the system.
+There are no systems that support the physical addition (or removal) of CPUs
+while the system is running, and ACPI is not able to sufficiently describe
+them.
+
+e.g. New CPUs come with new caches, but the platform's cache toplogy is
+described in a static table, the PPTT. How caches are shared between CPUs is
+not discoverable, and must be described by firmware.
+
+e.g. The GIC redistributor for each CPU must be accessed by the driver during
+boot to discover the system wide supported features. ACPI's MADT GICC
+structures can describe a redistributor associated with a disabled CPU, but
+can't describe whether the redistributor is accessible, only that it is not
+'always on'.
+
+arm64's ACPI tables assume that everything described is ``present``.
+
+
+CPU Hotplug on virtual systems - CPUs not enabled at boot
+---------------------------------------------------------
+
+Virtual systems have the advantage that all the properties the system will
+ever have can be described at boot. There are no power-domain considerations
+as such devices are emulated.
+
+CPU Hotplug on virtual systems is supported. It is distinct from physical
+CPU Hotplug as all resources are described as ``present``, but CPUs may be
+marked as disabled by firmware. Only the CPU's online/offline behaviour is
+influenced by firmware. An example is where a virtual machine boots with a
+single CPU, and additional CPUs are added once a cloud orchestrator deploys
+the workload.
+
+For a virtual machine, the VMM (e.g. Qemu) plays the part of firmware.
+
+Virtual hotplug is implemented as a firmware policy affecting which CPUs can be
+brought online. Firmware can enforce its policy via PSCI's return codes. e.g.
+``DENIED``.
+
+The ACPI tables must describe all the resources of the virtual machine. CPUs
+that firmware wishes to disable either from boot (or later) should not be
+``enabled`` in the MADT GICC structures, but should have the ``online capable``
+bit set, to indicate they can be enabled later. The boot CPU must be marked as
+``enabled``.  The 'always on' GICR structure must be used to describe the
+redistributors.
+
+CPUs described as ``online capable`` but not ``enabled`` can be set to enabled
+by the DSDT's Processor object's _STA method. On virtual systems the _STA method
+must always report the CPU as ``present``. Changes to the firmware policy can
+be notified to the OS via device-check or eject-request.
+
+CPUs described as ``enabled`` in the static table, should not have their _STA
+modified dynamically by firmware. Soft-restart features such as kexec will
+re-read the static properties of the system from these static tables, and
+may malfunction if these no longer describe the running system. Linux will
+re-discover the dynamic properties of the system from the _STA method later
+during boot.
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index ae21f8118830..54da62534871 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -13,6 +13,7 @@ ARM64 Architecture
     asymmetric-32bit
     booting
     cpu-feature-registers
+    cpu-hotplug
     elf_hwcaps
     hugetlbpage
     legacy_instructions
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [RFC PATCH 32/32] cpumask: Add enabled cpumask for present CPUs that can be brought online
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (30 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 31/32] arm64: document virtual CPU hotplug's expectations James Morse
@ 2023-02-03 13:50 ` James Morse
  2023-03-07 12:00 ` [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug Jonathan Cameron
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-02-03 13:50 UTC (permalink / raw)
  To: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	James Morse, Suzuki K Poulose, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

The 'offline' file in sysfs shows all offline CPUs, including those
that aren't present. User-space is expected to remove not-present CPUs
from thie list to learn which CPUs could be brought online.

CPUs can be present but not-enabled. These CPUs can't be brought online
until the firmware policy changes, which comes with an ACPI notification
that will register the CPUs.

With only the offline and present files, user-space is unable to
determine which CPUs it can try to bring online. Add a new CPU mask
that shows this based on all the registered CPUs.

Signed-off-by: James Morse <james.morse@arm.com>
---
 drivers/base/cpu.c      | 10 ++++++++++
 include/linux/cpumask.h | 25 +++++++++++++++++++++++++
 kernel/cpu.c            |  3 +++
 3 files changed, 38 insertions(+)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index bc2ce8c7f383..3cc06f9fd230 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -76,6 +76,7 @@ void unregister_cpu(struct cpu *cpu)
 {
 	int logical_cpu = cpu->dev.id;
 
+	set_cpu_enabled(logical_cpu, false);
 	unregister_cpu_under_node(logical_cpu, cpu_to_node(logical_cpu));
 
 	device_unregister(&cpu->dev);
@@ -265,6 +266,13 @@ static ssize_t print_cpus_offline(struct device *dev,
 }
 static DEVICE_ATTR(offline, 0444, print_cpus_offline, NULL);
 
+static ssize_t print_cpus_enabled(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%*pbl\n", cpumask_pr_args(cpu_enabled_mask));
+}
+static DEVICE_ATTR(enabled, 0444, print_cpus_enabled, NULL);
+
 static ssize_t print_cpus_isolated(struct device *dev,
 				  struct device_attribute *attr, char *buf)
 {
@@ -384,6 +392,7 @@ int register_cpu(struct cpu *cpu, int num)
 	register_cpu_under_node(num, cpu_to_node(num));
 	dev_pm_qos_expose_latency_limit(&cpu->dev,
 					PM_QOS_RESUME_LATENCY_NO_CONSTRAINT);
+	set_cpu_enabled(num, true);
 
 	return 0;
 }
@@ -465,6 +474,7 @@ static struct attribute *cpu_root_attrs[] = {
 	&cpu_attrs[2].attr.attr,
 	&dev_attr_kernel_max.attr,
 	&dev_attr_offline.attr,
+	&dev_attr_enabled.attr,
 	&dev_attr_isolated.attr,
 #ifdef CONFIG_NO_HZ_FULL
 	&dev_attr_nohz_full.attr,
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index c2aa0aa26b45..1513bb00dee4 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -59,6 +59,7 @@ static inline void set_nr_cpu_ids(unsigned int nr)
  *
  *     cpu_possible_mask- has bit 'cpu' set iff cpu is populatable
  *     cpu_present_mask - has bit 'cpu' set iff cpu is populated
+ *     cpu_enabled_mask  - has bit 'cpu' set iff cpu can be brought online
  *     cpu_online_mask  - has bit 'cpu' set iff cpu available to scheduler
  *     cpu_active_mask  - has bit 'cpu' set iff cpu available to migration
  *
@@ -91,11 +92,13 @@ static inline void set_nr_cpu_ids(unsigned int nr)
 
 extern struct cpumask __cpu_possible_mask;
 extern struct cpumask __cpu_online_mask;
+extern struct cpumask __cpu_enabled_mask;
 extern struct cpumask __cpu_present_mask;
 extern struct cpumask __cpu_active_mask;
 extern struct cpumask __cpu_dying_mask;
 #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask)
 #define cpu_online_mask   ((const struct cpumask *)&__cpu_online_mask)
+#define cpu_enabled_mask   ((const struct cpumask *)&__cpu_enabled_mask)
 #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
 #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
 #define cpu_dying_mask    ((const struct cpumask *)&__cpu_dying_mask)
@@ -921,6 +924,7 @@ extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS);
 #else
 #define for_each_possible_cpu(cpu) for_each_cpu((cpu), cpu_possible_mask)
 #define for_each_online_cpu(cpu)   for_each_cpu((cpu), cpu_online_mask)
+#define for_each_enabled_cpu(cpu)   for_each_cpu((cpu), cpu_enabled_mask)
 #define for_each_present_cpu(cpu)  for_each_cpu((cpu), cpu_present_mask)
 #endif
 
@@ -943,6 +947,15 @@ set_cpu_possible(unsigned int cpu, bool possible)
 		cpumask_clear_cpu(cpu, &__cpu_possible_mask);
 }
 
+static inline void
+set_cpu_enabled(unsigned int cpu, bool can_be_onlined)
+{
+	if (can_be_onlined)
+		cpumask_set_cpu(cpu, &__cpu_enabled_mask);
+	else
+		cpumask_clear_cpu(cpu, &__cpu_enabled_mask);
+}
+
 static inline void
 set_cpu_present(unsigned int cpu, bool present)
 {
@@ -1022,6 +1035,7 @@ static inline unsigned int num_online_cpus(void)
 	return atomic_read(&__num_online_cpus);
 }
 #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
+#define num_enabled_cpus()	cpumask_weight(cpu_enabled_mask)
 #define num_present_cpus()	cpumask_weight(cpu_present_mask)
 #define num_active_cpus()	cpumask_weight(cpu_active_mask)
 
@@ -1030,6 +1044,11 @@ static inline bool cpu_online(unsigned int cpu)
 	return cpumask_test_cpu(cpu, cpu_online_mask);
 }
 
+static inline bool cpu_enabled(unsigned int cpu)
+{
+	return cpumask_test_cpu(cpu, cpu_enabled_mask);
+}
+
 static inline bool cpu_possible(unsigned int cpu)
 {
 	return cpumask_test_cpu(cpu, cpu_possible_mask);
@@ -1054,6 +1073,7 @@ static inline bool cpu_dying(unsigned int cpu)
 
 #define num_online_cpus()	1U
 #define num_possible_cpus()	1U
+#define num_enabled_cpus()	1U
 #define num_present_cpus()	1U
 #define num_active_cpus()	1U
 
@@ -1067,6 +1087,11 @@ static inline bool cpu_possible(unsigned int cpu)
 	return cpu == 0;
 }
 
+static inline bool cpu_enabled(unsigned int cpu)
+{
+	return cpu == 0;
+}
+
 static inline bool cpu_present(unsigned int cpu)
 {
 	return cpu == 0;
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6c0a92ca6bb5..6f4febf0142e 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2639,6 +2639,9 @@ EXPORT_SYMBOL(__cpu_possible_mask);
 struct cpumask __cpu_online_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_online_mask);
 
+struct cpumask __cpu_enabled_mask __read_mostly;
+EXPORT_SYMBOL(__cpu_enabled_mask);
+
 struct cpumask __cpu_present_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_present_mask);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-03 13:50 ` [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace James Morse
@ 2023-02-03 21:08   ` Oliver Upton
  2023-02-07 17:50     ` James Morse
  2023-02-05 10:12   ` Marc Zyngier
  1 sibling, 1 reply; 59+ messages in thread
From: Oliver Upton @ 2023-02-03 21:08 UTC (permalink / raw)
  To: James Morse
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Marc Zyngier,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

Hi James,

On Fri, Feb 03, 2023 at 01:50:40PM +0000, James Morse wrote:
> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> 
> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
> request to handle all hypercalls that aren't handled by KVM.

I would very much prefer we not go down this route. This capability
effectively constructs an ABI out of what KVM presently does not
implement. What would happen if KVM decides to implement a new set
of hypercalls later down the road that were previously forwarded to
userspace?

Instead of a catch-all I think we should take the approach of having
userspace explicitly request which hypercalls should be forwarded to
userspace. I proposed something similar [1], but never got around to
respinning it (oops).

Let me dust those patches off and align with Marc's suggestions.

[1]: https://lore.kernel.org/kvmarm/20221110015327.3389351-1-oliver.upton@linux.dev/

--
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-03 13:50 ` [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace James Morse
  2023-02-03 21:08   ` Oliver Upton
@ 2023-02-05 10:12   ` Marc Zyngier
  2023-02-06 10:10     ` Suzuki K Poulose
                       ` (2 more replies)
  1 sibling, 3 replies; 59+ messages in thread
From: Marc Zyngier @ 2023-02-05 10:12 UTC (permalink / raw)
  To: James Morse
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Thomas Gleixner,
	Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Borislav Petkov,
	H Peter Anvin, Dave Hansen, Ingo Molnar, Will Deacon,
	Catalin Marinas, Huacai Chen, Suzuki K Poulose, Oliver Upton,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

On Fri, 03 Feb 2023 13:50:40 +0000,
James Morse <james.morse@arm.com> wrote:
> 
> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> 
> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
> request to handle all hypercalls that aren't handled by KVM. With the
> help of another capability, this will allow userspace to handle PSCI
> calls.
> 
> Suggested-by: James Morse <james.morse@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> Signed-off-by: James Morse <james.morse@arm.com>
> 
> ---
> 

On top of Oliver's ask not to make this a blanket "steal everything",
but instead to have an actual request for ranges of forwarded
hypercalls:

> Notes on this implementation:
> 
> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
>   generalizes the idea to all hypercalls, since that was suggested on
>   the list [2, 3].
> 
> * We're reusing kvm_run.hypercall. I copied x0-x5 into
>   kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
>   this, because:
>   - Most user handlers will need to write results back into the
>     registers (x0-x3 for SMCCC), so if we keep this shortcut we should
>     go all the way and read them back on return to kernel.
>   - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
>     handling the call.
>   - SMCCC uses x0-x16 for parameters.
>   x0 does contain the SMCCC function ID and may be useful for fast
>   dispatch, we could keep that plus the immediate number.
> 
> * Add a flag in the kvm_run.hypercall telling whether this is HVC or
>   SMC?  Can be added later in those bottom longmode and pad fields.

We definitely need this. A nested hypervisor can (and does) use SMCs
as the conduit. The question is whether they represent two distinct
namespaces or not. I *think* we can unify them, but someone should
check and maybe get clarification from the owners of the SMCCC spec.

>
> * On top of this we could share with userspace which HVC ranges are
>   available and which ones are handled by KVM. That can actually be added
>   independently, through a vCPU/VM device attribute which doesn't consume
>   a new ioctl:
>   - userspace issues HAS_ATTR ioctl on the vcpu fd to query whether this
>     feature is available.
>   - userspace queries the number N of HVC ranges using one GET_ATTR.
>   - userspace passes an array of N ranges using another GET_ATTR. The
>     array is filled and returned by KVM.

As mentioned above, I think this interface should go both ways.
Userspace should request the forwarding of a certain range of
hypercalls via a similar SET_ATTR interface.

Another question is how we migrate VMs that have these forwarding
requirements. Do we expect the VMM to replay the forwarding as part of
the setting up on the other side? Or do we save/restore this via a
firmware pseudo-register?

> 
> * Enabling this using a vCPU arch feature rather than the whole-VM
>   capability would be fine, but it would be difficult to do the same for
>   the following psci-in-user capability. So let's enable everything at
>   the VM scope.

Absolutely.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-05 10:12   ` Marc Zyngier
@ 2023-02-06 10:10     ` Suzuki K Poulose
  2023-02-06 12:31       ` Marc Zyngier
  2023-02-06 17:19     ` Oliver Upton
  2023-02-07 17:50     ` James Morse
  2 siblings, 1 reply; 59+ messages in thread
From: Suzuki K Poulose @ 2023-02-06 10:10 UTC (permalink / raw)
  To: Marc Zyngier, James Morse
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Thomas Gleixner,
	Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Borislav Petkov,
	H Peter Anvin, Dave Hansen, Ingo Molnar, Will Deacon,
	Catalin Marinas, Huacai Chen, Oliver Upton, Len Brown,
	Rafael Wysocki, WANG Xuerui, Salil Mehta, Russell King,
	Jean-Philippe Brucker

Hi,

A few cents from the Realm support point of view.

On 05/02/2023 10:12, Marc Zyngier wrote:
> On Fri, 03 Feb 2023 13:50:40 +0000,
> James Morse <james.morse@arm.com> wrote:
>>
>> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>
>> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
>> request to handle all hypercalls that aren't handled by KVM. With the
>> help of another capability, this will allow userspace to handle PSCI
>> calls.
>>
>> Suggested-by: James Morse <james.morse@arm.com>
>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>> Signed-off-by: James Morse <james.morse@arm.com>
>>
>> ---
>>
> 
> On top of Oliver's ask not to make this a blanket "steal everything",
> but instead to have an actual request for ranges of forwarded
> hypercalls:
> 
>> Notes on this implementation:
>>
>> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
>>    generalizes the idea to all hypercalls, since that was suggested on
>>    the list [2, 3].
>>
>> * We're reusing kvm_run.hypercall. I copied x0-x5 into
>>    kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
>>    this, because:
>>    - Most user handlers will need to write results back into the
>>      registers (x0-x3 for SMCCC), so if we keep this shortcut we should
>>      go all the way and read them back on return to kernel.
>>    - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
>>      handling the call.

This may not be always possible, e.g., for Realms. GET_ONE_REG is
not supported. So using an explicit passing down of the args is
preferrable.

Thanks
Suzuki

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-06 10:10     ` Suzuki K Poulose
@ 2023-02-06 12:31       ` Marc Zyngier
  2023-02-07  9:41         ` Suzuki K Poulose
  0 siblings, 1 reply; 59+ messages in thread
From: Marc Zyngier @ 2023-02-06 12:31 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Oliver Upton,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

On Mon, 06 Feb 2023 10:10:41 +0000,
Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> 
> Hi,
> 
> A few cents from the Realm support point of view.
> 
> On 05/02/2023 10:12, Marc Zyngier wrote:
> > On Fri, 03 Feb 2023 13:50:40 +0000,
> > James Morse <james.morse@arm.com> wrote:
> >> 
> >> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >> 
> >> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
> >> request to handle all hypercalls that aren't handled by KVM. With the
> >> help of another capability, this will allow userspace to handle PSCI
> >> calls.
> >> 
> >> Suggested-by: James Morse <james.morse@arm.com>
> >> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >> Signed-off-by: James Morse <james.morse@arm.com>
> >> 
> >> ---
> >> 
> > 
> > On top of Oliver's ask not to make this a blanket "steal everything",
> > but instead to have an actual request for ranges of forwarded
> > hypercalls:
> > 
> >> Notes on this implementation:
> >> 
> >> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
> >>    generalizes the idea to all hypercalls, since that was suggested on
> >>    the list [2, 3].
> >> 
> >> * We're reusing kvm_run.hypercall. I copied x0-x5 into
> >>    kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
> >>    this, because:
> >>    - Most user handlers will need to write results back into the
> >>      registers (x0-x3 for SMCCC), so if we keep this shortcut we should
> >>      go all the way and read them back on return to kernel.
> >>    - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
> >>      handling the call.
> 
> This may not be always possible, e.g., for Realms. GET_ONE_REG is
> not supported. So using an explicit passing down of the args is
> preferrable.

What is the blocker for CCA to use GET_ONE_REG? The value obviously
exists and is made available to the host. pKVM is perfectly able to
use GET_ONE_REG and gets a bunch of zeroes for things that the
hypervisor has decided to hide from the host.

Of course, it requires that the hypervisor (the RMM in your case)
knows about the semantics of the hypercall, but that's obviously
already a requirement (or you wouldn't be able to use PSCI at all).

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-05 10:12   ` Marc Zyngier
  2023-02-06 10:10     ` Suzuki K Poulose
@ 2023-02-06 17:19     ` Oliver Upton
  2023-02-07 17:50     ` James Morse
  2 siblings, 0 replies; 59+ messages in thread
From: Oliver Upton @ 2023-02-06 17:19 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

On Sun, Feb 05, 2023 at 10:12:49AM +0000, Marc Zyngier wrote:
> On Fri, 03 Feb 2023 13:50:40 +0000,
> James Morse <james.morse@arm.com> wrote:
> > 
> > From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > 
> > When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
> > request to handle all hypercalls that aren't handled by KVM. With the
> > help of another capability, this will allow userspace to handle PSCI
> > calls.
> > 
> > Suggested-by: James Morse <james.morse@arm.com>
> > Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > Signed-off-by: James Morse <james.morse@arm.com>
> > 
> > ---
> > 
> 
> On top of Oliver's ask not to make this a blanket "steal everything",
> but instead to have an actual request for ranges of forwarded
> hypercalls:
> 
> > Notes on this implementation:
> > 
> > * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
> >   generalizes the idea to all hypercalls, since that was suggested on
> >   the list [2, 3].
> > 
> > * We're reusing kvm_run.hypercall. I copied x0-x5 into
> >   kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
> >   this, because:
> >   - Most user handlers will need to write results back into the
> >     registers (x0-x3 for SMCCC), so if we keep this shortcut we should
> >     go all the way and read them back on return to kernel.
> >   - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
> >     handling the call.
> >   - SMCCC uses x0-x16 for parameters.
> >   x0 does contain the SMCCC function ID and may be useful for fast
> >   dispatch, we could keep that plus the immediate number.
> > 
> > * Add a flag in the kvm_run.hypercall telling whether this is HVC or
> >   SMC?  Can be added later in those bottom longmode and pad fields.
> 
> We definitely need this. A nested hypervisor can (and does) use SMCs
> as the conduit. The question is whether they represent two distinct
> namespaces or not. I *think* we can unify them, but someone should
> check and maybe get clarification from the owners of the SMCCC spec.
> 
> >
> > * On top of this we could share with userspace which HVC ranges are
> >   available and which ones are handled by KVM. That can actually be added
> >   independently, through a vCPU/VM device attribute which doesn't consume
> >   a new ioctl:
> >   - userspace issues HAS_ATTR ioctl on the vcpu fd to query whether this
> >     feature is available.
> >   - userspace queries the number N of HVC ranges using one GET_ATTR.
> >   - userspace passes an array of N ranges using another GET_ATTR. The
> >     array is filled and returned by KVM.
> 
> As mentioned above, I think this interface should go both ways.
> Userspace should request the forwarding of a certain range of
> hypercalls via a similar SET_ATTR interface.
> 
> Another question is how we migrate VMs that have these forwarding
> requirements. Do we expect the VMM to replay the forwarding as part of
> the setting up on the other side? Or do we save/restore this via a
> firmware pseudo-register?

Personally I'd prefer we left that job to userspace.

We could also implement GET_ATTR, in case userspace has forgotten what
it wrote to the hypercall filter. The firmware pseudo-registers are
handy for moving KVM state back and forth 'for free', but I don't think
we need to bend over backwards to migrate state userspace is directly
responsible for.

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-06 12:31       ` Marc Zyngier
@ 2023-02-07  9:41         ` Suzuki K Poulose
  2023-02-07 11:23           ` Marc Zyngier
  0 siblings, 1 reply; 59+ messages in thread
From: Suzuki K Poulose @ 2023-02-07  9:41 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Oliver Upton,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

Hi Marc,

On 06/02/2023 12:31, Marc Zyngier wrote:
> On Mon, 06 Feb 2023 10:10:41 +0000,
> Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> Hi,
>>
>> A few cents from the Realm support point of view.
>>
>> On 05/02/2023 10:12, Marc Zyngier wrote:
>>> On Fri, 03 Feb 2023 13:50:40 +0000,
>>> James Morse <james.morse@arm.com> wrote:
>>>>
>>>> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>>
>>>> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
>>>> request to handle all hypercalls that aren't handled by KVM. With the
>>>> help of another capability, this will allow userspace to handle PSCI
>>>> calls.
>>>>
>>>> Suggested-by: James Morse <james.morse@arm.com>
>>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>>> Signed-off-by: James Morse <james.morse@arm.com>
>>>>
>>>> ---
>>>>
>>>
>>> On top of Oliver's ask not to make this a blanket "steal everything",
>>> but instead to have an actual request for ranges of forwarded
>>> hypercalls:
>>>
>>>> Notes on this implementation:
>>>>
>>>> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
>>>>     generalizes the idea to all hypercalls, since that was suggested on
>>>>     the list [2, 3].
>>>>
>>>> * We're reusing kvm_run.hypercall. I copied x0-x5 into
>>>>     kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
>>>>     this, because:
>>>>     - Most user handlers will need to write results back into the
>>>>       registers (x0-x3 for SMCCC), so if we keep this shortcut we should
>>>>       go all the way and read them back on return to kernel.
>>>>     - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
>>>>       handling the call.
>>
>> This may not be always possible, e.g., for Realms. GET_ONE_REG is
>> not supported. So using an explicit passing down of the args is
>> preferrable.
> 
> What is the blocker for CCA to use GET_ONE_REG? The value obviously
> exists and is made available to the host. pKVM is perfectly able to
> use GET_ONE_REG and gets a bunch of zeroes for things that the
> hypervisor has decided to hide from the host.
> 

It is not impossible. On a "HOST CALL" (explicit calls to the Host from
Realm), the GPRs are made available to the host and can be stashed into 
the vcpu reg state and the request can be serviced. However, it is a bit
odd, to make this exception - "the GET_ONE_REG is valid now", while in 
almost all other cases it is invalid (exception of MMIO).

Of course we could always return what is stashed in the vcpu state,
which is may be invalid/ 0. But given the construct of "host doesn't
have access to the register state", it may be a good idea to say, 
request always fails, to indicate that the Host is probably doing 
something wrong, than silently passing on incorrect information.


> Of course, it requires that the hypervisor (the RMM in your case)
> knows about the semantics of the hypercall, but that's obviously

RMM doesn't care about the semantics of hypercall, other than
considering it just like an SMCCC compliant call. The hypercall
arguments/results are passed down/up by the Realm in a separate structure.

> already a requirement (or you wouldn't be able to use PSCI at all).

Realm PSCI calls are always serviced by the RMM. RMM may request
the Hyp for specific information in certain cases, but that doesn't
need to go down to the VMM.

Thanks
Suzuki


> 
> Thanks,
> 
> 	M.
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-07  9:41         ` Suzuki K Poulose
@ 2023-02-07 11:23           ` Marc Zyngier
  2023-02-07 12:46             ` Suzuki K Poulose
  0 siblings, 1 reply; 59+ messages in thread
From: Marc Zyngier @ 2023-02-07 11:23 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Oliver Upton,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

On Tue, 07 Feb 2023 09:41:54 +0000,
Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> 
> Hi Marc,
> 
> On 06/02/2023 12:31, Marc Zyngier wrote:
> > On Mon, 06 Feb 2023 10:10:41 +0000,
> > Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
> >> 
> >> This may not be always possible, e.g., for Realms. GET_ONE_REG is
> >> not supported. So using an explicit passing down of the args is
> >> preferrable.
> > 
> > What is the blocker for CCA to use GET_ONE_REG? The value obviously
> > exists and is made available to the host. pKVM is perfectly able to
> > use GET_ONE_REG and gets a bunch of zeroes for things that the
> > hypervisor has decided to hide from the host.
> > 
> 
> It is not impossible. On a "HOST CALL" (explicit calls to the Host
> from Realm), the GPRs are made available to the host and can be
> stashed into the vcpu reg state and the request can be
> serviced. However, it is a bit odd, to make this exception - "the
> GET_ONE_REG is valid now", while in almost all other cases it is
> invalid (exception of MMIO).

But that's an RMM decision. If the RMM decides to forward the
hypercall to the host (irrespective of the potential forwarding to
userspace), it makes the GPRs available.

If the hypercall is forwarded to userspace, then the host is
responsible to check with the RMM that it will be willing to provide
the required information (passed as GPRs or not).

> Of course we could always return what is stashed in the vcpu state,
> which is may be invalid/ 0. But given the construct of "host doesn't
> have access to the register state", it may be a good idea to say,
> request always fails, to indicate that the Host is probably doing
> something wrong, than silently passing on incorrect information.

I disagree. Either you fail at the delegation point, or you don't. On
getting a hypercall exit to userspace, you are guaranteed that the
GPRs are valid.

> > Of course, it requires that the hypervisor (the RMM in your case)
> > knows about the semantics of the hypercall, but that's obviously
> 
> RMM doesn't care about the semantics of hypercall, other than
> considering it just like an SMCCC compliant call. The hypercall
> arguments/results are passed down/up by the Realm in a separate
> structure.

That's because the RMM doesn't use registers to pass the data. But at
the end of the day, this is the same thing. The host gets the data
from the RMM, stashes it in the GPRs, and exit to userspace.

The important thing here is that GET_ONE_REG is valid in the context
where it matters. If the VMM tries to use it outside of the context of
a hypercall, it gets junk. It's not a bug, it's a feature.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-07 11:23           ` Marc Zyngier
@ 2023-02-07 12:46             ` Suzuki K Poulose
  0 siblings, 0 replies; 59+ messages in thread
From: Suzuki K Poulose @ 2023-02-07 12:46 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Oliver Upton,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

On 07/02/2023 11:23, Marc Zyngier wrote:
> On Tue, 07 Feb 2023 09:41:54 +0000,
> Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>
>> Hi Marc,
>>
>> On 06/02/2023 12:31, Marc Zyngier wrote:
>>> On Mon, 06 Feb 2023 10:10:41 +0000,
>>> Suzuki K Poulose <suzuki.poulose@arm.com> wrote:
>>>>
>>>> This may not be always possible, e.g., for Realms. GET_ONE_REG is
>>>> not supported. So using an explicit passing down of the args is
>>>> preferrable.
>>>
>>> What is the blocker for CCA to use GET_ONE_REG? The value obviously
>>> exists and is made available to the host. pKVM is perfectly able to
>>> use GET_ONE_REG and gets a bunch of zeroes for things that the
>>> hypervisor has decided to hide from the host.
>>>
>>
>> It is not impossible. On a "HOST CALL" (explicit calls to the Host
>> from Realm), the GPRs are made available to the host and can be
>> stashed into the vcpu reg state and the request can be
>> serviced. However, it is a bit odd, to make this exception - "the
>> GET_ONE_REG is valid now", while in almost all other cases it is
>> invalid (exception of MMIO).
> 
> But that's an RMM decision. If the RMM decides to forward the
> hypercall to the host (irrespective of the potential forwarding to
> userspace), it makes the GPRs available.
> 
> If the hypercall is forwarded to userspace, then the host is
> responsible to check with the RMM that it will be willing to provide
> the required information (passed as GPRs or not).

Just to be clear, on a hypercall, all the arguments are provided to
the host. And it is always possible for the host to sync the vcpu
GPR state with those arguments and make them available via the 
GET_ONE_REG call.

> 
>> Of course we could always return what is stashed in the vcpu state,
>> which is may be invalid/ 0. But given the construct of "host doesn't
>> have access to the register state", it may be a good idea to say,
>> request always fails, to indicate that the Host is probably doing
>> something wrong, than silently passing on incorrect information.
> 
> I disagree. Either you fail at the delegation point, or you don't. On
> getting a hypercall exit to userspace, you are guaranteed that the
> GPRs are valid.

This is possible, as I mentioned below, the question is bug vs feature.

> 
>>> Of course, it requires that the hypervisor (the RMM in your case)
>>> knows about the semantics of the hypercall, but that's obviously
>>
>> RMM doesn't care about the semantics of hypercall, other than
>> considering it just like an SMCCC compliant call. The hypercall
>> arguments/results are passed down/up by the Realm in a separate
>> structure.
> 
> That's because the RMM doesn't use registers to pass the data. But at
> the end of the day, this is the same thing. The host gets the data
> from the RMM, stashes it in the GPRs, and exit to userspace.

True.

> 
> The important thing here is that GET_ONE_REG is valid in the context
> where it matters. If the VMM tries to use it outside of the context of
> a hypercall, it gets junk. It's not a bug, it's a feature.

This is what I was concerned about.  As long as this "For any exit
other than hypercall (at least for now), you get junk values when using
GET_ONE_REG for confidential guests" is an acceptable feature, that 
should be alright.

Thanks
Suzuki

> 
> Thanks,
> 
> 	M.
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-05 10:12   ` Marc Zyngier
  2023-02-06 10:10     ` Suzuki K Poulose
  2023-02-06 17:19     ` Oliver Upton
@ 2023-02-07 17:50     ` James Morse
  2023-02-08  8:40       ` Marc Zyngier
  2023-02-11  1:44       ` Oliver Upton
  2 siblings, 2 replies; 59+ messages in thread
From: James Morse @ 2023-02-07 17:50 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Thomas Gleixner,
	Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Borislav Petkov,
	H Peter Anvin, Dave Hansen, Ingo Molnar, Will Deacon,
	Catalin Marinas, Huacai Chen, Suzuki K Poulose, Oliver Upton,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

Hi Marc,

On 05/02/2023 10:12, Marc Zyngier wrote:
> On Fri, 03 Feb 2023 13:50:40 +0000,
> James Morse <james.morse@arm.com> wrote:
>>
>> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>
>> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
>> request to handle all hypercalls that aren't handled by KVM. With the
>> help of another capability, this will allow userspace to handle PSCI
>> calls.

> On top of Oliver's ask not to make this a blanket "steal everything",
> but instead to have an actual request for ranges of forwarded
> hypercalls:
> 
>> Notes on this implementation:
>>
>> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
>>   generalizes the idea to all hypercalls, since that was suggested on
>>   the list [2, 3].
>>
>> * We're reusing kvm_run.hypercall. I copied x0-x5 into
>>   kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
>>   this, because:
>>   - Most user handlers will need to write results back into the
>>     registers (x0-x3 for SMCCC), so if we keep this shortcut we should
>>     go all the way and read them back on return to kernel.
>>   - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
>>     handling the call.
>>   - SMCCC uses x0-x16 for parameters.
>>   x0 does contain the SMCCC function ID and may be useful for fast
>>   dispatch, we could keep that plus the immediate number.
>>
>> * Add a flag in the kvm_run.hypercall telling whether this is HVC or
>>   SMC?  Can be added later in those bottom longmode and pad fields.

> We definitely need this. A nested hypervisor can (and does) use SMCs
> as the conduit.

Christoffer's comments last time round on this was that EL2 guests get SMC with this,
and EL1 guests get HVC. The VMM could never get both...


> The question is whether they represent two distinct
> namespaces or not. I *think* we can unify them, but someone should
> check and maybe get clarification from the owners of the SMCCC spec.

i.e. the VMM requests 0xC400_0000:0xC400_001F regardless of SMC/HVC?

I don't yet see how a VMM could get HVC out of a virtual-EL2 guest....


>> * On top of this we could share with userspace which HVC ranges are
>>   available and which ones are handled by KVM. That can actually be added
>>   independently, through a vCPU/VM device attribute which doesn't consume
>>   a new ioctl:
>>   - userspace issues HAS_ATTR ioctl on the vcpu fd to query whether this
>>     feature is available.
>>   - userspace queries the number N of HVC ranges using one GET_ATTR.
>>   - userspace passes an array of N ranges using another GET_ATTR. The
>>     array is filled and returned by KVM.

> As mentioned above, I think this interface should go both ways.
> Userspace should request the forwarding of a certain range of
> hypercalls via a similar SET_ATTR interface.

Yup, I'll sync up with Oliver about that.


> Another question is how we migrate VMs that have these forwarding
> requirements. Do we expect the VMM to replay the forwarding as part of
> the setting up on the other side? Or do we save/restore this via a
> firmware pseudo-register?

Pfff. VMMs problem. Enabling these things means it has its own internal state to migrate.
(is this vCPU on or off?), I doubt it needs reminding that the state exists.

That said, Salil is looking at making this work with migration in Qemu.


Thanks,

James

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-03 21:08   ` Oliver Upton
@ 2023-02-07 17:50     ` James Morse
  2023-02-08  9:02       ` Marc Zyngier
  0 siblings, 1 reply; 59+ messages in thread
From: James Morse @ 2023-02-07 17:50 UTC (permalink / raw)
  To: Oliver Upton
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Marc Zyngier,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

Hi Oliver,

On 03/02/2023 21:08, Oliver Upton wrote:
> On Fri, Feb 03, 2023 at 01:50:40PM +0000, James Morse wrote:
>> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
>>
>> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
>> request to handle all hypercalls that aren't handled by KVM.

> I would very much prefer we not go down this route. This capability
> effectively constructs an ABI out of what KVM presently does not
> implement. What would happen if KVM decides to implement a new set
> of hypercalls later down the road that were previously forwarded to
> userspace?

The user-space support would never get called. If we have a wild-west allocation of IDs in
this area we have bigger problems. I'd hope in this example it would be a VMM or an
in-kernel implementation of the same feature.

When I floated something like this before for supporting SDEI in guests, Christoffer
didn't like tie-ing KVM to SMC-CC - hence the all or nothing.

Since then we've had things like Spectre, which I don't think the VMM should
ever be allowed to handle, which makes the whole thing much murkier.


> Instead of a catch-all I think we should take the approach of having
> userspace explicitly request which hypercalls should be forwarded to
> userspace. I proposed something similar [1], but never got around to
> respinning it (oops).

> Let me dust those patches off and align with Marc's suggestions.
> 
> [1]: https://lore.kernel.org/kvmarm/20221110015327.3389351-1-oliver.upton@linux.dev/

I've no problem with doing it like this. This approach was based on Christoffer's previous
feedback, but the world has changed since then.

Let me know if you want me to re-spin that series - I need to get this into some
shape next week for Salil to look at the Qemu changes, as I can't test the whole thing
until that is done.



Thanks,

James

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-07 17:50     ` James Morse
@ 2023-02-08  8:40       ` Marc Zyngier
  2023-02-08 14:25         ` Marc Zyngier
  2023-02-11  1:44       ` Oliver Upton
  1 sibling, 1 reply; 59+ messages in thread
From: Marc Zyngier @ 2023-02-08  8:40 UTC (permalink / raw)
  To: James Morse
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Thomas Gleixner,
	Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Borislav Petkov,
	H Peter Anvin, Dave Hansen, Ingo Molnar, Will Deacon,
	Catalin Marinas, Huacai Chen, Suzuki K Poulose, Oliver Upton,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

On Tue, 07 Feb 2023 17:50:58 +0000,
James Morse <james.morse@arm.com> wrote:
> 
> Hi Marc,
> 
> On 05/02/2023 10:12, Marc Zyngier wrote:
> > On Fri, 03 Feb 2023 13:50:40 +0000,
> > James Morse <james.morse@arm.com> wrote:
> >>
> >> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>
> >> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
> >> request to handle all hypercalls that aren't handled by KVM. With the
> >> help of another capability, this will allow userspace to handle PSCI
> >> calls.
> 
> > On top of Oliver's ask not to make this a blanket "steal everything",
> > but instead to have an actual request for ranges of forwarded
> > hypercalls:
> > 
> >> Notes on this implementation:
> >>
> >> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
> >>   generalizes the idea to all hypercalls, since that was suggested on
> >>   the list [2, 3].
> >>
> >> * We're reusing kvm_run.hypercall. I copied x0-x5 into
> >>   kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
> >>   this, because:
> >>   - Most user handlers will need to write results back into the
> >>     registers (x0-x3 for SMCCC), so if we keep this shortcut we should
> >>     go all the way and read them back on return to kernel.
> >>   - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
> >>     handling the call.
> >>   - SMCCC uses x0-x16 for parameters.
> >>   x0 does contain the SMCCC function ID and may be useful for fast
> >>   dispatch, we could keep that plus the immediate number.
> >>
> >> * Add a flag in the kvm_run.hypercall telling whether this is HVC or
> >>   SMC?  Can be added later in those bottom longmode and pad fields.
> 
> > We definitely need this. A nested hypervisor can (and does) use SMCs
> > as the conduit.
> 
> Christoffer's comments last time round on this was that EL2 guests
> get SMC with this, and EL1 guests get HVC. The VMM could never get
> both...

I agree with the first half of the statement (EL2 guest using SMC),
but limiting EL1 guests to HVC is annoying. On systems that have a
secure side, it would make sense to be able to route the guest's SMC
calls to userspace and allow it to emulate/proxy/deny such calls.

This would solve the 10 year old question of "how do we allow a guest
to call into secure services...

> 
> 
> > The question is whether they represent two distinct
> > namespaces or not. I *think* we can unify them, but someone should
> > check and maybe get clarification from the owners of the SMCCC spec.
> 
> i.e. the VMM requests 0xC400_0000:0xC400_001F regardless of SMC/HVC?
> 
> I don't yet see how a VMM could get HVC out of a virtual-EL2 guest....

My statement was badly formulated, and I conflated the need for SMC in
EL2 guests with the (separate) need to handle SMC for EL1 guests.

>
> 
> >> * On top of this we could share with userspace which HVC ranges are
> >>   available and which ones are handled by KVM. That can actually be added
> >>   independently, through a vCPU/VM device attribute which doesn't consume
> >>   a new ioctl:
> >>   - userspace issues HAS_ATTR ioctl on the vcpu fd to query whether this
> >>     feature is available.
> >>   - userspace queries the number N of HVC ranges using one GET_ATTR.
> >>   - userspace passes an array of N ranges using another GET_ATTR. The
> >>     array is filled and returned by KVM.
> 
> > As mentioned above, I think this interface should go both ways.
> > Userspace should request the forwarding of a certain range of
> > hypercalls via a similar SET_ATTR interface.
> 
> Yup, I'll sync up with Oliver about that.
> 
> 
> > Another question is how we migrate VMs that have these forwarding
> > requirements. Do we expect the VMM to replay the forwarding as part of
> > the setting up on the other side? Or do we save/restore this via a
> > firmware pseudo-register?
> 
> Pfff. VMMs problem. Enabling these things means it has its own
> internal state to migrate.  (is this vCPU on or off?), I doubt it
> needs reminding that the state exists.

I'm perfectly OK with the VMM being in the driving seat here and that
it'd have to replay its own state. But it needs some level of
documentation.

> That said, Salil is looking at making this work with migration in Qemu.

Yup, that'd be needed.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-07 17:50     ` James Morse
@ 2023-02-08  9:02       ` Marc Zyngier
  0 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2023-02-08  9:02 UTC (permalink / raw)
  To: James Morse
  Cc: Oliver Upton, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

On Tue, 07 Feb 2023 17:50:59 +0000,
James Morse <james.morse@arm.com> wrote:
> 
> Hi Oliver,
> 
> On 03/02/2023 21:08, Oliver Upton wrote:
> > On Fri, Feb 03, 2023 at 01:50:40PM +0000, James Morse wrote:
> >> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> >>
> >> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
> >> request to handle all hypercalls that aren't handled by KVM.
> 
> > I would very much prefer we not go down this route. This capability
> > effectively constructs an ABI out of what KVM presently does not
> > implement. What would happen if KVM decides to implement a new set
> > of hypercalls later down the road that were previously forwarded to
> > userspace?
> 
> The user-space support would never get called. If we have a
> wild-west allocation of IDs in this area we have bigger
> problems. I'd hope in this example it would be a VMM or an in-kernel
> implementation of the same feature.
> 
> When I floated something like this before for supporting SDEI in
> guests, Christoffer didn't like tie-ing KVM to SMC-CC - hence the
> all or nothing.
> 
> Since then we've had things like Spectre, which I don't think the
> VMM should ever be allowed to handle, which makes the whole thing
> much murkier.

That ship has sailed a long time ago. We also have grown a bunch of
in-kernel SMCCC services that are KVM specific (the silly PTP stuff,
for example, not to mention all the pKVM hypercalls...).

It is also likely that these ranges will grow over time (it has been a
long time since the last drop of Spectre-like crap, and something must
be brewing somewhere), so a level of discrimination is important.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-08  8:40       ` Marc Zyngier
@ 2023-02-08 14:25         ` Marc Zyngier
  0 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2023-02-08 14:25 UTC (permalink / raw)
  To: James Morse
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Thomas Gleixner,
	Lorenzo Pieralisi, Mark Rutland, Sudeep Holla, Borislav Petkov,
	H Peter Anvin, Dave Hansen, Ingo Molnar, Will Deacon,
	Catalin Marinas, Huacai Chen, Suzuki K Poulose, Oliver Upton,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

On Wed, 08 Feb 2023 08:40:09 +0000,
Marc Zyngier <maz@kernel.org> wrote:
> 
> On Tue, 07 Feb 2023 17:50:58 +0000,
> James Morse <james.morse@arm.com> wrote:
> > 
> > Hi Marc,
> > 
> > On 05/02/2023 10:12, Marc Zyngier wrote:
> > > On Fri, 03 Feb 2023 13:50:40 +0000,
> > > James Morse <james.morse@arm.com> wrote:
> > >>
> > >> From: Jean-Philippe Brucker <jean-philippe@linaro.org>
> > >>
> > >> When capability KVM_CAP_ARM_HVC_TO_USER is available, userspace can
> > >> request to handle all hypercalls that aren't handled by KVM. With the
> > >> help of another capability, this will allow userspace to handle PSCI
> > >> calls.
> > 
> > > On top of Oliver's ask not to make this a blanket "steal everything",
> > > but instead to have an actual request for ranges of forwarded
> > > hypercalls:
> > > 
> > >> Notes on this implementation:
> > >>
> > >> * A similar mechanism was proposed for SDEI some time ago [1]. This RFC
> > >>   generalizes the idea to all hypercalls, since that was suggested on
> > >>   the list [2, 3].
> > >>
> > >> * We're reusing kvm_run.hypercall. I copied x0-x5 into
> > >>   kvm_run.hypercall.args[] to help userspace but I'm tempted to remove
> > >>   this, because:
> > >>   - Most user handlers will need to write results back into the
> > >>     registers (x0-x3 for SMCCC), so if we keep this shortcut we should
> > >>     go all the way and read them back on return to kernel.
> > >>   - QEMU doesn't care about this shortcut, it pulls all vcpu regs before
> > >>     handling the call.
> > >>   - SMCCC uses x0-x16 for parameters.
> > >>   x0 does contain the SMCCC function ID and may be useful for fast
> > >>   dispatch, we could keep that plus the immediate number.
> > >>
> > >> * Add a flag in the kvm_run.hypercall telling whether this is HVC or
> > >>   SMC?  Can be added later in those bottom longmode and pad fields.
> > 
> > > We definitely need this. A nested hypervisor can (and does) use SMCs
> > > as the conduit.
> > 
> > Christoffer's comments last time round on this was that EL2 guests
> > get SMC with this, and EL1 guests get HVC. The VMM could never get
> > both...
> 
> I agree with the first half of the statement (EL2 guest using SMC),
> but limiting EL1 guests to HVC is annoying. On systems that have a
> secure side, it would make sense to be able to route the guest's SMC
> calls to userspace and allow it to emulate/proxy/deny such calls.

You also want to look at the TRNG firmware spec (aka DEN0098), which
explicitly calls out for the use of SMC when EL2 and EL3 are
implemented (see 1.5 "Invocation considerations").

Is it mad? Yes. But madness seems to be the direction of travel these
days.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace
  2023-02-07 17:50     ` James Morse
  2023-02-08  8:40       ` Marc Zyngier
@ 2023-02-11  1:44       ` Oliver Upton
  1 sibling, 0 replies; 59+ messages in thread
From: Oliver Upton @ 2023-02-11  1:44 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Len Brown, Rafael Wysocki, WANG Xuerui, Salil Mehta,
	Russell King, Jean-Philippe Brucker

Hey James,

On Tue, Feb 07, 2023 at 05:50:58PM +0000, James Morse wrote:

[...]

> > As mentioned above, I think this interface should go both ways.
> > Userspace should request the forwarding of a certain range of
> > hypercalls via a similar SET_ATTR interface.
> 
> Yup, I'll sync up with Oliver about that.

Sorry, I was tied up with some unrelated stuff at work this week. I
polished up what I had and sent out an RFC v2 for SMCCC filtering [1].
Mentioning it here because my email service gives up if I Cc too many
recipients so I didn't include everyone on this series.

[1]: https://lore.kernel.org/linux-arm-kernel/20230211013759.3556016-1-oliver.upton@linux.dev/

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (31 preceding siblings ...)
  2023-02-03 13:50 ` [RFC PATCH 32/32] cpumask: Add enabled cpumask for present CPUs that can be brought online James Morse
@ 2023-03-07 12:00 ` Jonathan Cameron
  2023-03-13 15:50   ` James Morse
  2023-03-29  2:35 ` Gavin Shan
  2023-03-29  5:52 ` Shaoqin Huang
  34 siblings, 1 reply; 59+ messages in thread
From: Jonathan Cameron @ 2023-03-07 12:00 UTC (permalink / raw)
  To: James Morse
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Marc Zyngier,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Oliver Upton, Len Brown, Rafael Wysocki, WANG Xuerui,
	Salil Mehta, Russell King, Jean-Philippe Brucker, kangkang.shen

On Fri,  3 Feb 2023 13:50:11 +0000
James Morse <james.morse@arm.com> wrote:


...

> On a system that supports cpuhotplug the MADT has to describe every possible
> CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
> the guest is started.
> With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
> policy about which CPUs can be brought online.
> 
> This series adds support for virtual-cpuhotplug as exactly that: firmware
> policy. This may even work on a physical machine too; for a guest the part of
> firmware is played by the VMM. (typically Qemu).
> 
> PSCI support is modified to return 'DENIED' if the CPU can't be brought
> online/enabled yet. The CPU object's _STA method's enabled bit is used to
> indicate firmware's current disposition. If the CPU has its enabled bit clear,
> it will not be registered with sysfs, and attempts to bring it online will
> fail. The notifications that _STA has changed its value then work in the same
> way as physical hotplug, and firmware can cause the CPU to be registered some
> time later, allowing it to be brought online.

Hi James,

As we discussed on an LOD call a while back, I think that we need some path to
find out if the guest supports vCPU HP or not so that info can be queried by
an orchestrator / libvirt etc.  In general the entity responsible for allocating
extra vCPUs may not know what support the VM has for this feature.

There are various ways we could get this information into the VMM.
My immediate thought is to use one of the ACPI interfaces that lets us write
AML that can set an emulated register. A query to the VMM can check if this
register is set.

So options.

_OSI() - Deprecated on ARM64 so lets not use that ;)
_OSC() - Could add a bit to Table 6.13 Platform-Wide Capabilites in ACPI 6.5 spec.
         Given x86 has a similar online capable bit perhaps this is the best option
         though it is the one that requires a formal code first proposal to ASWG.
_OSC() - Could add a new UUID and put it under a suitable device - maybe all CPUs?
         You could definitely argue this feature is an operating system property.
_DSM() - Similar to OSC but always under a device.
         Whilst can be used for this I'm not sure it really matches intended usecase.

Assuming everyone agrees this bit of introspection is useful,
Rafael / other ACPI specialists: Any suggestions on how best to do this?

Jonathan





> 
> This creates something that looks like cpuhotplug to user-space, as the sysfs
> files appear and disappear, and the udev notifications look the same.
> 
> One notable difference is the CPU present mask, which is exposed via sysfs.
> Because the CPUs remain present throughout, they can still be seen in that mask.
> This value does get used by webbrowsers to estimate the number of CPUs
> as the CPU online mask is constantly changed on mobile phones.
> 
> Linux is tolerant of PSCI returning errors, as its always been allowed to do
> that. To avoid confusing OS that can't tolerate this, we needed an additional
> bit in the MADT GICC flags. This series copies ACPI_MADT_ONLINE_CAPABLE, which
> appears to be for this purpose, but calls it ACPI_MADT_GICC_CPU_CAPABLE as it
> has a different bit position in the GICC.
> 
> This code is unconditionally enabled for all ACPI architectures.
> If there are problems with firmware tables on some devices, the CPUs will
> already be online by the time the acpi_processor_make_enabled() is called.
> A mismatch here causes a firmware-bug message and kernel taint. This should
> only affect people with broken firmware who also boot with maxcpus=1, and
> bring CPUs online later.
> 
> I had a go at switching the remaining architectures over to GENERIC_CPU_DEVICES,
> so that the Kconfig symbol can be removed, but I got stuck with powerpc
> and s390.
> 
> 
> The first patch has already been posted as a fix here:
> https://www.spinics.net/lists/linux-ia64/msg21920.html
> I've only build tested Loongarch and ia64.
> 
> 
> If folk want to play along at home, you'll need a copy of Qemu that supports this.
> https://github.com/salil-mehta/qemu.git salil/virt-cpuhp-armv8/rfc-v1-port29092022.psci.present
> 
> You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
> to match your host kernel. Replace your '-smp' argument with something like:
> | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
> 
> then feed the following to the Qemu montior;
> | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
> | (qemu) device_del cpu1
> 
> 
> This series is based on v6.2-rc3, and can be retrieved from:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v1
> 
> 
> Thanks,
> 
> James Morse (29):
>   ia64: Fix build error due to switch case label appearing next to
>     declaration
>   ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture
>   drivers: base: Use present CPUs in GENERIC_CPU_DEVICES
>   drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden
>   drivers: base: Move cpu_dev_init() after node_dev_init()
>   arm64: setup: Switch over to GENERIC_CPU_DEVICES using
>     arch_register_cpu()
>   ia64/topology: Switch over to GENERIC_CPU_DEVICES
>   x86/topology: Switch over to GENERIC_CPU_DEVICES
>   LoongArch: Switch over to GENERIC_CPU_DEVICES
>   arch_topology: Make register_cpu_capacity_sysctl() tolerant to late
>     CPUs
>   ACPI: processor: Add support for processors described as container
>     packages
>   ACPI: processor: Register CPUs that are online, but not described in
>     the DSDT
>   ACPI: processor: Register all CPUs from acpi_processor_get_info()
>   ACPI: Rename ACPI_HOTPLUG_CPU to include 'present'
>   ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove()
>   ACPI: Rename acpi_processor_hotadd_init and remove pre-processor
>     guards
>   ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
>   ACPI: Check _STA present bit before making CPUs not present
>   ACPI: Warn when the present bit changes but the feature is not enabled
>   drivers: base: Implement weak arch_unregister_cpu()
>   LoongArch: Use the __weak version of arch_unregister_cpu()
>   arm64: acpi: Move get_cpu_for_acpi_id() to a header
>   ACPICA: Add new MADT GICC flags fields [code first?]
>   arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a
>     helper
>   irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
>   irqchip/gic-v3: Add support for ACPI's disabled but 'online capable'
>     CPUs
>   ACPI: add support to register CPUs based on the _STA enabled bit
>   arm64: document virtual CPU hotplug's expectations
>   cpumask: Add enabled cpumask for present CPUs that can be brought
>     online
> 
> Jean-Philippe Brucker (3):
>   arm64: psci: Ignore DENIED CPUs
>   KVM: arm64: Pass hypercalls to userspace
>   KVM: arm64: Pass PSCI calls to userspace
> 
>  Documentation/arm64/cpu-hotplug.rst       |  79 ++++++++++++
>  Documentation/arm64/index.rst             |   1 +
>  Documentation/virt/kvm/api.rst            |  31 ++++-
>  Documentation/virt/kvm/arm/hypercalls.rst |   1 +
>  arch/arm64/Kconfig                        |   1 +
>  arch/arm64/include/asm/acpi.h             |  11 ++
>  arch/arm64/include/asm/cpu.h              |   1 -
>  arch/arm64/include/asm/kvm_host.h         |   2 +
>  arch/arm64/kernel/acpi_numa.c             |  11 --
>  arch/arm64/kernel/psci.c                  |   2 +-
>  arch/arm64/kernel/setup.c                 |  13 +-
>  arch/arm64/kernel/smp.c                   |   5 +-
>  arch/arm64/kvm/arm.c                      |  15 ++-
>  arch/arm64/kvm/hypercalls.c               |  28 ++++-
>  arch/arm64/kvm/psci.c                     |  13 ++
>  arch/ia64/Kconfig                         |   2 +
>  arch/ia64/include/asm/acpi.h              |   2 +-
>  arch/ia64/include/asm/cpu.h               |  11 --
>  arch/ia64/kernel/acpi.c                   |   6 +-
>  arch/ia64/kernel/setup.c                  |   2 +-
>  arch/ia64/kernel/sys_ia64.c               |   7 +-
>  arch/ia64/kernel/topology.c               |  35 +-----
>  arch/loongarch/Kconfig                    |   2 +
>  arch/loongarch/kernel/topology.c          |  31 +----
>  arch/x86/Kconfig                          |   2 +
>  arch/x86/include/asm/cpu.h                |   6 -
>  arch/x86/kernel/acpi/boot.c               |   4 +-
>  arch/x86/kernel/topology.c                |  19 +--
>  drivers/acpi/Kconfig                      |   5 +-
>  drivers/acpi/acpi_processor.c             | 146 +++++++++++++++++-----
>  drivers/acpi/processor_core.c             |   2 +-
>  drivers/acpi/scan.c                       | 116 +++++++++++------
>  drivers/base/arch_topology.c              |  38 ++++--
>  drivers/base/cpu.c                        |  31 ++++-
>  drivers/base/init.c                       |   2 +-
>  drivers/firmware/psci/psci.c              |   2 +
>  drivers/irqchip/irq-gic-v3.c              |  38 +++---
>  include/acpi/acpi_bus.h                   |   1 +
>  include/acpi/actbl2.h                     |   1 +
>  include/kvm/arm_hypercalls.h              |   1 +
>  include/kvm/arm_psci.h                    |   4 +
>  include/linux/acpi.h                      |  10 +-
>  include/linux/cpu.h                       |   6 +
>  include/linux/cpumask.h                   |  25 ++++
>  include/uapi/linux/kvm.h                  |   2 +
>  kernel/cpu.c                              |   3 +
>  46 files changed, 532 insertions(+), 244 deletions(-)
>  create mode 100644 Documentation/arm64/cpu-hotplug.rst
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-03-07 12:00 ` [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug Jonathan Cameron
@ 2023-03-13 15:50   ` James Morse
  2023-03-14 11:02     ` Jonathan Cameron
  0 siblings, 1 reply; 59+ messages in thread
From: James Morse @ 2023-03-13 15:50 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Marc Zyngier,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Oliver Upton, Len Brown, Rafael Wysocki, WANG Xuerui,
	Salil Mehta, Russell King, Jean-Philippe Brucker, kangkang.shen

Hi Jonathan,

On 07/03/2023 12:00, Jonathan Cameron wrote:
> On Fri,  3 Feb 2023 13:50:11 +0000
> James Morse <james.morse@arm.com> wrote:

>> On a system that supports cpuhotplug the MADT has to describe every possible
>> CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
>> the guest is started.
>> With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
>> policy about which CPUs can be brought online.
>>
>> This series adds support for virtual-cpuhotplug as exactly that: firmware
>> policy. This may even work on a physical machine too; for a guest the part of
>> firmware is played by the VMM. (typically Qemu).
>>
>> PSCI support is modified to return 'DENIED' if the CPU can't be brought
>> online/enabled yet. The CPU object's _STA method's enabled bit is used to
>> indicate firmware's current disposition. If the CPU has its enabled bit clear,
>> it will not be registered with sysfs, and attempts to bring it online will
>> fail. The notifications that _STA has changed its value then work in the same
>> way as physical hotplug, and firmware can cause the CPU to be registered some
>> time later, allowing it to be brought online.

> As we discussed on an LOD call a while back, I think that we need some path to
> find out if the guest supports vCPU HP or not so that info can be queried by
> an orchestrator / libvirt etc.  In general the entity responsible for allocating
> extra vCPUs may not know what support the VM has for this feature.

I agree. For arm64 this is going to be important if/when there are machines that do
physical hotplug of CPUs too.


> There are various ways we could get this information into the VMM.
> My immediate thought is to use one of the ACPI interfaces that lets us write
> AML that can set an emulated register. A query to the VMM can check if this
> register is set.
> 
> So options.
> 
> _OSI() - Deprecated on ARM64 so lets not use that ;)

News to me, I've only just discovered it!


> _OSC() - Could add a bit to Table 6.13 Platform-Wide Capabilites in ACPI 6.5 spec.
>          Given x86 has a similar online capable bit perhaps this is the best option
>          though it is the one that requires a formal code first proposal to ASWG.

I've had a go at writing this one:
https://gitlab.arm.com/linux-arm/linux-jm/-/commit/220b0d8b0261d7467c8705e6f614d57325798859

It'll appear in the v1 of the series once the kernel and qemu bits are all lined up again.


Thanks,

James


> _OSC() - Could add a new UUID and put it under a suitable device - maybe all CPUs?
>          You could definitely argue this feature is an operating system property.
> _DSM() - Similar to OSC but always under a device.
>          Whilst can be used for this I'm not sure it really matches intended usecase.
> 
> Assuming everyone agrees this bit of introspection is useful,
> Rafael / other ACPI specialists: Any suggestions on how best to do this?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-03-13 15:50   ` James Morse
@ 2023-03-14 11:02     ` Jonathan Cameron
  0 siblings, 0 replies; 59+ messages in thread
From: Jonathan Cameron @ 2023-03-14 11:02 UTC (permalink / raw)
  To: James Morse
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Marc Zyngier,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Oliver Upton, Len Brown, Rafael Wysocki, WANG Xuerui,
	Salil Mehta, Russell King, kangkang.shen

On Mon, 13 Mar 2023 15:50:52 +0000
James Morse <james.morse@arm.com> wrote:

> Hi Jonathan,
> 
> On 07/03/2023 12:00, Jonathan Cameron wrote:
> > On Fri,  3 Feb 2023 13:50:11 +0000
> > James Morse <james.morse@arm.com> wrote:  
> 
> >> On a system that supports cpuhotplug the MADT has to describe every possible
> >> CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
> >> the guest is started.
> >> With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
> >> policy about which CPUs can be brought online.
> >>
> >> This series adds support for virtual-cpuhotplug as exactly that: firmware
> >> policy. This may even work on a physical machine too; for a guest the part of
> >> firmware is played by the VMM. (typically Qemu).
> >>
> >> PSCI support is modified to return 'DENIED' if the CPU can't be brought
> >> online/enabled yet. The CPU object's _STA method's enabled bit is used to
> >> indicate firmware's current disposition. If the CPU has its enabled bit clear,
> >> it will not be registered with sysfs, and attempts to bring it online will
> >> fail. The notifications that _STA has changed its value then work in the same
> >> way as physical hotplug, and firmware can cause the CPU to be registered some
> >> time later, allowing it to be brought online.  
> 
> > As we discussed on an LOD call a while back, I think that we need some path to
> > find out if the guest supports vCPU HP or not so that info can be queried by
> > an orchestrator / libvirt etc.  In general the entity responsible for allocating
> > extra vCPUs may not know what support the VM has for this feature.  
> 
> I agree. For arm64 this is going to be important if/when there are machines that do
> physical hotplug of CPUs too.
> 
> 
> > There are various ways we could get this information into the VMM.
> > My immediate thought is to use one of the ACPI interfaces that lets us write
> > AML that can set an emulated register. A query to the VMM can check if this
> > register is set.
> > 
> > So options.
> > 
> > _OSI() - Deprecated on ARM64 so lets not use that ;)  
> 
> News to me, I've only just discovered it!
> 
> 
> > _OSC() - Could add a bit to Table 6.13 Platform-Wide Capabilites in ACPI 6.5 spec.
> >          Given x86 has a similar online capable bit perhaps this is the best option
> >          though it is the one that requires a formal code first proposal to ASWG.  
> 
> I've had a go at writing this one:
> https://gitlab.arm.com/linux-arm/linux-jm/-/commit/220b0d8b0261d7467c8705e6f614d57325798859

From a quick glance that looks good to me.

> 
> It'll appear in the v1 of the series once the kernel and qemu bits are all lined up again.

We'll also need to kick off the spec change with a code-first proposal.
I think current standard way to do that is a bugzilla entry in EDK2 repo 
https://bugzilla.tianocore.org/buglist.cgi?component=Specification%20Update&product=EDK2%20Code%20First&resolution=---
and the get someone in ASWG to create equivalent tracking issue in mantis.

Great if you already have that in hand via relevant ARM folks.

Jonathan

> 
> 
> Thanks,
> 
> James
> 
> 
> > _OSC() - Could add a new UUID and put it under a suitable device - maybe all CPUs?
> >          You could definitely argue this feature is an operating system property.
> > _DSM() - Similar to OSC but always under a device.
> >          Whilst can be used for this I'm not sure it really matches intended usecase.
> > 
> > Assuming everyone agrees this bit of introspection is useful,
> > Rafael / other ACPI specialists: Any suggestions on how best to do this?  
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (32 preceding siblings ...)
  2023-03-07 12:00 ` [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug Jonathan Cameron
@ 2023-03-29  2:35 ` Gavin Shan
  2023-09-12 17:01   ` James Morse
  2023-03-29  5:52 ` Shaoqin Huang
  34 siblings, 1 reply; 59+ messages in thread
From: Gavin Shan @ 2023-03-29  2:35 UTC (permalink / raw)
  To: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	Suzuki K Poulose, Oliver Upton, Len Brown, Rafael Wysocki,
	WANG Xuerui, Salil Mehta, Russell King, Jean-Philippe Brucker

Hi James,

On 2/3/23 9:50 PM, James Morse wrote:

[...]

> 
> 
> The first patch has already been posted as a fix here:
> https://www.spinics.net/lists/linux-ia64/msg21920.html
> I've only build tested Loongarch and ia64.
> 

It has been merged to upstream.

> 
> If folk want to play along at home, you'll need a copy of Qemu that supports this.
> https://github.com/salil-mehta/qemu.git salil/virt-cpuhp-armv8/rfc-v1-port29092022.psci.present
> 
> You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
> to match your host kernel. Replace your '-smp' argument with something like:
> | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
> 
> then feed the following to the Qemu montior;
> | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
> | (qemu) device_del cpu1
> 
> 
> This series is based on v6.2-rc3, and can be retrieved from:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v1
> 

I give it a try, but the hot-added CPU needs to be put into online
state manually. I'm not sure if it's expected or not.

     /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64                \
     -accel kvm,dirty-ring-size=65536                                       \
     -machine virt,gic-version=host,nvdimm=on                               \
     -cpu host -smp maxcpus=8,cpus=1,sockets=1,clusters=1,cores=8,threads=1 \
     -m 1024M,slots=16,maxmem=64G                                           \
     -object memory-backend-ram,id=mem0,size=1024M                          \
     -numa node,nodeid=0,memdev=mem0                                        \
     -L /home/gavin/sandbox/qemu.main/build/pc-bios                         \
     -monitor none -serial mon:stdio -nographic -gdb tcp::1234              \
     -bios /home/gavin/sandbox/qemu.main/build/pc-bios/edk2-aarch64-code.fd \
     -kernel /home/gavin/sandbox/linux.guest/arch/arm64/boot/Image          \
     -initrd /home/gavin/sandbox/images/rootfs.cpio.xz                      \
     -append memhp_default_state=online_movable                             \
        :
        :
     guest# cat /proc/cpuinfo | grep "CPU implementer" | wc -l
     1
     (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
     guest# cat /proc/cpuinfo | grep "CPU implementer" | wc -l
     1
     guest# echo 1 > /sys/devices/system/cpu/cpu1/online
     guest# cat /proc/cpuinfo | grep "CPU implementer" | wc -l
     2
     (qemu) device_del cpu1
     guest# cat /proc/cpuinfo | grep "CPU implementer" | wc -l
     1

Note that the QEMU binary is directly built from Salil's repository and
the kernel image is built from v6.3-rc4, plus this patchset excluding the
first patch since it has been merged.

Thanks,
Gavin





^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
                   ` (33 preceding siblings ...)
  2023-03-29  2:35 ` Gavin Shan
@ 2023-03-29  5:52 ` Shaoqin Huang
  2023-04-03  6:25   ` Gavin Shan
  34 siblings, 1 reply; 59+ messages in thread
From: Shaoqin Huang @ 2023-03-29  5:52 UTC (permalink / raw)
  To: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	Suzuki K Poulose, Oliver Upton, Len Brown, Rafael Wysocki,
	WANG Xuerui, Salil Mehta, Russell King, Jean-Philippe Brucker

Hi James,

On 2/3/23 21:50, James Morse wrote:
> Hello!
> 
> This series adds what looks like cpuhotplug support to arm64 for use in
> virtual machines. It does this by moving the cpu_register() calls for
> architectures that support ACPI out of the arch code by using
> GENERIC_CPU_DEVICES, then into the ACPI processor driver.
> 
> The kubernetes folk really want to be able to add CPUs to an existing VM,
> in exactly the same way they do on x86. The use-case is pre-booting guests
> with one CPU, then adding the number that were actually needed when the
> workload is provisioned.
> 
> Wait? Doesn't arm64 support cpuhotplug already!?
> In the arm world, cpuhotplug gets used to mean removing the power from a CPU.
> The CPU is offline, and remains present. For x86, and ACPI, cpuhotplug
> has the additional step of physically removing the CPU, so that it isn't
> present anymore.
> 
> Arm64 doesn't support this, and can't support it: CPUs are really a slice
> of the SoC, and there is not enough information in the existing ACPI tables
> to describe which bits of the slice also got removed. Without a reference
> machine: adding this support to the spec is a wild goose chase.
> 
> Critically: everything described in the firmware tables must remain present.
> 
> For a virtual machine this is easy as all the other bits of 'virtual SoC'
> are emulated, so they can (and do) remain present when a vCPU is 'removed'.
> 
> On a system that supports cpuhotplug the MADT has to describe every possible
> CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
> the guest is started.
> With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
> policy about which CPUs can be brought online.
> 
> This series adds support for virtual-cpuhotplug as exactly that: firmware
> policy. This may even work on a physical machine too; for a guest the part of
> firmware is played by the VMM. (typically Qemu).
> 
> PSCI support is modified to return 'DENIED' if the CPU can't be brought
> online/enabled yet. The CPU object's _STA method's enabled bit is used to
> indicate firmware's current disposition. If the CPU has its enabled bit clear,
> it will not be registered with sysfs, and attempts to bring it online will
> fail. The notifications that _STA has changed its value then work in the same
> way as physical hotplug, and firmware can cause the CPU to be registered some
> time later, allowing it to be brought online.
> 
> This creates something that looks like cpuhotplug to user-space, as the sysfs
> files appear and disappear, and the udev notifications look the same.
> 
> One notable difference is the CPU present mask, which is exposed via sysfs.
> Because the CPUs remain present throughout, they can still be seen in that mask.
> This value does get used by webbrowsers to estimate the number of CPUs
> as the CPU online mask is constantly changed on mobile phones.
> 
> Linux is tolerant of PSCI returning errors, as its always been allowed to do
> that. To avoid confusing OS that can't tolerate this, we needed an additional
> bit in the MADT GICC flags. This series copies ACPI_MADT_ONLINE_CAPABLE, which
> appears to be for this purpose, but calls it ACPI_MADT_GICC_CPU_CAPABLE as it
> has a different bit position in the GICC.
> 
> This code is unconditionally enabled for all ACPI architectures.
> If there are problems with firmware tables on some devices, the CPUs will
> already be online by the time the acpi_processor_make_enabled() is called.
> A mismatch here causes a firmware-bug message and kernel taint. This should
> only affect people with broken firmware who also boot with maxcpus=1, and
> bring CPUs online later.
> 
> I had a go at switching the remaining architectures over to GENERIC_CPU_DEVICES,
> so that the Kconfig symbol can be removed, but I got stuck with powerpc
> and s390.
> 
> 
> The first patch has already been posted as a fix here:
> https://www.spinics.net/lists/linux-ia64/msg21920.html
> I've only build tested Loongarch and ia64.
> 
> 
> If folk want to play along at home, you'll need a copy of Qemu that supports this.
> https://github.com/salil-mehta/qemu.git salil/virt-cpuhp-armv8/rfc-v1-port29092022.psci.present
> 
> You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
> to match your host kernel. Replace your '-smp' argument with something like:
> | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
> 
> then feed the following to the Qemu montior;
> | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
> | (qemu) device_del cpu1
> 
> 
> This series is based on v6.2-rc3, and can be retrieved from:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v1

I applied this patch series on v6.2-rc3 and using the QEMU cloned from 
the salil-mehta/qemu.git repo. But when I try to run the QEMU, it shows:

$ qemu-system-aarch64: -accel kvm: Failed to enable 
KVM_CAP_ARM_PSCI_TO_USER cap.

Here is the command I use:

$ qemu-system-aarch64
-machine virt
-bios /usr/share/qemu-efi-aarch64/QEMU_EFI.fd
-accel kvm
-m 4096
-smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
-cpu host
-qmp unix:./src.socket,server,nowait
-hda ./XXX.qcow2
-serial unix:./src.serial,server,nowait
-monitor stdio

It seems something related to your notice: You'll need to fix the 
numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
to match your host kernel.

But I'm not actually understand what should I fix, since I haven't 
review the patch series. Could you give me some more information? Maybe 
I'm doing something wrong.

Thanks,

> 
> 
> Thanks,
> 
> James Morse (29):
>    ia64: Fix build error due to switch case label appearing next to
>      declaration
>    ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture
>    drivers: base: Use present CPUs in GENERIC_CPU_DEVICES
>    drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden
>    drivers: base: Move cpu_dev_init() after node_dev_init()
>    arm64: setup: Switch over to GENERIC_CPU_DEVICES using
>      arch_register_cpu()
>    ia64/topology: Switch over to GENERIC_CPU_DEVICES
>    x86/topology: Switch over to GENERIC_CPU_DEVICES
>    LoongArch: Switch over to GENERIC_CPU_DEVICES
>    arch_topology: Make register_cpu_capacity_sysctl() tolerant to late
>      CPUs
>    ACPI: processor: Add support for processors described as container
>      packages
>    ACPI: processor: Register CPUs that are online, but not described in
>      the DSDT
>    ACPI: processor: Register all CPUs from acpi_processor_get_info()
>    ACPI: Rename ACPI_HOTPLUG_CPU to include 'present'
>    ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove()
>    ACPI: Rename acpi_processor_hotadd_init and remove pre-processor
>      guards
>    ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
>    ACPI: Check _STA present bit before making CPUs not present
>    ACPI: Warn when the present bit changes but the feature is not enabled
>    drivers: base: Implement weak arch_unregister_cpu()
>    LoongArch: Use the __weak version of arch_unregister_cpu()
>    arm64: acpi: Move get_cpu_for_acpi_id() to a header
>    ACPICA: Add new MADT GICC flags fields [code first?]
>    arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a
>      helper
>    irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
>    irqchip/gic-v3: Add support for ACPI's disabled but 'online capable'
>      CPUs
>    ACPI: add support to register CPUs based on the _STA enabled bit
>    arm64: document virtual CPU hotplug's expectations
>    cpumask: Add enabled cpumask for present CPUs that can be brought
>      online
> 
> Jean-Philippe Brucker (3):
>    arm64: psci: Ignore DENIED CPUs
>    KVM: arm64: Pass hypercalls to userspace
>    KVM: arm64: Pass PSCI calls to userspace
> 
>   Documentation/arm64/cpu-hotplug.rst       |  79 ++++++++++++
>   Documentation/arm64/index.rst             |   1 +
>   Documentation/virt/kvm/api.rst            |  31 ++++-
>   Documentation/virt/kvm/arm/hypercalls.rst |   1 +
>   arch/arm64/Kconfig                        |   1 +
>   arch/arm64/include/asm/acpi.h             |  11 ++
>   arch/arm64/include/asm/cpu.h              |   1 -
>   arch/arm64/include/asm/kvm_host.h         |   2 +
>   arch/arm64/kernel/acpi_numa.c             |  11 --
>   arch/arm64/kernel/psci.c                  |   2 +-
>   arch/arm64/kernel/setup.c                 |  13 +-
>   arch/arm64/kernel/smp.c                   |   5 +-
>   arch/arm64/kvm/arm.c                      |  15 ++-
>   arch/arm64/kvm/hypercalls.c               |  28 ++++-
>   arch/arm64/kvm/psci.c                     |  13 ++
>   arch/ia64/Kconfig                         |   2 +
>   arch/ia64/include/asm/acpi.h              |   2 +-
>   arch/ia64/include/asm/cpu.h               |  11 --
>   arch/ia64/kernel/acpi.c                   |   6 +-
>   arch/ia64/kernel/setup.c                  |   2 +-
>   arch/ia64/kernel/sys_ia64.c               |   7 +-
>   arch/ia64/kernel/topology.c               |  35 +-----
>   arch/loongarch/Kconfig                    |   2 +
>   arch/loongarch/kernel/topology.c          |  31 +----
>   arch/x86/Kconfig                          |   2 +
>   arch/x86/include/asm/cpu.h                |   6 -
>   arch/x86/kernel/acpi/boot.c               |   4 +-
>   arch/x86/kernel/topology.c                |  19 +--
>   drivers/acpi/Kconfig                      |   5 +-
>   drivers/acpi/acpi_processor.c             | 146 +++++++++++++++++-----
>   drivers/acpi/processor_core.c             |   2 +-
>   drivers/acpi/scan.c                       | 116 +++++++++++------
>   drivers/base/arch_topology.c              |  38 ++++--
>   drivers/base/cpu.c                        |  31 ++++-
>   drivers/base/init.c                       |   2 +-
>   drivers/firmware/psci/psci.c              |   2 +
>   drivers/irqchip/irq-gic-v3.c              |  38 +++---
>   include/acpi/acpi_bus.h                   |   1 +
>   include/acpi/actbl2.h                     |   1 +
>   include/kvm/arm_hypercalls.h              |   1 +
>   include/kvm/arm_psci.h                    |   4 +
>   include/linux/acpi.h                      |  10 +-
>   include/linux/cpu.h                       |   6 +
>   include/linux/cpumask.h                   |  25 ++++
>   include/uapi/linux/kvm.h                  |   2 +
>   kernel/cpu.c                              |   3 +
>   46 files changed, 532 insertions(+), 244 deletions(-)
>   create mode 100644 Documentation/arm64/cpu-hotplug.rst
> 

-- 
Shaoqin


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-03-29  5:52 ` Shaoqin Huang
@ 2023-04-03  6:25   ` Gavin Shan
  0 siblings, 0 replies; 59+ messages in thread
From: Gavin Shan @ 2023-04-03  6:25 UTC (permalink / raw)
  To: Shaoqin Huang, James Morse, linux-pm, loongarch, kvmarm, kvm,
	linux-acpi, linux-arch, linux-ia64, linux-kernel,
	linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	Suzuki K Poulose, Oliver Upton, Len Brown, Rafael Wysocki,
	WANG Xuerui, Salil Mehta, Russell King, Jean-Philippe Brucker

Hi Shaoqin,

On 3/29/23 1:52 PM, Shaoqin Huang wrote:
> On 2/3/23 21:50, James Morse wrote:

[...]

>>
>> The first patch has already been posted as a fix here:
>> https://www.spinics.net/lists/linux-ia64/msg21920.html
>> I've only build tested Loongarch and ia64.
>>
>>
>> If folk want to play along at home, you'll need a copy of Qemu that supports this.
>> https://github.com/salil-mehta/qemu.git salil/virt-cpuhp-armv8/rfc-v1-port29092022.psci.present
>>
>> You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
>> to match your host kernel. Replace your '-smp' argument with something like:
>> | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
>>
>> then feed the following to the Qemu montior;
>> | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
>> | (qemu) device_del cpu1
>>
>>
>> This series is based on v6.2-rc3, and can be retrieved from:
>> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v1
> 
> I applied this patch series on v6.2-rc3 and using the QEMU cloned from the salil-mehta/qemu.git repo. But when I try to run the QEMU, it shows:
> 
> $ qemu-system-aarch64: -accel kvm: Failed to enable KVM_CAP_ARM_PSCI_TO_USER cap.
> 
> Here is the command I use:
> 
> $ qemu-system-aarch64
> -machine virt
> -bios /usr/share/qemu-efi-aarch64/QEMU_EFI.fd
> -accel kvm
> -m 4096
> -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
> -cpu host
> -qmp unix:./src.socket,server,nowait
> -hda ./XXX.qcow2
> -serial unix:./src.serial,server,nowait
> -monitor stdio
> 
> It seems something related to your notice: You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
> to match your host kernel.
> 
> But I'm not actually understand what should I fix, since I haven't review the patch series. Could you give me some more information? Maybe I'm doing something wrong.
> 

When the kernel is rebased to v6.2.rc3, the two capabilities are conflictsing
between QEMU and host kernel. Please adjust them like below and have a try:

In qemu/linux-headers/linux/kvm.h

#define KVM_CAP_ARM_HVC_TO_USER 250 /* TODO: as per linux 6.1-rc2 */
#define KVM_CAP_ARM_PSCI_TO_USER 251 /* TODO: as per linux 6.1-rc2 */

In linux/include/uapi/linux/kvm.h

#define KVM_CAP_ARM_HVC_TO_USER 250
#define KVM_CAP_ARM_PSCI_TO_USER 251

Thanks,
Gavin



^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [RFC PATCH 30/32] KVM: arm64: Pass PSCI calls to userspace
  2023-02-03 13:50 ` [RFC PATCH 30/32] KVM: arm64: Pass PSCI calls " James Morse
@ 2023-05-23  9:32   ` Salil Mehta
  2023-09-12 17:01     ` James Morse
  0 siblings, 1 reply; 59+ messages in thread
From: Salil Mehta @ 2023-05-23  9:32 UTC (permalink / raw)
  To: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	Suzuki K Poulose, Oliver Upton, Len Brown, Rafael Wysocki,
	WANG Xuerui, Russell King, Jean-Philippe Brucker

Hi James,
After Oliver Upton changes, I think we don't need most of the stuff in
[Patch 29/32] and [Patch 30/32].

I have few questions related to the PSCI Version. Please scroll below.


> From: James Morse <james.morse@arm.com>
> Sent: Friday, February 3, 2023 1:51 PM
> To: linux-pm@vger.kernel.org; loongarch@lists.linux.dev;
> kvmarm@lists.linux.dev; kvm@vger.kernel.org; linux-acpi@vger.kernel.org;
> linux-arch@vger.kernel.org; linux-ia64@vger.kernel.org; linux-
> kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> x86@kernel.org

[...]


> 
> When the KVM_CAP_ARM_PSCI_TO_USER capability is available, userspace can
> request to handle PSCI calls.
> 
> This is required for virtual CPU hotplug to allow the VMM to enforce the
> online/offline policy it has advertised via ACPI. By managing PSCI in
> user-space, the VMM is able to return PSCI_DENIED when the guest attempts
> to bring a disabled vCPU online.
> Without this, the VMM is only able to not-run the vCPU, the kernel will
> have already returned PSCI_SUCCESS to the guest. This results in
> timeouts during boot as the OS must wait for the secondary vCPU.
> 
> SMCCC probe requires PSCI v1.x. If userspace only implements PSCI v0.2,
> the guest won't query SMCCC support through PSCI and won't use the
> spectre workarounds. We could hijack PSCI_VERSION and pretend to support
> v1.0 if userspace does not, then handle all v1.0 calls ourselves
> (including guessing the PSCI feature set implemented by the guest), but
> that seems unnecessary. After all the API already allows userspace to
> force a version lower than v1.0 using the firmware pseudo-registers.
> 
> The KVM_REG_ARM_PSCI_VERSION pseudo-register currently resets to either
> v0.1 if userspace doesn't set KVM_ARM_VCPU_PSCI_0_2, or
> KVM_ARM_PSCI_LATEST (1.0).


I just saw the latest PSCI standard issue (Mar 2023 E Non-Confidential
PSCI 1.2 issue E) and it contains the DENIED return value for the CPU_ON. 

Should we *explicitly* check for PSCI 1.2 support before allowing vCPU
Hot plug support? For this we would need KVM changes.


@James, Since Oliver's patches have now got merged with the kernel I think
you would need to rebase your RFC on the latest kernel as patches 29/32
And 30/32 will create conflicts.


Many thanks
Salil



> 
> Suggested-by: James Morse <james.morse@arm.com>
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> [morse: Added description of why this is required]
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  Documentation/virt/kvm/api.rst            | 14 ++++++++++++++
>  Documentation/virt/kvm/arm/hypercalls.rst |  1 +
>  arch/arm64/include/asm/kvm_host.h         |  1 +
>  arch/arm64/kvm/arm.c                      | 10 +++++++---
>  arch/arm64/kvm/hypercalls.c               |  2 +-
>  arch/arm64/kvm/psci.c                     | 13 +++++++++++++
>  include/kvm/arm_hypercalls.h              |  1 +
>  include/uapi/linux/kvm.h                  |  1 +
>  8 files changed, 39 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst
> b/Documentation/virt/kvm/api.rst
> index 9a28a9cc1163..eb99436a1d97 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -8289,6 +8289,20 @@ This capability indicates that KVM can pass
> unhandled hypercalls to userspace,
>  if the VMM enables it. Hypercalls are passed with KVM_EXIT_HYPERCALL in
>  kvm_run::hypercall.
> 
> +8.38 KVM_CAP_ARM_PSCI_TO_USER
> +-----------------------------
> +
> +:Architectures: arm64
> +
> +When the VMM enables this capability, all PSCI calls are passed to
> userspace
> +instead of being handled by KVM. Capability KVM_CAP_ARM_HVC_TO_USER must
> be
> +enabled first.
> +
> +Userspace should support at least PSCI v1.0. Otherwise SMCCC features
> won't be
> +available to the guest. Userspace does not need to handle the
> SMCCC_VERSION
> +parameter for the PSCI_FEATURES function. The KVM_ARM_VCPU_PSCI_0_2 vCPU
> +feature should be set even if this capability is enabled.
> +
>  9. Known KVM API problems
>  =========================
> 
> diff --git a/Documentation/virt/kvm/arm/hypercalls.rst
> b/Documentation/virt/kvm/arm/hypercalls.rst
> index 3e23084644ba..4c111afa7d74 100644
> --- a/Documentation/virt/kvm/arm/hypercalls.rst
> +++ b/Documentation/virt/kvm/arm/hypercalls.rst
> @@ -34,6 +34,7 @@ The following registers are defined:
>    - Allows any PSCI version implemented by KVM and compatible with
>      v0.2 to be set with SET_ONE_REG
>    - Affects the whole VM (even if the register view is per-vcpu)
> +  - Defaults to PSCI 1.0 if userspace enables KVM_CAP_ARM_PSCI_TO_USER.
> 
>  * KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
>      Holds the state of the firmware support to mitigate CVE-2017-5715, as
> diff --git a/arch/arm64/include/asm/kvm_host.h
> b/arch/arm64/include/asm/kvm_host.h
> index 40911ebfa710..a9eff47bcb43 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -214,6 +214,7 @@ struct kvm_arch {
>  	/* PSCI SYSTEM_SUSPEND enabled for the guest */
>  #define KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED		5
>  #define KVM_ARCH_FLAG_HVC_TO_USER			6
> +#define KVM_ARCH_FLAG_PSCI_TO_USER			7
> 
>  	unsigned long flags;
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 815b7e8f88e1..3dba4e01f4d8 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -76,7 +76,7 @@ int kvm_arch_check_processor_compat(void *opaque)
>  int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  			    struct kvm_enable_cap *cap)
>  {
> -	int r;
> +	int r = -EINVAL;
> 
>  	if (cap->flags)
>  		return -EINVAL;
> @@ -105,8 +105,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		r = 0;
>  		set_bit(KVM_ARCH_FLAG_HVC_TO_USER, &kvm->arch.flags);
>  		break;
> -	default:
> -		r = -EINVAL;
> +	case KVM_CAP_ARM_PSCI_TO_USER:
> +		if (test_bit(KVM_ARCH_FLAG_HVC_TO_USER, &kvm->arch.flags)) {
> +			r = 0;
> +			set_bit(KVM_ARCH_FLAG_PSCI_TO_USER, &kvm->arch.flags);
> +		}
>  		break;
>  	}
> 
> @@ -235,6 +238,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long
> ext)
>  	case KVM_CAP_PTP_KVM:
>  	case KVM_CAP_ARM_SYSTEM_SUSPEND:
>  	case KVM_CAP_ARM_HVC_TO_USER:
> +	case KVM_CAP_ARM_PSCI_TO_USER:
>  		r = 1;
>  		break;
>  	case KVM_CAP_SET_GUEST_DEBUG2:
> diff --git a/arch/arm64/kvm/hypercalls.c b/arch/arm64/kvm/hypercalls.c
> index efaf05d40dab..3c2136cd7a3f 100644
> --- a/arch/arm64/kvm/hypercalls.c
> +++ b/arch/arm64/kvm/hypercalls.c
> @@ -121,7 +121,7 @@ static bool kvm_hvc_call_allowed(struct kvm_vcpu *vcpu,
> u32 func_id)
>  	}
>  }
> 
> -static int kvm_hvc_user(struct kvm_vcpu *vcpu)
> +int kvm_hvc_user(struct kvm_vcpu *vcpu)
>  {
>  	int i;
>  	struct kvm_run *run = vcpu->run;
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index 7fbc4c1b9df0..8505b26f0a83 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -418,6 +418,16 @@ static int kvm_psci_0_1_call(struct kvm_vcpu *vcpu)
>  	return 1;
>  }
> 
> +static bool kvm_psci_call_is_user(struct kvm_vcpu *vcpu)
> +{
> +	/* Handle the special case of SMCCC probe through PSCI */
> +	if (smccc_get_function(vcpu) == PSCI_1_0_FN_PSCI_FEATURES &&
> +	    smccc_get_arg1(vcpu) == ARM_SMCCC_VERSION_FUNC_ID)
> +		return false;
> +
> +	return test_bit(KVM_ARCH_FLAG_PSCI_TO_USER, &vcpu->kvm->arch.flags);
> +}
> +
>  /**
>   * kvm_psci_call - handle PSCI call if r0 value is in range
>   * @vcpu: Pointer to the VCPU struct
> @@ -443,6 +453,9 @@ int kvm_psci_call(struct kvm_vcpu *vcpu)
>  		return 1;
>  	}
> 
> +	if (kvm_psci_call_is_user(vcpu))
> +		return kvm_hvc_user(vcpu);
> +
>  	switch (kvm_psci_version(vcpu)) {
>  	case KVM_ARM_PSCI_1_1:
>  		return kvm_psci_1_x_call(vcpu, 1);
> diff --git a/include/kvm/arm_hypercalls.h b/include/kvm/arm_hypercalls.h
> index 1188f116cf4e..ea7073d1a82e 100644
> --- a/include/kvm/arm_hypercalls.h
> +++ b/include/kvm/arm_hypercalls.h
> @@ -6,6 +6,7 @@
> 
>  #include <asm/kvm_emulate.h>
> 
> +int kvm_hvc_user(struct kvm_vcpu *vcpu);
>  int kvm_hvc_call_handler(struct kvm_vcpu *vcpu);
> 
>  static inline u32 smccc_get_function(struct kvm_vcpu *vcpu)
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 2ead8b9aae56..c5da9d703a0f 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1176,6 +1176,7 @@ struct kvm_ppc_resize_hpt {
>  #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
>  #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
>  #define KVM_CAP_ARM_HVC_TO_USER 226
> +#define KVM_CAP_ARM_PSCI_TO_USER 227
> 
>  #ifdef KVM_CAP_IRQ_ROUTING
> 
> --
> 2.30.2
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 02/32] ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture
  2023-02-03 13:50 ` [RFC PATCH 02/32] ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture James Morse
@ 2023-08-30 18:31   ` Russell King (Oracle)
  0 siblings, 0 replies; 59+ messages in thread
From: Russell King (Oracle) @ 2023-08-30 18:31 UTC (permalink / raw)
  To: James Morse
  Cc: linux-pm, loongarch, kvmarm, kvm, linux-acpi, linux-arch,
	linux-ia64, linux-kernel, linux-arm-kernel, x86, Marc Zyngier,
	Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland, Sudeep Holla,
	Borislav Petkov, H Peter Anvin, Dave Hansen, Ingo Molnar,
	Will Deacon, Catalin Marinas, Huacai Chen, Suzuki K Poulose,
	Oliver Upton, Len Brown, Rafael Wysocki, WANG Xuerui,
	Salil Mehta, Jean-Philippe Brucker

On Fri, Feb 03, 2023 at 01:50:13PM +0000, James Morse wrote:
> --- a/arch/ia64/Kconfig
> +++ b/arch/ia64/Kconfig
> @@ -15,6 +15,7 @@ config IA64
>  	select ARCH_MIGHT_HAVE_PC_PARPORT
>  	select ARCH_MIGHT_HAVE_PC_SERIO
>  	select ACPI
> +	select ACPI_HOTPLUG_CPU if ACPI
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -5,6 +5,7 @@ config LOONGARCH
>  	select ACPI
>  	select ACPI_GENERIC_GSI if ACPI
>  	select ACPI_MCFG if ACPI
> +	select ACPI_HOTPLUG_CPU if ACPI
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -59,6 +59,7 @@ config X86
>  	#
>  	select ACPI_LEGACY_TABLES_LOOKUP	if ACPI
>  	select ACPI_SYSTEM_POWER_STATES_SUPPORT	if ACPI
> +	select ACPI_HOTPLUG_CPU			if ACPI
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -309,7 +309,6 @@ config ACPI_HOTPLUG_CPU
>  	bool
>  	depends on ACPI_PROCESSOR && HOTPLUG_CPU
>  	select ACPI_CONTAINER
> -	default y

When selecting the symbol, it's a good idea to ensure that its
dependencies are satisfied. So here, ACPI_HOTPLUG_CPU depends on
ACPI_PROCESSOR and HOTPLUG_CPU.

For x86, you're selecting ACPI_HOTPLUG_CPU if ACPI is enabled,
and ACPI can be freely enabled. HOTPLUG_CPU depends on SMP,
which is also a freely selectable option. Lastly,
ACPI_PROCESSOR depends on X86 || IA64 || ARM64 || LOONGARCH,
and is a user selectable, defaulting-y option if ACPI is
enabled.

So, shouldn't the x86 select be:

	select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU

?

I suspect similar issues exist for the other architecture Kconfig files
modified above.

This seems to also be in the latest rfc too, which is why I'm bringing
it up.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-03-29  2:35 ` Gavin Shan
@ 2023-09-12 17:01   ` James Morse
  2023-09-12 22:38     ` Gavin Shan
  0 siblings, 1 reply; 59+ messages in thread
From: James Morse @ 2023-09-12 17:01 UTC (permalink / raw)
  To: Gavin Shan, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	Suzuki K Poulose, Oliver Upton, Len Brown, Rafael Wysocki,
	WANG Xuerui, Salil Mehta, Russell King, Jean-Philippe Brucker

Hi Gavin,

On 29/03/2023 03:35, Gavin Shan wrote:
> On 2/3/23 9:50 PM, James Morse wrote:

>> If folk want to play along at home, you'll need a copy of Qemu that supports this.
>> https://github.com/salil-mehta/qemu.git
>> salil/virt-cpuhp-armv8/rfc-v1-port29092022.psci.present
>>
>> You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
>> to match your host kernel. Replace your '-smp' argument with something like:
>> | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
>>
>> then feed the following to the Qemu montior;
>> | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
>> | (qemu) device_del cpu1
>>
>>
>> This series is based on v6.2-rc3, and can be retrieved from:
>> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v1

> I give it a try, but the hot-added CPU needs to be put into online
> state manually. I'm not sure if it's expected or not.

This is expected. If you want the CPUs to be brought online automatically, you can add
udev rules to do that.


Thanks,

James

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 30/32] KVM: arm64: Pass PSCI calls to userspace
  2023-05-23  9:32   ` Salil Mehta
@ 2023-09-12 17:01     ` James Morse
  0 siblings, 0 replies; 59+ messages in thread
From: James Morse @ 2023-09-12 17:01 UTC (permalink / raw)
  To: Salil Mehta, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	Suzuki K Poulose, Oliver Upton, Len Brown, Rafael Wysocki,
	WANG Xuerui, Russell King, Jean-Philippe Brucker

Hi Salil,

On 23/05/2023 10:32, Salil Mehta wrote:
>> From: James Morse <james.morse@arm.com>
>> Sent: Friday, February 3, 2023 1:51 PM
>> To: linux-pm@vger.kernel.org; loongarch@lists.linux.dev;
>> kvmarm@lists.linux.dev; kvm@vger.kernel.org; linux-acpi@vger.kernel.org;
>> linux-arch@vger.kernel.org; linux-ia64@vger.kernel.org; linux-
>> kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>> x86@kernel.org
> 
> [...]
> 
> 
>>
>> When the KVM_CAP_ARM_PSCI_TO_USER capability is available, userspace can
>> request to handle PSCI calls.
>>
>> This is required for virtual CPU hotplug to allow the VMM to enforce the
>> online/offline policy it has advertised via ACPI. By managing PSCI in
>> user-space, the VMM is able to return PSCI_DENIED when the guest attempts
>> to bring a disabled vCPU online.
>> Without this, the VMM is only able to not-run the vCPU, the kernel will
>> have already returned PSCI_SUCCESS to the guest. This results in
>> timeouts during boot as the OS must wait for the secondary vCPU.
>>
>> SMCCC probe requires PSCI v1.x. If userspace only implements PSCI v0.2,
>> the guest won't query SMCCC support through PSCI and won't use the
>> spectre workarounds. We could hijack PSCI_VERSION and pretend to support
>> v1.0 if userspace does not, then handle all v1.0 calls ourselves
>> (including guessing the PSCI feature set implemented by the guest), but
>> that seems unnecessary. After all the API already allows userspace to
>> force a version lower than v1.0 using the firmware pseudo-registers.
>>
>> The KVM_REG_ARM_PSCI_VERSION pseudo-register currently resets to either
>> v0.1 if userspace doesn't set KVM_ARM_VCPU_PSCI_0_2, or
>> KVM_ARM_PSCI_LATEST (1.0).

> I just saw the latest PSCI standard issue (Mar 2023 E Non-Confidential
> PSCI 1.2 issue E) and it contains the DENIED return value for the CPU_ON. 
> 
> Should we *explicitly* check for PSCI 1.2 support before allowing vCPU
> Hot plug support? For this we would need KVM changes.

The VMM should certainly check which version of PSCI it supports, to make sure it doesn't
return an error code that the spec says that version of PSCI doesn't use.

Moving the PSCI support to the VMM is a pre-requisite for supporting this mechanism,
otherwise KVM will allow the CPUs to come online immediately.


Thanks,

James

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-09-12 17:01   ` James Morse
@ 2023-09-12 22:38     ` Gavin Shan
  2023-09-13 15:28       ` Russell King (Oracle)
  0 siblings, 1 reply; 59+ messages in thread
From: Gavin Shan @ 2023-09-12 22:38 UTC (permalink / raw)
  To: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	Suzuki K Poulose, Oliver Upton, Len Brown, Rafael Wysocki,
	WANG Xuerui, Salil Mehta, Russell King, Jean-Philippe Brucker


Hi James,

On 9/13/23 03:01, James Morse wrote:
> On 29/03/2023 03:35, Gavin Shan wrote:
>> On 2/3/23 9:50 PM, James Morse wrote:
> 
>>> If folk want to play along at home, you'll need a copy of Qemu that supports this.
>>> https://github.com/salil-mehta/qemu.git
>>> salil/virt-cpuhp-armv8/rfc-v1-port29092022.psci.present
>>>
>>> You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
>>> to match your host kernel. Replace your '-smp' argument with something like:
>>> | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
>>>
>>> then feed the following to the Qemu montior;
>>> | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
>>> | (qemu) device_del cpu1
>>>
>>>
>>> This series is based on v6.2-rc3, and can be retrieved from:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v1
> 
>> I give it a try, but the hot-added CPU needs to be put into online
>> state manually. I'm not sure if it's expected or not.
> 
> This is expected. If you want the CPUs to be brought online automatically, you can add
> udev rules to do that.
> 

Yeah, I usually execute the following command to bring the CPU into online state,
after the vCPU is hot added by QMP command.

(qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
guest# echo 1 > /sys/devices/system/cpu/cpux/online

James, the series was posted a while ago and do you have plan to respin
and post RFCv2 in near future? :)

Thanks,
Gavin


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug
  2023-09-12 22:38     ` Gavin Shan
@ 2023-09-13 15:28       ` Russell King (Oracle)
  0 siblings, 0 replies; 59+ messages in thread
From: Russell King (Oracle) @ 2023-09-13 15:28 UTC (permalink / raw)
  To: Gavin Shan
  Cc: James Morse, linux-pm, loongarch, kvmarm, kvm, linux-acpi,
	linux-arch, linux-ia64, linux-kernel, linux-arm-kernel, x86,
	Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Mark Rutland,
	Sudeep Holla, Borislav Petkov, H Peter Anvin, Dave Hansen,
	Ingo Molnar, Will Deacon, Catalin Marinas, Huacai Chen,
	Suzuki K Poulose, Oliver Upton, Len Brown, Rafael Wysocki,
	WANG Xuerui, Salil Mehta, Jean-Philippe Brucker

On Wed, Sep 13, 2023 at 08:38:51AM +1000, Gavin Shan wrote:
> 
> Hi James,
> 
> On 9/13/23 03:01, James Morse wrote:
> > On 29/03/2023 03:35, Gavin Shan wrote:
> > > On 2/3/23 9:50 PM, James Morse wrote:
> > 
> > > > If folk want to play along at home, you'll need a copy of Qemu that supports this.
> > > > https://github.com/salil-mehta/qemu.git
> > > > salil/virt-cpuhp-armv8/rfc-v1-port29092022.psci.present
> > > > 
> > > > You'll need to fix the numbers of KVM_CAP_ARM_HVC_TO_USER and KVM_CAP_ARM_PSCI_TO_USER
> > > > to match your host kernel. Replace your '-smp' argument with something like:
> > > > | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
> > > > 
> > > > then feed the following to the Qemu montior;
> > > > | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
> > > > | (qemu) device_del cpu1
> > > > 
> > > > 
> > > > This series is based on v6.2-rc3, and can be retrieved from:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v1
> > 
> > > I give it a try, but the hot-added CPU needs to be put into online
> > > state manually. I'm not sure if it's expected or not.
> > 
> > This is expected. If you want the CPUs to be brought online automatically, you can add
> > udev rules to do that.
> > 
> 
> Yeah, I usually execute the following command to bring the CPU into online state,
> after the vCPU is hot added by QMP command.
> 
> (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
> guest# echo 1 > /sys/devices/system/cpu/cpux/online
> 
> James, the series was posted a while ago and do you have plan to respin
> and post RFCv2 in near future? :)

I'll pipe up here, because I've been discussing this topic with James
privately.

In James' last email to me, he indicated that he's hoping to publish
the next iteration of the patches towards the end of this week. I
suspect that's conditional on there being no major issues coming up.

One of the things that I think would help this patch set along is if
people could test it on x86, to make sure that there aren't any
regressions on random x86 hardware - and report successes and
failures so confidence in the patch series can be gained.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2023-09-13 15:37 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-03 13:50 [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug James Morse
2023-02-03 13:50 ` [RFC PATCH 01/32] ia64: Fix build error due to switch case label appearing next to declaration James Morse
2023-02-03 13:50 ` [RFC PATCH 02/32] ACPI: Move ACPI_HOTPLUG_CPU to be enabled per architecture James Morse
2023-08-30 18:31   ` Russell King (Oracle)
2023-02-03 13:50 ` [RFC PATCH 03/32] drivers: base: Use present CPUs in GENERIC_CPU_DEVICES James Morse
2023-02-03 13:50 ` [RFC PATCH 04/32] drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden James Morse
2023-02-03 13:50 ` [RFC PATCH 05/32] drivers: base: Move cpu_dev_init() after node_dev_init() James Morse
2023-02-03 13:50 ` [RFC PATCH 06/32] arm64: setup: Switch over to GENERIC_CPU_DEVICES using arch_register_cpu() James Morse
2023-02-03 13:50 ` [RFC PATCH 07/32] ia64/topology: Switch over to GENERIC_CPU_DEVICES James Morse
2023-02-03 13:50 ` [RFC PATCH 08/32] x86/topology: " James Morse
2023-02-03 13:50 ` [RFC PATCH 09/32] LoongArch: " James Morse
2023-02-03 13:50 ` [RFC PATCH 10/32] arch_topology: Make register_cpu_capacity_sysctl() tolerant to late CPUs James Morse
2023-02-03 13:50 ` [RFC PATCH 11/32] ACPI: processor: Add support for processors described as container packages James Morse
2023-02-03 13:50 ` [RFC PATCH 12/32] ACPI: processor: Register CPUs that are online, but not described in the DSDT James Morse
2023-02-03 13:50 ` [RFC PATCH 13/32] ACPI: processor: Register all CPUs from acpi_processor_get_info() James Morse
2023-02-03 13:50 ` [RFC PATCH 14/32] ACPI: Rename ACPI_HOTPLUG_CPU to include 'present' James Morse
2023-02-03 13:50 ` [RFC PATCH 15/32] ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove() James Morse
2023-02-03 13:50 ` [RFC PATCH 16/32] ACPI: Rename acpi_processor_hotadd_init and remove pre-processor guards James Morse
2023-02-03 13:50 ` [RFC PATCH 17/32] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug James Morse
2023-02-03 13:50 ` [RFC PATCH 18/32] ACPI: Check _STA present bit before making CPUs not present James Morse
2023-02-03 13:50 ` [RFC PATCH 19/32] ACPI: Warn when the present bit changes but the feature is not enabled James Morse
2023-02-03 13:50 ` [RFC PATCH 20/32] drivers: base: Implement weak arch_unregister_cpu() James Morse
2023-02-03 13:50 ` [RFC PATCH 21/32] LoongArch: Use the __weak version of arch_unregister_cpu() James Morse
2023-02-03 13:50 ` [RFC PATCH 22/32] arm64: acpi: Move get_cpu_for_acpi_id() to a header James Morse
2023-02-03 13:50 ` [RFC PATCH 23/32] ACPICA: Add new MADT GICC flags fields [code first?] James Morse
2023-02-03 13:50 ` [RFC PATCH 24/32] arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a helper James Morse
2023-02-03 13:50 ` [RFC PATCH 25/32] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() James Morse
2023-02-03 13:50 ` [RFC PATCH 26/32] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs James Morse
2023-02-03 13:50 ` [RFC PATCH 27/32] arm64: psci: Ignore DENIED CPUs James Morse
2023-02-03 13:50 ` [RFC PATCH 28/32] ACPI: add support to register CPUs based on the _STA enabled bit James Morse
2023-02-03 13:50 ` [RFC PATCH 29/32] KVM: arm64: Pass hypercalls to userspace James Morse
2023-02-03 21:08   ` Oliver Upton
2023-02-07 17:50     ` James Morse
2023-02-08  9:02       ` Marc Zyngier
2023-02-05 10:12   ` Marc Zyngier
2023-02-06 10:10     ` Suzuki K Poulose
2023-02-06 12:31       ` Marc Zyngier
2023-02-07  9:41         ` Suzuki K Poulose
2023-02-07 11:23           ` Marc Zyngier
2023-02-07 12:46             ` Suzuki K Poulose
2023-02-06 17:19     ` Oliver Upton
2023-02-07 17:50     ` James Morse
2023-02-08  8:40       ` Marc Zyngier
2023-02-08 14:25         ` Marc Zyngier
2023-02-11  1:44       ` Oliver Upton
2023-02-03 13:50 ` [RFC PATCH 30/32] KVM: arm64: Pass PSCI calls " James Morse
2023-05-23  9:32   ` Salil Mehta
2023-09-12 17:01     ` James Morse
2023-02-03 13:50 ` [RFC PATCH 31/32] arm64: document virtual CPU hotplug's expectations James Morse
2023-02-03 13:50 ` [RFC PATCH 32/32] cpumask: Add enabled cpumask for present CPUs that can be brought online James Morse
2023-03-07 12:00 ` [RFC PATCH 00/32] ACPI/arm64: add support for virtual cpuhotplug Jonathan Cameron
2023-03-13 15:50   ` James Morse
2023-03-14 11:02     ` Jonathan Cameron
2023-03-29  2:35 ` Gavin Shan
2023-09-12 17:01   ` James Morse
2023-09-12 22:38     ` Gavin Shan
2023-09-13 15:28       ` Russell King (Oracle)
2023-03-29  5:52 ` Shaoqin Huang
2023-04-03  6:25   ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).