linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug
@ 2024-04-12 14:37 Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
                   ` (17 more replies)
  0 siblings, 18 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

This patch set changes hands again in an attempt to set a new record for
most people who have worked on a single problem.

Miguel has been working on a rename and factoring out of arch
specific code patch set that will clash with this.
https://lore.kernel.org/linux-acpi/20240409150536.9933-1-miguel.luis@oracle.com/
[RFC PATCH 0/4] ACPI: processor: refactor acpi_processor_{get_info|remove}

v5 changes:
- Rebase on Rafael's rework of acpi_scan_check_and_detach() series that
  superceeded the original first patch.
  https://lore.kernel.org/linux-acpi/6021126.lOV4Wx5bFT@kreacher/
  That dealt with what I thought was the most controversial part of the
  series - checking the enabled bit ACPI _STA for CPUS.
- Change the overall handling so that arch_register_cpu() returns
  -EPROBE_DEFER if the particular architecture is not yet ready to
  answer the question of whether a particular CPU maybe used.
  This occurs for ARM64 + ACPI in 2 cases.
  1) At the initial callsite early in boot, before the AML interpreter
     is available and so the code can't query _STA.
  2) If _STA is queried but a particular CPU is present but not enabled.
     Those are the ones we are going to hotplug later.
  For all other architectures and ARM64 DT boots the this deferred
  flow is not used.
- Make the _make_enabled() and _make_not_enabled() flows more similar
  to the _make_present() and _make_not_present(). There are still
  sufficient differences that I don't think it makes sense to combine
  the code, but ensuring the locking and NUMA handling brings them
  closer together.  Note than an additional series will address the
  question of onlining and offlining the NUMA node as for now it
  will always be present (that series is not necessary for initial
  merge of this feature).

Dropped RFC because I think this is getting close to ready for merging
and now we are interested in normal review rather than calling out
significant remaining questions.

Updated version of James' original introduction.

This series adds what looks like cpuhotplug support to arm64 for use in
virtual machines. It does this by moving the cpu_register() calls for
architectures that support ACPI into an arch specific call made from
the ACPI processor driver.
 
The kubernetes folk really want to be able to add CPUs to an existing VM,
in exactly the same way they do on x86. The use-case is pre-booting guests
with one CPU, then adding the number that were actually needed when the
workload is provisioned.

Wait? Doesn't arm64 support cpuhotplug already!?
In the arm world, cpuhotplug gets used to mean removing the power from a CPU.
The CPU is offline, and remains present. For x86, and ACPI, cpuhotplug
has the additional step of physically removing the CPU, so that it isn't
present anymore.
 
Arm64 doesn't support this, and can't support it: CPUs are really a slice
of the SoC, and there is not enough information in the existing ACPI tables
to describe which bits of the slice also got removed. Without a reference
machine: adding this support to the spec is a wild goose chase.
 
Critically: everything described in the firmware tables must remain present.
 
For a virtual machine this is easy as all the other bits of 'virtual SoC'
are emulated, so they can (and do) remain present when a vCPU is 'removed'.

On a system that supports cpuhotplug the MADT has to describe every possible
CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
the guest is started.
With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
policy about which CPUs can be brought online.
 
This series adds support for virtual-cpuhotplug as exactly that: firmware
policy. This may even work on a physical machine too; for a guest the part of
firmware is played by the VMM. (typically Qemu).
 
PSCI support is modified to return 'DENIED' if the CPU can't be brought
online/enabled yet. The CPU object's _STA method's enabled bit is used to
indicate firmware's current disposition. If the CPU has its enabled bit clear,
it will not be registered with sysfs, and attempts to bring it online will
fail. The notifications that _STA has changed its value then work in the same
way as physical hotplug, and firmware can cause the CPU to be registered some
time later, allowing it to be brought online.
 
This creates something that looks like cpuhotplug to user-space, as the sysfs
files appear and disappear, and the udev notifications look the same.
 
One notable difference is the CPU present mask, which is exposed via sysfs.
Because the CPUs remain present throughout, they can still be seen in that mask.
This value does get used by webbrowsers to estimate the number of CPUs
as the CPU online mask is constantly changed on mobile phones.
 
Linux is tolerant of PSCI returning errors, as its always been allowed to do
that. To avoid confusing OS that can't tolerate this, we needed an additional
bit in the MADT GICC flags. This series copies ACPI_MADT_ONLINE_CAPABLE, which
appears to be for this purpose, but calls it ACPI_MADT_GICC_CPU_CAPABLE as it
has a different bit position in the GICC.
 
This code is unconditionally enabled for all ACPI architectures, though for
now only arm64 will have deferred the cpu_register() calls.

If there are problems with firmware tables on some devices, the CPUs will
already be online by the time the acpi_processor_make_enabled() is called.
A mismatch here causes a firmware-bug message and kernel taint. This should
only affect people with broken firmware who also boot with maxcpus=1, and
bring CPUs online later.
 
If folk want to play along at home, you'll need a copy of Qemu that supports this.
https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2

Replace your '-smp' argument with something like:
 | -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
 
 then feed the following to the Qemu montior;
 | (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
 | (qemu) device_del cpu1

James Morse (11):
  ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  ACPI: Rename acpi_processor_hotadd_init and  remove pre-processor
    guards
  ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
  ACPI: Check _STA present bit before making CPUs not present
  ACPI: Warn when the present bit changes but the feature is not enabled
  arm64: acpi: Move get_cpu_for_acpi_id() to a header
  irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
  irqchip/gic-v3: Add support for ACPI's disabled but 'online capable'
    CPUs
  ACPI: add support to (un)register CPUs based on the _STA enabled bit
  arm64: document virtual CPU hotplug's expectations
  cpumask: Add enabled cpumask for present CPUs that can be brought
    online

Jean-Philippe Brucker (1):
  arm64: psci: Ignore DENIED CPUs

Jonathan Cameron (5):
  cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER
  ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  ACPI: utils: Add an acpi_sta_enabled() helper and use it in
    acpi_processor_make_present()
  ACPI: scan: Add parameter to allow defering some actions in
    acpi_scan_check_and_detach.
  arm64: arch_register_cpu() variant to allow checking of ACPI _STA

Russell King (1):
  ACPI: convert acpi_processor_post_eject() to use IS_ENABLED()

 .../ABI/testing/sysfs-devices-system-cpu      |   6 +
 Documentation/arch/arm64/cpu-hotplug.rst      |  79 ++++++++++++
 Documentation/arch/arm64/index.rst            |   1 +
 arch/arm64/include/asm/acpi.h                 |  11 ++
 arch/arm64/kernel/acpi_numa.c                 |  11 --
 arch/arm64/kernel/psci.c                      |   2 +-
 arch/arm64/kernel/smp.c                       |  23 +++-
 drivers/acpi/acpi_processor.c                 | 112 +++++++++++++++---
 drivers/acpi/scan.c                           |  57 +++++++--
 drivers/acpi/utils.c                          |  21 ++++
 drivers/base/cpu.c                            |  12 +-
 drivers/irqchip/irq-gic-v3.c                  |  32 +++--
 include/acpi/acpi_bus.h                       |   2 +
 include/linux/acpi.h                          |   5 +-
 include/linux/cpumask.h                       |  25 ++++
 kernel/cpu.c                                  |   3 +
 16 files changed, 346 insertions(+), 56 deletions(-)
 create mode 100644 Documentation/arch/arm64/cpu-hotplug.rst

-- 
2.39.2


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 17:42   ` Rafael J. Wysocki
  2024-04-22  3:53   ` Gavin Shan
  2024-04-12 14:37 ` [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance Jonathan Cameron
                   ` (16 subsequent siblings)
  17 siblings, 2 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

For arm64 the CPU registration cannot complete until the ACPI intepretter
us up and running so in those cases the arch specific
arch_register_cpu() will return -EPROBE_DEFER at this stage and the
registration will be attempted later.

Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5: New patch.
    Note that for now no arch_register_cpu() calls return -EPROBE_DEFER
    so it has no impact until the arm64 one is added later in this series.
---
 drivers/base/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 56fba44ba391..b9d0d14e5960 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -558,7 +558,7 @@ static void __init cpu_dev_register_generic(void)
 
 	for_each_present_cpu(i) {
 		ret = arch_register_cpu(i);
-		if (ret)
+		if (ret != -EPROBE_DEFER)
 			pr_warn("register_cpu %d failed (%d)\n", i, ret);
 	}
 }
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 18:10   ` Rafael J. Wysocki
  2024-04-12 14:37 ` [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info() Jonathan Cameron
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

The arm64 specific arch_register_cpu() needs to access the _STA
method of the DSDT object so make it available by assigning the
appropriate handle to the struct cpu instance.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 drivers/acpi/acpi_processor.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 7a0dd35d62c9..93e029403d05 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
 	union acpi_object object = { 0 };
 	struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
 	struct acpi_processor *pr = acpi_driver_data(device);
+	struct cpu *c;
 	int device_declaration = 0;
 	acpi_status status = AE_OK;
 	static int cpu0_initialized;
@@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
 			cpufreq_add_device("acpi-cpufreq");
 	}
 
+	c = &per_cpu(cpu_devices, pr->id);
+	ACPI_COMPANION_SET(&c->dev, device);
 	/*
 	 *  Extra Processor objects may be enumerated on MP systems with
 	 *  less than the max # of CPUs. They should be ignored _iff
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 18:30   ` Rafael J. Wysocki
  2024-04-16 14:00   ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 04/18] ACPI: Rename acpi_processor_hotadd_init and remove pre-processor guards Jonathan Cameron
                   ` (14 subsequent siblings)
  17 siblings, 2 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

The arm64 specific arch_register_cpu() call may defer CPU registration
until the ACPI interpreter is available and the _STA method can
be evaluated.

If this occurs, then a second attempt is made in
acpi_processor_get_info(). Note that the arm64 specific call has
not yet been added so for now this will never be successfully
called.

Systems can still be booted with 'acpi=off', or not include an
ACPI description at all as in these cases arch_register_cpu()
will not have deferred registration when first called.

This moves the CPU register logic back to a subsys_initcall(),
while the memory nodes will have been registered earlier.
Note this is where the call was prior to the cleanup series so
there should be no side effects of moving it back again for this
specific case.

[PATCH 00/21] Initial cleanups for vCPU HP.
https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/

e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
---
v5: Update commit message to make it clear this is moving the
    init back to where it was until very recently.

    No longer change the condition in the earlier registration point
    as that will be handled by the arm64 registration routine
    deferring until called again here.
---
 drivers/acpi/acpi_processor.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 93e029403d05..c78398cdd060 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
 
 	c = &per_cpu(cpu_devices, pr->id);
 	ACPI_COMPANION_SET(&c->dev, device);
+	/*
+	 * Register CPUs that are present. get_cpu_device() is used to skip
+	 * duplicate CPU descriptions from firmware.
+	 */
+	if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
+	    !get_cpu_device(pr->id)) {
+		int ret = arch_register_cpu(pr->id);
+
+		if (ret)
+			return ret;
+	}
+
 	/*
 	 *  Extra Processor objects may be enumerated on MP systems with
 	 *  less than the max # of CPUs. They should be ignored _iff
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 04/18] ACPI: Rename acpi_processor_hotadd_init and  remove pre-processor guards
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (2 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info() Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 05/18] ACPI: utils: Add an acpi_sta_enabled() helper and use it in acpi_processor_make_present() Jonathan Cameron
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

acpi_processor_hotadd_init() will make a CPU present by mapping it
based on its hardware id.

'hotadd_init' is ambiguous once there are two different behaviours
for cpu hotplug. This is for toggling the _STA present bit. Subsequent
patches will add support for toggling the _STA enabled bit, named
acpi_processor_make_enabled().

Rename it acpi_processor_make_present() to make it clear this is
for CPUs that were not previously present.

Expose the function prototypes it uses to allow the preprocessor
guards to be removed. The IS_ENABLED() check will let the compiler
dead-code elimination pass remove this if it isn't going to be
used.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5: Rebase.
---
 drivers/acpi/acpi_processor.c | 14 +++++---------
 include/linux/acpi.h          |  2 --
 2 files changed, 5 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index c78398cdd060..05264722c207 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -183,13 +183,15 @@ static void __init acpi_pcc_cpufreq_init(void) {}
 #endif /* CONFIG_X86 */
 
 /* Initialization */
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
-static int acpi_processor_hotadd_init(struct acpi_processor *pr)
+static int acpi_processor_make_present(struct acpi_processor *pr)
 {
 	unsigned long long sta;
 	acpi_status status;
 	int ret;
 
+	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
+		return -ENODEV;
+
 	if (invalid_phys_cpuid(pr->phys_id))
 		return -ENODEV;
 
@@ -223,12 +225,6 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr)
 	cpu_maps_update_done();
 	return ret;
 }
-#else
-static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
-{
-	return -ENODEV;
-}
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
 
 static int acpi_processor_get_info(struct acpi_device *device)
 {
@@ -338,7 +334,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
 	 *  because cpuid <-> apicid mapping is persistent now.
 	 */
 	if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
-		int ret = acpi_processor_hotadd_init(pr);
+		int ret = acpi_processor_make_present(pr);
 
 		if (ret)
 			return ret;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 34829f2c517a..2629c459738a 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -302,12 +302,10 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
 }
 #endif
 
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
 /* Arch dependent functions for cpu hotplug support */
 int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
 		 int *pcpu);
 int acpi_unmap_cpu(int cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
 
 #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
 int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 05/18] ACPI: utils: Add an acpi_sta_enabled() helper and use it in acpi_processor_make_present()
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (3 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 04/18] ACPI: Rename acpi_processor_hotadd_init and remove pre-processor guards Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 06/18] ACPI: scan: Add parameter to allow defering some actions in acpi_scan_check_and_detach Jonathan Cameron
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

A device is enabled only if both the present and enabled bits
are set in the result of calling the _STA method, or the
_STA method is not present (in which case the device is always
present and enabled).

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5: New patch
---
 drivers/acpi/acpi_processor.c |  8 +++-----
 drivers/acpi/utils.c          | 21 +++++++++++++++++++++
 include/acpi/acpi_bus.h       |  1 +
 3 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 05264722c207..3aa43dee4391 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -185,8 +185,6 @@ static void __init acpi_pcc_cpufreq_init(void) {}
 /* Initialization */
 static int acpi_processor_make_present(struct acpi_processor *pr)
 {
-	unsigned long long sta;
-	acpi_status status;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
@@ -195,9 +193,9 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
 	if (invalid_phys_cpuid(pr->phys_id))
 		return -ENODEV;
 
-	status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
-	if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_PRESENT))
-		return -ENODEV;
+	ret = acpi_sta_enabled(pr->handle);
+	if (ret)
+		return ret;
 
 	cpu_maps_update_begin();
 	cpus_write_lock();
diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
index 202234ba54bd..3004426b218c 100644
--- a/drivers/acpi/utils.c
+++ b/drivers/acpi/utils.c
@@ -744,6 +744,27 @@ acpi_status acpi_evaluate_reg(acpi_handle handle, u8 space_id, u32 function)
 }
 EXPORT_SYMBOL(acpi_evaluate_reg);
 
+int acpi_sta_enabled(acpi_handle handle)
+{
+	unsigned long long sta;
+	bool present, enabled;
+	acpi_status status;
+
+	if (acpi_has_method(handle, "_STA")) {
+		status = acpi_evaluate_integer(handle, "_STA", NULL, &sta);
+		if (ACPI_FAILURE(status))
+			return -ENODEV;
+
+		present = sta & ACPI_STA_DEVICE_PRESENT;
+		enabled = sta & ACPI_STA_DEVICE_ENABLED;
+		if (!present || !enabled) {
+			return -EPROBE_DEFER;
+		}
+		return 0;
+	}
+	return 0; /* No _STA means always on! */
+}
+
 /**
  * acpi_evaluate_dsm - evaluate device's _DSM method
  * @handle: ACPI device handle
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 5de954e2b18a..e193507fd743 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -51,6 +51,7 @@ bool acpi_ata_match(acpi_handle handle);
 bool acpi_bay_match(acpi_handle handle);
 bool acpi_dock_match(acpi_handle handle);
 
+int acpi_sta_enabled(acpi_handle handle);
 bool acpi_check_dsm(acpi_handle handle, const guid_t *guid, u64 rev, u64 funcs);
 union acpi_object *acpi_evaluate_dsm(acpi_handle handle, const guid_t *guid,
 			u64 rev, u64 func, union acpi_object *argv4);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 06/18] ACPI: scan: Add parameter to allow defering some actions in acpi_scan_check_and_detach.
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (4 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 05/18] ACPI: utils: Add an acpi_sta_enabled() helper and use it in acpi_processor_make_present() Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 07/18] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug Jonathan Cameron
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

Precursor patch adds the ability to pass a flag (not yet used) into
acpi_scan_check_and detach().  Done in a separate patch with no
functional changes to reduce complexity of the actual deferal which
follows.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5: New patch resulting from rebase.
    - Internal review suggested we could also do this with flags
      so I'm looking for feedback on which option people find
      more readable.
---
 drivers/acpi/scan.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 7c157bf92695..79b1f4d2b6bd 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -244,13 +244,19 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
 	return 0;
 }
 
-static int acpi_scan_check_and_detach(struct acpi_device *adev, void *check)
+struct acpi_scan_c_and_d_param {
+	bool check_status;
+	bool eject;
+};
+
+static int acpi_scan_check_and_detach(struct acpi_device *adev, void *p)
 {
 	struct acpi_scan_handler *handler = adev->handler;
+	struct acpi_scan_c_and_d_param *param = p;
 
-	acpi_dev_for_each_child_reverse(adev, acpi_scan_check_and_detach, check);
+	acpi_dev_for_each_child_reverse(adev, acpi_scan_check_and_detach, p);
 
-	if (check) {
+	if (param->check_status) {
 		acpi_bus_get_status(adev);
 		/*
 		 * Skip devices that are still there and take the enabled
@@ -288,7 +294,11 @@ static int acpi_scan_check_and_detach(struct acpi_device *adev, void *check)
 
 static void acpi_scan_check_subtree(struct acpi_device *adev)
 {
-	acpi_scan_check_and_detach(adev, (void *)true);
+	struct acpi_scan_c_and_d_param p = {
+		.check_status = true, /* Not update until after ej0 */
+		.eject = false,
+	};
+	acpi_scan_check_and_detach(adev, &p);
 }
 
 static int acpi_scan_hot_remove(struct acpi_device *device)
@@ -2600,7 +2610,12 @@ EXPORT_SYMBOL(acpi_bus_scan);
  */
 void acpi_bus_trim(struct acpi_device *adev)
 {
-	acpi_scan_check_and_detach(adev, NULL);
+	struct acpi_scan_c_and_d_param p = {
+		.check_status = false,
+		.eject = false,
+	};
+
+	acpi_scan_check_and_detach(adev, &p);
 }
 EXPORT_SYMBOL_GPL(acpi_bus_trim);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 07/18] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (5 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 06/18] ACPI: scan: Add parameter to allow defering some actions in acpi_scan_check_and_detach Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 08/18] ACPI: convert acpi_processor_post_eject() to use IS_ENABLED() Jonathan Cameron
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

struct acpi_scan_handler has a detach callback that is used to remove
a driver when a bus is changed. When interacting with an eject-request,
the detach callback is called before _EJ0.

This means the ACPI processor driver can't use _STA to determine if a
CPU has been made not-present, or some of the other _STA bits have been
changed. acpi_processor_remove() needs to know the value of _STA after
_EJ0 has been called.

Add a post_eject callback to struct acpi_scan_handler. This is called
after acpi_scan_hot_remove() has successfully called _EJ0. Because
acpi_scan_check_and_detach() also clears the handler pointer,
it needs to be told if the caller will go on to call
acpi_bus_post_eject(), so that acpi_device_clear_enumerated()
and clearing the handler pointer can be deferred.
The extra eject flag added in the previous patch is used for this
purpose.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

----
Russell, you hadn't signed off on this when posting last time.
Do you want to insert a suitable tag now?
v5:
 - Rebase to take into account the changes to scan handling in the
   meantime.
---
 drivers/acpi/acpi_processor.c |  4 ++--
 drivers/acpi/scan.c           | 32 +++++++++++++++++++++++++++++---
 include/acpi/acpi_bus.h       |  1 +
 3 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 3aa43dee4391..6b2ee0643d11 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -463,7 +463,7 @@ static int acpi_processor_add(struct acpi_device *device,
 
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
 /* Removal */
-static void acpi_processor_remove(struct acpi_device *device)
+static void acpi_processor_post_eject(struct acpi_device *device)
 {
 	struct acpi_processor *pr;
 
@@ -631,7 +631,7 @@ static struct acpi_scan_handler processor_handler = {
 	.ids = processor_device_ids,
 	.attach = acpi_processor_add,
 #ifdef CONFIG_ACPI_HOTPLUG_CPU
-	.detach = acpi_processor_remove,
+	.post_eject = acpi_processor_post_eject,
 #endif
 	.hotplug = {
 		.enabled = true,
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 79b1f4d2b6bd..992779ac31d4 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -276,8 +276,6 @@ static int acpi_scan_check_and_detach(struct acpi_device *adev, void *p)
 	if (handler) {
 		if (handler->detach)
 			handler->detach(adev);
-
-		adev->handler = NULL;
 	} else {
 		device_release_driver(&adev->dev);
 	}
@@ -287,6 +285,28 @@ static int acpi_scan_check_and_detach(struct acpi_device *adev, void *p)
 	 */
 	acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
 	adev->flags.initialized = false;
+
+	/* For eject this is deferred to acpi_bus_post_eject() */
+	if (!param->eject) {
+		adev->handler = NULL;
+		acpi_device_clear_enumerated(adev);
+	}
+	return 0;
+}
+
+static int acpi_bus_post_eject(struct acpi_device *adev, void *not_used)
+{
+	struct acpi_scan_handler *handler = adev->handler;
+
+	acpi_dev_for_each_child_reverse(adev, acpi_bus_post_eject, NULL);
+
+	if (handler) {
+		if (handler->post_eject)
+			handler->post_eject(adev);
+
+		adev->handler = NULL;
+	}
+
 	acpi_device_clear_enumerated(adev);
 
 	return 0;
@@ -306,6 +326,10 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
 	acpi_handle handle = device->handle;
 	unsigned long long sta;
 	acpi_status status;
+	struct acpi_scan_c_and_d_param p = {
+		.check_status = false, /* Not update until after ej0 */
+		.eject = true,
+	};
 
 	if (device->handler && device->handler->hotplug.demand_offline) {
 		if (!acpi_scan_is_offline(device, true))
@@ -318,7 +342,7 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
 
 	acpi_handle_debug(handle, "Ejecting\n");
 
-	acpi_bus_trim(device);
+	acpi_scan_check_and_detach(device, &p);
 
 	acpi_evaluate_lck(handle, 0);
 	/*
@@ -341,6 +365,8 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
 	} else if (sta & ACPI_STA_DEVICE_ENABLED) {
 		acpi_handle_warn(handle,
 			"Eject incomplete - status 0x%llx\n", sta);
+	} else {
+		acpi_bus_post_eject(device, NULL);
 	}
 
 	return 0;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index e193507fd743..27fdef17abe5 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -130,6 +130,7 @@ struct acpi_scan_handler {
 	bool (*match)(const char *idstr, const struct acpi_device_id **matchid);
 	int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
 	void (*detach)(struct acpi_device *dev);
+	void (*post_eject)(struct acpi_device *dev);
 	void (*bind)(struct device *phys_dev);
 	void (*unbind)(struct device *phys_dev);
 	struct acpi_hotplug_profile hotplug;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 08/18] ACPI: convert acpi_processor_post_eject() to use IS_ENABLED()
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (6 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 07/18] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 09/18] ACPI: Check _STA present bit before making CPUs not present Jonathan Cameron
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: Russell King <rmk+kernel@armlinux.org.uk>

Rather than ifdef'ing acpi_processor_post_eject() and its use site, use
IS_ENABLED() to increase compile coverage.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5: No change
---
 drivers/acpi/acpi_processor.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 6b2ee0643d11..15d89f80857b 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -461,12 +461,14 @@ static int acpi_processor_add(struct acpi_device *device,
 	return result;
 }
 
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
 /* Removal */
 static void acpi_processor_post_eject(struct acpi_device *device)
 {
 	struct acpi_processor *pr;
 
+	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
+		return;
+
 	if (!device || !acpi_driver_data(device))
 		return;
 
@@ -505,7 +507,6 @@ static void acpi_processor_post_eject(struct acpi_device *device)
 	free_cpumask_var(pr->throttling.shared_cpu_map);
 	kfree(pr);
 }
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
 
 #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
 bool __init processor_physically_present(acpi_handle handle)
@@ -630,9 +631,7 @@ static const struct acpi_device_id processor_device_ids[] = {
 static struct acpi_scan_handler processor_handler = {
 	.ids = processor_device_ids,
 	.attach = acpi_processor_add,
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
 	.post_eject = acpi_processor_post_eject,
-#endif
 	.hotplug = {
 		.enabled = true,
 	},
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 09/18] ACPI: Check _STA present bit before making CPUs not present
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (7 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 08/18] ACPI: convert acpi_processor_post_eject() to use IS_ENABLED() Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 10/18] ACPI: Warn when the present bit changes but the feature is not enabled Jonathan Cameron
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

When called acpi_processor_post_eject() unconditionally make a CPU
not-present and unregisters it.

To add support for AML events where the CPU has become disabled, but
remains present, the _STA method should be checked before calling
acpi_processor_remove().

Rename acpi_processor_post_eject() acpi_processor_remove_possible(), and
check the _STA before calling.

Adding the function prototype for arch_unregister_cpu() allows the
preprocessor guards to be removed.

After this change CPUs will remain registered and visible to
user-space as offline if buggy firmware triggers an eject-request,
but doesn't clear the corresponding _STA bits after _EJ0 has been
called.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
v5: No change
---
 drivers/acpi/acpi_processor.c | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 15d89f80857b..0403eddb3f80 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -462,16 +462,13 @@ static int acpi_processor_add(struct acpi_device *device,
 }
 
 /* Removal */
-static void acpi_processor_post_eject(struct acpi_device *device)
+static void acpi_processor_make_not_present(struct acpi_device *device)
 {
 	struct acpi_processor *pr;
 
 	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
 		return;
 
-	if (!device || !acpi_driver_data(device))
-		return;
-
 	pr = acpi_driver_data(device);
 	if (pr->id >= nr_cpu_ids)
 		goto out;
@@ -508,6 +505,29 @@ static void acpi_processor_post_eject(struct acpi_device *device)
 	kfree(pr);
 }
 
+static void acpi_processor_post_eject(struct acpi_device *device)
+{
+	struct acpi_processor *pr;
+	unsigned long long sta;
+	acpi_status status;
+
+	if (!device)
+		return;
+
+	pr = acpi_driver_data(device);
+	if (!pr || pr->id >= nr_cpu_ids || invalid_phys_cpuid(pr->phys_id))
+		return;
+
+	status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
+	if (ACPI_FAILURE(status))
+		return;
+
+	if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_PRESENT)) {
+		acpi_processor_make_not_present(device);
+		return;
+	}
+}
+
 #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
 bool __init processor_physically_present(acpi_handle handle)
 {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 10/18] ACPI: Warn when the present bit changes but the feature is not enabled
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (8 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 09/18] ACPI: Check _STA present bit before making CPUs not present Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 11/18] arm64: acpi: Move get_cpu_for_acpi_id() to a header Jonathan Cameron
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

ACPI firmware can trigger the events to add and remove CPUs, but the
OS may not support this.

Print an error message when this happens.

This gives early warning on arm64 systems that don't support
CONFIG_ACPI_HOTPLUG_PRESENT_CPU, as making CPUs not present has
side effects for other parts of the system.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
v5: No change
---
 drivers/acpi/acpi_processor.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 0403eddb3f80..3fb167ee9807 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -187,8 +187,10 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
 {
 	int ret;
 
-	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
+	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU)) {
+		pr_err_once("Changing CPU present bit is not supported\n");
 		return -ENODEV;
+	}
 
 	if (invalid_phys_cpuid(pr->phys_id))
 		return -ENODEV;
@@ -466,8 +468,10 @@ static void acpi_processor_make_not_present(struct acpi_device *device)
 {
 	struct acpi_processor *pr;
 
-	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU))
+	if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_CPU)) {
+		pr_err_once("Changing CPU present bit is not supported");
 		return;
+	}
 
 	pr = acpi_driver_data(device);
 	if (pr->id >= nr_cpu_ids)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 11/18] arm64: acpi: Move get_cpu_for_acpi_id() to a header
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (9 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 10/18] ACPI: Warn when the present bit changes but the feature is not enabled Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 12/18] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() Jonathan Cameron
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

ACPI identifies CPUs by UID. get_cpu_for_acpi_id() maps the ACPI UID
to the linux CPU number.

The helper to retrieve this mapping is only available in arm64's numa
code.

Move it to live next to get_acpi_id_for_cpu().

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
v5: No change
---
 arch/arm64/include/asm/acpi.h | 11 +++++++++++
 arch/arm64/kernel/acpi_numa.c | 11 -----------
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 6792a1f83f2a..bc9a6656fc0c 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -119,6 +119,17 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
 	return	acpi_cpu_get_madt_gicc(cpu)->uid;
 }
 
+static inline int get_cpu_for_acpi_id(u32 uid)
+{
+	int cpu;
+
+	for (cpu = 0; cpu < nr_cpu_ids; cpu++)
+		if (uid == get_acpi_id_for_cpu(cpu))
+			return cpu;
+
+	return -EINVAL;
+}
+
 static inline void arch_fix_phys_package_id(int num, u32 slot) { }
 void __init acpi_init_cpus(void);
 int apei_claim_sea(struct pt_regs *regs);
diff --git a/arch/arm64/kernel/acpi_numa.c b/arch/arm64/kernel/acpi_numa.c
index e51535a5f939..0c036a9a3c33 100644
--- a/arch/arm64/kernel/acpi_numa.c
+++ b/arch/arm64/kernel/acpi_numa.c
@@ -34,17 +34,6 @@ int __init acpi_numa_get_nid(unsigned int cpu)
 	return acpi_early_node_map[cpu];
 }
 
-static inline int get_cpu_for_acpi_id(u32 uid)
-{
-	int cpu;
-
-	for (cpu = 0; cpu < nr_cpu_ids; cpu++)
-		if (uid == get_acpi_id_for_cpu(cpu))
-			return cpu;
-
-	return -EINVAL;
-}
-
 static int __init acpi_parse_gicc_pxm(union acpi_subtable_headers *header,
 				      const unsigned long end)
 {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 12/18] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (10 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 11/18] arm64: acpi: Move get_cpu_for_acpi_id() to a header Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 13/18] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs Jonathan Cameron
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions().
It should only count the number of enabled redistributors, but it
also tries to sanity check the GICC entry, currently returning an
error if the Enabled bit is set, but the gicr_base_address is zero.

Adding support for the online-capable bit to the sanity check will
complicate it, for no benefit. The existing check implicitly depends on
gic_acpi_count_gicr_regions() previous failing to find any GICR regions
(as it is valid to have gicr_base_address of zero if the redistributors
are described via a GICR entry).

Instead of complicating the check, remove it. Failures that happen at
this point cause the irqchip not to register, meaning no irqs can be
requested. The kernel grinds to a panic() pretty quickly.

Without the check, MADT tables that exhibit this problem are still
caught by gic_populate_rdist(), which helpfully also prints what went
wrong:
| CPU4: mpidr 100 has no re-distributor!

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
v5: No change
---
 drivers/irqchip/irq-gic-v3.c | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 6fb276504bcc..10af15f93d4d 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2415,19 +2415,10 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
 	 * If GICC is enabled and has valid gicr base address, then it means
 	 * GICR base is presented via GICC
 	 */
-	if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
+	if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address)
 		acpi_data.enabled_rdists++;
-		return 0;
-	}
 
-	/*
-	 * It's perfectly valid firmware can pass disabled GICC entry, driver
-	 * should not treat as errors, skip the entry instead of probe fail.
-	 */
-	if (!acpi_gicc_is_usable(gicc))
-		return 0;
-
-	return -ENODEV;
+	return 0;
 }
 
 static int __init gic_acpi_count_gicr_regions(void)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 13/18] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (11 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 12/18] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 14/18] arm64: psci: Ignore DENIED CPUs Jonathan Cameron
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

To support virtual CPU hotplug, ACPI has added an 'online capable' bit
to the MADT GICC entries. This indicates a disabled CPU entry may not
be possible to online via PSCI until firmware has set enabled bit in
_STA.

This means that a "usable" GIC is one that is marked as either enabled,
or online capable. Therefore, change acpi_gicc_is_usable() to check both
bits. However, we need to change the test in gic_acpi_match_gicc() back
to testing just the enabled bit so the count of enabled distributors is
correct.

What about the redistributor in the GICC entry? ACPI doesn't want to say.
Assume the worst: When a redistributor is described in the GICC entry,
but the entry is marked as disabled at boot, assume the redistributor
is inaccessible.

The GICv3 driver doesn't support late online of redistributors, so this
means the corresponding CPU can't be brought online either. Clear the
possible and present bits.

Systems that want CPU hotplug in a VM can ensure their redistributors
are always-on, and describe them that way with a GICR entry in the MADT.

When mapping redistributors found via GICC entries, handle the case
where the arch code believes the CPU is present and possible, but it
does not have an accessible redistributor. Print a warning and clear
the present and possible bits.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5: No Change.
---
 drivers/irqchip/irq-gic-v3.c | 21 +++++++++++++++++++--
 include/linux/acpi.h         |  3 ++-
 2 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 10af15f93d4d..66132251c1bb 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2363,11 +2363,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
 				(struct acpi_madt_generic_interrupt *)header;
 	u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
 	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
+	int cpu = get_cpu_for_acpi_id(gicc->uid);
 	void __iomem *redist_base;
 
 	if (!acpi_gicc_is_usable(gicc))
 		return 0;
 
+	/*
+	 * Capable but disabled CPUs can be brought online later. What about
+	 * the redistributor? ACPI doesn't want to say!
+	 * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
+	 * Otherwise, prevent such CPUs from being brought online.
+	 */
+	if (!(gicc->flags & ACPI_MADT_ENABLED)) {
+		pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
+		set_cpu_present(cpu, false);
+		set_cpu_possible(cpu, false);
+		return 0;
+	}
+
 	redist_base = ioremap(gicc->gicr_base_address, size);
 	if (!redist_base)
 		return -ENOMEM;
@@ -2413,9 +2427,12 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
 
 	/*
 	 * If GICC is enabled and has valid gicr base address, then it means
-	 * GICR base is presented via GICC
+	 * GICR base is presented via GICC. The redistributor is only known to
+	 * be accessible if the GICC is marked as enabled. If this bit is not
+	 * set, we'd need to add the redistributor at runtime, which isn't
+	 * supported.
 	 */
-	if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address)
+	if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)
 		acpi_data.enabled_rdists++;
 
 	return 0;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 2629c459738a..6dd396825bb5 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -239,7 +239,8 @@ void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
 
 static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
 {
-	return gicc->flags & ACPI_MADT_ENABLED;
+	return gicc->flags & (ACPI_MADT_ENABLED |
+			      ACPI_MADT_GICC_ONLINE_CAPABLE);
 }
 
 /* the following numa functions are architecture-dependent */
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 14/18] arm64: psci: Ignore DENIED CPUs
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (12 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 13/18] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 15/18] arm64: arch_register_cpu() variant to allow checking of ACPI _STA Jonathan Cameron
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: Jean-Philippe Brucker <jean-philippe@linaro.org>

When a CPU is marked as disabled, but online capable in the MADT, PSCI
applies some firmware policy to control when it can be brought online.
PSCI returns DENIED to a CPU_ON request if this is not currently
permitted. The OS can learn the current policy from the _STA enabled bit.

Handle the PSCI DENIED return code gracefully instead of printing an
error.

See https://developer.arm.com/documentation/den0022/f/?lang=en page 58.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
[ morse: Rewrote commit message ]
Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
v5: No change
---
 arch/arm64/kernel/psci.c | 2 +-
 arch/arm64/kernel/smp.c  | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index 29a8e444db83..fabd732d0a2d 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -40,7 +40,7 @@ static int cpu_psci_cpu_boot(unsigned int cpu)
 {
 	phys_addr_t pa_secondary_entry = __pa_symbol(secondary_entry);
 	int err = psci_ops.cpu_on(cpu_logical_map(cpu), pa_secondary_entry);
-	if (err)
+	if (err && err != -EPERM)
 		pr_err("failed to boot CPU%d (%d)\n", cpu, err);
 
 	return err;
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 4ced34f62dab..dc0e0b3ec2d4 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -132,7 +132,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	/* Now bring the CPU into our world */
 	ret = boot_secondary(cpu, idle);
 	if (ret) {
-		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
+		if (ret != -EPERM)
+			pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
 		return ret;
 	}
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 15/18] arm64: arch_register_cpu() variant to allow checking of ACPI _STA
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (13 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 14/18] arm64: psci: Ignore DENIED CPUs Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 16/18] ACPI: add support to (un)register CPUs based on the _STA enabled bit Jonathan Cameron
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

On ARM64 virtual CPU Hotplug relies on the status value that can be
queried via the AML method _STA for the CPU object.

There are two conditions in which the CPU can be registered.
1) ACPI disabled.
2) ACPI enabled and the acpi_handle is available.
   _STA evaluates to the CPU is both enabled and present.
   (Note that in absences of the _STA method they are always in this
    state).

If neither of these conditions is met the CPU is not 'yet' ready
to be used and -EPROBE_DEFER is returned.

Success occurs in the early attempt to register the CPUs if we
are booting with DT (no concept yet of vCPU HP) if not it succeeds
for already enabled CPUs when the ACPI Processor driver attaches to
them.  Finally it may succeed via the CPU Hotplug code indicating that
the CPU is now enabled.

Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
v5: New patch.
---
 arch/arm64/kernel/smp.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index dc0e0b3ec2d4..68f2e7974815 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -504,6 +504,26 @@ static int __init smp_cpu_setup(int cpu)
 static bool bootcpu_valid __initdata;
 static unsigned int cpu_count = 1;
 
+int arch_register_cpu(int cpu)
+{
+	struct cpu *c = &per_cpu(cpu_devices, cpu);
+	acpi_handle acpi_handle = ACPI_HANDLE(&c->dev);
+	int ret;
+
+	if (!acpi_disabled && !acpi_handle)
+		return -EPROBE_DEFER;
+	if (acpi_handle) {
+		ret = acpi_sta_enabled(acpi_handle);
+		if (ret) {
+			/* Not enabled */
+			return ret;
+		}
+	}
+	c->hotpluggable = arch_cpu_is_hotpluggable(cpu);
+
+	return register_cpu(c, cpu);
+}
+
 #ifdef CONFIG_ACPI
 static struct acpi_madt_generic_interrupt cpu_madt_gicc[NR_CPUS];
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 16/18] ACPI: add support to (un)register CPUs based on the _STA enabled bit
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (14 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 15/18] arm64: arch_register_cpu() variant to allow checking of ACPI _STA Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 17/18] arm64: document virtual CPU hotplug's expectations Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 18/18] cpumask: Add enabled cpumask for present CPUs that can be brought online Jonathan Cameron
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

acpi_processor_get_info() registers all present CPUs. Registering a
CPU is what creates the sysfs entries and triggers the udev
notifications.

arm64 virtual machines that support 'virtual cpu hotplug' use the
enabled bit to indicate whether the CPU can be brought online, as
the existing ACPI tables require all hardware to be described and
present.

If firmware describes a CPU as present, but disabled, skip the
registration. Such CPUs are present, but can't be brought online for
whatever reason. (e.g. firmware/hypervisor policy).

Once firmware sets the enabled bit, the CPU can be registered and
brought online by user-space. Online CPUs, or CPUs that are missing
an _STA method must always be registered.

When firmware clears the enabled bit, we need to unregister the CPU
for symetry. As this is dependent on hotplug CPU being support, and
arch_unregister_cpu() only exists when hotplug CPU is supported,
we need to add a check for that configuration symbol.

Note that some elements in the *make_present() and *make_not_present()
paths are not appropriate for the *enabled() paths beause they
are related to elements such as interrupt controller setup that
are done for all present (but not enabled) CPUs at boot.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5:
   Make the enable and present paths look much more like each other.
   Whilst similar, I think combining the two paths any more will
   lead to less readable code by implying they are more similar than
   they actually should be.
---
 drivers/acpi/acpi_processor.c | 46 +++++++++++++++++++++++++++++++++--
 1 file changed, 44 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 3fb167ee9807..ffa2bc63da40 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -226,6 +226,24 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
 	return ret;
 }
 
+static int acpi_processor_make_enabled(struct acpi_processor *pr)
+{
+	int ret;
+
+	if (invalid_phys_cpuid(pr->phys_id))
+		return -ENODEV;
+
+	cpus_write_lock();
+	ret = arch_register_cpu(pr->id);
+	cpus_write_unlock();
+
+	if (ret)
+		return ret;
+
+	pr_info("CPU%d has been hot-added (onlined)\n", pr->id);
+	return 0;
+}
+
 static int acpi_processor_get_info(struct acpi_device *device)
 {
 	union acpi_object object = { 0 };
@@ -319,7 +337,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
 	 */
 	if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
 	    !get_cpu_device(pr->id)) {
-		int ret = arch_register_cpu(pr->id);
+		int ret = acpi_processor_make_enabled(pr);
 
 		if (ret)
 			return ret;
@@ -463,6 +481,27 @@ static int acpi_processor_add(struct acpi_device *device,
 	return result;
 }
 
+static void acpi_processor_make_not_enabled(struct acpi_device *device)
+{
+	struct acpi_processor *pr;
+
+	pr = acpi_driver_data(device);
+	if (pr->id >= nr_cpu_ids)
+		goto out;
+
+	device_release_driver(pr->dev);
+	per_cpu(processor_device_array, pr->id) = NULL;
+	per_cpu(processors, pr->id) = NULL;
+	cpus_write_lock();
+	arch_unregister_cpu(pr->id);
+	cpus_write_unlock();
+
+	try_offline_node(cpu_to_node(pr->id));
+out:
+	free_cpumask_var(pr->throttling.shared_cpu_map);
+	kfree(pr);
+}
+
 /* Removal */
 static void acpi_processor_make_not_present(struct acpi_device *device)
 {
@@ -515,7 +554,7 @@ static void acpi_processor_post_eject(struct acpi_device *device)
 	unsigned long long sta;
 	acpi_status status;
 
-	if (!device)
+	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) || !device)
 		return;
 
 	pr = acpi_driver_data(device);
@@ -530,6 +569,9 @@ static void acpi_processor_post_eject(struct acpi_device *device)
 		acpi_processor_make_not_present(device);
 		return;
 	}
+
+	if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED))
+		acpi_processor_make_not_enabled(device);
 }
 
 #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 17/18] arm64: document virtual CPU hotplug's expectations
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (15 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 16/18] ACPI: add support to (un)register CPUs based on the _STA enabled bit Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  2024-04-12 14:37 ` [PATCH v5 18/18] cpumask: Add enabled cpumask for present CPUs that can be brought online Jonathan Cameron
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

Add a description of physical and virtual CPU hotplug, explain the
differences and elaborate on what is required in ACPI for a working
virtual hotplug system.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5: No change.
---
 Documentation/arch/arm64/cpu-hotplug.rst | 79 ++++++++++++++++++++++++
 Documentation/arch/arm64/index.rst       |  1 +
 2 files changed, 80 insertions(+)

diff --git a/Documentation/arch/arm64/cpu-hotplug.rst b/Documentation/arch/arm64/cpu-hotplug.rst
new file mode 100644
index 000000000000..76ba8d932c72
--- /dev/null
+++ b/Documentation/arch/arm64/cpu-hotplug.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. _cpuhp_index:
+
+====================
+CPU Hotplug and ACPI
+====================
+
+CPU hotplug in the arm64 world is commonly used to describe the kernel taking
+CPUs online/offline using PSCI. This document is about ACPI firmware allowing
+CPUs that were not available during boot to be added to the system later.
+
+``possible`` and ``present`` refer to the state of the CPU as seen by linux.
+
+
+CPU Hotplug on physical systems - CPUs not present at boot
+----------------------------------------------------------
+
+Physical systems need to mark a CPU that is ``possible`` but not ``present`` as
+being ``present``. An example would be a dual socket machine, where the package
+in one of the sockets can be replaced while the system is running.
+
+This is not supported.
+
+In the arm64 world CPUs are not a single device but a slice of the system.
+There are no systems that support the physical addition (or removal) of CPUs
+while the system is running, and ACPI is not able to sufficiently describe
+them.
+
+e.g. New CPUs come with new caches, but the platform's cache toplogy is
+described in a static table, the PPTT. How caches are shared between CPUs is
+not discoverable, and must be described by firmware.
+
+e.g. The GIC redistributor for each CPU must be accessed by the driver during
+boot to discover the system wide supported features. ACPI's MADT GICC
+structures can describe a redistributor associated with a disabled CPU, but
+can't describe whether the redistributor is accessible, only that it is not
+'always on'.
+
+arm64's ACPI tables assume that everything described is ``present``.
+
+
+CPU Hotplug on virtual systems - CPUs not enabled at boot
+---------------------------------------------------------
+
+Virtual systems have the advantage that all the properties the system will
+ever have can be described at boot. There are no power-domain considerations
+as such devices are emulated.
+
+CPU Hotplug on virtual systems is supported. It is distinct from physical
+CPU Hotplug as all resources are described as ``present``, but CPUs may be
+marked as disabled by firmware. Only the CPU's online/offline behaviour is
+influenced by firmware. An example is where a virtual machine boots with a
+single CPU, and additional CPUs are added once a cloud orchestrator deploys
+the workload.
+
+For a virtual machine, the VMM (e.g. Qemu) plays the part of firmware.
+
+Virtual hotplug is implemented as a firmware policy affecting which CPUs can be
+brought online. Firmware can enforce its policy via PSCI's return codes. e.g.
+``DENIED``.
+
+The ACPI tables must describe all the resources of the virtual machine. CPUs
+that firmware wishes to disable either from boot (or later) should not be
+``enabled`` in the MADT GICC structures, but should have the ``online capable``
+bit set, to indicate they can be enabled later. The boot CPU must be marked as
+``enabled``.  The 'always on' GICR structure must be used to describe the
+redistributors.
+
+CPUs described as ``online capable`` but not ``enabled`` can be set to enabled
+by the DSDT's Processor object's _STA method. On virtual systems the _STA method
+must always report the CPU as ``present``. Changes to the firmware policy can
+be notified to the OS via device-check or eject-request.
+
+CPUs described as ``enabled`` in the static table, should not have their _STA
+modified dynamically by firmware. Soft-restart features such as kexec will
+re-read the static properties of the system from these static tables, and
+may malfunction if these no longer describe the running system. Linux will
+re-discover the dynamic properties of the system from the _STA method later
+during boot.
diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/index.rst
index d08e924204bf..78544de0a8a9 100644
--- a/Documentation/arch/arm64/index.rst
+++ b/Documentation/arch/arm64/index.rst
@@ -13,6 +13,7 @@ ARM64 Architecture
     asymmetric-32bit
     booting
     cpu-feature-registers
+    cpu-hotplug
     elf_hwcaps
     hugetlbpage
     kdump
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v5 18/18] cpumask: Add enabled cpumask for present CPUs that can be brought online
  2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
                   ` (16 preceding siblings ...)
  2024-04-12 14:37 ` [PATCH v5 17/18] arm64: document virtual CPU hotplug's expectations Jonathan Cameron
@ 2024-04-12 14:37 ` Jonathan Cameron
  17 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-12 14:37 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu

From: James Morse <james.morse@arm.com>

The 'offline' file in sysfs shows all offline CPUs, including those
that aren't present. User-space is expected to remove not-present CPUs
from this list to learn which CPUs could be brought online.

CPUs can be present but not-enabled. These CPUs can't be brought online
until the firmware policy changes, which comes with an ACPI notification
that will register the CPUs.

With only the offline and present files, user-space is unable to
determine which CPUs it can try to bring online. Add a new CPU mask
that shows this based on all the registered CPUs.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

---
v5: No change
---
 .../ABI/testing/sysfs-devices-system-cpu      |  6 +++++
 drivers/base/cpu.c                            | 10 ++++++++
 include/linux/cpumask.h                       | 25 +++++++++++++++++++
 kernel/cpu.c                                  |  3 +++
 4 files changed, 44 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 710d47be11e0..808efb5b860a 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -694,3 +694,9 @@ Description:
 		(RO) indicates whether or not the kernel directly supports
 		modifying the crash elfcorehdr for CPU hot un/plug and/or
 		on/offline changes.
+
+What:		/sys/devices/system/cpu/enabled
+Date:		Nov 2022
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:
+		(RO) the list of CPUs that can be brought online.
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index b9d0d14e5960..4713b86d20f2 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -95,6 +95,7 @@ void unregister_cpu(struct cpu *cpu)
 {
 	int logical_cpu = cpu->dev.id;
 
+	set_cpu_enabled(logical_cpu, false);
 	unregister_cpu_under_node(logical_cpu, cpu_to_node(logical_cpu));
 
 	device_unregister(&cpu->dev);
@@ -273,6 +274,13 @@ static ssize_t print_cpus_offline(struct device *dev,
 }
 static DEVICE_ATTR(offline, 0444, print_cpus_offline, NULL);
 
+static ssize_t print_cpus_enabled(struct device *dev,
+				  struct device_attribute *attr, char *buf)
+{
+	return sysfs_emit(buf, "%*pbl\n", cpumask_pr_args(cpu_enabled_mask));
+}
+static DEVICE_ATTR(enabled, 0444, print_cpus_enabled, NULL);
+
 static ssize_t print_cpus_isolated(struct device *dev,
 				  struct device_attribute *attr, char *buf)
 {
@@ -413,6 +421,7 @@ int register_cpu(struct cpu *cpu, int num)
 	register_cpu_under_node(num, cpu_to_node(num));
 	dev_pm_qos_expose_latency_limit(&cpu->dev,
 					PM_QOS_RESUME_LATENCY_NO_CONSTRAINT);
+	set_cpu_enabled(num, true);
 
 	return 0;
 }
@@ -494,6 +503,7 @@ static struct attribute *cpu_root_attrs[] = {
 	&cpu_attrs[2].attr.attr,
 	&dev_attr_kernel_max.attr,
 	&dev_attr_offline.attr,
+	&dev_attr_enabled.attr,
 	&dev_attr_isolated.attr,
 #ifdef CONFIG_NO_HZ_FULL
 	&dev_attr_nohz_full.attr,
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 1c29947db848..4b202b94c97a 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -93,6 +93,7 @@ static inline void set_nr_cpu_ids(unsigned int nr)
  *
  *     cpu_possible_mask- has bit 'cpu' set iff cpu is populatable
  *     cpu_present_mask - has bit 'cpu' set iff cpu is populated
+ *     cpu_enabled_mask  - has bit 'cpu' set iff cpu can be brought online
  *     cpu_online_mask  - has bit 'cpu' set iff cpu available to scheduler
  *     cpu_active_mask  - has bit 'cpu' set iff cpu available to migration
  *
@@ -125,11 +126,13 @@ static inline void set_nr_cpu_ids(unsigned int nr)
 
 extern struct cpumask __cpu_possible_mask;
 extern struct cpumask __cpu_online_mask;
+extern struct cpumask __cpu_enabled_mask;
 extern struct cpumask __cpu_present_mask;
 extern struct cpumask __cpu_active_mask;
 extern struct cpumask __cpu_dying_mask;
 #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask)
 #define cpu_online_mask   ((const struct cpumask *)&__cpu_online_mask)
+#define cpu_enabled_mask   ((const struct cpumask *)&__cpu_enabled_mask)
 #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
 #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
 #define cpu_dying_mask    ((const struct cpumask *)&__cpu_dying_mask)
@@ -1009,6 +1012,7 @@ extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS);
 #else
 #define for_each_possible_cpu(cpu) for_each_cpu((cpu), cpu_possible_mask)
 #define for_each_online_cpu(cpu)   for_each_cpu((cpu), cpu_online_mask)
+#define for_each_enabled_cpu(cpu)   for_each_cpu((cpu), cpu_enabled_mask)
 #define for_each_present_cpu(cpu)  for_each_cpu((cpu), cpu_present_mask)
 #endif
 
@@ -1031,6 +1035,15 @@ set_cpu_possible(unsigned int cpu, bool possible)
 		cpumask_clear_cpu(cpu, &__cpu_possible_mask);
 }
 
+static inline void
+set_cpu_enabled(unsigned int cpu, bool can_be_onlined)
+{
+	if (can_be_onlined)
+		cpumask_set_cpu(cpu, &__cpu_enabled_mask);
+	else
+		cpumask_clear_cpu(cpu, &__cpu_enabled_mask);
+}
+
 static inline void
 set_cpu_present(unsigned int cpu, bool present)
 {
@@ -1112,6 +1125,7 @@ static __always_inline unsigned int num_online_cpus(void)
 	return raw_atomic_read(&__num_online_cpus);
 }
 #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
+#define num_enabled_cpus()	cpumask_weight(cpu_enabled_mask)
 #define num_present_cpus()	cpumask_weight(cpu_present_mask)
 #define num_active_cpus()	cpumask_weight(cpu_active_mask)
 
@@ -1120,6 +1134,11 @@ static inline bool cpu_online(unsigned int cpu)
 	return cpumask_test_cpu(cpu, cpu_online_mask);
 }
 
+static inline bool cpu_enabled(unsigned int cpu)
+{
+	return cpumask_test_cpu(cpu, cpu_enabled_mask);
+}
+
 static inline bool cpu_possible(unsigned int cpu)
 {
 	return cpumask_test_cpu(cpu, cpu_possible_mask);
@@ -1144,6 +1163,7 @@ static inline bool cpu_dying(unsigned int cpu)
 
 #define num_online_cpus()	1U
 #define num_possible_cpus()	1U
+#define num_enabled_cpus()	1U
 #define num_present_cpus()	1U
 #define num_active_cpus()	1U
 
@@ -1157,6 +1177,11 @@ static inline bool cpu_possible(unsigned int cpu)
 	return cpu == 0;
 }
 
+static inline bool cpu_enabled(unsigned int cpu)
+{
+	return cpu == 0;
+}
+
 static inline bool cpu_present(unsigned int cpu)
 {
 	return cpu == 0;
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 8f6affd051f7..537099bf5d02 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -3117,6 +3117,9 @@ EXPORT_SYMBOL(__cpu_possible_mask);
 struct cpumask __cpu_online_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_online_mask);
 
+struct cpumask __cpu_enabled_mask __read_mostly;
+EXPORT_SYMBOL(__cpu_enabled_mask);
+
 struct cpumask __cpu_present_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_present_mask);
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER
  2024-04-12 14:37 ` [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
@ 2024-04-12 17:42   ` Rafael J. Wysocki
  2024-04-22  3:53   ` Gavin Shan
  1 sibling, 0 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-12 17:42 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, linuxarm, justin.he, jianyong.wu

On Fri, Apr 12, 2024 at 4:37 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> For arm64 the CPU registration cannot complete until the ACPI intepretter
> us up and running so in those cases the arch specific
> arch_register_cpu() will return -EPROBE_DEFER at this stage and the
> registration will be attempted later.
>
> Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Acked-by: Rafael J. Wysocki <rafael@kernel.org>

> ---
> v5: New patch.
>     Note that for now no arch_register_cpu() calls return -EPROBE_DEFER
>     so it has no impact until the arm64 one is added later in this series.
> ---
>  drivers/base/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 56fba44ba391..b9d0d14e5960 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -558,7 +558,7 @@ static void __init cpu_dev_register_generic(void)
>
>         for_each_present_cpu(i) {
>                 ret = arch_register_cpu(i);
> -               if (ret)
> +               if (ret != -EPROBE_DEFER)
>                         pr_warn("register_cpu %d failed (%d)\n", i, ret);
>         }
>  }
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-12 14:37 ` [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance Jonathan Cameron
@ 2024-04-12 18:10   ` Rafael J. Wysocki
  2024-04-15 15:48     ` Jonathan Cameron
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-12 18:10 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, linuxarm, justin.he, jianyong.wu

On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> The arm64 specific arch_register_cpu() needs to access the _STA
> method of the DSDT object so make it available by assigning the
> appropriate handle to the struct cpu instance.
>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/acpi/acpi_processor.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 7a0dd35d62c9..93e029403d05 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
>         union acpi_object object = { 0 };
>         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
>         struct acpi_processor *pr = acpi_driver_data(device);
> +       struct cpu *c;
>         int device_declaration = 0;
>         acpi_status status = AE_OK;
>         static int cpu0_initialized;
> @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
>                         cpufreq_add_device("acpi-cpufreq");
>         }
>
> +       c = &per_cpu(cpu_devices, pr->id);
> +       ACPI_COMPANION_SET(&c->dev, device);

This is also set for per_cpu(cpu_sys_devices, pr->id) in
acpi_processor_add(), via acpi_bind_one().

Moreover, there is some pr->id validation in acpi_processor_add(), so
it seems premature to use it here this way.

I think that ACPI_COMPANION_SET() should be called from here on
per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
pr->id validation should all be done here) and then NULL can be passed
as acpi_dev to acpi_bind_one() in acpi_processor_add().  Then, there
will be one physical device corresponding to the processor ACPI device
and no confusion.

>         /*
>          *  Extra Processor objects may be enumerated on MP systems with
>          *  less than the max # of CPUs. They should be ignored _iff
> --
> 2.39.2
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 14:37 ` [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info() Jonathan Cameron
@ 2024-04-12 18:30   ` Rafael J. Wysocki
  2024-04-12 20:16     ` Russell King (Oracle)
                       ` (2 more replies)
  2024-04-16 14:00   ` Jonathan Cameron
  1 sibling, 3 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-12 18:30 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, linuxarm, justin.he, jianyong.wu

On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> From: James Morse <james.morse@arm.com>
>
> The arm64 specific arch_register_cpu() call may defer CPU registration
> until the ACPI interpreter is available and the _STA method can
> be evaluated.
>
> If this occurs, then a second attempt is made in
> acpi_processor_get_info(). Note that the arm64 specific call has
> not yet been added so for now this will never be successfully
> called.
>
> Systems can still be booted with 'acpi=off', or not include an
> ACPI description at all as in these cases arch_register_cpu()
> will not have deferred registration when first called.
>
> This moves the CPU register logic back to a subsys_initcall(),
> while the memory nodes will have been registered earlier.
> Note this is where the call was prior to the cleanup series so
> there should be no side effects of moving it back again for this
> specific case.
>
> [PATCH 00/21] Initial cleanups for vCPU HP.
> https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
>
> e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
> ---
> v5: Update commit message to make it clear this is moving the
>     init back to where it was until very recently.
>
>     No longer change the condition in the earlier registration point
>     as that will be handled by the arm64 registration routine
>     deferring until called again here.
> ---
>  drivers/acpi/acpi_processor.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 93e029403d05..c78398cdd060 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
>
>         c = &per_cpu(cpu_devices, pr->id);
>         ACPI_COMPANION_SET(&c->dev, device);
> +       /*
> +        * Register CPUs that are present. get_cpu_device() is used to skip
> +        * duplicate CPU descriptions from firmware.
> +        */
> +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> +           !get_cpu_device(pr->id)) {
> +               int ret = arch_register_cpu(pr->id);
> +
> +               if (ret)
> +                       return ret;
> +       }
> +
>         /*
>          *  Extra Processor objects may be enumerated on MP systems with
>          *  less than the max # of CPUs. They should be ignored _iff
> --

I am still unsure why there need to be two paths calling
arch_register_cpu() in acpi_processor_get_info().

Just below the comment partially pulled into the patch context above,
there is this code:

if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
         int ret = acpi_processor_hotadd_init(pr);

        if (ret)
                return ret;
}

For the sake of the argument, fold acpi_processor_hotadd_init() into
it and drop the redundant _STA check from it:

if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
        if (invalid_phys_cpuid(pr->phys_id))
                return -ENODEV;

        cpu_maps_update_begin();
        cpus_write_lock();

       ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
       if (ret) {
                cpus_write_unlock();
                cpu_maps_update_done();
                return ret;
       }
       ret = arch_register_cpu(pr->id);
       if (ret) {
                acpi_unmap_cpu(pr->id);

                cpus_write_unlock();
                cpu_maps_update_done();
                return ret;
       }
      pr_info("CPU%d has been hot-added\n", pr->id);
      pr->flags.need_hotplug_init = 1;

      cpus_write_unlock();
      cpu_maps_update_done();
}

so I'm not sure why this cannot be combined with the new code.

Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
What's the difference then?  The locking, which should be fine if I'm
not mistaken and need_hotplug_init that needs to be set if this code
runs after the processor driver has loaded AFAICS.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 18:30   ` Rafael J. Wysocki
@ 2024-04-12 20:16     ` Russell King (Oracle)
  2024-04-12 20:54       ` Thomas Gleixner
  2024-04-15 10:52     ` Jonathan Cameron
  2024-04-15 11:07     ` Salil Mehta
  2 siblings, 1 reply; 58+ messages in thread
From: Russell King (Oracle) @ 2024-04-12 20:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jonathan Cameron, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
> On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > From: James Morse <james.morse@arm.com>
> >
> > The arm64 specific arch_register_cpu() call may defer CPU registration
> > until the ACPI interpreter is available and the _STA method can
> > be evaluated.
> >
> > If this occurs, then a second attempt is made in
> > acpi_processor_get_info(). Note that the arm64 specific call has
> > not yet been added so for now this will never be successfully
> > called.
> >
> > Systems can still be booted with 'acpi=off', or not include an
> > ACPI description at all as in these cases arch_register_cpu()
> > will not have deferred registration when first called.
> >
> > This moves the CPU register logic back to a subsys_initcall(),
> > while the memory nodes will have been registered earlier.
> > Note this is where the call was prior to the cleanup series so
> > there should be no side effects of moving it back again for this
> > specific case.
> >
> > [PATCH 00/21] Initial cleanups for vCPU HP.
> > https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
> >
> > e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> >
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Reviewed-by: Gavin Shan <gshan@redhat.com>
> > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > v5: Update commit message to make it clear this is moving the
> >     init back to where it was until very recently.
> >
> >     No longer change the condition in the earlier registration point
> >     as that will be handled by the arm64 registration routine
> >     deferring until called again here.
> > ---
> >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 93e029403d05..c78398cdd060 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> >
> >         c = &per_cpu(cpu_devices, pr->id);
> >         ACPI_COMPANION_SET(&c->dev, device);
> > +       /*
> > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > +        * duplicate CPU descriptions from firmware.
> > +        */
> > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > +           !get_cpu_device(pr->id)) {
> > +               int ret = arch_register_cpu(pr->id);
> > +
> > +               if (ret)
> > +                       return ret;
> > +       }
> > +
> >         /*
> >          *  Extra Processor objects may be enumerated on MP systems with
> >          *  less than the max # of CPUs. They should be ignored _iff
> > --
> 
> I am still unsure why there need to be two paths calling
> arch_register_cpu() in acpi_processor_get_info().
> 
> Just below the comment partially pulled into the patch context above,
> there is this code:
> 
> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>          int ret = acpi_processor_hotadd_init(pr);
> 
>         if (ret)
>                 return ret;
> }
> 
> For the sake of the argument, fold acpi_processor_hotadd_init() into
> it and drop the redundant _STA check from it:
> 
> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>         if (invalid_phys_cpuid(pr->phys_id))
>                 return -ENODEV;
> 
>         cpu_maps_update_begin();
>         cpus_write_lock();
> 
>        ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
>        if (ret) {
>                 cpus_write_unlock();
>                 cpu_maps_update_done();
>                 return ret;
>        }
>        ret = arch_register_cpu(pr->id);
>        if (ret) {
>                 acpi_unmap_cpu(pr->id);
> 
>                 cpus_write_unlock();
>                 cpu_maps_update_done();
>                 return ret;
>        }
>       pr_info("CPU%d has been hot-added\n", pr->id);
>       pr->flags.need_hotplug_init = 1;
> 
>       cpus_write_unlock();
>       cpu_maps_update_done();
> }
> 
> so I'm not sure why this cannot be combined with the new code.
> 
> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> What's the difference then?  The locking, which should be fine if I'm
> not mistaken and need_hotplug_init that needs to be set if this code
> runs after the processor driver has loaded AFAICS.

It is over this that I walked away from progressing this code, because
I don't think it's quite as simple as you make it out to be.

Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch implemented
functions, so Arm64 can easily provide stubs for these that do nothing.
That never caused me any concern.

What does cause me great concern though are the finer details. For
example, above you seem to drop the evaluation of _STA for the
"make_present" case - I've no idea whether that is something that
should be deleted or not (if it is something that can be deleted,
then why not delete it now?)

As for the cpu locking, I couldn't find anything in arch_register_cpu()
that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
being taken - so I've no idea why the "make_present" case takes these
locks.

Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
obvious that this is required - remember that with Arm64's "enabled"
toggling, the "processor" is a slice of the system and doesn't
actually go away - it's just "not enabled" for use.

Again, as "processors" in Arm64 are slices of the system, they have
to be fully described in ACPI before the OS boots, and they will be
marked as being "present", which means they will be enumerated, and
the driver will be probed. Any processor that is not to be used will
not have its enabled bit set. It is my understanding that every
processor will result in the ACPI processor driver being bound to it
whether its enabled or not.

The difference between real hotplug and Arm64 hotplug is that real
hotplug makes stuff not-present (and thus unenumerable). Arm64 hotplug
makes stuff not-enabled which is still enumerable.

... or at least that is my understanding which may not be entirely
correct (which is why I stepped down because I feel totally out of
my depth with ACPI stuff.)

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 20:16     ` Russell King (Oracle)
@ 2024-04-12 20:54       ` Thomas Gleixner
  2024-04-12 21:52         ` Russell King (Oracle)
  2024-04-15 11:51         ` Salil Mehta
  0 siblings, 2 replies; 58+ messages in thread
From: Thomas Gleixner @ 2024-04-12 20:54 UTC (permalink / raw)
  To: Russell King (Oracle), Rafael J. Wysocki
  Cc: Jonathan Cameron, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Fri, Apr 12 2024 at 21:16, Russell King (Oracle) wrote:
> On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
>> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
>> What's the difference then?  The locking, which should be fine if I'm
>> not mistaken and need_hotplug_init that needs to be set if this code
>> runs after the processor driver has loaded AFAICS.
>
> It is over this that I walked away from progressing this code, because
> I don't think it's quite as simple as you make it out to be.
>
> Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch implemented
> functions, so Arm64 can easily provide stubs for these that do nothing.
> That never caused me any concern.
>
> What does cause me great concern though are the finer details. For
> example, above you seem to drop the evaluation of _STA for the
> "make_present" case - I've no idea whether that is something that
> should be deleted or not (if it is something that can be deleted,
> then why not delete it now?)
>
> As for the cpu locking, I couldn't find anything in arch_register_cpu()
> that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> being taken - so I've no idea why the "make_present" case takes these
> locks.

Anything which updates a CPU mask, e.g. cpu_present_mask, after early
boot must hold the appropriate write locks. Otherwise it would be
possible to online a CPU which just got marked present, but the
registration has not completed yet.

> Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
> obvious that this is required - remember that with Arm64's "enabled"
> toggling, the "processor" is a slice of the system and doesn't
> actually go away - it's just "not enabled" for use.
>
> Again, as "processors" in Arm64 are slices of the system, they have
> to be fully described in ACPI before the OS boots, and they will be
> marked as being "present", which means they will be enumerated, and
> the driver will be probed. Any processor that is not to be used will
> not have its enabled bit set. It is my understanding that every
> processor will result in the ACPI processor driver being bound to it
> whether its enabled or not.
>
> The difference between real hotplug and Arm64 hotplug is that real
> hotplug makes stuff not-present (and thus unenumerable). Arm64 hotplug
> makes stuff not-enabled which is still enumerable.

Define "real hotplug" :)

Real physical hotplug does not really exist. That's at least true for
x86, where the physical hotplug support was chased for a while, but
never ended up in production.

Though virtualization happily jumped on it to hot add/remove CPUs
to/from a guest.

There are limitations to this and we learned it the hard way on X86. At
the end we came up with the following restrictions:

    1) All possible CPUs have to be advertised at boot time via firmware
       (ACPI/DT/whatever) independent of them being present at boot time
       or not.

       That guarantees proper sizing and ensures that associations
       between hardware entities and software representations and the
       resulting topology are stable for the lifetime of a system.

       It is really required to know the full topology of the system at
       boot time especially with hybrid CPUs where some of the cores
       have hyperthreading and the others do not.


    2) Hot add can only mark an already registered (possible) CPU
       present. Adding non-registered CPUs after boot is not possible.

       The CPU must have been registered in #1 already to ensure that
       the system topology does not suddenly change in an incompatible
       way at run-time.

The same restriction would apply to real physical hotplug. I don't think
that's any different for ARM64 or any other architecture.

Hope that helps.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 20:54       ` Thomas Gleixner
@ 2024-04-12 21:52         ` Russell King (Oracle)
  2024-04-12 23:23           ` Thomas Gleixner
  2024-04-15 11:51         ` Salil Mehta
  1 sibling, 1 reply; 58+ messages in thread
From: Russell King (Oracle) @ 2024-04-12 21:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Rafael J. Wysocki, Jonathan Cameron, linux-pm, loongarch,
	linux-acpi, linux-arch, linux-kernel, linux-arm-kernel, kvmarm,
	x86, Miguel Luis, James Morse, Salil Mehta,
	Jean-Philippe Brucker, Catalin Marinas, Will Deacon, linuxarm,
	justin.he, jianyong.wu

On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:
> On Fri, Apr 12 2024 at 21:16, Russell King (Oracle) wrote:
> > On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
> >> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> >> What's the difference then?  The locking, which should be fine if I'm
> >> not mistaken and need_hotplug_init that needs to be set if this code
> >> runs after the processor driver has loaded AFAICS.
> >
> > It is over this that I walked away from progressing this code, because
> > I don't think it's quite as simple as you make it out to be.
> >
> > Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch implemented
> > functions, so Arm64 can easily provide stubs for these that do nothing.
> > That never caused me any concern.
> >
> > What does cause me great concern though are the finer details. For
> > example, above you seem to drop the evaluation of _STA for the
> > "make_present" case - I've no idea whether that is something that
> > should be deleted or not (if it is something that can be deleted,
> > then why not delete it now?)
> >
> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > being taken - so I've no idea why the "make_present" case takes these
> > locks.
> 
> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> boot must hold the appropriate write locks. Otherwise it would be
> possible to online a CPU which just got marked present, but the
> registration has not completed yet.

Yes. As far as I've been able to determine, arch_register_cpu()
doesn't manipulate any of the CPU masks. All it seems to be doing
is initialising the struct cpu, registering the embedded struct
device, and setting up the sysfs links to its NUMA node.

There is nothing obvious in there which manipulates any CPU masks, and
this is rather my fundamental point when I said "I couldn't find
anything in arch_register_cpu() that depends on ...".

If there is something, then comments in the code would be a useful aid
because it's highly non-obvious where such a manipulation is located,
and hence why the locks are necessary.

> > Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
> > obvious that this is required - remember that with Arm64's "enabled"
> > toggling, the "processor" is a slice of the system and doesn't
> > actually go away - it's just "not enabled" for use.
> >
> > Again, as "processors" in Arm64 are slices of the system, they have
> > to be fully described in ACPI before the OS boots, and they will be
> > marked as being "present", which means they will be enumerated, and
> > the driver will be probed. Any processor that is not to be used will
> > not have its enabled bit set. It is my understanding that every
> > processor will result in the ACPI processor driver being bound to it
> > whether its enabled or not.
> >
> > The difference between real hotplug and Arm64 hotplug is that real
> > hotplug makes stuff not-present (and thus unenumerable). Arm64 hotplug
> > makes stuff not-enabled which is still enumerable.
> 
> Define "real hotplug" :)
> 
> Real physical hotplug does not really exist. That's at least true for
> x86, where the physical hotplug support was chased for a while, but
> never ended up in production.
> 
> Though virtualization happily jumped on it to hot add/remove CPUs
> to/from a guest.
> 
> There are limitations to this and we learned it the hard way on X86. At
> the end we came up with the following restrictions:
> 
>     1) All possible CPUs have to be advertised at boot time via firmware
>        (ACPI/DT/whatever) independent of them being present at boot time
>        or not.
> 
>        That guarantees proper sizing and ensures that associations
>        between hardware entities and software representations and the
>        resulting topology are stable for the lifetime of a system.
> 
>        It is really required to know the full topology of the system at
>        boot time especially with hybrid CPUs where some of the cores
>        have hyperthreading and the others do not.
> 
> 
>     2) Hot add can only mark an already registered (possible) CPU
>        present. Adding non-registered CPUs after boot is not possible.
> 
>        The CPU must have been registered in #1 already to ensure that
>        the system topology does not suddenly change in an incompatible
>        way at run-time.
> 
> The same restriction would apply to real physical hotplug. I don't think
> that's any different for ARM64 or any other architecture.

This makes me wonder whether the Arm64 has been barking up the wrong
tree then, and whether the whole "present" vs "enabled" thing comes
from a misunderstanding as far as a CPU goes.

However, there is a big difference between the two. On x86, a processor
is just a processor. On Arm64, a "processor" is a slice of the system
(includes the interrupt controller, PMUs etc) and we must enumerate
those even when the processor itself is not enabled. This is the whole
reason there's a difference between "present" and "enabled" and why
there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
The processor never actually goes away in arm64, it's just prevented
from being used.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 21:52         ` Russell King (Oracle)
@ 2024-04-12 23:23           ` Thomas Gleixner
  2024-04-15  8:45             ` Jonathan Cameron
  0 siblings, 1 reply; 58+ messages in thread
From: Thomas Gleixner @ 2024-04-12 23:23 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Rafael J. Wysocki, Jonathan Cameron, linux-pm, loongarch,
	linux-acpi, linux-arch, linux-kernel, linux-arm-kernel, kvmarm,
	x86, Miguel Luis, James Morse, Salil Mehta,
	Jean-Philippe Brucker, Catalin Marinas, Will Deacon, linuxarm,
	justin.he, jianyong.wu

Russell!

On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:
> On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:
>> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
>> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
>> > being taken - so I've no idea why the "make_present" case takes these
>> > locks.
>> 
>> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
>> boot must hold the appropriate write locks. Otherwise it would be
>> possible to online a CPU which just got marked present, but the
>> registration has not completed yet.
>
> Yes. As far as I've been able to determine, arch_register_cpu()
> doesn't manipulate any of the CPU masks. All it seems to be doing
> is initialising the struct cpu, registering the embedded struct
> device, and setting up the sysfs links to its NUMA node.
>
> There is nothing obvious in there which manipulates any CPU masks, and
> this is rather my fundamental point when I said "I couldn't find
> anything in arch_register_cpu() that depends on ...".
>
> If there is something, then comments in the code would be a useful aid
> because it's highly non-obvious where such a manipulation is located,
> and hence why the locks are necessary.

acpi_processor_hotadd_init()
...
         acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);

That ends up in fiddling with cpu_present_mask.

I grant you that arch_register_cpu() is not, but it might rely on the
external locking too. I could not be bothered to figure that out.

>> Define "real hotplug" :)
>> 
>> Real physical hotplug does not really exist. That's at least true for
>> x86, where the physical hotplug support was chased for a while, but
>> never ended up in production.
>> 
>> Though virtualization happily jumped on it to hot add/remove CPUs
>> to/from a guest.
>> 
>> There are limitations to this and we learned it the hard way on X86. At
>> the end we came up with the following restrictions:
>> 
>>     1) All possible CPUs have to be advertised at boot time via firmware
>>        (ACPI/DT/whatever) independent of them being present at boot time
>>        or not.
>> 
>>        That guarantees proper sizing and ensures that associations
>>        between hardware entities and software representations and the
>>        resulting topology are stable for the lifetime of a system.
>> 
>>        It is really required to know the full topology of the system at
>>        boot time especially with hybrid CPUs where some of the cores
>>        have hyperthreading and the others do not.
>> 
>> 
>>     2) Hot add can only mark an already registered (possible) CPU
>>        present. Adding non-registered CPUs after boot is not possible.
>> 
>>        The CPU must have been registered in #1 already to ensure that
>>        the system topology does not suddenly change in an incompatible
>>        way at run-time.
>> 
>> The same restriction would apply to real physical hotplug. I don't think
>> that's any different for ARM64 or any other architecture.
>
> This makes me wonder whether the Arm64 has been barking up the wrong
> tree then, and whether the whole "present" vs "enabled" thing comes
> from a misunderstanding as far as a CPU goes.
>
> However, there is a big difference between the two. On x86, a processor
> is just a processor. On Arm64, a "processor" is a slice of the system
> (includes the interrupt controller, PMUs etc) and we must enumerate
> those even when the processor itself is not enabled. This is the whole
> reason there's a difference between "present" and "enabled" and why
> there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> The processor never actually goes away in arm64, it's just prevented
> from being used.

It's the same on X86 at least in the physical world.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 23:23           ` Thomas Gleixner
@ 2024-04-15  8:45             ` Jonathan Cameron
  2024-04-15  9:16               ` Jonathan Cameron
  2024-04-15 11:37               ` Rafael J. Wysocki
  0 siblings, 2 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15  8:45 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Russell King (Oracle),
	Rafael J. Wysocki, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Sat, 13 Apr 2024 01:23:48 +0200
Thomas Gleixner <tglx@linutronix.de> wrote:

> Russell!
> 
> On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:
> > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:  
> >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> >> > being taken - so I've no idea why the "make_present" case takes these
> >> > locks.  
> >> 
> >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> >> boot must hold the appropriate write locks. Otherwise it would be
> >> possible to online a CPU which just got marked present, but the
> >> registration has not completed yet.  
> >
> > Yes. As far as I've been able to determine, arch_register_cpu()
> > doesn't manipulate any of the CPU masks. All it seems to be doing
> > is initialising the struct cpu, registering the embedded struct
> > device, and setting up the sysfs links to its NUMA node.
> >
> > There is nothing obvious in there which manipulates any CPU masks, and
> > this is rather my fundamental point when I said "I couldn't find
> > anything in arch_register_cpu() that depends on ...".
> >
> > If there is something, then comments in the code would be a useful aid
> > because it's highly non-obvious where such a manipulation is located,
> > and hence why the locks are necessary.  
> 
> acpi_processor_hotadd_init()
> ...
>          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> 
> That ends up in fiddling with cpu_present_mask.
> 
> I grant you that arch_register_cpu() is not, but it might rely on the
> external locking too. I could not be bothered to figure that out.
> 
> >> Define "real hotplug" :)
> >> 
> >> Real physical hotplug does not really exist. That's at least true for
> >> x86, where the physical hotplug support was chased for a while, but
> >> never ended up in production.
> >> 
> >> Though virtualization happily jumped on it to hot add/remove CPUs
> >> to/from a guest.
> >> 
> >> There are limitations to this and we learned it the hard way on X86. At
> >> the end we came up with the following restrictions:
> >> 
> >>     1) All possible CPUs have to be advertised at boot time via firmware
> >>        (ACPI/DT/whatever) independent of them being present at boot time
> >>        or not.
> >> 
> >>        That guarantees proper sizing and ensures that associations
> >>        between hardware entities and software representations and the
> >>        resulting topology are stable for the lifetime of a system.
> >> 
> >>        It is really required to know the full topology of the system at
> >>        boot time especially with hybrid CPUs where some of the cores
> >>        have hyperthreading and the others do not.
> >> 
> >> 
> >>     2) Hot add can only mark an already registered (possible) CPU
> >>        present. Adding non-registered CPUs after boot is not possible.
> >> 
> >>        The CPU must have been registered in #1 already to ensure that
> >>        the system topology does not suddenly change in an incompatible
> >>        way at run-time.
> >> 
> >> The same restriction would apply to real physical hotplug. I don't think
> >> that's any different for ARM64 or any other architecture.  
> >
> > This makes me wonder whether the Arm64 has been barking up the wrong
> > tree then, and whether the whole "present" vs "enabled" thing comes
> > from a misunderstanding as far as a CPU goes.
> >
> > However, there is a big difference between the two. On x86, a processor
> > is just a processor. On Arm64, a "processor" is a slice of the system
> > (includes the interrupt controller, PMUs etc) and we must enumerate
> > those even when the processor itself is not enabled. This is the whole
> > reason there's a difference between "present" and "enabled" and why
> > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > The processor never actually goes away in arm64, it's just prevented
> > from being used.  
> 
> It's the same on X86 at least in the physical world.

There were public calls on this via the Linaro Open Discussions group,
so I can talk a little about how we ended up here.  Note that (in my
opinion) there is zero chance of this changing - it took us well over
a year to get to this conclusion.  So if we ever want ARM vCPU HP
we need to work within these constraints. 

The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
specs etc, not the kernel maintainers) are determined that they want
to retain the option to do real physical CPU hotplug in the future
with all the necessary work around dynamic interrupt controller
initialization, debug and many other messy corners.

Thus anything defined had to be structured in a way that was 'different'
from that.

I don't mind the proposed flattening of the 2 paths if the ARM kernel
maintainers are fine with it but it will remove the distinctions and
we will need to be very careful with the CPU masks - we can't handle
them the same as x86 does.

I'll get on with doing that, but do need input from Will / Catalin / James.
There are some quirks that need calling out as it's not quite a simple
as it appears from a high level.

Another part of that long discussion established that there is userspace
(Android IIRC) in which the CPU present mask must include all CPUs
at boot. To change that would be userspace ABI breakage so we can't
do that.  Hence the dance around adding yet another mask to allow the
OS to understand which CPUs are 'present' but not possible to online.

Flattening the two paths removes any distinction between calls that
are for real hotplug and those that are for this online capable path.
As a side note, the indicating bit for these flows is defined in ACPI
for x86 from ACPI 6.3 as a flag in Processor Local APIC
(the ARM64 definition is a cut and paste of that text).  So someone
is interested in this distinction on x86. I can't say who but if
you have a mantis account you can easily follow the history and it
might be instructive to not everyone considering the current x86
flow the right way to do it.

Jonathan


> 
> Thanks,
> 
>         tglx
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15  8:45             ` Jonathan Cameron
@ 2024-04-15  9:16               ` Jonathan Cameron
  2024-04-15  9:31                 ` Jonathan Cameron
  2024-04-15 11:57                 ` Jonathan Cameron
  2024-04-15 11:37               ` Rafael J. Wysocki
  1 sibling, 2 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15  9:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Russell King (Oracle),
	Rafael J. Wysocki, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Mon, 15 Apr 2024 09:45:52 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Sat, 13 Apr 2024 01:23:48 +0200
> Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> > Russell!
> > 
> > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:  
> > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:    
> > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > >> > being taken - so I've no idea why the "make_present" case takes these
> > >> > locks.    
> > >> 
> > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > >> boot must hold the appropriate write locks. Otherwise it would be
> > >> possible to online a CPU which just got marked present, but the
> > >> registration has not completed yet.    
> > >
> > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > is initialising the struct cpu, registering the embedded struct
> > > device, and setting up the sysfs links to its NUMA node.
> > >
> > > There is nothing obvious in there which manipulates any CPU masks, and
> > > this is rather my fundamental point when I said "I couldn't find
> > > anything in arch_register_cpu() that depends on ...".
> > >
> > > If there is something, then comments in the code would be a useful aid
> > > because it's highly non-obvious where such a manipulation is located,
> > > and hence why the locks are necessary.    
> > 
> > acpi_processor_hotadd_init()
> > ...
> >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> > 
> > That ends up in fiddling with cpu_present_mask.
> > 
> > I grant you that arch_register_cpu() is not, but it might rely on the
> > external locking too. I could not be bothered to figure that out.
> >   
> > >> Define "real hotplug" :)
> > >> 
> > >> Real physical hotplug does not really exist. That's at least true for
> > >> x86, where the physical hotplug support was chased for a while, but
> > >> never ended up in production.
> > >> 
> > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > >> to/from a guest.
> > >> 
> > >> There are limitations to this and we learned it the hard way on X86. At
> > >> the end we came up with the following restrictions:
> > >> 
> > >>     1) All possible CPUs have to be advertised at boot time via firmware
> > >>        (ACPI/DT/whatever) independent of them being present at boot time
> > >>        or not.
> > >> 
> > >>        That guarantees proper sizing and ensures that associations
> > >>        between hardware entities and software representations and the
> > >>        resulting topology are stable for the lifetime of a system.
> > >> 
> > >>        It is really required to know the full topology of the system at
> > >>        boot time especially with hybrid CPUs where some of the cores
> > >>        have hyperthreading and the others do not.
> > >> 
> > >> 
> > >>     2) Hot add can only mark an already registered (possible) CPU
> > >>        present. Adding non-registered CPUs after boot is not possible.
> > >> 
> > >>        The CPU must have been registered in #1 already to ensure that
> > >>        the system topology does not suddenly change in an incompatible
> > >>        way at run-time.
> > >> 
> > >> The same restriction would apply to real physical hotplug. I don't think
> > >> that's any different for ARM64 or any other architecture.    
> > >
> > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > from a misunderstanding as far as a CPU goes.
> > >
> > > However, there is a big difference between the two. On x86, a processor
> > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > those even when the processor itself is not enabled. This is the whole
> > > reason there's a difference between "present" and "enabled" and why
> > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > The processor never actually goes away in arm64, it's just prevented
> > > from being used.    
> > 
> > It's the same on X86 at least in the physical world.  
> 
> There were public calls on this via the Linaro Open Discussions group,
> so I can talk a little about how we ended up here.  Note that (in my
> opinion) there is zero chance of this changing - it took us well over
> a year to get to this conclusion.  So if we ever want ARM vCPU HP
> we need to work within these constraints. 
> 
> The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> specs etc, not the kernel maintainers) are determined that they want
> to retain the option to do real physical CPU hotplug in the future
> with all the necessary work around dynamic interrupt controller
> initialization, debug and many other messy corners.
> 
> Thus anything defined had to be structured in a way that was 'different'
> from that.
> 
> I don't mind the proposed flattening of the 2 paths if the ARM kernel
> maintainers are fine with it but it will remove the distinctions and
> we will need to be very careful with the CPU masks - we can't handle
> them the same as x86 does.
> 
> I'll get on with doing that, but do need input from Will / Catalin / James.
> There are some quirks that need calling out as it's not quite a simple
> as it appears from a high level.
> 
> Another part of that long discussion established that there is userspace
> (Android IIRC) in which the CPU present mask must include all CPUs
> at boot. To change that would be userspace ABI breakage so we can't
> do that.  Hence the dance around adding yet another mask to allow the
> OS to understand which CPUs are 'present' but not possible to online.
> 
> Flattening the two paths removes any distinction between calls that
> are for real hotplug and those that are for this online capable path.
> As a side note, the indicating bit for these flows is defined in ACPI
> for x86 from ACPI 6.3 as a flag in Processor Local APIC
> (the ARM64 definition is a cut and paste of that text).  So someone
> is interested in this distinction on x86. I can't say who but if
> you have a mantis account you can easily follow the history and it
> might be instructive to not everyone considering the current x86
> flow the right way to do it.

Would a higher level check to catch that we are hitting undefined
territory on arm64 be acceptable? That might satisfy the constraint
that we should not have any software for arm64 that would run if
physical CPU HP is added to the arch in future.  Something like:

@@ -331,6 +331,13 @@ static int acpi_processor_get_info(struct acpi_device *device)

        c = &per_cpu(cpu_devices, pr->id);
        ACPI_COMPANION_SET(&c->dev, device);
+
+       if (!IS_ENABLED(CONFIG_ACPI_CPU_HOTPLUG_CPU) &&
+           (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id))) {
+               pr_err_once("Changing CPU present bit is not supported\n");
+               return -ENODEV;
+       }
+

This is basically lifting the check out of the acpi_processor_make_present()
call in this patch set.

With that in place before the new shared call I think we should be fine
wrt to the ARM Architecture requirements.

Jonathan


        /*
> 
> Jonathan
> 
> 
> > 
> > Thanks,
> > 
> >         tglx
> >   
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15  9:16               ` Jonathan Cameron
@ 2024-04-15  9:31                 ` Jonathan Cameron
  2024-04-15 11:57                 ` Jonathan Cameron
  1 sibling, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15  9:31 UTC (permalink / raw)
  To: Thomas Gleixner, linuxarm
  Cc: Russell King (Oracle),
	Rafael J. Wysocki, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, justin.he, jianyong.wu

On Mon, 15 Apr 2024 10:16:37 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 15 Apr 2024 09:45:52 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Sat, 13 Apr 2024 01:23:48 +0200
> > Thomas Gleixner <tglx@linutronix.de> wrote:
> >   
> > > Russell!
> > > 
> > > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:    
> > > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:      
> > > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > > >> > being taken - so I've no idea why the "make_present" case takes these
> > > >> > locks.      
> > > >> 
> > > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > > >> boot must hold the appropriate write locks. Otherwise it would be
> > > >> possible to online a CPU which just got marked present, but the
> > > >> registration has not completed yet.      
> > > >
> > > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > > is initialising the struct cpu, registering the embedded struct
> > > > device, and setting up the sysfs links to its NUMA node.
> > > >
> > > > There is nothing obvious in there which manipulates any CPU masks, and
> > > > this is rather my fundamental point when I said "I couldn't find
> > > > anything in arch_register_cpu() that depends on ...".
> > > >
> > > > If there is something, then comments in the code would be a useful aid
> > > > because it's highly non-obvious where such a manipulation is located,
> > > > and hence why the locks are necessary.      
> > > 
> > > acpi_processor_hotadd_init()
> > > ...
> > >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> > > 
> > > That ends up in fiddling with cpu_present_mask.
> > > 
> > > I grant you that arch_register_cpu() is not, but it might rely on the
> > > external locking too. I could not be bothered to figure that out.
> > >     
> > > >> Define "real hotplug" :)
> > > >> 
> > > >> Real physical hotplug does not really exist. That's at least true for
> > > >> x86, where the physical hotplug support was chased for a while, but
> > > >> never ended up in production.
> > > >> 
> > > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > > >> to/from a guest.
> > > >> 
> > > >> There are limitations to this and we learned it the hard way on X86. At
> > > >> the end we came up with the following restrictions:
> > > >> 
> > > >>     1) All possible CPUs have to be advertised at boot time via firmware
> > > >>        (ACPI/DT/whatever) independent of them being present at boot time
> > > >>        or not.
> > > >> 
> > > >>        That guarantees proper sizing and ensures that associations
> > > >>        between hardware entities and software representations and the
> > > >>        resulting topology are stable for the lifetime of a system.
> > > >> 
> > > >>        It is really required to know the full topology of the system at
> > > >>        boot time especially with hybrid CPUs where some of the cores
> > > >>        have hyperthreading and the others do not.
> > > >> 
> > > >> 
> > > >>     2) Hot add can only mark an already registered (possible) CPU
> > > >>        present. Adding non-registered CPUs after boot is not possible.
> > > >> 
> > > >>        The CPU must have been registered in #1 already to ensure that
> > > >>        the system topology does not suddenly change in an incompatible
> > > >>        way at run-time.
> > > >> 
> > > >> The same restriction would apply to real physical hotplug. I don't think
> > > >> that's any different for ARM64 or any other architecture.      
> > > >
> > > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > > from a misunderstanding as far as a CPU goes.
> > > >
> > > > However, there is a big difference between the two. On x86, a processor
> > > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > > those even when the processor itself is not enabled. This is the whole
> > > > reason there's a difference between "present" and "enabled" and why
> > > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > > The processor never actually goes away in arm64, it's just prevented
> > > > from being used.      
> > > 
> > > It's the same on X86 at least in the physical world.    
> > 
> > There were public calls on this via the Linaro Open Discussions group,
> > so I can talk a little about how we ended up here.  Note that (in my
> > opinion) there is zero chance of this changing - it took us well over
> > a year to get to this conclusion.  So if we ever want ARM vCPU HP
> > we need to work within these constraints. 
> > 
> > The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> > specs etc, not the kernel maintainers) are determined that they want
> > to retain the option to do real physical CPU hotplug in the future
> > with all the necessary work around dynamic interrupt controller
> > initialization, debug and many other messy corners.
> > 
> > Thus anything defined had to be structured in a way that was 'different'
> > from that.
> > 
> > I don't mind the proposed flattening of the 2 paths if the ARM kernel
> > maintainers are fine with it but it will remove the distinctions and
> > we will need to be very careful with the CPU masks - we can't handle
> > them the same as x86 does.
> > 
> > I'll get on with doing that, but do need input from Will / Catalin / James.
> > There are some quirks that need calling out as it's not quite a simple
> > as it appears from a high level.
> > 
> > Another part of that long discussion established that there is userspace
> > (Android IIRC) in which the CPU present mask must include all CPUs
> > at boot. To change that would be userspace ABI breakage so we can't
> > do that.  Hence the dance around adding yet another mask to allow the
> > OS to understand which CPUs are 'present' but not possible to online.
> > 
> > Flattening the two paths removes any distinction between calls that
> > are for real hotplug and those that are for this online capable path.
> > As a side note, the indicating bit for these flows is defined in ACPI
> > for x86 from ACPI 6.3 as a flag in Processor Local APIC
> > (the ARM64 definition is a cut and paste of that text).  So someone
> > is interested in this distinction on x86. I can't say who but if
> > you have a mantis account you can easily follow the history and it
> > might be instructive to not everyone considering the current x86
> > flow the right way to do it.  
> 
> Would a higher level check to catch that we are hitting undefined
> territory on arm64 be acceptable? That might satisfy the constraint
> that we should not have any software for arm64 that would run if
> physical CPU HP is added to the arch in future.  Something like:
> 
> @@ -331,6 +331,13 @@ static int acpi_processor_get_info(struct acpi_device *device)
> 
>         c = &per_cpu(cpu_devices, pr->id);
>         ACPI_COMPANION_SET(&c->dev, device);
> +
> +       if (!IS_ENABLED(CONFIG_ACPI_CPU_HOTPLUG_CPU) &&
> +           (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id))) {
> +               pr_err_once("Changing CPU present bit is not supported\n");
> +               return -ENODEV;
> +       }
> +
> 
> This is basically lifting the check out of the acpi_processor_make_present()
> call in this patch set.
> 
> With that in place before the new shared call I think we should be fine
> wrt to the ARM Architecture requirements.
> 

Thomas, one result of using the same code in both paths is that we 
end up calling acpi_map_cpu() in paths on x86 that aren't under CONFIG_ACPI_HOTPLUG_CPU
any more.   If anyone ever implements the x86 version of online capable, this
might be valid.

For now I've dropped the CONFIG_ACPI_HOTPLUG_CPU guard in arch/x86/kernel/acpi/boot.c
but is that the right thing to do or should we stub out with an error return for
now?

> Jonathan
> 
> 
>         /*
> > 
> > Jonathan
> > 
> >   
> > > 
> > > Thanks,
> > > 
> > >         tglx
> > >     
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel  
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 18:30   ` Rafael J. Wysocki
  2024-04-12 20:16     ` Russell King (Oracle)
@ 2024-04-15 10:52     ` Jonathan Cameron
  2024-04-15 11:11       ` Jonathan Cameron
  2024-04-15 11:52       ` Rafael J. Wysocki
  2024-04-15 11:07     ` Salil Mehta
  2 siblings, 2 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15 10:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Fri, 12 Apr 2024 20:30:40 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > From: James Morse <james.morse@arm.com>
> >
> > The arm64 specific arch_register_cpu() call may defer CPU registration
> > until the ACPI interpreter is available and the _STA method can
> > be evaluated.
> >
> > If this occurs, then a second attempt is made in
> > acpi_processor_get_info(). Note that the arm64 specific call has
> > not yet been added so for now this will never be successfully
> > called.
> >
> > Systems can still be booted with 'acpi=off', or not include an
> > ACPI description at all as in these cases arch_register_cpu()
> > will not have deferred registration when first called.
> >
> > This moves the CPU register logic back to a subsys_initcall(),
> > while the memory nodes will have been registered earlier.
> > Note this is where the call was prior to the cleanup series so
> > there should be no side effects of moving it back again for this
> > specific case.
> >
> > [PATCH 00/21] Initial cleanups for vCPU HP.
> > https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
> >
> > e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> >
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Reviewed-by: Gavin Shan <gshan@redhat.com>
> > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > v5: Update commit message to make it clear this is moving the
> >     init back to where it was until very recently.
> >
> >     No longer change the condition in the earlier registration point
> >     as that will be handled by the arm64 registration routine
> >     deferring until called again here.
> > ---
> >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 93e029403d05..c78398cdd060 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> >
> >         c = &per_cpu(cpu_devices, pr->id);
> >         ACPI_COMPANION_SET(&c->dev, device);
> > +       /*
> > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > +        * duplicate CPU descriptions from firmware.
> > +        */
> > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > +           !get_cpu_device(pr->id)) {
> > +               int ret = arch_register_cpu(pr->id);
> > +
> > +               if (ret)
> > +                       return ret;
> > +       }
> > +
> >         /*
> >          *  Extra Processor objects may be enumerated on MP systems with
> >          *  less than the max # of CPUs. They should be ignored _iff
> > --  
> 
> I am still unsure why there need to be two paths calling
> arch_register_cpu() in acpi_processor_get_info().

I replied further down the thread, but the key point was to maintain
the strong distinction between 'what' was done in a real hotplug
path vs one where onlining was all.  We can relax that but it goes
contrary to the careful dance that was needed to get any agreement
to the ARM architecture aspects of this.

> 
> Just below the comment partially pulled into the patch context above,
> there is this code:
> 
> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>          int ret = acpi_processor_hotadd_init(pr);
> 
>         if (ret)
>                 return ret;
> }
> 
> For the sake of the argument, fold acpi_processor_hotadd_init() into
> it and drop the redundant _STA check from it:

If we combine these, the _STA check is necessary because we will call this
path for delayed onlining of ARM64 CPUs (if the earlier registration code
call or arch_register_cpu() returned -EPROBE defer). That's the only way
we know that a given CPU is online capable but firmware is saying we can't
bring it online yet (it may be be vHP later).

> 
> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>         if (invalid_phys_cpuid(pr->phys_id))
>                 return -ENODEV;
> 
>         cpu_maps_update_begin();
>         cpus_write_lock();
> 
>        ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);

I read that call as
	acpi_map_cpu_for_physical_cpu_hotplug()
but we could make it equivalent of.
	acpi_map_cpu_for_whatever_cpu_hotplug()
(I'm not proposing those names though ;)

in which case it is fine to just stub it out on ARM64.
>        if (ret) {
>                 cpus_write_unlock();
>                 cpu_maps_update_done();
>                 return ret;
>        }
>        ret = arch_register_cpu(pr->id);
>        if (ret) {
>                 acpi_unmap_cpu(pr->id);
> 
>                 cpus_write_unlock();
>                 cpu_maps_update_done();
>                 return ret;
>        }
>       pr_info("CPU%d has been hot-added\n", pr->id);
>       pr->flags.need_hotplug_init = 1;
This one needs more careful handling because we are calling this
for non hotplug cases on arm64 in which case we end up setting this
for initially online CPUs - thus if we offline and online them
again via sysfs /sys/bus/cpu/device/cpuX/online it goes through the
hotplug path and should not.

So I need a way to detect if we are hotplugging the cpu or not.
Is there a standard way to do this?  I haven't figured out how
to use flags in drivers to communicate this state.

> 
>       cpus_write_unlock();
>       cpu_maps_update_done();
> }
> 
> so I'm not sure why this cannot be combined with the new code.
> 
> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> What's the difference then?  The locking, which should be fine if I'm
> not mistaken and need_hotplug_init that needs to be set if this code
> runs after the processor driver has loaded AFAICS.

That's the bit that I'm currently finding a challenge. Is there a clean
way to detect that?

Jonathan






^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 18:30   ` Rafael J. Wysocki
  2024-04-12 20:16     ` Russell King (Oracle)
  2024-04-15 10:52     ` Jonathan Cameron
@ 2024-04-15 11:07     ` Salil Mehta
  2 siblings, 0 replies; 58+ messages in thread
From: Salil Mehta @ 2024-04-15 11:07 UTC (permalink / raw)
  To: Rafael J. Wysocki, Jonathan Cameron
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Miguel Luis,
	James Morse, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	Linuxarm, justin.he, jianyong.wu

Hello,

Engaging after a long time (I've been almost off grid due to very challenging personal circumstances).
Please find my response inline.

Thanks
Salil

>  From: Rafael J. Wysocki <rafael@kernel.org>
>  Sent: Friday, April 12, 2024 7:31 PM
>  To: Jonathan Cameron <jonathan.cameron@huawei.com>
>  Cc: linux-pm@vger.kernel.org; loongarch@lists.linux.dev; linux-
>  acpi@vger.kernel.org; linux-arch@vger.kernel.org; linux-
>  kernel@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>  kvmarm@lists.linux.dev; x86@kernel.org; Russell King
>  <linux@armlinux.org.uk>; Rafael J . Wysocki <rafael@kernel.org>; Miguel
>  Luis <miguel.luis@oracle.com>; James Morse <james.morse@arm.com>;
>  Salil Mehta <salil.mehta@huawei.com>; Jean-Philippe Brucker <jean-
>  philippe@linaro.org>; Catalin Marinas <catalin.marinas@arm.com>; Will
>  Deacon <will@kernel.org>; Linuxarm <linuxarm@huawei.com>;
>  justin.he@arm.com; jianyong.wu@arm.com
>  Subject: Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from
>  acpi_processor_get_info()
>  
>  On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
>  <Jonathan.Cameron@huawei.com> wrote:
>  >
>  > From: James Morse <james.morse@arm.com>
>  >
>  > The arm64 specific arch_register_cpu() call may defer CPU registration
>  > until the ACPI interpreter is available and the _STA method can be
>  > evaluated.
>  >
>  > If this occurs, then a second attempt is made in
>  > acpi_processor_get_info(). Note that the arm64 specific call has not
>  > yet been added so for now this will never be successfully called.
>  >
>  > Systems can still be booted with 'acpi=off', or not include an ACPI
>  > description at all as in these cases arch_register_cpu() will not have
>  > deferred registration when first called.
>  >
>  > This moves the CPU register logic back to a subsys_initcall(), while
>  > the memory nodes will have been registered earlier.
>  > Note this is where the call was prior to the cleanup series so there
>  > should be no side effects of moving it back again for this specific
>  > case.
>  >
>  > [PATCH 00/21] Initial cleanups for vCPU HP.
>  >
>  https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
>  >
>  > e.g. 5b95f94c3b9f ("x86/topology: Switch over to
>  GENERIC_CPU_DEVICES")
>  >
>  > Signed-off-by: James Morse <james.morse@arm.com>
>  > Reviewed-by: Gavin Shan <gshan@redhat.com>
>  > Tested-by: Miguel Luis <miguel.luis@oracle.com>
>  > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
>  > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
>  > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>  > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
>  > ---
>  > v5: Update commit message to make it clear this is moving the
>  >     init back to where it was until very recently.
>  >
>  >     No longer change the condition in the earlier registration point
>  >     as that will be handled by the arm64 registration routine
>  >     deferring until called again here.
>  > ---
>  >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
>  >  1 file changed, 12 insertions(+)
>  >
>  > diff --git a/drivers/acpi/acpi_processor.c
>  > b/drivers/acpi/acpi_processor.c index 93e029403d05..c78398cdd060
>  > 100644
>  > --- a/drivers/acpi/acpi_processor.c
>  > +++ b/drivers/acpi/acpi_processor.c
>  > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct
>  > acpi_device *device)
>  >
>  >         c = &per_cpu(cpu_devices, pr->id);
>  >         ACPI_COMPANION_SET(&c->dev, device);
>  > +       /*
>  > +        * Register CPUs that are present. get_cpu_device() is used to skip
>  > +        * duplicate CPU descriptions from firmware.
>  > +        */
>  > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
>  > +           !get_cpu_device(pr->id)) {
>  > +               int ret = arch_register_cpu(pr->id);
>  > +
>  > +               if (ret)
>  > +                       return ret;
>  > +       }
>  > +
>  >         /*
>  >          *  Extra Processor objects may be enumerated on MP systems with
>  >          *  less than the max # of CPUs. They should be ignored _iff
>  > --
>  
>  I am still unsure why there need to be two paths calling
>  arch_register_cpu() in acpi_processor_get_info().


This is because all CPUs are expected to be 'present' during the boot time for ARM64 arch.
This is not true for x86 world i.e. the logical_cpuid could  be invalid (and present mask not
set) for the x86 arch during the boot time.  Faking the 'present' behavior at the virtualizer
level for ARM is like interfering with the architecture and then tweaking the kernel to fit
that unauthorized hack. This has a potential to break the existing and future version of the
ARM arch. (Between, I'm one of the initial offender of doing that but later corrected the
approach after many discussions and KVM Forum presentations along with ARM )

Therefore, in ARM we keep all the processor as present and just use _STA enabled bit to
decide the online'ing of the processor and this requires a separate handling.


>  
>  Just below the comment partially pulled into the patch context above, there
>  is this code:
>  
>  if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>           int ret = acpi_processor_hotadd_init(pr);
>  
>          if (ret)
>                  return ret;
>  }
>  
>  For the sake of the argument, fold acpi_processor_hotadd_init() into it and
>  drop the redundant _STA check from it:
>  
>  if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>          if (invalid_phys_cpuid(pr->phys_id))
>                  return -ENODEV;
>  
>          cpu_maps_update_begin();
>          cpus_write_lock();
>  
>         ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
>         if (ret) {
>                  cpus_write_unlock();
>                  cpu_maps_update_done();
>                  return ret;
>         }
>         ret = arch_register_cpu(pr->id);
>         if (ret) {
>                  acpi_unmap_cpu(pr->id);
>  
>                  cpus_write_unlock();
>                  cpu_maps_update_done();
>                  return ret;
>         }
>        pr_info("CPU%d has been hot-added\n", pr->id);
>        pr->flags.need_hotplug_init = 1;
>  
>        cpus_write_unlock();
>        cpu_maps_update_done();
>  }
>  
>  so I'm not sure why this cannot be combined with the new code.
>  
>  Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.


We cannot because logical cpu-id can never be invalid and cpus can
never be in NOT present state on ARM arch.


>  What's the difference then?  


Above is the precise difference. Changing the behavior of 'presence' in
the ARM architecture after boot is not allowed. With the latest efforts, we
have added the concept of 'online-capable' bit which can help in defer
online'ing the CPUs but then this is not same as not being present at the
boot time. 


The locking, which should be fine if I'm not
>  mistaken and need_hotplug_init that needs to be set if this code runs after
>  the processor driver has loaded AFAICS.

AFAICS, Locking looks to be okay to me as well.

Best regards
Salil.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 10:52     ` Jonathan Cameron
@ 2024-04-15 11:11       ` Jonathan Cameron
  2024-04-15 11:52       ` Rafael J. Wysocki
  1 sibling, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15 11:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Mon, 15 Apr 2024 11:52:03 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Fri, 12 Apr 2024 20:30:40 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> 
> > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:  
> > >
> > > From: James Morse <james.morse@arm.com>
> > >
> > > The arm64 specific arch_register_cpu() call may defer CPU registration
> > > until the ACPI interpreter is available and the _STA method can
> > > be evaluated.
> > >
> > > If this occurs, then a second attempt is made in
> > > acpi_processor_get_info(). Note that the arm64 specific call has
> > > not yet been added so for now this will never be successfully
> > > called.
> > >
> > > Systems can still be booted with 'acpi=off', or not include an
> > > ACPI description at all as in these cases arch_register_cpu()
> > > will not have deferred registration when first called.
> > >
> > > This moves the CPU register logic back to a subsys_initcall(),
> > > while the memory nodes will have been registered earlier.
> > > Note this is where the call was prior to the cleanup series so
> > > there should be no side effects of moving it back again for this
> > > specific case.
> > >
> > > [PATCH 00/21] Initial cleanups for vCPU HP.
> > > https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
> > >
> > > e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> > >
> > > Signed-off-by: James Morse <james.morse@arm.com>
> > > Reviewed-by: Gavin Shan <gshan@redhat.com>
> > > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
> > > ---
> > > v5: Update commit message to make it clear this is moving the
> > >     init back to where it was until very recently.
> > >
> > >     No longer change the condition in the earlier registration point
> > >     as that will be handled by the arm64 registration routine
> > >     deferring until called again here.
> > > ---
> > >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > index 93e029403d05..c78398cdd060 100644
> > > --- a/drivers/acpi/acpi_processor.c
> > > +++ b/drivers/acpi/acpi_processor.c
> > > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > >
> > >         c = &per_cpu(cpu_devices, pr->id);
> > >         ACPI_COMPANION_SET(&c->dev, device);
> > > +       /*
> > > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > > +        * duplicate CPU descriptions from firmware.
> > > +        */
> > > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > > +           !get_cpu_device(pr->id)) {
> > > +               int ret = arch_register_cpu(pr->id);
> > > +
> > > +               if (ret)
> > > +                       return ret;
> > > +       }
> > > +
> > >         /*
> > >          *  Extra Processor objects may be enumerated on MP systems with
> > >          *  less than the max # of CPUs. They should be ignored _iff
> > > --    
> > 
> > I am still unsure why there need to be two paths calling
> > arch_register_cpu() in acpi_processor_get_info().  
> 
> I replied further down the thread, but the key point was to maintain
> the strong distinction between 'what' was done in a real hotplug
> path vs one where onlining was all.  We can relax that but it goes
> contrary to the careful dance that was needed to get any agreement
> to the ARM architecture aspects of this.
> 
> > 
> > Just below the comment partially pulled into the patch context above,
> > there is this code:
> > 
> > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> >          int ret = acpi_processor_hotadd_init(pr);
> > 
> >         if (ret)
> >                 return ret;
> > }
> > 
> > For the sake of the argument, fold acpi_processor_hotadd_init() into
> > it and drop the redundant _STA check from it:  
> 
> If we combine these, the _STA check is necessary because we will call this
> path for delayed onlining of ARM64 CPUs (if the earlier registration code
> call or arch_register_cpu() returned -EPROBE defer). That's the only way
> we know that a given CPU is online capable but firmware is saying we can't
> bring it online yet (it may be be vHP later).

For arm64 ignore this comment. I'd forgotten we moved it into the arch
specific code last week.  May need similar on x86 I'm not 100% sure.

> 
> > 
> > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> >         if (invalid_phys_cpuid(pr->phys_id))
> >                 return -ENODEV;
> > 
> >         cpu_maps_update_begin();
> >         cpus_write_lock();
> > 
> >        ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);  
> 
> I read that call as
> 	acpi_map_cpu_for_physical_cpu_hotplug()
> but we could make it equivalent of.
> 	acpi_map_cpu_for_whatever_cpu_hotplug()
> (I'm not proposing those names though ;)
> 
> in which case it is fine to just stub it out on ARM64.
> >        if (ret) {
> >                 cpus_write_unlock();
> >                 cpu_maps_update_done();
> >                 return ret;
> >        }
> >        ret = arch_register_cpu(pr->id);
> >        if (ret) {
> >                 acpi_unmap_cpu(pr->id);
> > 
> >                 cpus_write_unlock();
> >                 cpu_maps_update_done();
> >                 return ret;
> >        }
> >       pr_info("CPU%d has been hot-added\n", pr->id);
> >       pr->flags.need_hotplug_init = 1;  
> This one needs more careful handling because we are calling this
> for non hotplug cases on arm64 in which case we end up setting this
> for initially online CPUs - thus if we offline and online them
> again via sysfs /sys/bus/cpu/device/cpuX/online it goes through the
> hotplug path and should not.
> 
> So I need a way to detect if we are hotplugging the cpu or not.
> Is there a standard way to do this?  I haven't figured out how
> to use flags in drivers to communicate this state.
> 
> > 
> >       cpus_write_unlock();
> >       cpu_maps_update_done();
> > }
> > 
> > so I'm not sure why this cannot be combined with the new code.
> > 
> > Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> > What's the difference then?  The locking, which should be fine if I'm
> > not mistaken and need_hotplug_init that needs to be set if this code
> > runs after the processor driver has loaded AFAICS.  
> 
> That's the bit that I'm currently finding a challenge. Is there a clean
> way to detect that?
> 
> Jonathan
> 
> 
> 
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15  8:45             ` Jonathan Cameron
  2024-04-15  9:16               ` Jonathan Cameron
@ 2024-04-15 11:37               ` Rafael J. Wysocki
  2024-04-15 11:56                 ` Jonathan Cameron
  1 sibling, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 11:37 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Thomas Gleixner, Russell King (Oracle),
	Rafael J. Wysocki, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Mon, Apr 15, 2024 at 10:46 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Sat, 13 Apr 2024 01:23:48 +0200
> Thomas Gleixner <tglx@linutronix.de> wrote:
>
> > Russell!
> >
> > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:
> > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:
> > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > >> > being taken - so I've no idea why the "make_present" case takes these
> > >> > locks.
> > >>
> > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > >> boot must hold the appropriate write locks. Otherwise it would be
> > >> possible to online a CPU which just got marked present, but the
> > >> registration has not completed yet.
> > >
> > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > is initialising the struct cpu, registering the embedded struct
> > > device, and setting up the sysfs links to its NUMA node.
> > >
> > > There is nothing obvious in there which manipulates any CPU masks, and
> > > this is rather my fundamental point when I said "I couldn't find
> > > anything in arch_register_cpu() that depends on ...".
> > >
> > > If there is something, then comments in the code would be a useful aid
> > > because it's highly non-obvious where such a manipulation is located,
> > > and hence why the locks are necessary.
> >
> > acpi_processor_hotadd_init()
> > ...
> >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> >
> > That ends up in fiddling with cpu_present_mask.
> >
> > I grant you that arch_register_cpu() is not, but it might rely on the
> > external locking too. I could not be bothered to figure that out.
> >
> > >> Define "real hotplug" :)
> > >>
> > >> Real physical hotplug does not really exist. That's at least true for
> > >> x86, where the physical hotplug support was chased for a while, but
> > >> never ended up in production.
> > >>
> > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > >> to/from a guest.
> > >>
> > >> There are limitations to this and we learned it the hard way on X86. At
> > >> the end we came up with the following restrictions:
> > >>
> > >>     1) All possible CPUs have to be advertised at boot time via firmware
> > >>        (ACPI/DT/whatever) independent of them being present at boot time
> > >>        or not.
> > >>
> > >>        That guarantees proper sizing and ensures that associations
> > >>        between hardware entities and software representations and the
> > >>        resulting topology are stable for the lifetime of a system.
> > >>
> > >>        It is really required to know the full topology of the system at
> > >>        boot time especially with hybrid CPUs where some of the cores
> > >>        have hyperthreading and the others do not.
> > >>
> > >>
> > >>     2) Hot add can only mark an already registered (possible) CPU
> > >>        present. Adding non-registered CPUs after boot is not possible.
> > >>
> > >>        The CPU must have been registered in #1 already to ensure that
> > >>        the system topology does not suddenly change in an incompatible
> > >>        way at run-time.
> > >>
> > >> The same restriction would apply to real physical hotplug. I don't think
> > >> that's any different for ARM64 or any other architecture.
> > >
> > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > from a misunderstanding as far as a CPU goes.
> > >
> > > However, there is a big difference between the two. On x86, a processor
> > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > those even when the processor itself is not enabled. This is the whole
> > > reason there's a difference between "present" and "enabled" and why
> > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > The processor never actually goes away in arm64, it's just prevented
> > > from being used.
> >
> > It's the same on X86 at least in the physical world.
>
> There were public calls on this via the Linaro Open Discussions group,
> so I can talk a little about how we ended up here.  Note that (in my
> opinion) there is zero chance of this changing - it took us well over
> a year to get to this conclusion.  So if we ever want ARM vCPU HP
> we need to work within these constraints.
>
> The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> specs etc, not the kernel maintainers) are determined that they want
> to retain the option to do real physical CPU hotplug in the future
> with all the necessary work around dynamic interrupt controller
> initialization, debug and many other messy corners.

That's OK, but the difference is not in the ACPi CPU enumeration/removal code.

> Thus anything defined had to be structured in a way that was 'different'
> from that.

Apparently, that's where things got confused.

> I don't mind the proposed flattening of the 2 paths if the ARM kernel
> maintainers are fine with it but it will remove the distinctions and
> we will need to be very careful with the CPU masks - we can't handle
> them the same as x86 does.

At the ACPI code level, there is no distinction.

A CPU that was not available before has just become available.  The
platform firmware has notified the kernel about it and now
acpi_processor_add() runs.  Why would it need to use different code
paths depending on what _STA bits were clear before?

Yes, there is some arch stuff to be called and that arch stuff should
figure out what to do to make things actually work.

> I'll get on with doing that, but do need input from Will / Catalin / James.
> There are some quirks that need calling out as it's not quite a simple
> as it appears from a high level.
>
> Another part of that long discussion established that there is userspace
> (Android IIRC) in which the CPU present mask must include all CPUs
> at boot. To change that would be userspace ABI breakage so we can't
> do that.  Hence the dance around adding yet another mask to allow the
> OS to understand which CPUs are 'present' but not possible to online.
>
> Flattening the two paths removes any distinction between calls that
> are for real hotplug and those that are for this online capable path.

Which calls exactly do you mean?

> As a side note, the indicating bit for these flows is defined in ACPI
> for x86 from ACPI 6.3 as a flag in Processor Local APIC
> (the ARM64 definition is a cut and paste of that text).  So someone
> is interested in this distinction on x86. I can't say who but if
> you have a mantis account you can easily follow the history and it
> might be instructive to not everyone considering the current x86
> flow the right way to do it.

So a physically absent processor is different from a physically
present processor that has not been disabled.  No doubt about this.

That said, I'm still unsure why these two cases require two different
code paths in acpi_processor_add().

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 20:54       ` Thomas Gleixner
  2024-04-12 21:52         ` Russell King (Oracle)
@ 2024-04-15 11:51         ` Salil Mehta
  2024-04-15 12:51           ` Rafael J. Wysocki
  1 sibling, 1 reply; 58+ messages in thread
From: Salil Mehta @ 2024-04-15 11:51 UTC (permalink / raw)
  To: Thomas Gleixner, Russell King (Oracle), Rafael J. Wysocki
  Cc: Jonathan Cameron, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	Linuxarm, justin.he, jianyong.wu

Hello,

>  From: Thomas Gleixner <tglx@linutronix.de>
>  Sent: Friday, April 12, 2024 9:55 PM
>  
>  On Fri, Apr 12 2024 at 21:16, Russell King (Oracle) wrote:
>  > On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
>  >> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
>  >> What's the difference then?  The locking, which should be fine if I'm
>  >> not mistaken and need_hotplug_init that needs to be set if this code
>  >> runs after the processor driver has loaded AFAICS.
>  >
>  > It is over this that I walked away from progressing this code, because
>  > I don't think it's quite as simple as you make it out to be.
>  >
>  > Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch
>  implemented
>  > functions, so Arm64 can easily provide stubs for these that do nothing.
>  > That never caused me any concern.
>  >
>  > What does cause me great concern though are the finer details. For
>  > example, above you seem to drop the evaluation of _STA for the
>  > "make_present" case - I've no idea whether that is something that
>  > should be deleted or not (if it is something that can be deleted, then
>  > why not delete it now?)
>  >
>  > As for the cpu locking, I couldn't find anything in
>  > arch_register_cpu() that depends on the cpu_maps_update stuff nor
>  > needs the cpus_write_lock being taken - so I've no idea why the
>  > "make_present" case takes these locks.
>  
>  Anything which updates a CPU mask, e.g. cpu_present_mask, after early
>  boot must hold the appropriate write locks. Otherwise it would be possible
>  to online a CPU which just got marked present, but the registration has not
>  completed yet.
>  
>  > Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
>  > obvious that this is required - remember that with Arm64's "enabled"
>  > toggling, the "processor" is a slice of the system and doesn't
>  > actually go away - it's just "not enabled" for use.
>  >
>  > Again, as "processors" in Arm64 are slices of the system, they have to
>  > be fully described in ACPI before the OS boots, and they will be
>  > marked as being "present", which means they will be enumerated, and
>  > the driver will be probed. Any processor that is not to be used will
>  > not have its enabled bit set. It is my understanding that every
>  > processor will result in the ACPI processor driver being bound to it
>  > whether its enabled or not.
>  >
>  > The difference between real hotplug and Arm64 hotplug is that real
>  > hotplug makes stuff not-present (and thus unenumerable). Arm64
>  hotplug
>  > makes stuff not-enabled which is still enumerable.
>  
>  Define "real hotplug" :)
>  
>  Real physical hotplug does not really exist. That's at least true for x86, where
>  the physical hotplug support was chased for a while, but never ended up in
>  production.
>  
>  Though virtualization happily jumped on it to hot add/remove CPUs to/from
>  a guest.
>  
>  There are limitations to this and we learned it the hard way on X86. At the
>  end we came up with the following restrictions:
>  
>      1) All possible CPUs have to be advertised at boot time via firmware
>         (ACPI/DT/whatever) independent of them being present at boot time
>         or not.
>  
>         That guarantees proper sizing and ensures that associations
>         between hardware entities and software representations and the
>         resulting topology are stable for the lifetime of a system.
>  
>         It is really required to know the full topology of the system at
>         boot time especially with hybrid CPUs where some of the cores
>         have hyperthreading and the others do not.
>  
>  
>      2) Hot add can only mark an already registered (possible) CPU
>         present. Adding non-registered CPUs after boot is not possible.
>  
>         The CPU must have been registered in #1 already to ensure that
>         the system topology does not suddenly change in an incompatible
>         way at run-time.
>  
>  The same restriction would apply to real physical hotplug. I don't think that's
>  any different for ARM64 or any other architecture.


There is a difference:

1.   ARM arch does not allows for any processor to be NOT present. Hence, because of
this restriction any of its related per-cpu components must be present and enumerated
at the boot time as well (exposed by firmware and ACPI). This means all the enumerated
processors will be marked as 'present' but they might exist in NOT enabled (_STA.enabled=0)
state.
 
There was one clear difference and please correct me if I'm wrong here,  for x86, the LAPIC
associated with the x86 core can be brought online later even after boot?

But for ARM Arch, processors and its corresponding per-cpu components like redistributors
all need to be present and enumerated during the boot time. Redistributors are part of
ALWAYS-ON power domain. 

2.  Agreed regarding the topology. Are you suggesting that we must call arch_register_cpu()
during boot time for all the 'present' CPUs? Even if that's the case, we might still want to defer
registration of the cpu device (register_cpu() API) with the Linux device model. Later is what
we are doing to hide/unhide the CPUs from the user while STA.Enabled Bit is toggled due to
CPU (un)plug action.


Best regards
Salil.


>  
>  Hope that helps.
>  
>  Thanks,
>  
>          tglx


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 10:52     ` Jonathan Cameron
  2024-04-15 11:11       ` Jonathan Cameron
@ 2024-04-15 11:52       ` Rafael J. Wysocki
  1 sibling, 0 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 11:52 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Rafael J. Wysocki, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Russell King,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, linuxarm, justin.he, jianyong.wu

On Mon, Apr 15, 2024 at 12:52 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Fri, 12 Apr 2024 20:30:40 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
>
> > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > From: James Morse <james.morse@arm.com>
> > >
> > > The arm64 specific arch_register_cpu() call may defer CPU registration
> > > until the ACPI interpreter is available and the _STA method can
> > > be evaluated.
> > >
> > > If this occurs, then a second attempt is made in
> > > acpi_processor_get_info(). Note that the arm64 specific call has
> > > not yet been added so for now this will never be successfully
> > > called.
> > >
> > > Systems can still be booted with 'acpi=off', or not include an
> > > ACPI description at all as in these cases arch_register_cpu()
> > > will not have deferred registration when first called.
> > >
> > > This moves the CPU register logic back to a subsys_initcall(),
> > > while the memory nodes will have been registered earlier.
> > > Note this is where the call was prior to the cleanup series so
> > > there should be no side effects of moving it back again for this
> > > specific case.
> > >
> > > [PATCH 00/21] Initial cleanups for vCPU HP.
> > > https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
> > >
> > > e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> > >
> > > Signed-off-by: James Morse <james.morse@arm.com>
> > > Reviewed-by: Gavin Shan <gshan@redhat.com>
> > > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
> > > ---
> > > v5: Update commit message to make it clear this is moving the
> > >     init back to where it was until very recently.
> > >
> > >     No longer change the condition in the earlier registration point
> > >     as that will be handled by the arm64 registration routine
> > >     deferring until called again here.
> > > ---
> > >  drivers/acpi/acpi_processor.c | 12 ++++++++++++
> > >  1 file changed, 12 insertions(+)
> > >
> > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > index 93e029403d05..c78398cdd060 100644
> > > --- a/drivers/acpi/acpi_processor.c
> > > +++ b/drivers/acpi/acpi_processor.c
> > > @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > >
> > >         c = &per_cpu(cpu_devices, pr->id);
> > >         ACPI_COMPANION_SET(&c->dev, device);
> > > +       /*
> > > +        * Register CPUs that are present. get_cpu_device() is used to skip
> > > +        * duplicate CPU descriptions from firmware.
> > > +        */
> > > +       if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> > > +           !get_cpu_device(pr->id)) {
> > > +               int ret = arch_register_cpu(pr->id);
> > > +
> > > +               if (ret)
> > > +                       return ret;
> > > +       }
> > > +
> > >         /*
> > >          *  Extra Processor objects may be enumerated on MP systems with
> > >          *  less than the max # of CPUs. They should be ignored _iff
> > > --
> >
> > I am still unsure why there need to be two paths calling
> > arch_register_cpu() in acpi_processor_get_info().
>
> I replied further down the thread, but the key point was to maintain
> the strong distinction between 'what' was done in a real hotplug
> path vs one where onlining was all.  We can relax that but it goes
> contrary to the careful dance that was needed to get any agreement
> to the ARM architecture aspects of this.

This seems to go beyond technical territory.

As a general rule, we tend to combine code paths that look similar
instead of making them separate on purpose.  Especially with a little
to no explanation of the technical reason.

> >
> > Just below the comment partially pulled into the patch context above,
> > there is this code:
> >
> > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> >          int ret = acpi_processor_hotadd_init(pr);
> >
> >         if (ret)
> >                 return ret;
> > }
> >
> > For the sake of the argument, fold acpi_processor_hotadd_init() into
> > it and drop the redundant _STA check from it:
>
> If we combine these, the _STA check is necessary because we will call this
> path for delayed onlining of ARM64 CPUs (if the earlier registration code
> call or arch_register_cpu() returned -EPROBE defer). That's the only way
> we know that a given CPU is online capable but firmware is saying we can't
> bring it online yet (it may be be vHP later).

Ignoring the above as per the other message.

> >
> > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> >         if (invalid_phys_cpuid(pr->phys_id))
> >                 return -ENODEV;
> >
> >         cpu_maps_update_begin();
> >         cpus_write_lock();
> >
> >        ret = acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
>
> I read that call as
>         acpi_map_cpu_for_physical_cpu_hotplug()
> but we could make it equivalent of.
>         acpi_map_cpu_for_whatever_cpu_hotplug()

Yes.

> (I'm not proposing those names though ;)

Sure.

> in which case it is fine to just stub it out on ARM64.
> >        if (ret) {
> >                 cpus_write_unlock();
> >                 cpu_maps_update_done();
> >                 return ret;
> >        }
> >        ret = arch_register_cpu(pr->id);
> >        if (ret) {
> >                 acpi_unmap_cpu(pr->id);
> >
> >                 cpus_write_unlock();
> >                 cpu_maps_update_done();
> >                 return ret;
> >        }
> >       pr_info("CPU%d has been hot-added\n", pr->id);
> >       pr->flags.need_hotplug_init = 1;
> This one needs more careful handling because we are calling this
> for non hotplug cases on arm64 in which case we end up setting this
> for initially online CPUs - thus if we offline and online them
> again via sysfs /sys/bus/cpu/device/cpuX/online it goes through the
> hotplug path and should not.
>
> So I need a way to detect if we are hotplugging the cpu or not.
> Is there a standard way to do this?

If you mean physical hot-add, I don't think so.

What exactly do you need to do differently in the two cases?

>  I haven't figured out how to use flags in drivers to communicate this state.
>
> >
> >       cpus_write_unlock();
> >       cpu_maps_update_done();
> > }
> >
> > so I'm not sure why this cannot be combined with the new code.
> >
> > Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> > What's the difference then?  The locking, which should be fine if I'm
> > not mistaken and need_hotplug_init that needs to be set if this code
> > runs after the processor driver has loaded AFAICS.
>
> That's the bit that I'm currently finding a challenge. Is there a clean
> way to detect that?

If you mean the need_hotplug_init flag, I'm currently a bit struggling
to convince myself that it is really needed.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 11:37               ` Rafael J. Wysocki
@ 2024-04-15 11:56                 ` Jonathan Cameron
  2024-04-15 12:04                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15 11:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Russell King (Oracle),
	linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Miguel Luis, James Morse,
	Salil Mehta, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	linuxarm, justin.he, jianyong.wu

On Mon, 15 Apr 2024 13:37:08 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Mon, Apr 15, 2024 at 10:46 AM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Sat, 13 Apr 2024 01:23:48 +0200
> > Thomas Gleixner <tglx@linutronix.de> wrote:
> >  
> > > Russell!
> > >
> > > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:  
> > > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:  
> > > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > > >> > being taken - so I've no idea why the "make_present" case takes these
> > > >> > locks.  
> > > >>
> > > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > > >> boot must hold the appropriate write locks. Otherwise it would be
> > > >> possible to online a CPU which just got marked present, but the
> > > >> registration has not completed yet.  
> > > >
> > > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > > is initialising the struct cpu, registering the embedded struct
> > > > device, and setting up the sysfs links to its NUMA node.
> > > >
> > > > There is nothing obvious in there which manipulates any CPU masks, and
> > > > this is rather my fundamental point when I said "I couldn't find
> > > > anything in arch_register_cpu() that depends on ...".
> > > >
> > > > If there is something, then comments in the code would be a useful aid
> > > > because it's highly non-obvious where such a manipulation is located,
> > > > and hence why the locks are necessary.  
> > >
> > > acpi_processor_hotadd_init()
> > > ...
> > >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> > >
> > > That ends up in fiddling with cpu_present_mask.
> > >
> > > I grant you that arch_register_cpu() is not, but it might rely on the
> > > external locking too. I could not be bothered to figure that out.
> > >  
> > > >> Define "real hotplug" :)
> > > >>
> > > >> Real physical hotplug does not really exist. That's at least true for
> > > >> x86, where the physical hotplug support was chased for a while, but
> > > >> never ended up in production.
> > > >>
> > > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > > >> to/from a guest.
> > > >>
> > > >> There are limitations to this and we learned it the hard way on X86. At
> > > >> the end we came up with the following restrictions:
> > > >>
> > > >>     1) All possible CPUs have to be advertised at boot time via firmware
> > > >>        (ACPI/DT/whatever) independent of them being present at boot time
> > > >>        or not.
> > > >>
> > > >>        That guarantees proper sizing and ensures that associations
> > > >>        between hardware entities and software representations and the
> > > >>        resulting topology are stable for the lifetime of a system.
> > > >>
> > > >>        It is really required to know the full topology of the system at
> > > >>        boot time especially with hybrid CPUs where some of the cores
> > > >>        have hyperthreading and the others do not.
> > > >>
> > > >>
> > > >>     2) Hot add can only mark an already registered (possible) CPU
> > > >>        present. Adding non-registered CPUs after boot is not possible.
> > > >>
> > > >>        The CPU must have been registered in #1 already to ensure that
> > > >>        the system topology does not suddenly change in an incompatible
> > > >>        way at run-time.
> > > >>
> > > >> The same restriction would apply to real physical hotplug. I don't think
> > > >> that's any different for ARM64 or any other architecture.  
> > > >
> > > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > > from a misunderstanding as far as a CPU goes.
> > > >
> > > > However, there is a big difference between the two. On x86, a processor
> > > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > > those even when the processor itself is not enabled. This is the whole
> > > > reason there's a difference between "present" and "enabled" and why
> > > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > > The processor never actually goes away in arm64, it's just prevented
> > > > from being used.  
> > >
> > > It's the same on X86 at least in the physical world.  
> >
> > There were public calls on this via the Linaro Open Discussions group,
> > so I can talk a little about how we ended up here.  Note that (in my
> > opinion) there is zero chance of this changing - it took us well over
> > a year to get to this conclusion.  So if we ever want ARM vCPU HP
> > we need to work within these constraints.
> >
> > The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> > specs etc, not the kernel maintainers) are determined that they want
> > to retain the option to do real physical CPU hotplug in the future
> > with all the necessary work around dynamic interrupt controller
> > initialization, debug and many other messy corners.  
> 
> That's OK, but the difference is not in the ACPi CPU enumeration/removal code.
> 
> > Thus anything defined had to be structured in a way that was 'different'
> > from that.  
> 
> Apparently, that's where things got confused.
> 
> > I don't mind the proposed flattening of the 2 paths if the ARM kernel
> > maintainers are fine with it but it will remove the distinctions and
> > we will need to be very careful with the CPU masks - we can't handle
> > them the same as x86 does.  
> 
> At the ACPI code level, there is no distinction.
> 
> A CPU that was not available before has just become available.  The
> platform firmware has notified the kernel about it and now
> acpi_processor_add() runs.  Why would it need to use different code
> paths depending on what _STA bits were clear before?

I think we will continue to disagree on this.  To my mind and from the
ACPI specification, they are two different state transitions with different
required actions.   Those state transitions are an ACPI level thing not
an arch level one.  However, I want a solution that moves things forwards
so I'll give pushing that entirely into the arch code a try.

> 
> Yes, there is some arch stuff to be called and that arch stuff should
> figure out what to do to make things actually work.
> 
> > I'll get on with doing that, but do need input from Will / Catalin / James.
> > There are some quirks that need calling out as it's not quite a simple
> > as it appears from a high level.
> >
> > Another part of that long discussion established that there is userspace
> > (Android IIRC) in which the CPU present mask must include all CPUs
> > at boot. To change that would be userspace ABI breakage so we can't
> > do that.  Hence the dance around adding yet another mask to allow the
> > OS to understand which CPUs are 'present' but not possible to online.
> >
> > Flattening the two paths removes any distinction between calls that
> > are for real hotplug and those that are for this online capable path.  
> 
> Which calls exactly do you mean?

At the moment he distinction does not exist (because x86 only supports
fake physical CPU HP and arm64 only vCPU HP / online capable), but if
the architecture is defined for arm64 physical hotplug in the future
we would need to do interrupt controller bring up + a lot of other stuff.

It may be possible to do that in the arch code - will be hard to verify
that until that arch is defined  Today all I need to do is ensure that
any attempt to do present bit setting for ARM64 returns an error.
That looks to be straight forward.


> 
> > As a side note, the indicating bit for these flows is defined in ACPI
> > for x86 from ACPI 6.3 as a flag in Processor Local APIC
> > (the ARM64 definition is a cut and paste of that text).  So someone
> > is interested in this distinction on x86. I can't say who but if
> > you have a mantis account you can easily follow the history and it
> > might be instructive to not everyone considering the current x86
> > flow the right way to do it.  
> 
> So a physically absent processor is different from a physically
> present processor that has not been disabled.  No doubt about this.
> 
> That said, I'm still unsure why these two cases require two different
> code paths in acpi_processor_add().

It might be possible to push the checking down into arch_register_cpu()
and have that for now reject any attempt to do physical CPU HP on arm64.
It is that gate that is vital to getting this accepted by ARM.

I'm still very much stuck on the hotadd_init flag however, so any suggestions
on that would be very welcome! 

Jonathan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15  9:16               ` Jonathan Cameron
  2024-04-15  9:31                 ` Jonathan Cameron
@ 2024-04-15 11:57                 ` Jonathan Cameron
  1 sibling, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15 11:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Russell King (Oracle),
	Rafael J. Wysocki, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Mon, 15 Apr 2024 10:16:37 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 15 Apr 2024 09:45:52 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> 
> > On Sat, 13 Apr 2024 01:23:48 +0200
> > Thomas Gleixner <tglx@linutronix.de> wrote:
> >   
> > > Russell!
> > > 
> > > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:    
> > > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:      
> > > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > > >> > being taken - so I've no idea why the "make_present" case takes these
> > > >> > locks.      
> > > >> 
> > > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > > >> boot must hold the appropriate write locks. Otherwise it would be
> > > >> possible to online a CPU which just got marked present, but the
> > > >> registration has not completed yet.      
> > > >
> > > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > > is initialising the struct cpu, registering the embedded struct
> > > > device, and setting up the sysfs links to its NUMA node.
> > > >
> > > > There is nothing obvious in there which manipulates any CPU masks, and
> > > > this is rather my fundamental point when I said "I couldn't find
> > > > anything in arch_register_cpu() that depends on ...".
> > > >
> > > > If there is something, then comments in the code would be a useful aid
> > > > because it's highly non-obvious where such a manipulation is located,
> > > > and hence why the locks are necessary.      
> > > 
> > > acpi_processor_hotadd_init()
> > > ...
> > >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> > > 
> > > That ends up in fiddling with cpu_present_mask.
> > > 
> > > I grant you that arch_register_cpu() is not, but it might rely on the
> > > external locking too. I could not be bothered to figure that out.
> > >     
> > > >> Define "real hotplug" :)
> > > >> 
> > > >> Real physical hotplug does not really exist. That's at least true for
> > > >> x86, where the physical hotplug support was chased for a while, but
> > > >> never ended up in production.
> > > >> 
> > > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > > >> to/from a guest.
> > > >> 
> > > >> There are limitations to this and we learned it the hard way on X86. At
> > > >> the end we came up with the following restrictions:
> > > >> 
> > > >>     1) All possible CPUs have to be advertised at boot time via firmware
> > > >>        (ACPI/DT/whatever) independent of them being present at boot time
> > > >>        or not.
> > > >> 
> > > >>        That guarantees proper sizing and ensures that associations
> > > >>        between hardware entities and software representations and the
> > > >>        resulting topology are stable for the lifetime of a system.
> > > >> 
> > > >>        It is really required to know the full topology of the system at
> > > >>        boot time especially with hybrid CPUs where some of the cores
> > > >>        have hyperthreading and the others do not.
> > > >> 
> > > >> 
> > > >>     2) Hot add can only mark an already registered (possible) CPU
> > > >>        present. Adding non-registered CPUs after boot is not possible.
> > > >> 
> > > >>        The CPU must have been registered in #1 already to ensure that
> > > >>        the system topology does not suddenly change in an incompatible
> > > >>        way at run-time.
> > > >> 
> > > >> The same restriction would apply to real physical hotplug. I don't think
> > > >> that's any different for ARM64 or any other architecture.      
> > > >
> > > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > > from a misunderstanding as far as a CPU goes.
> > > >
> > > > However, there is a big difference between the two. On x86, a processor
> > > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > > those even when the processor itself is not enabled. This is the whole
> > > > reason there's a difference between "present" and "enabled" and why
> > > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > > The processor never actually goes away in arm64, it's just prevented
> > > > from being used.      
> > > 
> > > It's the same on X86 at least in the physical world.    
> > 
> > There were public calls on this via the Linaro Open Discussions group,
> > so I can talk a little about how we ended up here.  Note that (in my
> > opinion) there is zero chance of this changing - it took us well over
> > a year to get to this conclusion.  So if we ever want ARM vCPU HP
> > we need to work within these constraints. 
> > 
> > The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> > specs etc, not the kernel maintainers) are determined that they want
> > to retain the option to do real physical CPU hotplug in the future
> > with all the necessary work around dynamic interrupt controller
> > initialization, debug and many other messy corners.
> > 
> > Thus anything defined had to be structured in a way that was 'different'
> > from that.
> > 
> > I don't mind the proposed flattening of the 2 paths if the ARM kernel
> > maintainers are fine with it but it will remove the distinctions and
> > we will need to be very careful with the CPU masks - we can't handle
> > them the same as x86 does.
> > 
> > I'll get on with doing that, but do need input from Will / Catalin / James.
> > There are some quirks that need calling out as it's not quite a simple
> > as it appears from a high level.
> > 
> > Another part of that long discussion established that there is userspace
> > (Android IIRC) in which the CPU present mask must include all CPUs
> > at boot. To change that would be userspace ABI breakage so we can't
> > do that.  Hence the dance around adding yet another mask to allow the
> > OS to understand which CPUs are 'present' but not possible to online.
> > 
> > Flattening the two paths removes any distinction between calls that
> > are for real hotplug and those that are for this online capable path.
> > As a side note, the indicating bit for these flows is defined in ACPI
> > for x86 from ACPI 6.3 as a flag in Processor Local APIC
> > (the ARM64 definition is a cut and paste of that text).  So someone
> > is interested in this distinction on x86. I can't say who but if
> > you have a mantis account you can easily follow the history and it
> > might be instructive to not everyone considering the current x86
> > flow the right way to do it.  
> 
> Would a higher level check to catch that we are hitting undefined
> territory on arm64 be acceptable? That might satisfy the constraint
> that we should not have any software for arm64 that would run if
> physical CPU HP is added to the arch in future.  Something like:
> 
> @@ -331,6 +331,13 @@ static int acpi_processor_get_info(struct acpi_device *device)
> 
>         c = &per_cpu(cpu_devices, pr->id);
>         ACPI_COMPANION_SET(&c->dev, device);
> +
> +       if (!IS_ENABLED(CONFIG_ACPI_CPU_HOTPLUG_CPU) &&
> +           (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id))) {
> +               pr_err_once("Changing CPU present bit is not supported\n");
> +               return -ENODEV;
> +       }
> +
> 
> This is basically lifting the check out of the acpi_processor_make_present()
> call in this patch set.
> 
> With that in place before the new shared call I think we should be fine
> wrt to the ARM Architecture requirements.

As discussed elsewhere in this thread, I'll push this into the arm64
specific arch_register_cpu() definition.

> 
> Jonathan
> 
> 
>         /*
> > 
> > Jonathan
> > 
> >   
> > > 
> > > Thanks,
> > > 
> > >         tglx
> > >     
> > 
> > 
> > _______________________________________________
> > linux-arm-kernel mailing list
> > linux-arm-kernel@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel  
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 11:56                 ` Jonathan Cameron
@ 2024-04-15 12:04                   ` Rafael J. Wysocki
  2024-04-15 12:23                     ` Jonathan Cameron
  2024-04-15 12:37                     ` Salil Mehta
  0 siblings, 2 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 12:04 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Rafael J. Wysocki, Thomas Gleixner, Russell King (Oracle),
	linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Miguel Luis, James Morse,
	Salil Mehta, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	linuxarm, justin.he, jianyong.wu

On Mon, Apr 15, 2024 at 1:56 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 15 Apr 2024 13:37:08 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
>
> > On Mon, Apr 15, 2024 at 10:46 AM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > On Sat, 13 Apr 2024 01:23:48 +0200
> > > Thomas Gleixner <tglx@linutronix.de> wrote:
> > >
> > > > Russell!
> > > >
> > > > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:
> > > > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:
> > > > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > > > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > > > >> > being taken - so I've no idea why the "make_present" case takes these
> > > > >> > locks.
> > > > >>
> > > > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > > > >> boot must hold the appropriate write locks. Otherwise it would be
> > > > >> possible to online a CPU which just got marked present, but the
> > > > >> registration has not completed yet.
> > > > >
> > > > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > > > is initialising the struct cpu, registering the embedded struct
> > > > > device, and setting up the sysfs links to its NUMA node.
> > > > >
> > > > > There is nothing obvious in there which manipulates any CPU masks, and
> > > > > this is rather my fundamental point when I said "I couldn't find
> > > > > anything in arch_register_cpu() that depends on ...".
> > > > >
> > > > > If there is something, then comments in the code would be a useful aid
> > > > > because it's highly non-obvious where such a manipulation is located,
> > > > > and hence why the locks are necessary.
> > > >
> > > > acpi_processor_hotadd_init()
> > > > ...
> > > >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> > > >
> > > > That ends up in fiddling with cpu_present_mask.
> > > >
> > > > I grant you that arch_register_cpu() is not, but it might rely on the
> > > > external locking too. I could not be bothered to figure that out.
> > > >
> > > > >> Define "real hotplug" :)
> > > > >>
> > > > >> Real physical hotplug does not really exist. That's at least true for
> > > > >> x86, where the physical hotplug support was chased for a while, but
> > > > >> never ended up in production.
> > > > >>
> > > > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > > > >> to/from a guest.
> > > > >>
> > > > >> There are limitations to this and we learned it the hard way on X86. At
> > > > >> the end we came up with the following restrictions:
> > > > >>
> > > > >>     1) All possible CPUs have to be advertised at boot time via firmware
> > > > >>        (ACPI/DT/whatever) independent of them being present at boot time
> > > > >>        or not.
> > > > >>
> > > > >>        That guarantees proper sizing and ensures that associations
> > > > >>        between hardware entities and software representations and the
> > > > >>        resulting topology are stable for the lifetime of a system.
> > > > >>
> > > > >>        It is really required to know the full topology of the system at
> > > > >>        boot time especially with hybrid CPUs where some of the cores
> > > > >>        have hyperthreading and the others do not.
> > > > >>
> > > > >>
> > > > >>     2) Hot add can only mark an already registered (possible) CPU
> > > > >>        present. Adding non-registered CPUs after boot is not possible.
> > > > >>
> > > > >>        The CPU must have been registered in #1 already to ensure that
> > > > >>        the system topology does not suddenly change in an incompatible
> > > > >>        way at run-time.
> > > > >>
> > > > >> The same restriction would apply to real physical hotplug. I don't think
> > > > >> that's any different for ARM64 or any other architecture.
> > > > >
> > > > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > > > from a misunderstanding as far as a CPU goes.
> > > > >
> > > > > However, there is a big difference between the two. On x86, a processor
> > > > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > > > those even when the processor itself is not enabled. This is the whole
> > > > > reason there's a difference between "present" and "enabled" and why
> > > > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > > > The processor never actually goes away in arm64, it's just prevented
> > > > > from being used.
> > > >
> > > > It's the same on X86 at least in the physical world.
> > >
> > > There were public calls on this via the Linaro Open Discussions group,
> > > so I can talk a little about how we ended up here.  Note that (in my
> > > opinion) there is zero chance of this changing - it took us well over
> > > a year to get to this conclusion.  So if we ever want ARM vCPU HP
> > > we need to work within these constraints.
> > >
> > > The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> > > specs etc, not the kernel maintainers) are determined that they want
> > > to retain the option to do real physical CPU hotplug in the future
> > > with all the necessary work around dynamic interrupt controller
> > > initialization, debug and many other messy corners.
> >
> > That's OK, but the difference is not in the ACPi CPU enumeration/removal code.
> >
> > > Thus anything defined had to be structured in a way that was 'different'
> > > from that.
> >
> > Apparently, that's where things got confused.
> >
> > > I don't mind the proposed flattening of the 2 paths if the ARM kernel
> > > maintainers are fine with it but it will remove the distinctions and
> > > we will need to be very careful with the CPU masks - we can't handle
> > > them the same as x86 does.
> >
> > At the ACPI code level, there is no distinction.
> >
> > A CPU that was not available before has just become available.  The
> > platform firmware has notified the kernel about it and now
> > acpi_processor_add() runs.  Why would it need to use different code
> > paths depending on what _STA bits were clear before?
>
> I think we will continue to disagree on this.  To my mind and from the
> ACPI specification, they are two different state transitions with different
> required actions.

Well, please be specific: What exactly do you mean here and which
parts of the spec are you talking about?

> Those state transitions are an ACPI level thing not
> an arch level one.  However, I want a solution that moves things forwards
> so I'll give pushing that entirely into the arch code a try.

Thanks!

Though I think that there is a disconnect between us that needs to be
clarified first.

> >
> > Yes, there is some arch stuff to be called and that arch stuff should
> > figure out what to do to make things actually work.
> >
> > > I'll get on with doing that, but do need input from Will / Catalin / James.
> > > There are some quirks that need calling out as it's not quite a simple
> > > as it appears from a high level.
> > >
> > > Another part of that long discussion established that there is userspace
> > > (Android IIRC) in which the CPU present mask must include all CPUs
> > > at boot. To change that would be userspace ABI breakage so we can't
> > > do that.  Hence the dance around adding yet another mask to allow the
> > > OS to understand which CPUs are 'present' but not possible to online.
> > >
> > > Flattening the two paths removes any distinction between calls that
> > > are for real hotplug and those that are for this online capable path.
> >
> > Which calls exactly do you mean?
>
> At the moment he distinction does not exist (because x86 only supports
> fake physical CPU HP and arm64 only vCPU HP / online capable), but if
> the architecture is defined for arm64 physical hotplug in the future
> we would need to do interrupt controller bring up + a lot of other stuff.
>
> It may be possible to do that in the arch code - will be hard to verify
> that until that arch is defined  Today all I need to do is ensure that
> any attempt to do present bit setting for ARM64 returns an error.
> That looks to be straight forward.

OK

>
> >
> > > As a side note, the indicating bit for these flows is defined in ACPI
> > > for x86 from ACPI 6.3 as a flag in Processor Local APIC
> > > (the ARM64 definition is a cut and paste of that text).  So someone
> > > is interested in this distinction on x86. I can't say who but if
> > > you have a mantis account you can easily follow the history and it
> > > might be instructive to not everyone considering the current x86
> > > flow the right way to do it.
> >
> > So a physically absent processor is different from a physically
> > present processor that has not been disabled.  No doubt about this.
> >
> > That said, I'm still unsure why these two cases require two different
> > code paths in acpi_processor_add().
>
> It might be possible to push the checking down into arch_register_cpu()
> and have that for now reject any attempt to do physical CPU HP on arm64.
> It is that gate that is vital to getting this accepted by ARM.
>
> I'm still very much stuck on the hotadd_init flag however, so any suggestions
> on that would be very welcome!

I need to do some investigation which will take some time I suppose.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 12:04                   ` Rafael J. Wysocki
@ 2024-04-15 12:23                     ` Jonathan Cameron
  2024-04-16 17:41                       ` Jonathan Cameron
  2024-04-15 12:37                     ` Salil Mehta
  1 sibling, 1 reply; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15 12:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Russell King (Oracle),
	linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Miguel Luis, James Morse,
	Salil Mehta, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	linuxarm, justin.he, jianyong.wu

On Mon, 15 Apr 2024 14:04:26 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Mon, Apr 15, 2024 at 1:56 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Mon, 15 Apr 2024 13:37:08 +0200
> > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> >  
> > > On Mon, Apr 15, 2024 at 10:46 AM Jonathan Cameron
> > > <Jonathan.Cameron@huawei.com> wrote:  
> > > >
> > > > On Sat, 13 Apr 2024 01:23:48 +0200
> > > > Thomas Gleixner <tglx@linutronix.de> wrote:
> > > >  
> > > > > Russell!
> > > > >
> > > > > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:  
> > > > > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:  
> > > > > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > > > > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > > > > >> > being taken - so I've no idea why the "make_present" case takes these
> > > > > >> > locks.  
> > > > > >>
> > > > > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > > > > >> boot must hold the appropriate write locks. Otherwise it would be
> > > > > >> possible to online a CPU which just got marked present, but the
> > > > > >> registration has not completed yet.  
> > > > > >
> > > > > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > > > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > > > > is initialising the struct cpu, registering the embedded struct
> > > > > > device, and setting up the sysfs links to its NUMA node.
> > > > > >
> > > > > > There is nothing obvious in there which manipulates any CPU masks, and
> > > > > > this is rather my fundamental point when I said "I couldn't find
> > > > > > anything in arch_register_cpu() that depends on ...".
> > > > > >
> > > > > > If there is something, then comments in the code would be a useful aid
> > > > > > because it's highly non-obvious where such a manipulation is located,
> > > > > > and hence why the locks are necessary.  
> > > > >
> > > > > acpi_processor_hotadd_init()
> > > > > ...
> > > > >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> > > > >
> > > > > That ends up in fiddling with cpu_present_mask.
> > > > >
> > > > > I grant you that arch_register_cpu() is not, but it might rely on the
> > > > > external locking too. I could not be bothered to figure that out.
> > > > >  
> > > > > >> Define "real hotplug" :)
> > > > > >>
> > > > > >> Real physical hotplug does not really exist. That's at least true for
> > > > > >> x86, where the physical hotplug support was chased for a while, but
> > > > > >> never ended up in production.
> > > > > >>
> > > > > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > > > > >> to/from a guest.
> > > > > >>
> > > > > >> There are limitations to this and we learned it the hard way on X86. At
> > > > > >> the end we came up with the following restrictions:
> > > > > >>
> > > > > >>     1) All possible CPUs have to be advertised at boot time via firmware
> > > > > >>        (ACPI/DT/whatever) independent of them being present at boot time
> > > > > >>        or not.
> > > > > >>
> > > > > >>        That guarantees proper sizing and ensures that associations
> > > > > >>        between hardware entities and software representations and the
> > > > > >>        resulting topology are stable for the lifetime of a system.
> > > > > >>
> > > > > >>        It is really required to know the full topology of the system at
> > > > > >>        boot time especially with hybrid CPUs where some of the cores
> > > > > >>        have hyperthreading and the others do not.
> > > > > >>
> > > > > >>
> > > > > >>     2) Hot add can only mark an already registered (possible) CPU
> > > > > >>        present. Adding non-registered CPUs after boot is not possible.
> > > > > >>
> > > > > >>        The CPU must have been registered in #1 already to ensure that
> > > > > >>        the system topology does not suddenly change in an incompatible
> > > > > >>        way at run-time.
> > > > > >>
> > > > > >> The same restriction would apply to real physical hotplug. I don't think
> > > > > >> that's any different for ARM64 or any other architecture.  
> > > > > >
> > > > > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > > > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > > > > from a misunderstanding as far as a CPU goes.
> > > > > >
> > > > > > However, there is a big difference between the two. On x86, a processor
> > > > > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > > > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > > > > those even when the processor itself is not enabled. This is the whole
> > > > > > reason there's a difference between "present" and "enabled" and why
> > > > > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > > > > The processor never actually goes away in arm64, it's just prevented
> > > > > > from being used.  
> > > > >
> > > > > It's the same on X86 at least in the physical world.  
> > > >
> > > > There were public calls on this via the Linaro Open Discussions group,
> > > > so I can talk a little about how we ended up here.  Note that (in my
> > > > opinion) there is zero chance of this changing - it took us well over
> > > > a year to get to this conclusion.  So if we ever want ARM vCPU HP
> > > > we need to work within these constraints.
> > > >
> > > > The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> > > > specs etc, not the kernel maintainers) are determined that they want
> > > > to retain the option to do real physical CPU hotplug in the future
> > > > with all the necessary work around dynamic interrupt controller
> > > > initialization, debug and many other messy corners.  
> > >
> > > That's OK, but the difference is not in the ACPi CPU enumeration/removal code.
> > >  
> > > > Thus anything defined had to be structured in a way that was 'different'
> > > > from that.  
> > >
> > > Apparently, that's where things got confused.
> > >  
> > > > I don't mind the proposed flattening of the 2 paths if the ARM kernel
> > > > maintainers are fine with it but it will remove the distinctions and
> > > > we will need to be very careful with the CPU masks - we can't handle
> > > > them the same as x86 does.  
> > >
> > > At the ACPI code level, there is no distinction.
> > >
> > > A CPU that was not available before has just become available.  The
> > > platform firmware has notified the kernel about it and now
> > > acpi_processor_add() runs.  Why would it need to use different code
> > > paths depending on what _STA bits were clear before?  
> >
> > I think we will continue to disagree on this.  To my mind and from the
> > ACPI specification, they are two different state transitions with different
> > required actions.  
> 
> Well, please be specific: What exactly do you mean here and which
> parts of the spec are you talking about?

Given we are moving on with your suggestion, lets leave this for now - too many
other things to do! :)

> 
> > Those state transitions are an ACPI level thing not
> > an arch level one.  However, I want a solution that moves things forwards
> > so I'll give pushing that entirely into the arch code a try.  
> 
> Thanks!
> 
> Though I think that there is a disconnect between us that needs to be
> clarified first.

I'm fine with accepting your approach if it works and is acceptable
to the arm kernel folk. They are getting a non trivial arch_register_cpu()
with a bunch of ACPI specific handling in it that may come as a surprise.

> 
> > >
> > > Yes, there is some arch stuff to be called and that arch stuff should
> > > figure out what to do to make things actually work.
> > >  
> > > > I'll get on with doing that, but do need input from Will / Catalin / James.
> > > > There are some quirks that need calling out as it's not quite a simple
> > > > as it appears from a high level.
> > > >
> > > > Another part of that long discussion established that there is userspace
> > > > (Android IIRC) in which the CPU present mask must include all CPUs
> > > > at boot. To change that would be userspace ABI breakage so we can't
> > > > do that.  Hence the dance around adding yet another mask to allow the
> > > > OS to understand which CPUs are 'present' but not possible to online.
> > > >
> > > > Flattening the two paths removes any distinction between calls that
> > > > are for real hotplug and those that are for this online capable path.  
> > >
> > > Which calls exactly do you mean?  
> >
> > At the moment he distinction does not exist (because x86 only supports
> > fake physical CPU HP and arm64 only vCPU HP / online capable), but if
> > the architecture is defined for arm64 physical hotplug in the future
> > we would need to do interrupt controller bring up + a lot of other stuff.
> >
> > It may be possible to do that in the arch code - will be hard to verify
> > that until that arch is defined  Today all I need to do is ensure that
> > any attempt to do present bit setting for ARM64 returns an error.
> > That looks to be straight forward.  
> 
> OK
> 
> >  
> > >  
> > > > As a side note, the indicating bit for these flows is defined in ACPI
> > > > for x86 from ACPI 6.3 as a flag in Processor Local APIC
> > > > (the ARM64 definition is a cut and paste of that text).  So someone
> > > > is interested in this distinction on x86. I can't say who but if
> > > > you have a mantis account you can easily follow the history and it
> > > > might be instructive to not everyone considering the current x86
> > > > flow the right way to do it.  
> > >
> > > So a physically absent processor is different from a physically
> > > present processor that has not been disabled.  No doubt about this.
> > >
> > > That said, I'm still unsure why these two cases require two different
> > > code paths in acpi_processor_add().  
> >
> > It might be possible to push the checking down into arch_register_cpu()
> > and have that for now reject any attempt to do physical CPU HP on arm64.
> > It is that gate that is vital to getting this accepted by ARM.
> >
> > I'm still very much stuck on the hotadd_init flag however, so any suggestions
> > on that would be very welcome!  
> 
> I need to do some investigation which will take some time I suppose.

I'll do so as well once I've gotten the rest sorted out.  That whole
structure seems overly complex and liable to race, though maybe sufficient
locking happens to be held that it's not a problem.

Jonathan




^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 12:04                   ` Rafael J. Wysocki
  2024-04-15 12:23                     ` Jonathan Cameron
@ 2024-04-15 12:37                     ` Salil Mehta
  2024-04-15 12:41                       ` Rafael J. Wysocki
  1 sibling, 1 reply; 58+ messages in thread
From: Salil Mehta @ 2024-04-15 12:37 UTC (permalink / raw)
  To: Rafael J. Wysocki, Jonathan Cameron
  Cc: Thomas Gleixner, Russell King (Oracle),
	linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Miguel Luis, James Morse,
	Jean-Philippe Brucker, Catalin Marinas, Will Deacon, Linuxarm,
	justin.he, jianyong.wu

Hi Rafael,

>  From: Rafael J. Wysocki <rafael@kernel.org>
>  Sent: Monday, April 15, 2024 1:04 PM
>  
>  On Mon, Apr 15, 2024 at 1:56 PM Jonathan Cameron
>  <Jonathan.Cameron@huawei.com> wrote:
>  >
>  > On Mon, 15 Apr 2024 13:37:08 +0200
>  > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
>  >
>  > > On Mon, Apr 15, 2024 at 10:46 AM Jonathan Cameron
>  > > <Jonathan.Cameron@huawei.com> wrote:
>  > > >
>  > > > On Sat, 13 Apr 2024 01:23:48 +0200 Thomas Gleixner
>  > > > <tglx@linutronix.de> wrote:
>  > > >
>  > > > > Russell!
>  > > > >
>  > > > > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:
>  > > > > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:
>  > > > > >> > As for the cpu locking, I couldn't find anything in
>  > > > > >> > arch_register_cpu() that depends on the cpu_maps_update
>  > > > > >> > stuff nor needs the cpus_write_lock being taken - so I've
>  > > > > >> > no idea why the "make_present" case takes these locks.
>  > > > > >>
>  > > > > >> Anything which updates a CPU mask, e.g. cpu_present_mask,
>  > > > > >> after early boot must hold the appropriate write locks.
>  > > > > >> Otherwise it would be possible to online a CPU which just got
>  > > > > >> marked present, but the registration has not completed yet.
>  > > > > >
>  > > > > > Yes. As far as I've been able to determine,
>  > > > > > arch_register_cpu() doesn't manipulate any of the CPU masks.
>  > > > > > All it seems to be doing is initialising the struct cpu,
>  > > > > > registering the embedded struct device, and setting up the sysfs  links to its NUMA node.
>  > > > > >
>  > > > > > There is nothing obvious in there which manipulates any CPU
>  > > > > > masks, and this is rather my fundamental point when I said "I
>  > > > > > couldn't find anything in arch_register_cpu() that depends on ...".
>  > > > > >
>  > > > > > If there is something, then comments in the code would be a
>  > > > > > useful aid because it's highly non-obvious where such a
>  > > > > > manipulation is located, and hence why the locks are necessary.
>  > > > >
>  > > > > acpi_processor_hotadd_init()
>  > > > > ...
>  > > > >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id,
>  > > > > &pr->id);
>  > > > >
>  > > > > That ends up in fiddling with cpu_present_mask.
>  > > > >
>  > > > > I grant you that arch_register_cpu() is not, but it might rely
>  > > > > on the external locking too. I could not be bothered to figure that  out.
>  > > > >
>  > > > > >> Define "real hotplug" :)
>  > > > > >>
>  > > > > >> Real physical hotplug does not really exist. That's at least
>  > > > > >> true for x86, where the physical hotplug support was chased
>  > > > > >> for a while, but never ended up in production.
>  > > > > >>
>  > > > > >> Though virtualization happily jumped on it to hot add/remove
>  > > > > >> CPUs to/from a guest.
>  > > > > >>
>  > > > > >> There are limitations to this and we learned it the hard way
>  > > > > >> on X86. At the end we came up with the following restrictions:
>  > > > > >>
>  > > > > >>     1) All possible CPUs have to be advertised at boot time via firmware
>  > > > > >>        (ACPI/DT/whatever) independent of them being present at boot time
>  > > > > >>        or not.
>  > > > > >>
>  > > > > >>        That guarantees proper sizing and ensures that associations
>  > > > > >>        between hardware entities and software representations and the
>  > > > > >>        resulting topology are stable for the lifetime of a system.
>  > > > > >>
>  > > > > >>        It is really required to know the full topology of the system at
>  > > > > >>        boot time especially with hybrid CPUs where some of the cores
>  > > > > >>        have hyperthreading and the others do not.
>  > > > > >>
>  > > > > >>
>  > > > > >>     2) Hot add can only mark an already registered (possible) CPU
>  > > > > >>        present. Adding non-registered CPUs after boot is not  possible.
>  > > > > >>
>  > > > > >>        The CPU must have been registered in #1 already to ensure  that
>  > > > > >>        the system topology does not suddenly change in an incompatible
>  > > > > >>        way at run-time.
>  > > > > >>
>  > > > > >> The same restriction would apply to real physical hotplug. I
>  > > > > >> don't think that's any different for ARM64 or any other architecture.
>  > > > > >
>  > > > > > This makes me wonder whether the Arm64 has been barking up the
>  > > > > > wrong tree then, and whether the whole "present" vs "enabled"
>  > > > > > thing comes from a misunderstanding as far as a CPU goes.
>  > > > > >
>  > > > > > However, there is a big difference between the two. On x86, a
>  > > > > > processor is just a processor. On Arm64, a "processor" is a
>  > > > > > slice of the system (includes the interrupt controller, PMUs
>  > > > > > etc) and we must enumerate those even when the processor
>  > > > > > itself is not enabled. This is the whole reason there's a
>  > > > > > difference between "present" and "enabled" and why there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
>  > > > > > The processor never actually goes away in arm64, it's just
>  > > > > > prevented from being used.
>  > > > >
>  > > > > It's the same on X86 at least in the physical world.
>  > > >
>  > > > There were public calls on this via the Linaro Open Discussions
>  > > > group, so I can talk a little about how we ended up here.  Note
>  > > > that (in my
>  > > > opinion) there is zero chance of this changing - it took us well
>  > > > over a year to get to this conclusion.  So if we ever want ARM
>  > > > vCPU HP we need to work within these constraints.
>  > > >
>  > > > The ARM architecture folk (the ones defining the ARM ARM, relevant
>  > > > ACPI specs etc, not the kernel maintainers) are determined that
>  > > > they want to retain the option to do real physical CPU hotplug in
>  > > > the future with all the necessary work around dynamic interrupt
>  > > > controller initialization, debug and many other messy corners.
>  > >
>  > > That's OK, but the difference is not in the ACPi CPU enumeration/removal code.
>  > >
>  > > > Thus anything defined had to be structured in a way that was  'different'
>  > > > from that.
>  > >
>  > > Apparently, that's where things got confused.
>  > >
>  > > > I don't mind the proposed flattening of the 2 paths if the ARM
>  > > > kernel maintainers are fine with it but it will remove the
>  > > > distinctions and we will need to be very careful with the CPU
>  > > > masks - we can't handle them the same as x86 does.
>  > >
>  > > At the ACPI code level, there is no distinction.
>  > >
>  > > A CPU that was not available before has just become available.  The
>  > > platform firmware has notified the kernel about it and now
>  > > acpi_processor_add() runs.  Why would it need to use different code
>  > > paths depending on what _STA bits were clear before?
>  >
>  > I think we will continue to disagree on this.  To my mind and from the
>  > ACPI specification, they are two different state transitions with
>  > different required actions.
>  
>  Well, please be specific: What exactly do you mean here and which parts of
>  the spec are you talking about?
>  
>  > Those state transitions are an ACPI level thing not an arch level one.
>  > However, I want a solution that moves things forwards so I'll give
>  > pushing that entirely into the arch code a try.
>  
>  Thanks!
>  
>  Though I think that there is a disconnect between us that needs to be
>  clarified first.
>  
>  > >
>  > > Yes, there is some arch stuff to be called and that arch stuff
>  > > should figure out what to do to make things actually work.
>  > >
>  > > > I'll get on with doing that, but do need input from Will / Catalin / James.
>  > > > There are some quirks that need calling out as it's not quite a
>  > > > simple as it appears from a high level.
>  > > >
>  > > > Another part of that long discussion established that there is
>  > > > userspace (Android IIRC) in which the CPU present mask must
>  > > > include all CPUs at boot. To change that would be userspace ABI
>  > > > breakage so we can't do that.  Hence the dance around adding yet
>  > > > another mask to allow the OS to understand which CPUs are 'present'
>  but not possible to online.
>  > > >
>  > > > Flattening the two paths removes any distinction between calls
>  > > > that are for real hotplug and those that are for this online capable path.
>  > >
>  > > Which calls exactly do you mean?
>  >
>  > At the moment he distinction does not exist (because x86 only supports
>  > fake physical CPU HP and arm64 only vCPU HP / online capable), but if
>  > the architecture is defined for arm64 physical hotplug in the future
>  > we would need to do interrupt controller bring up + a lot of other stuff.
>  >
>  > It may be possible to do that in the arch code - will be hard to
>  > verify that until that arch is defined  Today all I need to do is
>  > ensure that any attempt to do present bit setting for ARM64 returns an error.
>  > That looks to be straight forward.
>  
>  OK
>  
>  >
>  > >
>  > > > As a side note, the indicating bit for these flows is defined in
>  > > > ACPI for x86 from ACPI 6.3 as a flag in Processor Local APIC (the
>  > > > ARM64 definition is a cut and paste of that text).  So someone is
>  > > > interested in this distinction on x86. I can't say who but if you
>  > > > have a mantis account you can easily follow the history and it
>  > > > might be instructive to not everyone considering the current x86
>  > > > flow the right way to do it.
>  > >
>  > > So a physically absent processor is different from a physically
>  > > present processor that has not been disabled.  No doubt about this.
>  > >
>  > > That said, I'm still unsure why these two cases require two
>  > > different code paths in acpi_processor_add().
>  >
>  > It might be possible to push the checking down into
>  > arch_register_cpu() and have that for now reject any attempt to do
>  physical CPU HP on arm64.
>  > It is that gate that is vital to getting this accepted by ARM.
>  >
>  > I'm still very much stuck on the hotadd_init flag however, so any
>  > suggestions on that would be very welcome!
>  
>  I need to do some investigation which will take some time I suppose.


You might find below cover letter and links to the presentations useful:

1. https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/
2. https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
3. https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
4. https://sched.co/eE4m


Best regards
Salil.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 12:37                     ` Salil Mehta
@ 2024-04-15 12:41                       ` Rafael J. Wysocki
  0 siblings, 0 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 12:41 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Rafael J. Wysocki, Jonathan Cameron, Thomas Gleixner,
	Russell King (Oracle),
	linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Miguel Luis, James Morse,
	Jean-Philippe Brucker, Catalin Marinas, Will Deacon, Linuxarm,
	justin.he, jianyong.wu

On Mon, Apr 15, 2024 at 2:37 PM Salil Mehta <salil.mehta@huawei.com> wrote:
>
> Hi Rafael,
>
> >  From: Rafael J. Wysocki <rafael@kernel.org>
> >  Sent: Monday, April 15, 2024 1:04 PM
> >

[cut]

> >
> >  I need to do some investigation which will take some time I suppose.
>
>
> You might find below cover letter and links to the presentations useful:
>
> 1. https://lore.kernel.org/qemu-devel/20230926100436.28284-1-salil.mehta@huawei.com/
> 2. https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
> 3. https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
> 4. https://sched.co/eE4m

Thanks, I'll go through this, but I kind of doubt if it helps me with
finding out what to do with the hotadd_init flag.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 11:51         ` Salil Mehta
@ 2024-04-15 12:51           ` Rafael J. Wysocki
  2024-04-15 15:31             ` Salil Mehta
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 12:51 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Thomas Gleixner, Russell King (Oracle),
	Rafael J. Wysocki, Jonathan Cameron, linux-pm, loongarch,
	linux-acpi, linux-arch, linux-kernel, linux-arm-kernel, kvmarm,
	x86, Miguel Luis, James Morse, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, Linuxarm, justin.he, jianyong.wu

On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta <salil.mehta@huawei.com> wrote:
>
> Hello,
>
> >  From: Thomas Gleixner <tglx@linutronix.de>
> >  Sent: Friday, April 12, 2024 9:55 PM
> >
> >  On Fri, Apr 12 2024 at 21:16, Russell King (Oracle) wrote:
> >  > On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
> >  >> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
> >  >> What's the difference then?  The locking, which should be fine if I'm
> >  >> not mistaken and need_hotplug_init that needs to be set if this code
> >  >> runs after the processor driver has loaded AFAICS.
> >  >
> >  > It is over this that I walked away from progressing this code, because
> >  > I don't think it's quite as simple as you make it out to be.
> >  >
> >  > Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch
> >  implemented
> >  > functions, so Arm64 can easily provide stubs for these that do nothing.
> >  > That never caused me any concern.
> >  >
> >  > What does cause me great concern though are the finer details. For
> >  > example, above you seem to drop the evaluation of _STA for the
> >  > "make_present" case - I've no idea whether that is something that
> >  > should be deleted or not (if it is something that can be deleted, then
> >  > why not delete it now?)
> >  >
> >  > As for the cpu locking, I couldn't find anything in
> >  > arch_register_cpu() that depends on the cpu_maps_update stuff nor
> >  > needs the cpus_write_lock being taken - so I've no idea why the
> >  > "make_present" case takes these locks.
> >
> >  Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> >  boot must hold the appropriate write locks. Otherwise it would be possible
> >  to online a CPU which just got marked present, but the registration has not
> >  completed yet.
> >
> >  > Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
> >  > obvious that this is required - remember that with Arm64's "enabled"
> >  > toggling, the "processor" is a slice of the system and doesn't
> >  > actually go away - it's just "not enabled" for use.
> >  >
> >  > Again, as "processors" in Arm64 are slices of the system, they have to
> >  > be fully described in ACPI before the OS boots, and they will be
> >  > marked as being "present", which means they will be enumerated, and
> >  > the driver will be probed. Any processor that is not to be used will
> >  > not have its enabled bit set. It is my understanding that every
> >  > processor will result in the ACPI processor driver being bound to it
> >  > whether its enabled or not.
> >  >
> >  > The difference between real hotplug and Arm64 hotplug is that real
> >  > hotplug makes stuff not-present (and thus unenumerable). Arm64
> >  hotplug
> >  > makes stuff not-enabled which is still enumerable.
> >
> >  Define "real hotplug" :)
> >
> >  Real physical hotplug does not really exist. That's at least true for x86, where
> >  the physical hotplug support was chased for a while, but never ended up in
> >  production.
> >
> >  Though virtualization happily jumped on it to hot add/remove CPUs to/from
> >  a guest.
> >
> >  There are limitations to this and we learned it the hard way on X86. At the
> >  end we came up with the following restrictions:
> >
> >      1) All possible CPUs have to be advertised at boot time via firmware
> >         (ACPI/DT/whatever) independent of them being present at boot time
> >         or not.
> >
> >         That guarantees proper sizing and ensures that associations
> >         between hardware entities and software representations and the
> >         resulting topology are stable for the lifetime of a system.
> >
> >         It is really required to know the full topology of the system at
> >         boot time especially with hybrid CPUs where some of the cores
> >         have hyperthreading and the others do not.
> >
> >
> >      2) Hot add can only mark an already registered (possible) CPU
> >         present. Adding non-registered CPUs after boot is not possible.
> >
> >         The CPU must have been registered in #1 already to ensure that
> >         the system topology does not suddenly change in an incompatible
> >         way at run-time.
> >
> >  The same restriction would apply to real physical hotplug. I don't think that's
> >  any different for ARM64 or any other architecture.
>
>
> There is a difference:
>
> 1.   ARM arch does not allows for any processor to be NOT present. Hence, because of
> this restriction any of its related per-cpu components must be present and enumerated
> at the boot time as well (exposed by firmware and ACPI). This means all the enumerated
> processors will be marked as 'present' but they might exist in NOT enabled (_STA.enabled=0)
> state.
>
> There was one clear difference and please correct me if I'm wrong here,  for x86, the LAPIC
> associated with the x86 core can be brought online later even after boot?
>
> But for ARM Arch, processors and its corresponding per-cpu components like redistributors
> all need to be present and enumerated during the boot time. Redistributors are part of
> ALWAYS-ON power domain.

OK

So what exactly is the problem with this and what does
acpi_processor_add() have to do with it?

Do you want the per-CPU structures etc. to be created from the
acpi_processor_add() path?

This plain won't work because acpi_processor_add(), as defined in the
mainline kernel today (and the Jonathan's patches don't change that
AFAICS), returns an error for processor devices with the "enabled" bit
clear in _STA (it returns an error if the "present" bit is clear too,
but that's obvious), so it only gets to calling arch_register_cpu() if
*both* "present" and "enabled" _STA bits are set for the given
processor device.

That, BTW, is why I keep saying that from the ACPI CPU enumeration
code perspective, there is no difference between "present" and
"enabled".

> 2.  Agreed regarding the topology. Are you suggesting that we must call arch_register_cpu()
> during boot time for all the 'present' CPUs? Even if that's the case, we might still want to defer
> registration of the cpu device (register_cpu() API) with the Linux device model. Later is what
> we are doing to hide/unhide the CPUs from the user while STA.Enabled Bit is toggled due to
> CPU (un)plug action.

There are two ways to approach this IMV, and both seem to be valid in principle.

One would be to treat CPUs with the "enabled" bit clear as not present
and create all of the requisite data structures for them when they
become available (in analogy with the "real hot-add" case).

The alternative one is to create all of the requisite data structures
for the CPUs that you find during boot, but register CPU devices for
those having the "enabled" _STA bit set.

It looks like you have chosen the second approach, which is fine with
me (although personally, I would rather choose the first one), but
then your arch code needs to arrange for the requisite CPU data
structures etc. to be set up before acpi_processor_add() gets called
because, as per the above, the latter just rejects CPUs with the
"enabled" _STA bit clear.

Thanks!

^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 12:51           ` Rafael J. Wysocki
@ 2024-04-15 15:31             ` Salil Mehta
  2024-04-15 16:38               ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Salil Mehta @ 2024-04-15 15:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Russell King (Oracle),
	Jonathan Cameron, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	Linuxarm, justin.he, jianyong.wu

>  From: Rafael J. Wysocki <rafael@kernel.org>
>  Sent: Monday, April 15, 2024 1:51 PM
>  
>  On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta <salil.mehta@huawei.com>
>  wrote:
>  >
>  > Hello,
>  >
>  > >  From: Thomas Gleixner <tglx@linutronix.de>
>  > >  Sent: Friday, April 12, 2024 9:55 PM
>  > >
>  > >  On Fri, Apr 12 2024 at 21:16, Russell King (Oracle) wrote:
>  > >  > On Fri, Apr 12, 2024 at 08:30:40PM +0200, Rafael J. Wysocki wrote:
>  > >  >> Say acpi_map_cpu) / acpi_unmap_cpu() are turned into arch calls.
>  > >  >> What's the difference then?  The locking, which should be fine
>  > > if I'm  >> not mistaken and need_hotplug_init that needs to be set
>  > > if this code  >> runs after the processor driver has loaded AFAICS.
>  > >  >
>  > >  > It is over this that I walked away from progressing this code,
>  > > because  > I don't think it's quite as simple as you make it out to be.
>  > >  >
>  > >  > Yes, acpi_map_cpu() and acpi_unmap_cpu() are already arch
>  > > implemented  > functions, so Arm64 can easily provide stubs for
>  > > these that do nothing.
>  > >  > That never caused me any concern.
>  > >  >
>  > >  > What does cause me great concern though are the finer details.
>  > > For  > example, above you seem to drop the evaluation of _STA for
>  > > the  > "make_present" case - I've no idea whether that is something
>  > > that  > should be deleted or not (if it is something that can be
>  > > deleted, then  > why not delete it now?)  >  > As for the cpu
>  > > locking, I couldn't find anything in  > arch_register_cpu() that
>  > > depends on the cpu_maps_update stuff nor  > needs the
>  > > cpus_write_lock being taken - so I've no idea why the  >
>  > > "make_present" case takes these locks.
>  > >
>  > >  Anything which updates a CPU mask, e.g. cpu_present_mask, after
>  > > early  boot must hold the appropriate write locks. Otherwise it
>  > > would be possible  to online a CPU which just got marked present,
>  > > but the registration has not  completed yet.
>  > >
>  > >  > Finally, the "pr->flags.need_hotplug_init = 1" thing... it's not
>  > > > obvious that this is required - remember that with Arm64's "enabled"
>  > >  > toggling, the "processor" is a slice of the system and doesn't  >
>  > > actually go away - it's just "not enabled" for use.
>  > >  >
>  > >  > Again, as "processors" in Arm64 are slices of the system, they
>  > > have to  > be fully described in ACPI before the OS boots, and they
>  > > will be  > marked as being "present", which means they will be
>  > > enumerated, and  > the driver will be probed. Any processor that is
>  > > not to be used will  > not have its enabled bit set. It is my
>  > > understanding that every  > processor will result in the ACPI
>  > > processor driver being bound to it  > whether its enabled or not.
>  > >  >
>  > >  > The difference between real hotplug and Arm64 hotplug is that
>  > > real  > hotplug makes stuff not-present (and thus unenumerable).
>  > > Arm64  hotplug  > makes stuff not-enabled which is still enumerable.
>  > >
>  > >  Define "real hotplug" :)
>  > >
>  > >  Real physical hotplug does not really exist. That's at least true
>  > > for x86, where  the physical hotplug support was chased for a while,
>  > > but never ended up in  production.
>  > >
>  > >  Though virtualization happily jumped on it to hot add/remove CPUs
>  > > to/from  a guest.
>  > >
>  > >  There are limitations to this and we learned it the hard way on
>  > > X86. At the  end we came up with the following restrictions:
>  > >
>  > >      1) All possible CPUs have to be advertised at boot time via firmware
>  > >         (ACPI/DT/whatever) independent of them being present at boot time
>  > >         or not.
>  > >
>  > >         That guarantees proper sizing and ensures that associations
>  > >         between hardware entities and software representations and the
>  > >         resulting topology are stable for the lifetime of a system.
>  > >
>  > >         It is really required to know the full topology of the system at
>  > >         boot time especially with hybrid CPUs where some of the cores
>  > >         have hyperthreading and the others do not.
>  > >
>  > >
>  > >      2) Hot add can only mark an already registered (possible) CPU
>  > >         present. Adding non-registered CPUs after boot is not possible.
>  > >
>  > >         The CPU must have been registered in #1 already to ensure that
>  > >         the system topology does not suddenly change in an incompatible
>  > >         way at run-time.
>  > >
>  > >  The same restriction would apply to real physical hotplug. I don't
>  > > think that's  any different for ARM64 or any other architecture.
>  >
>  >
>  > There is a difference:
>  >
>  > 1.   ARM arch does not allows for any processor to be NOT present. Hence,  because of
>  > this restriction any of its related per-cpu components must be present
>  > and enumerated at the boot time as well (exposed by firmware and
>  > ACPI). This means all the enumerated processors will be marked as
>  > 'present' but they might exist in NOT enabled (_STA.enabled=0) state.
>  >
>  > There was one clear difference and please correct me if I'm wrong
>  > here,  for x86, the LAPIC associated with the x86 core can be brought online later even after boot?
>  >
>  > But for ARM Arch, processors and its corresponding per-cpu components
>  > like redistributors all need to be present and enumerated during the
>  > boot time. Redistributors are part of ALWAYS-ON power domain.
>  
>  OK
>  
>  So what exactly is the problem with this and what does
>  acpi_processor_add() have to do with it?


For ARM Arch, during boot time, it should add processor as if no hotplug exists. But later,
in context to the (fake) hotplug trigger from the virtualizer as a result of the CPU (un)plug
action  it should just end up in registering the already present CPU with the Linux Driver Model. 


>  
>  Do you want the per-CPU structures etc. to be created from the
>  acpi_processor_add() path?


I referred to the components related to ARM CPU Arch like redistributors etc.
which will get initialized in context to Arch specific _init code not here. This
i.e. acpi_processor_add() is arch agnostic code common to all architectures.

[ A digression: You do have _weak functions which can be overridden to arch specific
 handling like  arch_(un)map_cpu() etc. but we can't use those to defer initialize
 the CPU related components - ARM Arch constraint!]


>  
>  This plain won't work because acpi_processor_add(), as defined in the
>  mainline kernel today (and the Jonathan's patches don't change that
>  AFAICS), returns an error for processor devices with the "enabled" bit clear
>  in _STA (it returns an error if the "present" bit is clear too, but that's
>  obvious), so it only gets to calling arch_register_cpu() if
>  *both* "present" and "enabled" _STA bits are set for the given processor
>  device.


If you are referring to the _STA check in the XX_hot_add_init() then in the current
kernel code it only checks for the ACPI_STA_DEVICE_PRESENT flag and not
the ACPI_STA_DEVICE_ENABLED flag(?). The code being reviewed has changed
exactly that behavior for 2 legs i.e. make-present and make-enabled legs.

I'm open to further address your point clearly.

>  
>  That, BTW, is why I keep saying that from the ACPI CPU enumeration code
>  perspective, there is no difference between "present" and "enabled".


Agreed but there is still a subtle difference.  Enumeration happens once and
for all the processors during the boot time. And as confirmed by yourself and
Thomas as well that even in x86 arch all the processors will be discovered and
their topology fixed during the boot time which is effectively the same behavior
as in the ARM Arch. But ARM assumes those 'present' bits in the present masks
to be set during the boot time which is not like x86(?).  Hence, 'present cpu' Bits
will always be equal to 'possible cpu' Bits. This is a constraint put by the ARM
maintainers and looks unshakable. 


>  
>  > 2.  Agreed regarding the topology. Are you suggesting that we must
>  > call arch_register_cpu() during boot time for all the 'present' CPUs?
>  > Even if that's the case, we might still want to defer registration of
>  > the cpu device (register_cpu() API) with the Linux device model. Later
>  > is what we are doing to hide/unhide the CPUs from the user while
>  STA.Enabled Bit is toggled due to CPU (un)plug action.
>  
>  There are two ways to approach this IMV, and both seem to be valid in
>  principle.
>  
>  One would be to treat CPUs with the "enabled" bit clear as not present and
>  create all of the requisite data structures for them when they become
>  available (in analogy with the "real hot-add" case).


Right. This one is out-of-scope for ARM Arch as we cannot defer any Arch
specific sizing and initializations after boot i.e. when processor_add() gets
called again later as a trigger of CPU plug action at the Virtualizer.


>  
>  The alternative one is to create all of the requisite data structures for the
>  CPUs that you find during boot, but register CPU devices for those having
>  the "enabled" _STA bit set.


Correct. and we defer the registration for CPUs with online-capable Bit
set in the ACPI MADT/GICC data structure. These CPUs basically form
set of hot-pluggable CPUs on ARM. 


>  
>  It looks like you have chosen the second approach, which is fine with me
>  (although personally, I would rather choose the first one), but then your
>  arch code needs to arrange for the requisite CPU data structures etc. to be
>  set up before acpi_processor_add() gets called because, as per the above,
>  the latter just rejects CPUs with the "enabled" _STA bit clear.

Yes, correct. First one is not possible - at least for now and to have that it will
require lot of rework especially at GIC. But there are many other arch components
(like timers, PMUs, etc.) whose behavior needs to be specified somewhere in the
architecture as well. All these are closely coupled with the ARM CPU architecture.
(it's beyond this discussion and lets leave that to ARM folks).

This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly.


Best regards
Salil.

>  
>  Thanks!

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-12 18:10   ` Rafael J. Wysocki
@ 2024-04-15 15:48     ` Jonathan Cameron
  2024-04-15 16:16       ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15 15:48 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Fri, 12 Apr 2024 20:10:54 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > The arm64 specific arch_register_cpu() needs to access the _STA
> > method of the DSDT object so make it available by assigning the
> > appropriate handle to the struct cpu instance.
> >
> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> >  drivers/acpi/acpi_processor.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 7a0dd35d62c9..93e029403d05 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> >         union acpi_object object = { 0 };
> >         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
> >         struct acpi_processor *pr = acpi_driver_data(device);
> > +       struct cpu *c;
> >         int device_declaration = 0;
> >         acpi_status status = AE_OK;
> >         static int cpu0_initialized;
> > @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> >                         cpufreq_add_device("acpi-cpufreq");
> >         }
> >
> > +       c = &per_cpu(cpu_devices, pr->id);
> > +       ACPI_COMPANION_SET(&c->dev, device);  
> 
> This is also set for per_cpu(cpu_sys_devices, pr->id) in
> acpi_processor_add(), via acpi_bind_one().

Hi Rafael,

cpu_sys_devices gets filled with a pointer to this same structure.
The contents gets set in register_cpu() so at this point
it doesn't point anywhere.  As a side note register_cpu()
memsets to zero the value I set it to in the code above which isn't
great, particularly as I want to use this in post_eject for
arm64.

We could make a copy of the handle and put it back after
the memset in register_cpu() but that is also ugly.
It's the best I've come up with to make sure this is still set
come remove time but is rather odd.

> 
> Moreover, there is some pr->id validation in acpi_processor_add(), so
> it seems premature to use it here this way.
> 
> I think that ACPI_COMPANION_SET() should be called from here on
> per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
> pr->id validation should all be done here) and then NULL can be passed
> as acpi_dev to acpi_bind_one() in acpi_processor_add().  Then, there
> will be one physical device corresponding to the processor ACPI device
> and no confusion.

I'm fairly sure this is pointing to the same device but agreed this
is a tiny bit confusing. However we can't use cpu_sys_devices at this point
so I'm not immediately seeing a cleaner solution :(

Jonathan

> 
> >         /*
> >          *  Extra Processor objects may be enumerated on MP systems with
> >          *  less than the max # of CPUs. They should be ignored _iff
> > --
> > 2.39.2
> >  


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-15 15:48     ` Jonathan Cameron
@ 2024-04-15 16:16       ` Rafael J. Wysocki
  2024-04-15 16:19         ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 16:16 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Rafael J. Wysocki, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Russell King,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, linuxarm, justin.he, jianyong.wu

On Mon, Apr 15, 2024 at 5:49 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Fri, 12 Apr 2024 20:10:54 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
>
> > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > The arm64 specific arch_register_cpu() needs to access the _STA
> > > method of the DSDT object so make it available by assigning the
> > > appropriate handle to the struct cpu instance.
> > >
> > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > ---
> > >  drivers/acpi/acpi_processor.c | 3 +++
> > >  1 file changed, 3 insertions(+)
> > >
> > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > index 7a0dd35d62c9..93e029403d05 100644
> > > --- a/drivers/acpi/acpi_processor.c
> > > +++ b/drivers/acpi/acpi_processor.c
> > > @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > >         union acpi_object object = { 0 };
> > >         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
> > >         struct acpi_processor *pr = acpi_driver_data(device);
> > > +       struct cpu *c;
> > >         int device_declaration = 0;
> > >         acpi_status status = AE_OK;
> > >         static int cpu0_initialized;
> > > @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > >                         cpufreq_add_device("acpi-cpufreq");
> > >         }
> > >
> > > +       c = &per_cpu(cpu_devices, pr->id);
> > > +       ACPI_COMPANION_SET(&c->dev, device);
> >
> > This is also set for per_cpu(cpu_sys_devices, pr->id) in
> > acpi_processor_add(), via acpi_bind_one().
>
> Hi Rafael,
>
> cpu_sys_devices gets filled with a pointer to this same structure.
> The contents gets set in register_cpu() so at this point
> it doesn't point anywhere.  As a side note register_cpu()
> memsets to zero the value I set it to in the code above which isn't
> great, particularly as I want to use this in post_eject for
> arm64.
>
> We could make a copy of the handle and put it back after
> the memset in register_cpu() but that is also ugly.
> It's the best I've come up with to make sure this is still set
> come remove time but is rather odd.
> >
> > Moreover, there is some pr->id validation in acpi_processor_add(), so
> > it seems premature to use it here this way.
> >
> > I think that ACPI_COMPANION_SET() should be called from here on
> > per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
> > pr->id validation should all be done here) and then NULL can be passed
> > as acpi_dev to acpi_bind_one() in acpi_processor_add().  Then, there
> > will be one physical device corresponding to the processor ACPI device
> > and no confusion.
>
> I'm fairly sure this is pointing to the same device but agreed this
> is a tiny bit confusing. However we can't use cpu_sys_devices at this point
> so I'm not immediately seeing a cleaner solution :(

Well, OK.

Please at least consider doing the pr->id validation checks before
setting the ACPI companion for &per_cpu(cpu_devices, pr->id).

Also, acpi_bind_one() needs to be called on the "physical" devices
passed to ACPI_COMPANION_SET() (with NULL as the second argument) for
the reference counting and physical device lookup to work.

Please also note that acpi_primary_dev_companion() should return
per_cpu(cpu_sys_devices, pr->id) for the processor ACPI device, which
depends on the order of acpi_bind_one() calls involving the same ACPI
device.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-15 16:16       ` Rafael J. Wysocki
@ 2024-04-15 16:19         ` Rafael J. Wysocki
  2024-04-15 16:50           ` Jonathan Cameron
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 16:19 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Rafael J. Wysocki, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Russell King,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, linuxarm, justin.he, jianyong.wu

On Mon, Apr 15, 2024 at 6:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Apr 15, 2024 at 5:49 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Fri, 12 Apr 2024 20:10:54 +0200
> > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> >
> > > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > > <Jonathan.Cameron@huawei.com> wrote:
> > > >
> > > > The arm64 specific arch_register_cpu() needs to access the _STA
> > > > method of the DSDT object so make it available by assigning the
> > > > appropriate handle to the struct cpu instance.
> > > >
> > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > ---
> > > >  drivers/acpi/acpi_processor.c | 3 +++
> > > >  1 file changed, 3 insertions(+)
> > > >
> > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > index 7a0dd35d62c9..93e029403d05 100644
> > > > --- a/drivers/acpi/acpi_processor.c
> > > > +++ b/drivers/acpi/acpi_processor.c
> > > > @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > >         union acpi_object object = { 0 };
> > > >         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
> > > >         struct acpi_processor *pr = acpi_driver_data(device);
> > > > +       struct cpu *c;
> > > >         int device_declaration = 0;
> > > >         acpi_status status = AE_OK;
> > > >         static int cpu0_initialized;
> > > > @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > >                         cpufreq_add_device("acpi-cpufreq");
> > > >         }
> > > >
> > > > +       c = &per_cpu(cpu_devices, pr->id);
> > > > +       ACPI_COMPANION_SET(&c->dev, device);
> > >
> > > This is also set for per_cpu(cpu_sys_devices, pr->id) in
> > > acpi_processor_add(), via acpi_bind_one().
> >
> > Hi Rafael,
> >
> > cpu_sys_devices gets filled with a pointer to this same structure.
> > The contents gets set in register_cpu() so at this point
> > it doesn't point anywhere.  As a side note register_cpu()
> > memsets to zero the value I set it to in the code above which isn't
> > great, particularly as I want to use this in post_eject for
> > arm64.
> >
> > We could make a copy of the handle and put it back after
> > the memset in register_cpu() but that is also ugly.
> > It's the best I've come up with to make sure this is still set
> > come remove time but is rather odd.
> > >
> > > Moreover, there is some pr->id validation in acpi_processor_add(), so
> > > it seems premature to use it here this way.
> > >
> > > I think that ACPI_COMPANION_SET() should be called from here on
> > > per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
> > > pr->id validation should all be done here) and then NULL can be passed
> > > as acpi_dev to acpi_bind_one() in acpi_processor_add().  Then, there
> > > will be one physical device corresponding to the processor ACPI device
> > > and no confusion.
> >
> > I'm fairly sure this is pointing to the same device but agreed this
> > is a tiny bit confusing. However we can't use cpu_sys_devices at this point
> > so I'm not immediately seeing a cleaner solution :(
>
> Well, OK.
>
> Please at least consider doing the pr->id validation checks before
> setting the ACPI companion for &per_cpu(cpu_devices, pr->id).
>
> Also, acpi_bind_one() needs to be called on the "physical" devices
> passed to ACPI_COMPANION_SET() (with NULL as the second argument) for
> the reference counting and physical device lookup to work.
>
> Please also note that acpi_primary_dev_companion() should return
> per_cpu(cpu_sys_devices, pr->id) for the processor ACPI device, which
> depends on the order of acpi_bind_one() calls involving the same ACPI
> device.

Of course, if the value set by ACPI_COMPANION_SET() is cleared
subsequently, the above is not needed, but then using
ACPI_COMPANION_SET() is questionable overall.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 15:31             ` Salil Mehta
@ 2024-04-15 16:38               ` Rafael J. Wysocki
  2024-04-17 15:01                 ` Salil Mehta
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 16:38 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Rafael J. Wysocki, Thomas Gleixner, Russell King (Oracle),
	Jonathan Cameron, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	Linuxarm, justin.he, jianyong.wu

On Mon, Apr 15, 2024 at 5:31 PM Salil Mehta <salil.mehta@huawei.com> wrote:
>
> >  From: Rafael J. Wysocki <rafael@kernel.org>
> >  Sent: Monday, April 15, 2024 1:51 PM
> >
> >  On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta <salil.mehta@huawei.com>
> >  wrote:
> >  >

[cut]

> >  > >  Though virtualization happily jumped on it to hot add/remove CPUs
> >  > > to/from  a guest.
> >  > >
> >  > >  There are limitations to this and we learned it the hard way on
> >  > > X86. At the  end we came up with the following restrictions:
> >  > >
> >  > >      1) All possible CPUs have to be advertised at boot time via firmware
> >  > >         (ACPI/DT/whatever) independent of them being present at boot time
> >  > >         or not.
> >  > >
> >  > >         That guarantees proper sizing and ensures that associations
> >  > >         between hardware entities and software representations and the
> >  > >         resulting topology are stable for the lifetime of a system.
> >  > >
> >  > >         It is really required to know the full topology of the system at
> >  > >         boot time especially with hybrid CPUs where some of the cores
> >  > >         have hyperthreading and the others do not.
> >  > >
> >  > >
> >  > >      2) Hot add can only mark an already registered (possible) CPU
> >  > >         present. Adding non-registered CPUs after boot is not possible.
> >  > >
> >  > >         The CPU must have been registered in #1 already to ensure that
> >  > >         the system topology does not suddenly change in an incompatible
> >  > >         way at run-time.
> >  > >
> >  > >  The same restriction would apply to real physical hotplug. I don't
> >  > > think that's  any different for ARM64 or any other architecture.
> >  >
> >  >
> >  > There is a difference:
> >  >
> >  > 1.   ARM arch does not allows for any processor to be NOT present. Hence,  because of
> >  > this restriction any of its related per-cpu components must be present
> >  > and enumerated at the boot time as well (exposed by firmware and
> >  > ACPI). This means all the enumerated processors will be marked as
> >  > 'present' but they might exist in NOT enabled (_STA.enabled=0) state.
> >  >
> >  > There was one clear difference and please correct me if I'm wrong
> >  > here,  for x86, the LAPIC associated with the x86 core can be brought online later even after boot?
> >  >
> >  > But for ARM Arch, processors and its corresponding per-cpu components
> >  > like redistributors all need to be present and enumerated during the
> >  > boot time. Redistributors are part of ALWAYS-ON power domain.
> >
> >  OK
> >
> >  So what exactly is the problem with this and what does
> >  acpi_processor_add() have to do with it?
>
>
> For ARM Arch, during boot time, it should add processor as if no hotplug exists. But later,
> in context to the (fake) hotplug trigger from the virtualizer as a result of the CPU (un)plug
> action  it should just end up in registering the already present CPU with the Linux Driver Model.

So let me repeat this last time: acpi_processor_add() cannot do that,
because (as defined today) it rejects CPUs with the "enabled" bit
clear in _STA.

> >
> >  Do you want the per-CPU structures etc. to be created from the
> >  acpi_processor_add() path?
>
>
> I referred to the components related to ARM CPU Arch like redistributors etc.
> which will get initialized in context to Arch specific _init code not here. This
> i.e. acpi_processor_add() is arch agnostic code common to all architectures.
>
> [ A digression: You do have _weak functions which can be overridden to arch specific
>  handling like  arch_(un)map_cpu() etc. but we can't use those to defer initialize
>  the CPU related components - ARM Arch constraint!]

Not right now, but they can be added I suppose.

>
> >
> >  This plain won't work because acpi_processor_add(), as defined in the
> >  mainline kernel today (and the Jonathan's patches don't change that
> >  AFAICS), returns an error for processor devices with the "enabled" bit clear
> >  in _STA (it returns an error if the "present" bit is clear too, but that's
> >  obvious), so it only gets to calling arch_register_cpu() if
> >  *both* "present" and "enabled" _STA bits are set for the given processor
> >  device.
>
>
> If you are referring to the _STA check in the XX_hot_add_init() then in the current
> kernel code it only checks for the ACPI_STA_DEVICE_PRESENT flag and not
> the ACPI_STA_DEVICE_ENABLED flag(?).

No, I am not.  I'm referring to this code in 6.9-rc4:

static int acpi_processor_add(struct acpi_device *device,
                    const struct acpi_device_id *id)
{
    struct acpi_processor *pr;
    struct device *dev;
    int result = 0;

    if (!acpi_device_is_enabled(device))
        return -ENODEV;

    ...
}

where acpi_device_is_enabled() is defined as follows:

bool acpi_device_is_enabled(const struct acpi_device *adev)
{
    return adev->status.present && adev->status.enabled;
}

> The code being reviewed has changed
> exactly that behavior for 2 legs i.e. make-present and make-enabled legs.

I'm not sure what you mean here, but the code above means that
acpi_processor_add) does not distinguish between CPU with the
"present" bit clear (in which case the "enabled" bit must also be
clear as per the spec) and CPUs with the "present" bit set and the
"enabled" bit clear.  These two cases are handled in the same way.

> I'm open to further address your point clearly.

I hope that the above is clear enough.

> >
> >  That, BTW, is why I keep saying that from the ACPI CPU enumeration code
> >  perspective, there is no difference between "present" and "enabled".
>
>
> Agreed but there is still a subtle difference.  Enumeration happens once and
> for all the processors during the boot time. And as confirmed by yourself and
> Thomas as well that even in x86 arch all the processors will be discovered and
> their topology fixed during the boot time which is effectively the same behavior
> as in the ARM Arch. But ARM assumes those 'present' bits in the present masks
> to be set during the boot time which is not like x86(?).  Hence, 'present cpu' Bits
> will always be equal to 'possible cpu' Bits. This is a constraint put by the ARM
> maintainers and looks unshakable.

Yes, there are differences between architectures, but the ACPI code is
(or at least should be) architecture-agnostic (as you said somewhere
above).  So why does this matter for the ACPI code?

> >
> >  > 2.  Agreed regarding the topology. Are you suggesting that we must
> >  > call arch_register_cpu() during boot time for all the 'present' CPUs?
> >  > Even if that's the case, we might still want to defer registration of
> >  > the cpu device (register_cpu() API) with the Linux device model. Later
> >  > is what we are doing to hide/unhide the CPUs from the user while
> >  STA.Enabled Bit is toggled due to CPU (un)plug action.
> >
> >  There are two ways to approach this IMV, and both seem to be valid in
> >  principle.
> >
> >  One would be to treat CPUs with the "enabled" bit clear as not present and
> >  create all of the requisite data structures for them when they become
> >  available (in analogy with the "real hot-add" case).
>
>
> Right. This one is out-of-scope for ARM Arch as we cannot defer any Arch
> specific sizing and initializations after boot i.e. when processor_add() gets
> called again later as a trigger of CPU plug action at the Virtualizer.
>
>
> >
> >  The alternative one is to create all of the requisite data structures for the
> >  CPUs that you find during boot, but register CPU devices for those having
> >  the "enabled" _STA bit set.
>
>
> Correct. and we defer the registration for CPUs with online-capable Bit
> set in the ACPI MADT/GICC data structure. These CPUs basically form
> set of hot-pluggable CPUs on ARM.
>
>
> >
> >  It looks like you have chosen the second approach, which is fine with me
> >  (although personally, I would rather choose the first one), but then your
> >  arch code needs to arrange for the requisite CPU data structures etc. to be
> >  set up before acpi_processor_add() gets called because, as per the above,
> >  the latter just rejects CPUs with the "enabled" _STA bit clear.
>
> Yes, correct. First one is not possible - at least for now and to have that it will
> require lot of rework especially at GIC. But there are many other arch components
> (like timers, PMUs, etc.) whose behavior needs to be specified somewhere in the
> architecture as well. All these are closely coupled with the ARM CPU architecture.
> (it's beyond this discussion and lets leave that to ARM folks).
>
> This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly.

Well, I'm having a hard time with this.

As far as CPU enumeration goes, if the "enabled" bit is clear in _STA,
it does not happen at all.  Both on ARM and on x86.

Now tell me why there need to be two separate code paths calling
arch_register_cpu() in acpi_processor_add()?

I see no reason whatsoever.

Moreover, I see reasons why there needs to be only one such code path.

Please feel free to prove me wrong.

Thanks!

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-15 16:19         ` Rafael J. Wysocki
@ 2024-04-15 16:50           ` Jonathan Cameron
  2024-04-15 17:34             ` Jonathan Cameron
  0 siblings, 1 reply; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15 16:50 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, linuxarm, justin.he, jianyong.wu

On Mon, 15 Apr 2024 18:19:17 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Mon, Apr 15, 2024 at 6:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Mon, Apr 15, 2024 at 5:49 PM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:  
> > >
> > > On Fri, 12 Apr 2024 20:10:54 +0200
> > > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> > >  
> > > > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > > > <Jonathan.Cameron@huawei.com> wrote:  
> > > > >
> > > > > The arm64 specific arch_register_cpu() needs to access the _STA
> > > > > method of the DSDT object so make it available by assigning the
> > > > > appropriate handle to the struct cpu instance.
> > > > >
> > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > > ---
> > > > >  drivers/acpi/acpi_processor.c | 3 +++
> > > > >  1 file changed, 3 insertions(+)
> > > > >
> > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > > index 7a0dd35d62c9..93e029403d05 100644
> > > > > --- a/drivers/acpi/acpi_processor.c
> > > > > +++ b/drivers/acpi/acpi_processor.c
> > > > > @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > >         union acpi_object object = { 0 };
> > > > >         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
> > > > >         struct acpi_processor *pr = acpi_driver_data(device);
> > > > > +       struct cpu *c;
> > > > >         int device_declaration = 0;
> > > > >         acpi_status status = AE_OK;
> > > > >         static int cpu0_initialized;
> > > > > @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > >                         cpufreq_add_device("acpi-cpufreq");
> > > > >         }
> > > > >
> > > > > +       c = &per_cpu(cpu_devices, pr->id);
> > > > > +       ACPI_COMPANION_SET(&c->dev, device);  
> > > >
> > > > This is also set for per_cpu(cpu_sys_devices, pr->id) in
> > > > acpi_processor_add(), via acpi_bind_one().  
> > >
> > > Hi Rafael,
> > >
> > > cpu_sys_devices gets filled with a pointer to this same structure.
> > > The contents gets set in register_cpu() so at this point
> > > it doesn't point anywhere.  As a side note register_cpu()
> > > memsets to zero the value I set it to in the code above which isn't
> > > great, particularly as I want to use this in post_eject for
> > > arm64.
> > >
> > > We could make a copy of the handle and put it back after
> > > the memset in register_cpu() but that is also ugly.
> > > It's the best I've come up with to make sure this is still set
> > > come remove time but is rather odd.  
> > > >
> > > > Moreover, there is some pr->id validation in acpi_processor_add(), so
> > > > it seems premature to use it here this way.
> > > >
> > > > I think that ACPI_COMPANION_SET() should be called from here on
> > > > per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
> > > > pr->id validation should all be done here) and then NULL can be passed
> > > > as acpi_dev to acpi_bind_one() in acpi_processor_add().  Then, there
> > > > will be one physical device corresponding to the processor ACPI device
> > > > and no confusion.  
> > >
> > > I'm fairly sure this is pointing to the same device but agreed this
> > > is a tiny bit confusing. However we can't use cpu_sys_devices at this point
> > > so I'm not immediately seeing a cleaner solution :(  
> >
> > Well, OK.
> >
> > Please at least consider doing the pr->id validation checks before
> > setting the ACPI companion for &per_cpu(cpu_devices, pr->id).
> >
> > Also, acpi_bind_one() needs to be called on the "physical" devices
> > passed to ACPI_COMPANION_SET() (with NULL as the second argument) for
> > the reference counting and physical device lookup to work.
> >
> > Please also note that acpi_primary_dev_companion() should return
> > per_cpu(cpu_sys_devices, pr->id) for the processor ACPI device, which
> > depends on the order of acpi_bind_one() calls involving the same ACPI
> > device.  
> 
> Of course, if the value set by ACPI_COMPANION_SET() is cleared
> subsequently, the above is not needed, but then using
> ACPI_COMPANION_SET() is questionable overall.

Agreed + smoothing over that by stashing and putting it back doesn't
work because there is an additional call to acpi_bind_one() inbetween
here and the one you reference.

The arch_register_cpu() calls end up calling register_cpu() /
device_register() / acpi_device_notify() / acpi_bind_one()

With current code that fails (silently)
If I make sure the handle is set before register_cpu() then it
succeeds, but we end up with duplicate sysfs files etc because we
bind twice.

I think the only way around this is larger reorganization of the
CPU hotplug code to pull the arch_register_cpu() call to where
the acpi_bind_one() call is.  However that changes a lot more than I'd like
(and I don't have it working yet).

Alternatively find somewhere else to stash the handle, or just add it as
a parameter to arch_register_cpu(). Right now this feels the easier
path to me. arch_register_cpu(int cpu, acpi_handle handle) 

Would that be a path you'd consider?

Jonathan



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-15 16:50           ` Jonathan Cameron
@ 2024-04-15 17:34             ` Jonathan Cameron
  2024-04-15 17:41               ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-15 17:34 UTC (permalink / raw)
  To: Rafael J. Wysocki, linuxarm
  Cc: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Miguel Luis,
	James Morse, Salil Mehta, Jean-Philippe Brucker, Catalin Marinas,
	Will Deacon, justin.he, jianyong.wu

On Mon, 15 Apr 2024 17:50:57 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:

> On Mon, 15 Apr 2024 18:19:17 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> 
> > On Mon, Apr 15, 2024 at 6:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:  
> > >
> > > On Mon, Apr 15, 2024 at 5:49 PM Jonathan Cameron
> > > <Jonathan.Cameron@huawei.com> wrote:    
> > > >
> > > > On Fri, 12 Apr 2024 20:10:54 +0200
> > > > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> > > >    
> > > > > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > > > > <Jonathan.Cameron@huawei.com> wrote:    
> > > > > >
> > > > > > The arm64 specific arch_register_cpu() needs to access the _STA
> > > > > > method of the DSDT object so make it available by assigning the
> > > > > > appropriate handle to the struct cpu instance.
> > > > > >
> > > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > > > ---
> > > > > >  drivers/acpi/acpi_processor.c | 3 +++
> > > > > >  1 file changed, 3 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > > > index 7a0dd35d62c9..93e029403d05 100644
> > > > > > --- a/drivers/acpi/acpi_processor.c
> > > > > > +++ b/drivers/acpi/acpi_processor.c
> > > > > > @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > >         union acpi_object object = { 0 };
> > > > > >         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
> > > > > >         struct acpi_processor *pr = acpi_driver_data(device);
> > > > > > +       struct cpu *c;
> > > > > >         int device_declaration = 0;
> > > > > >         acpi_status status = AE_OK;
> > > > > >         static int cpu0_initialized;
> > > > > > @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > >                         cpufreq_add_device("acpi-cpufreq");
> > > > > >         }
> > > > > >
> > > > > > +       c = &per_cpu(cpu_devices, pr->id);
> > > > > > +       ACPI_COMPANION_SET(&c->dev, device);    
> > > > >
> > > > > This is also set for per_cpu(cpu_sys_devices, pr->id) in
> > > > > acpi_processor_add(), via acpi_bind_one().    
> > > >
> > > > Hi Rafael,
> > > >
> > > > cpu_sys_devices gets filled with a pointer to this same structure.
> > > > The contents gets set in register_cpu() so at this point
> > > > it doesn't point anywhere.  As a side note register_cpu()
> > > > memsets to zero the value I set it to in the code above which isn't
> > > > great, particularly as I want to use this in post_eject for
> > > > arm64.
> > > >
> > > > We could make a copy of the handle and put it back after
> > > > the memset in register_cpu() but that is also ugly.
> > > > It's the best I've come up with to make sure this is still set
> > > > come remove time but is rather odd.    
> > > > >
> > > > > Moreover, there is some pr->id validation in acpi_processor_add(), so
> > > > > it seems premature to use it here this way.
> > > > >
> > > > > I think that ACPI_COMPANION_SET() should be called from here on
> > > > > per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
> > > > > pr->id validation should all be done here) and then NULL can be passed
> > > > > as acpi_dev to acpi_bind_one() in acpi_processor_add().  Then, there
> > > > > will be one physical device corresponding to the processor ACPI device
> > > > > and no confusion.    
> > > >
> > > > I'm fairly sure this is pointing to the same device but agreed this
> > > > is a tiny bit confusing. However we can't use cpu_sys_devices at this point
> > > > so I'm not immediately seeing a cleaner solution :(    
> > >
> > > Well, OK.
> > >
> > > Please at least consider doing the pr->id validation checks before
> > > setting the ACPI companion for &per_cpu(cpu_devices, pr->id).
> > >
> > > Also, acpi_bind_one() needs to be called on the "physical" devices
> > > passed to ACPI_COMPANION_SET() (with NULL as the second argument) for
> > > the reference counting and physical device lookup to work.
> > >
> > > Please also note that acpi_primary_dev_companion() should return
> > > per_cpu(cpu_sys_devices, pr->id) for the processor ACPI device, which
> > > depends on the order of acpi_bind_one() calls involving the same ACPI
> > > device.    
> > 
> > Of course, if the value set by ACPI_COMPANION_SET() is cleared
> > subsequently, the above is not needed, but then using
> > ACPI_COMPANION_SET() is questionable overall.  
> 
> Agreed + smoothing over that by stashing and putting it back doesn't
> work because there is an additional call to acpi_bind_one() inbetween
> here and the one you reference.
> 
> The arch_register_cpu() calls end up calling register_cpu() /
> device_register() / acpi_device_notify() / acpi_bind_one()
> 
> With current code that fails (silently)
> If I make sure the handle is set before register_cpu() then it
> succeeds, but we end up with duplicate sysfs files etc because we
> bind twice.
> 
> I think the only way around this is larger reorganization of the
> CPU hotplug code to pull the arch_register_cpu() call to where
> the acpi_bind_one() call is.  However that changes a lot more than I'd like
> (and I don't have it working yet).
> 
> Alternatively find somewhere else to stash the handle, or just add it as
> a parameter to arch_register_cpu(). Right now this feels the easier
> path to me. arch_register_cpu(int cpu, acpi_handle handle) 
> 
> Would that be a path you'd consider?

Another option would be to do the per_cpu(processors, pr->id) = pr
a few lines earlier than currently and access that directly from the
arch_register_cpu() call.  Similarly remove that reference a bit later and
use it in arch_unregister_cpu().

This seems like the simplest solution, but I may be missing something.

Jonathan

> 
> Jonathan
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-15 17:34             ` Jonathan Cameron
@ 2024-04-15 17:41               ` Rafael J. Wysocki
  2024-04-16 17:35                 ` Jonathan Cameron
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-15 17:41 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Rafael J. Wysocki, linuxarm, linux-pm, loongarch, linux-acpi,
	linux-arch, linux-kernel, linux-arm-kernel, kvmarm, x86,
	Russell King, Miguel Luis, James Morse, Salil Mehta,
	Jean-Philippe Brucker, Catalin Marinas, Will Deacon, justin.he,
	jianyong.wu

On Mon, Apr 15, 2024 at 7:35 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 15 Apr 2024 17:50:57 +0100
> Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
>
> > On Mon, 15 Apr 2024 18:19:17 +0200
> > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> >
> > > On Mon, Apr 15, 2024 at 6:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> > > >
> > > > On Mon, Apr 15, 2024 at 5:49 PM Jonathan Cameron
> > > > <Jonathan.Cameron@huawei.com> wrote:
> > > > >
> > > > > On Fri, 12 Apr 2024 20:10:54 +0200
> > > > > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> > > > >
> > > > > > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > > > > > <Jonathan.Cameron@huawei.com> wrote:
> > > > > > >
> > > > > > > The arm64 specific arch_register_cpu() needs to access the _STA
> > > > > > > method of the DSDT object so make it available by assigning the
> > > > > > > appropriate handle to the struct cpu instance.
> > > > > > >
> > > > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > > > > ---
> > > > > > >  drivers/acpi/acpi_processor.c | 3 +++
> > > > > > >  1 file changed, 3 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > > > > index 7a0dd35d62c9..93e029403d05 100644
> > > > > > > --- a/drivers/acpi/acpi_processor.c
> > > > > > > +++ b/drivers/acpi/acpi_processor.c
> > > > > > > @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > > >         union acpi_object object = { 0 };
> > > > > > >         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
> > > > > > >         struct acpi_processor *pr = acpi_driver_data(device);
> > > > > > > +       struct cpu *c;
> > > > > > >         int device_declaration = 0;
> > > > > > >         acpi_status status = AE_OK;
> > > > > > >         static int cpu0_initialized;
> > > > > > > @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > > >                         cpufreq_add_device("acpi-cpufreq");
> > > > > > >         }
> > > > > > >
> > > > > > > +       c = &per_cpu(cpu_devices, pr->id);
> > > > > > > +       ACPI_COMPANION_SET(&c->dev, device);
> > > > > >
> > > > > > This is also set for per_cpu(cpu_sys_devices, pr->id) in
> > > > > > acpi_processor_add(), via acpi_bind_one().
> > > > >
> > > > > Hi Rafael,
> > > > >
> > > > > cpu_sys_devices gets filled with a pointer to this same structure.
> > > > > The contents gets set in register_cpu() so at this point
> > > > > it doesn't point anywhere.  As a side note register_cpu()
> > > > > memsets to zero the value I set it to in the code above which isn't
> > > > > great, particularly as I want to use this in post_eject for
> > > > > arm64.
> > > > >
> > > > > We could make a copy of the handle and put it back after
> > > > > the memset in register_cpu() but that is also ugly.
> > > > > It's the best I've come up with to make sure this is still set
> > > > > come remove time but is rather odd.
> > > > > >
> > > > > > Moreover, there is some pr->id validation in acpi_processor_add(), so
> > > > > > it seems premature to use it here this way.
> > > > > >
> > > > > > I think that ACPI_COMPANION_SET() should be called from here on
> > > > > > per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
> > > > > > pr->id validation should all be done here) and then NULL can be passed
> > > > > > as acpi_dev to acpi_bind_one() in acpi_processor_add().  Then, there
> > > > > > will be one physical device corresponding to the processor ACPI device
> > > > > > and no confusion.
> > > > >
> > > > > I'm fairly sure this is pointing to the same device but agreed this
> > > > > is a tiny bit confusing. However we can't use cpu_sys_devices at this point
> > > > > so I'm not immediately seeing a cleaner solution :(
> > > >
> > > > Well, OK.
> > > >
> > > > Please at least consider doing the pr->id validation checks before
> > > > setting the ACPI companion for &per_cpu(cpu_devices, pr->id).
> > > >
> > > > Also, acpi_bind_one() needs to be called on the "physical" devices
> > > > passed to ACPI_COMPANION_SET() (with NULL as the second argument) for
> > > > the reference counting and physical device lookup to work.
> > > >
> > > > Please also note that acpi_primary_dev_companion() should return
> > > > per_cpu(cpu_sys_devices, pr->id) for the processor ACPI device, which
> > > > depends on the order of acpi_bind_one() calls involving the same ACPI
> > > > device.
> > >
> > > Of course, if the value set by ACPI_COMPANION_SET() is cleared
> > > subsequently, the above is not needed, but then using
> > > ACPI_COMPANION_SET() is questionable overall.
> >
> > Agreed + smoothing over that by stashing and putting it back doesn't
> > work because there is an additional call to acpi_bind_one() inbetween
> > here and the one you reference.
> >
> > The arch_register_cpu() calls end up calling register_cpu() /
> > device_register() / acpi_device_notify() / acpi_bind_one()
> >
> > With current code that fails (silently)

And that's why there is an explicit acpi_bind_one() invocation in
acpi_processor_add().

> > If I make sure the handle is set before register_cpu() then it
> > succeeds, but we end up with duplicate sysfs files etc because we
> > bind twice.

Right, I should have recalled that earlier.

> > I think the only way around this is larger reorganization of the
> > CPU hotplug code to pull the arch_register_cpu() call to where
> > the acpi_bind_one() call is.  However that changes a lot more than I'd like
> > (and I don't have it working yet).

I see.

> > Alternatively find somewhere else to stash the handle, or just add it as
> > a parameter to arch_register_cpu(). Right now this feels the easier
> > path to me. arch_register_cpu(int cpu, acpi_handle handle)
> >
> > Would that be a path you'd consider?
>
> Another option would be to do the per_cpu(processors, pr->id) = pr
> a few lines earlier than currently and access that directly from the
> arch_register_cpu() call.  Similarly remove that reference a bit later and
> use it in arch_unregister_cpu().
>
> This seems like the simplest solution, but I may be missing something.

This should work AFAICS, but I'd move the entire piece of code between
BUG_ON() and setting per_cpu(processors, pr->id) inclusive:

    BUG_ON(pr->id >= nr_cpu_ids);

    /*
     * Buggy BIOS check.
     * ACPI id of processors can be reported wrongly by the BIOS.
     * Don't trust it blindly
     */
    if (per_cpu(processor_device_array, pr->id) != NULL &&
        per_cpu(processor_device_array, pr->id) != device) {
        dev_warn(&device->dev,
            "BIOS reported wrong ACPI id %d for the processor\n",
            pr->id);
        /* Give up, but do not abort the namespace scan. */
        goto err;
    }
    /*
     * processor_device_array is not cleared on errors to allow buggy BIOS
     * checks.
     */
    per_cpu(processor_device_array, pr->id) = device;
    per_cpu(processors, pr->id) = pr;

into acpi_processor_get_info(), right after the point where pr->id is set.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-12 14:37 ` [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info() Jonathan Cameron
  2024-04-12 18:30   ` Rafael J. Wysocki
@ 2024-04-16 14:00   ` Jonathan Cameron
  1 sibling, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-16 14:00 UTC (permalink / raw)
  To: linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Russell King, Rafael J . Wysocki,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, linuxarm
  Cc: justin.he, jianyong.wu

On Fri, 12 Apr 2024 15:37:04 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> From: James Morse <james.morse@arm.com>
> 
> The arm64 specific arch_register_cpu() call may defer CPU registration
> until the ACPI interpreter is available and the _STA method can
> be evaluated.
> 
> If this occurs, then a second attempt is made in
> acpi_processor_get_info(). Note that the arm64 specific call has
> not yet been added so for now this will never be successfully
> called.
> 
> Systems can still be booted with 'acpi=off', or not include an
> ACPI description at all as in these cases arch_register_cpu()
> will not have deferred registration when first called.
> 
> This moves the CPU register logic back to a subsys_initcall(),
> while the memory nodes will have been registered earlier.
> Note this is where the call was prior to the cleanup series so
> there should be no side effects of moving it back again for this
> specific case.
> 
> [PATCH 00/21] Initial cleanups for vCPU HP.
> https://lore.kernel.org/all/ZVyz%2FVe5pPu8AWoA@shell.armlinux.org.uk/
> 
> e.g. 5b95f94c3b9f ("x86/topology: Switch over to GENERIC_CPU_DEVICES")
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Joanthan Cameron <Jonathan.Cameron@huawei.com>
> ---
> v5: Update commit message to make it clear this is moving the
>     init back to where it was until very recently.
> 
>     No longer change the condition in the earlier registration point
>     as that will be handled by the arm64 registration routine
>     deferring until called again here.
> ---
>  drivers/acpi/acpi_processor.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 93e029403d05..c78398cdd060 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -317,6 +317,18 @@ static int acpi_processor_get_info(struct acpi_device *device)
>  
>  	c = &per_cpu(cpu_devices, pr->id);
>  	ACPI_COMPANION_SET(&c->dev, device);
> +	/*
> +	 * Register CPUs that are present. get_cpu_device() is used to skip
> +	 * duplicate CPU descriptions from firmware.
> +	 */
> +	if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> +	    !get_cpu_device(pr->id)) {

Just a quick note to call out that this case of 'duplicate' firmware
description needs an updated comment.  Now we are not deferring
registration on x86 this is detecting that arch_register_cpu()
has already been successfully called and we should not do it again.

I've added rather more detailed comments enumerating of the paths we
can take to hit acpi_processor_hotadd_init() in the v6 series
(tests ongoing)

Jonathan


> +		int ret = arch_register_cpu(pr->id);
> +
> +		if (ret)
> +			return ret;
> +	}
> +
>  	/*
>  	 *  Extra Processor objects may be enumerated on MP systems with
>  	 *  less than the max # of CPUs. They should be ignored _iff


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance
  2024-04-15 17:41               ` Rafael J. Wysocki
@ 2024-04-16 17:35                 ` Jonathan Cameron
  0 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-16 17:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: linuxarm, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Russell King,
	Miguel Luis, James Morse, Salil Mehta, Jean-Philippe Brucker,
	Catalin Marinas, Will Deacon, justin.he, jianyong.wu

On Mon, 15 Apr 2024 19:41:43 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Mon, Apr 15, 2024 at 7:35 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Mon, 15 Apr 2024 17:50:57 +0100
> > Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:
> >  
> > > On Mon, 15 Apr 2024 18:19:17 +0200
> > > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> > >  
> > > > On Mon, Apr 15, 2024 at 6:16 PM Rafael J. Wysocki <rafael@kernel.org> wrote:  
> > > > >
> > > > > On Mon, Apr 15, 2024 at 5:49 PM Jonathan Cameron
> > > > > <Jonathan.Cameron@huawei.com> wrote:  
> > > > > >
> > > > > > On Fri, 12 Apr 2024 20:10:54 +0200
> > > > > > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> > > > > >  
> > > > > > > On Fri, Apr 12, 2024 at 4:38 PM Jonathan Cameron
> > > > > > > <Jonathan.Cameron@huawei.com> wrote:  
> > > > > > > >
> > > > > > > > The arm64 specific arch_register_cpu() needs to access the _STA
> > > > > > > > method of the DSDT object so make it available by assigning the
> > > > > > > > appropriate handle to the struct cpu instance.
> > > > > > > >
> > > > > > > > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > > > > > ---
> > > > > > > >  drivers/acpi/acpi_processor.c | 3 +++
> > > > > > > >  1 file changed, 3 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > > > > > > index 7a0dd35d62c9..93e029403d05 100644
> > > > > > > > --- a/drivers/acpi/acpi_processor.c
> > > > > > > > +++ b/drivers/acpi/acpi_processor.c
> > > > > > > > @@ -235,6 +235,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > > > >         union acpi_object object = { 0 };
> > > > > > > >         struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
> > > > > > > >         struct acpi_processor *pr = acpi_driver_data(device);
> > > > > > > > +       struct cpu *c;
> > > > > > > >         int device_declaration = 0;
> > > > > > > >         acpi_status status = AE_OK;
> > > > > > > >         static int cpu0_initialized;
> > > > > > > > @@ -314,6 +315,8 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > > > > > >                         cpufreq_add_device("acpi-cpufreq");
> > > > > > > >         }
> > > > > > > >
> > > > > > > > +       c = &per_cpu(cpu_devices, pr->id);
> > > > > > > > +       ACPI_COMPANION_SET(&c->dev, device);  
> > > > > > >
> > > > > > > This is also set for per_cpu(cpu_sys_devices, pr->id) in
> > > > > > > acpi_processor_add(), via acpi_bind_one().  
> > > > > >
> > > > > > Hi Rafael,
> > > > > >
> > > > > > cpu_sys_devices gets filled with a pointer to this same structure.
> > > > > > The contents gets set in register_cpu() so at this point
> > > > > > it doesn't point anywhere.  As a side note register_cpu()
> > > > > > memsets to zero the value I set it to in the code above which isn't
> > > > > > great, particularly as I want to use this in post_eject for
> > > > > > arm64.
> > > > > >
> > > > > > We could make a copy of the handle and put it back after
> > > > > > the memset in register_cpu() but that is also ugly.
> > > > > > It's the best I've come up with to make sure this is still set
> > > > > > come remove time but is rather odd.  
> > > > > > >
> > > > > > > Moreover, there is some pr->id validation in acpi_processor_add(), so
> > > > > > > it seems premature to use it here this way.
> > > > > > >
> > > > > > > I think that ACPI_COMPANION_SET() should be called from here on
> > > > > > > per_cpu(cpu_sys_devices, pr->id) after validating pr->id (so the
> > > > > > > pr->id validation should all be done here) and then NULL can be passed
> > > > > > > as acpi_dev to acpi_bind_one() in acpi_processor_add().  Then, there
> > > > > > > will be one physical device corresponding to the processor ACPI device
> > > > > > > and no confusion.  
> > > > > >
> > > > > > I'm fairly sure this is pointing to the same device but agreed this
> > > > > > is a tiny bit confusing. However we can't use cpu_sys_devices at this point
> > > > > > so I'm not immediately seeing a cleaner solution :(  
> > > > >
> > > > > Well, OK.
> > > > >
> > > > > Please at least consider doing the pr->id validation checks before
> > > > > setting the ACPI companion for &per_cpu(cpu_devices, pr->id).
> > > > >
> > > > > Also, acpi_bind_one() needs to be called on the "physical" devices
> > > > > passed to ACPI_COMPANION_SET() (with NULL as the second argument) for
> > > > > the reference counting and physical device lookup to work.
> > > > >
> > > > > Please also note that acpi_primary_dev_companion() should return
> > > > > per_cpu(cpu_sys_devices, pr->id) for the processor ACPI device, which
> > > > > depends on the order of acpi_bind_one() calls involving the same ACPI
> > > > > device.  
> > > >
> > > > Of course, if the value set by ACPI_COMPANION_SET() is cleared
> > > > subsequently, the above is not needed, but then using
> > > > ACPI_COMPANION_SET() is questionable overall.  
> > >
> > > Agreed + smoothing over that by stashing and putting it back doesn't
> > > work because there is an additional call to acpi_bind_one() inbetween
> > > here and the one you reference.
> > >
> > > The arch_register_cpu() calls end up calling register_cpu() /
> > > device_register() / acpi_device_notify() / acpi_bind_one()
> > >
> > > With current code that fails (silently)  
> 
> And that's why there is an explicit acpi_bind_one() invocation in
> acpi_processor_add().
> 
> > > If I make sure the handle is set before register_cpu() then it
> > > succeeds, but we end up with duplicate sysfs files etc because we
> > > bind twice.  
> 
> Right, I should have recalled that earlier.
> 
> > > I think the only way around this is larger reorganization of the
> > > CPU hotplug code to pull the arch_register_cpu() call to where
> > > the acpi_bind_one() call is.  However that changes a lot more than I'd like
> > > (and I don't have it working yet).  
> 
> I see.
> 
> > > Alternatively find somewhere else to stash the handle, or just add it as
> > > a parameter to arch_register_cpu(). Right now this feels the easier
> > > path to me. arch_register_cpu(int cpu, acpi_handle handle)
> > >
> > > Would that be a path you'd consider?  
> >
> > Another option would be to do the per_cpu(processors, pr->id) = pr
> > a few lines earlier than currently and access that directly from the
> > arch_register_cpu() call.  Similarly remove that reference a bit later and
> > use it in arch_unregister_cpu().
> >
> > This seems like the simplest solution, but I may be missing something.  
> 
> This should work AFAICS, but I'd move the entire piece of code between
> BUG_ON() and setting per_cpu(processors, pr->id) inclusive:

Hi Rafael,

Unfortunately this is more complex on x86 than I realized :(

On x86 the initial pr->id is invalid, which is one of the conditions
that leads to acpi_processor_hotadd_init() being called.
It only become valid after acpi_map_cpu() in acpi_processor_hotadd_init().

So the best I can immediately come up with is to factor out these checks and the
setting of the per_cpu structures and set them either in acpi_processor_hotadd_init()
or in an else for the non hotplug / normal registration path (where the pr->id is valid).

Naturally found this on my final set of tests...

A little ugly but not 'too bad'. 

Jonathan
p.s. No one minds if I break x86, right?





> 
>     BUG_ON(pr->id >= nr_cpu_ids);
> 
>     /*
>      * Buggy BIOS check.
>      * ACPI id of processors can be reported wrongly by the BIOS.
>      * Don't trust it blindly
>      */
>     if (per_cpu(processor_device_array, pr->id) != NULL &&
>         per_cpu(processor_device_array, pr->id) != device) {
>         dev_warn(&device->dev,
>             "BIOS reported wrong ACPI id %d for the processor\n",
>             pr->id);
>         /* Give up, but do not abort the namespace scan. */
>         goto err;
>     }
>     /*
>      * processor_device_array is not cleared on errors to allow buggy BIOS
>      * checks.
>      */
>     per_cpu(processor_device_array, pr->id) = device;
>     per_cpu(processors, pr->id) = pr;
> 
> into acpi_processor_get_info(), right after the point where pr->id is set.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 12:23                     ` Jonathan Cameron
@ 2024-04-16 17:41                       ` Jonathan Cameron
  2024-04-16 19:02                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-16 17:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Russell King (Oracle),
	linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Miguel Luis, James Morse,
	Salil Mehta, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	linuxarm, justin.he, jianyong.wu

On Mon, 15 Apr 2024 13:23:51 +0100
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Mon, 15 Apr 2024 14:04:26 +0200
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> 
> > On Mon, Apr 15, 2024 at 1:56 PM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:  
> > >
> > > On Mon, 15 Apr 2024 13:37:08 +0200
> > > "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> > >    
> > > > On Mon, Apr 15, 2024 at 10:46 AM Jonathan Cameron
> > > > <Jonathan.Cameron@huawei.com> wrote:    
> > > > >
> > > > > On Sat, 13 Apr 2024 01:23:48 +0200
> > > > > Thomas Gleixner <tglx@linutronix.de> wrote:
> > > > >    
> > > > > > Russell!
> > > > > >
> > > > > > On Fri, Apr 12 2024 at 22:52, Russell King (Oracle) wrote:    
> > > > > > > On Fri, Apr 12, 2024 at 10:54:32PM +0200, Thomas Gleixner wrote:    
> > > > > > >> > As for the cpu locking, I couldn't find anything in arch_register_cpu()
> > > > > > >> > that depends on the cpu_maps_update stuff nor needs the cpus_write_lock
> > > > > > >> > being taken - so I've no idea why the "make_present" case takes these
> > > > > > >> > locks.    
> > > > > > >>
> > > > > > >> Anything which updates a CPU mask, e.g. cpu_present_mask, after early
> > > > > > >> boot must hold the appropriate write locks. Otherwise it would be
> > > > > > >> possible to online a CPU which just got marked present, but the
> > > > > > >> registration has not completed yet.    
> > > > > > >
> > > > > > > Yes. As far as I've been able to determine, arch_register_cpu()
> > > > > > > doesn't manipulate any of the CPU masks. All it seems to be doing
> > > > > > > is initialising the struct cpu, registering the embedded struct
> > > > > > > device, and setting up the sysfs links to its NUMA node.
> > > > > > >
> > > > > > > There is nothing obvious in there which manipulates any CPU masks, and
> > > > > > > this is rather my fundamental point when I said "I couldn't find
> > > > > > > anything in arch_register_cpu() that depends on ...".
> > > > > > >
> > > > > > > If there is something, then comments in the code would be a useful aid
> > > > > > > because it's highly non-obvious where such a manipulation is located,
> > > > > > > and hence why the locks are necessary.    
> > > > > >
> > > > > > acpi_processor_hotadd_init()
> > > > > > ...
> > > > > >          acpi_map_cpu(pr->handle, pr->phys_id, pr->acpi_id, &pr->id);
> > > > > >
> > > > > > That ends up in fiddling with cpu_present_mask.
> > > > > >
> > > > > > I grant you that arch_register_cpu() is not, but it might rely on the
> > > > > > external locking too. I could not be bothered to figure that out.
> > > > > >    
> > > > > > >> Define "real hotplug" :)
> > > > > > >>
> > > > > > >> Real physical hotplug does not really exist. That's at least true for
> > > > > > >> x86, where the physical hotplug support was chased for a while, but
> > > > > > >> never ended up in production.
> > > > > > >>
> > > > > > >> Though virtualization happily jumped on it to hot add/remove CPUs
> > > > > > >> to/from a guest.
> > > > > > >>
> > > > > > >> There are limitations to this and we learned it the hard way on X86. At
> > > > > > >> the end we came up with the following restrictions:
> > > > > > >>
> > > > > > >>     1) All possible CPUs have to be advertised at boot time via firmware
> > > > > > >>        (ACPI/DT/whatever) independent of them being present at boot time
> > > > > > >>        or not.
> > > > > > >>
> > > > > > >>        That guarantees proper sizing and ensures that associations
> > > > > > >>        between hardware entities and software representations and the
> > > > > > >>        resulting topology are stable for the lifetime of a system.
> > > > > > >>
> > > > > > >>        It is really required to know the full topology of the system at
> > > > > > >>        boot time especially with hybrid CPUs where some of the cores
> > > > > > >>        have hyperthreading and the others do not.
> > > > > > >>
> > > > > > >>
> > > > > > >>     2) Hot add can only mark an already registered (possible) CPU
> > > > > > >>        present. Adding non-registered CPUs after boot is not possible.
> > > > > > >>
> > > > > > >>        The CPU must have been registered in #1 already to ensure that
> > > > > > >>        the system topology does not suddenly change in an incompatible
> > > > > > >>        way at run-time.
> > > > > > >>
> > > > > > >> The same restriction would apply to real physical hotplug. I don't think
> > > > > > >> that's any different for ARM64 or any other architecture.    
> > > > > > >
> > > > > > > This makes me wonder whether the Arm64 has been barking up the wrong
> > > > > > > tree then, and whether the whole "present" vs "enabled" thing comes
> > > > > > > from a misunderstanding as far as a CPU goes.
> > > > > > >
> > > > > > > However, there is a big difference between the two. On x86, a processor
> > > > > > > is just a processor. On Arm64, a "processor" is a slice of the system
> > > > > > > (includes the interrupt controller, PMUs etc) and we must enumerate
> > > > > > > those even when the processor itself is not enabled. This is the whole
> > > > > > > reason there's a difference between "present" and "enabled" and why
> > > > > > > there's a difference between x86 cpu hotplug and arm64 cpu hotplug.
> > > > > > > The processor never actually goes away in arm64, it's just prevented
> > > > > > > from being used.    
> > > > > >
> > > > > > It's the same on X86 at least in the physical world.    
> > > > >
> > > > > There were public calls on this via the Linaro Open Discussions group,
> > > > > so I can talk a little about how we ended up here.  Note that (in my
> > > > > opinion) there is zero chance of this changing - it took us well over
> > > > > a year to get to this conclusion.  So if we ever want ARM vCPU HP
> > > > > we need to work within these constraints.
> > > > >
> > > > > The ARM architecture folk (the ones defining the ARM ARM, relevant ACPI
> > > > > specs etc, not the kernel maintainers) are determined that they want
> > > > > to retain the option to do real physical CPU hotplug in the future
> > > > > with all the necessary work around dynamic interrupt controller
> > > > > initialization, debug and many other messy corners.    
> > > >
> > > > That's OK, but the difference is not in the ACPi CPU enumeration/removal code.
> > > >    
> > > > > Thus anything defined had to be structured in a way that was 'different'
> > > > > from that.    
> > > >
> > > > Apparently, that's where things got confused.
> > > >    
> > > > > I don't mind the proposed flattening of the 2 paths if the ARM kernel
> > > > > maintainers are fine with it but it will remove the distinctions and
> > > > > we will need to be very careful with the CPU masks - we can't handle
> > > > > them the same as x86 does.    
> > > >
> > > > At the ACPI code level, there is no distinction.
> > > >
> > > > A CPU that was not available before has just become available.  The
> > > > platform firmware has notified the kernel about it and now
> > > > acpi_processor_add() runs.  Why would it need to use different code
> > > > paths depending on what _STA bits were clear before?    
> > >
> > > I think we will continue to disagree on this.  To my mind and from the
> > > ACPI specification, they are two different state transitions with different
> > > required actions.    
> > 
> > Well, please be specific: What exactly do you mean here and which
> > parts of the spec are you talking about?  
> 
> Given we are moving on with your suggestion, lets leave this for now - too many
> other things to do! :)
> 
> >   
> > > Those state transitions are an ACPI level thing not
> > > an arch level one.  However, I want a solution that moves things forwards
> > > so I'll give pushing that entirely into the arch code a try.    
> > 
> > Thanks!
> > 
> > Though I think that there is a disconnect between us that needs to be
> > clarified first.  
> 
> I'm fine with accepting your approach if it works and is acceptable
> to the arm kernel folk. They are getting a non trivial arch_register_cpu()
> with a bunch of ACPI specific handling in it that may come as a surprise.
> 
> >   
> > > >
> > > > Yes, there is some arch stuff to be called and that arch stuff should
> > > > figure out what to do to make things actually work.
> > > >    
> > > > > I'll get on with doing that, but do need input from Will / Catalin / James.
> > > > > There are some quirks that need calling out as it's not quite a simple
> > > > > as it appears from a high level.
> > > > >
> > > > > Another part of that long discussion established that there is userspace
> > > > > (Android IIRC) in which the CPU present mask must include all CPUs
> > > > > at boot. To change that would be userspace ABI breakage so we can't
> > > > > do that.  Hence the dance around adding yet another mask to allow the
> > > > > OS to understand which CPUs are 'present' but not possible to online.
> > > > >
> > > > > Flattening the two paths removes any distinction between calls that
> > > > > are for real hotplug and those that are for this online capable path.    
> > > >
> > > > Which calls exactly do you mean?    
> > >
> > > At the moment he distinction does not exist (because x86 only supports
> > > fake physical CPU HP and arm64 only vCPU HP / online capable), but if
> > > the architecture is defined for arm64 physical hotplug in the future
> > > we would need to do interrupt controller bring up + a lot of other stuff.
> > >
> > > It may be possible to do that in the arch code - will be hard to verify
> > > that until that arch is defined  Today all I need to do is ensure that
> > > any attempt to do present bit setting for ARM64 returns an error.
> > > That looks to be straight forward.    
> > 
> > OK
> >   
> > >    
> > > >    
> > > > > As a side note, the indicating bit for these flows is defined in ACPI
> > > > > for x86 from ACPI 6.3 as a flag in Processor Local APIC
> > > > > (the ARM64 definition is a cut and paste of that text).  So someone
> > > > > is interested in this distinction on x86. I can't say who but if
> > > > > you have a mantis account you can easily follow the history and it
> > > > > might be instructive to not everyone considering the current x86
> > > > > flow the right way to do it.    
> > > >
> > > > So a physically absent processor is different from a physically
> > > > present processor that has not been disabled.  No doubt about this.
> > > >
> > > > That said, I'm still unsure why these two cases require two different
> > > > code paths in acpi_processor_add().    
> > >
> > > It might be possible to push the checking down into arch_register_cpu()
> > > and have that for now reject any attempt to do physical CPU HP on arm64.
> > > It is that gate that is vital to getting this accepted by ARM.
> > >
> > > I'm still very much stuck on the hotadd_init flag however, so any suggestions
> > > on that would be very welcome!    
> > 
> > I need to do some investigation which will take some time I suppose.  
> 
> I'll do so as well once I've gotten the rest sorted out.  That whole
> structure seems overly complex and liable to race, though maybe sufficient
> locking happens to be held that it's not a problem.

Back to this a (maybe) last outstanding problem.

Superficially I think we might be able to get around this by always
doing the setup in the initial online. In brief that looks something the
below code.  Relying on the cpu hotplug callback registration calling
the acpi_soft_cpu_online for all instances that are already online.

Very lightly tested on arm64 and x86 with cold and hotplugged CPUs.
However this is all in emulation and I don't have access to any significant
x86 test farms :( So help will be needed if it's not immediately obvious why
we can't do this.

Of course, I'm open to other suggestions!

For now I'll put a tidied version of this one is as an RFC with the rest of v6.

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 06e718b650e5..97ca53b516d0 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -340,7 +340,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
         */
        per_cpu(processor_device_array, pr->id) = device;
        per_cpu(processors, pr->id) = pr;
-
+       pr->flags.need_hotplug_init = 1;
        /*
         *  Extra Processor objects may be enumerated on MP systems with
         *  less than the max # of CPUs. They should be ignored _iff
diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
index 67db60eda370..930f911fc435 100644
--- a/drivers/acpi/processor_driver.c
+++ b/drivers/acpi/processor_driver.c
@@ -206,7 +206,7 @@ static int acpi_processor_start(struct device *dev)

        /* Protect against concurrent CPU hotplug operations */
        cpu_hotplug_disable();
-       ret = __acpi_processor_start(device);
+       //      ret = __acpi_processor_start(device);
        cpu_hotplug_enable();
        return ret;
 }
@@ -279,7 +279,7 @@ static int __init acpi_processor_driver_init(void)
        if (result < 0)
                return result;

-       result = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+       result = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
                                           "acpi/cpu-drv:online",
                                           acpi_soft_cpu_online, NULL);
        if (result < 0)
> 
> Jonathan
> 
> 
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-16 17:41                       ` Jonathan Cameron
@ 2024-04-16 19:02                         ` Rafael J. Wysocki
  2024-04-17 10:39                           ` Jonathan Cameron
  0 siblings, 1 reply; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-16 19:02 UTC (permalink / raw)
  To: Jonathan Cameron
  Cc: Rafael J. Wysocki, Thomas Gleixner, Russell King (Oracle),
	linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Miguel Luis, James Morse,
	Salil Mehta, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	linuxarm, justin.he, jianyong.wu

On Tue, Apr 16, 2024 at 7:41 PM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 15 Apr 2024 13:23:51 +0100
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
>
> > On Mon, 15 Apr 2024 14:04:26 +0200
> > "Rafael J. Wysocki" <rafael@kernel.org> wrote:

[cut]

> > > > I'm still very much stuck on the hotadd_init flag however, so any suggestions
> > > > on that would be very welcome!
> > >
> > > I need to do some investigation which will take some time I suppose.
> >
> > I'll do so as well once I've gotten the rest sorted out.  That whole
> > structure seems overly complex and liable to race, though maybe sufficient
> > locking happens to be held that it's not a problem.
>
> Back to this a (maybe) last outstanding problem.
>
> Superficially I think we might be able to get around this by always
> doing the setup in the initial online. In brief that looks something the
> below code.  Relying on the cpu hotplug callback registration calling
> the acpi_soft_cpu_online for all instances that are already online.
>
> Very lightly tested on arm64 and x86 with cold and hotplugged CPUs.
> However this is all in emulation and I don't have access to any significant
> x86 test farms :( So help will be needed if it's not immediately obvious why
> we can't do this.

AFAICS, this should work.  At least I don't see why it wouldn't.

> Of course, I'm open to other suggestions!
>
> For now I'll put a tidied version of this one is as an RFC with the rest of v6.
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 06e718b650e5..97ca53b516d0 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -340,7 +340,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
>          */
>         per_cpu(processor_device_array, pr->id) = device;
>         per_cpu(processors, pr->id) = pr;
> -
> +       pr->flags.need_hotplug_init = 1;
>         /*
>          *  Extra Processor objects may be enumerated on MP systems with
>          *  less than the max # of CPUs. They should be ignored _iff
> diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
> index 67db60eda370..930f911fc435 100644
> --- a/drivers/acpi/processor_driver.c
> +++ b/drivers/acpi/processor_driver.c
> @@ -206,7 +206,7 @@ static int acpi_processor_start(struct device *dev)
>
>         /* Protect against concurrent CPU hotplug operations */
>         cpu_hotplug_disable();
> -       ret = __acpi_processor_start(device);
> +       //      ret = __acpi_processor_start(device);
>         cpu_hotplug_enable();
>         return ret;
>  }

So it looks like acpi_processor_start() is not necessary any more, is it?

> @@ -279,7 +279,7 @@ static int __init acpi_processor_driver_init(void)
>         if (result < 0)
>                 return result;
>
> -       result = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> +       result = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
>                                            "acpi/cpu-drv:online",
>                                            acpi_soft_cpu_online, NULL);
>         if (result < 0)
> >
> > Jonathan

Thanks!

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-16 19:02                         ` Rafael J. Wysocki
@ 2024-04-17 10:39                           ` Jonathan Cameron
  0 siblings, 0 replies; 58+ messages in thread
From: Jonathan Cameron @ 2024-04-17 10:39 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Russell King (Oracle),
	linux-pm, loongarch, linux-acpi, linux-arch, linux-kernel,
	linux-arm-kernel, kvmarm, x86, Miguel Luis, James Morse,
	Salil Mehta, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	linuxarm, justin.he, jianyong.wu

On Tue, 16 Apr 2024 21:02:02 +0200
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Tue, Apr 16, 2024 at 7:41 PM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Mon, 15 Apr 2024 13:23:51 +0100
> > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> >  
> > > On Mon, 15 Apr 2024 14:04:26 +0200
> > > "Rafael J. Wysocki" <rafael@kernel.org> wrote:  
> 
> [cut]
> 
> > > > > I'm still very much stuck on the hotadd_init flag however, so any suggestions
> > > > > on that would be very welcome!  
> > > >
> > > > I need to do some investigation which will take some time I suppose.  
> > >
> > > I'll do so as well once I've gotten the rest sorted out.  That whole
> > > structure seems overly complex and liable to race, though maybe sufficient
> > > locking happens to be held that it's not a problem.  
> >
> > Back to this a (maybe) last outstanding problem.
> >
> > Superficially I think we might be able to get around this by always
> > doing the setup in the initial online. In brief that looks something the
> > below code.  Relying on the cpu hotplug callback registration calling
> > the acpi_soft_cpu_online for all instances that are already online.
> >
> > Very lightly tested on arm64 and x86 with cold and hotplugged CPUs.
> > However this is all in emulation and I don't have access to any significant
> > x86 test farms :( So help will be needed if it's not immediately obvious why
> > we can't do this.  
> 
> AFAICS, this should work.  At least I don't see why it wouldn't.
> 
> > Of course, I'm open to other suggestions!
> >
> > For now I'll put a tidied version of this one is as an RFC with the rest of v6.
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 06e718b650e5..97ca53b516d0 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -340,7 +340,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> >          */
> >         per_cpu(processor_device_array, pr->id) = device;
> >         per_cpu(processors, pr->id) = pr;
> > -
> > +       pr->flags.need_hotplug_init = 1;
> >         /*
> >          *  Extra Processor objects may be enumerated on MP systems with
> >          *  less than the max # of CPUs. They should be ignored _iff
> > diff --git a/drivers/acpi/processor_driver.c b/drivers/acpi/processor_driver.c
> > index 67db60eda370..930f911fc435 100644
> > --- a/drivers/acpi/processor_driver.c
> > +++ b/drivers/acpi/processor_driver.c
> > @@ -206,7 +206,7 @@ static int acpi_processor_start(struct device *dev)
> >
> >         /* Protect against concurrent CPU hotplug operations */
> >         cpu_hotplug_disable();
> > -       ret = __acpi_processor_start(device);
> > +       //      ret = __acpi_processor_start(device);
> >         cpu_hotplug_enable();
> >         return ret;
> >  }  
> 
> So it looks like acpi_processor_start() is not necessary any more, is it?

Absolutely.  This needs cleaning up beyond this hack.

Given pr has been initialized to 0, flipping the flag to be something
like 'initialized' and having the driver set it on first online rather than
in acpi_processor.c will clean it up further.

Jonathan
> 
> > @@ -279,7 +279,7 @@ static int __init acpi_processor_driver_init(void)
> >         if (result < 0)
> >                 return result;
> >
> > -       result = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> > +       result = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> >                                            "acpi/cpu-drv:online",
> >                                            acpi_soft_cpu_online, NULL);
> >         if (result < 0)  
> > >
> > > Jonathan  
> 
> Thanks!


^ permalink raw reply	[flat|nested] 58+ messages in thread

* RE: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-15 16:38               ` Rafael J. Wysocki
@ 2024-04-17 15:01                 ` Salil Mehta
  2024-04-17 16:19                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 58+ messages in thread
From: Salil Mehta @ 2024-04-17 15:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Russell King (Oracle),
	Jonathan Cameron, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	Linuxarm, justin.he, jianyong.wu

HI Rafael,

>  From: Rafael J. Wysocki <rafael@kernel.org>
>  Sent: Monday, April 15, 2024 5:39 PM
>  
>  On Mon, Apr 15, 2024 at 5:31 PM Salil Mehta <salil.mehta@huawei.com>  wrote:
>  >
>  > >  From: Rafael J. Wysocki <rafael@kernel.org>
>  > >  Sent: Monday, April 15, 2024 1:51 PM
>  > >
>  > >  On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta
>  > > <salil.mehta@huawei.com>
>  > >  wrote:
>  > >  >
>  
>  [cut]
>  
>  > >  > >  Though virtualization happily jumped on it to hot add/remove
>  > > CPUs  > > to/from  a guest.
>  > >  > >
>  > >  > >  There are limitations to this and we learned it the hard way
>  > > on  > > X86. At the  end we came up with the following restrictions:
>  > >  > >
>  > >  > >      1) All possible CPUs have to be advertised at boot time via  firmware
>  > >  > >         (ACPI/DT/whatever) independent of them being present at boot time
>  > >  > >         or not.
>  > >  > >
>  > >  > >         That guarantees proper sizing and ensures that associations
>  > >  > >         between hardware entities and software representations and the
>  > >  > >         resulting topology are stable for the lifetime of a system.
>  > >  > >
>  > >  > >         It is really required to know the full topology of the system at
>  > >  > >         boot time especially with hybrid CPUs where some of the cores
>  > >  > >         have hyperthreading and the others do not.
>  > >  > >
>  > >  > >
>  > >  > >      2) Hot add can only mark an already registered (possible) CPU
>  > >  > >         present. Adding non-registered CPUs after boot is not possible.
>  > >  > >
>  > >  > >         The CPU must have been registered in #1 already to ensure that
>  > >  > >         the system topology does not suddenly change in an incompatible
>  > >  > >         way at run-time.
>  > >  > >
>  > >  > >  The same restriction would apply to real physical hotplug. I
>  > > don't  > > think that's  any different for ARM64 or any other architecture.
>  > >  >
>  > >  >
>  > >  > There is a difference:
>  > >  >
>  > >  > 1.   ARM arch does not allows for any processor to be NOT present. Hence,  because of
>  > >  > this restriction any of its related per-cpu components must be
>  > > present  > and enumerated at the boot time as well (exposed by
>  > > firmware and  > ACPI). This means all the enumerated processors will
>  > > be marked as  > 'present' but they might exist in NOT enabled (_STA.enabled=0) state.
>  > >  >
>  > >  > There was one clear difference and please correct me if I'm wrong
>  > > > here,  for x86, the LAPIC associated with the x86 core can be brought online later even after boot?
>  > >  >
>  > >  > But for ARM Arch, processors and its corresponding per-cpu
>  > > components  > like redistributors all need to be present and
>  > > enumerated during the  > boot time. Redistributors are part of ALWAYS-ON power domain.
>  > >
>  > >  OK
>  > >
>  > >  So what exactly is the problem with this and what does
>  > >  acpi_processor_add() have to do with it?
>  >
>  >
>  > For ARM Arch, during boot time, it should add processor as if no
>  > hotplug exists. But later, in context to the (fake) hotplug trigger
>  > from the virtualizer as a result of the CPU (un)plug action  it should just
>  end up in registering the already present CPU with the Linux Driver Model.
>  
>  So let me repeat this last time: acpi_processor_add() cannot do that,
>  because (as defined today) it rejects CPUs with the "enabled" bit clear in  _STA.


I understand that now because you have placed a check recently. sorry for stretching
it a bit but I wanted to clearly understand the reason for this behavior. Is it because,

1. It does not makes sense to add a disabled but present/functional processor or
    perhaps there are repercussions to support such a behavior?

Or

2. None of the existing processors need such a behavior?



>  > >  Do you want the per-CPU structures etc. to be created from the
>  > >  acpi_processor_add() path?
>  >
>  >
>  > I referred to the components related to ARM CPU Arch like redistributors etc.
>  > which will get initialized in context to Arch specific _init code not
>  > here. This i.e. acpi_processor_add() is arch agnostic code common to all architectures.
>  >
>  > [ A digression: You do have _weak functions which can be overridden to
>  > arch specific  handling like  arch_(un)map_cpu() etc. but we can't use
>  > those to defer initialize  the CPU related components - ARM Arch
>  > constraint!]
>  
>  Not right now, but they can be added I suppose.
>  
>  >
>  > >
>  > >  This plain won't work because acpi_processor_add(), as defined in
>  > > the  mainline kernel today (and the Jonathan's patches don't change
>  > > that  AFAICS), returns an error for processor devices with the
>  > > "enabled" bit clear  in _STA (it returns an error if the "present"
>  > > bit is clear too, but that's  obvious), so it only gets to calling
>  > > arch_register_cpu() if
>  > >  *both* "present" and "enabled" _STA bits are set for the given
>  > > processor  device.
>  >
>  >
>  > If you are referring to the _STA check in the XX_hot_add_init() then
>  > in the current kernel code it only checks for the
>  > ACPI_STA_DEVICE_PRESENT flag and not the ACPI_STA_DEVICE_ENABLED flag(?).
>  
>  No, I am not.  I'm referring to this code in 6.9-rc4:
>  
>  static int acpi_processor_add(struct acpi_device *device,
>                      const struct acpi_device_id *id) {
>      struct acpi_processor *pr;
>      struct device *dev;
>      int result = 0;
>  
>      if (!acpi_device_is_enabled(device))
>          return -ENODEV;


Ahh, sorry, I missed this check as this has been added recently. Yes, now your
logic of having common legs makes more sense.


>  
>      ...
>  }
>  
>  where acpi_device_is_enabled() is defined as follows:
>  
>  bool acpi_device_is_enabled(const struct acpi_device *adev) {
>      return adev->status.present && adev->status.enabled; }


Got it. 


[digression note:]
BTW, I'm wondering why we are checking adev->status.present
as having adev->status.enabled as set and adev->status.present
as unset would mean firmware has a BUG. If we really want to
check this then we should rather flag a warning on detection
of this condition? 


Either this:
 bool acpi_device_is_enabled(const struct acpi_device *adev) {
      
     if (!acpi_device_is_present(adev)) {
            if (adev->status.enabled)
                       pr_debug("Device [%s] status inconsistent: Enabled but not Present\n",
                                          device->pnp.bus_id);
            return false;
     }
      return  true;
}

Ideally this inconsistency should have been checked in acpi_bus_get_status()
and above function should have been just,

file: drivers/acpi/scan.c
bool acpi_device_is_enabled(const struct acpi_device *adev) {
      return !!adev->status.enabled; }


file: drivers/acpi/bus.c
int acpi_bus_get_status(struct acpi_device *device)
{
       [...]

	status = acpi_bus_get_status_handle(device->handle, &sta);
	if (ACPI_FAILURE(status))
		return -ENODEV;

	acpi_set_device_status(device, sta);

	if (device->status.functional && !device->status.present) {
		pr_debug("Device [%s] status [%08x]: functional but not present\n",
			 device->pnp.bus_id, (u32)sta);
	}

+	if (device->status.enabled && !device->status.present) {
+		pr_debug("BUG: Device [%s] status [%08x]: enabled but not present\n",
+			 device->pnp.bus_id, (u32)sta);
+                         /* any specific handling here? */
+	}

	pr_debug("Device [%s] status [%08x]\n", device->pnp.bus_id, (u32)sta);
	return 0;
}

>  
>  > The code being reviewed has changed
>  > exactly that behavior for 2 legs i.e. make-present and make-enabled legs.
>  
>  I'm not sure what you mean here, but the code above means that
>  acpi_processor_add) does not distinguish between CPU with the "present"
>  bit clear (in which case the "enabled" bit must also be clear as per the spec)
>  and CPUs with the "present" bit set and the "enabled" bit clear.  These two
>  cases are handled in the same way.
>  
>  > I'm open to further address your point clearly.
>  
>  I hope that the above is clear enough.


Yes, clear now. I missed the new check.

>  
>  > >
>  > >  That, BTW, is why I keep saying that from the ACPI CPU enumeration
>  > > code  perspective, there is no difference between "present" and
>  "enabled".
>  >
>  >
>  > Agreed but there is still a subtle difference.  Enumeration happens
>  > once and for all the processors during the boot time. And as confirmed
>  > by yourself and Thomas as well that even in x86 arch all the
>  > processors will be discovered and their topology fixed during the boot
>  > time which is effectively the same behavior as in the ARM Arch. But
>  > ARM assumes those 'present' bits in the present masks to be set during
>  > the boot time which is not like x86(?).  Hence, 'present cpu' Bits
>  > will always be equal to 'possible cpu' Bits. This is a constraint put by the
>  ARM maintainers and looks unshakable.
>  
>  Yes, there are differences between architectures, but the ACPI code is (or
>  at least should be) architecture-agnostic (as you said somewhere above).
>  So why does this matter for the ACPI code?


It should not. There were few bits like overriding of arch_register_cpu() which
was not allowed by ARM folks in 2020 when I floated the first prototype.


>  > >  > 2.  Agreed regarding the topology. Are you suggesting that we
>  > > must  > call arch_register_cpu() during boot time for all the 'present'  CPUs?
>  > >  > Even if that's the case, we might still want to defer
>  > > registration of  > the cpu device (register_cpu() API) with the
>  > > Linux device model. Later  > is what we are doing to hide/unhide the
>  > > CPUs from the user while  STA.Enabled Bit is toggled due to CPU  (un)plug action.
>  > >
>  > >  There are two ways to approach this IMV, and both seem to be valid
>  > > in  principle.
>  > >
>  > >  One would be to treat CPUs with the "enabled" bit clear as not
>  > > present and  create all of the requisite data structures for them
>  > > when they become  available (in analogy with the "real hot-add" case).
>  >
>  >
>  > Right. This one is out-of-scope for ARM Arch as we cannot defer any
>  > Arch specific sizing and initializations after boot i.e. when
>  > processor_add() gets called again later as a trigger of CPU plug action at the Virtualizer.
>  >
>  >
>  > >
>  > >  The alternative one is to create all of the requisite data
>  > > structures for the  CPUs that you find during boot, but register CPU
>  > > devices for those having  the "enabled" _STA bit set.
>  >
>  >
>  > Correct. and we defer the registration for CPUs with online-capable
>  > Bit set in the ACPI MADT/GICC data structure. These CPUs basically
>  > form set of hot-pluggable CPUs on ARM.
>  >
>  >
>  > >
>  > >  It looks like you have chosen the second approach, which is fine
>  > > with me  (although personally, I would rather choose the first one),
>  > > but then your  arch code needs to arrange for the requisite CPU data
>  > > structures etc. to be  set up before acpi_processor_add() gets
>  > > called because, as per the above,  the latter just rejects CPUs with the  "enabled" _STA bit clear.
>  >
>  > Yes, correct. First one is not possible - at least for now and to have
>  > that it will require lot of rework especially at GIC. But there are
>  > many other arch components (like timers, PMUs, etc.) whose behavior
>  > needs to be specified somewhere in the architecture as well. All these are closely coupled with the ARM CPU architecture.
>  > (it's beyond this discussion and lets leave that to ARM folks).
>  >
>  > This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly.
>  
>  Well, I'm having a hard time with this.
>  
>  As far as CPU enumeration goes, if the "enabled" bit is clear in _STA, it does
>  not happen at all.  Both on ARM and on x86.

sure, I can see that now.

>  
>  Now tell me why there need to be two separate code paths calling
>  arch_register_cpu() in acpi_processor_add()?


As mentioned above, in the first prototype I floated in the year 2020 any attempts
to override the __weak call of arch_register_cpu() for ARM64 was discouraged. 
Though, the reasons might have changed now as some code has been moved.

Once we are allowed to override the calls then there are many more possibilities
which open up to simplify the code further.


>  
>  I see no reason whatsoever.
>  
>  Moreover, I see reasons why there needs to be only one such code path.
>  
>  Please feel free to prove me wrong.
>  
>  Thanks!

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info()
  2024-04-17 15:01                 ` Salil Mehta
@ 2024-04-17 16:19                   ` Rafael J. Wysocki
  0 siblings, 0 replies; 58+ messages in thread
From: Rafael J. Wysocki @ 2024-04-17 16:19 UTC (permalink / raw)
  To: Salil Mehta
  Cc: Rafael J. Wysocki, Thomas Gleixner, Russell King (Oracle),
	Jonathan Cameron, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Miguel Luis,
	James Morse, Jean-Philippe Brucker, Catalin Marinas, Will Deacon,
	Linuxarm, justin.he, jianyong.wu

On Wed, Apr 17, 2024 at 5:01 PM Salil Mehta <salil.mehta@huawei.com> wrote:
>
> HI Rafael,
>
> >  From: Rafael J. Wysocki <rafael@kernel.org>
> >  Sent: Monday, April 15, 2024 5:39 PM
> >
> >  On Mon, Apr 15, 2024 at 5:31 PM Salil Mehta <salil.mehta@huawei.com>  wrote:
> >  >
> >  > >  From: Rafael J. Wysocki <rafael@kernel.org>
> >  > >  Sent: Monday, April 15, 2024 1:51 PM
> >  > >
> >  > >  On Mon, Apr 15, 2024 at 1:51 PM Salil Mehta
> >  > > <salil.mehta@huawei.com>
> >  > >  wrote:
> >  > >  >
> >
> >  [cut]
> >
> >  > >  > >  Though virtualization happily jumped on it to hot add/remove
> >  > > CPUs  > > to/from  a guest.
> >  > >  > >
> >  > >  > >  There are limitations to this and we learned it the hard way
> >  > > on  > > X86. At the  end we came up with the following restrictions:
> >  > >  > >
> >  > >  > >      1) All possible CPUs have to be advertised at boot time via  firmware
> >  > >  > >         (ACPI/DT/whatever) independent of them being present at boot time
> >  > >  > >         or not.
> >  > >  > >
> >  > >  > >         That guarantees proper sizing and ensures that associations
> >  > >  > >         between hardware entities and software representations and the
> >  > >  > >         resulting topology are stable for the lifetime of a system.
> >  > >  > >
> >  > >  > >         It is really required to know the full topology of the system at
> >  > >  > >         boot time especially with hybrid CPUs where some of the cores
> >  > >  > >         have hyperthreading and the others do not.
> >  > >  > >
> >  > >  > >
> >  > >  > >      2) Hot add can only mark an already registered (possible) CPU
> >  > >  > >         present. Adding non-registered CPUs after boot is not possible.
> >  > >  > >
> >  > >  > >         The CPU must have been registered in #1 already to ensure that
> >  > >  > >         the system topology does not suddenly change in an incompatible
> >  > >  > >         way at run-time.
> >  > >  > >
> >  > >  > >  The same restriction would apply to real physical hotplug. I
> >  > > don't  > > think that's  any different for ARM64 or any other architecture.
> >  > >  >
> >  > >  >
> >  > >  > There is a difference:
> >  > >  >
> >  > >  > 1.   ARM arch does not allows for any processor to be NOT present. Hence,  because of
> >  > >  > this restriction any of its related per-cpu components must be
> >  > > present  > and enumerated at the boot time as well (exposed by
> >  > > firmware and  > ACPI). This means all the enumerated processors will
> >  > > be marked as  > 'present' but they might exist in NOT enabled (_STA.enabled=0) state.
> >  > >  >
> >  > >  > There was one clear difference and please correct me if I'm wrong
> >  > > > here,  for x86, the LAPIC associated with the x86 core can be brought online later even after boot?
> >  > >  >
> >  > >  > But for ARM Arch, processors and its corresponding per-cpu
> >  > > components  > like redistributors all need to be present and
> >  > > enumerated during the  > boot time. Redistributors are part of ALWAYS-ON power domain.
> >  > >
> >  > >  OK
> >  > >
> >  > >  So what exactly is the problem with this and what does
> >  > >  acpi_processor_add() have to do with it?
> >  >
> >  >
> >  > For ARM Arch, during boot time, it should add processor as if no
> >  > hotplug exists. But later, in context to the (fake) hotplug trigger
> >  > from the virtualizer as a result of the CPU (un)plug action  it should just
> >  end up in registering the already present CPU with the Linux Driver Model.
> >
> >  So let me repeat this last time: acpi_processor_add() cannot do that,
> >  because (as defined today) it rejects CPUs with the "enabled" bit clear in  _STA.
>
>
> I understand that now because you have placed a check recently. sorry for stretching
> it a bit but I wanted to clearly understand the reason for this behavior. Is it because,
>
> 1. It does not makes sense to add a disabled but present/functional processor or
>     perhaps there are repercussions to support such a behavior?

Yes because it is unusable.

> Or
>
> 2. None of the existing processors need such a behavior?
>
>
>
> >  > >  Do you want the per-CPU structures etc. to be created from the
> >  > >  acpi_processor_add() path?
> >  >
> >  >
> >  > I referred to the components related to ARM CPU Arch like redistributors etc.
> >  > which will get initialized in context to Arch specific _init code not
> >  > here. This i.e. acpi_processor_add() is arch agnostic code common to all architectures.
> >  >
> >  > [ A digression: You do have _weak functions which can be overridden to
> >  > arch specific  handling like  arch_(un)map_cpu() etc. but we can't use
> >  > those to defer initialize  the CPU related components - ARM Arch
> >  > constraint!]
> >
> >  Not right now, but they can be added I suppose.
> >
> >  >
> >  > >
> >  > >  This plain won't work because acpi_processor_add(), as defined in
> >  > > the  mainline kernel today (and the Jonathan's patches don't change
> >  > > that  AFAICS), returns an error for processor devices with the
> >  > > "enabled" bit clear  in _STA (it returns an error if the "present"
> >  > > bit is clear too, but that's  obvious), so it only gets to calling
> >  > > arch_register_cpu() if
> >  > >  *both* "present" and "enabled" _STA bits are set for the given
> >  > > processor  device.
> >  >
> >  >
> >  > If you are referring to the _STA check in the XX_hot_add_init() then
> >  > in the current kernel code it only checks for the
> >  > ACPI_STA_DEVICE_PRESENT flag and not the ACPI_STA_DEVICE_ENABLED flag(?).
> >
> >  No, I am not.  I'm referring to this code in 6.9-rc4:
> >
> >  static int acpi_processor_add(struct acpi_device *device,
> >                      const struct acpi_device_id *id) {
> >      struct acpi_processor *pr;
> >      struct device *dev;
> >      int result = 0;
> >
> >      if (!acpi_device_is_enabled(device))
> >          return -ENODEV;
>
>
> Ahh, sorry, I missed this check as this has been added recently. Yes, now your
> logic of having common legs makes more sense.
>
>
> >
> >      ...
> >  }
> >
> >  where acpi_device_is_enabled() is defined as follows:
> >
> >  bool acpi_device_is_enabled(const struct acpi_device *adev) {
> >      return adev->status.present && adev->status.enabled; }
>
>
> Got it.
>
>
> [digression note:]
> BTW, I'm wondering why we are checking adev->status.present
> as having adev->status.enabled as set and adev->status.present
> as unset would mean firmware has a BUG. If we really want to
> check this then we should rather flag a warning on detection
> of this condition?

Adding a warning would be fine with me.

> Either this:
>  bool acpi_device_is_enabled(const struct acpi_device *adev) {
>
>      if (!acpi_device_is_present(adev)) {
>             if (adev->status.enabled)
>                        pr_debug("Device [%s] status inconsistent: Enabled but not Present\n",
>                                           device->pnp.bus_id);
>             return false;
>      }
>       return  true;
> }
>
> Ideally this inconsistency should have been checked in acpi_bus_get_status()
> and above function should have been just,

Yes, it can be added there.  It can even clear 'enabled' if 'present' is clear.

> file: drivers/acpi/scan.c
> bool acpi_device_is_enabled(const struct acpi_device *adev) {
>       return !!adev->status.enabled; }

Sure.

> file: drivers/acpi/bus.c
> int acpi_bus_get_status(struct acpi_device *device)
> {
>        [...]
>
>         status = acpi_bus_get_status_handle(device->handle, &sta);
>         if (ACPI_FAILURE(status))
>                 return -ENODEV;
>
>         acpi_set_device_status(device, sta);
>
>         if (device->status.functional && !device->status.present) {
>                 pr_debug("Device [%s] status [%08x]: functional but not present\n",
>                          device->pnp.bus_id, (u32)sta);
>         }
>
> +       if (device->status.enabled && !device->status.present) {
> +               pr_debug("BUG: Device [%s] status [%08x]: enabled but not present\n",
> +                        device->pnp.bus_id, (u32)sta);
> +                         /* any specific handling here? */
> +       }
>
>         pr_debug("Device [%s] status [%08x]\n", device->pnp.bus_id, (u32)sta);
>         return 0;
> }
>
> >
> >  > The code being reviewed has changed
> >  > exactly that behavior for 2 legs i.e. make-present and make-enabled legs.
> >
> >  I'm not sure what you mean here, but the code above means that
> >  acpi_processor_add) does not distinguish between CPU with the "present"
> >  bit clear (in which case the "enabled" bit must also be clear as per the spec)
> >  and CPUs with the "present" bit set and the "enabled" bit clear.  These two
> >  cases are handled in the same way.
> >
> >  > I'm open to further address your point clearly.
> >
> >  I hope that the above is clear enough.
>
>
> Yes, clear now. I missed the new check.
>
> >
> >  > >
> >  > >  That, BTW, is why I keep saying that from the ACPI CPU enumeration
> >  > > code  perspective, there is no difference between "present" and
> >  "enabled".
> >  >
> >  >
> >  > Agreed but there is still a subtle difference.  Enumeration happens
> >  > once and for all the processors during the boot time. And as confirmed
> >  > by yourself and Thomas as well that even in x86 arch all the
> >  > processors will be discovered and their topology fixed during the boot
> >  > time which is effectively the same behavior as in the ARM Arch. But
> >  > ARM assumes those 'present' bits in the present masks to be set during
> >  > the boot time which is not like x86(?).  Hence, 'present cpu' Bits
> >  > will always be equal to 'possible cpu' Bits. This is a constraint put by the
> >  ARM maintainers and looks unshakable.
> >
> >  Yes, there are differences between architectures, but the ACPI code is (or
> >  at least should be) architecture-agnostic (as you said somewhere above).
> >  So why does this matter for the ACPI code?
>
>
> It should not. There were few bits like overriding of arch_register_cpu() which
> was not allowed by ARM folks in 2020 when I floated the first prototype.
>
>
> >  > >  > 2.  Agreed regarding the topology. Are you suggesting that we
> >  > > must  > call arch_register_cpu() during boot time for all the 'present'  CPUs?
> >  > >  > Even if that's the case, we might still want to defer
> >  > > registration of  > the cpu device (register_cpu() API) with the
> >  > > Linux device model. Later  > is what we are doing to hide/unhide the
> >  > > CPUs from the user while  STA.Enabled Bit is toggled due to CPU  (un)plug action.
> >  > >
> >  > >  There are two ways to approach this IMV, and both seem to be valid
> >  > > in  principle.
> >  > >
> >  > >  One would be to treat CPUs with the "enabled" bit clear as not
> >  > > present and  create all of the requisite data structures for them
> >  > > when they become  available (in analogy with the "real hot-add" case).
> >  >
> >  >
> >  > Right. This one is out-of-scope for ARM Arch as we cannot defer any
> >  > Arch specific sizing and initializations after boot i.e. when
> >  > processor_add() gets called again later as a trigger of CPU plug action at the Virtualizer.
> >  >
> >  >
> >  > >
> >  > >  The alternative one is to create all of the requisite data
> >  > > structures for the  CPUs that you find during boot, but register CPU
> >  > > devices for those having  the "enabled" _STA bit set.
> >  >
> >  >
> >  > Correct. and we defer the registration for CPUs with online-capable
> >  > Bit set in the ACPI MADT/GICC data structure. These CPUs basically
> >  > form set of hot-pluggable CPUs on ARM.
> >  >
> >  >
> >  > >
> >  > >  It looks like you have chosen the second approach, which is fine
> >  > > with me  (although personally, I would rather choose the first one),
> >  > > but then your  arch code needs to arrange for the requisite CPU data
> >  > > structures etc. to be  set up before acpi_processor_add() gets
> >  > > called because, as per the above,  the latter just rejects CPUs with the  "enabled" _STA bit clear.
> >  >
> >  > Yes, correct. First one is not possible - at least for now and to have
> >  > that it will require lot of rework especially at GIC. But there are
> >  > many other arch components (like timers, PMUs, etc.) whose behavior
> >  > needs to be specified somewhere in the architecture as well. All these are closely coupled with the ARM CPU architecture.
> >  > (it's beyond this discussion and lets leave that to ARM folks).
> >  >
> >  > This patch-set has a change to deal with ACPI _STA.Enabled Bit accordingly.
> >
> >  Well, I'm having a hard time with this.
> >
> >  As far as CPU enumeration goes, if the "enabled" bit is clear in _STA, it does
> >  not happen at all.  Both on ARM and on x86.
>
> sure, I can see that now.
>
> >
> >  Now tell me why there need to be two separate code paths calling
> >  arch_register_cpu() in acpi_processor_add()?
>
>
> As mentioned above, in the first prototype I floated in the year 2020 any attempts
> to override the __weak call of arch_register_cpu() for ARM64 was discouraged.
> Though, the reasons might have changed now as some code has been moved.
>
> Once we are allowed to override the calls then there are many more possibilities
> which open up to simplify the code further.

Well, IMV this should just be an arch function with no __weak
defaults, because the default would probably be unusable in practice
anyway.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER
  2024-04-12 14:37 ` [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
  2024-04-12 17:42   ` Rafael J. Wysocki
@ 2024-04-22  3:53   ` Gavin Shan
  1 sibling, 0 replies; 58+ messages in thread
From: Gavin Shan @ 2024-04-22  3:53 UTC (permalink / raw)
  To: Jonathan Cameron, linux-pm, loongarch, linux-acpi, linux-arch,
	linux-kernel, linux-arm-kernel, kvmarm, x86, Russell King,
	Rafael J . Wysocki, Miguel Luis, James Morse, Salil Mehta,
	Jean-Philippe Brucker, Catalin Marinas, Will Deacon
  Cc: linuxarm, justin.he, jianyong.wu


On 4/13/24 00:37, Jonathan Cameron wrote:
> For arm64 the CPU registration cannot complete until the ACPI intepretter
> us up and running so in those cases the arch specific
   ^^

typo: s/us/is

> arch_register_cpu() will return -EPROBE_DEFER at this stage and the
> registration will be attempted later.
> 
> Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> ---
> v5: New patch.
>      Note that for now no arch_register_cpu() calls return -EPROBE_DEFER
>      so it has no impact until the arm64 one is added later in this series.
> ---
>   drivers/base/cpu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: Gavin Shan <gshan@redhat.com>


^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2024-04-22  3:53 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-12 14:37 [PATCH v5 00/18] ACPI/arm64: add support for virtual cpu hotplug Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 01/18] cpu: Do not warn on arch_register_cpu() returning -EPROBE_DEFER Jonathan Cameron
2024-04-12 17:42   ` Rafael J. Wysocki
2024-04-22  3:53   ` Gavin Shan
2024-04-12 14:37 ` [PATCH v5 02/18] ACPI: processor: Set the ACPI_COMPANION for the struct cpu instance Jonathan Cameron
2024-04-12 18:10   ` Rafael J. Wysocki
2024-04-15 15:48     ` Jonathan Cameron
2024-04-15 16:16       ` Rafael J. Wysocki
2024-04-15 16:19         ` Rafael J. Wysocki
2024-04-15 16:50           ` Jonathan Cameron
2024-04-15 17:34             ` Jonathan Cameron
2024-04-15 17:41               ` Rafael J. Wysocki
2024-04-16 17:35                 ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 03/18] ACPI: processor: Register deferred CPUs from acpi_processor_get_info() Jonathan Cameron
2024-04-12 18:30   ` Rafael J. Wysocki
2024-04-12 20:16     ` Russell King (Oracle)
2024-04-12 20:54       ` Thomas Gleixner
2024-04-12 21:52         ` Russell King (Oracle)
2024-04-12 23:23           ` Thomas Gleixner
2024-04-15  8:45             ` Jonathan Cameron
2024-04-15  9:16               ` Jonathan Cameron
2024-04-15  9:31                 ` Jonathan Cameron
2024-04-15 11:57                 ` Jonathan Cameron
2024-04-15 11:37               ` Rafael J. Wysocki
2024-04-15 11:56                 ` Jonathan Cameron
2024-04-15 12:04                   ` Rafael J. Wysocki
2024-04-15 12:23                     ` Jonathan Cameron
2024-04-16 17:41                       ` Jonathan Cameron
2024-04-16 19:02                         ` Rafael J. Wysocki
2024-04-17 10:39                           ` Jonathan Cameron
2024-04-15 12:37                     ` Salil Mehta
2024-04-15 12:41                       ` Rafael J. Wysocki
2024-04-15 11:51         ` Salil Mehta
2024-04-15 12:51           ` Rafael J. Wysocki
2024-04-15 15:31             ` Salil Mehta
2024-04-15 16:38               ` Rafael J. Wysocki
2024-04-17 15:01                 ` Salil Mehta
2024-04-17 16:19                   ` Rafael J. Wysocki
2024-04-15 10:52     ` Jonathan Cameron
2024-04-15 11:11       ` Jonathan Cameron
2024-04-15 11:52       ` Rafael J. Wysocki
2024-04-15 11:07     ` Salil Mehta
2024-04-16 14:00   ` Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 04/18] ACPI: Rename acpi_processor_hotadd_init and remove pre-processor guards Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 05/18] ACPI: utils: Add an acpi_sta_enabled() helper and use it in acpi_processor_make_present() Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 06/18] ACPI: scan: Add parameter to allow defering some actions in acpi_scan_check_and_detach Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 07/18] ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 08/18] ACPI: convert acpi_processor_post_eject() to use IS_ENABLED() Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 09/18] ACPI: Check _STA present bit before making CPUs not present Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 10/18] ACPI: Warn when the present bit changes but the feature is not enabled Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 11/18] arm64: acpi: Move get_cpu_for_acpi_id() to a header Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 12/18] irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc() Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 13/18] irqchip/gic-v3: Add support for ACPI's disabled but 'online capable' CPUs Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 14/18] arm64: psci: Ignore DENIED CPUs Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 15/18] arm64: arch_register_cpu() variant to allow checking of ACPI _STA Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 16/18] ACPI: add support to (un)register CPUs based on the _STA enabled bit Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 17/18] arm64: document virtual CPU hotplug's expectations Jonathan Cameron
2024-04-12 14:37 ` [PATCH v5 18/18] cpumask: Add enabled cpumask for present CPUs that can be brought online Jonathan Cameron

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).