QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF
@ 2019-10-08 10:52 Laszlo Ersek
  2019-10-08 10:52 ` [PATCH 1/4] fw_cfg: bump file slots to 40 Laszlo Ersek
                   ` (4 more replies)
  0 siblings, 5 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-08 10:52 UTC (permalink / raw)
  To: qemu devel list
  Cc: Eduardo Habkost, Michael S. Tsirkin, Gerd Hoffmann,
	Paolo Bonzini, Igor Mammedov, Philippe Mathieu-Daudé,
	Richard Henderson

https://bugzilla.tianocore.org/show_bug.cgi?id=1515

For enabling VCPU hotplug with OVMF, OVMF needs to know the value of
"max_cpus". This quantity is not passed to the firmware currently --
SeaBIOS receives the (exclusive) maximum APICID, which is not useful to
OVMF --, so introduce a new fw_cfg file for this purpose.

Note: when OVMF is built with -D SMM_REQUIRE, this patch series is just
a small building block, towards the full VCPU hotplug feature. However,
when OVMF is built without -D SMM_REQUIRE, this series (together with
the counterpart patch set for OVMF) completes the VCPU hotplug feature:
it allows S3 resume to work with VCPUs hot-plugged previously (at OS
runtime, of course).

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>

Thanks
Laszlo

Laszlo Ersek (4):
  fw_cfg: bump file slots to 40
  target/i386: remove useless enable_compat_apic_id_mode() prototype
  hw/i386: add facility to expose CPU topology over fw-cfg
  hw/i386/pc: expose CPU topology over fw-cfg

 hw/core/machine.c    |  2 ++
 hw/i386/fw_cfg.c     | 26 ++++++++++++++++++++++++--
 hw/i386/fw_cfg.h     | 30 +++++++++++++++++++++++++++++-
 hw/i386/pc.c         |  5 +++--
 hw/i386/pc_piix.c    |  2 ++
 hw/i386/pc_q35.c     |  2 ++
 hw/nvram/fw_cfg.c    |  2 +-
 include/hw/i386/pc.h |  3 +++
 target/i386/cpu.h    |  2 --
 9 files changed, 66 insertions(+), 8 deletions(-)

-- 
2.19.1.3.g30247aa5d201



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/4] fw_cfg: bump file slots to 40
  2019-10-08 10:52 [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF Laszlo Ersek
@ 2019-10-08 10:52 ` Laszlo Ersek
  2019-10-08 10:52 ` [PATCH 2/4] target/i386: remove useless enable_compat_apic_id_mode() prototype Laszlo Ersek
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-08 10:52 UTC (permalink / raw)
  To: qemu devel list
  Cc: Eduardo Habkost, Gerd Hoffmann, Paolo Bonzini, Igor Mammedov,
	Philippe Mathieu-Daudé

I count approximately 30 additions of named fw_cfg files, with
fw_cfg_add_file() / fw_cfg_add_file_callback() / fw_cfg_modify_file(). On
top of those, vgaroms/* and genroms/* files are added with invocations of
rom_add_vga() and rom_add_option().

While it's pretty unlikely that a QEMU cmdline causes all of those named
files to be added simultaneously, it now seems prudent to bump
FW_CFG_FILE_SLOTS_DFLT from 32 to a higher value. Increment it by 8.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 hw/core/machine.c | 2 ++
 hw/nvram/fw_cfg.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index 1689ad3bf8af..fefc9e1f0dd0 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -29,6 +29,8 @@
 
 GlobalProperty hw_compat_4_1[] = {
     { "virtio-pci", "x-pcie-flr-init", "off" },
+    { "fw_cfg_mem", "x-file-slots", "0x20" },
+    { "fw_cfg_io", "x-file-slots", "0x20" },
 };
 const size_t hw_compat_4_1_len = G_N_ELEMENTS(hw_compat_4_1);
 
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 7dc3ac378ee0..cd785fad93b1 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -40,7 +40,7 @@
 #include "qemu/cutils.h"
 #include "qapi/error.h"
 
-#define FW_CFG_FILE_SLOTS_DFLT 0x20
+#define FW_CFG_FILE_SLOTS_DFLT 0x28
 
 /* FW_CFG_VERSION bits */
 #define FW_CFG_VERSION      0x01
-- 
2.19.1.3.g30247aa5d201




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 2/4] target/i386: remove useless enable_compat_apic_id_mode() prototype
  2019-10-08 10:52 [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF Laszlo Ersek
  2019-10-08 10:52 ` [PATCH 1/4] fw_cfg: bump file slots to 40 Laszlo Ersek
@ 2019-10-08 10:52 ` Laszlo Ersek
  2019-10-08 13:35   ` Philippe Mathieu-Daudé
  2019-10-08 10:52 ` [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg Laszlo Ersek
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-08 10:52 UTC (permalink / raw)
  To: qemu devel list
  Cc: Igor Mammedov, Richard Henderson, Eduardo Habkost, Paolo Bonzini

The enable_compat_apic_id_mode() function definition was removed earlier;
there are no callers left. Remove the function declaration too.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 target/i386/cpu.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index eaa5395aa539..c9ab1a367939 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2126,8 +2126,6 @@ void x86_cpu_set_default_version(X86CPUVersion version);
 /* Return name of 32-bit register, from a R_* constant */
 const char *get_register_name_32(unsigned int reg);
 
-void enable_compat_apic_id_mode(void);
-
 #define APIC_DEFAULT_ADDRESS 0xfee00000
 #define APIC_SPACE_SIZE      0x100000
 
-- 
2.19.1.3.g30247aa5d201




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 10:52 [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF Laszlo Ersek
  2019-10-08 10:52 ` [PATCH 1/4] fw_cfg: bump file slots to 40 Laszlo Ersek
  2019-10-08 10:52 ` [PATCH 2/4] target/i386: remove useless enable_compat_apic_id_mode() prototype Laszlo Ersek
@ 2019-10-08 10:52 ` Laszlo Ersek
  2019-10-08 13:29   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2019-10-08 10:52 ` [PATCH 4/4] hw/i386/pc: " Laszlo Ersek
  2019-10-08 14:23 ` [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF no-reply
  4 siblings, 3 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-08 10:52 UTC (permalink / raw)
  To: qemu devel list
  Cc: Eduardo Habkost, Michael S. Tsirkin, Paolo Bonzini,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Richard Henderson

FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
due to historical reasons. That value is not useful to edk2, however. For
supporting VCPU hotplug, edk2 needs:

- the boot CPU count (already exposed in FW_CFG_NB_CPUS),

- and the maximum foreseen CPU count (tracked in
  "MachineState.smp.max_cpus", but not currently exposed).

Add a new fw-cfg file to expose "max_cpus".

While at it, expose the rest of the topology too (die / core / thread
counts), because I expect that the VCPU hotplug feature for OVMF will
ultimately need those too, and the data size is not large. This is
slightly complicated by the fact that the die count is specific to
PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
commit 149c50cabcc4).

For now, the feature is temporarily disabled.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
 hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
 hw/i386/pc.c     |  4 ++--
 3 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
index e0856a376996..d742435b9793 100644
--- a/hw/i386/fw_cfg.h
+++ b/hw/i386/fw_cfg.h
@@ -18,9 +18,37 @@
 #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
 #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
 
+/**
+ * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over fw-cfg.
+ *
+ * All fields have little-endian encoding.
+ *
+ * @dies:     Number of dies per package (aka socket). Set it to 1 unless the
+ *            concrete MachineState subclass defines it differently.
+ * @cores:    Corresponds to @CpuTopology.@cores.
+ * @threads:  Corresponds to @CpuTopology.@threads.
+ * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
+ *
+ * Firmware can derive the package (aka socket) count with the following
+ * formula:
+ *
+ *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
+ *
+ * Firmware can derive APIC ID field widths and offsets per the standard
+ * calculations in "include/hw/i386/topology.h".
+ */
+typedef struct FWCfgX86Topology {
+  uint32_t dies;
+  uint32_t cores;
+  uint32_t threads;
+  uint32_t max_cpus;
+} QEMU_PACKED FWCfgX86Topology;
+
 FWCfgState *fw_cfg_arch_create(MachineState *ms,
                                uint16_t boot_cpus,
-                               uint16_t apic_id_limit);
+                               uint16_t apic_id_limit,
+                               unsigned smp_dies,
+                               bool expose_topology);
 void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
 void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
 
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index 39b6bc60520c..33d09875014f 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
     }
 }
 
+static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
+                                   unsigned dies,
+                                   unsigned cores,
+                                   unsigned threads,
+                                   unsigned max_cpus)
+{
+    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
+
+    topo->dies     = cpu_to_le32(dies);
+    topo->cores    = cpu_to_le32(cores);
+    topo->threads  = cpu_to_le32(threads);
+    topo->max_cpus = cpu_to_le32(max_cpus);
+    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
+}
+
 FWCfgState *fw_cfg_arch_create(MachineState *ms,
-                                      uint16_t boot_cpus,
-                                      uint16_t apic_id_limit)
+                               uint16_t boot_cpus,
+                               uint16_t apic_id_limit,
+                               unsigned smp_dies,
+                               bool expose_topology)
 {
     FWCfgState *fw_cfg;
     uint64_t *numa_fw_cfg;
@@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
                      (1 + apic_id_limit + nb_numa_nodes) *
                      sizeof(*numa_fw_cfg));
 
+    if (expose_topology) {
+        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
+                               ms->smp.threads, ms->smp.max_cpus);
+    }
+
     return fw_cfg;
 }
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index bcda50efcc23..bb72b12edad2 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
                                         option_rom_mr,
                                         1);
 
-    fw_cfg = fw_cfg_arch_create(machine,
-                                pcms->boot_cpus, pcms->apic_id_limit);
+    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
+                                pcms->smp_dies, false);
 
     rom_set_fw(fw_cfg);
 
-- 
2.19.1.3.g30247aa5d201




^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 4/4] hw/i386/pc: expose CPU topology over fw-cfg
  2019-10-08 10:52 [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF Laszlo Ersek
                   ` (2 preceding siblings ...)
  2019-10-08 10:52 ` [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg Laszlo Ersek
@ 2019-10-08 10:52 ` " Laszlo Ersek
  2019-10-08 14:23 ` [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF no-reply
  4 siblings, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-08 10:52 UTC (permalink / raw)
  To: qemu devel list
  Cc: Eduardo Habkost, Michael S. Tsirkin, Paolo Bonzini,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Richard Henderson

Enable the small feature added in the last patch on 4.2+ PC machine types.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
---
 include/hw/i386/pc.h | 3 +++
 hw/i386/pc.c         | 3 ++-
 hw/i386/pc_piix.c    | 2 ++
 hw/i386/pc_q35.c     | 2 ++
 4 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 6df4f4b6fbe7..96af441453bc 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -144,6 +144,9 @@ typedef struct PCMachineClass {
 
     /* Enables contiguous-apic-ID mode */
     bool compat_apic_id_mode;
+
+    /* expose x86 CPU topology over fw_cfg */
+    bool fwcfg_topology;
 } PCMachineClass;
 
 #define TYPE_PC_MACHINE "generic-pc-machine"
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index bb72b12edad2..12b13ba0e1c5 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1739,7 +1739,7 @@ void pc_memory_init(PCMachineState *pcms,
                                         1);
 
     fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
-                                pcms->smp_dies, false);
+                                pcms->smp_dies, pcmc->fwcfg_topology);
 
     rom_set_fw(fw_cfg);
 
@@ -2798,6 +2798,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     pcmc->save_tsc_khz = true;
     pcmc->linuxboot_dma_enabled = true;
     pcmc->pvh_enabled = true;
+    pcmc->fwcfg_topology = true;
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = pc_get_hotplug_handler;
     mc->hotplug_allowed = pc_hotplug_allowed;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 6824b72124de..7fb87452889b 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -445,9 +445,11 @@ DEFINE_I440FX_MACHINE(v4_2, "pc-i440fx-4.2", NULL,
 
 static void pc_i440fx_4_1_machine_options(MachineClass *m)
 {
+    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
     pc_i440fx_4_2_machine_options(m);
     m->alias = NULL;
     m->is_default = 0;
+    pcmc->fwcfg_topology = false;
     compat_props_add(m->compat_props, hw_compat_4_1, hw_compat_4_1_len);
     compat_props_add(m->compat_props, pc_compat_4_1, pc_compat_4_1_len);
 }
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 8fad20f3146a..3ec45659da02 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -377,8 +377,10 @@ DEFINE_Q35_MACHINE(v4_2, "pc-q35-4.2", NULL,
 
 static void pc_q35_4_1_machine_options(MachineClass *m)
 {
+    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
     pc_q35_4_2_machine_options(m);
     m->alias = NULL;
+    pcmc->fwcfg_topology = false;
     compat_props_add(m->compat_props, hw_compat_4_1, hw_compat_4_1_len);
     compat_props_add(m->compat_props, pc_compat_4_1, pc_compat_4_1_len);
 }
-- 
2.19.1.3.g30247aa5d201



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 10:52 ` [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg Laszlo Ersek
@ 2019-10-08 13:29   ` Philippe Mathieu-Daudé
  2019-10-08 18:31     ` Laszlo Ersek
  2019-10-08 15:59   ` Igor Mammedov
  2019-10-08 18:58   ` Laszlo Ersek
  2 siblings, 1 reply; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-08 13:29 UTC (permalink / raw)
  To: Laszlo Ersek, qemu devel list
  Cc: Eduardo Habkost, Michael S. Tsirkin, Paolo Bonzini,
	Igor Mammedov, Richard Henderson

Hi Laszlo,

On 10/8/19 12:52 PM, Laszlo Ersek wrote:
> FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> due to historical reasons. That value is not useful to edk2, however. For
> supporting VCPU hotplug, edk2 needs:
> 
> - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> 
> - and the maximum foreseen CPU count (tracked in
>    "MachineState.smp.max_cpus", but not currently exposed).
> 
> Add a new fw-cfg file to expose "max_cpus".
> 
> While at it, expose the rest of the topology too (die / core / thread
> counts), because I expect that the VCPU hotplug feature for OVMF will
> ultimately need those too, and the data size is not large. This is
> slightly complicated by the fact that the die count is specific to
> PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
> commit 149c50cabcc4).

The X86 topology is generic to the architecture (not machine specific) 
so it is well placed in fw_cfg_arch_create().

> 
> For now, the feature is temporarily disabled.

I see you enable it in the PC machine in the next patch.
Do you plan to remove the 'expose_topology' argument and expose the key 
later, or is this comment simply related to this patch?

Ah, now I see you disable it previous to pc-4.2, OK.

> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>   hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
>   hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
>   hw/i386/pc.c     |  4 ++--
>   3 files changed, 55 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
> index e0856a376996..d742435b9793 100644
> --- a/hw/i386/fw_cfg.h
> +++ b/hw/i386/fw_cfg.h
> @@ -18,9 +18,37 @@
>   #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
>   #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
>   
> +/**
> + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over fw-cfg.
> + *
> + * All fields have little-endian encoding.
> + *
> + * @dies:     Number of dies per package (aka socket). Set it to 1 unless the
> + *            concrete MachineState subclass defines it differently.
> + * @cores:    Corresponds to @CpuTopology.@cores.
> + * @threads:  Corresponds to @CpuTopology.@threads.
> + * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
> + *
> + * Firmware can derive the package (aka socket) count with the following
> + * formula:
> + *
> + *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
> + *
> + * Firmware can derive APIC ID field widths and offsets per the standard
> + * calculations in "include/hw/i386/topology.h".
> + */
> +typedef struct FWCfgX86Topology {
> +  uint32_t dies;
> +  uint32_t cores;
> +  uint32_t threads;
> +  uint32_t max_cpus;
> +} QEMU_PACKED FWCfgX86Topology;
> +
>   FWCfgState *fw_cfg_arch_create(MachineState *ms,
>                                  uint16_t boot_cpus,
> -                               uint16_t apic_id_limit);
> +                               uint16_t apic_id_limit,
> +                               unsigned smp_dies,
> +                               bool expose_topology);
>   void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
>   void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
>   
> diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
> index 39b6bc60520c..33d09875014f 100644
> --- a/hw/i386/fw_cfg.c
> +++ b/hw/i386/fw_cfg.c
> @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
>       }
>   }
>   
> +static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
> +                                   unsigned dies,
> +                                   unsigned cores,
> +                                   unsigned threads,
> +                                   unsigned max_cpus)
> +{
> +    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
> +
> +    topo->dies     = cpu_to_le32(dies);
> +    topo->cores    = cpu_to_le32(cores);
> +    topo->threads  = cpu_to_le32(threads);
> +    topo->max_cpus = cpu_to_le32(max_cpus);
> +    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
> +}
> +
>   FWCfgState *fw_cfg_arch_create(MachineState *ms,
> -                                      uint16_t boot_cpus,
> -                                      uint16_t apic_id_limit)
> +                               uint16_t boot_cpus,
> +                               uint16_t apic_id_limit,
> +                               unsigned smp_dies,
> +                               bool expose_topology)
>   {
>       FWCfgState *fw_cfg;
>       uint64_t *numa_fw_cfg;
> @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
>                        (1 + apic_id_limit + nb_numa_nodes) *
>                        sizeof(*numa_fw_cfg));
>   
> +    if (expose_topology) {
> +        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
> +                               ms->smp.threads, ms->smp.max_cpus);
> +    }
> +
>       return fw_cfg;
>   }
>   
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index bcda50efcc23..bb72b12edad2 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
>                                           option_rom_mr,
>                                           1);
>   
> -    fw_cfg = fw_cfg_arch_create(machine,
> -                                pcms->boot_cpus, pcms->apic_id_limit);
> +    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
> +                                pcms->smp_dies, false);
>   
>       rom_set_fw(fw_cfg);
>   

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] target/i386: remove useless enable_compat_apic_id_mode() prototype
  2019-10-08 10:52 ` [PATCH 2/4] target/i386: remove useless enable_compat_apic_id_mode() prototype Laszlo Ersek
@ 2019-10-08 13:35   ` Philippe Mathieu-Daudé
  2019-10-08 18:22     ` Laszlo Ersek
  0 siblings, 1 reply; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-08 13:35 UTC (permalink / raw)
  To: Laszlo Ersek, qemu devel list
  Cc: Igor Mammedov, Paolo Bonzini, Eduardo Habkost, Richard Henderson

On 10/8/19 12:52 PM, Laszlo Ersek wrote:
> The enable_compat_apic_id_mode() function definition was removed earlier;

"in 457cfcccdd1"

> there are no callers left. Remove the function declaration too.
> 
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>   target/i386/cpu.h | 2 --
>   1 file changed, 2 deletions(-)
> 
> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> index eaa5395aa539..c9ab1a367939 100644
> --- a/target/i386/cpu.h
> +++ b/target/i386/cpu.h
> @@ -2126,8 +2126,6 @@ void x86_cpu_set_default_version(X86CPUVersion version);
>   /* Return name of 32-bit register, from a R_* constant */
>   const char *get_register_name_32(unsigned int reg);
>   
> -void enable_compat_apic_id_mode(void);
> -
>   #define APIC_DEFAULT_ADDRESS 0xfee00000
>   #define APIC_SPACE_SIZE      0x100000
>   
> 

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF
  2019-10-08 10:52 [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF Laszlo Ersek
                   ` (3 preceding siblings ...)
  2019-10-08 10:52 ` [PATCH 4/4] hw/i386/pc: " Laszlo Ersek
@ 2019-10-08 14:23 ` no-reply
  4 siblings, 0 replies; 24+ messages in thread
From: no-reply @ 2019-10-08 14:23 UTC (permalink / raw)
  To: lersek; +Cc: ehabkost, mst, qemu-devel, kraxel, imammedo, pbonzini, philmd, rth

Patchew URL: https://patchew.org/QEMU/20191008105259.5378-1-lersek@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      io/channel-tls.o
  CC      io/channel-watch.o

Encoding error:
'utf-8' codec can't decode byte 0x95 in position 799: invalid start byte
The full traceback has been saved in /tmp/sphinx-err-x00li0bh.log, if you want to report the issue to the developers.
  CC      io/channel-util.o
  CC      io/channel-websock.o
make: *** [Makefile:994: docs/interop/index.html] Error 2
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=9f05429fd2334f609b3ce170f0bc0879', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-pybykcuy/src/docker-src.2019-10-08-10.19.05.32206:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=9f05429fd2334f609b3ce170f0bc0879
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-pybykcuy/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    4m32.332s
user    0m8.670s


The full log is available at
http://patchew.org/logs/20191008105259.5378-1-lersek@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 10:52 ` [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg Laszlo Ersek
  2019-10-08 13:29   ` Philippe Mathieu-Daudé
@ 2019-10-08 15:59   ` Igor Mammedov
  2019-10-09 21:01     ` Laszlo Ersek
  2019-10-10 10:01     ` Michael S. Tsirkin
  2019-10-08 18:58   ` Laszlo Ersek
  2 siblings, 2 replies; 24+ messages in thread
From: Igor Mammedov @ 2019-10-08 15:59 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Eduardo Habkost, Michael S. Tsirkin, qemu devel list,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson

On Tue,  8 Oct 2019 12:52:58 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> due to historical reasons. That value is not useful to edk2, however. For
> supporting VCPU hotplug, edk2 needs:
> 
> - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> 
> - and the maximum foreseen CPU count (tracked in
>   "MachineState.smp.max_cpus", but not currently exposed).
one can get it with current QEMU without adding new fgcfg
(albeit in a bit awkward way)

max_cpu count can be derived indirectly as result of cpu hotplug
enumeration (IO interface at 0x0cd8-0xcf7) by writing/reading
to/from selector register (see ACPI_CPU_SELECTOR_OFFSET_WR)
until read value stops changing values (i.e. max cpu count
is reached). One also can figure out present/non-present
cpu status by reading flags register.

> Add a new fw-cfg file to expose "max_cpus".
> 
> While at it, expose the rest of the topology too (die / core / thread
> counts), because I expect that the VCPU hotplug feature for OVMF will
> ultimately need those too, and the data size is not large. This is
> slightly complicated by the fact that the die count is specific to
> PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
> commit 149c50cabcc4).
Could you clarify why topology info is necessary?

Potentially it's possible to extend cpu hotplug ABI to report
arch specific apic-id (x86) or mpidr (arm) if firmware really
needs to know topology and let guest to decode it according
to CPU's spec.

So far there were no need for it as all possible cpus are
described in ACPI tables passed to guest, but I'm not going
to suggest to parse them on firmware side as it's too complicated :)

PS:
The reason we started building ACPI tables in QEMU, was never
ending story of adding more ABI and supporting it afterwards.
So I'd try to avoid doing it if it can be helped.

> For now, the feature is temporarily disabled.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
>  hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
>  hw/i386/pc.c     |  4 ++--
>  3 files changed, 55 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
> index e0856a376996..d742435b9793 100644
> --- a/hw/i386/fw_cfg.h
> +++ b/hw/i386/fw_cfg.h
> @@ -18,9 +18,37 @@
>  #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
>  #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
>  
> +/**
> + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over fw-cfg.
> + *
> + * All fields have little-endian encoding.
> + *
> + * @dies:     Number of dies per package (aka socket). Set it to 1 unless the
> + *            concrete MachineState subclass defines it differently.
> + * @cores:    Corresponds to @CpuTopology.@cores.
> + * @threads:  Corresponds to @CpuTopology.@threads.
> + * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
> + *
> + * Firmware can derive the package (aka socket) count with the following
> + * formula:
> + *
> + *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
> + *
> + * Firmware can derive APIC ID field widths and offsets per the standard
> + * calculations in "include/hw/i386/topology.h".
> + */
> +typedef struct FWCfgX86Topology {
> +  uint32_t dies;
> +  uint32_t cores;
> +  uint32_t threads;
> +  uint32_t max_cpus;
> +} QEMU_PACKED FWCfgX86Topology;
> +
>  FWCfgState *fw_cfg_arch_create(MachineState *ms,
>                                 uint16_t boot_cpus,
> -                               uint16_t apic_id_limit);
> +                               uint16_t apic_id_limit,
> +                               unsigned smp_dies,
> +                               bool expose_topology);
>  void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
>  void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
>  
> diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
> index 39b6bc60520c..33d09875014f 100644
> --- a/hw/i386/fw_cfg.c
> +++ b/hw/i386/fw_cfg.c
> @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
>      }
>  }
>  
> +static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
> +                                   unsigned dies,
> +                                   unsigned cores,
> +                                   unsigned threads,
> +                                   unsigned max_cpus)
> +{
> +    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
> +
> +    topo->dies     = cpu_to_le32(dies);
> +    topo->cores    = cpu_to_le32(cores);
> +    topo->threads  = cpu_to_le32(threads);
> +    topo->max_cpus = cpu_to_le32(max_cpus);
> +    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
> +}
> +
>  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> -                                      uint16_t boot_cpus,
> -                                      uint16_t apic_id_limit)
> +                               uint16_t boot_cpus,
> +                               uint16_t apic_id_limit,
> +                               unsigned smp_dies,
> +                               bool expose_topology)
>  {
>      FWCfgState *fw_cfg;
>      uint64_t *numa_fw_cfg;
> @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
>                       (1 + apic_id_limit + nb_numa_nodes) *
>                       sizeof(*numa_fw_cfg));
>  
> +    if (expose_topology) {
> +        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
> +                               ms->smp.threads, ms->smp.max_cpus);
> +    }
> +
>      return fw_cfg;
>  }
>  
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index bcda50efcc23..bb72b12edad2 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
>                                          option_rom_mr,
>                                          1);
>  
> -    fw_cfg = fw_cfg_arch_create(machine,
> -                                pcms->boot_cpus, pcms->apic_id_limit);
> +    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
> +                                pcms->smp_dies, false);
>  
>      rom_set_fw(fw_cfg);
>  



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/4] target/i386: remove useless enable_compat_apic_id_mode() prototype
  2019-10-08 13:35   ` Philippe Mathieu-Daudé
@ 2019-10-08 18:22     ` Laszlo Ersek
  0 siblings, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-08 18:22 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu devel list
  Cc: Igor Mammedov, Paolo Bonzini, Eduardo Habkost, Richard Henderson

On 10/08/19 15:35, Philippe Mathieu-Daudé wrote:
> On 10/8/19 12:52 PM, Laszlo Ersek wrote:
>> The enable_compat_apic_id_mode() function definition was removed earlier;
> 
> "in 457cfcccdd1"

Thanks. I'll add that to the commit message if I need to post a v2.

> 
>> there are no callers left. Remove the function declaration too.
>>
>> Cc: Eduardo Habkost <ehabkost@redhat.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Richard Henderson <rth@twiddle.net>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>   target/i386/cpu.h | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/target/i386/cpu.h b/target/i386/cpu.h
>> index eaa5395aa539..c9ab1a367939 100644
>> --- a/target/i386/cpu.h
>> +++ b/target/i386/cpu.h
>> @@ -2126,8 +2126,6 @@ void x86_cpu_set_default_version(X86CPUVersion
>> version);
>>   /* Return name of 32-bit register, from a R_* constant */
>>   const char *get_register_name_32(unsigned int reg);
>>   -void enable_compat_apic_id_mode(void);
>> -
>>   #define APIC_DEFAULT_ADDRESS 0xfee00000
>>   #define APIC_SPACE_SIZE      0x100000
>>  
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

Cheers!
Laszlo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 13:29   ` Philippe Mathieu-Daudé
@ 2019-10-08 18:31     ` Laszlo Ersek
  0 siblings, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-08 18:31 UTC (permalink / raw)
  To: Philippe Mathieu-Daudé, qemu devel list
  Cc: Eduardo Habkost, Michael S. Tsirkin, Paolo Bonzini,
	Igor Mammedov, Richard Henderson

On 10/08/19 15:29, Philippe Mathieu-Daudé wrote:
> Hi Laszlo,
> 
> On 10/8/19 12:52 PM, Laszlo Ersek wrote:
>> FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest
>> firmware,
>> due to historical reasons. That value is not useful to edk2, however. For
>> supporting VCPU hotplug, edk2 needs:
>>
>> - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
>>
>> - and the maximum foreseen CPU count (tracked in
>>    "MachineState.smp.max_cpus", but not currently exposed).
>>
>> Add a new fw-cfg file to expose "max_cpus".
>>
>> While at it, expose the rest of the topology too (die / core / thread
>> counts), because I expect that the VCPU hotplug feature for OVMF will
>> ultimately need those too, and the data size is not large. This is
>> slightly complicated by the fact that the die count is specific to
>> PCMachineState, but fw_cfg_arch_create() intends to be PC-independent
>> (see
>> commit 149c50cabcc4).
> 
> The X86 topology is generic to the architecture (not machine specific)
> so it is well placed in fw_cfg_arch_create().

Certainly -- I didn't mean to "complain" in the commit message, just to
point out why I added a new parameter to fw_cfg_arch_create().

Because, my first instinct was to change fw_cfg_arch_create() to simply
take a (PCMachineState*), and then fw_cfg_arch_create() could fetch
whatever it needed, internally.

But, upon finding your commit 149c50cabcc4, I realized that adding a new
parameter was the correct approach (just "slightly complicated" relative
to passing (PCMachineState*) whole-sale.)

To wit, I didn't write "*tries* to be PC-independent", but "*intends* to
be PC-independent". I agree with the intent :)

> 
>>
>> For now, the feature is temporarily disabled.
> 
> I see you enable it in the PC machine in the next patch.
> Do you plan to remove the 'expose_topology' argument and expose the key
> later, or is this comment simply related to this patch?
> 
> Ah, now I see you disable it previous to pc-4.2, OK.

Right, the sentence only refers to the "false" argument in this patch,
for the "expose_topology" parameter.

It took me some time to come up with this solution. I certainly wanted
to separate the feature from the versioned machine type changes. One
approach could have been to introduce the fw_cfg_expose_topology()
function in itself, in a separate patch -- but then bisection would
break at that commit, because gcc doesn't like static functions
(functions with internal linkage) that are never called. So I wanted to
call the new function at once, but short-circuit it too.

(I tend to build patch series at every stage, before posting them.)

> 
>> Cc: "Michael S. Tsirkin" <mst@redhat.com>
>> Cc: Eduardo Habkost <ehabkost@redhat.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
>> Cc: Richard Henderson <rth@twiddle.net>
>> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
>> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>> ---
>>   hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
>>   hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
>>   hw/i386/pc.c     |  4 ++--
>>   3 files changed, 55 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
>> index e0856a376996..d742435b9793 100644
>> --- a/hw/i386/fw_cfg.h
>> +++ b/hw/i386/fw_cfg.h
>> @@ -18,9 +18,37 @@
>>   #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
>>   #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
>>   +/**
>> + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware
>> over fw-cfg.
>> + *
>> + * All fields have little-endian encoding.
>> + *
>> + * @dies:     Number of dies per package (aka socket). Set it to 1
>> unless the
>> + *            concrete MachineState subclass defines it differently.
>> + * @cores:    Corresponds to @CpuTopology.@cores.
>> + * @threads:  Corresponds to @CpuTopology.@threads.
>> + * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
>> + *
>> + * Firmware can derive the package (aka socket) count with the following
>> + * formula:
>> + *
>> + *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
>> + *
>> + * Firmware can derive APIC ID field widths and offsets per the standard
>> + * calculations in "include/hw/i386/topology.h".
>> + */
>> +typedef struct FWCfgX86Topology {
>> +  uint32_t dies;
>> +  uint32_t cores;
>> +  uint32_t threads;
>> +  uint32_t max_cpus;
>> +} QEMU_PACKED FWCfgX86Topology;
>> +
>>   FWCfgState *fw_cfg_arch_create(MachineState *ms,
>>                                  uint16_t boot_cpus,
>> -                               uint16_t apic_id_limit);
>> +                               uint16_t apic_id_limit,
>> +                               unsigned smp_dies,
>> +                               bool expose_topology);
>>   void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
>>   void fw_cfg_build_feature_control(MachineState *ms, FWCfgState
>> *fw_cfg);
>>   diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
>> index 39b6bc60520c..33d09875014f 100644
>> --- a/hw/i386/fw_cfg.c
>> +++ b/hw/i386/fw_cfg.c
>> @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms,
>> FWCfgState *fw_cfg)
>>       }
>>   }
>>   +static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
>> +                                   unsigned dies,
>> +                                   unsigned cores,
>> +                                   unsigned threads,
>> +                                   unsigned max_cpus)
>> +{
>> +    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
>> +
>> +    topo->dies     = cpu_to_le32(dies);
>> +    topo->cores    = cpu_to_le32(cores);
>> +    topo->threads  = cpu_to_le32(threads);
>> +    topo->max_cpus = cpu_to_le32(max_cpus);
>> +    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
>> +}
>> +
>>   FWCfgState *fw_cfg_arch_create(MachineState *ms,
>> -                                      uint16_t boot_cpus,
>> -                                      uint16_t apic_id_limit)
>> +                               uint16_t boot_cpus,
>> +                               uint16_t apic_id_limit,
>> +                               unsigned smp_dies,
>> +                               bool expose_topology)
>>   {
>>       FWCfgState *fw_cfg;
>>       uint64_t *numa_fw_cfg;
>> @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
>>                        (1 + apic_id_limit + nb_numa_nodes) *
>>                        sizeof(*numa_fw_cfg));
>>   +    if (expose_topology) {
>> +        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
>> +                               ms->smp.threads, ms->smp.max_cpus);
>> +    }
>> +
>>       return fw_cfg;
>>   }
>>   diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index bcda50efcc23..bb72b12edad2 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
>>                                           option_rom_mr,
>>                                           1);
>>   -    fw_cfg = fw_cfg_arch_create(machine,
>> -                                pcms->boot_cpus, pcms->apic_id_limit);
>> +    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus,
>> pcms->apic_id_limit,
>> +                                pcms->smp_dies, false);
>>         rom_set_fw(fw_cfg);
>>   
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

Thank you!
Laszlo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 10:52 ` [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg Laszlo Ersek
  2019-10-08 13:29   ` Philippe Mathieu-Daudé
  2019-10-08 15:59   ` Igor Mammedov
@ 2019-10-08 18:58   ` Laszlo Ersek
  2019-10-09 11:13     ` Igor Mammedov
  2019-10-09 21:09     ` Eduardo Habkost
  2 siblings, 2 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-08 18:58 UTC (permalink / raw)
  To: Eduardo Habkost, Igor Mammedov
  Cc: Paolo Bonzini, Richard Henderson, Philippe Mathieu-Daudé,
	qemu devel list, Michael S. Tsirkin

Eduardo, Igor,

On 10/08/19 12:52, Laszlo Ersek wrote:
> FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> due to historical reasons. That value is not useful to edk2, however. For
> supporting VCPU hotplug, edk2 needs:
> 
> - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> 
> - and the maximum foreseen CPU count (tracked in
>   "MachineState.smp.max_cpus", but not currently exposed).
> 
> Add a new fw-cfg file to expose "max_cpus".
> 
> While at it, expose the rest of the topology too (die / core / thread
> counts), because I expect that the VCPU hotplug feature for OVMF will
> ultimately need those too, and the data size is not large.

In fact, it seems like OVMF will have to synthesize the new
(hot-plugged) VCPU's *APIC-ID* from the following information sources:

- the topology information described above (die / core / thread counts), and

- the "modern" CPU hotplug interface (docs/specs/acpi_cpu_hotplug.txt).

Now, if I understand correctly, the "CPU selector" ([0x0-0x3]) specifies
a CPU *index*. Therefore, in the hotplug SMI handler (running on one of
the pre-existent CPUs), OVMF will have to translate the new CPU's
selector (the new CPU's *index*) to its *APIC-ID*, based on the topology
information (numbers of dies / cores / threads).

(That's because existent SMM infrastructure in edk2 uses the initial
APIC-ID as the key for referencing CPUs.)

Algorithmically, I think this translation is doable in OVMF  -- after
all, it is implemented in the x86_apicid_from_cpu_idx() function
already, in "include/hw/i386/topology.h". And that function does not
need more information either:

static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_dies,
                                                unsigned nr_cores,
                                                unsigned nr_threads,
                                                unsigned cpu_index)

Therefore, my plan is to implement the same translation logic in OVMF.

Now, the questions:

- Am I right to think that the "CPU selector" register in the "modern"
ACPI hotplug interface operates in the *same domain* as the "cpu_index"
parameter of x86_apicid_from_cpu_idx()?

- As we progress through CPU indices, x86_apicid_from_cpu_idx() first
fills threads in a core, then cores in a die, then dies in a socket.
Will this logic remain the same, forever?

If any of the two questions above is answered with "no", then OVMF will
need an fw_cfg blob that is different from the current proposal.

Namely, OVMF will need a *full* "cpu_index -> APIC-ID" map, up to (and
excluding) "max_cpus".

The pc_possible_cpu_arch_ids() function in "hw/i386/pc.c" already
calculates a similar map:

        ms->possible_cpus->cpus[i].arch_id =
x86_cpu_apic_id_from_index(pcms, i);

So, basically that map is what OVMF would have to receive over fw_cfg,
*if* the "cpu_index -> APIC-ID" mapping is not considered guest ABI.
Should I write v2 for that?

Please comment!

Thanks,
Laszlo


> This is
> slightly complicated by the fact that the die count is specific to
> PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
> commit 149c50cabcc4).
> 
> For now, the feature is temporarily disabled.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Eduardo Habkost <ehabkost@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
> Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> ---
>  hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
>  hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
>  hw/i386/pc.c     |  4 ++--
>  3 files changed, 55 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
> index e0856a376996..d742435b9793 100644
> --- a/hw/i386/fw_cfg.h
> +++ b/hw/i386/fw_cfg.h
> @@ -18,9 +18,37 @@
>  #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
>  #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
>  
> +/**
> + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over fw-cfg.
> + *
> + * All fields have little-endian encoding.
> + *
> + * @dies:     Number of dies per package (aka socket). Set it to 1 unless the
> + *            concrete MachineState subclass defines it differently.
> + * @cores:    Corresponds to @CpuTopology.@cores.
> + * @threads:  Corresponds to @CpuTopology.@threads.
> + * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
> + *
> + * Firmware can derive the package (aka socket) count with the following
> + * formula:
> + *
> + *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
> + *
> + * Firmware can derive APIC ID field widths and offsets per the standard
> + * calculations in "include/hw/i386/topology.h".
> + */
> +typedef struct FWCfgX86Topology {
> +  uint32_t dies;
> +  uint32_t cores;
> +  uint32_t threads;
> +  uint32_t max_cpus;
> +} QEMU_PACKED FWCfgX86Topology;
> +
>  FWCfgState *fw_cfg_arch_create(MachineState *ms,
>                                 uint16_t boot_cpus,
> -                               uint16_t apic_id_limit);
> +                               uint16_t apic_id_limit,
> +                               unsigned smp_dies,
> +                               bool expose_topology);
>  void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
>  void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
>  
> diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
> index 39b6bc60520c..33d09875014f 100644
> --- a/hw/i386/fw_cfg.c
> +++ b/hw/i386/fw_cfg.c
> @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
>      }
>  }
>  
> +static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
> +                                   unsigned dies,
> +                                   unsigned cores,
> +                                   unsigned threads,
> +                                   unsigned max_cpus)
> +{
> +    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
> +
> +    topo->dies     = cpu_to_le32(dies);
> +    topo->cores    = cpu_to_le32(cores);
> +    topo->threads  = cpu_to_le32(threads);
> +    topo->max_cpus = cpu_to_le32(max_cpus);
> +    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
> +}
> +
>  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> -                                      uint16_t boot_cpus,
> -                                      uint16_t apic_id_limit)
> +                               uint16_t boot_cpus,
> +                               uint16_t apic_id_limit,
> +                               unsigned smp_dies,
> +                               bool expose_topology)
>  {
>      FWCfgState *fw_cfg;
>      uint64_t *numa_fw_cfg;
> @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
>                       (1 + apic_id_limit + nb_numa_nodes) *
>                       sizeof(*numa_fw_cfg));
>  
> +    if (expose_topology) {
> +        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
> +                               ms->smp.threads, ms->smp.max_cpus);
> +    }
> +
>      return fw_cfg;
>  }
>  
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index bcda50efcc23..bb72b12edad2 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
>                                          option_rom_mr,
>                                          1);
>  
> -    fw_cfg = fw_cfg_arch_create(machine,
> -                                pcms->boot_cpus, pcms->apic_id_limit);
> +    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
> +                                pcms->smp_dies, false);
>  
>      rom_set_fw(fw_cfg);
>  
> 



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 18:58   ` Laszlo Ersek
@ 2019-10-09 11:13     ` Igor Mammedov
  2019-10-09 21:03       ` Laszlo Ersek
  2019-10-09 21:09     ` Eduardo Habkost
  1 sibling, 1 reply; 24+ messages in thread
From: Igor Mammedov @ 2019-10-09 11:13 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Eduardo Habkost, Michael S. Tsirkin, qemu devel list,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson

On Tue, 8 Oct 2019 20:58:30 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> Eduardo, Igor,
> 
> On 10/08/19 12:52, Laszlo Ersek wrote:
> > FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> > due to historical reasons. That value is not useful to edk2, however. For
> > supporting VCPU hotplug, edk2 needs:
> > 
> > - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> > 
> > - and the maximum foreseen CPU count (tracked in
> >   "MachineState.smp.max_cpus", but not currently exposed).
> > 
> > Add a new fw-cfg file to expose "max_cpus".
> > 
> > While at it, expose the rest of the topology too (die / core / thread
> > counts), because I expect that the VCPU hotplug feature for OVMF will
> > ultimately need those too, and the data size is not large.  
> 
> In fact, it seems like OVMF will have to synthesize the new
> (hot-plugged) VCPU's *APIC-ID* from the following information sources:
> 
> - the topology information described above (die / core / thread counts), and
> 
> - the "modern" CPU hotplug interface (docs/specs/acpi_cpu_hotplug.txt).

In general duplicating cpu_index+topo => apic id in firmware I
consider as a really bad idea (even ignoring cpu_index 
which I were trying to get rid of in QEMU), it's going to break
when algorithms diverge and it will be never ending race.

Topology is rather messy business, not only arch specific but also
cpu specific (ex: on my review TODO list, there is a series for
fixing broken EPYCs topo). Who knows what other variables would be
add dependencies for calculating APIC IDs down the road.

I also consider to re-use CPU hotplug interface on ARM, which will
bring its own set of algorithms.

Let's instead add a command to CPU hotplug interface to return
APIC ID (which QEMU already calculated) and later MPIDR (ARM)
for selected CPU, so firmware could get it while enumeration CPUs
via that interface.

> 
> Now, if I understand correctly, the "CPU selector" ([0x0-0x3]) specifies
> a CPU *index*. Therefore, in the hotplug SMI handler (running on one of
> the pre-existent CPUs), OVMF will have to translate the new CPU's
> selector (the new CPU's *index*) to its *APIC-ID*, based on the topology
> information (numbers of dies / cores / threads).
> 
> (That's because existent SMM infrastructure in edk2 uses the initial
> APIC-ID as the key for referencing CPUs.)
> 
> Algorithmically, I think this translation is doable in OVMF  -- after
> all, it is implemented in the x86_apicid_from_cpu_idx() function
> already, in "include/hw/i386/topology.h". And that function does not
> need more information either:
> 
> static inline apic_id_t x86_apicid_from_cpu_idx(unsigned nr_dies,
>                                                 unsigned nr_cores,
>                                                 unsigned nr_threads,
>                                                 unsigned cpu_index)
> 
> Therefore, my plan is to implement the same translation logic in OVMF.
> 
> Now, the questions:
> 
> - Am I right to think that the "CPU selector" register in the "modern"
> ACPI hotplug interface operates in the *same domain* as the "cpu_index"
> parameter of x86_apicid_from_cpu_idx()?
>
> - As we progress through CPU indices, x86_apicid_from_cpu_idx() first
> fills threads in a core, then cores in a die, then dies in a socket.
> Will this logic remain the same, forever?
> 
> If any of the two questions above is answered with "no", then OVMF will
> need an fw_cfg blob that is different from the current proposal.
> 
> Namely, OVMF will need a *full* "cpu_index -> APIC-ID" map, up to (and
> excluding) "max_cpus".
> 
> The pc_possible_cpu_arch_ids() function in "hw/i386/pc.c" already
> calculates a similar map:
> 
>         ms->possible_cpus->cpus[i].arch_id =
> x86_cpu_apic_id_from_index(pcms, i);
> 
> So, basically that map is what OVMF would have to receive over fw_cfg,
> *if* the "cpu_index -> APIC-ID" mapping is not considered guest ABI.
> Should I write v2 for that?
> 
> Please comment!
> 
> Thanks,
> Laszlo
> 
> 
> > This is
> > slightly complicated by the fact that the die count is specific to
> > PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
> > commit 149c50cabcc4).
> > 
> > For now, the feature is temporarily disabled.
> > 
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Eduardo Habkost <ehabkost@redhat.com>
> > Cc: Igor Mammedov <imammedo@redhat.com>
> > Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> > Cc: Richard Henderson <rth@twiddle.net>
> > Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
> > Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> > ---
> >  hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
> >  hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
> >  hw/i386/pc.c     |  4 ++--
> >  3 files changed, 55 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
> > index e0856a376996..d742435b9793 100644
> > --- a/hw/i386/fw_cfg.h
> > +++ b/hw/i386/fw_cfg.h
> > @@ -18,9 +18,37 @@
> >  #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
> >  #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
> >  
> > +/**
> > + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over fw-cfg.
> > + *
> > + * All fields have little-endian encoding.
> > + *
> > + * @dies:     Number of dies per package (aka socket). Set it to 1 unless the
> > + *            concrete MachineState subclass defines it differently.
> > + * @cores:    Corresponds to @CpuTopology.@cores.
> > + * @threads:  Corresponds to @CpuTopology.@threads.
> > + * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
> > + *
> > + * Firmware can derive the package (aka socket) count with the following
> > + * formula:
> > + *
> > + *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
> > + *
> > + * Firmware can derive APIC ID field widths and offsets per the standard
> > + * calculations in "include/hw/i386/topology.h".
> > + */
> > +typedef struct FWCfgX86Topology {
> > +  uint32_t dies;
> > +  uint32_t cores;
> > +  uint32_t threads;
> > +  uint32_t max_cpus;
> > +} QEMU_PACKED FWCfgX86Topology;
> > +
> >  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> >                                 uint16_t boot_cpus,
> > -                               uint16_t apic_id_limit);
> > +                               uint16_t apic_id_limit,
> > +                               unsigned smp_dies,
> > +                               bool expose_topology);
> >  void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
> >  void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
> >  
> > diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
> > index 39b6bc60520c..33d09875014f 100644
> > --- a/hw/i386/fw_cfg.c
> > +++ b/hw/i386/fw_cfg.c
> > @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
> >      }
> >  }
> >  
> > +static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
> > +                                   unsigned dies,
> > +                                   unsigned cores,
> > +                                   unsigned threads,
> > +                                   unsigned max_cpus)
> > +{
> > +    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
> > +
> > +    topo->dies     = cpu_to_le32(dies);
> > +    topo->cores    = cpu_to_le32(cores);
> > +    topo->threads  = cpu_to_le32(threads);
> > +    topo->max_cpus = cpu_to_le32(max_cpus);
> > +    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
> > +}
> > +
> >  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> > -                                      uint16_t boot_cpus,
> > -                                      uint16_t apic_id_limit)
> > +                               uint16_t boot_cpus,
> > +                               uint16_t apic_id_limit,
> > +                               unsigned smp_dies,
> > +                               bool expose_topology)
> >  {
> >      FWCfgState *fw_cfg;
> >      uint64_t *numa_fw_cfg;
> > @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
> >                       (1 + apic_id_limit + nb_numa_nodes) *
> >                       sizeof(*numa_fw_cfg));
> >  
> > +    if (expose_topology) {
> > +        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
> > +                               ms->smp.threads, ms->smp.max_cpus);
> > +    }
> > +
> >      return fw_cfg;
> >  }
> >  
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index bcda50efcc23..bb72b12edad2 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
> >                                          option_rom_mr,
> >                                          1);
> >  
> > -    fw_cfg = fw_cfg_arch_create(machine,
> > -                                pcms->boot_cpus, pcms->apic_id_limit);
> > +    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
> > +                                pcms->smp_dies, false);
> >  
> >      rom_set_fw(fw_cfg);
> >  
> >   
> 



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 15:59   ` Igor Mammedov
@ 2019-10-09 21:01     ` Laszlo Ersek
  2019-10-10  9:45       ` Igor Mammedov
  2019-10-10 10:01     ` Michael S. Tsirkin
  1 sibling, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-09 21:01 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Eduardo Habkost, Michael S. Tsirkin, qemu devel list,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson

On 10/08/19 17:59, Igor Mammedov wrote:
> On Tue,  8 Oct 2019 12:52:58 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
>> due to historical reasons. That value is not useful to edk2, however. For
>> supporting VCPU hotplug, edk2 needs:
>>
>> - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
>>
>> - and the maximum foreseen CPU count (tracked in
>>   "MachineState.smp.max_cpus", but not currently exposed).
> one can get it with current QEMU without adding new fgcfg
> (albeit in a bit awkward way)
> 
> max_cpu count can be derived indirectly as result of cpu hotplug
> enumeration (IO interface at 0x0cd8-0xcf7) by writing/reading
> to/from selector register (see ACPI_CPU_SELECTOR_OFFSET_WR)
> until read value stops changing values (i.e. max cpu count
> is reached). One also can figure out present/non-present
> cpu status by reading flags register.

What do you mean by "read value stops changing values"?

I assume I have to write the CPU index (in incrementing fashion) to
offset 0 in the register block.

- What byte order?
- What offset / width do I need to read back? What endianness? :)
- What is the expected value once I run out of the possible CPU range?

(I tried to figure these out from "docs/specs/acpi_cpu_hotplug.txt", but
I can't find the answers in it. Apologies.)

Other than that, I'm fine with this method. Hopefully the IO port
accesses (on every boot) won't slow down the boot much (esp. in SEV
guests, where they are more costly).


>> Add a new fw-cfg file to expose "max_cpus".
>>
>> While at it, expose the rest of the topology too (die / core / thread
>> counts), because I expect that the VCPU hotplug feature for OVMF will
>> ultimately need those too, and the data size is not large. This is
>> slightly complicated by the fact that the die count is specific to
>> PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
>> commit 149c50cabcc4).
> Could you clarify why topology info is necessary?

(Done in the subsequent message, but I'll answer here too, below.)


> Potentially it's possible to extend cpu hotplug ABI to report
> arch specific apic-id (x86) or mpidr (arm) if firmware really
> needs to know topology and let guest to decode it according
> to CPU's spec.

This would be very nice.

For the hotplug use case, the internal structure / topology of the
APIC-ID actually appears irrelevant. What's needed is that the "host
CPU", handling the hotplug SMI, can *somehow* deduce the APIC-ID of the
new CPU. (The edk2 code suggests that, on physical platforms, the RAS
controller passes the new APIC-ID the the "host CPU".) The edk2
infrastructure uses APIC-ID's as the unique key for identifying CPUs.

The topology info was supposed to allow OVMF to calculate the APIC-ID
from scratch, based on the sequential CPU index (retrieved from the ACPI
hotplug register block).

> So far there were no need for it as all possible cpus are
> described in ACPI tables passed to guest, but I'm not going
> to suggest to parse them on firmware side as it's too complicated :)

Thanks, that's appreciated :)

> PS:
> The reason we started building ACPI tables in QEMU, was never
> ending story of adding more ABI and supporting it afterwards.
> So I'd try to avoid doing it if it can be helped.

Sure, I don't insist.

If the hotplug register block can expose the APIC-IDs as "opaque"
integers, and they match the APIC-IDs read on the actual processors,
things should work.

Thanks,
Laszlo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-09 11:13     ` Igor Mammedov
@ 2019-10-09 21:03       ` Laszlo Ersek
  0 siblings, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-09 21:03 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Eduardo Habkost, Michael S. Tsirkin, qemu devel list,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson

On 10/09/19 13:13, Igor Mammedov wrote:
> On Tue, 8 Oct 2019 20:58:30 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> Eduardo, Igor,
>>
>> On 10/08/19 12:52, Laszlo Ersek wrote:
>>> FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
>>> due to historical reasons. That value is not useful to edk2, however. For
>>> supporting VCPU hotplug, edk2 needs:
>>>
>>> - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
>>>
>>> - and the maximum foreseen CPU count (tracked in
>>>   "MachineState.smp.max_cpus", but not currently exposed).
>>>
>>> Add a new fw-cfg file to expose "max_cpus".
>>>
>>> While at it, expose the rest of the topology too (die / core / thread
>>> counts), because I expect that the VCPU hotplug feature for OVMF will
>>> ultimately need those too, and the data size is not large.  
>>
>> In fact, it seems like OVMF will have to synthesize the new
>> (hot-plugged) VCPU's *APIC-ID* from the following information sources:
>>
>> - the topology information described above (die / core / thread counts), and
>>
>> - the "modern" CPU hotplug interface (docs/specs/acpi_cpu_hotplug.txt).
> 
> In general duplicating cpu_index+topo => apic id in firmware I
> consider as a really bad idea (even ignoring cpu_index 
> which I were trying to get rid of in QEMU), it's going to break
> when algorithms diverge and it will be never ending race.

OK.

> Topology is rather messy business, not only arch specific but also
> cpu specific (ex: on my review TODO list, there is a series for
> fixing broken EPYCs topo). Who knows what other variables would be
> add dependencies for calculating APIC IDs down the road.
> 
> I also consider to re-use CPU hotplug interface on ARM, which will
> bring its own set of algorithms.
> 
> Let's instead add a command to CPU hotplug interface to return
> APIC ID (which QEMU already calculated) and later MPIDR (ARM)
> for selected CPU, so firmware could get it while enumeration CPUs
> via that interface.

Sounds good to me, thanks.

I'll stay tuned for your patches! :)

Thanks!
Laszlo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 18:58   ` Laszlo Ersek
  2019-10-09 11:13     ` Igor Mammedov
@ 2019-10-09 21:09     ` Eduardo Habkost
  1 sibling, 0 replies; 24+ messages in thread
From: Eduardo Habkost @ 2019-10-09 21:09 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Michael S. Tsirkin, qemu devel list, Paolo Bonzini,
	Igor Mammedov, Philippe Mathieu-Daudé,
	Richard Henderson

On Tue, Oct 08, 2019 at 08:58:30PM +0200, Laszlo Ersek wrote:
> Eduardo, Igor,
> 
> On 10/08/19 12:52, Laszlo Ersek wrote:
> > FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> > due to historical reasons. That value is not useful to edk2, however. For
> > supporting VCPU hotplug, edk2 needs:
> > 
> > - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> > 
> > - and the maximum foreseen CPU count (tracked in
> >   "MachineState.smp.max_cpus", but not currently exposed).
> > 
> > Add a new fw-cfg file to expose "max_cpus".
> > 
> > While at it, expose the rest of the topology too (die / core / thread
> > counts), because I expect that the VCPU hotplug feature for OVMF will
> > ultimately need those too, and the data size is not large.
> 
> In fact, it seems like OVMF will have to synthesize the new
> (hot-plugged) VCPU's *APIC-ID* from the following information sources:
> 
> - the topology information described above (die / core / thread counts), and
> 
> - the "modern" CPU hotplug interface (docs/specs/acpi_cpu_hotplug.txt).
> 
> Now, if I understand correctly, the "CPU selector" ([0x0-0x3]) specifies
> a CPU *index*. Therefore, in the hotplug SMI handler (running on one of
> the pre-existent CPUs), OVMF will have to translate the new CPU's
> selector (the new CPU's *index*) to its *APIC-ID*, based on the topology
> information (numbers of dies / cores / threads).

I thought we had already fixed all our guest interfaces to make
CPU indexes never visible to guests.  If it is still visible
somewhere, it's a bug we need to fix.

-- 
Eduardo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-09 21:01     ` Laszlo Ersek
@ 2019-10-10  9:45       ` Igor Mammedov
  0 siblings, 0 replies; 24+ messages in thread
From: Igor Mammedov @ 2019-10-10  9:45 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Eduardo Habkost, Michael S. Tsirkin, qemu devel list,
	Paolo Bonzini, Philippe Mathieu-Daudé,
	Richard Henderson

On Wed, 9 Oct 2019 23:01:21 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

> On 10/08/19 17:59, Igor Mammedov wrote:
> > On Tue,  8 Oct 2019 12:52:58 +0200
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >   
> >> FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> >> due to historical reasons. That value is not useful to edk2, however. For
> >> supporting VCPU hotplug, edk2 needs:
> >>
> >> - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> >>
> >> - and the maximum foreseen CPU count (tracked in
> >>   "MachineState.smp.max_cpus", but not currently exposed).  
> > one can get it with current QEMU without adding new fgcfg
> > (albeit in a bit awkward way)
> > 
> > max_cpu count can be derived indirectly as result of cpu hotplug
> > enumeration (IO interface at 0x0cd8-0xcf7) by writing/reading
> > to/from selector register (see ACPI_CPU_SELECTOR_OFFSET_WR)
> > until read value stops changing values (i.e. max cpu count
> > is reached). One also can figure out present/non-present
> > cpu status by reading flags register.  
> 
> What do you mean by "read value stops changing values"?
> 
> I assume I have to write the CPU index (in incrementing fashion) to
> offset 0 in the register block.
> 
> - What byte order?
> - What offset / width do I need to read back? What endianness? :)
Since it's ACPI oriented oriented, it's supposed to be little-endian.
But spec doesn't mention it and apparently code I wrote back then
have bugs in this regard.



> - What is the expected value once I run out of the possible CPU range?
> (I tried to figure these out from "docs/specs/acpi_cpu_hotplug.txt", but
> I can't find the answers in it. Apologies.)

The apology is all mine. I should've written better spec/code.
I'll fix it and update spec/code to match expected byte-order.

As for a way to enumerate CPUs an APIC ID, I've just posted patches
updating spec with example workflows and exposing APIC ID
in the interface.

  [RFC 0/3] acpi: cphp: add CPHP_GET_CPU_ID_CMD command to cpu hotplug  MMIO interface


> Other than that, I'm fine with this method. Hopefully the IO port
> accesses (on every boot) won't slow down the boot much (esp. in SEV
> guests, where they are more costly).
> 
> 
> >> Add a new fw-cfg file to expose "max_cpus".
> >>
> >> While at it, expose the rest of the topology too (die / core / thread
> >> counts), because I expect that the VCPU hotplug feature for OVMF will
> >> ultimately need those too, and the data size is not large. This is
> >> slightly complicated by the fact that the die count is specific to
> >> PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
> >> commit 149c50cabcc4).  
> > Could you clarify why topology info is necessary?  
> 
> (Done in the subsequent message, but I'll answer here too, below.)
> 
> 
> > Potentially it's possible to extend cpu hotplug ABI to report
> > arch specific apic-id (x86) or mpidr (arm) if firmware really
> > needs to know topology and let guest to decode it according
> > to CPU's spec.  
> 
> This would be very nice.
> 
> For the hotplug use case, the internal structure / topology of the
> APIC-ID actually appears irrelevant. What's needed is that the "host
> CPU", handling the hotplug SMI, can *somehow* deduce the APIC-ID of the
> new CPU. (The edk2 code suggests that, on physical platforms, the RAS
> controller passes the new APIC-ID the the "host CPU".) The edk2
> infrastructure uses APIC-ID's as the unique key for identifying CPUs.
> 
> The topology info was supposed to allow OVMF to calculate the APIC-ID
> from scratch, based on the sequential CPU index (retrieved from the ACPI
> hotplug register block).
> 
> > So far there were no need for it as all possible cpus are
> > described in ACPI tables passed to guest, but I'm not going
> > to suggest to parse them on firmware side as it's too complicated :)  
> 
> Thanks, that's appreciated :)
> 
> > PS:
> > The reason we started building ACPI tables in QEMU, was never
> > ending story of adding more ABI and supporting it afterwards.
> > So I'd try to avoid doing it if it can be helped.  
> 
> Sure, I don't insist.
> 
> If the hotplug register block can expose the APIC-IDs as "opaque"
> integers, and they match the APIC-IDs read on the actual processors,
> things should work.
> 
> Thanks,
> Laszlo



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-08 15:59   ` Igor Mammedov
  2019-10-09 21:01     ` Laszlo Ersek
@ 2019-10-10 10:01     ` Michael S. Tsirkin
  2019-10-10 12:48       ` Igor Mammedov
  2019-10-10 16:15       ` Laszlo Ersek
  1 sibling, 2 replies; 24+ messages in thread
From: Michael S. Tsirkin @ 2019-10-10 10:01 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Eduardo Habkost, Laszlo Ersek, qemu devel list, Paolo Bonzini,
	Philippe Mathieu-Daudé,
	Richard Henderson

On Tue, Oct 08, 2019 at 05:59:31PM +0200, Igor Mammedov wrote:
> On Tue,  8 Oct 2019 12:52:58 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
> > FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> > due to historical reasons. That value is not useful to edk2, however. For
> > supporting VCPU hotplug, edk2 needs:
> > 
> > - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> > 
> > - and the maximum foreseen CPU count (tracked in
> >   "MachineState.smp.max_cpus", but not currently exposed).
> one can get it with current QEMU without adding new fgcfg
> (albeit in a bit awkward way)
> 
> max_cpu count can be derived indirectly as result of cpu hotplug
> enumeration (IO interface at 0x0cd8-0xcf7) by writing/reading
> to/from selector register (see ACPI_CPU_SELECTOR_OFFSET_WR)
> until read value stops changing values (i.e. max cpu count
> is reached). One also can figure out present/non-present
> cpu status by reading flags register.

Right but so far ACPI was the only driver of CPU hotplug, right?
I don't think firmware should poke it directly, it is
not really clean or especially scalable.
Fine for ACPI but not great as a HV/guest interface.

> > Add a new fw-cfg file to expose "max_cpus".
> > 
> > While at it, expose the rest of the topology too (die / core / thread
> > counts), because I expect that the VCPU hotplug feature for OVMF will
> > ultimately need those too, and the data size is not large. This is
> > slightly complicated by the fact that the die count is specific to
> > PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
> > commit 149c50cabcc4).
> Could you clarify why topology info is necessary?
> 
> Potentially it's possible to extend cpu hotplug ABI to report
> arch specific apic-id (x86) or mpidr (arm) if firmware really
> needs to know topology and let guest to decode it according
> to CPU's spec.
> 
> So far there were no need for it as all possible cpus are
> described in ACPI tables passed to guest, but I'm not going
> to suggest to parse them on firmware side as it's too complicated :)

We can always add a QEMU specific data table by the way.
Format would be up to us and would be easy to parse.
I don't see a big advantage as compared to fw cfg though.

> PS:
> The reason we started building ACPI tables in QEMU, was never
> ending story of adding more ABI and supporting it afterwards.
> So I'd try to avoid doing it if it can be helped.

Absolutely. We should try to keep all ACPI generation in QEMU.
But this value is not for ACPI, is it?



> > For now, the feature is temporarily disabled.
> > 
> > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > Cc: Eduardo Habkost <ehabkost@redhat.com>
> > Cc: Igor Mammedov <imammedo@redhat.com>
> > Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> > Cc: Richard Henderson <rth@twiddle.net>
> > Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
> > Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> > ---
> >  hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
> >  hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
> >  hw/i386/pc.c     |  4 ++--
> >  3 files changed, 55 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
> > index e0856a376996..d742435b9793 100644
> > --- a/hw/i386/fw_cfg.h
> > +++ b/hw/i386/fw_cfg.h
> > @@ -18,9 +18,37 @@
> >  #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
> >  #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
> >  
> > +/**
> > + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over fw-cfg.
> > + *
> > + * All fields have little-endian encoding.
> > + *
> > + * @dies:     Number of dies per package (aka socket). Set it to 1 unless the
> > + *            concrete MachineState subclass defines it differently.
> > + * @cores:    Corresponds to @CpuTopology.@cores.
> > + * @threads:  Corresponds to @CpuTopology.@threads.
> > + * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
> > + *
> > + * Firmware can derive the package (aka socket) count with the following
> > + * formula:
> > + *
> > + *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
> > + *
> > + * Firmware can derive APIC ID field widths and offsets per the standard
> > + * calculations in "include/hw/i386/topology.h".
> > + */
> > +typedef struct FWCfgX86Topology {
> > +  uint32_t dies;
> > +  uint32_t cores;
> > +  uint32_t threads;
> > +  uint32_t max_cpus;
> > +} QEMU_PACKED FWCfgX86Topology;
> > +
> >  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> >                                 uint16_t boot_cpus,
> > -                               uint16_t apic_id_limit);
> > +                               uint16_t apic_id_limit,
> > +                               unsigned smp_dies,
> > +                               bool expose_topology);
> >  void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
> >  void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
> >  
> > diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
> > index 39b6bc60520c..33d09875014f 100644
> > --- a/hw/i386/fw_cfg.c
> > +++ b/hw/i386/fw_cfg.c
> > @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
> >      }
> >  }
> >  
> > +static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
> > +                                   unsigned dies,
> > +                                   unsigned cores,
> > +                                   unsigned threads,
> > +                                   unsigned max_cpus)
> > +{
> > +    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
> > +
> > +    topo->dies     = cpu_to_le32(dies);
> > +    topo->cores    = cpu_to_le32(cores);
> > +    topo->threads  = cpu_to_le32(threads);
> > +    topo->max_cpus = cpu_to_le32(max_cpus);
> > +    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
> > +}
> > +
> >  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> > -                                      uint16_t boot_cpus,
> > -                                      uint16_t apic_id_limit)
> > +                               uint16_t boot_cpus,
> > +                               uint16_t apic_id_limit,
> > +                               unsigned smp_dies,
> > +                               bool expose_topology)
> >  {
> >      FWCfgState *fw_cfg;
> >      uint64_t *numa_fw_cfg;
> > @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
> >                       (1 + apic_id_limit + nb_numa_nodes) *
> >                       sizeof(*numa_fw_cfg));
> >  
> > +    if (expose_topology) {
> > +        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
> > +                               ms->smp.threads, ms->smp.max_cpus);
> > +    }
> > +
> >      return fw_cfg;
> >  }
> >  
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index bcda50efcc23..bb72b12edad2 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
> >                                          option_rom_mr,
> >                                          1);
> >  
> > -    fw_cfg = fw_cfg_arch_create(machine,
> > -                                pcms->boot_cpus, pcms->apic_id_limit);
> > +    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
> > +                                pcms->smp_dies, false);
> >  
> >      rom_set_fw(fw_cfg);
> >  


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-10 10:01     ` Michael S. Tsirkin
@ 2019-10-10 12:48       ` Igor Mammedov
  2019-10-10 16:23         ` Laszlo Ersek
  2019-10-10 16:15       ` Laszlo Ersek
  1 sibling, 1 reply; 24+ messages in thread
From: Igor Mammedov @ 2019-10-10 12:48 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, Laszlo Ersek, qemu devel list, Paolo Bonzini,
	Philippe Mathieu-Daudé,
	Richard Henderson

On Thu, 10 Oct 2019 06:01:51 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Oct 08, 2019 at 05:59:31PM +0200, Igor Mammedov wrote:
> > On Tue,  8 Oct 2019 12:52:58 +0200
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >   
> > > FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> > > due to historical reasons. That value is not useful to edk2, however. For
> > > supporting VCPU hotplug, edk2 needs:
> > > 
> > > - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> > > 
> > > - and the maximum foreseen CPU count (tracked in
> > >   "MachineState.smp.max_cpus", but not currently exposed).  
> > one can get it with current QEMU without adding new fgcfg
> > (albeit in a bit awkward way)
> > 
> > max_cpu count can be derived indirectly as result of cpu hotplug
> > enumeration (IO interface at 0x0cd8-0xcf7) by writing/reading
> > to/from selector register (see ACPI_CPU_SELECTOR_OFFSET_WR)
> > until read value stops changing values (i.e. max cpu count
> > is reached). One also can figure out present/non-present
> > cpu status by reading flags register.  
> 
> Right but so far ACPI was the only driver of CPU hotplug, right?
even if its only user is ACPI now, like any other ABI it's
fixed and given that requirements to CPU hotplug ABI from ACPI
and firmware pretty much the same, I don't really see a reason
not to reuse it, since whole hotplug process orchestrated
by ACPI anyway.

> I don't think firmware should poke it directly, it is
> not really clean or especially scalable.
> Fine for ACPI but not great as a HV/guest interface.
Building an alternative ABI that will duplicate already
existing one, doesn't sound like a great idea to me.

Could you elaborate why do you think it's not a good idea
to re-use already existing CPU hotplug ABI in firmware?

Or even better suggest an algorithm how CPU hotplug should
work with SMM enabled OVMF.

 * in generic case, CPUs are asynchronously hotplugged and
   OSMP also asynchronously processing hotplug events.
   (So at the moment OSPM tells firmware to handle hotplug
    there are 1-n CPUs to handle and more might be coming)

> > > Add a new fw-cfg file to expose "max_cpus".
> > > 
> > > While at it, expose the rest of the topology too (die / core / thread
> > > counts), because I expect that the VCPU hotplug feature for OVMF will
> > > ultimately need those too, and the data size is not large. This is
> > > slightly complicated by the fact that the die count is specific to
> > > PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
> > > commit 149c50cabcc4).  
> > Could you clarify why topology info is necessary?
> > 
> > Potentially it's possible to extend cpu hotplug ABI to report
> > arch specific apic-id (x86) or mpidr (arm) if firmware really
> > needs to know topology and let guest to decode it according
> > to CPU's spec.
> > 
> > So far there were no need for it as all possible cpus are
> > described in ACPI tables passed to guest, but I'm not going
> > to suggest to parse them on firmware side as it's too complicated :)  
> 
> We can always add a QEMU specific data table by the way.
> Format would be up to us and would be easy to parse.
> I don't see a big advantage as compared to fw cfg though.

it doesn't really matter if it's ACPI blob or fw_cfg,
what firmware needs is a table of possible CPUs with APIC IDs.

But if we go this route (i.e. not reuse CPU hotplug interface),
the table alone is not enough, one would need to build a protocol
between ACPI and firmware to communicate what CPUs to (un)hotplug.
It's basically repeating what current CPU hotplug interface does.

> > PS:
> > The reason we started building ACPI tables in QEMU, was never
> > ending story of adding more ABI and supporting it afterwards.
> > So I'd try to avoid doing it if it can be helped.  
> 
> Absolutely. We should try to keep all ACPI generation in QEMU.
> But this value is not for ACPI, is it?
I'm not sure what you are trying to say here.
 
 
> 
> > > For now, the feature is temporarily disabled.
> > > 
> > > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > > Cc: Eduardo Habkost <ehabkost@redhat.com>
> > > Cc: Igor Mammedov <imammedo@redhat.com>
> > > Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > Cc: Richard Henderson <rth@twiddle.net>
> > > Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
> > > Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> > > ---
> > >  hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
> > >  hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
> > >  hw/i386/pc.c     |  4 ++--
> > >  3 files changed, 55 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
> > > index e0856a376996..d742435b9793 100644
> > > --- a/hw/i386/fw_cfg.h
> > > +++ b/hw/i386/fw_cfg.h
> > > @@ -18,9 +18,37 @@
> > >  #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
> > >  #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
> > >  
> > > +/**
> > > + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over fw-cfg.
> > > + *
> > > + * All fields have little-endian encoding.
> > > + *
> > > + * @dies:     Number of dies per package (aka socket). Set it to 1 unless the
> > > + *            concrete MachineState subclass defines it differently.
> > > + * @cores:    Corresponds to @CpuTopology.@cores.
> > > + * @threads:  Corresponds to @CpuTopology.@threads.
> > > + * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
> > > + *
> > > + * Firmware can derive the package (aka socket) count with the following
> > > + * formula:
> > > + *
> > > + *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
> > > + *
> > > + * Firmware can derive APIC ID field widths and offsets per the standard
> > > + * calculations in "include/hw/i386/topology.h".
> > > + */
> > > +typedef struct FWCfgX86Topology {
> > > +  uint32_t dies;
> > > +  uint32_t cores;
> > > +  uint32_t threads;
> > > +  uint32_t max_cpus;
> > > +} QEMU_PACKED FWCfgX86Topology;
> > > +
> > >  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> > >                                 uint16_t boot_cpus,
> > > -                               uint16_t apic_id_limit);
> > > +                               uint16_t apic_id_limit,
> > > +                               unsigned smp_dies,
> > > +                               bool expose_topology);
> > >  void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
> > >  void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
> > >  
> > > diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
> > > index 39b6bc60520c..33d09875014f 100644
> > > --- a/hw/i386/fw_cfg.c
> > > +++ b/hw/i386/fw_cfg.c
> > > @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
> > >      }
> > >  }
> > >  
> > > +static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
> > > +                                   unsigned dies,
> > > +                                   unsigned cores,
> > > +                                   unsigned threads,
> > > +                                   unsigned max_cpus)
> > > +{
> > > +    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
> > > +
> > > +    topo->dies     = cpu_to_le32(dies);
> > > +    topo->cores    = cpu_to_le32(cores);
> > > +    topo->threads  = cpu_to_le32(threads);
> > > +    topo->max_cpus = cpu_to_le32(max_cpus);
> > > +    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
> > > +}
> > > +
> > >  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> > > -                                      uint16_t boot_cpus,
> > > -                                      uint16_t apic_id_limit)
> > > +                               uint16_t boot_cpus,
> > > +                               uint16_t apic_id_limit,
> > > +                               unsigned smp_dies,
> > > +                               bool expose_topology)
> > >  {
> > >      FWCfgState *fw_cfg;
> > >      uint64_t *numa_fw_cfg;
> > > @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
> > >                       (1 + apic_id_limit + nb_numa_nodes) *
> > >                       sizeof(*numa_fw_cfg));
> > >  
> > > +    if (expose_topology) {
> > > +        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
> > > +                               ms->smp.threads, ms->smp.max_cpus);
> > > +    }
> > > +
> > >      return fw_cfg;
> > >  }
> > >  
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index bcda50efcc23..bb72b12edad2 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
> > >                                          option_rom_mr,
> > >                                          1);
> > >  
> > > -    fw_cfg = fw_cfg_arch_create(machine,
> > > -                                pcms->boot_cpus, pcms->apic_id_limit);
> > > +    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
> > > +                                pcms->smp_dies, false);
> > >  
> > >      rom_set_fw(fw_cfg);
> > >    



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-10 10:01     ` Michael S. Tsirkin
  2019-10-10 12:48       ` Igor Mammedov
@ 2019-10-10 16:15       ` Laszlo Ersek
  1 sibling, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-10 16:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, Igor Mammedov
  Cc: Paolo Bonzini, Richard Henderson, Philippe Mathieu-Daudé,
	qemu devel list, Eduardo Habkost

On 10/10/19 12:01, Michael S. Tsirkin wrote:
> On Tue, Oct 08, 2019 at 05:59:31PM +0200, Igor Mammedov wrote:

>> So far there were no need for it as all possible cpus are
>> described in ACPI tables passed to guest, but I'm not going
>> to suggest to parse them on firmware side as it's too complicated :)
> 
> We can always add a QEMU specific data table by the way.
> Format would be up to us and would be easy to parse.
> I don't see a big advantage as compared to fw cfg though.

I'd like to comment just on this part.

*If* we decide to expose the information through some kind of data table
(as opposed to the modern CPU hotplug register block), then the
representation *must* be a dedicated fw_cfg blob. It cannot be an ACPI
table.

The reason is that *selecting* the fw_cfg blob that contains the ACPI
linker/loader script is a very specific action (it re-generates the ACPI
payload, with dependencies on assigned PCI resources). Therefore, it is
done in a super-particular spot in the firmware.

On the other hand, the "possible CPUs count" is needed much earlier than
that. I need to fetch that info way before PCI resource assignment
appears on the radar.

So please let us stick with "ACPI is only for the guest OS to read" rule
-- it's not only a parsing convenience for the firmware, but a real
necessity.

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-10 12:48       ` Igor Mammedov
@ 2019-10-10 16:23         ` Laszlo Ersek
  2019-10-10 18:08           ` Michael S. Tsirkin
  0 siblings, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-10 16:23 UTC (permalink / raw)
  To: Igor Mammedov, Michael S. Tsirkin
  Cc: Paolo Bonzini, Richard Henderson, Philippe Mathieu-Daudé,
	qemu devel list, Eduardo Habkost

On 10/10/19 14:48, Igor Mammedov wrote:

> it doesn't really matter if it's ACPI blob or fw_cfg,
> what firmware needs is a table of possible CPUs with APIC IDs.

To repeat my previous point:

Not necessarily taking sides between "data table" and "register block",
but *if* we opt for "data table", then it *must* be fw_cfg.

> But if we go this route (i.e. not reuse CPU hotplug interface),
> the table alone is not enough, one would need to build a protocol
> between ACPI and firmware to communicate what CPUs to (un)hotplug.

That's for sure, yes -- for finding out what CPU has been hotplugged,
the hotplug SMI handler in the firmware has to look at the register
block no matter what.

The "data table" vs "register block" question only arises *afterwards*,
for translating the CPU selector (fetched from the register block) to
the APIC-ID domain. (The generic edk2 infrastructure requires APIC-IDs).

Thanks
Laszlo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-10 16:23         ` Laszlo Ersek
@ 2019-10-10 18:08           ` Michael S. Tsirkin
  2019-10-11  6:50             ` Laszlo Ersek
  0 siblings, 1 reply; 24+ messages in thread
From: Michael S. Tsirkin @ 2019-10-10 18:08 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: Eduardo Habkost, qemu devel list, Paolo Bonzini, Igor Mammedov,
	Philippe Mathieu-Daudé,
	Richard Henderson

On Thu, Oct 10, 2019 at 06:23:00PM +0200, Laszlo Ersek wrote:
> On 10/10/19 14:48, Igor Mammedov wrote:
> 
> > it doesn't really matter if it's ACPI blob or fw_cfg,
> > what firmware needs is a table of possible CPUs with APIC IDs.
> 
> To repeat my previous point:
> 
> Not necessarily taking sides between "data table" and "register block",
> but *if* we opt for "data table", then it *must* be fw_cfg.
> 
> > But if we go this route (i.e. not reuse CPU hotplug interface),
> > the table alone is not enough, one would need to build a protocol
> > between ACPI and firmware to communicate what CPUs to (un)hotplug.
> 
> That's for sure, yes -- for finding out what CPU has been hotplugged,
> the hotplug SMI handler in the firmware has to look at the register
> block no matter what.

I thought all that's done by ACPI, with ACPI returning an event
to the OSPM reporting what happened.

> The "data table" vs "register block" question only arises *afterwards*,
> for translating the CPU selector (fetched from the register block) to
> the APIC-ID domain. (The generic edk2 infrastructure requires APIC-IDs).
> 
> Thanks
> Laszlo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-10 18:08           ` Michael S. Tsirkin
@ 2019-10-11  6:50             ` Laszlo Ersek
  2019-10-11  7:46               ` Laszlo Ersek
  0 siblings, 1 reply; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-11  6:50 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, qemu devel list, Paolo Bonzini, Igor Mammedov,
	Philippe Mathieu-Daudé,
	Richard Henderson

On 10/10/19 20:08, Michael S. Tsirkin wrote:
> On Thu, Oct 10, 2019 at 06:23:00PM +0200, Laszlo Ersek wrote:
>> On 10/10/19 14:48, Igor Mammedov wrote:
>>
>>> it doesn't really matter if it's ACPI blob or fw_cfg,
>>> what firmware needs is a table of possible CPUs with APIC IDs.
>>
>> To repeat my previous point:
>>
>> Not necessarily taking sides between "data table" and "register block",
>> but *if* we opt for "data table", then it *must* be fw_cfg.
>>
>>> But if we go this route (i.e. not reuse CPU hotplug interface),
>>> the table alone is not enough, one would need to build a protocol
>>> between ACPI and firmware to communicate what CPUs to (un)hotplug.
>>
>> That's for sure, yes -- for finding out what CPU has been hotplugged,
>> the hotplug SMI handler in the firmware has to look at the register
>> block no matter what.
> 
> I thought all that's done by ACPI, with ACPI returning an event
> to the OSPM reporting what happened.

That works if only the OS needs to care -- the OS can rely on ACPI.

But with SMM in the picture, the firmware has to care too (the new CPU's
SMBASE has to be relocated, and the SMM data structures need to account
for the new CPU). The firmware cannot rely on any AML generated by QEMU.

Thanks
Laszlo

> 
>> The "data table" vs "register block" question only arises *afterwards*,
>> for translating the CPU selector (fetched from the register block) to
>> the APIC-ID domain. (The generic edk2 infrastructure requires APIC-IDs).
>>
>> Thanks
>> Laszlo



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
  2019-10-11  6:50             ` Laszlo Ersek
@ 2019-10-11  7:46               ` Laszlo Ersek
  0 siblings, 0 replies; 24+ messages in thread
From: Laszlo Ersek @ 2019-10-11  7:46 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Eduardo Habkost, qemu devel list, Paolo Bonzini, Igor Mammedov,
	Philippe Mathieu-Daudé,
	Richard Henderson

On 10/11/19 08:50, Laszlo Ersek wrote:
> On 10/10/19 20:08, Michael S. Tsirkin wrote:
>> On Thu, Oct 10, 2019 at 06:23:00PM +0200, Laszlo Ersek wrote:
>>> On 10/10/19 14:48, Igor Mammedov wrote:
>>>
>>>> it doesn't really matter if it's ACPI blob or fw_cfg,
>>>> what firmware needs is a table of possible CPUs with APIC IDs.
>>>
>>> To repeat my previous point:
>>>
>>> Not necessarily taking sides between "data table" and "register block",
>>> but *if* we opt for "data table", then it *must* be fw_cfg.
>>>
>>>> But if we go this route (i.e. not reuse CPU hotplug interface),
>>>> the table alone is not enough, one would need to build a protocol
>>>> between ACPI and firmware to communicate what CPUs to (un)hotplug.
>>>
>>> That's for sure, yes -- for finding out what CPU has been hotplugged,
>>> the hotplug SMI handler in the firmware has to look at the register
>>> block no matter what.
>>
>> I thought all that's done by ACPI, with ACPI returning an event
>> to the OSPM reporting what happened.
> 
> That works if only the OS needs to care -- the OS can rely on ACPI.
> 
> But with SMM in the picture, the firmware has to care too (the new CPU's
> SMBASE has to be relocated, and the SMM data structures need to account
> for the new CPU). The firmware cannot rely on any AML generated by QEMU.

To clarify, I mean that the firmware cannot call AML methods, plus the
firmware can take a limited amount of parameters when the AML raises the
hotplug SMI. Basically everything has to be passed in chipset registers,
and due to SEV, those registers had better all be IO ports.

>>
>>> The "data table" vs "register block" question only arises *afterwards*,
>>> for translating the CPU selector (fetched from the register block) to
>>> the APIC-ID domain. (The generic edk2 infrastructure requires APIC-IDs).
>>>
>>> Thanks
>>> Laszlo
> 



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, back to index

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-08 10:52 [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF Laszlo Ersek
2019-10-08 10:52 ` [PATCH 1/4] fw_cfg: bump file slots to 40 Laszlo Ersek
2019-10-08 10:52 ` [PATCH 2/4] target/i386: remove useless enable_compat_apic_id_mode() prototype Laszlo Ersek
2019-10-08 13:35   ` Philippe Mathieu-Daudé
2019-10-08 18:22     ` Laszlo Ersek
2019-10-08 10:52 ` [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg Laszlo Ersek
2019-10-08 13:29   ` Philippe Mathieu-Daudé
2019-10-08 18:31     ` Laszlo Ersek
2019-10-08 15:59   ` Igor Mammedov
2019-10-09 21:01     ` Laszlo Ersek
2019-10-10  9:45       ` Igor Mammedov
2019-10-10 10:01     ` Michael S. Tsirkin
2019-10-10 12:48       ` Igor Mammedov
2019-10-10 16:23         ` Laszlo Ersek
2019-10-10 18:08           ` Michael S. Tsirkin
2019-10-11  6:50             ` Laszlo Ersek
2019-10-11  7:46               ` Laszlo Ersek
2019-10-10 16:15       ` Laszlo Ersek
2019-10-08 18:58   ` Laszlo Ersek
2019-10-09 11:13     ` Igor Mammedov
2019-10-09 21:03       ` Laszlo Ersek
2019-10-09 21:09     ` Eduardo Habkost
2019-10-08 10:52 ` [PATCH 4/4] hw/i386/pc: " Laszlo Ersek
2019-10-08 14:23 ` [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF no-reply

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org qemu-devel@archiver.kernel.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox