All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Eduardo Habkost" <ehabkost@redhat.com>,
	"Laszlo Ersek" <lersek@redhat.com>,
	"qemu devel list" <qemu-devel@nongnu.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Philippe Mathieu-Daudé" <philmd@redhat.com>,
	"Richard Henderson" <rth@twiddle.net>
Subject: Re: [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg
Date: Thu, 10 Oct 2019 14:48:12 +0200	[thread overview]
Message-ID: <20191010144812.20fd8b5d@redhat.com> (raw)
In-Reply-To: <20191010055733-mutt-send-email-mst@kernel.org>

On Thu, 10 Oct 2019 06:01:51 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Oct 08, 2019 at 05:59:31PM +0200, Igor Mammedov wrote:
> > On Tue,  8 Oct 2019 12:52:58 +0200
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >   
> > > FW_CFG_MAX_CPUS exposes the (exclusive) maximum APIC ID to guest firmware,
> > > due to historical reasons. That value is not useful to edk2, however. For
> > > supporting VCPU hotplug, edk2 needs:
> > > 
> > > - the boot CPU count (already exposed in FW_CFG_NB_CPUS),
> > > 
> > > - and the maximum foreseen CPU count (tracked in
> > >   "MachineState.smp.max_cpus", but not currently exposed).  
> > one can get it with current QEMU without adding new fgcfg
> > (albeit in a bit awkward way)
> > 
> > max_cpu count can be derived indirectly as result of cpu hotplug
> > enumeration (IO interface at 0x0cd8-0xcf7) by writing/reading
> > to/from selector register (see ACPI_CPU_SELECTOR_OFFSET_WR)
> > until read value stops changing values (i.e. max cpu count
> > is reached). One also can figure out present/non-present
> > cpu status by reading flags register.  
> 
> Right but so far ACPI was the only driver of CPU hotplug, right?
even if its only user is ACPI now, like any other ABI it's
fixed and given that requirements to CPU hotplug ABI from ACPI
and firmware pretty much the same, I don't really see a reason
not to reuse it, since whole hotplug process orchestrated
by ACPI anyway.

> I don't think firmware should poke it directly, it is
> not really clean or especially scalable.
> Fine for ACPI but not great as a HV/guest interface.
Building an alternative ABI that will duplicate already
existing one, doesn't sound like a great idea to me.

Could you elaborate why do you think it's not a good idea
to re-use already existing CPU hotplug ABI in firmware?

Or even better suggest an algorithm how CPU hotplug should
work with SMM enabled OVMF.

 * in generic case, CPUs are asynchronously hotplugged and
   OSMP also asynchronously processing hotplug events.
   (So at the moment OSPM tells firmware to handle hotplug
    there are 1-n CPUs to handle and more might be coming)

> > > Add a new fw-cfg file to expose "max_cpus".
> > > 
> > > While at it, expose the rest of the topology too (die / core / thread
> > > counts), because I expect that the VCPU hotplug feature for OVMF will
> > > ultimately need those too, and the data size is not large. This is
> > > slightly complicated by the fact that the die count is specific to
> > > PCMachineState, but fw_cfg_arch_create() intends to be PC-independent (see
> > > commit 149c50cabcc4).  
> > Could you clarify why topology info is necessary?
> > 
> > Potentially it's possible to extend cpu hotplug ABI to report
> > arch specific apic-id (x86) or mpidr (arm) if firmware really
> > needs to know topology and let guest to decode it according
> > to CPU's spec.
> > 
> > So far there were no need for it as all possible cpus are
> > described in ACPI tables passed to guest, but I'm not going
> > to suggest to parse them on firmware side as it's too complicated :)  
> 
> We can always add a QEMU specific data table by the way.
> Format would be up to us and would be easy to parse.
> I don't see a big advantage as compared to fw cfg though.

it doesn't really matter if it's ACPI blob or fw_cfg,
what firmware needs is a table of possible CPUs with APIC IDs.

But if we go this route (i.e. not reuse CPU hotplug interface),
the table alone is not enough, one would need to build a protocol
between ACPI and firmware to communicate what CPUs to (un)hotplug.
It's basically repeating what current CPU hotplug interface does.

> > PS:
> > The reason we started building ACPI tables in QEMU, was never
> > ending story of adding more ABI and supporting it afterwards.
> > So I'd try to avoid doing it if it can be helped.  
> 
> Absolutely. We should try to keep all ACPI generation in QEMU.
> But this value is not for ACPI, is it?
I'm not sure what you are trying to say here.
 
 
> 
> > > For now, the feature is temporarily disabled.
> > > 
> > > Cc: "Michael S. Tsirkin" <mst@redhat.com>
> > > Cc: Eduardo Habkost <ehabkost@redhat.com>
> > > Cc: Igor Mammedov <imammedo@redhat.com>
> > > Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
> > > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > > Cc: Philippe Mathieu-Daudé <philmd@redhat.com>
> > > Cc: Richard Henderson <rth@twiddle.net>
> > > Ref: https://bugzilla.tianocore.org/show_bug.cgi?id=1515
> > > Signed-off-by: Laszlo Ersek <lersek@redhat.com>
> > > ---
> > >  hw/i386/fw_cfg.h | 30 +++++++++++++++++++++++++++++-
> > >  hw/i386/fw_cfg.c | 26 ++++++++++++++++++++++++--
> > >  hw/i386/pc.c     |  4 ++--
> > >  3 files changed, 55 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/hw/i386/fw_cfg.h b/hw/i386/fw_cfg.h
> > > index e0856a376996..d742435b9793 100644
> > > --- a/hw/i386/fw_cfg.h
> > > +++ b/hw/i386/fw_cfg.h
> > > @@ -18,9 +18,37 @@
> > >  #define FW_CFG_E820_TABLE       (FW_CFG_ARCH_LOCAL + 3)
> > >  #define FW_CFG_HPET             (FW_CFG_ARCH_LOCAL + 4)
> > >  
> > > +/**
> > > + * FWCfgX86Topology: expose the X86 CPU topology to guest firmware over fw-cfg.
> > > + *
> > > + * All fields have little-endian encoding.
> > > + *
> > > + * @dies:     Number of dies per package (aka socket). Set it to 1 unless the
> > > + *            concrete MachineState subclass defines it differently.
> > > + * @cores:    Corresponds to @CpuTopology.@cores.
> > > + * @threads:  Corresponds to @CpuTopology.@threads.
> > > + * @max_cpus: Corresponds to @CpuTopology.@max_cpus.
> > > + *
> > > + * Firmware can derive the package (aka socket) count with the following
> > > + * formula:
> > > + *
> > > + *   DIV_ROUND_UP(@max_cpus, @dies * @cores * @threads)
> > > + *
> > > + * Firmware can derive APIC ID field widths and offsets per the standard
> > > + * calculations in "include/hw/i386/topology.h".
> > > + */
> > > +typedef struct FWCfgX86Topology {
> > > +  uint32_t dies;
> > > +  uint32_t cores;
> > > +  uint32_t threads;
> > > +  uint32_t max_cpus;
> > > +} QEMU_PACKED FWCfgX86Topology;
> > > +
> > >  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> > >                                 uint16_t boot_cpus,
> > > -                               uint16_t apic_id_limit);
> > > +                               uint16_t apic_id_limit,
> > > +                               unsigned smp_dies,
> > > +                               bool expose_topology);
> > >  void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg);
> > >  void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg);
> > >  
> > > diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
> > > index 39b6bc60520c..33d09875014f 100644
> > > --- a/hw/i386/fw_cfg.c
> > > +++ b/hw/i386/fw_cfg.c
> > > @@ -85,9 +85,26 @@ void fw_cfg_build_smbios(MachineState *ms, FWCfgState *fw_cfg)
> > >      }
> > >  }
> > >  
> > > +static void fw_cfg_expose_topology(FWCfgState *fw_cfg,
> > > +                                   unsigned dies,
> > > +                                   unsigned cores,
> > > +                                   unsigned threads,
> > > +                                   unsigned max_cpus)
> > > +{
> > > +    FWCfgX86Topology *topo = g_new(FWCfgX86Topology, 1);
> > > +
> > > +    topo->dies     = cpu_to_le32(dies);
> > > +    topo->cores    = cpu_to_le32(cores);
> > > +    topo->threads  = cpu_to_le32(threads);
> > > +    topo->max_cpus = cpu_to_le32(max_cpus);
> > > +    fw_cfg_add_file(fw_cfg, "etc/x86-smp-topology", topo, sizeof *topo);
> > > +}
> > > +
> > >  FWCfgState *fw_cfg_arch_create(MachineState *ms,
> > > -                                      uint16_t boot_cpus,
> > > -                                      uint16_t apic_id_limit)
> > > +                               uint16_t boot_cpus,
> > > +                               uint16_t apic_id_limit,
> > > +                               unsigned smp_dies,
> > > +                               bool expose_topology)
> > >  {
> > >      FWCfgState *fw_cfg;
> > >      uint64_t *numa_fw_cfg;
> > > @@ -143,6 +160,11 @@ FWCfgState *fw_cfg_arch_create(MachineState *ms,
> > >                       (1 + apic_id_limit + nb_numa_nodes) *
> > >                       sizeof(*numa_fw_cfg));
> > >  
> > > +    if (expose_topology) {
> > > +        fw_cfg_expose_topology(fw_cfg, smp_dies, ms->smp.cores,
> > > +                               ms->smp.threads, ms->smp.max_cpus);
> > > +    }
> > > +
> > >      return fw_cfg;
> > >  }
> > >  
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index bcda50efcc23..bb72b12edad2 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -1738,8 +1738,8 @@ void pc_memory_init(PCMachineState *pcms,
> > >                                          option_rom_mr,
> > >                                          1);
> > >  
> > > -    fw_cfg = fw_cfg_arch_create(machine,
> > > -                                pcms->boot_cpus, pcms->apic_id_limit);
> > > +    fw_cfg = fw_cfg_arch_create(machine, pcms->boot_cpus, pcms->apic_id_limit,
> > > +                                pcms->smp_dies, false);
> > >  
> > >      rom_set_fw(fw_cfg);
> > >    



  reply	other threads:[~2019-10-10 13:01 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-08 10:52 [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF Laszlo Ersek
2019-10-08 10:52 ` [PATCH 1/4] fw_cfg: bump file slots to 40 Laszlo Ersek
2019-10-08 10:52 ` [PATCH 2/4] target/i386: remove useless enable_compat_apic_id_mode() prototype Laszlo Ersek
2019-10-08 13:35   ` Philippe Mathieu-Daudé
2019-10-08 18:22     ` Laszlo Ersek
2019-10-08 10:52 ` [PATCH 3/4] hw/i386: add facility to expose CPU topology over fw-cfg Laszlo Ersek
2019-10-08 13:29   ` Philippe Mathieu-Daudé
2019-10-08 18:31     ` Laszlo Ersek
2019-10-08 15:59   ` Igor Mammedov
2019-10-09 21:01     ` Laszlo Ersek
2019-10-10  9:45       ` Igor Mammedov
2019-10-10 10:01     ` Michael S. Tsirkin
2019-10-10 12:48       ` Igor Mammedov [this message]
2019-10-10 16:23         ` Laszlo Ersek
2019-10-10 18:08           ` Michael S. Tsirkin
2019-10-11  6:50             ` Laszlo Ersek
2019-10-11  7:46               ` Laszlo Ersek
2019-10-10 16:15       ` Laszlo Ersek
2019-10-08 18:58   ` Laszlo Ersek
2019-10-09 11:13     ` Igor Mammedov
2019-10-09 21:03       ` Laszlo Ersek
2019-10-09 21:09     ` Eduardo Habkost
2019-10-08 10:52 ` [PATCH 4/4] hw/i386/pc: " Laszlo Ersek
2019-10-08 14:23 ` [PATCH 0/4] hw/i386: pass "MachineState.smp.max_cpus" to OVMF no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191010144812.20fd8b5d@redhat.com \
    --to=imammedo@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=lersek@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=philmd@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.