All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'
@ 2016-11-11 12:57 Igor Mammedov
  2016-11-11 14:02 ` Laszlo Ersek
  0 siblings, 1 reply; 6+ messages in thread
From: Igor Mammedov @ 2016-11-11 12:57 UTC (permalink / raw)
  To: qemu-devel
  Cc: Laszlo Ersek, mst, Stefan Hajnoczi, Kevin O'Connor,
	Gerd Hoffmann, Eduardo Habkost

While looking at OVMF and how it handles CPUs (ACPI/AP wakeup),
I've noticed that it uses legacy FW_CFG_NB_CPUS(0x05) to get
the number of present at start CPUs.
Variable was introduced back in 2008 by fbfcf955ba and is/was used
by ppc/sparc/arm/x86 and a bunch of firmwares (but not by SeaBIOS).

However in 2.8 I've just added similar fwcfg file 'etc/boot-cpus'
which is used for the same purpose \-)

Both variables are UINT16 and the only difference that FW_CFG_NB_CPUS
doesn't account for CPUs added with -device which might make
a firmware to wait indefinitely in FW_CFG_NB_CPUS == Woken-up-APs loop
due to extra not expected APs being woken up.

FW_CFG_NB_CPUS should be fixed to mimic 'etc/boot-cpus' behavior anyway,
so question arises why not to reuse fixed legacy FW_CFG_NB_CPUS and
drop just added but not yet released 'etc/boot-cpus' from QEMU and SeaBIOS.

Any opinions?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'
  2016-11-11 12:57 [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus' Igor Mammedov
@ 2016-11-11 14:02 ` Laszlo Ersek
  2016-11-11 14:36   ` Igor Mammedov
  0 siblings, 1 reply; 6+ messages in thread
From: Laszlo Ersek @ 2016-11-11 14:02 UTC (permalink / raw)
  To: Igor Mammedov, qemu-devel
  Cc: mst, Stefan Hajnoczi, Kevin O'Connor, Gerd Hoffmann,
	Eduardo Habkost, Jeff Fan, Jordan Justen (Intel address)

adding Jeff Fan and Jordan Justen

On 11/11/16 13:57, Igor Mammedov wrote:
> While looking at OVMF and how it handles CPUs (ACPI/AP wakeup),
> I've noticed that it uses legacy FW_CFG_NB_CPUS(0x05) to get
> the number of present at start CPUs.

Where exactly do you see this?

The only place where I see this fw_cfg key being accessed is
"OvmfPkg/AcpiPlatformDxe/Qemu.c", function QemuInstallAcpiMadtTable().

That function is only called by OVMF if QEMU does not provide the
"etc/table-loader" interface. Which means, in practical terms, that the
QemuInstallAcpiMadtTable() function is historical / dead.

IOW, if you run OVMF on a non-historical machine type, then fw_cfg key
0x05 is unused.

> Variable was introduced back in 2008 by fbfcf955ba and is/was used
> by ppc/sparc/arm/x86 and a bunch of firmwares (but not by SeaBIOS).
> 
> However in 2.8 I've just added similar fwcfg file 'etc/boot-cpus'
> which is used for the same purpose \-)
> 
> Both variables are UINT16 and the only difference that FW_CFG_NB_CPUS
> doesn't account for CPUs added with -device which might make
> a firmware to wait indefinitely in FW_CFG_NB_CPUS == Woken-up-APs loop
> due to extra not expected APs being woken up.
> 
> FW_CFG_NB_CPUS should be fixed to mimic 'etc/boot-cpus' behavior anyway,
> so question arises why not to reuse fixed legacy FW_CFG_NB_CPUS and
> drop just added but not yet released 'etc/boot-cpus' from QEMU and SeaBIOS.
> 
> Any opinions?
> 

I'm fine either way -- I don't have a preference for either of key 0x05
or file "etc/boot-cpus".

For starting up APs, OVMF includes UefiCpuPkg modules. Recently, a good
chunk of the multiprocessing code has been extracted to
"UefiCpuPkg/Library/MpInitLib". This directory has two INF files:

- PeiMpInitLib.inf, which customizes the library for PEI phase modules
  (PEIMs),

- DxeMpInitLib.inf, which customizes the library for DXE_DRIVER modules.

The most important drivers that use this library are:

- UefiCpuPkg/CpuDxe, which is a DXE_DRIVER and provides (among other
  things) the EFI_MP_SERVICES_PROTOCOL, for DXE- and later phase
  clients,

- UefiCpuPkg/CpuMpPei, which is a PEIM and provides the
  EFI_PEI_MP_SERVICES_PPI for PEI-phase clients.

For example, when OVMF programs the MSR_IA32_FEATURE_CONTROL MSR on each
CPU during boot (and S3 resume too), it calls
EFI_PEI_MP_SERVICES_PPI.StartupAllAPs() for this. Please see commit
dbab994991c7 for details.

(The parallel SeaBIOS commits are 20f83d5c7c0f and 54e3a88609da.)

The number of processors in the system is apparently determined in
MpInitLibInitialize() --> CollectProcessorCount(). It does not use any
fw_cfg hints from OVMF. What it uses is called

  PcdCpuMaxLogicalProcessorNumber

which is an absolute upper bound on the number of logical processors in
the system. (It is not required that this many CPUs be available: there
may be fewer. However, there mustn't be more -- in that case, things
will break.)

This value is currently set statically, at build time, to 64. We can
raise it statically. If you want to set it dynamically, that's possible
as well.

Normally, this would not be trivial, because the PCD is consumed early.
Usually we set such dynamic PCDs in OVMF's PlatformPei, but that doesn't
work -- generally -- when another PEIM would like to consume the PCD.
The dispatch order between PEIMs is generally unspecified, so we
couldn't be sure that PlatformPei would set the PCD before CpuMpPei
consumed it at startup, via PeiMpInitLib.

Normally, we solve this with a plugin library instance that we hook into
all the PCD consumer modules (without their explicit knowledge), so the
PCD gets set (unbeknownst to them) right when they start up, from
fw_cfg. However, in this case we can simplify things a bit, because
CpuMpPei is serialized strictly after PlatformPei through the
installation of the permanent PEI RAM (the
gEfiPeiMemoryDiscoveredPpiGuid depex). CpuMpPei cannot launch until
PlatformPei exposes the permanent PEI RAM, hence PlatformPei can set the
PCD as first guy too.

(If you are interested why PlatformPei can use CpuMpPei's services then:
because we register a callback in PlatformPei so that CpuMpPei's PPI
installation will inform us.)

Okay, okay, this is likely too much information. Let me summarize:

- The use of fw_cfg key 0x05 in OVMF is his historical, and only
  related to OVMF's built-in ACPI tables. Which are never used with
  halfway recent QEMU machine types.

- OVMF -- via UefiCpuPkg's MpInitLib, CpuMpPei, CpuDxe and
  PiSmmCpuDxeSmm -- starts up all APs without looking for a dynamically
  set "present at boot" count, or absolute limit. The current static
  limit is 64 processors.

- If you want to make the limit dynamic, from fw_cfg, we can do that.
  We just have to know where to read it from; key 0x05, or file
  "etc/boot-cpus".

- I don't know how high we can go with the maximum. 255 should work. I
  seem to remember that we might want 288, and for that we need X2Apic.
  OVMF already uses the "UefiCpuPkg/Library/BaseXApicX2ApicLib"
  instance exclusively, for the LocalApicLib class, so if there's no
  other limitation, things should be fine.

- Note that with the SMM driver stack built into OVMF, the maximum CPU
  count influences SMRAM consumption. 8MB -- the maximum that Q35
  provides -- might not be enough for a huge number of processors
  (I even suspect it's no longer enough for 255).

Thanks
Laszlo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'
  2016-11-11 14:02 ` Laszlo Ersek
@ 2016-11-11 14:36   ` Igor Mammedov
  2016-11-11 14:44     ` Kevin O'Connor
  2016-11-11 16:37     ` Laszlo Ersek
  0 siblings, 2 replies; 6+ messages in thread
From: Igor Mammedov @ 2016-11-11 14:36 UTC (permalink / raw)
  To: Laszlo Ersek
  Cc: qemu-devel, Eduardo Habkost, mst, Stefan Hajnoczi,
	Kevin O'Connor, Gerd Hoffmann, Jordan Justen (Intel address),
	Jeff Fan

On Fri, 11 Nov 2016 15:02:36 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> adding Jeff Fan and Jordan Justen
> 
> On 11/11/16 13:57, Igor Mammedov wrote:
> > While looking at OVMF and how it handles CPUs (ACPI/AP wakeup),
> > I've noticed that it uses legacy FW_CFG_NB_CPUS(0x05) to get
> > the number of present at start CPUs.  
> 
> Where exactly do you see this?
That's the only place in OVMF, but according to google there are
other firmwares that use FW_CFG_NB_CPUS so we are not free to remove
it and break guests.

So I'd just drop not yet released 'etc/boot-cpus',
of cause SeaBIOS should be fixed and its blob in QEMU updated.


> The only place where I see this fw_cfg key being accessed is
> "OvmfPkg/AcpiPlatformDxe/Qemu.c", function QemuInstallAcpiMadtTable().
> 
> That function is only called by OVMF if QEMU does not provide the
> "etc/table-loader" interface. Which means, in practical terms, that the
> QemuInstallAcpiMadtTable() function is historical / dead.
> 
> IOW, if you run OVMF on a non-historical machine type, then fw_cfg key
> 0x05 is unused.
> 
> > Variable was introduced back in 2008 by fbfcf955ba and is/was used
> > by ppc/sparc/arm/x86 and a bunch of firmwares (but not by SeaBIOS).
> > 
> > However in 2.8 I've just added similar fwcfg file 'etc/boot-cpus'
> > which is used for the same purpose \-)
> > 
> > Both variables are UINT16 and the only difference that FW_CFG_NB_CPUS
> > doesn't account for CPUs added with -device which might make
> > a firmware to wait indefinitely in FW_CFG_NB_CPUS == Woken-up-APs loop
> > due to extra not expected APs being woken up.
> > 
> > FW_CFG_NB_CPUS should be fixed to mimic 'etc/boot-cpus' behavior anyway,
> > so question arises why not to reuse fixed legacy FW_CFG_NB_CPUS and
> > drop just added but not yet released 'etc/boot-cpus' from QEMU and SeaBIOS.
> > 
> > Any opinions?
> >   
> 
> I'm fine either way -- I don't have a preference for either of key 0x05
> or file "etc/boot-cpus".
I'll post fixup patches  (removing"etc/boot-cpus" ) today, after I've tested
them a little bit more.



> For starting up APs, OVMF includes UefiCpuPkg modules. Recently, a good
> chunk of the multiprocessing code has been extracted to
> "UefiCpuPkg/Library/MpInitLib". This directory has two INF files:
> 
> - PeiMpInitLib.inf, which customizes the library for PEI phase modules
>   (PEIMs),
> 
> - DxeMpInitLib.inf, which customizes the library for DXE_DRIVER modules.
> 
> The most important drivers that use this library are:
> 
> - UefiCpuPkg/CpuDxe, which is a DXE_DRIVER and provides (among other
>   things) the EFI_MP_SERVICES_PROTOCOL, for DXE- and later phase
>   clients,
> 
> - UefiCpuPkg/CpuMpPei, which is a PEIM and provides the
>   EFI_PEI_MP_SERVICES_PPI for PEI-phase clients.
> 
> For example, when OVMF programs the MSR_IA32_FEATURE_CONTROL MSR on each
> CPU during boot (and S3 resume too), it calls
> EFI_PEI_MP_SERVICES_PPI.StartupAllAPs() for this. Please see commit
> dbab994991c7 for details.
> 
> (The parallel SeaBIOS commits are 20f83d5c7c0f and 54e3a88609da.)
> 
> The number of processors in the system is apparently determined in
> MpInitLibInitialize() --> CollectProcessorCount(). It does not use any
> fw_cfg hints from OVMF. What it uses is called
> 
>   PcdCpuMaxLogicalProcessorNumber
> 
> which is an absolute upper bound on the number of logical processors in
> the system. (It is not required that this many CPUs be available: there
> may be fewer. However, there mustn't be more -- in that case, things
> will break.)
> 
> This value is currently set statically, at build time, to 64. We can
> raise it statically. If you want to set it dynamically, that's possible
> as well.
> 
> Normally, this would not be trivial, because the PCD is consumed early.
> Usually we set such dynamic PCDs in OVMF's PlatformPei, but that doesn't
> work -- generally -- when another PEIM would like to consume the PCD.
> The dispatch order between PEIMs is generally unspecified, so we
> couldn't be sure that PlatformPei would set the PCD before CpuMpPei
> consumed it at startup, via PeiMpInitLib.
> 
> Normally, we solve this with a plugin library instance that we hook into
> all the PCD consumer modules (without their explicit knowledge), so the
> PCD gets set (unbeknownst to them) right when they start up, from
> fw_cfg. However, in this case we can simplify things a bit, because
> CpuMpPei is serialized strictly after PlatformPei through the
> installation of the permanent PEI RAM (the
> gEfiPeiMemoryDiscoveredPpiGuid depex). CpuMpPei cannot launch until
> PlatformPei exposes the permanent PEI RAM, hence PlatformPei can set the
> PCD as first guy too.
> 
> (If you are interested why PlatformPei can use CpuMpPei's services then:
> because we register a callback in PlatformPei so that CpuMpPei's PPI
> installation will inform us.)
> 
> Okay, okay, this is likely too much information. Let me summarize:
> 
> - The use of fw_cfg key 0x05 in OVMF is his historical, and only
>   related to OVMF's built-in ACPI tables. Which are never used with
>   halfway recent QEMU machine types.
> 
> - OVMF -- via UefiCpuPkg's MpInitLib, CpuMpPei, CpuDxe and
>   PiSmmCpuDxeSmm -- starts up all APs without looking for a dynamically
>   set "present at boot" count, or absolute limit. The current static
>   limit is 64 processors.
> 
> - If you want to make the limit dynamic, from fw_cfg, we can do that.
>   We just have to know where to read it from; key 0x05, or file
>   "etc/boot-cpus".
> 
> - I don't know how high we can go with the maximum. 255 should work. I
>   seem to remember that we might want 288, and for that we need X2Apic.
>   OVMF already uses the "UefiCpuPkg/Library/BaseXApicX2ApicLib"
>   instance exclusively, for the LocalApicLib class, so if there's no
>   other limitation, things should be fine.
I've already have a patch that makes upstream OVMF work for more
than 64 CPUs and upto 288 (hopefully written in correct way).
So expect me bothering you about boring stuff like rules where/how
to post patches for edk2 and asking for review.


> - Note that with the SMM driver stack built into OVMF, the maximum CPU
>   count influences SMRAM consumption. 8MB -- the maximum that Q35
>   provides -- might not be enough for a huge number of processors
>   (I even suspect it's no longer enough for 255).
I haven't looked at SMM yet, maybe I shouldn't after all :)

 
> Thanks
> Laszlo
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'
  2016-11-11 14:36   ` Igor Mammedov
@ 2016-11-11 14:44     ` Kevin O'Connor
  2016-11-11 15:05       ` Igor Mammedov
  2016-11-11 16:37     ` Laszlo Ersek
  1 sibling, 1 reply; 6+ messages in thread
From: Kevin O'Connor @ 2016-11-11 14:44 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Laszlo Ersek, qemu-devel, Eduardo Habkost, mst, Stefan Hajnoczi,
	Gerd Hoffmann, Jordan Justen (Intel address),
	Jeff Fan

On Fri, Nov 11, 2016 at 03:36:19PM +0100, Igor Mammedov wrote:
> On Fri, 11 Nov 2016 15:02:36 +0100
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
> > adding Jeff Fan and Jordan Justen
> > 
> > On 11/11/16 13:57, Igor Mammedov wrote:
> > > While looking at OVMF and how it handles CPUs (ACPI/AP wakeup),
> > > I've noticed that it uses legacy FW_CFG_NB_CPUS(0x05) to get
> > > the number of present at start CPUs.  
> > 
> > Where exactly do you see this?
> That's the only place in OVMF, but according to google there are
> other firmwares that use FW_CFG_NB_CPUS so we are not free to remove
> it and break guests.
> 
> So I'd just drop not yet released 'etc/boot-cpus',
> of cause SeaBIOS should be fixed and its blob in QEMU updated.

I have no preference between FW_CFG_NB_CPUS and 'etc/boot-cpus'.  Note
that SeaBIOS v1.10 was released with the code using 'etc/boot-cpus'
though.  If it's to go away, we'll need a SeaBIOS stable branch
release with the change.

-Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'
  2016-11-11 14:44     ` Kevin O'Connor
@ 2016-11-11 15:05       ` Igor Mammedov
  0 siblings, 0 replies; 6+ messages in thread
From: Igor Mammedov @ 2016-11-11 15:05 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Eduardo Habkost, mst, Stefan Hajnoczi, qemu-devel, Gerd Hoffmann,
	Jordan Justen (Intel address),
	Laszlo Ersek, Jeff Fan

On Fri, 11 Nov 2016 09:44:03 -0500
"Kevin O'Connor" <kevin@koconnor.net> wrote:

> On Fri, Nov 11, 2016 at 03:36:19PM +0100, Igor Mammedov wrote:
> > On Fri, 11 Nov 2016 15:02:36 +0100
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >   
> > > adding Jeff Fan and Jordan Justen
> > > 
> > > On 11/11/16 13:57, Igor Mammedov wrote:  
> > > > While looking at OVMF and how it handles CPUs (ACPI/AP wakeup),
> > > > I've noticed that it uses legacy FW_CFG_NB_CPUS(0x05) to get
> > > > the number of present at start CPUs.    
> > > 
> > > Where exactly do you see this?  
> > That's the only place in OVMF, but according to google there are
> > other firmwares that use FW_CFG_NB_CPUS so we are not free to remove
> > it and break guests.
> > 
> > So I'd just drop not yet released 'etc/boot-cpus',
> > of cause SeaBIOS should be fixed and its blob in QEMU updated.  
> 
> I have no preference between FW_CFG_NB_CPUS and 'etc/boot-cpus'.  Note
> that SeaBIOS v1.10 was released with the code using 'etc/boot-cpus'
> though.  If it's to go away, we'll need a SeaBIOS stable branch
> release with the change.
Yep, we'd need a stable for that.
It still probably better then keeping 2 exactly same interfaces
when we have a chance to maintain only one.

Without supplier of 'etc/boot-cpus' (QEMU) it will behave like
it used before, i.e. hang with x2apic configuration.

> 
> -Kevin
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus'
  2016-11-11 14:36   ` Igor Mammedov
  2016-11-11 14:44     ` Kevin O'Connor
@ 2016-11-11 16:37     ` Laszlo Ersek
  1 sibling, 0 replies; 6+ messages in thread
From: Laszlo Ersek @ 2016-11-11 16:37 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: qemu-devel, Eduardo Habkost, mst, Stefan Hajnoczi,
	Kevin O'Connor, Gerd Hoffmann, Jordan Justen (Intel address),
	Jeff Fan

On 11/11/16 15:36, Igor Mammedov wrote:

> I've already have a patch that makes upstream OVMF work for more
> than 64 CPUs and upto 288 (hopefully written in correct way).
> So expect me bothering you about boring stuff like rules where/how
> to post patches for edk2 and asking for review.

Great :)

Here's the list info:

https://lists.01.org/mailman/listinfo/edk2-devel

In particular, you have to be subscribed to be able to post. It's not
optimal, but that's how the list works for now.

Here's the official rules for contributing:

https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Development-Process

And here's my "somewhat more verbose" take on them:

https://github.com/tianocore/tianocore.github.io/wiki/Laszlo's-unkempt-git-guide-for-edk2-contributors-and-maintainers

Thanks,
Laszlo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-11-11 16:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-11 12:57 [Qemu-devel] FW_CFG_NB_CPUS vs fwcfg file 'etc/boot-cpus' Igor Mammedov
2016-11-11 14:02 ` Laszlo Ersek
2016-11-11 14:36   ` Igor Mammedov
2016-11-11 14:44     ` Kevin O'Connor
2016-11-11 15:05       ` Igor Mammedov
2016-11-11 16:37     ` Laszlo Ersek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.