qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Yao, Jiewen" <jiewen.yao@intel.com>
To: Laszlo Ersek <lersek@redhat.com>,
	edk2-devel-groups-io <devel@edk2.groups.io>
Cc: "Chen, Yingwen" <yingwen.chen@intel.com>,
	Phillip Goerl <phillip.goerl@oracle.com>,
	qemu devel list <qemu-devel@nongnu.org>,
	"Nakajima,  Jun" <jun.nakajima@intel.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	edk2-rfc-groups-io <rfc@edk2.groups.io>,
	Joao Marcal Lemos Martins <joao.m.martins@oracle.com>
Subject: Re: [Qemu-devel] CPU hotplug using SMM with QEMU+OVMF
Date: Wed, 14 Aug 2019 13:20:55 +0000	[thread overview]
Message-ID: <74D8A39837DF1E4DA445A8C0B3885C503F75B680@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <effa5e32-be1e-4703-4419-8866b7754e2d@redhat.com>

My comments below.

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Wednesday, August 14, 2019 12:09 AM
> To: edk2-devel-groups-io <devel@edk2.groups.io>
> Cc: edk2-rfc-groups-io <rfc@edk2.groups.io>; qemu devel list
> <qemu-devel@nongnu.org>; Igor Mammedov <imammedo@redhat.com>;
> Paolo Bonzini <pbonzini@redhat.com>; Yao, Jiewen
> <jiewen.yao@intel.com>; Chen, Yingwen <yingwen.chen@intel.com>;
> Nakajima, Jun <jun.nakajima@intel.com>; Boris Ostrovsky
> <boris.ostrovsky@oracle.com>; Joao Marcal Lemos Martins
> <joao.m.martins@oracle.com>; Phillip Goerl <phillip.goerl@oracle.com>
> Subject: Re: CPU hotplug using SMM with QEMU+OVMF
> 
> On 08/13/19 16:16, Laszlo Ersek wrote:
> 
> > Yingwen and Jiewen suggested the following process.
> >
> > Legend:
> >
> > - "New CPU":  CPU being hot-added
> > - "Host CPU": existing CPU
> > - (Flash):    code running from flash
> > - (SMM):      code running from SMRAM
> >
> > Steps:
> >
> > (01) New CPU: (Flash) enter reset vector, Global SMI disabled by
> >      default.
> 
> - What does "Global SMI disabled by default" mean? In particular, what
>   is "global" here?
[Jiewen] OK. Let's don’t use the term "global".


>   Do you mean that the CPU being hot-plugged should mask (by default)
>   broadcast SMIs? What about directed SMIs? (An attacker could try that
>   too.)
[Jiewen] I mean all SMIs are blocked for this specific hot-added CPU.


>   And what about other processors? (I'd assume step (01)) is not
>   relevant for other processors, but "global" is quite confusing here.)
[Jiewen] No impact to other processors.


> - Does this part require a new branch somewhere in the OVMF SEC code?
>   How do we determine whether the CPU executing SEC is BSP or
>   hot-plugged AP?
[Jiewen] I think this is blocked from hardware perspective, since the first instruction.
There are some hardware specific registers can be used to determine if the CPU is new added.
I don’t think this must be same as the real hardware.
You are free to invent some registers in device model to be used in OVMF hot plug driver.


> - How do we tell the hot-plugged AP where to start execution? (I.e. that
>   it should execute code at a particular pflash location.)
[Jiewen] Same real mode reset vector at FFFF:FFF0.


>   For example, in MpInitLib, we start a specific AP with INIT-SIPI-SIPI,
>   where "SIPI" stores the startup address in the "Interrupt Command
>   Register" (which is memory-mapped in xAPIC mode, and an MSR in x2APIC
>   mode, apparently). That doesn't apply here -- should QEMU auto-start
>   the new CPU?
[Jiewen] You can send INIT-SIPI-SIPI to new CPU only after it can access memory.
SIPI need give AP an below 1M memory address as waking vector.


> - What memory is used as stack by the new CPU, when it runs code from
>   flash?
[Jiewen] Same as other CPU in normal boot. You can use special reserved memory.


>   QEMU does not emulate CAR (Cache As RAM). The new CPU doesn't have
>   access to SMRAM. And we cannot use AcpiNVS or Reserved memory,
> because
>   a malicious OS could use other CPUs -- or PCI device DMA -- to attack
>   the stack (unless QEMU forcibly paused other CPUs upon hotplug; I'm
>   not sure).
[Jiewen] Excellent point!
I don’t think there is problem for real hardware, who always has CAR.
Can QEMU provide some CPU specific space, such as MMIO region?


> - If an attempt is made to hotplug multiple CPUs in quick succession,
>   does something serialize those attempts?
[Jiewen] The BIOS need consider this as availability requirement.
I don’t have strong opinion.
You can design a system that required hotplug must be one-by-one, or fail the hot-add.
Or you can design a system that did not have such restriction.
Again, all we need to do is to maintain the integrity of SMM.
The availability should be considered as separate requirement.


>   Again, stack usage could be a concern, even with Cache-As-RAM --
>   HyperThreads (logical processors) on a single core don't have
>   dedicated cache.
[Jiewen] Agree with you on the virtual environment.
For real hardware, we do socket level hot-add only. So HT is not the concern.
But if you want to do that in virtual environment, a processor specific memory
should be considered.


>   Does CPU hotplug apply only at the socket level? If the CPU is
>   multi-core, what is responsible for hot-plugging all cores present in
>   the socket?
[Jiewen] Ditto.


> > (02) New CPU: (Flash) configure memory control to let it access global
> >      host memory.
> 
> In QEMU/KVM guests, we don't have to enable memory explicitly, it just
> exists and works.
> 
> In OVMF X64 SEC, we can't access RAM above 4GB, but that shouldn't be an
> issue per se.
[Jiewen] Agree. I do not see the issue.


> > (03) New CPU: (Flash) send board message to tell host CPU (GPIO->SCI)
> >      -- I am waiting for hot-add message.
> 
> Maybe we can simplify this in QEMU by broadcasting an SMI to existent
> processors immediately upon plugging the new CPU.
> 
> 
> >                                        (NOTE: Host CPU can only
> send
> >      instruction in SMM mode. -- The register is SMM only)
> 
> Sorry, I don't follow -- what register are we talking about here, and
> why is the BSP needed to send anything at all? What "instruction" do you
> have in mind?
[Jiewen] The new CPU does not enable SMI at reset.
At some point of time later, the CPU need enable SMI, right?
The "instruction" here means, the host CPUs need tell to CPU to enable SMI.


> > (04) Host CPU: (OS) get message from board that a new CPU is added.
> >      (GPIO -> SCI)
> >
> > (05) Host CPU: (OS) All CPUs enter SMM (SCI->SWSMI) (NOTE: New CPU
> >      will not enter CPU because SMI is disabled)
> 
> I don't understand the OS involvement here. But, again, perhaps QEMU can
> force all existent CPUs into SMM immediately upon adding the new CPU.
[Jiewen] OS here means the Host CPU running code in OS environment, not in SMM environment.


> > (06) Host CPU: (SMM) Save 38000, Update 38000 -- fill simple SMM
> >      rebase code.
> >
> > (07) Host CPU: (SMM) Send message to New CPU to Enable SMI.
> 
> Aha, so this is the SMM-only register you mention in step (03). Is the
> register specified in the Intel SDM?
[Jiewen] Right. That is the register to let host CPU tell new CPU to enable SMI.
It is platform specific register. Not defined in SDM.
You may invent one in device model.


> > (08) New CPU: (Flash) Get message - Enable SMI.
> >
> > (09) Host CPU: (SMM) Send SMI to the new CPU only.
> >
> > (10) New CPU: (SMM) Response first SMI at 38000, and rebase SMBASE to
> >      TSEG.
> 
> What code does the new CPU execute after it completes step (10)? Does it
> halt?
[Jiewen] The new CPU exits SMM and return to original place - where it is
interrupted to enter SMM - running code on the flash.


> > (11) Host CPU: (SMM) Restore 38000.
> 
> These steps (i.e., (06) through (11)) don't appear RAS-specific. The
> only platform-specific feature seems to be SMI masking register, which
> could be extracted into a new SmmCpuFeaturesLib API.
> 
> Thus, would you please consider open sourcing firmware code for steps
> (06) through (11)?
> 
> Alternatively -- and in particular because the stack for step (01)
> concerns me --, we could approach this from a high-level, functional
> perspective. The states that really matter are the relocated SMBASE for
> the new CPU, and the state of the full system, right at the end of step
> (11).
> 
> When the SMM setup quiesces during normal firmware boot, OVMF could
> use
> existent (finalized) SMBASE infomation to *pre-program* some virtual
> QEMU hardware, with such state that would be expected, as "final" state,
> of any new hotplugged CPU. Afterwards, if / when the hotplug actually
> happens, QEMU could blanket-apply this state to the new CPU, and
> broadcast a hardware SMI to all CPUs except the new one.
> 
> The hardware SMI should tell the firmware that the rest of the process
> -- step (12) below, and onward -- is being requested.
> 
> If I understand right, this approach would produce an firmware & system
> state that's identical to what's expected right after step (11):
> 
> - all SMBASEs relocated
> - all preexistent CPUs in SMM
> - new CPU halted / blocked from launch
> - DRAM at 0x30000 / 0x38000 contains OS-owned data
> 
> Is my understanding correct that this is the expected state after step
> (11)?
[Jiewen] I think you are correct.


> Three more comments on the "SMBASE pre-config" approach:
> 
> - the virtual hardware providing this feature should become locked after
>   the configuration, until next platform reset
> 
> - the pre-config should occur via simple hardware accesses, so that it
>   can be replayed at S3 resume, i.e. as part of the S3 boot script
> 
> - from the pre-configured state, and the APIC ID, QEMU itself could
>   perhaps calculate the SMI stack location for the new processor.
> 
> 
> > (12) Host CPU: (SMM) Update located data structure to add the new CPU
> >      information. (This step will involve CPU_SERVICE protocol)
> 
> I commented on EFI_SMM_CPU_SERVICE_PROTOCOL in upon bullet (4) of
> <https://bugzilla.tianocore.org/show_bug.cgi?id=1512#c4>.
> 
> Calling EFI_SMM_ADD_PROCESSOR looks justified.
[Jiewen] I think you are correct.
Also REMOVE_PROCESSOR will be used for hot-remove action.


> What are some of the other member functions used for? The scary one is
> EFI_SMM_REGISTER_EXCEPTION_HANDLER.
[Jiewen] This is to register a new exception handler in SMM.
I don’t think this API is involved in hot-add.


> > ===================== (now, the next SMI will bring all CPU into TSEG)
> 
> OK... but what component injects that SMI, and when?
[Jiewen] Any SMI event. It could be synchronized SMI or asynchronized SMI.
It could from software such as IO write, or hardware such as thermal event.


> > (13) New CPU: (Flash) run MRC code, to init its own memory.
> 
> Why is this needed esp. after step (10)? The new CPU has accessed DRAM
> already. And why are we executing code from pflash, rather than from
> SMRAM, given that we're past SMBASE relocation?
[Jiewen] On real hardware, it is needed because different CPU may have different capability to access different DIMM.
I do not think your virtual platform need it.


> > (14) New CPU: (Flash) Deadloop, and wait for INIT-SIPI-SIPI.
> >
> > (15) Host CPU: (OS) Send INIT-SIPI-SIPI to pull new CPU in.
> 
> I'm confused by these steps. I thought that step (12) would complete the
> hotplug, by updating the administrative data structures internally. And
> the next SMI -- raised for the usual purposes, such as a software SMI
> for variable access -- would be handled like it always is, except it
> would also pull the new CPU into SMM too.
[Jiewen] The OS need use the new CPU at some point of time, right?
As such, the OS need pull the new CPU into its own environment by INIT-SIPI-SIPI.


> Thanks!
> Laszlo

  parent reply	other threads:[~2019-08-14 13:21 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13 14:16 [Qemu-devel] CPU hotplug using SMM with QEMU+OVMF Laszlo Ersek
2019-08-13 16:09 ` Laszlo Ersek
2019-08-13 16:18   ` Laszlo Ersek
2019-08-14 13:20   ` Yao, Jiewen [this message]
2019-08-14 14:04     ` Paolo Bonzini
2019-08-15  9:55       ` Yao, Jiewen
2019-08-15 16:04         ` Paolo Bonzini
2019-08-15 15:00       ` [Qemu-devel] [edk2-devel] " Laszlo Ersek
2019-08-15 16:16         ` Igor Mammedov
2019-08-15 16:21         ` Paolo Bonzini
2019-08-16  2:46           ` Yao, Jiewen
2019-08-16  7:20             ` Paolo Bonzini
2019-08-16  7:49               ` Yao, Jiewen
2019-08-16 20:15                 ` Laszlo Ersek
2019-08-16 22:19                   ` Alex Williamson
2019-08-17  0:20                     ` Yao, Jiewen
2019-08-18 19:50                       ` Paolo Bonzini
2019-08-18 23:00                         ` Yao, Jiewen
2019-08-19 14:10                           ` Paolo Bonzini
2019-08-21 12:07                             ` Laszlo Ersek
     [not found]                           ` <E92EE9817A31E24EB0585FDF735412F5B9D9C671@ORSMSX113.amr.corp.intel.com>
2019-08-21 17:05                             ` [Qemu-devel] [edk2-rfc] " Paolo Bonzini
     [not found]                               ` <E92EE9817A31E24EB0585FDF735412F5B9D9D74A@ORSMSX113.amr.corp.intel.com>
2019-08-21 17:39                                 ` Paolo Bonzini
2019-08-21 20:17                                   ` Kinney, Michael D
2019-08-22  6:18                                     ` Paolo Bonzini
2019-08-22 18:29                                       ` Laszlo Ersek
2019-08-22 18:51                                         ` Paolo Bonzini
2019-08-23 14:53                                           ` Laszlo Ersek
2019-08-22 20:13                                         ` Kinney, Michael D
2019-08-22 17:59                               ` Laszlo Ersek
2019-08-22 18:43                                 ` Paolo Bonzini
2019-08-22 20:06                                   ` Kinney, Michael D
2019-08-22 22:18                                     ` Paolo Bonzini
2019-08-22 22:32                                       ` Kinney, Michael D
2019-08-22 23:11                                         ` Paolo Bonzini
2019-08-23  1:02                                           ` Kinney, Michael D
2019-08-23  5:00                                             ` Yao, Jiewen
2019-08-23 15:25                                               ` Kinney, Michael D
2019-08-24  1:48                                                 ` Yao, Jiewen
2019-08-27 18:31                                                   ` Igor Mammedov
2019-08-29 17:01                                                     ` Laszlo Ersek
2019-08-30 14:48                                                       ` Igor Mammedov
2019-08-30 18:46                                                         ` Laszlo Ersek
2019-09-02  8:45                                                           ` Igor Mammedov
2019-09-02 19:09                                                             ` Laszlo Ersek
2019-09-03 14:53                                                               ` Igor Mammedov
2019-09-03 17:20                                                                 ` Laszlo Ersek
2019-09-04  9:52                                                                   ` Igor Mammedov
2019-09-05 13:08                                                                     ` Laszlo Ersek
2019-09-05 15:45                                                                       ` Igor Mammedov
2019-09-05 15:49                                                                       ` [Qemu-devel] [PATCH] q35: lpc: allow to lock down 128K RAM at default SMBASE address Igor Mammedov
2019-09-09 19:15                                                                         ` Laszlo Ersek
2019-09-09 19:20                                                                           ` Laszlo Ersek
2019-09-10 15:58                                                                           ` Igor Mammedov
2019-09-11 17:30                                                                             ` Laszlo Ersek
2019-09-17 13:11                                                                               ` [Qemu-devel] [edk2-devel] " Igor Mammedov
2019-08-26 15:30                                                 ` [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF Laszlo Ersek
2019-08-27 16:23                                                   ` Igor Mammedov
2019-08-27 20:11                                                     ` Laszlo Ersek
2019-08-28 12:01                                                       ` Igor Mammedov
2019-08-29 16:25                                                         ` Laszlo Ersek
2019-08-30 13:49                                                           ` Igor Mammedov
2019-08-22 17:53                             ` Laszlo Ersek
2019-08-16 20:00           ` [Qemu-devel] " Laszlo Ersek
2019-08-15 16:07       ` [Qemu-devel] " Igor Mammedov
2019-08-15 16:24         ` Paolo Bonzini
2019-08-16  7:42           ` Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=74D8A39837DF1E4DA445A8C0B3885C503F75B680@shsmsx102.ccr.corp.intel.com \
    --to=jiewen.yao@intel.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=devel@edk2.groups.io \
    --cc=imammedo@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=jun.nakajima@intel.com \
    --cc=lersek@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=phillip.goerl@oracle.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rfc@edk2.groups.io \
    --cc=yingwen.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).