All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yao, Jiewen" <jiewen.yao@intel.com>
To: Alex Williamson <alex.williamson@redhat.com>,
	Laszlo Ersek <lersek@redhat.com>
Cc: "Chen, Yingwen" <yingwen.chen@intel.com>,
	"devel@edk2.groups.io" <devel@edk2.groups.io>,
	Phillip Goerl <phillip.goerl@oracle.com>,
	qemu devel list <qemu-devel@nongnu.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	edk2-rfc-groups-io <rfc@edk2.groups.io>,
	Joao Marcal Lemos Martins <joao.m.martins@oracle.com>
Subject: Re: [Qemu-devel] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
Date: Sat, 17 Aug 2019 00:20:25 +0000	[thread overview]
Message-ID: <74D8A39837DF1E4DA445A8C0B3885C503F761B96@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <20190816161933.7d30a881@x1.home>



> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Saturday, August 17, 2019 6:20 AM
> To: Laszlo Ersek <lersek@redhat.com>
> Cc: Yao, Jiewen <jiewen.yao@intel.com>; Paolo Bonzini
> <pbonzini@redhat.com>; devel@edk2.groups.io; edk2-rfc-groups-io
> <rfc@edk2.groups.io>; qemu devel list <qemu-devel@nongnu.org>; Igor
> Mammedov <imammedo@redhat.com>; Chen, Yingwen
> <yingwen.chen@intel.com>; Nakajima, Jun <jun.nakajima@intel.com>; Boris
> Ostrovsky <boris.ostrovsky@oracle.com>; Joao Marcal Lemos Martins
> <joao.m.martins@oracle.com>; Phillip Goerl <phillip.goerl@oracle.com>
> Subject: Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
> 
> On Fri, 16 Aug 2019 22:15:15 +0200
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
> > +Alex (direct question at the bottom)
> >
> > On 08/16/19 09:49, Yao, Jiewen wrote:
> > > below
> > >
> > >> -----Original Message-----
> > >> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> > >> Sent: Friday, August 16, 2019 3:20 PM
> > >> To: Yao, Jiewen <jiewen.yao@intel.com>; Laszlo Ersek
> > >> <lersek@redhat.com>; devel@edk2.groups.io
> > >> Cc: edk2-rfc-groups-io <rfc@edk2.groups.io>; qemu devel list
> > >> <qemu-devel@nongnu.org>; Igor Mammedov
> <imammedo@redhat.com>;
> > >> Chen, Yingwen <yingwen.chen@intel.com>; Nakajima, Jun
> > >> <jun.nakajima@intel.com>; Boris Ostrovsky
> <boris.ostrovsky@oracle.com>;
> > >> Joao Marcal Lemos Martins <joao.m.martins@oracle.com>; Phillip
> Goerl
> > >> <phillip.goerl@oracle.com>
> > >> Subject: Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
> > >>
> > >> On 16/08/19 04:46, Yao, Jiewen wrote:
> > >>> Comment below:
> > >>>
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> > >>>> Sent: Friday, August 16, 2019 12:21 AM
> > >>>> To: Laszlo Ersek <lersek@redhat.com>; devel@edk2.groups.io; Yao,
> > >> Jiewen
> > >>>> <jiewen.yao@intel.com>
> > >>>> Cc: edk2-rfc-groups-io <rfc@edk2.groups.io>; qemu devel list
> > >>>> <qemu-devel@nongnu.org>; Igor Mammedov
> > >> <imammedo@redhat.com>;
> > >>>> Chen, Yingwen <yingwen.chen@intel.com>; Nakajima, Jun
> > >>>> <jun.nakajima@intel.com>; Boris Ostrovsky
> > >> <boris.ostrovsky@oracle.com>;
> > >>>> Joao Marcal Lemos Martins <joao.m.martins@oracle.com>; Phillip
> Goerl
> > >>>> <phillip.goerl@oracle.com>
> > >>>> Subject: Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF
> > >>>>
> > >>>> On 15/08/19 17:00, Laszlo Ersek wrote:
> > >>>>> On 08/14/19 16:04, Paolo Bonzini wrote:
> > >>>>>> On 14/08/19 15:20, Yao, Jiewen wrote:
> > >>>>>>>> - Does this part require a new branch somewhere in the OVMF
> SEC
> > >>>> code?
> > >>>>>>>>   How do we determine whether the CPU executing SEC is BSP
> or
> > >>>>>>>>   hot-plugged AP?
> > >>>>>>> [Jiewen] I think this is blocked from hardware perspective, since
> the
> > >> first
> > >>>> instruction.
> > >>>>>>> There are some hardware specific registers can be used to
> determine
> > >> if
> > >>>> the CPU is new added.
> > >>>>>>> I don’t think this must be same as the real hardware.
> > >>>>>>> You are free to invent some registers in device model to be used
> in
> > >>>> OVMF hot plug driver.
> > >>>>>>
> > >>>>>> Yes, this would be a new operation mode for QEMU, that only
> applies
> > >> to
> > >>>>>> hot-plugged CPUs.  In this mode the AP doesn't reply to INIT or
> SMI,
> > >> in
> > >>>>>> fact it doesn't reply to anything at all.
> > >>>>>>
> > >>>>>>>> - How do we tell the hot-plugged AP where to start execution?
> (I.e.
> > >>>> that
> > >>>>>>>>   it should execute code at a particular pflash location.)
> > >>>>>>> [Jiewen] Same real mode reset vector at FFFF:FFF0.
> > >>>>>>
> > >>>>>> You do not need a reset vector or INIT/SIPI/SIPI sequence at all in
> > >>>>>> QEMU.  The AP does not start execution at all when it is
> unplugged,
> > >> so
> > >>>>>> no cache-as-RAM etc.
> > >>>>>>
> > >>>>>> We only need to modify QEMU so that hot-plugged APIs do not
> reply
> > >> to
> > >>>>>> INIT/SIPI/SMI.
> > >>>>>>
> > >>>>>>> I don’t think there is problem for real hardware, who always has
> CAR.
> > >>>>>>> Can QEMU provide some CPU specific space, such as MMIO
> region?
> > >>>>>>
> > >>>>>> Why is a CPU-specific region needed if every other processor is in
> SMM
> > >>>>>> and thus trusted.
> > >>>>>
> > >>>>> I was going through the steps Jiewen and Yingwen recommended.
> > >>>>>
> > >>>>> In step (02), the new CPU is expected to set up RAM access. In step
> > >>>>> (03), the new CPU, executing code from flash, is expected to "send
> > >> board
> > >>>>> message to tell host CPU (GPIO->SCI) -- I am waiting for hot-add
> > >>>>> message." For that action, the new CPU may need a stack
> (minimally if
> > >> we
> > >>>>> want to use C function calls).
> > >>>>>
> > >>>>> Until step (03), there had been no word about any other (=
> pre-plugged)
> > >>>>> CPUs (more precisely, Jiewen even confirmed "No impact to other
> > >>>>> processors"), so I didn't assume that other CPUs had entered SMM.
> > >>>>>
> > >>>>> Paolo, I've attempted to read Jiewen's response, and yours, as
> carefully
> > >>>>> as I can. I'm still very confused. If you have a better understanding,
> > >>>>> could you please write up the 15-step process from the thread
> starter
> > >>>>> again, with all QEMU customizations applied? Such as, unnecessary
> > >> steps
> > >>>>> removed, and platform specifics filled in.
> > >>>>
> > >>>> Sure.
> > >>>>
> > >>>> (01a) QEMU: create new CPU.  The CPU already exists, but it does
> not
> > >>>>      start running code until unparked by the CPU hotplug
> controller.
> > >>>>
> > >>>> (01b) QEMU: trigger SCI
> > >>>>
> > >>>> (02-03) no equivalent
> > >>>>
> > >>>> (04) Host CPU: (OS) execute GPE handler from DSDT
> > >>>>
> > >>>> (05) Host CPU: (OS) Port 0xB2 write, all CPUs enter SMM (NOTE: New
> CPU
> > >>>>      will not enter CPU because SMI is disabled)
> > >>>>
> > >>>> (06) Host CPU: (SMM) Save 38000, Update 38000 -- fill simple SMM
> > >>>>      rebase code.
> > >>>>
> > >>>> (07a) Host CPU: (SMM) Write to CPU hotplug controller to enable
> > >>>>      new CPU
> > >>>>
> > >>>> (07b) Host CPU: (SMM) Send INIT/SIPI/SIPI to new CPU.
> > >>> [Jiewen] NOTE: INIT/SIPI/SIPI can be sent by a malicious CPU. There is
> no
> > >>> restriction that INIT/SIPI/SIPI can only be sent in SMM.
> > >>
> > >> All of the CPUs are now in SMM, and INIT/SIPI/SIPI will be discarded
> > >> before 07a, so this is okay.
> > > [Jiewen] May I know why INIT/SIPI/SIPI is discarded before 07a but is
> delivered at 07a?
> > > I don’t see any extra step between 06 and 07a.
> > > What is the magic here?
> >
> > The magic is 07a itself, IIUC. The CPU hotplug controller would be
> > accessible only in SMM. And until 07a happens, the new CPU ignores
> > INIT/SIPI/SIPI even if another CPU sends it those, simply because QEMU
> > would implement the new CPU's behavior like that.
[Jiewen] Got it. Looks fine to me.



> > >> However I do see a problem, because a PCI device's DMA could
> overwrite
> > >> 0x38000 between (06) and (10) and hijack the code that is executed in
> > >> SMM.  How is this avoided on real hardware?  By the time the new
> CPU
> > >> enters SMM, it doesn't run off cache-as-RAM anymore.
> > > [Jiewen] Interesting question.
> > > I don’t think the DMA attack is considered in threat model for the virtual
> environment. We only list adversary below:
> > > -- Adversary: System Software Attacker, who can control any OS memory
> or silicon register from OS level, or read write BIOS data.
> > > -- Adversary: Simple hardware attacker, who can hot add or hot remove
> a CPU.
> >
> > We do have physical PCI(e) device assignment; sorry for not highlighting
> > that earlier.
[Jiewen] That is OK. Then we MUST add the third adversary.
-- Adversary: Simple hardware attacker, who can use device to perform DMA attack in the virtual world.
NOTE: The DMA attack in the real world is out of scope. That is be handled by IOMMU in the real world, such as VTd. -- Please do clarify if this is TRUE.

In the real world:
#1: the SMM MUST be non-DMA capable region.
#2: the MMIO MUST be non-DMA capable region.
#3: the stolen memory MIGHT be DMA capable region or non-DMA capable region. It depends upon the silicon design.
#4: the normal OS accessible memory - including ACPI reclaim, ACPI NVS, and reserved memory not included by #3 - MUST be DMA capable region.
As such, IOMMU protection is NOT required for #1 and #2. IOMMU protection MIGHT be required for #3 and MUST be required for #4.
I assume the virtual environment is designed in the same way. Please correct me if I am wrong.



>> That feature (VFIO) does rely on the (physical) IOMMU, and
> > it makes sure that the assigned device can only access physical frames
> > that belong to the virtual machine that the device is assigned to.
[Jiewen] Thank you! Good to know.
I found https://www.kernel.org/doc/Documentation/vfio.txt
Is that what you scribed above?
Anyway, I believe the problem is clear and the solution in real world is clear.
I will leave the virtual world discussion to Alex, Paolo, Laszlo.
If you need any of my input, please let me know.



> > However, as far as I know, VFIO doesn't try to restrict PCI DMA to
> > subsets of guest RAM... I could be wrong about that, I vaguely recall
> > RMRR support, which seems somewhat related.
> >
> > > I agree it is a threat from real hardware perspective. SMM may check
> VTd to make sure the 38000 is blocked.
> > > I doubt if it is a threat in virtual environment. Do we have a way to block
> DMA in virtual environment?
> >
> > I think that would be a VFIO feature.
> >
> > Alex: if we wanted to block PCI(e) DMA to a specific part of guest RAM
> > (expressed with guest-physical RAM addresses), perhaps permanently,
> > perhaps just for a while -- not sure about coordination though --, could
> > VFIO accommodate that (I guess by "punching holes" in the IOMMU page
> > tables)?
> 
> It depends.  For starters, the vfio mapping API does not allow
> unmapping arbitrary sub-ranges of previous mappings.  So the hole you
> want to punch would need to be independently mapped.  From there you
> get into the issue of whether this range is a potential DMA target.  If
> it is, then this is the path to data corruption.  We cannot interfere
> with the operation of the device and we have little to no visibility of
> active DMA targets.
> 
> If we're talking about RAM that is never a DMA target, perhaps e820
> reserved memory, then we can make sure certainly MemoryRegions are
> skipped when mapped by QEMU and would expect the guest to never map
> them through a vIOMMU as well.  Maybe then it's a question of where
> we're trying to provide security (it might be more difficult if QEMU
> needs to sanitize vIOMMU mappings to actively prevent mapping
> reserved areas).
> 
> Is there anything unique about the VM case here?  Bare metal SMM needs
> to be concerned about protecting itself from I/O devices that operate
> outside of the realm of SMM mode as well, right?  Is something "simple"
> like an AddressSpace switch necessary here, such that an I/O device
> always has a mapping to a safe guest RAM page while the vCPU
> AddressSpace can switch to some protected page?  The IOMMU and vCPU
> mappings don't need to be the same.  The vCPU is more under our control
> than the assigned device.
> 
> FWIW, RMRRs are a VT-d specific mechanism to define an address range as
> persistently, identity mapped for one or more devices.  IOW, the device
> would always map that range.  I don't think that's what you're after
> here.  RMRRs are also an abomination that I hope we never find a
> requirement for in a VM.  Thanks,
> 
> Alex

  reply	other threads:[~2019-08-17  0:21 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13 14:16 [Qemu-devel] CPU hotplug using SMM with QEMU+OVMF Laszlo Ersek
2019-08-13 16:09 ` Laszlo Ersek
2019-08-13 16:18   ` Laszlo Ersek
2019-08-14 13:20   ` Yao, Jiewen
2019-08-14 14:04     ` Paolo Bonzini
2019-08-15  9:55       ` Yao, Jiewen
2019-08-15 16:04         ` Paolo Bonzini
2019-08-15 15:00       ` [Qemu-devel] [edk2-devel] " Laszlo Ersek
2019-08-15 16:16         ` Igor Mammedov
2019-08-15 16:21         ` Paolo Bonzini
2019-08-16  2:46           ` Yao, Jiewen
2019-08-16  7:20             ` Paolo Bonzini
2019-08-16  7:49               ` Yao, Jiewen
2019-08-16 20:15                 ` Laszlo Ersek
2019-08-16 22:19                   ` Alex Williamson
2019-08-17  0:20                     ` Yao, Jiewen [this message]
2019-08-18 19:50                       ` Paolo Bonzini
2019-08-18 23:00                         ` Yao, Jiewen
2019-08-19 14:10                           ` Paolo Bonzini
2019-08-21 12:07                             ` Laszlo Ersek
     [not found]                           ` <E92EE9817A31E24EB0585FDF735412F5B9D9C671@ORSMSX113.amr.corp.intel.com>
2019-08-21 17:05                             ` [Qemu-devel] [edk2-rfc] " Paolo Bonzini
     [not found]                               ` <E92EE9817A31E24EB0585FDF735412F5B9D9D74A@ORSMSX113.amr.corp.intel.com>
2019-08-21 17:39                                 ` Paolo Bonzini
2019-08-21 20:17                                   ` Kinney, Michael D
2019-08-22  6:18                                     ` Paolo Bonzini
2019-08-22 18:29                                       ` Laszlo Ersek
2019-08-22 18:51                                         ` Paolo Bonzini
2019-08-23 14:53                                           ` Laszlo Ersek
2019-08-22 20:13                                         ` Kinney, Michael D
2019-08-22 17:59                               ` Laszlo Ersek
2019-08-22 18:43                                 ` Paolo Bonzini
2019-08-22 20:06                                   ` Kinney, Michael D
2019-08-22 22:18                                     ` Paolo Bonzini
2019-08-22 22:32                                       ` Kinney, Michael D
2019-08-22 23:11                                         ` Paolo Bonzini
2019-08-23  1:02                                           ` Kinney, Michael D
2019-08-23  5:00                                             ` Yao, Jiewen
2019-08-23 15:25                                               ` Kinney, Michael D
2019-08-24  1:48                                                 ` Yao, Jiewen
2019-08-27 18:31                                                   ` Igor Mammedov
2019-08-29 17:01                                                     ` Laszlo Ersek
2019-08-30 14:48                                                       ` Igor Mammedov
2019-08-30 18:46                                                         ` Laszlo Ersek
2019-09-02  8:45                                                           ` Igor Mammedov
2019-09-02 19:09                                                             ` Laszlo Ersek
2019-09-03 14:53                                                               ` Igor Mammedov
2019-09-03 17:20                                                                 ` Laszlo Ersek
2019-09-04  9:52                                                                   ` Igor Mammedov
2019-09-05 13:08                                                                     ` Laszlo Ersek
2019-09-05 15:45                                                                       ` Igor Mammedov
2019-09-05 15:49                                                                       ` [Qemu-devel] [PATCH] q35: lpc: allow to lock down 128K RAM at default SMBASE address Igor Mammedov
2019-09-09 19:15                                                                         ` Laszlo Ersek
2019-09-09 19:20                                                                           ` Laszlo Ersek
2019-09-10 15:58                                                                           ` Igor Mammedov
2019-09-11 17:30                                                                             ` Laszlo Ersek
2019-09-17 13:11                                                                               ` [Qemu-devel] [edk2-devel] " Igor Mammedov
2019-08-26 15:30                                                 ` [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF Laszlo Ersek
2019-08-27 16:23                                                   ` Igor Mammedov
2019-08-27 20:11                                                     ` Laszlo Ersek
2019-08-28 12:01                                                       ` Igor Mammedov
2019-08-29 16:25                                                         ` Laszlo Ersek
2019-08-30 13:49                                                           ` Igor Mammedov
2019-08-22 17:53                             ` Laszlo Ersek
2019-08-16 20:00           ` [Qemu-devel] " Laszlo Ersek
2019-08-15 16:07       ` [Qemu-devel] " Igor Mammedov
2019-08-15 16:24         ` Paolo Bonzini
2019-08-16  7:42           ` Igor Mammedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=74D8A39837DF1E4DA445A8C0B3885C503F761B96@shsmsx102.ccr.corp.intel.com \
    --to=jiewen.yao@intel.com \
    --cc=alex.williamson@redhat.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=devel@edk2.groups.io \
    --cc=imammedo@redhat.com \
    --cc=joao.m.martins@oracle.com \
    --cc=jun.nakajima@intel.com \
    --cc=lersek@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=phillip.goerl@oracle.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rfc@edk2.groups.io \
    --cc=yingwen.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.