[Qemu-devel] Native Memory Virtualization in qemu-system-aarch64

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] Native Memory Virtualization in qemu-system-aarch64
@ 2018-07-12 16:48 Kevin Loughlin
  2018-07-13 15:22 ` Peter Maydell
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Loughlin @ 2018-07-12 16:48 UTC (permalink / raw)
  To: qemu-devel, qemu-arm, qemu-discuss

I know TrustZone has support for memory virtualization in AArch64, but I'm
looking to create a different model. Namely, I'd like to fully virtualize
the memory map for the "virt" board.

As a basic example of what I want, assuming an execution environment that
runs in a 1GB physical address space (0x0 - 0x3FFFFFFF), I'd like to be
able to switch to a second execution environment with a distinct SW stack
that runs in the second GB of a board memory (0x40000000 - 0x7FFFFFFF). The
key points for my desired memory virtualization are the following...

   1. Both of these environments should have distinct virtual address spaces
   2. The OS in each environment should believe it is running on physical
   addresses 0x0 - 0x3FFFFFFF in both cases.
   3. Neither environment should have access to the physical memory state
   of the other

I initialize distinct AddressSpace and MemoryRegion structures for each of
these GB blocks. Because all I want is a simple shift of physical address
for one environment, I hesitate to mirror the (relatively) complex address
translation process for TrustZone. Does anyone know if it would be better
to either (a) provide custom read/write functions for the shifted
MemoryRegion object, or (b) modify the target/arm code, such as adding a
shift to get_phys_addr() in target/arm/helper.c?

Thanks in advance,

Kevin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Native Memory Virtualization in qemu-system-aarch64
  2018-07-12 16:48 [Qemu-devel] Native Memory Virtualization in qemu-system-aarch64 Kevin Loughlin
@ 2018-07-13 15:22 ` Peter Maydell
  2018-07-18  1:34   ` Kevin Loughlin
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Maydell @ 2018-07-13 15:22 UTC (permalink / raw)
  To: Kevin Loughlin; +Cc: QEMU Developers, qemu-arm, qemu-discuss

On 12 July 2018 at 17:48, Kevin Loughlin <kevlough@umich.edu> wrote:
> I know TrustZone has support for memory virtualization in AArch64, but I'm
> looking to create a different model. Namely, I'd like to fully virtualize
> the memory map for the "virt" board.
>
> As a basic example of what I want, assuming an execution environment that
> runs in a 1GB physical address space (0x0 - 0x3FFFFFFF), I'd like to be
> able to switch to a second execution environment with a distinct SW stack
> that runs in the second GB of a board memory (0x40000000 - 0x7FFFFFFF). The
> key points for my desired memory virtualization are the following...
>
>    1. Both of these environments should have distinct virtual address spaces
>    2. The OS in each environment should believe it is running on physical
>    addresses 0x0 - 0x3FFFFFFF in both cases.
>    3. Neither environment should have access to the physical memory state
>    of the other
>
> I initialize distinct AddressSpace and MemoryRegion structures for each of
> these GB blocks. Because all I want is a simple shift of physical address
> for one environment, I hesitate to mirror the (relatively) complex address
> translation process for TrustZone. Does anyone know if it would be better
> to either (a) provide custom read/write functions for the shifted
> MemoryRegion object, or (b) modify the target/arm code, such as adding a
> shift to get_phys_addr() in target/arm/helper.c?

I'm a bit confused about what you're trying to do. Without TrustZone,
by definition there is only one physical address space (ie all of
memory/devices/etc are addressed by a single 64-bit physaddr).
There's no way to cause the CPU to not have access to it.
With TrustZone, you can think of the system as having two physical
address spaces (so to access something you need to specify both
a 64-bit physaddr and the TZ secure/nonsecure bit), and the CPU
and the system design cooperate to enforce that code running in the
nonsecure world can't get at things in the system it should not have
access to.

The whole point of TZ is to allow you to do this sort of partitioning.
Without it there's no way for the system (RAM or whatever) to know which
environment is running on the CPU.

You could in theory design and implement a non-standard extension to
the architecture to do equivalent things to what TZ is doing I suppose,
but that would be a lot of work and a lot of fragile modifications
to QEMU.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Native Memory Virtualization in qemu-system-aarch64
  2018-07-13 15:22 ` Peter Maydell
@ 2018-07-18  1:34   ` Kevin Loughlin
  2018-07-18 17:58     ` Peter Maydell
  0 siblings, 1 reply; 5+ messages in thread
From: Kevin Loughlin @ 2018-07-18  1:34 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, qemu-discuss, Mickens, James

I am indeed attempting to implement a non-standard extension to the ARMv8
architecture for experimental purposes. My high-level goal for the
extension is to completely isolate *N* execution environments (for example,
I even prohibit inter-environment communication) using purely HW-based
isolation mechanisms (i.e. no monitor software to help enforce/configure
isolation).

As part of my design, I want to take a single set of physical memory
hardware (e.g., RAM chips, MMUs, etc.) and

   1. partition the resources *N* ways, creating *N* views of the available
   physical resources, and then
   2. be able to dynamically switch the current view that is "active,"
   i.e., visible to the CPU and other devices

Under my setup, the CPU's MMU translates from VAs to IPAs, and an external
memory controller then intercepts all memory transactions and translates
these IPAs to true PAs. This allows the memory controller to enforce
physical isolation of environments, and does not expose true PAs to the
CPU/system software.

The CPU object would initialize and store an AddressSpace object for each
environment in its field "cpu_ases." Additionally, each environment's
memory map would follow identical offsets. That is, if RAM/flash/etc starts
at offset X in one environment, it will start at offset X in the other as
well. Therefore, my controller only ever needs to perform IPA-to-PA
translations via a simple, hard-wired base+bounds policy that is based on
the active environment.

My question is how best to emulate the memory controller given this desired
setup. I have three primary ideas, and I would love to get feedback on
their feasibility.

   1. Implement the controller as an IOMMU region. I would be responsible
   for writing the controller's operations to shift and forward the target
   address to the appropriate subregion. Would it be possible to trigger
   the IOMMU region on every access to system_memory? For example, even during
   QEMU's loading process? Or would I only be able to trigger the IOMMU
   operations on access to the subregions that represent my environments? My
   understanding of the IOMMU regions is shaky. Nonetheless, this sounds
   like the most promising approach, assuming I can provide the shifting and
   forwarding operations and hide the PAs from the CPU's TLB as desired.

   2. Go into the target/arm code, find every instance of accesses to
   address spaces, and shift the target physical address accordingly. This
   seems ugly and unlikely to work.

   3. Use overlapping subregions with differing priorities, as in done in
   QEMU's TrustZone implementation. However, these priorities would have to
   change on an environment context switch, and I don't know if that would
   lead to chaos.

Thanks,

Kevin

P.S. Note that my virtualization actually occurs *beneath* the TrustZone
layer. While creating "nested" TrustZones within each of my partitions is
theoretically possible, it's not an explicit goal of my design. Naturally,
I do use some isolation techniques similar to those deployed in TrustZone,
but ultimately my extension is designed for different purposes than
TrustZone.

On Fri, Jul 13, 2018 at 11:22 AM Peter Maydell <peter.maydell@linaro.org>
wrote:

> On 12 July 2018 at 17:48, Kevin Loughlin <kevlough@umich.edu> wrote:
> > I know TrustZone has support for memory virtualization in AArch64, but
> I'm
> > looking to create a different model. Namely, I'd like to fully virtualize
> > the memory map for the "virt" board.
> >
> > As a basic example of what I want, assuming an execution environment that
> > runs in a 1GB physical address space (0x0 - 0x3FFFFFFF), I'd like to be
> > able to switch to a second execution environment with a distinct SW stack
> > that runs in the second GB of a board memory (0x40000000 - 0x7FFFFFFF).
> The
> > key points for my desired memory virtualization are the following...
> >
> >    1. Both of these environments should have distinct virtual address
> spaces
> >    2. The OS in each environment should believe it is running on physical
> >    addresses 0x0 - 0x3FFFFFFF in both cases.
> >    3. Neither environment should have access to the physical memory state
> >    of the other
> >
> > I initialize distinct AddressSpace and MemoryRegion structures for each
> of
> > these GB blocks. Because all I want is a simple shift of physical address
> > for one environment, I hesitate to mirror the (relatively) complex
> address
> > translation process for TrustZone. Does anyone know if it would be better
> > to either (a) provide custom read/write functions for the shifted
> > MemoryRegion object, or (b) modify the target/arm code, such as adding a
> > shift to get_phys_addr() in target/arm/helper.c?
>
> I'm a bit confused about what you're trying to do. Without TrustZone,
> by definition there is only one physical address space (ie all of
> memory/devices/etc are addressed by a single 64-bit physaddr).
> There's no way to cause the CPU to not have access to it.
> With TrustZone, you can think of the system as having two physical
> address spaces (so to access something you need to specify both
> a 64-bit physaddr and the TZ secure/nonsecure bit), and the CPU
> and the system design cooperate to enforce that code running in the
> nonsecure world can't get at things in the system it should not have
> access to.
>
> The whole point of TZ is to allow you to do this sort of partitioning.
> Without it there's no way for the system (RAM or whatever) to know which
> environment is running on the CPU.
>
> You could in theory design and implement a non-standard extension to
> the architecture to do equivalent things to what TZ is doing I suppose,
> but that would be a lot of work and a lot of fragile modifications
> to QEMU.
>
> thanks
> -- PMM
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Native Memory Virtualization in qemu-system-aarch64
  2018-07-18  1:34   ` Kevin Loughlin
@ 2018-07-18 17:58     ` Peter Maydell
  2018-07-24 21:23       ` Kevin Loughlin
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Maydell @ 2018-07-18 17:58 UTC (permalink / raw)
  To: Kevin Loughlin; +Cc: QEMU Developers, qemu-arm, qemu-discuss, Mickens, James

On 18 July 2018 at 02:34, Kevin Loughlin <kevlough@umich.edu> wrote:
> Under my setup, the CPU's MMU translates from VAs to IPAs, and an external
> memory controller then intercepts all memory transactions and translates
> these IPAs to true PAs. This allows the memory controller to enforce
> physical isolation of environments, and does not expose true PAs to the
> CPU/system software.

Ah, right, "external custom memory controller" makes sense.

> My question is how best to emulate the memory controller given this desired
> setup. I have three primary ideas, and I would love to get feedback on their
> feasibility.
>
> Implement the controller as an IOMMU region. I would be responsible for
> writing the controller's operations to shift and forward the target address
> to the appropriate subregion. Would it be possible to trigger the IOMMU
> region on every access to system_memory? For example, even during QEMU's
> loading process? Or would I only be able to trigger the IOMMU operations on
> access to the subregions that represent my environments? My understanding of
> the IOMMU regions is shaky. Nonetheless, this sounds like the most promising
> approach, assuming I can provide the shifting and forwarding operations and
> hide the PAs from the CPU's TLB as desired.

I would probably go with implementing it as an IOMMU region. We recently
added code to QEMU that allows you to put IOMMUs in the CPU's
memory-access path, so this works now. The example we have of
that at the moment is hw/misc/tz-mpc.c (which is a simple device
which configurably controls access to the thing "downstream" of it
based on a lookup table and whether the access is S or NS).

As you've guessed, the way the IOMMU stuff works is that it gates
access to the things sat behind it: the device has a MemoryRegion
"upstream" which it exposes to the code which creates it, and a
MemoryRegion property "downstream". The creating code (ie the board)
passes in whatever the "downstream" is (likely a container MemoryRegion
with RAM and so on), and maps the "upstream" end into the address
space that the CPU sees. (You would probably have one downstream
for each separate subregion). You could either have the IOMMU
only "in front" of one part of the overall address space, or
in front of the whole of the address space, as you liked.
(Assuming you have some kind of "control register" memory mapped
interface for programming it, it can't be behind itself; that
would be "an interesting topological exercise", to quote nethack.)

What the CPU sees is whatever is in the MemoryRegion passed to it
via
        object_property_set_link(cpuobj, ..., "memory",
                                 &error_abort);

The virt board happens to currently use get_system_memory()
for that, but you can use a custom container MemoryRegion if you
like.

> Go into the target/arm code, find every instance of accesses to address
> spaces, and shift the target physical address accordingly. This seems ugly
> and unlikely to work.

That's very fragile, and I don't recommend it.

> Use overlapping subregions with differing priorities, as in done in QEMU's
> TrustZone implementation. However, these priorities would have to change on
> an environment context switch, and I don't know if that would lead to chaos.

You can model things this way too, yes (both by changing priorities, and by
simply enabling/disabling/mapping/unmapping memory regions). The main
advantage that using IOMMU regions gets you are:
 * you can do things at a finer granularity, say changing
   permissions at a page-by-page level, without creating a
   ton of MemoryRegion objects. Swapping MRs in and out would
   work if you only wanted to do it for big chunks of space at once
 * you can make the choice of what to do based on the memory
   transaction attributes (eg secure/nonsecure, user/privileged)
   rather than having to provide a single mapping only
If you really do only want to map 4G of RAM in and out at once,
this might be simpler.

Note that for both the IOMMU approach and the MemoryRegion map/unmap
approach, changing the mapping will blow away the emulated CPU's
cached TLB entirely. So if you do it very often you'll see a
performance hit. (In the IOMMU case it might in theory be possible
to get some of that performance back by being cleverer in the core
memory subsystem code so as to only drop the bits of the TLB that
are affected; but if you're remapping all-of-RAM then that probably
covers all the interesting cached TLB entries anyhow.)

thanks
-- PMM

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Native Memory Virtualization in qemu-system-aarch64
  2018-07-18 17:58     ` Peter Maydell
@ 2018-07-24 21:23       ` Kevin Loughlin
  0 siblings, 0 replies; 5+ messages in thread
From: Kevin Loughlin @ 2018-07-24 21:23 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, qemu-arm, qemu-discuss, Mickens, James

Thanks! That was super helpful.

To confirm, support for IOMMU regions in the CPU's memory access path did
NOT exist prior to recent releases, correct? My QEMU version is 2.11, and I
believe you're up to 3.0 now. If that's the case, I may stick with the
"changing priorities" approach, since I know you've also updated the virt
board and refactored the system bus code since I branched. Additionally,
you correctly pointed out that I simply want to map a huge chunk of memory
in and out at once. However, the IOMMU solution does have the benefit of a
more realistic approach than changing the priorities.

On Wed, Jul 18, 2018 at 1:58 PM Peter Maydell <peter.maydell@linaro.org>
wrote:

> On 18 July 2018 at 02:34, Kevin Loughlin <kevlough@umich.edu> wrote:
> > Under my setup, the CPU's MMU translates from VAs to IPAs, and an
> external
> > memory controller then intercepts all memory transactions and translates
> > these IPAs to true PAs. This allows the memory controller to enforce
> > physical isolation of environments, and does not expose true PAs to the
> > CPU/system software.
>
> Ah, right, "external custom memory controller" makes sense.
>
> > My question is how best to emulate the memory controller given this
> desired
> > setup. I have three primary ideas, and I would love to get feedback on
> their
> > feasibility.
> >
> > Implement the controller as an IOMMU region. I would be responsible for
> > writing the controller's operations to shift and forward the target
> address
> > to the appropriate subregion. Would it be possible to trigger the IOMMU
> > region on every access to system_memory? For example, even during QEMU's
> > loading process? Or would I only be able to trigger the IOMMU operations
> on
> > access to the subregions that represent my environments? My
> understanding of
> > the IOMMU regions is shaky. Nonetheless, this sounds like the most
> promising
> > approach, assuming I can provide the shifting and forwarding operations
> and
> > hide the PAs from the CPU's TLB as desired.
>
> I would probably go with implementing it as an IOMMU region. We recently
> added code to QEMU that allows you to put IOMMUs in the CPU's
> memory-access path, so this works now. The example we have of
> that at the moment is hw/misc/tz-mpc.c (which is a simple device
> which configurably controls access to the thing "downstream" of it
> based on a lookup table and whether the access is S or NS).
>
> As you've guessed, the way the IOMMU stuff works is that it gates
> access to the things sat behind it: the device has a MemoryRegion
> "upstream" which it exposes to the code which creates it, and a
> MemoryRegion property "downstream". The creating code (ie the board)
> passes in whatever the "downstream" is (likely a container MemoryRegion
> with RAM and so on), and maps the "upstream" end into the address
> space that the CPU sees. (You would probably have one downstream
> for each separate subregion). You could either have the IOMMU
> only "in front" of one part of the overall address space, or
> in front of the whole of the address space, as you liked.
> (Assuming you have some kind of "control register" memory mapped
> interface for programming it, it can't be behind itself; that
> would be "an interesting topological exercise", to quote nethack.)
>
> What the CPU sees is whatever is in the MemoryRegion passed to it
> via
>         object_property_set_link(cpuobj, ..., "memory",
>                                  &error_abort);
>
> The virt board happens to currently use get_system_memory()
> for that, but you can use a custom container MemoryRegion if you
> like.
>
> > Go into the target/arm code, find every instance of accesses to address
> > spaces, and shift the target physical address accordingly. This seems
> ugly
> > and unlikely to work.
>
> That's very fragile, and I don't recommend it.
>
> > Use overlapping subregions with differing priorities, as in done in
> QEMU's
> > TrustZone implementation. However, these priorities would have to change
> on
> > an environment context switch, and I don't know if that would lead to
> chaos.
>
> You can model things this way too, yes (both by changing priorities, and by
> simply enabling/disabling/mapping/unmapping memory regions). The main
> advantage that using IOMMU regions gets you are:
>  * you can do things at a finer granularity, say changing
>    permissions at a page-by-page level, without creating a
>    ton of MemoryRegion objects. Swapping MRs in and out would
>    work if you only wanted to do it for big chunks of space at once
>  * you can make the choice of what to do based on the memory
>    transaction attributes (eg secure/nonsecure, user/privileged)
>    rather than having to provide a single mapping only
> If you really do only want to map 4G of RAM in and out at once,
> this might be simpler.
>
> Note that for both the IOMMU approach and the MemoryRegion map/unmap
> approach, changing the mapping will blow away the emulated CPU's
> cached TLB entirely. So if you do it very often you'll see a
> performance hit. (In the IOMMU case it might in theory be possible
> to get some of that performance back by being cleverer in the core
> memory subsystem code so as to only drop the bits of the TLB that
> are affected; but if you're remapping all-of-RAM then that probably
> covers all the interesting cached TLB entries anyhow.)
>
> thanks
> -- PMM
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-07-24 21:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-12 16:48 [Qemu-devel] Native Memory Virtualization in qemu-system-aarch64 Kevin Loughlin
2018-07-13 15:22 ` Peter Maydell
2018-07-18  1:34   ` Kevin Loughlin
2018-07-18 17:58     ` Peter Maydell
2018-07-24 21:23       ` Kevin Loughlin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.