All of lore.kernel.org
 help / color / mirror / Atom feed
* kvm stage 2 mapping logic
@ 2020-03-31 11:17 Janne Karhunen
  2020-03-31 12:36 ` Marc Zyngier
  0 siblings, 1 reply; 3+ messages in thread
From: Janne Karhunen @ 2020-03-31 11:17 UTC (permalink / raw)
  To: kvmarm

Hi,

I'm experimenting with the kvm in order to see how it would work in
co-existence with a tiny external hypervisor that also runs the host
in el1/vmid 0. More about this later on in case it turns out to be
anything generally useful, but I've been stuck for a few days now
understanding the kvm stage-2 (ipa-to-phys) mapping when the guest is
being created. Things I think I've understood so far;

- qemu mmaps the guest memory per the machine type (virt in my case)
- qemu pushes the machine physical memory model in the kernel through
the kvm_vm_ioctl_set_memory_region()
- kvm has mmu notifier block set to listen to the changes to these
regions and it becomes active after the machine memory model arrives.
The mmu notifier calls handle_hva_to_gpa() that dispatches the call to
the appropriate map or unmap handler and these do the s2 mapping
changes for the vm as needed
- prior to starting the vm, kvm_arch_prepare_memory_region() is given
a try to see if any IO areas could be s2 mapped before the host is
allowed to execute. This is mostly an optimization?
- vcpu is started
- as the pages are touched when the vcpu starts executing, page faults
get generated and the real s2 mappings slowly start to get created.
LRU keeps the active pages pinned in memory, others will get evicted
and their s2 mapping eventually disappears
- all in all, the vm runs and behaves pretty much like a normal
userspace process

Is this roughly the story? If it is, I'm a bit lost where the stage2
page fault handler that is supposed to generate the s2 mappings is. It
was surprisingly easy to get the external hypervisor (with very
minimal changes to the kvm) to the point when the guest is being
entered and the vmid 1 starts to refer to the instructions at the vm
ram base (0x4000 0000 for virt). Those, of course, currently scream
bloody murder as the s2 mapping does not exist.


--
Janne
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: kvm stage 2 mapping logic
  2020-03-31 11:17 kvm stage 2 mapping logic Janne Karhunen
@ 2020-03-31 12:36 ` Marc Zyngier
  2020-03-31 13:14   ` Janne Karhunen
  0 siblings, 1 reply; 3+ messages in thread
From: Marc Zyngier @ 2020-03-31 12:36 UTC (permalink / raw)
  To: Janne Karhunen; +Cc: kvmarm

On Tue, 31 Mar 2020 14:17:51 +0300
Janne Karhunen <janne.karhunen@gmail.com> wrote:

Hi Janne,

> Hi,
> 
> I'm experimenting with the kvm in order to see how it would work in
> co-existence with a tiny external hypervisor that also runs the host
> in el1/vmid 0.

Popular theme these days...

> More about this later on in case it turns out to be
> anything generally useful, but I've been stuck for a few days now
> understanding the kvm stage-2 (ipa-to-phys) mapping when the guest is
> being created. Things I think I've understood so far;
> 
> - qemu mmaps the guest memory per the machine type (virt in my case)
> - qemu pushes the machine physical memory model in the kernel through
> the kvm_vm_ioctl_set_memory_region()
> - kvm has mmu notifier block set to listen to the changes to these
> regions and it becomes active after the machine memory model arrives.
> The mmu notifier calls handle_hva_to_gpa() that dispatches the call to
> the appropriate map or unmap handler and these do the s2 mapping
> changes for the vm as needed

Note that these MMU notifiers only make sense when something happens on
the host: attribute change (for page aging, for example) or unmap
(e.g. page being swapped out).

> - prior to starting the vm, kvm_arch_prepare_memory_region() is given
> a try to see if any IO areas could be s2 mapped before the host is
> allowed to execute. This is mostly an optimization?

Yes, and not necessarily a useful one. I think I have a patch to drop
that.

> - vcpu is started
> - as the pages are touched when the vcpu starts executing, page faults
> get generated and the real s2 mappings slowly start to get created.
> LRU keeps the active pages pinned in memory, others will get evicted
> and their s2 mapping eventually disappears
> - all in all, the vm runs and behaves pretty much like a normal
> userspace process

Indeed, just with a different set of page tables.

> Is this roughly the story? If it is, I'm a bit lost where the stage2
> page fault handler that is supposed to generate the s2 mappings is.

user_mem_abort() is your friend (or not, it's a very nasty piece of
code). If you trace the fault handling path all the way from the EL2
vectors, you will eventually get there.

> It
> was surprisingly easy to get the external hypervisor (with very
> minimal changes to the kvm) to the point when the guest is being
> entered and the vmid 1 starts to refer to the instructions at the vm
> ram base (0x4000 0000 for virt). Those, of course, currently scream
> bloody murder as the s2 mapping does not exist.

Well, *something* must be handling the fault, right? But if you've
wrapped the host with its own S2 page tables, it may not be able to
populate the S2 pages for the guest (wild guess...).

Hope this helps,

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: kvm stage 2 mapping logic
  2020-03-31 12:36 ` Marc Zyngier
@ 2020-03-31 13:14   ` Janne Karhunen
  0 siblings, 0 replies; 3+ messages in thread
From: Janne Karhunen @ 2020-03-31 13:14 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kvmarm

On Tue, Mar 31, 2020 at 3:36 PM Marc Zyngier <maz@kernel.org> wrote:

> > I'm experimenting with the kvm in order to see how it would work in
> > co-existence with a tiny external hypervisor that also runs the host
> > in el1/vmid 0.
>
> Popular theme these days...

Indeed. In my dream world I would land with a config where the host
and the guest have very minimal understanding of each other. How small
that can be remains to be seen..


> > More about this later on in case it turns out to be
> > anything generally useful, but I've been stuck for a few days now
> > understanding the kvm stage-2 (ipa-to-phys) mapping when the guest is
> > being created. Things I think I've understood so far;
> >
> > - qemu mmaps the guest memory per the machine type (virt in my case)
> > - qemu pushes the machine physical memory model in the kernel through
> > the kvm_vm_ioctl_set_memory_region()
> > - kvm has mmu notifier block set to listen to the changes to these
> > regions and it becomes active after the machine memory model arrives.
> > The mmu notifier calls handle_hva_to_gpa() that dispatches the call to
> > the appropriate map or unmap handler and these do the s2 mapping
> > changes for the vm as needed
>
> Note that these MMU notifiers only make sense when something happens on
> the host: attribute change (for page aging, for example) or unmap
> (e.g. page being swapped out).

Yes. This fooled me for a while as I was thinking it actually does the
job, but no. It was my second miss, first place I was looking at was
the ioctl call itself.


> > - prior to starting the vm, kvm_arch_prepare_memory_region() is given
> > a try to see if any IO areas could be s2 mapped before the host is
> > allowed to execute. This is mostly an optimization?
>
> Yes, and not necessarily a useful one. I think I have a patch to drop
> that.

Ack.


> > - vcpu is started
> > - as the pages are touched when the vcpu starts executing, page faults
> > get generated and the real s2 mappings slowly start to get created.
> > LRU keeps the active pages pinned in memory, others will get evicted
> > and their s2 mapping eventually disappears
> > - all in all, the vm runs and behaves pretty much like a normal
> > userspace process
>
> Indeed, just with a different set of page tables.

Awesome. Took a while to understand the construction.


> > Is this roughly the story? If it is, I'm a bit lost where the stage2
> > page fault handler that is supposed to generate the s2 mappings is.
>
> user_mem_abort() is your friend (or not, it's a very nasty piece of
> code). If you trace the fault handling path all the way from the EL2
> vectors, you will eventually get there.

THANK YOU! My missing piece.


> > It
> > was surprisingly easy to get the external hypervisor (with very
> > minimal changes to the kvm) to the point when the guest is being
> > entered and the vmid 1 starts to refer to the instructions at the vm
> > ram base (0x4000 0000 for virt). Those, of course, currently scream
> > bloody murder as the s2 mapping does not exist.
>
> Well, *something* must be handling the fault, right? But if you've
> wrapped the host with its own S2 page tables, it may not be able to
> populate the S2 pages for the guest (wild guess...).

Let's see. Yes, the 'virt' host is already nicely wrapped and running.
To make this easier for starters I use 1:1 ipa:phys mapping to get one
vm (besides the host) going.


--
Janne
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-03-31 13:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-31 11:17 kvm stage 2 mapping logic Janne Karhunen
2020-03-31 12:36 ` Marc Zyngier
2020-03-31 13:14   ` Janne Karhunen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.