Re: Demand paging for VM on KVM

* Re: Demand paging for VM on KVM
       [not found] <CAJMTq5=LXMp2jBaxPMBWX_3-+RC5j98n=Nz8TRe3AXFwRY1Beg@mail.gmail.com>
@ 2014-03-20 13:18 ` Paolo Bonzini
  2014-03-20 17:32   ` Andrea Arcangeli
  0 siblings, 1 reply; 4+ messages in thread
From: Paolo Bonzini @ 2014-03-20 13:18 UTC (permalink / raw)
  To: Grigory Makarevich, kvm, gleb; +Cc: Eric Northup, Andrea Arcangeli

Il 20/03/2014 00:27, Grigory Makarevich ha scritto:
> Hi All,
>
> I have been exploring different ways to implement on-demand paging for
> VMs running in KVM.
>
> The core of the idea is to introduce an additional exit
>  KVM_EXIT_MEMORY_NOT_PRESENT to inform VMM's user space to process
> access to "not yet present" guest's page.
> Each memory slot may be instructed to keep track of ondemand bit per
> page. If the page is marked as "ondemand", page fault  will generate
> exit to the host's
> user-space with the information about the faulting page. Once the page
> is filled, VMM instructs the KVM to clear "ondemand" bit for the page.
>
> I have working prototype and would like to consider upstreaming
> corresponding KVM changes.
>
> To start up the discussion before sending the actual patch-set, I'd like
> to send the patch for the kvm's api.txt.  Please, let me know what you
> think.

Hi, Andrea Arcangeli is considering a similar infrastructure at the 
generic mm level.  Last time I discussed it with him, his idea was 
roughly to have:

* a "userfaultfd" syscall that would take a memory range and return a 
file descriptor; the file descriptor becomes readable when the first 
access happens on a page in the region, and the read gives the address 
of the access.  Any thread that accesses a still-unmapped region remains 
blocked until the address of the faulting page is written back to the 
userfaultfd, or gets a SIGBUS if the userfaultfd is closed.

* a remap_anon_pages syscall that would be used in the userfaultfd I/O 
handler to make the page accessible.  The handler would build the page 
in a "shadow" area with the actual contents of guest memory, and then 
remap the shadow area onto the actual guest memory.

Andrea, please correct me.

QEMU would use this infrastructure for post-copy migration and possibly 
also for live snapshotting of the guests.  The advantage in making this 
generic rather than KVM-based is that QEMU could use it also in 
system-emulation mode (and of course anything else needing a read 
barrier could use it too).

Paolo

^ permalink raw reply	[flat|nested] 4+ messages in thread