All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Migration without memory page transfer
@ 2018-04-26 23:33 Eric Wheeler
  2018-04-27  2:45 ` Peter Xu
  2018-04-27  9:24 ` Dr. David Alan Gilbert
  0 siblings, 2 replies; 4+ messages in thread
From: Eric Wheeler @ 2018-04-26 23:33 UTC (permalink / raw)
  To: qemu-devel

Hello all,

This is my first time inside of the qemu code, so your help is greatly 
appreciated!

I have been experimenting with stop/start of VMs to/from a migration 
stream that excludes RAM pages and let the RAM pages come from memory file 
provided by the memory-backend-file called '/dev/shm/mem'.

To disable writing of memory pages to the migration stream, I've disabled 
calls to ram_find_and_save_block in ram_save_iterate() and 
ram_save_complete() (see patch below).  Thus, the migration stream has the 
"ram" SaveStateEntry section start/ends, but no pages:

qemu-system-x86_64 \
	-object memory-backend-file,prealloc=no,mem-path=/dev/shm/mem,id=ram-node0,host-nodes=0,policy=bind,size=64m,share=on \
	-numa node,nodeid=0,cpus=0,memdev=ram-node0\
	-m 64 -vnc 0:0

Once the VM is running, I press ctrl-B to get the IPXE prompt and then 
run 'kernel http://192.168.0.1/foo' to start a network request and watch 
it in tcpdump.

Once the download starts, I save the migration file:
	migrate "exec:cat > /dev/shm/t"

	# ls -lh /dev/shm/t
	-rw-r--r-- 1 root root 321K Apr 26 16:06 /dev/shm/t

Now I can kill qemu and boot it again with -incoming:

qemu-system-x86_64 \
	-object memory-backend-file,prealloc=no,mem-path=/dev/shm/mem,id=ram-node0,host-nodes=0,policy=bind,size=64m,share=on \
	-numa node,nodeid=0,cpus=0,memdev=ram-node0\
	-m 64 -vnc 0:0 \
	-incoming 'exec:cat /dev/shm/t'

It seems to work.  That is, network traffic continues (http from IPXE) 
which I can see from tcpdump.  I can type into the console and it moves 
the cursor around---but there is nothing on the screen except the blinking 
text-mode cursor!  I can even blindly start a new transfer in ipxe: kernel 
http://192.168.0.222/foo2 and see it in tcpdump.

So what am I missing here?  Is the video memory not saved to /dev/shm/mem?

Or perhaps it is saved, but VGA isn't initialized to use what is 
already in /dev/shm/mem?  I've tried the cirrus, std, and vmware drivers 
to see if they behave differently, but the do not seem to.

Thanks for your help!

--
Eric Wheeler


diff --git a/migration/ram.c b/migration/ram.c
index 021d583..9f4bfff 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2267,9 +2267,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
     t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
     i = 0;
     while ((ret = qemu_file_rate_limit(f)) == 0) {
-        int pages;
+        int pages = 0;
 
-        pages = ram_find_and_save_block(rs, false);
+        if (0) pages = ram_find_and_save_block(rs, false);
         /* no more pages to sent */
         if (pages == 0) {
             done = 1;
@@ -2338,9 +2338,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 
     /* flush all remaining blocks regardless of rate limiting */
     while (true) {
-        int pages;
+        int pages = 0;
 
-        pages = ram_find_and_save_block(rs, !migration_in_colo_state());
+        if (0) pages = ram_find_and_save_block(rs, !migration_in_colo_state());
         /* no more blocks to sent */
         if (pages == 0) {
             break;

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Migration without memory page transfer
  2018-04-26 23:33 [Qemu-devel] Migration without memory page transfer Eric Wheeler
@ 2018-04-27  2:45 ` Peter Xu
  2018-05-05 20:30   ` Eric Wheeler
  2018-04-27  9:24 ` Dr. David Alan Gilbert
  1 sibling, 1 reply; 4+ messages in thread
From: Peter Xu @ 2018-04-27  2:45 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: qemu-devel

On Thu, Apr 26, 2018 at 11:33:53PM +0000, Eric Wheeler wrote:
> Hello all,

Hi, Eric,

> 
> This is my first time inside of the qemu code, so your help is greatly 
> appreciated!
> 
> I have been experimenting with stop/start of VMs to/from a migration 
> stream that excludes RAM pages and let the RAM pages come from memory file 
> provided by the memory-backend-file called '/dev/shm/mem'.
> 
> To disable writing of memory pages to the migration stream, I've disabled 
> calls to ram_find_and_save_block in ram_save_iterate() and 
> ram_save_complete() (see patch below).  Thus, the migration stream has the 
> "ram" SaveStateEntry section start/ends, but no pages:
> 
> qemu-system-x86_64 \
> 	-object memory-backend-file,prealloc=no,mem-path=/dev/shm/mem,id=ram-node0,host-nodes=0,policy=bind,size=64m,share=on \
> 	-numa node,nodeid=0,cpus=0,memdev=ram-node0\
> 	-m 64 -vnc 0:0
> 
> Once the VM is running, I press ctrl-B to get the IPXE prompt and then 
> run 'kernel http://192.168.0.1/foo' to start a network request and watch 
> it in tcpdump.
> 
> Once the download starts, I save the migration file:
> 	migrate "exec:cat > /dev/shm/t"
> 
> 	# ls -lh /dev/shm/t
> 	-rw-r--r-- 1 root root 321K Apr 26 16:06 /dev/shm/t
> 
> Now I can kill qemu and boot it again with -incoming:
> 
> qemu-system-x86_64 \
> 	-object memory-backend-file,prealloc=no,mem-path=/dev/shm/mem,id=ram-node0,host-nodes=0,policy=bind,size=64m,share=on \
> 	-numa node,nodeid=0,cpus=0,memdev=ram-node0\
> 	-m 64 -vnc 0:0 \
> 	-incoming 'exec:cat /dev/shm/t'
> 
> It seems to work.  That is, network traffic continues (http from IPXE) 
> which I can see from tcpdump.  I can type into the console and it moves 
> the cursor around---but there is nothing on the screen except the blinking 
> text-mode cursor!  I can even blindly start a new transfer in ipxe: kernel 
> http://192.168.0.222/foo2 and see it in tcpdump.
> 
> So what am I missing here?  Is the video memory not saved to /dev/shm/mem?
> 
> Or perhaps it is saved, but VGA isn't initialized to use what is 
> already in /dev/shm/mem?  I've tried the cirrus, std, and vmware drivers 
> to see if they behave differently, but the do not seem to.

My wild guess is that we might still need to migrate some RAM besides
the /dev/shm/mem file.  We have at least these ramblocks to migrate:

$ ./x86_64-softmmu/qemu-system-x86_64 -monitor stdio -m 2G                                                                           
QEMU 2.12.0 monitor - type 'help' for more information
(qemu) info ramblock
              Block Name    PSize              Offset               Used              Total
                  pc.ram    4 KiB  0x0000000000000000 0x0000000080000000 0x0000000080000000
                vga.vram    4 KiB  0x0000000080080000 0x0000000001000000 0x0000000001000000
    /rom@etc/acpi/tables    4 KiB  0x0000000081100000 0x0000000000020000 0x0000000000200000
                 pc.bios    4 KiB  0x0000000080000000 0x0000000000040000 0x0000000000040000
  0000:00:03.0/e1000.rom    4 KiB  0x00000000810c0000 0x0000000000040000 0x0000000000040000
                  pc.rom    4 KiB  0x0000000080040000 0x0000000000020000 0x0000000000020000
    0000:00:02.0/vga.rom    4 KiB  0x0000000081080000 0x0000000000010000 0x0000000000010000
   /rom@etc/table-loader    4 KiB  0x0000000081300000 0x0000000000001000 0x0000000000001000
      /rom@etc/acpi/rsdp    4 KiB  0x0000000081340000 0x0000000000001000 0x0000000000001000

And my understanding is that /dev/shm/mem only corresponds to the
"pc.ram" entry.  I suspect the rest of RAMBlocks will still need to be
migrated.  For example, the VGA ram.

Meanwhile, could I ask about where will this be used?  Is there
anything to do with something like a "distributed memory cache" that
provide memory service across multiple hosts?

Best Regards,

> 
> Thanks for your help!
> 
> --
> Eric Wheeler
> 
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 021d583..9f4bfff 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2267,9 +2267,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      i = 0;
>      while ((ret = qemu_file_rate_limit(f)) == 0) {
> -        int pages;
> +        int pages = 0;
>  
> -        pages = ram_find_and_save_block(rs, false);
> +        if (0) pages = ram_find_and_save_block(rs, false);
>          /* no more pages to sent */
>          if (pages == 0) {
>              done = 1;
> @@ -2338,9 +2338,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  
>      /* flush all remaining blocks regardless of rate limiting */
>      while (true) {
> -        int pages;
> +        int pages = 0;
>  
> -        pages = ram_find_and_save_block(rs, !migration_in_colo_state());
> +        if (0) pages = ram_find_and_save_block(rs, !migration_in_colo_state());
>          /* no more blocks to sent */
>          if (pages == 0) {
>              break;
> 
> 

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Migration without memory page transfer
  2018-04-26 23:33 [Qemu-devel] Migration without memory page transfer Eric Wheeler
  2018-04-27  2:45 ` Peter Xu
@ 2018-04-27  9:24 ` Dr. David Alan Gilbert
  1 sibling, 0 replies; 4+ messages in thread
From: Dr. David Alan Gilbert @ 2018-04-27  9:24 UTC (permalink / raw)
  To: Eric Wheeler; +Cc: qemu-devel, peterx

* Eric Wheeler (qemu-devel@lists.ewheeler.net) wrote:
> Hello all,

Hi Eric,

> This is my first time inside of the qemu code, so your help is greatly 
> appreciated!
> 
> I have been experimenting with stop/start of VMs to/from a migration 
> stream that excludes RAM pages and let the RAM pages come from memory file 
> provided by the memory-backend-file called '/dev/shm/mem'.
> 
> To disable writing of memory pages to the migration stream, I've disabled 
> calls to ram_find_and_save_block in ram_save_iterate() and 
> ram_save_complete() (see patch below).  Thus, the migration stream has the 
> "ram" SaveStateEntry section start/ends, but no pages:

You're in luck, because someone else has just done something very
similar.

> qemu-system-x86_64 \
> 	-object memory-backend-file,prealloc=no,mem-path=/dev/shm/mem,id=ram-node0,host-nodes=0,policy=bind,size=64m,share=on \
> 	-numa node,nodeid=0,cpus=0,memdev=ram-node0\
> 	-m 64 -vnc 0:0
> 
> Once the VM is running, I press ctrl-B to get the IPXE prompt and then 
> run 'kernel http://192.168.0.1/foo' to start a network request and watch 
> it in tcpdump.
> 
> Once the download starts, I save the migration file:
> 	migrate "exec:cat > /dev/shm/t"
> 
> 	# ls -lh /dev/shm/t
> 	-rw-r--r-- 1 root root 321K Apr 26 16:06 /dev/shm/t
> 
> Now I can kill qemu and boot it again with -incoming:
> 
> qemu-system-x86_64 \
> 	-object memory-backend-file,prealloc=no,mem-path=/dev/shm/mem,id=ram-node0,host-nodes=0,policy=bind,size=64m,share=on \
> 	-numa node,nodeid=0,cpus=0,memdev=ram-node0\
> 	-m 64 -vnc 0:0 \
> 	-incoming 'exec:cat /dev/shm/t'
> 
> It seems to work.  That is, network traffic continues (http from IPXE) 
> which I can see from tcpdump.  I can type into the console and it moves 
> the cursor around---but there is nothing on the screen except the blinking 
> text-mode cursor!  I can even blindly start a new transfer in ipxe: kernel 
> http://192.168.0.222/foo2 and see it in tcpdump.
> 
> So what am I missing here?  Is the video memory not saved to /dev/shm/mem?
> 
> Or perhaps it is saved, but VGA isn't initialized to use what is 
> already in /dev/shm/mem?  I've tried the cirrus, std, and vmware drivers 
> to see if they behave differently, but the do not seem to.

They have their own RAMBlocks, not the main RAM, so that does need
migrating.

The patch here:
   http://lists.gnu.org/archive/html/qemu-devel/2018-04/msg00003.html

should do what you want.
You need to turn on the bypass-shared-memory capability.
   migrate_set_capability bypass-shared-memory on

Dave


> Thanks for your help!


> --
> Eric Wheeler
> 
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 021d583..9f4bfff 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2267,9 +2267,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>      i = 0;
>      while ((ret = qemu_file_rate_limit(f)) == 0) {
> -        int pages;
> +        int pages = 0;
>  
> -        pages = ram_find_and_save_block(rs, false);
> +        if (0) pages = ram_find_and_save_block(rs, false);
>          /* no more pages to sent */
>          if (pages == 0) {
>              done = 1;
> @@ -2338,9 +2338,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  
>      /* flush all remaining blocks regardless of rate limiting */
>      while (true) {
> -        int pages;
> +        int pages = 0;
>  
> -        pages = ram_find_and_save_block(rs, !migration_in_colo_state());
> +        if (0) pages = ram_find_and_save_block(rs, !migration_in_colo_state());
>          /* no more blocks to sent */
>          if (pages == 0) {
>              break;
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] Migration without memory page transfer
  2018-04-27  2:45 ` Peter Xu
@ 2018-05-05 20:30   ` Eric Wheeler
  0 siblings, 0 replies; 4+ messages in thread
From: Eric Wheeler @ 2018-05-05 20:30 UTC (permalink / raw)
  To: Peter Xu; +Cc: qemu-devel

On Fri, 27 Apr 2018, Peter Xu wrote:

> On Thu, Apr 26, 2018 at 11:33:53PM +0000, Eric Wheeler wrote:
> > Hello all,
> 
> Hi, Eric,
> 
> > 
> > This is my first time inside of the qemu code, so your help is greatly 
> > appreciated!
> > 
> > I have been experimenting with stop/start of VMs to/from a migration 
> > stream that excludes RAM pages and let the RAM pages come from memory file 
> > provided by the memory-backend-file called '/dev/shm/mem'.
> > 
> > To disable writing of memory pages to the migration stream, I've disabled 
> > calls to ram_find_and_save_block in ram_save_iterate() and 
> > ram_save_complete() (see patch below).  Thus, the migration stream has the 
> > "ram" SaveStateEntry section start/ends, but no pages:
> > 
> > qemu-system-x86_64 \
> > 	-object memory-backend-file,prealloc=no,mem-path=/dev/shm/mem,id=ram-node0,host-nodes=0,policy=bind,size=64m,share=on \
> > 	-numa node,nodeid=0,cpus=0,memdev=ram-node0\
> > 	-m 64 -vnc 0:0
> > 
> > Once the VM is running, I press ctrl-B to get the IPXE prompt and then 
> > run 'kernel http://192.168.0.1/foo' to start a network request and watch 
> > it in tcpdump.
> > 
> > Once the download starts, I save the migration file:
> > 	migrate "exec:cat > /dev/shm/t"
> > 
> > 	# ls -lh /dev/shm/t
> > 	-rw-r--r-- 1 root root 321K Apr 26 16:06 /dev/shm/t
> > 
> > Now I can kill qemu and boot it again with -incoming:
> > 
> > qemu-system-x86_64 \
> > 	-object memory-backend-file,prealloc=no,mem-path=/dev/shm/mem,id=ram-node0,host-nodes=0,policy=bind,size=64m,share=on \
> > 	-numa node,nodeid=0,cpus=0,memdev=ram-node0\
> > 	-m 64 -vnc 0:0 \
> > 	-incoming 'exec:cat /dev/shm/t'
> > 
> > It seems to work.  That is, network traffic continues (http from IPXE) 
> > which I can see from tcpdump.  I can type into the console and it moves 
> > the cursor around---but there is nothing on the screen except the blinking 
> > text-mode cursor!  I can even blindly start a new transfer in ipxe: kernel 
> > http://192.168.0.222/foo2 and see it in tcpdump.
> > 
> > So what am I missing here?  Is the video memory not saved to /dev/shm/mem?
> > 
> > Or perhaps it is saved, but VGA isn't initialized to use what is 
> > already in /dev/shm/mem?  I've tried the cirrus, std, and vmware drivers 
> > to see if they behave differently, but the do not seem to.
> 
> My wild guess is that we might still need to migrate some RAM besides
> the /dev/shm/mem file.  We have at least these ramblocks to migrate:
> 
> $ ./x86_64-softmmu/qemu-system-x86_64 -monitor stdio -m 2G                                                                           
> QEMU 2.12.0 monitor - type 'help' for more information
> (qemu) info ramblock
>               Block Name    PSize              Offset               Used              Total
>                   pc.ram    4 KiB  0x0000000000000000 0x0000000080000000 0x0000000080000000
>                 vga.vram    4 KiB  0x0000000080080000 0x0000000001000000 0x0000000001000000
>     /rom@etc/acpi/tables    4 KiB  0x0000000081100000 0x0000000000020000 0x0000000000200000
>                  pc.bios    4 KiB  0x0000000080000000 0x0000000000040000 0x0000000000040000
>   0000:00:03.0/e1000.rom    4 KiB  0x00000000810c0000 0x0000000000040000 0x0000000000040000
>                   pc.rom    4 KiB  0x0000000080040000 0x0000000000020000 0x0000000000020000
>     0000:00:02.0/vga.rom    4 KiB  0x0000000081080000 0x0000000000010000 0x0000000000010000
>    /rom@etc/table-loader    4 KiB  0x0000000081300000 0x0000000000001000 0x0000000000001000
>       /rom@etc/acpi/rsdp    4 KiB  0x0000000081340000 0x0000000000001000 0x0000000000001000

Yes, that makes sense!

> 
> And my understanding is that /dev/shm/mem only corresponds to the
> "pc.ram" entry.  I suspect the rest of RAMBlocks will still need to be
> migrated.  For example, the VGA ram.

The patch Dr. David Alan Gilbert mentioned is exactly what I was looking 
for:
  https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg02250.html

> Meanwhile, could I ask about where will this be used?  Is there
> anything to do with something like a "distributed memory cache" that
> provide memory service across multiple hosts?

I'm mostly interested in quickly restoring without a memory dump, possibly 
implementing a "fork()" to quickly clone VMs.  Also remote memory would be 
neat.

Do you remember OpenMOSIX?  I wonder if it would be possible to launch 
qemu instances on different nodes such that each instance is a remote NUMA 
node.  Cache coherency would need to be worked out, and it might require 
an OS port to handle new synchronization primitives---but if there was a 
way to do it without modifying the OS then you could create really big 
single-system-image NUMA servers.


--
Eric Wheeler



> 
> Best Regards,
> 
> > 
> > Thanks for your help!
> > 
> > --
> > Eric Wheeler
> > 
> > 
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 021d583..9f4bfff 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -2267,9 +2267,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
> >      t0 = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> >      i = 0;
> >      while ((ret = qemu_file_rate_limit(f)) == 0) {
> > -        int pages;
> > +        int pages = 0;
> >  
> > -        pages = ram_find_and_save_block(rs, false);
> > +        if (0) pages = ram_find_and_save_block(rs, false);
> >          /* no more pages to sent */
> >          if (pages == 0) {
> >              done = 1;
> > @@ -2338,9 +2338,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
> >  
> >      /* flush all remaining blocks regardless of rate limiting */
> >      while (true) {
> > -        int pages;
> > +        int pages = 0;
> >  
> > -        pages = ram_find_and_save_block(rs, !migration_in_colo_state());
> > +        if (0) pages = ram_find_and_save_block(rs, !migration_in_colo_state());
> >          /* no more blocks to sent */
> >          if (pages == 0) {
> >              break;
> > 
> > 
> 
> -- 
> Peter Xu
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-05-05 20:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-26 23:33 [Qemu-devel] Migration without memory page transfer Eric Wheeler
2018-04-27  2:45 ` Peter Xu
2018-05-05 20:30   ` Eric Wheeler
2018-04-27  9:24 ` Dr. David Alan Gilbert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.