All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] NVDIMM live migration broken?
@ 2017-06-22 14:08 Stefan Hajnoczi
  2017-06-23  0:13 ` haozhong.zhang
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2017-06-22 14:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: haozhong.zhang, Xiao Guangrong

[-- Attachment #1: Type: text/plain, Size: 894 bytes --]

I tried live migrating a guest with NVDIMM on qemu.git/master (edf8bc984):

  $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
         -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
	 -device nvdimm,id=nvdimm1,memdev=mem1 \
	 -drive if=virtio,file=test.img,format=raw

  $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
         -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
	 -device nvdimm,id=nvdimm1,memdev=mem1 \
	 -drive if=virtio,file=test.img,format=raw \
	 -incoming tcp::1234

  (qemu) migrate tcp:127.0.0.1:1234

The guest kernel panics or hangs every time on the destination.  It
happens as long as the nvdimm device is present - I didn't even mount it
inside the guest.

Is migration expected to work?

If not we need a migration blocker so that users get a graceful error
message.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] NVDIMM live migration broken?
  2017-06-22 14:08 [Qemu-devel] NVDIMM live migration broken? Stefan Hajnoczi
@ 2017-06-23  0:13 ` haozhong.zhang
  2017-06-23  9:55   ` Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: haozhong.zhang @ 2017-06-23  0:13 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Xiao Guangrong

On 06/22/17 15:08 +0100, Stefan Hajnoczi wrote:
> I tried live migrating a guest with NVDIMM on qemu.git/master (edf8bc984):
> 
>   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
>          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> 	 -drive if=virtio,file=test.img,format=raw
> 
>   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
>          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> 	 -drive if=virtio,file=test.img,format=raw \
> 	 -incoming tcp::1234
> 
>   (qemu) migrate tcp:127.0.0.1:1234
> 
> The guest kernel panics or hangs every time on the destination.  It
> happens as long as the nvdimm device is present - I didn't even mount it
> inside the guest.
> 
> Is migration expected to work?

Yes, I tested on QEMU 2.8.0 several months ago and it worked. I'll
have a look at this issue.

Haozhong

> 
> If not we need a migration blocker so that users get a graceful error
> message.
> 
> Stefan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] NVDIMM live migration broken?
  2017-06-23  0:13 ` haozhong.zhang
@ 2017-06-23  9:55   ` Stefan Hajnoczi
  2017-06-26  2:05     ` Haozhong Zhang
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2017-06-23  9:55 UTC (permalink / raw)
  To: qemu-devel, Xiao Guangrong

[-- Attachment #1: Type: text/plain, Size: 1420 bytes --]

On Fri, Jun 23, 2017 at 08:13:13AM +0800, haozhong.zhang@intel.com wrote:
> On 06/22/17 15:08 +0100, Stefan Hajnoczi wrote:
> > I tried live migrating a guest with NVDIMM on qemu.git/master (edf8bc984):
> > 
> >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > 	 -drive if=virtio,file=test.img,format=raw
> > 
> >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > 	 -drive if=virtio,file=test.img,format=raw \
> > 	 -incoming tcp::1234
> > 
> >   (qemu) migrate tcp:127.0.0.1:1234
> > 
> > The guest kernel panics or hangs every time on the destination.  It
> > happens as long as the nvdimm device is present - I didn't even mount it
> > inside the guest.
> > 
> > Is migration expected to work?
> 
> Yes, I tested on QEMU 2.8.0 several months ago and it worked. I'll
> have a look at this issue.

Great, thanks!

David Gilbert suggested the following on IRC, it sounds like a good
starting point for debugging:

Launch the destination QEMU with -S (vcpus will be paused) and after
migration has completed, compare the NVDIMM contents on source and
destination.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] NVDIMM live migration broken?
  2017-06-23  9:55   ` Stefan Hajnoczi
@ 2017-06-26  2:05     ` Haozhong Zhang
  2017-06-26 12:56       ` Stefan Hajnoczi
  0 siblings, 1 reply; 9+ messages in thread
From: Haozhong Zhang @ 2017-06-26  2:05 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, Xiao Guangrong

On 06/23/17 10:55 +0100, Stefan Hajnoczi wrote:
> On Fri, Jun 23, 2017 at 08:13:13AM +0800, haozhong.zhang@intel.com wrote:
> > On 06/22/17 15:08 +0100, Stefan Hajnoczi wrote:
> > > I tried live migrating a guest with NVDIMM on qemu.git/master (edf8bc984):
> > > 
> > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > 	 -drive if=virtio,file=test.img,format=raw
> > > 
> > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > 	 -drive if=virtio,file=test.img,format=raw \
> > > 	 -incoming tcp::1234
> > > 
> > >   (qemu) migrate tcp:127.0.0.1:1234
> > > 
> > > The guest kernel panics or hangs every time on the destination.  It
> > > happens as long as the nvdimm device is present - I didn't even mount it
> > > inside the guest.
> > > 
> > > Is migration expected to work?
> > 
> > Yes, I tested on QEMU 2.8.0 several months ago and it worked. I'll
> > have a look at this issue.
> 
> Great, thanks!
> 
> David Gilbert suggested the following on IRC, it sounds like a good
> starting point for debugging:
> 
> Launch the destination QEMU with -S (vcpus will be paused) and after
> migration has completed, compare the NVDIMM contents on source and
> destination.
> 

Which host and guest kernel are you testing? Is any workload running
in guest when migration?

I just tested QEMU commit edf8bc984 with host/guest kernel 4.8.0, and
could not reproduce the issue.

Haozhong

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] NVDIMM live migration broken?
  2017-06-26  2:05     ` Haozhong Zhang
@ 2017-06-26 12:56       ` Stefan Hajnoczi
  2017-06-27 14:30         ` Haozhong Zhang
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Hajnoczi @ 2017-06-26 12:56 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel, Xiao Guangrong

[-- Attachment #1: Type: text/plain, Size: 8058 bytes --]

On Mon, Jun 26, 2017 at 10:05:01AM +0800, Haozhong Zhang wrote:
> On 06/23/17 10:55 +0100, Stefan Hajnoczi wrote:
> > On Fri, Jun 23, 2017 at 08:13:13AM +0800, haozhong.zhang@intel.com wrote:
> > > On 06/22/17 15:08 +0100, Stefan Hajnoczi wrote:
> > > > I tried live migrating a guest with NVDIMM on qemu.git/master (edf8bc984):
> > > > 
> > > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > > >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > > 	 -drive if=virtio,file=test.img,format=raw
> > > > 
> > > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > > >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > > 	 -drive if=virtio,file=test.img,format=raw \
> > > > 	 -incoming tcp::1234
> > > > 
> > > >   (qemu) migrate tcp:127.0.0.1:1234
> > > > 
> > > > The guest kernel panics or hangs every time on the destination.  It
> > > > happens as long as the nvdimm device is present - I didn't even mount it
> > > > inside the guest.
> > > > 
> > > > Is migration expected to work?
> > > 
> > > Yes, I tested on QEMU 2.8.0 several months ago and it worked. I'll
> > > have a look at this issue.
> > 
> > Great, thanks!
> > 
> > David Gilbert suggested the following on IRC, it sounds like a good
> > starting point for debugging:
> > 
> > Launch the destination QEMU with -S (vcpus will be paused) and after
> > migration has completed, compare the NVDIMM contents on source and
> > destination.
> > 
> 
> Which host and guest kernel are you testing? Is any workload running
> in guest when migration?
> 
> I just tested QEMU commit edf8bc984 with host/guest kernel 4.8.0, and
> could not reproduce the issue.

I can still reproduce the problem on qemu.git edf8bc984.

My guest kernel is fairly close to yours.  The host kernel is newer.

Host kernel: 4.11.6-201.fc25.x86_64
Guest kernel: 4.8.8-300.fc25.x86_64

Command-line:

  qemu-system-x86_64 \
      -enable-kvm \
      -cpu host \
      -machine pc,nvdimm \
      -m 1G,slots=4,maxmem=8G \
      -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
      -device nvdimm,id=nvdimm1,memdev=mem1 \
      -drive if=virtio,file=test.img,format=raw \
      -display none \
      -serial stdio \
      -monitor unix:/tmp/monitor.sock,server,nowait

Start migration at the guest login prompt.  You don't need to log in or
do anything inside the guest.

There seems to be a guest RAM corruption because I get different
backtraces inside the guest every time.

The problem goes away if I remove -device nvdimm.

Here is an example backtrace:

[   28.577138] BUG: Bad rss-counter state mm:ffff9a21fd38aec0 idx:0 val:2605
[   28.577954] BUG: Bad rss-counter state mm:ffff9a21fd38aec0 idx:1 val:503
[   28.578646] BUG: non-zero nr_ptes on freeing mm: 73
[   28.579133] BUG: non-zero nr_pmds on freeing mm: 4
[   28.579932] BUG: unable to handle kernel paging request at ffff9a2100000000
[   28.581174] IP: [<ffffffffbe227723>] __kmalloc+0xc3/0x1f0
[   28.582015] PGD 3327c067 PUD 0 
[   28.582549] Oops: 0000 [#1] SMP
[   28.583032] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_raw ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_security iptable_raw iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_security ebtable_filter ebtables ip6table_filter ip6_tables bochs_drm ttm drm_kms_helper snd_pcsp dax_pmem nd_pmem crct10dif_pclmul dax nd_btt crc32_pclmul ppdev snd_pcm ghash_clmulni_intel drm e1000 snd_timer snd soundcore acpi_cpufreq joydev i2c_piix4 tpm_tis parport_pc tpm_tis_core parport qemu_fw_cfg tpm nfit xfs libcrc32c virtio_blk crc32c_intel virtio_pci serio_raw virtio_ring virtio ata_generic pata_acpi
[   28.592394] CPU: 0 PID: 573 Comm: systemd-journal Not tainted 4.8.8-300.fc25.x86_64 #1
[   28.593124] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
[   28.594208] task: ffff9a21f67e5b80 task.stack: ffff9a21fd0c0000
[   28.594752] RIP: 0010:[<ffffffffbe227723>]  [<ffffffffbe227723>] __kmalloc+0xc3/0x1f0
[   28.595485] RSP: 0018:ffff9a21fd0c3740  EFLAGS: 00010046
[   28.595976] RAX: ffff9a2100000000 RBX: 0000000002080020 RCX: 000000000000007f
[   28.596644] RDX: 0000000000010bf2 RSI: 0000000000000000 RDI: 000000000001c980
[   28.597311] RBP: ffff9a21fd0c3770 R08: ffff9a21ffc1c980 R09: 0000000002080020
[   28.597971] R10: ffff9a2100000000 R11: 0000000000000008 R12: 0000000002080020
[   28.598637] R13: 0000000000000030 R14: ffff9a21fe0018c0 R15: ffff9a21fe0018c0
[   28.599301] FS:  00007fd95ae4c700(0000) GS:ffff9a21ffc00000(0000) knlGS:0000000000000000
[   28.600050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.600587] CR2: ffff9a2100000000 CR3: 000000003715f000 CR4: 00000000003406f0
[   28.601250] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   28.601908] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   28.602574] Stack:
[   28.602754]  ffffffffc03dde4d 0000000000000003 ffff9a21fd0c38e0 000000000000001c
[   28.603493]  ffff9a21f6cfb000 ffff9a21fd0c38c8 ffff9a21fd0c3788 ffffffffc03dde4d
[   28.604217]  0000000000000003 ffff9a21fd0c3800 ffffffffc03de043 ffff9a21fd0c38c8
[   28.604942] Call Trace:
[   28.605185]  [<ffffffffc03dde4d>] ? alloc_indirect.isra.14+0x1d/0x50 [virtio_ring]
[   28.605890]  [<ffffffffc03dde4d>] alloc_indirect.isra.14+0x1d/0x50 [virtio_ring]
[   28.606561]  [<ffffffffc03de043>] virtqueue_add_sgs+0x1c3/0x4a0 [virtio_ring]
[   28.607086]  [<ffffffffc040165c>] __virtblk_add_req+0xbc/0x220 [virtio_blk]
[   28.607614]  [<ffffffffbe3fbb3d>] ? find_next_zero_bit+0x1d/0x20
[   28.608060]  [<ffffffffbe3c2e57>] ? __bt_get.isra.6+0xd7/0x1c0
[   28.608506]  [<ffffffffc040195d>] virtio_queue_rq+0x12d/0x290 [virtio_blk]
[   28.609013]  [<ffffffffbe3c06b3>] __blk_mq_run_hw_queue+0x233/0x380
[   28.609565]  [<ffffffffbe3b2101>] ? blk_run_queue+0x21/0x40
[   28.610087]  [<ffffffffbe3c045b>] blk_mq_run_hw_queue+0x8b/0xb0
[   28.610649]  [<ffffffffbe3c1926>] blk_sq_make_request+0x216/0x4d0
[   28.611225]  [<ffffffffbe3b5782>] generic_make_request+0xf2/0x1d0
[   28.611796]  [<ffffffffbe3b58dd>] submit_bio+0x7d/0x150
[   28.612297]  [<ffffffffbe1c6797>] ? __test_set_page_writeback+0x107/0x220
[   28.612952]  [<ffffffffc045b644>] xfs_submit_ioend.isra.14+0x84/0xd0 [xfs]
[   28.613617]  [<ffffffffc045bbfe>] xfs_do_writepage+0x26e/0x5f0 [xfs]
[   28.614219]  [<ffffffffbe1c8425>] write_cache_pages+0x205/0x530
[   28.614789]  [<ffffffffc045b990>] ? xfs_aops_discard_page+0x140/0x140 [xfs]
[   28.615460]  [<ffffffffc045b73b>] xfs_vm_writepages+0xab/0xd0 [xfs]
[   28.616052]  [<ffffffffbe1c940e>] do_writepages+0x1e/0x30
[   28.616569]  [<ffffffffbe1ba5c6>] __filemap_fdatawrite_range+0xc6/0x100
[   28.617192]  [<ffffffffbe1ba741>] filemap_write_and_wait_range+0x41/0x90
[   28.617832]  [<ffffffffc0465c23>] xfs_file_fsync+0x63/0x1d0 [xfs]
[   28.618415]  [<ffffffffbe285289>] vfs_fsync_range+0x49/0xa0
[   28.618940]  [<ffffffffbe28533d>] do_fsync+0x3d/0x70
[   28.619411]  [<ffffffffbe2855d0>] SyS_fsync+0x10/0x20
[   28.619887]  [<ffffffffbe003c57>] do_syscall_64+0x67/0x160
[   28.620410]  [<ffffffffbe802861>] entry_SYSCALL64_slow_path+0x25/0x25
[   28.621017] Code: 49 83 78 10 00 4d 8b 10 0f 84 ce 00 00 00 4d 85 d2 0f 84 c5 00 00 00 49 63 47 20 49 8b 3f 4c 01 d0 40 f6 c7 0f 0f 85 1a 01 00 00 <48> 8b 18 48 8d 4a 01 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 
[   28.623292] RIP  [<ffffffffbe227723>] __kmalloc+0xc3/0x1f0
[   28.623712]  RSP <ffff9a21fd0c3740>
[   28.623975] CR2: ffff9a2100000000
[   28.624275] ---[ end trace 60d3c1e57c22eb41 ]---

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] NVDIMM live migration broken?
  2017-06-26 12:56       ` Stefan Hajnoczi
@ 2017-06-27 14:30         ` Haozhong Zhang
  2017-06-27 16:58           ` Juan Quintela
  2017-06-28 10:05           ` Stefan Hajnoczi
  0 siblings, 2 replies; 9+ messages in thread
From: Haozhong Zhang @ 2017-06-27 14:30 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, qemu-devel, Xiao Guangrong

[-- Attachment #1: Type: text/plain, Size: 3296 bytes --]

On 06/26/17 13:56 +0100, Stefan Hajnoczi wrote:
> On Mon, Jun 26, 2017 at 10:05:01AM +0800, Haozhong Zhang wrote:
> > On 06/23/17 10:55 +0100, Stefan Hajnoczi wrote:
> > > On Fri, Jun 23, 2017 at 08:13:13AM +0800, haozhong.zhang@intel.com wrote:
> > > > On 06/22/17 15:08 +0100, Stefan Hajnoczi wrote:
> > > > > I tried live migrating a guest with NVDIMM on qemu.git/master (edf8bc984):
> > > > > 
> > > > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > > > >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > > > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > > > 	 -drive if=virtio,file=test.img,format=raw
> > > > > 
> > > > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > > > >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > > > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > > > 	 -drive if=virtio,file=test.img,format=raw \
> > > > > 	 -incoming tcp::1234
> > > > > 
> > > > >   (qemu) migrate tcp:127.0.0.1:1234
> > > > > 
> > > > > The guest kernel panics or hangs every time on the destination.  It
> > > > > happens as long as the nvdimm device is present - I didn't even mount it
> > > > > inside the guest.
> > > > > 
> > > > > Is migration expected to work?
> > > > 
> > > > Yes, I tested on QEMU 2.8.0 several months ago and it worked. I'll
> > > > have a look at this issue.
> > > 
> > > Great, thanks!
> > > 
> > > David Gilbert suggested the following on IRC, it sounds like a good
> > > starting point for debugging:
> > > 
> > > Launch the destination QEMU with -S (vcpus will be paused) and after
> > > migration has completed, compare the NVDIMM contents on source and
> > > destination.
> > > 
> > 
> > Which host and guest kernel are you testing? Is any workload running
> > in guest when migration?
> > 
> > I just tested QEMU commit edf8bc984 with host/guest kernel 4.8.0, and
> > could not reproduce the issue.
> 
> I can still reproduce the problem on qemu.git edf8bc984.
> 
> My guest kernel is fairly close to yours.  The host kernel is newer.
> 
> Host kernel: 4.11.6-201.fc25.x86_64
> Guest kernel: 4.8.8-300.fc25.x86_64
> 
> Command-line:
> 
>   qemu-system-x86_64 \
>       -enable-kvm \
>       -cpu host \
>       -machine pc,nvdimm \
>       -m 1G,slots=4,maxmem=8G \
>       -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
>       -device nvdimm,id=nvdimm1,memdev=mem1 \
>       -drive if=virtio,file=test.img,format=raw \
>       -display none \
>       -serial stdio \
>       -monitor unix:/tmp/monitor.sock,server,nowait
> 
> Start migration at the guest login prompt.  You don't need to log in or
> do anything inside the guest.
> 
> There seems to be a guest RAM corruption because I get different
> backtraces inside the guest every time.
> 
> The problem goes away if I remove -device nvdimm.
> 

I managed to reproduce this bug. After bisect between good v2.8.0 and
bad edf8bc984, it looks a regression introduced by
    6b6712efccd "ram: Split dirty bitmap by RAMBlock"
This commit may result in guest crash after migration if any host
memory backend is used.

Could you test whether the attached draft patch fixes this bug? If yes,
I will make a formal patch later.

Thanks,
Haozhong

[-- Attachment #2: migration-fix.patch --]
[-- Type: text/plain, Size: 1654 bytes --]

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 73d1bea8b6..2ae4ff3965 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -377,7 +377,9 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
                                                uint64_t *real_dirty_pages)
 {
     ram_addr_t addr;
+    ram_addr_t offset = rb->offset;
     unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
+    unsigned long dirty_page = BIT_WORD((start + offset) >> TARGET_PAGE_BITS);
     uint64_t num_dirty = 0;
     unsigned long *dest = rb->bmap;
 
@@ -386,8 +388,9 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
         int k;
         int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
         unsigned long * const *src;
-        unsigned long idx = (page * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
-        unsigned long offset = BIT_WORD((page * BITS_PER_LONG) %
+        unsigned long idx = (dirty_page * BITS_PER_LONG) /
+                            DIRTY_MEMORY_BLOCK_SIZE;
+        unsigned long offset = BIT_WORD((dirty_page * BITS_PER_LONG) %
                                         DIRTY_MEMORY_BLOCK_SIZE);
 
         rcu_read_lock();
@@ -416,7 +419,7 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
     } else {
         for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
             if (cpu_physical_memory_test_and_clear_dirty(
-                        start + addr,
+                        start + addr + offset,
                         TARGET_PAGE_SIZE,
                         DIRTY_MEMORY_MIGRATION)) {
                 *real_dirty_pages += 1;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] NVDIMM live migration broken?
  2017-06-27 14:30         ` Haozhong Zhang
@ 2017-06-27 16:58           ` Juan Quintela
  2017-06-27 18:12             ` Juan Quintela
  2017-06-28 10:05           ` Stefan Hajnoczi
  1 sibling, 1 reply; 9+ messages in thread
From: Juan Quintela @ 2017-06-27 16:58 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, qemu-devel, Xiao Guangrong

Haozhong Zhang <haozhong.zhang@intel.com> wrote:

....

Hi

I am trying to see what is going on.

>> 
>
> I managed to reproduce this bug. After bisect between good v2.8.0 and
> bad edf8bc984, it looks a regression introduced by
>     6b6712efccd "ram: Split dirty bitmap by RAMBlock"
> This commit may result in guest crash after migration if any host
> memory backend is used.
>
> Could you test whether the attached draft patch fixes this bug? If yes,
> I will make a formal patch later.
>
> Thanks,
> Haozhong
>
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 73d1bea8b6..2ae4ff3965 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -377,7 +377,9 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>                                                 uint64_t *real_dirty_pages)
>  {
>      ram_addr_t addr;
> +    ram_addr_t offset = rb->offset;
>      unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
> +    unsigned long dirty_page = BIT_WORD((start + offset) >> TARGET_PAGE_BITS);
>      uint64_t num_dirty = 0;
>      unsigned long *dest = rb->bmap;
>  


If this is the case, I can't understand how it ever worked :-(

Investigating.

Later, Juan.

> @@ -386,8 +388,9 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>          int k;
>          int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
>          unsigned long * const *src;
> -        unsigned long idx = (page * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
> -        unsigned long offset = BIT_WORD((page * BITS_PER_LONG) %
> +        unsigned long idx = (dirty_page * BITS_PER_LONG) /
> +                            DIRTY_MEMORY_BLOCK_SIZE;
> +        unsigned long offset = BIT_WORD((dirty_page * BITS_PER_LONG) %
>                                          DIRTY_MEMORY_BLOCK_SIZE);
>  
>          rcu_read_lock();
> @@ -416,7 +419,7 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>      } else {
>          for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
>              if (cpu_physical_memory_test_and_clear_dirty(
> -                        start + addr,
> +                        start + addr + offset,
>                          TARGET_PAGE_SIZE,
>                          DIRTY_MEMORY_MIGRATION)) {
>                  *real_dirty_pages += 1;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] NVDIMM live migration broken?
  2017-06-27 16:58           ` Juan Quintela
@ 2017-06-27 18:12             ` Juan Quintela
  0 siblings, 0 replies; 9+ messages in thread
From: Juan Quintela @ 2017-06-27 18:12 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Stefan Hajnoczi, qemu-devel, Xiao Guangrong

Juan Quintela <quintela@redhat.com> wrote:
> Haozhong Zhang <haozhong.zhang@intel.com> wrote:
>
> ....
>
> Hi
>
> I am trying to see what is going on.
>
>>> 
>>
>> I managed to reproduce this bug. After bisect between good v2.8.0 and
>> bad edf8bc984, it looks a regression introduced by
>>     6b6712efccd "ram: Split dirty bitmap by RAMBlock"
>> This commit may result in guest crash after migration if any host
>> memory backend is used.
>>
>> Could you test whether the attached draft patch fixes this bug? If yes,
>> I will make a formal patch later.
>>
>> Thanks,
>> Haozhong
>>
>> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
>> index 73d1bea8b6..2ae4ff3965 100644
>> --- a/include/exec/ram_addr.h
>> +++ b/include/exec/ram_addr.h
>> @@ -377,7 +377,9 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>>                                                 uint64_t *real_dirty_pages)
>>  {
>>      ram_addr_t addr;
>> +    ram_addr_t offset = rb->offset;
>>      unsigned long page = BIT_WORD(start >> TARGET_PAGE_BITS);
>> +    unsigned long dirty_page = BIT_WORD((start + offset) >> TARGET_PAGE_BITS);
>>      uint64_t num_dirty = 0;
>>      unsigned long *dest = rb->bmap;
>>  
>
>
> If this is the case, I can't understand how it ever worked :-(
>
> Investigating.

Further investigation, it gets as:
- pc.ram, by default is at slot 0
- so offset == 0
- rest of devices are not ram-lived

So it work well.

Only ram ends using that function, so we don't care.

When we use nvdimm device (don't know if any other), it just gets out of
ramblock 0, and then the offset is important.

# No NVDIMM

(qemu) info ramblock 
              Block Name    PSize              Offset               Used              Total
                  pc.ram    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
                vga.vram    4 KiB  0x0000000040060000 0x0000000000400000 0x0000000000400000

# with NVDIMM

(qemu) info ramblock 
              Block Name    PSize              Offset               Used              Total
           /objects/mem1    4 KiB  0x0000000000000000 0x0000000040000000 0x0000000040000000
                  pc.ram    4 KiB  0x0000000040000000 0x0000000040000000 0x0000000040000000
                vga.vram    4 KiB  0x0000000080060000 0x0000000000400000 0x0000000000400000


I am still amused/confused/integrated? how we haven't discovered the
problem before.

The patch fixes the problem described on the thread.


Later, Juan.

>
> Later, Juan.
>
>> @@ -386,8 +388,9 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>>          int k;
>>          int nr = BITS_TO_LONGS(length >> TARGET_PAGE_BITS);
>>          unsigned long * const *src;
>> -        unsigned long idx = (page * BITS_PER_LONG) / DIRTY_MEMORY_BLOCK_SIZE;
>> -        unsigned long offset = BIT_WORD((page * BITS_PER_LONG) %
>> +        unsigned long idx = (dirty_page * BITS_PER_LONG) /
>> +                            DIRTY_MEMORY_BLOCK_SIZE;
>> +        unsigned long offset = BIT_WORD((dirty_page * BITS_PER_LONG) %
>>                                          DIRTY_MEMORY_BLOCK_SIZE);
>>  
>>          rcu_read_lock();
>> @@ -416,7 +419,7 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
>>      } else {
>>          for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
>>              if (cpu_physical_memory_test_and_clear_dirty(
>> -                        start + addr,
>> +                        start + addr + offset,
>>                          TARGET_PAGE_SIZE,
>>                          DIRTY_MEMORY_MIGRATION)) {
>>                  *real_dirty_pages += 1;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Qemu-devel] NVDIMM live migration broken?
  2017-06-27 14:30         ` Haozhong Zhang
  2017-06-27 16:58           ` Juan Quintela
@ 2017-06-28 10:05           ` Stefan Hajnoczi
  1 sibling, 0 replies; 9+ messages in thread
From: Stefan Hajnoczi @ 2017-06-28 10:05 UTC (permalink / raw)
  To: Stefan Hajnoczi, qemu-devel, Xiao Guangrong

[-- Attachment #1: Type: text/plain, Size: 3665 bytes --]

On Tue, Jun 27, 2017 at 10:30:01PM +0800, Haozhong Zhang wrote:
> On 06/26/17 13:56 +0100, Stefan Hajnoczi wrote:
> > On Mon, Jun 26, 2017 at 10:05:01AM +0800, Haozhong Zhang wrote:
> > > On 06/23/17 10:55 +0100, Stefan Hajnoczi wrote:
> > > > On Fri, Jun 23, 2017 at 08:13:13AM +0800, haozhong.zhang@intel.com wrote:
> > > > > On 06/22/17 15:08 +0100, Stefan Hajnoczi wrote:
> > > > > > I tried live migrating a guest with NVDIMM on qemu.git/master (edf8bc984):
> > > > > > 
> > > > > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > > > > >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > > > > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > > > > 	 -drive if=virtio,file=test.img,format=raw
> > > > > > 
> > > > > >   $ qemu -M accel=kvm,nvdimm=on -m 1G,slots=4,maxmem=8G -cpu host \
> > > > > >          -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> > > > > > 	 -device nvdimm,id=nvdimm1,memdev=mem1 \
> > > > > > 	 -drive if=virtio,file=test.img,format=raw \
> > > > > > 	 -incoming tcp::1234
> > > > > > 
> > > > > >   (qemu) migrate tcp:127.0.0.1:1234
> > > > > > 
> > > > > > The guest kernel panics or hangs every time on the destination.  It
> > > > > > happens as long as the nvdimm device is present - I didn't even mount it
> > > > > > inside the guest.
> > > > > > 
> > > > > > Is migration expected to work?
> > > > > 
> > > > > Yes, I tested on QEMU 2.8.0 several months ago and it worked. I'll
> > > > > have a look at this issue.
> > > > 
> > > > Great, thanks!
> > > > 
> > > > David Gilbert suggested the following on IRC, it sounds like a good
> > > > starting point for debugging:
> > > > 
> > > > Launch the destination QEMU with -S (vcpus will be paused) and after
> > > > migration has completed, compare the NVDIMM contents on source and
> > > > destination.
> > > > 
> > > 
> > > Which host and guest kernel are you testing? Is any workload running
> > > in guest when migration?
> > > 
> > > I just tested QEMU commit edf8bc984 with host/guest kernel 4.8.0, and
> > > could not reproduce the issue.
> > 
> > I can still reproduce the problem on qemu.git edf8bc984.
> > 
> > My guest kernel is fairly close to yours.  The host kernel is newer.
> > 
> > Host kernel: 4.11.6-201.fc25.x86_64
> > Guest kernel: 4.8.8-300.fc25.x86_64
> > 
> > Command-line:
> > 
> >   qemu-system-x86_64 \
> >       -enable-kvm \
> >       -cpu host \
> >       -machine pc,nvdimm \
> >       -m 1G,slots=4,maxmem=8G \
> >       -object memory-backend-file,id=mem1,share=on,mem-path=nvdimm.dat,size=1G \
> >       -device nvdimm,id=nvdimm1,memdev=mem1 \
> >       -drive if=virtio,file=test.img,format=raw \
> >       -display none \
> >       -serial stdio \
> >       -monitor unix:/tmp/monitor.sock,server,nowait
> > 
> > Start migration at the guest login prompt.  You don't need to log in or
> > do anything inside the guest.
> > 
> > There seems to be a guest RAM corruption because I get different
> > backtraces inside the guest every time.
> > 
> > The problem goes away if I remove -device nvdimm.
> > 
> 
> I managed to reproduce this bug. After bisect between good v2.8.0 and
> bad edf8bc984, it looks a regression introduced by
>     6b6712efccd "ram: Split dirty bitmap by RAMBlock"
> This commit may result in guest crash after migration if any host
> memory backend is used.
> 
> Could you test whether the attached draft patch fixes this bug? If yes,
> I will make a formal patch later.

Thanks for the fix!  I tested and replied to your v2 patch.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-06-28 10:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-22 14:08 [Qemu-devel] NVDIMM live migration broken? Stefan Hajnoczi
2017-06-23  0:13 ` haozhong.zhang
2017-06-23  9:55   ` Stefan Hajnoczi
2017-06-26  2:05     ` Haozhong Zhang
2017-06-26 12:56       ` Stefan Hajnoczi
2017-06-27 14:30         ` Haozhong Zhang
2017-06-27 16:58           ` Juan Quintela
2017-06-27 18:12             ` Juan Quintela
2017-06-28 10:05           ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.