All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	Amit Shah <amit.shah@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: post-copy is broken?
Date: Mon, 18 Apr 2016 18:18:15 +0100	[thread overview]
Message-ID: <20160418171814.GH2222@work-vm> (raw)
In-Reply-To: <20160418132338.GG2222@work-vm>

* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> * Li, Liang Z (liang.z.li@intel.com) wrote:
> > > > > > > > > Interesting; it's failing reliably for me - but only with a
> > > > > > > > > reasonably freshly booted machine (so that the pages get THPd).
> > > > > > > >
> > > > > > > > The same here. Freshly booted machine with 64GiB ram. I've
> > > > > > > > checked
> > > > > > > > /proc/vmstat: huge pages were allocated
> > > > > > >
> > > > > > > Thanks for testing.
> > > > > > >
> > > > > > > Damn; this is confusing now.  I've got a RHEL7 box with
> > > > > > > 4.6.0-rc3 on where it works, and a fedora24 VM where it fails
> > > > > > > (the f24 VM is where I did the bisect so it works fine with the
> > > > > > > older kernel on the f24
> > > > > userspace in that VM).
> > > > > > >
> > > > > > > So lets see:
> > > > > > >    works: Kirill's (64GB machine)
> > > > > > >           Dave's RHEL7 host (24GB RAM, dual xeon, RHEL7
> > > > > > > userspace and kernel
> > > > > > > config)
> > > > > > >    fails: Dave's f24 VM (4GB RAM, 4 vcpus VM on my laptop24
> > > > > > > userspace and kernel config)
> > > > > > >
> > > > > > > So it's any of userspace, kernel config, machine hardware or hmm.
> > > > > > >
> > > > > > > My f24 box has transparent_hugepage_madvise, where my rhel7 has
> > > > > > > transparent_hugepage_always (but still works if I flip it to
> > > > > > > madvise at run time).  I'll try and get the configs closer together.
> > > > > > >
> > > > > > > Liang Li: Can you run my test on your setup which fails the
> > > > > > > migrate and tell me what your userspace is?
> > > > > > >
> > > > > > > (If you've not built my test yet, you might find you need to add a :
> > > > > > >    tests/postcopy-test$(EXESUF): tests/postcopy-test.o
> > > > > > >
> > > > > > >   to the tests/Makefile)
> > > > > > >
> > > > > >
> > > > > > Hi Dave,
> > > > > >
> > > > > >   How to build and run you test? I didn't do that before.
> > > > >
> > > > > Apply the code in:
> > > > > http://lists.gnu.org/archive/html/qemu-devel/2016-04/msg02138.html
> > > > >
> > > > > fix the:
> > > > > +            if ( ((b + 1) % 255) == last_byte && !hit_edge) {
> > > > > to:
> > > > > +            if ( ((b + 1) % 256) == last_byte && !hit_edge) {
> > > > >
> > > > > to tests/Makefile
> > > > >    tests/postcopy-test$(EXESUF): tests/postcopy-test.o
> > > > >
> > > > > and do a:
> > > > >     make check
> > > > >
> > > > > in qemu.
> > > > > Then you can rerun the test with:
> > > > >     QTEST_QEMU_BINARY=path/to/qemu-system-
> > > x86_64 ./tests/postcopy-
> > > > > test
> > > > >
> > > > > if it works, reboot and check it still works from a fresh boot.
> > > > >
> > > > > Can you describe the system which your full test failed on? What
> > > > > distro on the host? What type of host was it tested on?
> > > > >
> > > > > Dave
> > > > >
> > > >
> > > >
> > > > Thanks, Dave
> > > >
> > > > The host is CenOS7, its original kernel is 3.10.0-327.el7.x86_64
> > > > (CentOS 7.1?), The hardware platform is HSW-EP with 64GB RAM.
> > > 
> > > OK, so your test fails on real hardware; my guess is that my test will work on
> > > there.
> > > Can you try your test with THP disabled on the host:
> > > 
> > > echo never > /sys/kernel/mm/transparent_hugepage/enabled
> > > 
> > 
> > If the THP is disabled, no fails.
> > And your test was always passed, even when  real post-copy was failed. 
> > 
> > In my env, the output of 
> > 'cat /sys/kernel/mm/transparent_hugepage/enabled'  is:
> > 
> >  [always] ...
> 
> OK, I can't get my test to fail on real hardware - only in a VM; but my
> suspicion is we're looking at the same bug; both of them it goes away
> if we disable THP, both of them work on 4.4.x and fail on 4.5.x.
> I'd love to be able to find a nice easy test to be able to give to Andrea
> and Kirill
> 
> I've also just confirmed that running (in a VM) a fedora-24 4.5.0 kernel
> with a fedora-23 userspace (qemu built under f23) still fails with my test.
> So the problem there is definitely triggered by the newer kernel not
> the newer userspace.

OK, some more results - I *can* get it to fail on real hardware - it's just
really really rare, and the failure is slightly different than in the nest.

I'm using the following magic:
count=0; while true; do count=$(($count+1)); echo 3 >/proc/sys/vm/drop_caches; echo >/proc/sys/vm/compact_memory; echo "Iteration $count"; QTEST_QEMU_BINARY=./bin/qemu-system-x86_64 ./tests/postcopy-test || break; done

I've had about 4 failures out of about 5000 runs (ouch);

On the real hardware the failure addresses are always 2MB aligned, even though
other than the start address, everything in the test is 4K page based - so again
this is pointing the finger at THP:

/x86_64/postcopy: Memory content inconsistency at 4200000 first_byte = 48 last_byte = 47 current = 1 hit_edge = 1
postcopy-test: /root/git/qemu/tests/postcopy-test.c:274: check_guests_ram: Assertion `0' failed.
/x86_64/postcopy: Memory content inconsistency at 4200000 first_byte = e last_byte = d current = 9b hit_edge = 1
postcopy-test: /root/git/qemu/tests/postcopy-test.c:274: check_guests_ram: Assertion `0' failed.
/x86_64/postcopy: Memory content inconsistency at 4800000 first_byte = 19 last_byte = 18 current = 1 hit_edge = 1
postcopy-test: /root/git/qemu/tests/postcopy-test.c:274: check_guests_ram: Assertion `0' failed.
/x86_64/postcopy: Memory content inconsistency at 5e00000 first_byte = d6 last_byte = d5 current = 1 hit_edge = 1
postcopy-test: /root/git/qemu/tests/postcopy-test.c:274: check_guests_ram: Assertion `0' failed.

(My test host for the real hardware is 2x E5-2640 v3 running fedora 24)

where as in the VM I'm seeing immediate failures with addresses just on any 4k alignment.

You can run a couple in parallel; but if your load is too high the test will fail with an
assertion (postcopy-test.c:196 ...(qdict_haskey(rsp, "return")) - but that's
my test - so don't worry if you hit that; decreasing the migrate_speed_set value should
avoid that if you're hitting it repeatedly.

(Could this be something like a missing TLB flush?)

Dave


> 
> Dave
> 
> > 
> > Liang
> > 
> > > Dave
> > > 
> > > >
> > > >
> > > > > >
> > > > > > Thanks!
> > > > > > Liang
> > > > > >
> > > > > > >
> > > > > > > Dave
> > > > > > > >
> > > > > > > > --
> > > > > > > >  Kirill A. Shutemov
> > > > > > > --
> > > > > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > > > > --
> > > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: "Li, Liang Z" <liang.z.li@intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	Amit Shah <amit.shah@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [Qemu-devel] post-copy is broken?
Date: Mon, 18 Apr 2016 18:18:15 +0100	[thread overview]
Message-ID: <20160418171814.GH2222@work-vm> (raw)
In-Reply-To: <20160418132338.GG2222@work-vm>

* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> * Li, Liang Z (liang.z.li@intel.com) wrote:
> > > > > > > > > Interesting; it's failing reliably for me - but only with a
> > > > > > > > > reasonably freshly booted machine (so that the pages get THPd).
> > > > > > > >
> > > > > > > > The same here. Freshly booted machine with 64GiB ram. I've
> > > > > > > > checked
> > > > > > > > /proc/vmstat: huge pages were allocated
> > > > > > >
> > > > > > > Thanks for testing.
> > > > > > >
> > > > > > > Damn; this is confusing now.  I've got a RHEL7 box with
> > > > > > > 4.6.0-rc3 on where it works, and a fedora24 VM where it fails
> > > > > > > (the f24 VM is where I did the bisect so it works fine with the
> > > > > > > older kernel on the f24
> > > > > userspace in that VM).
> > > > > > >
> > > > > > > So lets see:
> > > > > > >    works: Kirill's (64GB machine)
> > > > > > >           Dave's RHEL7 host (24GB RAM, dual xeon, RHEL7
> > > > > > > userspace and kernel
> > > > > > > config)
> > > > > > >    fails: Dave's f24 VM (4GB RAM, 4 vcpus VM on my laptop24
> > > > > > > userspace and kernel config)
> > > > > > >
> > > > > > > So it's any of userspace, kernel config, machine hardware or hmm.
> > > > > > >
> > > > > > > My f24 box has transparent_hugepage_madvise, where my rhel7 has
> > > > > > > transparent_hugepage_always (but still works if I flip it to
> > > > > > > madvise at run time).  I'll try and get the configs closer together.
> > > > > > >
> > > > > > > Liang Li: Can you run my test on your setup which fails the
> > > > > > > migrate and tell me what your userspace is?
> > > > > > >
> > > > > > > (If you've not built my test yet, you might find you need to add a :
> > > > > > >    tests/postcopy-test$(EXESUF): tests/postcopy-test.o
> > > > > > >
> > > > > > >   to the tests/Makefile)
> > > > > > >
> > > > > >
> > > > > > Hi Dave,
> > > > > >
> > > > > >   How to build and run you test? I didn't do that before.
> > > > >
> > > > > Apply the code in:
> > > > > http://lists.gnu.org/archive/html/qemu-devel/2016-04/msg02138.html
> > > > >
> > > > > fix the:
> > > > > +            if ( ((b + 1) % 255) == last_byte && !hit_edge) {
> > > > > to:
> > > > > +            if ( ((b + 1) % 256) == last_byte && !hit_edge) {
> > > > >
> > > > > to tests/Makefile
> > > > >    tests/postcopy-test$(EXESUF): tests/postcopy-test.o
> > > > >
> > > > > and do a:
> > > > >     make check
> > > > >
> > > > > in qemu.
> > > > > Then you can rerun the test with:
> > > > >     QTEST_QEMU_BINARY=path/to/qemu-system-
> > > x86_64 ./tests/postcopy-
> > > > > test
> > > > >
> > > > > if it works, reboot and check it still works from a fresh boot.
> > > > >
> > > > > Can you describe the system which your full test failed on? What
> > > > > distro on the host? What type of host was it tested on?
> > > > >
> > > > > Dave
> > > > >
> > > >
> > > >
> > > > Thanks, Dave
> > > >
> > > > The host is CenOS7, its original kernel is 3.10.0-327.el7.x86_64
> > > > (CentOS 7.1?), The hardware platform is HSW-EP with 64GB RAM.
> > > 
> > > OK, so your test fails on real hardware; my guess is that my test will work on
> > > there.
> > > Can you try your test with THP disabled on the host:
> > > 
> > > echo never > /sys/kernel/mm/transparent_hugepage/enabled
> > > 
> > 
> > If the THP is disabled, no fails.
> > And your test was always passed, even when  real post-copy was failed. 
> > 
> > In my env, the output of 
> > 'cat /sys/kernel/mm/transparent_hugepage/enabled'  is:
> > 
> >  [always] ...
> 
> OK, I can't get my test to fail on real hardware - only in a VM; but my
> suspicion is we're looking at the same bug; both of them it goes away
> if we disable THP, both of them work on 4.4.x and fail on 4.5.x.
> I'd love to be able to find a nice easy test to be able to give to Andrea
> and Kirill
> 
> I've also just confirmed that running (in a VM) a fedora-24 4.5.0 kernel
> with a fedora-23 userspace (qemu built under f23) still fails with my test.
> So the problem there is definitely triggered by the newer kernel not
> the newer userspace.

OK, some more results - I *can* get it to fail on real hardware - it's just
really really rare, and the failure is slightly different than in the nest.

I'm using the following magic:
count=0; while true; do count=$(($count+1)); echo 3 >/proc/sys/vm/drop_caches; echo >/proc/sys/vm/compact_memory; echo "Iteration $count"; QTEST_QEMU_BINARY=./bin/qemu-system-x86_64 ./tests/postcopy-test || break; done

I've had about 4 failures out of about 5000 runs (ouch);

On the real hardware the failure addresses are always 2MB aligned, even though
other than the start address, everything in the test is 4K page based - so again
this is pointing the finger at THP:

/x86_64/postcopy: Memory content inconsistency at 4200000 first_byte = 48 last_byte = 47 current = 1 hit_edge = 1
postcopy-test: /root/git/qemu/tests/postcopy-test.c:274: check_guests_ram: Assertion `0' failed.
/x86_64/postcopy: Memory content inconsistency at 4200000 first_byte = e last_byte = d current = 9b hit_edge = 1
postcopy-test: /root/git/qemu/tests/postcopy-test.c:274: check_guests_ram: Assertion `0' failed.
/x86_64/postcopy: Memory content inconsistency at 4800000 first_byte = 19 last_byte = 18 current = 1 hit_edge = 1
postcopy-test: /root/git/qemu/tests/postcopy-test.c:274: check_guests_ram: Assertion `0' failed.
/x86_64/postcopy: Memory content inconsistency at 5e00000 first_byte = d6 last_byte = d5 current = 1 hit_edge = 1
postcopy-test: /root/git/qemu/tests/postcopy-test.c:274: check_guests_ram: Assertion `0' failed.

(My test host for the real hardware is 2x E5-2640 v3 running fedora 24)

where as in the VM I'm seeing immediate failures with addresses just on any 4k alignment.

You can run a couple in parallel; but if your load is too high the test will fail with an
assertion (postcopy-test.c:196 ...(qdict_haskey(rsp, "return")) - but that's
my test - so don't worry if you hit that; decreasing the migrate_speed_set value should
avoid that if you're hitting it repeatedly.

(Could this be something like a missing TLB flush?)

Dave


> 
> Dave
> 
> > 
> > Liang
> > 
> > > Dave
> > > 
> > > >
> > > >
> > > > > >
> > > > > > Thanks!
> > > > > > Liang
> > > > > >
> > > > > > >
> > > > > > > Dave
> > > > > > > >
> > > > > > > > --
> > > > > > > >  Kirill A. Shutemov
> > > > > > > --
> > > > > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > > > > --
> > > > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> > > --
> > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  reply	other threads:[~2016-04-18 17:18 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-12  9:23 [Qemu-devel] post-copy is broken? Li, Liang Z
2016-04-12  9:37 ` Dr. David Alan Gilbert
2016-04-12 17:55 ` Dr. David Alan Gilbert
2016-04-13  0:43   ` Li, Liang Z
2016-04-13  2:28     ` Li, Liang Z
2016-04-13  8:06       ` Dr. David Alan Gilbert
2016-04-13 11:41         ` Dr. David Alan Gilbert
2016-04-13 12:50           ` Dr. David Alan Gilbert
2016-04-13 20:51             ` Andrea Arcangeli
2016-04-14 10:13               ` Dr. David Alan Gilbert
2016-04-14 12:34               ` Dr. David Alan Gilbert
2016-04-14 16:22                 ` Andrea Arcangeli
2016-04-14 16:22                   ` [Qemu-devel] " Andrea Arcangeli
2016-04-15 12:52                   ` Kirill A. Shutemov
2016-04-15 12:52                     ` [Qemu-devel] " Kirill A. Shutemov
2016-04-15 13:42                     ` Dr. David Alan Gilbert
2016-04-15 13:42                       ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-15 15:23                       ` Kirill A. Shutemov
2016-04-15 15:23                         ` [Qemu-devel] " Kirill A. Shutemov
2016-04-15 16:34                         ` Dr. David Alan Gilbert
2016-04-15 16:34                           ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-18  9:50                           ` Li, Liang Z
2016-04-18  9:50                             ` [Qemu-devel] " Li, Liang Z
2016-04-18  9:55                             ` Dr. David Alan Gilbert
2016-04-18  9:55                               ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-18 10:06                               ` Li, Liang Z
2016-04-18 10:06                                 ` [Qemu-devel] " Li, Liang Z
2016-04-18 10:15                                 ` Dr. David Alan Gilbert
2016-04-18 10:15                                   ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-18 10:33                                   ` Li, Liang Z
2016-04-18 10:33                                     ` [Qemu-devel] " Li, Liang Z
2016-04-18 13:23                                     ` Dr. David Alan Gilbert
2016-04-18 13:23                                       ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-18 17:18                                       ` Dr. David Alan Gilbert [this message]
2016-04-18 17:18                                         ` Dr. David Alan Gilbert
2016-04-20 17:27                                     ` Dr. David Alan Gilbert
2016-04-20 17:27                                       ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-21 19:21                                       ` Dr. David Alan Gilbert
2016-04-21 19:21                                         ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-27 14:47                                     ` Andrea Arcangeli
2016-04-27 14:47                                       ` [Qemu-devel] " Andrea Arcangeli
2016-04-28  2:59                                       ` Li, Liang Z
2016-04-28  2:59                                         ` [Qemu-devel] " Li, Liang Z
2016-04-28  8:03                                         ` Dr. David Alan Gilbert
2016-04-28  8:03                                           ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-15 22:19                         ` Andrea Arcangeli
2016-04-15 22:19                           ` [Qemu-devel] " Andrea Arcangeli
2016-04-18  9:40                           ` Dr. David Alan Gilbert
2016-04-18  9:40                             ` [Qemu-devel] " Dr. David Alan Gilbert
2016-04-18  9:58                             ` Li, Liang Z
2016-04-18  9:58                               ` [Qemu-devel] " Li, Liang Z

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160418171814.GH2222@work-vm \
    --to=dgilbert@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=amit.shah@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=liang.z.li@intel.com \
    --cc=linux-mm@kvack.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.