All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Lieven <pl@kamp.de>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	David Gibson <david@gibson.dropbear.id.au>,
	"qemu-ppc@nongnu.org" <qemu-ppc@nongnu.org>,
	Wenchao Xia <xiawenc@linux.vnet.ibm.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] broken incoming migration
Date: Mon, 10 Jun 2013 10:44:01 +0200	[thread overview]
Message-ID: <51B591D1.5040705@kamp.de> (raw)
In-Reply-To: <51B5785B.6040704@ozlabs.ru>

On 10.06.2013 08:55, Alexey Kardashevskiy wrote:
> On 06/10/2013 04:50 PM, Peter Lieven wrote:
>> On 10.06.2013 08:39, Alexey Kardashevskiy wrote:
>>> On 06/09/2013 05:27 PM, Peter Lieven wrote:
>>>> Am 09.06.2013 um 05:09 schrieb Alexey Kardashevskiy <aik@ozlabs.ru>:
>>>>
>>>>> On 06/09/2013 01:01 PM, Wenchao Xia wrote:
>>>>>> 于 2013-6-9 10:34, Alexey Kardashevskiy 写道:
>>>>>>> On 06/09/2013 12:16 PM, Wenchao Xia wrote:
>>>>>>>> 于 2013-6-8 16:30, Alexey Kardashevskiy 写道:
>>>>>>>>> On 06/08/2013 06:27 PM, Wenchao Xia wrote:
>>>>>>>>>>> On 04.06.2013 16:40, Paolo Bonzini wrote:
>>>>>>>>>>>> Il 04/06/2013 16:38, Peter Lieven ha scritto:
>>>>>>>>>>>>> On 04.06.2013 16:14, Paolo Bonzini wrote:
>>>>>>>>>>>>>> Il 04/06/2013 15:52, Peter Lieven ha scritto:
>>>>>>>>>>>>>>> On 30.05.2013 16:41, Paolo Bonzini wrote:
>>>>>>>>>>>>>>>> Il 30/05/2013 16:38, Peter Lieven ha scritto:
>>>>>>>>>>>>>>>>>>> You could also scan the page for nonzero
>>>>>>>>>>>>>>>>>>> values before writing it.
>>>>>>>>>>>>>>>>> i had this in mind, but then choosed the other
>>>>>>>>>>>>>>>>> approach.... turned out to be a bad idea.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> alexey: i will prepare a patch later today,
>>>>>>>>>>>>>>>>> could you then please verify it fixes your
>>>>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> paolo: would we still need the madvise or is
>>>>>>>>>>>>>>>>> it enough to not write the zeroes?
>>>>>>>>>>>>>>>> It should be enough to not write them.
>>>>>>>>>>>>>>> Problem: checking the pages for zero allocates
>>>>>>>>>>>>>>> them. even at the source.
>>>>>>>>>>>>>> It doesn't look like.  I tried this program and top
>>>>>>>>>>>>>> doesn't show an increasing amount of reserved
>>>>>>>>>>>>>> memory:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> #include <stdio.h> #include <stdlib.h> int main() {
>>>>>>>>>>>>>> char *x = malloc(500 << 20); int i, j; for (i = 0; i
>>>>>>>>>>>>>> < 500; i += 10) { for (j = 0; j < 10 << 20; j +=
>>>>>>>>>>>>>> 4096) { *(volatile char*) (x + (i << 20) + j); }
>>>>>>>>>>>>>> getchar(); } }
>>>>>>>>>>>>> strange. we are talking about RSS size, right?
>>>>>>>>>>>> None of the three top values change, and only VIRT is
>>>>>>>>>>>>> 500 MB.
>>>>>>>>>>>>> is the malloc above using mmapped memory?
>>>>>>>>>>>> Yes.
>>>>>>>>>>>>
>>>>>>>>>>>>> which kernel version do you use?
>>>>>>>>>>>> 3.9.
>>>>>>>>>>>>
>>>>>>>>>>>>> what avoids allocating the memory for me is the
>>>>>>>>>>>>> following (with whatever side effects it has ;-))
>>>>>>>>>>>> This would also fail to migrate any page that is swapped
>>>>>>>>>>>> out, breaking overcommit in a more subtle way. :)
>>>>>>>>>>>>
>>>>>>>>>>>> Paolo
>>>>>>>>>>> the following does also not allocate memory, but qemu
>>>>>>>>>>> does...
>>>>>>>>>> Hi, Peter As the patch writes
>>>>>>>>>>
>>>>>>>>>> "not sending zero pages breaks migration if a page is zero
>>>>>>>>>> at the source but not at the destination."
>>>>>>>>>>
>>>>>>>>>> I don't understand why it would be trouble, shouldn't all
>>>>>>>>>> page not received in dest be treated as zero pages?
>>>>>>>>> How would the destination guest know if some page must be
>>>>>>>>> cleared? The previous patch (which Peter reverted) did not
>>>>>>>>> send anything for the pages which were zero on the source
>>>>>>>>> side.
>>>>>>>> If an page was not received and destination knows that page
>>>>>>>> should exist according to total size, fill it with zero at
>>>>>>>> destination, would it solve the problem?
>>>>>>> It is _live_ migration, the source sends changes, same pages can
>>>>>>> change and be sent several times. So we would need to turn
>>>>>>> tracking on on the destination to know if some page was received
>>>>>>> from the source or changed by the destination itself (by writing
>>>>>>> there bios/firmware images, etc) and then clear pages which were
>>>>>>> touched by the destination and were not sent by the source.
>>>>>> OK, I can understand the problem is, for example: Destination boots
>>>>>> up with 0x0000-0xFFFF filled with bios image. Source forgot to send
>>>>>> zero pages in 0x0000-0xFFFF.
>>>>> The source did not forget, instead it zeroed these pages during its
>>>>> life and thought that they must be zeroed at the destination already
>>>>> (as the destination did not start and did not have a chance to write
>>>>> something there).
>>>>>
>>>>>
>>>>>> After migration destination got 0x0000-0xFFFF dirty(different with
>>>>>> source)
>>>>> Yep. And those pages were empty on the source what made debugging very
>>>>> easy :)
>>>>>
>>>>>
>>>>>> Thanks for explain.
>>>>>>
>>>>>> This seems refer to the migration protocol: how should the guest
>>>>>> treat unsent pages. The patch causing the problem, actually treat
>>>>>> zero pages as "not to sent" at source, but another half is missing:
>>>>>> treat "not received" as zero pages at destination. I guess if second
>>>>>> half is added, problem is gone: after page transfer completed,
>>>>>> before destination resume, fill zero in "not received" pages.
>>>>>
>>>>> Make a working patch, we'll discuss it :) I do not see much
>>>>> acceleration coming from there.
>>>> I would also not spent much time with this. I would either look to find
>>>> an easy way to fix the initialization code to not unneccessarily load
>>>> data into RAM or i will sent a v2 of my patch following Eric's
>>>> concerns.
>>> There is no easy way to implement the flag and keep your original patch as
>>> we have to implement this flag in all architectures which got broken by
>>> your patch and I personally can fix only PPC64-pseries but not the others.
>>>
>>> Furthermore your revert + new patches perfectly solve the problem, why
>>> would we want to bother now with this new flag which nobody really needs
>>> right now?
>>>
>>> Please, please, revert the original patch or I'll try to do it :)
>>>
>>>
>> I tried, but there where concerns by the community.
>
> Was here anybody who did not want to revert the patch (besides you)?
> I did not notice.
Eric said I should not drop the skipped_pages stuff in the monitor.
>
>
>> Alternativly I found
>> the following alternate solution. Please drop the 2 patches and try the
>> following:
>
> How is it going to work if upstream QEMU doesn't send anything about empty
> pages at all (this is why I want to revert that patch)?
I do not understand your question. The patch below zeroes out the destination
memory if it is not zero (e.g. if there is a BIOS copied to memory already during
machine init).

I would prefer not to completely drop the patch since it saves bandwidth and
resources.

Peter

  reply	other threads:[~2013-06-10  8:44 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30  7:44 [Qemu-devel] broken incoming migration Alexey Kardashevskiy
2013-05-30  7:49 ` Alexey Kardashevskiy
2013-05-30  7:49 ` Paolo Bonzini
2013-05-30  8:18   ` Alexey Kardashevskiy
2013-05-30  9:08     ` Peter Lieven
2013-05-30  9:31       ` Alexey Kardashevskiy
2013-05-30 13:00       ` Paolo Bonzini
2013-05-30 13:38         ` Alexey Kardashevskiy
2013-05-30 14:08           ` Paolo Bonzini
2013-05-30 14:38         ` Peter Lieven
2013-05-30 14:41           ` Paolo Bonzini
2013-06-04 13:52             ` Peter Lieven
2013-06-04 14:14               ` Paolo Bonzini
2013-06-04 14:38                 ` Peter Lieven
2013-06-04 14:40                   ` Paolo Bonzini
2013-06-04 14:48                     ` Peter Lieven
2013-06-04 15:17                       ` Paolo Bonzini
2013-06-04 19:15                         ` Peter Lieven
2013-06-05  3:37                           ` Alexey Kardashevskiy
2013-06-05  6:09                             ` Peter Lieven
2013-06-09  4:12                               ` liu ping fan
2013-06-09  7:22                                 ` Peter Lieven
2013-06-04 15:10                     ` Peter Lieven
2013-06-08  8:27                       ` Wenchao Xia
2013-06-08  8:30                         ` Alexey Kardashevskiy
2013-06-09  2:16                           ` Wenchao Xia
2013-06-09  2:34                             ` Alexey Kardashevskiy
2013-06-09  2:52                               ` [Qemu-devel] [Qemu-ppc] " Benjamin Herrenschmidt
2013-06-09  3:01                                 ` Alexey Kardashevskiy
2013-06-09  3:01                               ` [Qemu-devel] " Wenchao Xia
2013-06-09  3:09                                 ` Alexey Kardashevskiy
2013-06-09  3:31                                   ` Wenchao Xia
2013-06-09  7:27                                   ` Peter Lieven
2013-06-10  6:39                                     ` Alexey Kardashevskiy
2013-06-10  6:50                                       ` Peter Lieven
2013-06-10  6:55                                         ` Alexey Kardashevskiy
2013-06-10  8:44                                           ` Peter Lieven [this message]
2013-06-10  9:10                                             ` Alexey Kardashevskiy
2013-06-10  9:33                                               ` [Qemu-devel] [Qemu-ppc] " Benjamin Herrenschmidt
2013-06-10  9:42                                                 ` Peter Lieven
2013-06-09  2:53                             ` Benjamin Herrenschmidt
2013-06-12 14:00                               ` Paolo Bonzini
2013-06-12 14:11                                 ` Benjamin Herrenschmidt
2013-06-12 20:10                                   ` Paolo Bonzini
2013-06-13  2:41                                     ` Wenchao Xia
2013-06-03 10:04           ` [Qemu-devel] " Alexey Kardashevskiy
2013-06-04 10:56             ` Peter Lieven
2013-06-08  8:24         ` Wenchao Xia
2013-05-30 10:18 ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51B591D1.5040705@kamp.de \
    --to=pl@kamp.de \
    --cc=aik@ozlabs.ru \
    --cc=david@gibson.dropbear.id.au \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=xiawenc@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.