From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34112) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UlW0z-0005lw-Kb for qemu-devel@nongnu.org; Sat, 08 Jun 2013 23:09:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UlW0y-00079a-EW for qemu-devel@nongnu.org; Sat, 08 Jun 2013 23:09:57 -0400 Received: from mail-pb0-x22f.google.com ([2607:f8b0:400e:c01::22f]:51572) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UlW0y-00079N-4I for qemu-devel@nongnu.org; Sat, 08 Jun 2013 23:09:56 -0400 Received: by mail-pb0-f47.google.com with SMTP id rr13so3402556pbb.6 for ; Sat, 08 Jun 2013 20:09:55 -0700 (PDT) Message-ID: <51B3F1FD.1090401@ozlabs.ru> Date: Sun, 09 Jun 2013 13:09:49 +1000 From: Alexey Kardashevskiy MIME-Version: 1.0 References: <51A7036A.3050407@ozlabs.ru> <51A7049F.6040207@redhat.com> <51A70B3D.90609@ozlabs.ru> <51A71705.6060009@kamp.de> <51A74D79.7040204@redhat.com> <2765FDFA-8050-4AA3-8621-7E9EA2C89F9C@kamp.de> <51A764FC.7080705@redhat.com> <51ADF122.70307@kamp.de> <51ADF637.7060804@redhat.com> <51ADFBCE.3080200@kamp.de> <51ADFC7A.7030009@redhat.com> <51AE035A.5070301@kamp.de> <51B2EB0A.7000704@linux.vnet.ibm.com> <51B2EBA2.5060401@ozlabs.ru> <51B3E58C.50301@linux.vnet.ibm.com> <51B3E9A8.5010705@ozlabs.ru> <51B3EFFA.4040608@linux.vnet.ibm.com> In-Reply-To: <51B3EFFA.4040608@linux.vnet.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] broken incoming migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wenchao Xia Cc: "qemu-ppc@nongnu.org" , Paolo Bonzini , Peter Lieven , "qemu-devel@nongnu.org" , David Gibson On 06/09/2013 01:01 PM, Wenchao Xia wrote: > 于 2013-6-9 10:34, Alexey Kardashevskiy 写道: >> On 06/09/2013 12:16 PM, Wenchao Xia wrote: >>> 于 2013-6-8 16:30, Alexey Kardashevskiy 写道: >>>> On 06/08/2013 06:27 PM, Wenchao Xia wrote: >>>>>> On 04.06.2013 16:40, Paolo Bonzini wrote: >>>>>>> Il 04/06/2013 16:38, Peter Lieven ha scritto: >>>>>>>> On 04.06.2013 16:14, Paolo Bonzini wrote: >>>>>>>>> Il 04/06/2013 15:52, Peter Lieven ha scritto: >>>>>>>>>> On 30.05.2013 16:41, Paolo Bonzini wrote: >>>>>>>>>>> Il 30/05/2013 16:38, Peter Lieven ha scritto: >>>>>>>>>>>>>> You could also scan the page for nonzero values before >>>>>>>>>>>>>> writing it. >>>>>>>>>>>> i had this in mind, but then choosed the other approach.... turned >>>>>>>>>>>> out to be a bad idea. >>>>>>>>>>>> >>>>>>>>>>>> alexey: i will prepare a patch later today, could you then please >>>>>>>>>>>> verify it fixes your problem. >>>>>>>>>>>> >>>>>>>>>>>> paolo: would we still need the madvise or is it enough to not >>>>>>>>>>>> write >>>>>>>>>>>> the zeroes? >>>>>>>>>>> It should be enough to not write them. >>>>>>>>>> Problem: checking the pages for zero allocates them. even at the >>>>>>>>>> source. >>>>>>>>> It doesn't look like. I tried this program and top doesn't show an >>>>>>>>> increasing amount of reserved memory: >>>>>>>>> >>>>>>>>> #include >>>>>>>>> #include >>>>>>>>> int main() >>>>>>>>> { >>>>>>>>> char *x = malloc(500 << 20); >>>>>>>>> int i, j; >>>>>>>>> for (i = 0; i < 500; i += 10) { >>>>>>>>> for (j = 0; j < 10 << 20; j += 4096) { >>>>>>>>> *(volatile char*) (x + (i << 20) + j); >>>>>>>>> } >>>>>>>>> getchar(); >>>>>>>>> } >>>>>>>>> } >>>>>>>> strange. we are talking about RSS size, right? >>>>>>> None of the three top values change, and only VIRT is >500 MB. >>>>>>> >>>>>>>> is the malloc above using mmapped memory? >>>>>>> Yes. >>>>>>> >>>>>>>> which kernel version do you use? >>>>>>> 3.9. >>>>>>> >>>>>>>> what avoids allocating the memory for me is the following (with >>>>>>>> whatever side effects it has ;-)) >>>>>>> This would also fail to migrate any page that is swapped out, breaking >>>>>>> overcommit in a more subtle way. :) >>>>>>> >>>>>>> Paolo >>>>>> the following does also not allocate memory, but qemu does... >>>>>> >>>>> Hi, Peter >>>>> As the patch writes >>>>> >>>>> "not sending zero pages breaks migration if a page is zero >>>>> at the source but not at the destination." >>>>> >>>>> I don't understand why it would be trouble, shouldn't all page >>>>> not received in dest be treated as zero pages? >>>> >>>> >>>> How would the destination guest know if some page must be cleared? The >>>> previous patch (which Peter reverted) did not send anything for the pages >>>> which were zero on the source side. >>>> >>>> >>> If an page was not received and destination knows that page should >>> exist according to total size, fill it with zero at destination, would >>> it solve the problem? >> >> It is _live_ migration, the source sends changes, same pages can change and >> be sent several times. So we would need to turn tracking on on the >> destination to know if some page was received from the source or changed by >> the destination itself (by writing there bios/firmware images, etc) and >> then clear pages which were touched by the destination and were not sent by >> the source. > OK, I can understand the problem is, for example: > Destination boots up with 0x0000-0xFFFF filled with bios image. > Source forgot to send zero pages in 0x0000-0xFFFF. The source did not forget, instead it zeroed these pages during its life and thought that they must be zeroed at the destination already (as the destination did not start and did not have a chance to write something there). > After migration destination got 0x0000-0xFFFF dirty(different with > source) Yep. And those pages were empty on the source what made debugging very easy :) > Thanks for explain. > > This seems refer to the migration protocol: how should the guest treat > unsent pages. The patch causing the problem, actually treat zero pages > as "not to sent" at source, but another half is missing: treat "not > received" as zero pages at destination. I guess if second half is added, > problem is gone: > after page transfer completed, before destination resume, > fill zero in "not received" pages. Make a working patch, we'll discuss it :) I do not see much acceleration coming from there. -- Alexey