From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Ian Pratt" Subject: RE: Live migration fails under heavy network use Date: Tue, 20 Feb 2007 23:48:14 -0000 Message-ID: <8A87A9A84C201449A0C56B728ACF491E0B9AD6@liverpoolst.ad.cl.cam.ac.uk> References: <20070220215039.GA28903@totally.trollied.org.uk> <8A87A9A84C201449A0C56B728ACF491E0B9AD2@liverpoolst.ad.cl.cam.ac.uk> <20070220230447.GA31928@totally.trollied.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Return-path: Content-class: urn:content-classes:message List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: John Levon Cc: xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org > > The freeing in-use page messages may be unrelated to the actual problem > > -- AFAIK that's a relatively new printk that could occur benignly during > > a live migrate of an rx-flip guest. >=20 > We're failing here: >=20 > [2007-02-20 13:39:50 xend 100401] INFO (XendCheckpoint:247) Saving memory > pages: iter 2 0%ERROR Internal error: Error when writing to state file > (5) (errno 14) > [2007-02-20 13:39:50 xend 100401] INFO (XendCheckpoint:247) Save exit rc=3D1 > [2007-02-20 13:39:50 xend 100401] ERROR (XendCheckpoint:111) Save failed on > domain fedora64 (2). >=20 > 1049 /* We have a normal page: just write it directly. > */ > 1050 if (ratewrite(io_fd, spage, PAGE_SIZE) !=3D > PAGE_SIZE) { > 1051 ERROR("Error when writing to state file (5)" >=20 > IOW, we're faulting (EFAULT) on the domain's MFN due to the above error. Urk. Checkout line 204 of privcmd.c=20 That doesn't look too 64b clean to me.... The top nibble is supposed to be set if its not possible to map the frame correctly. This will be propagated through the xc_get_pfn_type_batch call, hence skipping the frame. Ian