From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: Fatal crash on xen4.2 HVM + qemu-xen dm + NFS Date: Mon, 21 Jan 2013 17:29:09 +0000 Message-ID: <1358789349.3279.272.camel@zakaz.uk.xensource.com> References: <5B4525F296F6ABEB38B0E614@nimrod.local> <50CEFDA602000078000B0B11@nat28.tlf.novell.com> <3B1D0701EAEA6532CEA91EA0@Ximines.local> <77822E2DDAEA8F94631B6A52@Ximines.local> <1358781790.3279.224.camel@zakaz.uk.xensource.com> <1358783420.3279.235.camel@zakaz.uk.xensource.com> <1358787073.3279.257.camel@zakaz.uk.xensource.com> <91736C8D6DB136290494B9F8@Ximines.local> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <91736C8D6DB136290494B9F8@Ximines.local> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Alex Bligh Cc: Konrad Wilk , Xen Devel , Jan Beulich , Stefano Stabellini List-Id: xen-devel@lists.xenproject.org On Mon, 2013-01-21 at 17:06 +0000, Alex Bligh wrote: > >> I'm wondering whether what's happening is that when the disk grows > >> (or there's a backing file in place) some sort of different I/O is > >> done by qemu. Perhaps irrespective of write cache setting, it does some > >> form of zero copy I/O when there's a backing file in place. > > > > I doubt that, but I don't really know anything about qdisk. > > > > I'd be much more inclined to suspect a bug in the xen_qdisk backend's > > handling of disks resizes, if that's what you are doing. > > We aren't resizing the qcow2 disk itself. What we're doing is > creating a 20G (virtual size) qcow2 disk, containing a 3G (or > so) Ubuntu image - i.e. the partition table says it's 3G. We > then take a snapshot of it and use that as a backing file. The > guest then writes to the partition table enlarging it to the > virtual size of the disk, then resizes the file system. This > triggers it. Unless QEMU has some special reason to care about > what is in the partition table (e.g. to support the old xen > 'mount a file as a partition' stuff), it's just a pile of sectors > being written. > > > tap == blktap2. I don't know if it supports qcow or not but I don't > > think xl exposes it if it does. > > Well, in xl's conf file we are using > disk = [ 'tap:qcow2:/my/nfs/directory/testdisk.qcow2,xvda,w' ] > > I think that's how you are meant to do qcow2 isn't it? See docs/misc/xl-disk-configuration.txt, the "tap" prefix is deprecated and ignored by xl. Sorry, I didn't think of this usage of "tap" above. With xend the tap: prefix did force blktap (1 or 2) to be used. xl tries to pick the most suitable, and picks xen_qdisk for qcow, I think always. > > You could try with a test .vhd or .raw file though. > > We can do this but I'm betting it won't fail (at least with .raw) > as it only breaks on qcow2 if there's a backing file associated > with the qcow2 file (i.e. if we're writing to a snapshot). > > > Unfortunately it won't be zero. There will be at least one reference > > from the page being part of the process, which won't be dropped until > > the process dies. > > OK, well this is my ignorance of how the grant mechanism work. > I had assumed the page from the relevant domU got mapped into the > process in dom0, and that when it was unmapped it would be mapped > back out of the process's memory. Otherwise would the process's > memory map not fill up? The page is mapped out of the user process like you expect. The problem is that you cannot tell if the network stack still has a reference after the write() syscall has finished. if you were to assume it did then you would indeed fill the processes memory map. Ian.