From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Stabellini Subject: Re: Fatal crash on xen4.2 HVM + qemu-xen dm + NFS Date: Wed, 16 Jan 2013 17:39:59 +0000 Message-ID: References: <5B4525F296F6ABEB38B0E614@nimrod.local> <50CEFDA602000078000B0B11@nat28.tlf.novell.com> <3B1D0701EAEA6532CEA91EA0@Ximines.local> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: Konrad Rzeszutek Wilk , Xen Devel , Jan Beulich , Alex Bligh , Ian Campbell List-Id: xen-devel@lists.xenproject.org On Wed, 16 Jan 2013, Stefano Stabellini wrote: > > >> Could the problem be "cache=writeback" on the QEMU command > > >> line (evident from a 'ps'). If caching is writeback perhaps QEMU > > >> needs to copy the data. Is there some setting to turn this off in > > >> xl for test purposes? > > > > > > The command line cache options are ignored by xen_disk, so, assuming > > > that the guest is using the PV disk interface, that can't be the issue. > > > > This appears not to be the case (at least in our environment). > > > > We use PV on HVM and: > > disk = [ 'tap:qcow2:/my/nfs/directory/testdisk.qcow2,xvda,w' ] > > (remainder of config file in the original message) > > > > We tried modifying the cache= setting using the patch below (yes, > > the mail client will probably have eaten it, but in essence change > > the word 'writeback' to 'none'), and that stops it booting VMs > > at all with > > hd0 write error > > error: couldn't read file > > so it would appear not to be entirely correct that the cache= > > settings are being ignored. I've not had time to find out why > > (possibly it's trying and failing to use O_DIRECT on NFS) but > > I'll try writethrough. > > The cache command line option is ignored by xen_disk, the PV disk > backend. I was assuming that the guest is using blkfront to access the > disk, but it looks like I am wrong. If the guest is using the IDE > interface, then yes, the cache command line option makes a big > difference. > > It is interesting that cache=none has that terrible effect on the disk > reads, that means that O_DIRECT doesn't work properly either. Let me elaborate on this: the guest is a PV on HVM guest, so at the very least the bootloader is going to use the emulated IDE interface to grab Xen and the kernel. After that the kernel should use the PV disk interface straight away (I actually downloaded the image and tried it myself: it is using the PV disk interface indeed). The bug should occur after the guest has switch over to the PV disk interface (state = 4 on xenstore for the vbd device). It can't be the IDE emulator triggering the issue.