From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stefano Stabellini Subject: Re: Fatal crash on xen4.2 HVM + qemu-xen dm + NFS Date: Tue, 22 Jan 2013 16:09:21 +0000 Message-ID: References: <5B4525F296F6ABEB38B0E614@nimrod.local> <50CEFDA602000078000B0B11@nat28.tlf.novell.com> <3B1D0701EAEA6532CEA91EA0@Ximines.local> <77822E2DDAEA8F94631B6A52@Ximines.local> <1358781790.3279.224.camel@zakaz.uk.xensource.com> <1358783420.3279.235.camel@zakaz.uk.xensource.com> <1358787073.3279.257.camel@zakaz.uk.xensource.com> <19EA31DDC3BEF4D66B42CBAC@Ximines.local> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Stefano Stabellini Cc: Jan Beulich , Konrad Wilk , Ian Campbell , Alex Bligh , Xen Devel List-Id: xen-devel@lists.xenproject.org On Tue, 22 Jan 2013, Stefano Stabellini wrote: > > But this would explain why I'm still seeing the crash with O_DIRECT > > apparently off (cache=writeback), as the cache setting is being ignored. > > > > This would also explain why Ian might not have seen it (it went in > > late and without O_DIRECT we think this crash can't happen). > > > > Is the BDRV_O_NOCACHE | BDRV_O_CACHE_WB combination intentional or > > should BDRV_O_NOCACHE be removed? Why would the default be different > > for emulated and PV disks? > > The setting is different from the one of emulated devices because after > analyzing the IDE code, we thought that using BDRV_O_CACHE_WB would be > safe enough because when the guest wants to make sure that the data hits > the disk, it issues an IDE FLUSH_CACHE operation. > > In the xen_disk case instead, we weren't quite sure about the > assumptions of all the possible different PV frontend drivers, so we > went for the safe choice, that is O_DIRECT. > > In fact if we wanted to change the cache setting for xen_disk, we would > probably have to go back to write-through (this setting is selected by > passing neither BDRV_O_NOCACHE nor BDRV_O_CACHE_WB) that is quite slow. > > Recently, thanks to Konrad's work on blkfront cache flushes, a new flush > operation has been implemented in the block protocol: > BLKIF_OP_FLUSH_DISKCACHE. BLKIF_OP_FLUSH_DISKCACHE was introduced in > xen_disk by 7e7b7cba16faa7b721b822fa9ed8bebafa35700f "xen_disk: > implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER". > Thanks to the new operation, maybe it is now safe to use write-back > caching. > Konrad, what do you think? Is blkback using the Linux disk cache by > default? Or is it using O_DIRECT? Looking more closely at xen_disk's flush operations, even if the semantics of the old BLKIF_OP_WRITE_BARRIER is confused, the implementation of it in xen_disk is strickly a superset of the flushes performed by the new BLKIF_OP_FLUSH_DISKCACHE. So unless blkfront issues fewer cache flushes when using BLKIF_OP_WRITE_BARRIER, I am tempted to say that even with the old flush operation, it might be safe to open the file write-back.