From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Bligh Subject: Re: [PATCH] QEMU(upstream): Disable xen's use of O_DIRECT by default as it results in crashes. Date: Fri, 08 Mar 2013 10:45:58 +0000 Message-ID: <08A8C3360E6184928778A56B@nimrod.local> References: <1362653247-32551-1-git-send-email-alex@alex.org.uk> Reply-To: Alex Bligh Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: George Dunlap , Stefano Stabellini Cc: Ian Campbell , Konrad Rzeszutek Wilk , Ian Jackson , xen-devel , Jan Beulich , Alex Bligh List-Id: xen-devel@lists.xenproject.org George, --On 8 March 2013 10:28:32 +0000 George Dunlap wrote: >> Wait, aren't O_DIRECT and BDRV_O_NOCACHE required for safety? That >> is, without these flags isn't it possible that the guest OS thinks >> that the data has made it onto stable storage, while in fact it's >> still in dom0's memory? Or am I missing something? > > And in any case, if it's a kernel bug it should be fixed in the kernel. Well, in theory yes. In practice, it's a very difficult bug to fix it seems. You need to track skbs. Here's a set of patches which sort of fix it. > Alex, which dom0 kernel are you using? I actually seem to recall this > being a long-known bug with some versions of the pvops kernels; but I > thought it had long since been fixed. Kondrad / IanC, can you > comment? I'm using Ubuntu Precise's Kernel 3.2.0-32-generic on x86_64. However, this bug is in every kernel back to (at least) 2007. This thread is shorter that the one on xen-devel if you want to follow the history: http://comments.gmane.org/gmane.linux.nfs/54325 > Alex, if that's the case and if you're using a distro kernel maybe you > should try to push for a backport? That would require it being fixed first! In our lab we literally cannot boot Ubuntu cloud images (a standard OS) as a guest without Xen crashing horribly using an NFS backend without this patch. So, we have a choice. Either we work around the kernel bug in Xen, or we wait until it's fixed in the kernel and a very invasive backport is produced. Given the almost total lack of interest in fixing this in the kernel (save for Ian Campbell's patches which by his own omission he hasn't had the time to finish), and given the fact that it's a one line fix in qemu, I know which I prefer! -- Alex Bligh