From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=32932 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Ov9mm-0002wW-W5 for qemu-devel@nongnu.org; Mon, 13 Sep 2010 10:13:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1Ov9mk-0001gH-PH for qemu-devel@nongnu.org; Mon, 13 Sep 2010 10:13:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:2401) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1Ov9mk-0001g0-IB for qemu-devel@nongnu.org; Mon, 13 Sep 2010 10:13:30 -0400 Message-ID: <4C8E319A.4090103@redhat.com> Date: Mon, 13 Sep 2010 16:13:46 +0200 From: Kevin Wolf MIME-Version: 1.0 References: <1284213896-12705-1-git-send-email-aliguori@us.ibm.com> <1284213896-12705-4-git-send-email-aliguori@us.ibm.com> <4C8DE19B.9090309@redhat.com> <4C8E2747.9090806@linux.vnet.ibm.com> <4C8E2981.7000304@redhat.com> <4C8E2A52.1000708@linux.vnet.ibm.com> In-Reply-To: <4C8E2A52.1000708@linux.vnet.ibm.com> Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the guest starts List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Anthony Liguori Cc: Anthony Liguori , Juan Quintela , qemu-devel@nongnu.org, Stefan Hajnoczi Am 13.09.2010 15:42, schrieb Anthony Liguori: > On 09/13/2010 08:39 AM, Kevin Wolf wrote: >>> Yeah, one of the key design points of live migration is to minimize the >>> number of failure scenarios where you lose a VM. If someone typed the >>> wrong command line or shared storage hasn't been mounted yet and we >>> delay failure until live migration is in the critical path, that would >>> be terribly unfortunate. >>> >> We would catch most of them if we try to open the image when migration >> starts and immediately close it again until migration is (almost) >> completed, so that no other code can possibly use it before the source >> has really closed it. >> > > I think the only real advantage is that we fix NFS migration, right? That's the one that we know about, yes. The rest is not a specific scenario, but a strong feeling that having an image opened twice at the same time feels dangerous. As soon as an open/close sequence writes to the image for some format, we probably have a bug. For example, what about this mounted flag that you were discussing for QED? > But if we do invalidate_cache() as you suggested with a close/open of > the qcow2 layer, and also acquire and release a lock in the file layer > by propagating the invalidate_cache(), that should work robustly with NFS. > > I think that's a simpler change. Do you see additional advantages to > delaying the open? Just that it makes it very obvious if a device model is doing bad things and accessing the image before it should. The difference is a failed request vs. silently corrupted data. Kevin