From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=58499 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OvEin-0005cg-RU for qemu-devel@nongnu.org; Mon, 13 Sep 2010 15:29:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OvEim-0008Ae-Ox for qemu-devel@nongnu.org; Mon, 13 Sep 2010 15:29:45 -0400 Received: from mail-vw0-f45.google.com ([209.85.212.45]:44430) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OvEim-0008Aa-Km for qemu-devel@nongnu.org; Mon, 13 Sep 2010 15:29:44 -0400 Received: by vws19 with SMTP id 19so5321509vws.4 for ; Mon, 13 Sep 2010 12:29:44 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <4C8E319A.4090103@redhat.com> References: <1284213896-12705-1-git-send-email-aliguori@us.ibm.com> <1284213896-12705-4-git-send-email-aliguori@us.ibm.com> <4C8DE19B.9090309@redhat.com> <4C8E2747.9090806@linux.vnet.ibm.com> <4C8E2981.7000304@redhat.com> <4C8E2A52.1000708@linux.vnet.ibm.com> <4C8E319A.4090103@redhat.com> Date: Mon, 13 Sep 2010 20:29:43 +0100 Message-ID: Subject: Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the guest starts From: Stefan Hajnoczi Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Anthony Liguori , Anthony Liguori , qemu-devel@nongnu.org, Stefan Hajnoczi , Juan Quintela On Mon, Sep 13, 2010 at 3:13 PM, Kevin Wolf wrote: > Am 13.09.2010 15:42, schrieb Anthony Liguori: >> On 09/13/2010 08:39 AM, Kevin Wolf wrote: >>>> Yeah, one of the key design points of live migration is to minimize th= e >>>> number of failure scenarios where you lose a VM. =A0If someone typed t= he >>>> wrong command line or shared storage hasn't been mounted yet and we >>>> delay failure until live migration is in the critical path, that would >>>> be terribly unfortunate. >>>> >>> We would catch most of them if we try to open the image when migration >>> starts and immediately close it again until migration is (almost) >>> completed, so that no other code can possibly use it before the source >>> has really closed it. >>> >> >> I think the only real advantage is that we fix NFS migration, right? > > That's the one that we know about, yes. > > The rest is not a specific scenario, but a strong feeling that having an > image opened twice at the same time feels dangerous. As soon as an > open/close sequence writes to the image for some format, we probably > have a bug. For example, what about this mounted flag that you were > discussing for QED? There is some room left to work in, even if we can't check in open(). One idea would be to do the check asynchronously once I/O begins. It is actually easy to check L1/L2 tables as they are loaded. The only barrier relationship between I/O and checking is that an allocating write (which will need to update L1/L2 tables) is only allowed after check completes. Otherwise reads and non-allocating writes may proceed while the image is not yet fully checked. We can detect when a table element is an invalid offset and discard it. Stefan