From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=58983 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OvFLC-0007o4-QK for qemu-devel@nongnu.org; Mon, 13 Sep 2010 16:09:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OvFLB-0005OC-37 for qemu-devel@nongnu.org; Mon, 13 Sep 2010 16:09:26 -0400 Received: from e38.co.us.ibm.com ([32.97.110.159]:46449) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OvFLA-0005Nb-Qp for qemu-devel@nongnu.org; Mon, 13 Sep 2010 16:09:25 -0400 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e38.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id o8DK1lPG003851 for ; Mon, 13 Sep 2010 14:01:47 -0600 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by d03relay04.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o8DK9Jbh109522 for ; Mon, 13 Sep 2010 14:09:19 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id o8DK9Inf019762 for ; Mon, 13 Sep 2010 14:09:18 -0600 Message-ID: <4C8E84EE.9040702@linux.vnet.ibm.com> Date: Mon, 13 Sep 2010 15:09:18 -0500 From: Anthony Liguori MIME-Version: 1.0 Subject: Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the guest starts References: <1284213896-12705-1-git-send-email-aliguori@us.ibm.com> <1284213896-12705-4-git-send-email-aliguori@us.ibm.com> <4C8DE19B.9090309@redhat.com> <4C8E2747.9090806@linux.vnet.ibm.com> <4C8E2981.7000304@redhat.com> <4C8E2A52.1000708@linux.vnet.ibm.com> <4C8E319A.4090103@redhat.com> <4C8E8390.6050607@redhat.com> In-Reply-To: <4C8E8390.6050607@redhat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: Stefan Hajnoczi , qemu-devel@nongnu.org, Stefan Hajnoczi , Juan Quintela On 09/13/2010 03:03 PM, Kevin Wolf wrote: > Am 13.09.2010 21:29, schrieb Stefan Hajnoczi: > >> On Mon, Sep 13, 2010 at 3:13 PM, Kevin Wolf wrote: >> >>> Am 13.09.2010 15:42, schrieb Anthony Liguori: >>> >>>> On 09/13/2010 08:39 AM, Kevin Wolf wrote: >>>> >>>>>> Yeah, one of the key design points of live migration is to minimize the >>>>>> number of failure scenarios where you lose a VM. If someone typed the >>>>>> wrong command line or shared storage hasn't been mounted yet and we >>>>>> delay failure until live migration is in the critical path, that would >>>>>> be terribly unfortunate. >>>>>> >>>>>> >>>>> We would catch most of them if we try to open the image when migration >>>>> starts and immediately close it again until migration is (almost) >>>>> completed, so that no other code can possibly use it before the source >>>>> has really closed it. >>>>> >>>>> >>>> I think the only real advantage is that we fix NFS migration, right? >>>> >>> That's the one that we know about, yes. >>> >>> The rest is not a specific scenario, but a strong feeling that having an >>> image opened twice at the same time feels dangerous. As soon as an >>> open/close sequence writes to the image for some format, we probably >>> have a bug. For example, what about this mounted flag that you were >>> discussing for QED? >>> >> There is some room left to work in, even if we can't check in open(). >> One idea would be to do the check asynchronously once I/O begins. It >> is actually easy to check L1/L2 tables as they are loaded. >> >> The only barrier relationship between I/O and checking is that an >> allocating write (which will need to update L1/L2 tables) is only >> allowed after check completes. Otherwise reads and non-allocating >> writes may proceed while the image is not yet fully checked. We can >> detect when a table element is an invalid offset and discard it. >> > I'm not even talking about such complicated things. You wanted to have a > dirty flag in the header, right? So when we allow opening an image > twice, you get this sequence with migration: > > Source: open > Destination: open (with dirty image) > Source: close > > The image is now marked as clean, even though the destination is still > working on it. > The dirty flag should be read on demand (which is the first time we fetch an L1/L2 table). I agree that the life cycle of the block drivers is getting fuzzy. Need to think quite a bit here. Regards, Anthony Liguori > Kevin >