From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from [140.186.70.92] (port=33562 helo=eggs.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1Ov96R-0002qV-Oz
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 09:29:48 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69)
	(envelope-from <aliguori@linux.vnet.ibm.com>) id 1Ov96Q-0002uN-MF
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 09:29:47 -0400
Received: from e33.co.us.ibm.com ([32.97.110.151]:52953)
	by eggs.gnu.org with esmtp (Exim 4.69)
	(envelope-from <aliguori@linux.vnet.ibm.com>) id 1Ov96Q-0002uB-C9
	for qemu-devel@nongnu.org; Mon, 13 Sep 2010 09:29:46 -0400
Received: from d03relay03.boulder.ibm.com (d03relay03.boulder.ibm.com
	[9.17.195.228])
	by e33.co.us.ibm.com (8.14.4/8.13.1) with ESMTP id o8DDOh3P026867
	for <qemu-devel@nongnu.org>; Mon, 13 Sep 2010 07:24:43 -0600
Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168])
	by d03relay03.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id
	o8DDTjET212598
	for <qemu-devel@nongnu.org>; Mon, 13 Sep 2010 07:29:45 -0600
Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1])
	by d03av02.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP
	id o8DDTjjD003470
	for <qemu-devel@nongnu.org>; Mon, 13 Sep 2010 07:29:45 -0600
Message-ID: <4C8E2747.9090806@linux.vnet.ibm.com>
Date: Mon, 13 Sep 2010 08:29:43 -0500
From: Anthony Liguori <aliguori@linux.vnet.ibm.com>
MIME-Version: 1.0
References: <1284213896-12705-1-git-send-email-aliguori@us.ibm.com>
	<1284213896-12705-4-git-send-email-aliguori@us.ibm.com>
	<4C8DE19B.9090309@redhat.com>
In-Reply-To: <4C8DE19B.9090309@redhat.com>
Content-Type: text/plain; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
Subject: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the
	guest starts
List-Id: qemu-devel.nongnu.org
List-Unsubscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <http://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>, Juan Quintela <quintela@redhat.com>, qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>

On 09/13/2010 03:32 AM, Kevin Wolf wrote:
> Am 11.09.2010 16:04, schrieb Anthony Liguori:
>    
>> This fixes a couple nasty problems relating to live migration.
>>
>> 1) When dealing with shared storage with weak coherence (i.e. NFS), even if
>>     we re-read, we may end up with undesired caching.  By delaying any reads
>>     until we absolutely have to, we decrease the likelihood of any undesirable
>>     caching.
>>
>> 2) When dealing with copy-on-read, the local storage acts as a cache.  We need
>>     to make sure to avoid any reads to avoid polluting the local cache.
>>
>> Signed-off-by: Anthony Liguori<aliguori@us.ibm.com>
>>      
> I think we should also delay even opening the image file at all to the
> latest possible point to avoid that new problems of this kind are
> introduced. Ideally, opening the image would be the very last thing we
> do before telling the migration source that we're set and it should
> close the images.
>    

There's a lot of possible failure scenarios that opening an image file 
can introduce.  Fortunately, I don't think we have a strict requirement 
for it provided that we make a couple of reasonable changes.

> Even better would be to only open the image when the source has already
> closed it (we could completely avoid the invalidation/reopen then), but
> I think you were afraid that we might lose the image on both ends.
>    

Yeah, one of the key design points of live migration is to minimize the 
number of failure scenarios where you lose a VM.  If someone typed the 
wrong command line or shared storage hasn't been mounted yet and we 
delay failure until live migration is in the critical path, that would 
be terribly unfortunate.

Regards,

Anthony Liguori

> Kevin
>