All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Avi Kivity <avi@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration
Date: Sun, 12 Sep 2010 08:12:15 -0500	[thread overview]
Message-ID: <4C8CD1AF.3060904@codemonkey.ws> (raw)
In-Reply-To: <4C8CAF9C.8090903@redhat.com>

On 09/12/2010 05:46 AM, Avi Kivity wrote:
>  On 09/11/2010 05:04 PM, Anthony Liguori wrote:
>> Today, live migration only works when using shared storage that is fully
>> cache coherent using raw images.
>>
>> The failure case with weak coherent (i.e. NFS) is subtle but 
>> nontheless still
>> exists.  NFS only guarantees close-to-open coherence and when 
>> performing a live
>> migration, we do an open on the source and an open on the 
>> destination.  We
>> fsync() on the source before launching the destination but since we 
>> have two
>> simultaneous opens, we're not guaranteed coherence.
>>
>> This is not necessarily a problem except that we are a bit gratituous 
>> in reading
>> from the disk before launching a guest.  This means that as things 
>> stand today,
>> we're guaranteed to read the first 64k of the disk and as such, if a 
>> client
>> writes to that region during live migration, corruption will result.
>>
>> The second failure condition has to do with image files (such as 
>> qcow2).  Today,
>> we aggressively cache metadata in all image formats and that cache is 
>> definitely
>> not coherent even with fully coherent shared storage.
>>
>> In all image formats, we prefetch at least the L1 table in open() 
>> which means
>> that if there is a write operation that causes a modification to an 
>> L1 table,
>> corruption will ensue.
>>
>> This series attempts to address both of these issue.  Technically, if 
>> a NFS
>> client aggressively prefetches this solution is not enough but in 
>> practice,
>> Linux doesn't do that.
>
> I think it is unlikely that it will, but I prefer to be on the right 
> side of the standards.

I've been asking around about this and one thing that was suggested was 
acquiring a file lock as NFS requires that a lock acquisition drops any 
client cache for a file.  I need to understand this a bit more so it's 
step #2.

>   Why not delay image open until after migration completes?  I know 
> your concern about the image not being there, but we can verify that 
> with access().  If the image is deleted between access() and open() 
> then the user has much bigger problems.

3/3 would still be needed because if we delay the open we obviously can 
do a read until an open.

So it's only really a choice between invalidate_cache and delaying 
open.  It's a far less invasive change to just do invalidate_cache 
though and it has some nice properties.

Regards,

Anthony Liguori

> Note that on NFS, removing (and I think chmoding) a file after it is 
> opened will cause subsequent data access to fail, unlike posix.
>

      reply	other threads:[~2010-09-12 13:20 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-11 14:04 [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Anthony Liguori
2010-09-11 14:04 ` [Qemu-devel] [PATCH 1/3] block: allow migration to work with image files Anthony Liguori
2010-09-12 10:37   ` Avi Kivity
2010-09-12 13:06     ` Anthony Liguori
2010-09-12 13:28       ` Avi Kivity
2010-09-12 15:26         ` Anthony Liguori
2010-09-12 16:06           ` Avi Kivity
2010-09-12 17:10             ` Anthony Liguori
2010-09-12 17:51               ` Avi Kivity
2010-09-15 16:00                 ` [Qemu-devel] " Juan Quintela
2010-09-15 15:57         ` Juan Quintela
2010-09-13  8:21   ` Kevin Wolf
2010-09-13 13:27     ` Anthony Liguori
2010-09-15 16:03     ` Juan Quintela
2010-09-16  7:54       ` Kevin Wolf
2010-09-15 15:53   ` Juan Quintela
2010-09-11 14:04 ` [Qemu-devel] [PATCH 2/3] block-nbd: fix use of protocols in backing files and nbd probing Anthony Liguori
2010-09-11 16:53   ` Stefan Hajnoczi
2010-09-11 17:27     ` Anthony Liguori
2010-09-11 17:45       ` Anthony Liguori
2010-09-15 16:06   ` [Qemu-devel] " Juan Quintela
2010-09-16 15:40     ` Anthony Liguori
2010-09-17  8:53       ` Kevin Wolf
2010-09-16  8:08   ` Kevin Wolf
2010-09-16 13:00     ` Anthony Liguori
2010-09-16 14:08       ` Kevin Wolf
2010-09-11 14:04 ` [Qemu-devel] [PATCH 3/3] disk: don't read from disk until the guest starts Anthony Liguori
2010-09-11 17:24   ` Stefan Hajnoczi
2010-09-11 17:34     ` Anthony Liguori
2010-09-12 10:42   ` Avi Kivity
2010-09-12 13:08     ` Anthony Liguori
2010-09-12 13:26       ` Avi Kivity
2010-09-12 15:29         ` Anthony Liguori
2010-09-12 16:04           ` Avi Kivity
2010-09-15 16:10       ` [Qemu-devel] " Juan Quintela
2010-09-13  8:32   ` Kevin Wolf
2010-09-13 13:29     ` Anthony Liguori
2010-09-13 13:39       ` Kevin Wolf
2010-09-13 13:42         ` Anthony Liguori
2010-09-13 14:13           ` Kevin Wolf
2010-09-13 14:34             ` Anthony Liguori
2010-09-14  9:47               ` Avi Kivity
2010-09-14 12:51                 ` Anthony Liguori
2010-09-14 13:16                   ` Avi Kivity
2010-09-13 19:29             ` Stefan Hajnoczi
2010-09-13 20:03               ` Kevin Wolf
2010-09-13 20:09                 ` Anthony Liguori
2010-09-14  8:28                   ` Kevin Wolf
2010-09-15 16:16     ` Juan Quintela
2010-09-12 10:46 ` [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Avi Kivity
2010-09-12 13:12   ` Anthony Liguori [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C8CD1AF.3060904@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=avi@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.