All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the guest starts
Date: Mon, 13 Sep 2010 09:34:35 -0500	[thread overview]
Message-ID: <4C8E367B.8070609@codemonkey.ws> (raw)
In-Reply-To: <4C8E319A.4090103@redhat.com>

On 09/13/2010 09:13 AM, Kevin Wolf wrote:
>> I think the only real advantage is that we fix NFS migration, right?
>>      
> That's the one that we know about, yes.
>
> The rest is not a specific scenario, but a strong feeling that having an
> image opened twice at the same time feels dangerous.

We've never really had clear semantics about live migration and block 
driver's life cycles.  At a high level, for live migration to work, we 
need the following sequence:

1) src> flush all pending writes to disk
2) <barrier>
3) dst> invalidate any cached data
4) dst> start guest

We've gotten away ignoring (3) because raw disks never cache anything.  
But that assumes that we're on top of cache coherent storage.  If we 
don't have fully cache coherent storage, we need to do more.

We need to extend (3) to also flush the cache of the underlying 
storage.  There are two ways we can solve this, we can either ensure 
that (3) is a nop by not having any operations that would cause caching 
until after (3), or we can find a way to inject a flush into the 
underlying cache.

Since he later approach requires storage specific knowledge, I went with 
the former approach.  Of course, if you did a close-open sequence at 
(3), it may achieve the same goal but only really for NFS.  If you have 
something weaker than close-to-open coherency, you still need to do 
something unique in step (3).

I don't know that I see a perfect model.  Pushing reads past point (3) 
is easy and fixes raw on top of NFS.  I think we want to do that because 
it's low hanging fruit.  An block driver hook for (3) also seems 
appealing because we can make use of it easily in QED.

That said, I'm open to suggestions of a better model.  Delaying open 
(especially if you open, then close, then open again) seems a bit hacky.

With respect to the devices, I think the question of when block devices 
can begin accessing drives is orthogonal to this discussion.  Even 
without delaying open, we could simply not give them their 
BlockDriverStates until realize() or something like that.

Regards,

Anthony Liguori

>   As soon as an
> open/close sequence writes to the image for some format, we probably
> have a bug. For example, what about this mounted flag that you were
> discussing for QED?
>
>    
>> But if we do invalidate_cache() as you suggested with a close/open of
>> the qcow2 layer, and also acquire and release a lock in the file layer
>> by propagating the invalidate_cache(), that should work robustly with NFS.
>>
>> I think that's a simpler change.  Do you see additional advantages to
>> delaying the open?
>>      
> Just that it makes it very obvious if a device model is doing bad things
> and accessing the image before it should. The difference is a failed
> request vs. silently corrupted data.
>
> Kevin
>
>    

  reply	other threads:[~2010-09-13 14:35 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-11 14:04 [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Anthony Liguori
2010-09-11 14:04 ` [Qemu-devel] [PATCH 1/3] block: allow migration to work with image files Anthony Liguori
2010-09-12 10:37   ` Avi Kivity
2010-09-12 13:06     ` Anthony Liguori
2010-09-12 13:28       ` Avi Kivity
2010-09-12 15:26         ` Anthony Liguori
2010-09-12 16:06           ` Avi Kivity
2010-09-12 17:10             ` Anthony Liguori
2010-09-12 17:51               ` Avi Kivity
2010-09-15 16:00                 ` [Qemu-devel] " Juan Quintela
2010-09-15 15:57         ` Juan Quintela
2010-09-13  8:21   ` Kevin Wolf
2010-09-13 13:27     ` Anthony Liguori
2010-09-15 16:03     ` Juan Quintela
2010-09-16  7:54       ` Kevin Wolf
2010-09-15 15:53   ` Juan Quintela
2010-09-11 14:04 ` [Qemu-devel] [PATCH 2/3] block-nbd: fix use of protocols in backing files and nbd probing Anthony Liguori
2010-09-11 16:53   ` Stefan Hajnoczi
2010-09-11 17:27     ` Anthony Liguori
2010-09-11 17:45       ` Anthony Liguori
2010-09-15 16:06   ` [Qemu-devel] " Juan Quintela
2010-09-16 15:40     ` Anthony Liguori
2010-09-17  8:53       ` Kevin Wolf
2010-09-16  8:08   ` Kevin Wolf
2010-09-16 13:00     ` Anthony Liguori
2010-09-16 14:08       ` Kevin Wolf
2010-09-11 14:04 ` [Qemu-devel] [PATCH 3/3] disk: don't read from disk until the guest starts Anthony Liguori
2010-09-11 17:24   ` Stefan Hajnoczi
2010-09-11 17:34     ` Anthony Liguori
2010-09-12 10:42   ` Avi Kivity
2010-09-12 13:08     ` Anthony Liguori
2010-09-12 13:26       ` Avi Kivity
2010-09-12 15:29         ` Anthony Liguori
2010-09-12 16:04           ` Avi Kivity
2010-09-15 16:10       ` [Qemu-devel] " Juan Quintela
2010-09-13  8:32   ` Kevin Wolf
2010-09-13 13:29     ` Anthony Liguori
2010-09-13 13:39       ` Kevin Wolf
2010-09-13 13:42         ` Anthony Liguori
2010-09-13 14:13           ` Kevin Wolf
2010-09-13 14:34             ` Anthony Liguori [this message]
2010-09-14  9:47               ` Avi Kivity
2010-09-14 12:51                 ` Anthony Liguori
2010-09-14 13:16                   ` Avi Kivity
2010-09-13 19:29             ` Stefan Hajnoczi
2010-09-13 20:03               ` Kevin Wolf
2010-09-13 20:09                 ` Anthony Liguori
2010-09-14  8:28                   ` Kevin Wolf
2010-09-15 16:16     ` Juan Quintela
2010-09-12 10:46 ` [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Avi Kivity
2010-09-12 13:12   ` Anthony Liguori

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C8E367B.8070609@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.