From: Anthony Liguori <anthony@codemonkey.ws>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org,
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] Re: [PATCH 3/3] disk: don't read from disk until the guest starts
Date: Mon, 13 Sep 2010 09:34:35 -0500 [thread overview]
Message-ID: <4C8E367B.8070609@codemonkey.ws> (raw)
In-Reply-To: <4C8E319A.4090103@redhat.com>
On 09/13/2010 09:13 AM, Kevin Wolf wrote:
>> I think the only real advantage is that we fix NFS migration, right?
>>
> That's the one that we know about, yes.
>
> The rest is not a specific scenario, but a strong feeling that having an
> image opened twice at the same time feels dangerous.
We've never really had clear semantics about live migration and block
driver's life cycles. At a high level, for live migration to work, we
need the following sequence:
1) src> flush all pending writes to disk
2) <barrier>
3) dst> invalidate any cached data
4) dst> start guest
We've gotten away ignoring (3) because raw disks never cache anything.
But that assumes that we're on top of cache coherent storage. If we
don't have fully cache coherent storage, we need to do more.
We need to extend (3) to also flush the cache of the underlying
storage. There are two ways we can solve this, we can either ensure
that (3) is a nop by not having any operations that would cause caching
until after (3), or we can find a way to inject a flush into the
underlying cache.
Since he later approach requires storage specific knowledge, I went with
the former approach. Of course, if you did a close-open sequence at
(3), it may achieve the same goal but only really for NFS. If you have
something weaker than close-to-open coherency, you still need to do
something unique in step (3).
I don't know that I see a perfect model. Pushing reads past point (3)
is easy and fixes raw on top of NFS. I think we want to do that because
it's low hanging fruit. An block driver hook for (3) also seems
appealing because we can make use of it easily in QED.
That said, I'm open to suggestions of a better model. Delaying open
(especially if you open, then close, then open again) seems a bit hacky.
With respect to the devices, I think the question of when block devices
can begin accessing drives is orthogonal to this discussion. Even
without delaying open, we could simply not give them their
BlockDriverStates until realize() or something like that.
Regards,
Anthony Liguori
> As soon as an
> open/close sequence writes to the image for some format, we probably
> have a bug. For example, what about this mounted flag that you were
> discussing for QED?
>
>
>> But if we do invalidate_cache() as you suggested with a close/open of
>> the qcow2 layer, and also acquire and release a lock in the file layer
>> by propagating the invalidate_cache(), that should work robustly with NFS.
>>
>> I think that's a simpler change. Do you see additional advantages to
>> delaying the open?
>>
> Just that it makes it very obvious if a device model is doing bad things
> and accessing the image before it should. The difference is a failed
> request vs. silently corrupted data.
>
> Kevin
>
>
next prev parent reply other threads:[~2010-09-13 14:35 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-09-11 14:04 [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Anthony Liguori
2010-09-11 14:04 ` [Qemu-devel] [PATCH 1/3] block: allow migration to work with image files Anthony Liguori
2010-09-12 10:37 ` Avi Kivity
2010-09-12 13:06 ` Anthony Liguori
2010-09-12 13:28 ` Avi Kivity
2010-09-12 15:26 ` Anthony Liguori
2010-09-12 16:06 ` Avi Kivity
2010-09-12 17:10 ` Anthony Liguori
2010-09-12 17:51 ` Avi Kivity
2010-09-15 16:00 ` [Qemu-devel] " Juan Quintela
2010-09-15 15:57 ` Juan Quintela
2010-09-13 8:21 ` Kevin Wolf
2010-09-13 13:27 ` Anthony Liguori
2010-09-15 16:03 ` Juan Quintela
2010-09-16 7:54 ` Kevin Wolf
2010-09-15 15:53 ` Juan Quintela
2010-09-11 14:04 ` [Qemu-devel] [PATCH 2/3] block-nbd: fix use of protocols in backing files and nbd probing Anthony Liguori
2010-09-11 16:53 ` Stefan Hajnoczi
2010-09-11 17:27 ` Anthony Liguori
2010-09-11 17:45 ` Anthony Liguori
2010-09-15 16:06 ` [Qemu-devel] " Juan Quintela
2010-09-16 15:40 ` Anthony Liguori
2010-09-17 8:53 ` Kevin Wolf
2010-09-16 8:08 ` Kevin Wolf
2010-09-16 13:00 ` Anthony Liguori
2010-09-16 14:08 ` Kevin Wolf
2010-09-11 14:04 ` [Qemu-devel] [PATCH 3/3] disk: don't read from disk until the guest starts Anthony Liguori
2010-09-11 17:24 ` Stefan Hajnoczi
2010-09-11 17:34 ` Anthony Liguori
2010-09-12 10:42 ` Avi Kivity
2010-09-12 13:08 ` Anthony Liguori
2010-09-12 13:26 ` Avi Kivity
2010-09-12 15:29 ` Anthony Liguori
2010-09-12 16:04 ` Avi Kivity
2010-09-15 16:10 ` [Qemu-devel] " Juan Quintela
2010-09-13 8:32 ` Kevin Wolf
2010-09-13 13:29 ` Anthony Liguori
2010-09-13 13:39 ` Kevin Wolf
2010-09-13 13:42 ` Anthony Liguori
2010-09-13 14:13 ` Kevin Wolf
2010-09-13 14:34 ` Anthony Liguori [this message]
2010-09-14 9:47 ` Avi Kivity
2010-09-14 12:51 ` Anthony Liguori
2010-09-14 13:16 ` Avi Kivity
2010-09-13 19:29 ` Stefan Hajnoczi
2010-09-13 20:03 ` Kevin Wolf
2010-09-13 20:09 ` Anthony Liguori
2010-09-14 8:28 ` Kevin Wolf
2010-09-15 16:16 ` Juan Quintela
2010-09-12 10:46 ` [Qemu-devel] [RFC][PATCH 0/3] Fix caching issues with live migration Avi Kivity
2010-09-12 13:12 ` Anthony Liguori
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C8E367B.8070609@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=kwolf@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.