From mboxrd@z Thu Jan  1 00:00:00 1970
From: George Dunlap <George.Dunlap@eu.citrix.com>
Subject: Re: [PATCHv3] QEMU(upstream): Disable xen's use of
 O_DIRECT by default as it results in crashes.
Date: Tue, 19 Mar 2013 10:06:53 +0000
Message-ID: <CAFLBxZbL4JC9SBgHzehtfpcjtR5OQN90SPFSTsXRLrpTWUSBsQ__37968.7388464392$1363687757$gmane$org@mail.gmail.com>
References: <1363609123-20748-1-git-send-email-alex@alex.org.uk>
	<51471767.8030604@redhat.com>
	<7AC8953FE45335FB794B6DFE@Ximines.local>
	<51471F14.7030209@redhat.com>
	<6D0F4ACDA3B7FCF1A50F8B52@Ximines.local>
	<5147298C.8080900@redhat.com>
	<A2FA46AE3DD746AD97DC3137@Ximines.local>
	<51473E82.1020806@redhat.com>
	<861AFE1A9C44444FD8BAEE16@Ximines.local>
	<5147512E.5050501@eu.citrix.com> <5147562E.1090203@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <5147562E.1090203@redhat.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ian Campbell <Ian.Campbell@citrix.com>, Stefano Stabellini <Stefano.Stabellini@eu.citrix.com>, Ian Jackson <Ian.Jackson@eu.citrix.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, xen-devel <xen-devel@lists.xen.org>, Alex Bligh <alex@alex.org.uk>, Anthony Liguori <anthony@codemonkey.ws>
List-Id: xen-devel@lists.xenproject.org

On Mon, Mar 18, 2013 at 6:00 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 18/03/2013 18:38, George Dunlap ha scritto:
>>>>
>>> This might be a difference between Xen and KVM. On Xen migration is
>>> made to a server in a paused state, and it's only unpaused when
>>> the migration to B is complete. There's a sort of extra handshake at
>>> the end.
>>
>> I think what you mean is that all the memory is handled by Xen and the
>> toolstack, not by qemu.  The qemu state is sent as the very last thing,
>> after all of the memory, and therefore (you are arguing) that qemu is
>> not started, and the files cannot be opened, until after the migration
>> is nearly complete, and certainly until after the file is closed on the
>> sending side.
>
> That would be quite dangerous.  Files aren't closed until after QEMU
> exits; at this point whatever problem you have launching QEMU on the
> destination would be unrecoverable.

But if I understand your concern correctly, you were concerned about
the following scenario:
R1. Receiver qemu opens file
R2. Something causes receiver kernel to cache parts of file (maybe
optimistic read-ahead)
S1. Sender qemu writes to file
S2. Sender qemu does final flush
S3. Sender qemu closes file
R3. Receiver reads stale blocks from cache

Even supposing that Xen doesn't actually shut down qemu until it is
started on the remote side, as long as the file isn't opened by qemu
until after S2, we should be safe, right?  It would look like this:

S1. Sender qemu writes to file
S2. Sender qemu does final flush
R1. Receiver qemu opens file
R2. Receiver kernel caches file
S3. Sender qemu closes file

This is all assuming that:
1. The barrier operations / write flush are effective at getting the
data back on to the NFS server
2. The receiver qemu doesn't open the file until after the last flush
by the sender.

Number 1 has been tested by Alex I believe, and is mentioned in the
changeset log; so if #2 is true, then we should be safe.  I'll try to
verify that today.

> Even for successful migration, it would also be bad for downtime (QEMU
> isn't exactly lightning-fast to start).  And even if failure weren't
> catastrophic, it would be a pity to transfer a few gigs of memory and
> then find out that QEMU isn't present in the destination. :)

Well, if qemu isn't present at the destination, that's definitely user
error. :-)  In any case, I know that he migrate can resume if it
fails, so I suspect that the qemu is just paused on the sending side
until the migration is known to complete.  As long as the last write
was flushed to the NFS server before the receiver opens the file, we
should be safe.

> Still, it's more than possible that I've forgotten something about Xen's
> management of QEMU.

And unfortunately I am not intimately familiar with that codepath; it
just happens that I'm the last person to have to dig into that code
and fix something. :-)

 -George