* [xen-unstable test] 13394: regressions - FAIL
@ 2012-06-29 10:41 xen.org
2012-06-29 11:20 ` Ian Jackson
0 siblings, 1 reply; 16+ messages in thread
From: xen.org @ 2012-06-29 10:41 UTC (permalink / raw)
To: xen-devel; +Cc: ian.jackson
flight 13394 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/13394/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379
test-i386-i386-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379
test-amd64-amd64-xl-qemuu-win7-amd64 9 guest-localmigrate fail REGR. vs. 13376
Regressions which are regarded as allowable (not blocking):
test-amd64-amd64-xl-sedf-pin 10 guest-saverestore fail REGR. vs. 13379
Tests which did not succeed, but are not blocking:
test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass
test-amd64-i386-qemuu-rhel6hvm-amd 11 leak-check/check fail never pass
test-amd64-i386-rhel6hvm-amd 11 leak-check/check fail never pass
test-amd64-i386-qemuu-rhel6hvm-intel 11 leak-check/check fail never pass
test-amd64-i386-rhel6hvm-intel 11 leak-check/check fail never pass
test-amd64-i386-xend-winxpsp3 16 leak-check/check fail never pass
test-amd64-amd64-win 16 leak-check/check fail never pass
test-amd64-i386-win-vcpus1 16 leak-check/check fail never pass
test-amd64-i386-win 16 leak-check/check fail never pass
test-i386-i386-win 16 leak-check/check fail never pass
test-amd64-amd64-xl-win7-amd64 13 guest-stop fail never pass
test-amd64-amd64-xl-win 13 guest-stop fail never pass
test-amd64-i386-xl-win7-amd64 13 guest-stop fail never pass
test-i386-i386-xl-win 13 guest-stop fail never pass
test-amd64-i386-xl-win-vcpus1 13 guest-stop fail never pass
test-amd64-amd64-xl-winxpsp3 13 guest-stop fail never pass
test-i386-i386-xl-winxpsp3 13 guest-stop fail never pass
test-amd64-i386-xl-winxpsp3-vcpus1 13 guest-stop fail never pass
version targeted for testing:
xen 0455d8317631
baseline version:
xen 4f92bdf3370c
------------------------------------------------------------
People who touched revisions under test:
Andres Lagar-Cavilla <andres@lagarcavilla.org>
Dario Faggioli <dario.faggioli@citrix.com>
Dario Faggioli <raistlin@linux.it>
George Dunlap <george.dunlap@eu.citrix.com>
Ian Campbell <ian.campbell@citrix.com>
Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich <jbeulich@suse.com>
Keir Fraser <keir@xen.org>
Roger Pau Monne <roger.pau@entel.upc.edu>
Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tim Deegan <tim@xen.org>
Yang Zhang <yang.z.zhang@Intel.com>
------------------------------------------------------------
jobs:
build-amd64 pass
build-i386 pass
build-amd64-oldkern pass
build-i386-oldkern pass
build-amd64-pvops pass
build-i386-pvops pass
test-amd64-amd64-xl pass
test-amd64-i386-xl pass
test-i386-i386-xl pass
test-amd64-i386-rhel6hvm-amd fail
test-amd64-i386-qemuu-rhel6hvm-amd fail
test-amd64-amd64-xl-qemuu-win7-amd64 fail
test-amd64-amd64-xl-win7-amd64 fail
test-amd64-i386-xl-win7-amd64 fail
test-amd64-i386-xl-credit2 pass
test-amd64-amd64-xl-pcipt-intel fail
test-amd64-i386-rhel6hvm-intel fail
test-amd64-i386-qemuu-rhel6hvm-intel fail
test-amd64-i386-xl-multivcpu pass
test-amd64-amd64-pair pass
test-amd64-i386-pair pass
test-i386-i386-pair pass
test-amd64-amd64-xl-sedf-pin fail
test-amd64-amd64-pv pass
test-amd64-i386-pv pass
test-i386-i386-pv pass
test-amd64-amd64-xl-sedf pass
test-amd64-i386-win-vcpus1 fail
test-amd64-i386-xl-win-vcpus1 fail
test-amd64-i386-xl-winxpsp3-vcpus1 fail
test-amd64-amd64-win fail
test-amd64-i386-win fail
test-i386-i386-win fail
test-amd64-amd64-xl-win fail
test-i386-i386-xl-win fail
test-amd64-amd64-xl-qemuu-winxpsp3 fail
test-i386-i386-xl-qemuu-winxpsp3 fail
test-amd64-i386-xend-winxpsp3 fail
test-amd64-amd64-xl-winxpsp3 fail
test-i386-i386-xl-winxpsp3 fail
------------------------------------------------------------
sg-report-flight on woking.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images
Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs
Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary
Not pushing.
(No revision log; it would be 863 lines long.)
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [xen-unstable test] 13394: regressions - FAIL
2012-06-29 10:41 [xen-unstable test] 13394: regressions - FAIL xen.org
@ 2012-06-29 11:20 ` Ian Jackson
2012-06-29 11:47 ` Ian Campbell
0 siblings, 1 reply; 16+ messages in thread
From: Ian Jackson @ 2012-06-29 11:20 UTC (permalink / raw)
To: xen-devel; +Cc: Anthony Perard, Ian Campbell, Stefano Stabellini
xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"):
> Tests which did not succeed and are blocking,
> including tests which could not be run:
> test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379
The logs show this:
libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out
And in xenstore:
/local/domain/0/device-model/5/logdirty/cmd = "enable" (n0)
And in the source code:
$ grep -R logdirty qemu-upstream-unstable.git/*
$
So the upstream qemu does not participate properly in the migration
protocol. And anyway this protocol seems to involve xenstore and I
would have expected it to do something with QMP. But there is no code
in libxl to do this (and never has been) and no code in upstream qemu
to do it either.
That means we'll get memory corruption in migrated guests with the new
qemu: any time qemu writes to guest memory, it needs to trigger a
logdirty update so that the write is properly transferred to the
migration target domain.
With the old libxl we didn't notice this apart from random failures.
With my new migration code, particularly
25542:1883e5c71a87
libxl: wait for qemu to acknowledge logdirty command
this turns into a hard failure.
I will add this as an allowable test failure pending a proper fix.
Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [xen-unstable test] 13394: regressions - FAIL
2012-06-29 11:20 ` Ian Jackson
@ 2012-06-29 11:47 ` Ian Campbell
2012-06-29 12:29 ` Stefano Stabellini
0 siblings, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2012-06-29 11:47 UTC (permalink / raw)
To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote:
> xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"):
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379
>
> The logs show this:
>
> libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out
>
> And in xenstore:
>
> /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0)
>
> And in the source code:
>
> $ grep -R logdirty qemu-upstream-unstable.git/*
> $
>
> So the upstream qemu does not participate properly in the migration
> protocol. And anyway this protocol seems to involve xenstore and I
> would have expected it to do something with QMP. But there is no code
> in libxl to do this (and never has been) and no code in upstream qemu
> to do it either.
>
> That means we'll get memory corruption in migrated guests with the new
> qemu: any time qemu writes to guest memory, it needs to trigger a
> logdirty update so that the write is properly transferred to the
> migration target domain.
>
> With the old libxl we didn't notice this apart from random failures.
> With my new migration code, particularly
> 25542:1883e5c71a87
> libxl: wait for qemu to acknowledge logdirty command
> this turns into a hard failure.
>
> I will add this as an allowable test failure pending a proper fix.
Thanks for investigating. It does appear that this has always been
broken.
Do we think this is a blocker for 4.2?
It certainly prevents us from suggesting that we support HVM migration
with the (non-default) upstream qemu.
If we can't fix this for 4.2 (e.g. because we need to get patches into
upstream qemu or because the libxl side is too involved) we should at a
minimum make libxl reject attempts to migrate such domains with an
appropriate error message.
How does this impact the use of upstream qemu for PV guest backends vs
migration? I *think* they don't require log-dirty support, but I'm not
sure.
Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [xen-unstable test] 13394: regressions - FAIL
2012-06-29 11:47 ` Ian Campbell
@ 2012-06-29 12:29 ` Stefano Stabellini
2012-06-29 12:32 ` Ian Campbell
0 siblings, 1 reply; 16+ messages in thread
From: Stefano Stabellini @ 2012-06-29 12:29 UTC (permalink / raw)
To: Ian Campbell; +Cc: Anthony Perard, xen-devel, Ian Jackson, Stefano Stabellini
On Fri, 29 Jun 2012, Ian Campbell wrote:
> On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote:
> > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"):
> > > Tests which did not succeed and are blocking,
> > > including tests which could not be run:
> > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379
> >
> > The logs show this:
> >
> > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out
> >
> > And in xenstore:
> >
> > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0)
> >
> > And in the source code:
> >
> > $ grep -R logdirty qemu-upstream-unstable.git/*
> > $
> >
> > So the upstream qemu does not participate properly in the migration
> > protocol. And anyway this protocol seems to involve xenstore and I
> > would have expected it to do something with QMP. But there is no code
> > in libxl to do this (and never has been) and no code in upstream qemu
> > to do it either.
> >
> > That means we'll get memory corruption in migrated guests with the new
> > qemu: any time qemu writes to guest memory, it needs to trigger a
> > logdirty update so that the write is properly transferred to the
> > migration target domain.
> >
> > With the old libxl we didn't notice this apart from random failures.
> > With my new migration code, particularly
> > 25542:1883e5c71a87
> > libxl: wait for qemu to acknowledge logdirty command
> > this turns into a hard failure.
> >
> > I will add this as an allowable test failure pending a proper fix.
>
> Thanks for investigating. It does appear that this has always been
> broken.
>
> Do we think this is a blocker for 4.2?
I wouldn't consider it a blocker, given that upstream QEMU is not the
default for HVM guests.
> It certainly prevents us from suggesting that we support HVM migration
> with the (non-default) upstream qemu.
>
> If we can't fix this for 4.2 (e.g. because we need to get patches into
> upstream qemu or because the libxl side is too involved) we should at a
> minimum make libxl reject attempts to migrate such domains with an
> appropriate error message.
We do need to get patches in QEMU to fix this but we could backport them in
qemu-upstream-unstable (and ask for backports to the stable trees).
> How does this impact the use of upstream qemu for PV guest backends vs
> migration? I *think* they don't require log-dirty support, but I'm not
> sure.
It does not affect qemu for PV guests.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [xen-unstable test] 13394: regressions - FAIL
2012-06-29 12:29 ` Stefano Stabellini
@ 2012-06-29 12:32 ` Ian Campbell
2012-06-29 12:36 ` Stefano Stabellini
0 siblings, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2012-06-29 12:32 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Anthony Perard, xen-devel, Ian Jackson
On Fri, 2012-06-29 at 13:29 +0100, Stefano Stabellini wrote:
> On Fri, 29 Jun 2012, Ian Campbell wrote:
> > On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote:
> > > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"):
> > > > Tests which did not succeed and are blocking,
> > > > including tests which could not be run:
> > > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379
> > >
> > > The logs show this:
> > >
> > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out
> > >
> > > And in xenstore:
> > >
> > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0)
> > >
> > > And in the source code:
> > >
> > > $ grep -R logdirty qemu-upstream-unstable.git/*
> > > $
> > >
> > > So the upstream qemu does not participate properly in the migration
> > > protocol. And anyway this protocol seems to involve xenstore and I
> > > would have expected it to do something with QMP. But there is no code
> > > in libxl to do this (and never has been) and no code in upstream qemu
> > > to do it either.
> > >
> > > That means we'll get memory corruption in migrated guests with the new
> > > qemu: any time qemu writes to guest memory, it needs to trigger a
> > > logdirty update so that the write is properly transferred to the
> > > migration target domain.
> > >
> > > With the old libxl we didn't notice this apart from random failures.
> > > With my new migration code, particularly
> > > 25542:1883e5c71a87
> > > libxl: wait for qemu to acknowledge logdirty command
> > > this turns into a hard failure.
> > >
> > > I will add this as an allowable test failure pending a proper fix.
> >
> > Thanks for investigating. It does appear that this has always been
> > broken.
> >
> > Do we think this is a blocker for 4.2?
>
> I wouldn't consider it a blocker, given that upstream QEMU is not the
> default for HVM guests.
>
>
> > It certainly prevents us from suggesting that we support HVM migration
> > with the (non-default) upstream qemu.
> >
> > If we can't fix this for 4.2 (e.g. because we need to get patches into
> > upstream qemu or because the libxl side is too involved) we should at a
> > minimum make libxl reject attempts to migrate such domains with an
> > appropriate error message.
>
> We do need to get patches in QEMU to fix this but we could backport them in
> qemu-upstream-unstable (and ask for backports to the stable trees).
Can we do that in time for 4.2? It's pretty late in the day.
I think we need to consider either achieving this or adding the
appropriate error message as a blocker. Hopefully the former but falling
back to the latter if it comes to it.
> > How does this impact the use of upstream qemu for PV guest backends vs
> > migration? I *think* they don't require log-dirty support, but I'm not
> > sure.
>
> It does not affect qemu for PV guests.
Great.
Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [xen-unstable test] 13394: regressions - FAIL
2012-06-29 12:32 ` Ian Campbell
@ 2012-06-29 12:36 ` Stefano Stabellini
2012-06-29 12:45 ` Ian Campbell
2012-06-29 16:01 ` [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen Ian Campbell
0 siblings, 2 replies; 16+ messages in thread
From: Stefano Stabellini @ 2012-06-29 12:36 UTC (permalink / raw)
To: Ian Campbell; +Cc: Anthony Perard, xen-devel, Ian Jackson, Stefano Stabellini
On Fri, 29 Jun 2012, Ian Campbell wrote:
> On Fri, 2012-06-29 at 13:29 +0100, Stefano Stabellini wrote:
> > On Fri, 29 Jun 2012, Ian Campbell wrote:
> > > On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote:
> > > > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"):
> > > > > Tests which did not succeed and are blocking,
> > > > > including tests which could not be run:
> > > > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379
> > > >
> > > > The logs show this:
> > > >
> > > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out
> > > >
> > > > And in xenstore:
> > > >
> > > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0)
> > > >
> > > > And in the source code:
> > > >
> > > > $ grep -R logdirty qemu-upstream-unstable.git/*
> > > > $
> > > >
> > > > So the upstream qemu does not participate properly in the migration
> > > > protocol. And anyway this protocol seems to involve xenstore and I
> > > > would have expected it to do something with QMP. But there is no code
> > > > in libxl to do this (and never has been) and no code in upstream qemu
> > > > to do it either.
> > > >
> > > > That means we'll get memory corruption in migrated guests with the new
> > > > qemu: any time qemu writes to guest memory, it needs to trigger a
> > > > logdirty update so that the write is properly transferred to the
> > > > migration target domain.
> > > >
> > > > With the old libxl we didn't notice this apart from random failures.
> > > > With my new migration code, particularly
> > > > 25542:1883e5c71a87
> > > > libxl: wait for qemu to acknowledge logdirty command
> > > > this turns into a hard failure.
> > > >
> > > > I will add this as an allowable test failure pending a proper fix.
> > >
> > > Thanks for investigating. It does appear that this has always been
> > > broken.
> > >
> > > Do we think this is a blocker for 4.2?
> >
> > I wouldn't consider it a blocker, given that upstream QEMU is not the
> > default for HVM guests.
> >
> >
> > > It certainly prevents us from suggesting that we support HVM migration
> > > with the (non-default) upstream qemu.
> > >
> > > If we can't fix this for 4.2 (e.g. because we need to get patches into
> > > upstream qemu or because the libxl side is too involved) we should at a
> > > minimum make libxl reject attempts to migrate such domains with an
> > > appropriate error message.
> >
> > We do need to get patches in QEMU to fix this but we could backport them in
> > qemu-upstream-unstable (and ask for backports to the stable trees).
>
> Can we do that in time for 4.2? It's pretty late in the day.
>
> I think we need to consider either achieving this or adding the
> appropriate error message as a blocker. Hopefully the former but falling
> back to the latter if it comes to it.
I think we should add an appropriate error message as a blocker. We
should also try to fix this on the QEMU side, but given that the QEMU
1.0 stable tree is pretty much unmaintaned, we won't be able to backport
the fix there, so we cannot be sure that a distro will end up with a
QEMU with or without the fix.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [xen-unstable test] 13394: regressions - FAIL
2012-06-29 12:36 ` Stefano Stabellini
@ 2012-06-29 12:45 ` Ian Campbell
2012-06-29 16:01 ` [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen Ian Campbell
1 sibling, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2012-06-29 12:45 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Anthony Perard, xen-devel, Ian Jackson
On Fri, 2012-06-29 at 13:36 +0100, Stefano Stabellini wrote:
> On Fri, 29 Jun 2012, Ian Campbell wrote:
> > On Fri, 2012-06-29 at 13:29 +0100, Stefano Stabellini wrote:
> > > On Fri, 29 Jun 2012, Ian Campbell wrote:
> > > > On Fri, 2012-06-29 at 12:20 +0100, Ian Jackson wrote:
> > > > > xen.org writes ("[xen-unstable test] 13394: regressions - FAIL"):
> > > > > > Tests which did not succeed and are blocking,
> > > > > > including tests which could not be run:
> > > > > > test-amd64-amd64-xl-qemuu-winxpsp3 9 guest-localmigrate fail REGR. vs. 13379
> > > > >
> > > > > The logs show this:
> > > > >
> > > > > libxl: error: libxl_dom.c:632:switch_logdirty_timeout: logdirty switch: wait for device model timed out
> > > > >
> > > > > And in xenstore:
> > > > >
> > > > > /local/domain/0/device-model/5/logdirty/cmd = "enable" (n0)
> > > > >
> > > > > And in the source code:
> > > > >
> > > > > $ grep -R logdirty qemu-upstream-unstable.git/*
> > > > > $
> > > > >
> > > > > So the upstream qemu does not participate properly in the migration
> > > > > protocol. And anyway this protocol seems to involve xenstore and I
> > > > > would have expected it to do something with QMP. But there is no code
> > > > > in libxl to do this (and never has been) and no code in upstream qemu
> > > > > to do it either.
> > > > >
> > > > > That means we'll get memory corruption in migrated guests with the new
> > > > > qemu: any time qemu writes to guest memory, it needs to trigger a
> > > > > logdirty update so that the write is properly transferred to the
> > > > > migration target domain.
> > > > >
> > > > > With the old libxl we didn't notice this apart from random failures.
> > > > > With my new migration code, particularly
> > > > > 25542:1883e5c71a87
> > > > > libxl: wait for qemu to acknowledge logdirty command
> > > > > this turns into a hard failure.
> > > > >
> > > > > I will add this as an allowable test failure pending a proper fix.
> > > >
> > > > Thanks for investigating. It does appear that this has always been
> > > > broken.
> > > >
> > > > Do we think this is a blocker for 4.2?
> > >
> > > I wouldn't consider it a blocker, given that upstream QEMU is not the
> > > default for HVM guests.
> > >
> > >
> > > > It certainly prevents us from suggesting that we support HVM migration
> > > > with the (non-default) upstream qemu.
> > > >
> > > > If we can't fix this for 4.2 (e.g. because we need to get patches into
> > > > upstream qemu or because the libxl side is too involved) we should at a
> > > > minimum make libxl reject attempts to migrate such domains with an
> > > > appropriate error message.
> > >
> > > We do need to get patches in QEMU to fix this but we could backport them in
> > > qemu-upstream-unstable (and ask for backports to the stable trees).
> >
> > Can we do that in time for 4.2? It's pretty late in the day.
> >
> > I think we need to consider either achieving this or adding the
> > appropriate error message as a blocker. Hopefully the former but falling
> > back to the latter if it comes to it.
>
> I think we should add an appropriate error message as a blocker. We
> should also try to fix this on the QEMU side, but given that the QEMU
> 1.0 stable tree is pretty much unmaintaned, we won't be able to backport
> the fix there, so we cannot be sure that a distro will end up with a
> QEMU with or without the fix.
Agreed. I'll add this to the 4.2 TODO list.
If we can get support into the mainline qemu then as a stretch goal we
can consider whether libxl can auto detect the availability of the
feature and react accordingly.
Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-06-29 12:36 ` Stefano Stabellini
2012-06-29 12:45 ` Ian Campbell
@ 2012-06-29 16:01 ` Ian Campbell
2012-06-29 16:03 ` Ian Jackson
1 sibling, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2012-06-29 16:01 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Anthony Perard, xen-devel, Ian Jackson
> I think we should add an appropriate error message as a blocker. We
> should also try to fix this on the QEMU side, but given that the QEMU
> 1.0 stable tree is pretty much unmaintaned, we won't be able to backport
> the fix there, so we cannot be sure that a distro will end up with a
> QEMU with or without the fix.
How about this for the time being, we can always revert or enhance as
necessary before 4.2.
8<----------------------
# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1340985595 -3600
# Node ID a0c1c8c585e03279f9ea149bc2951d0df900717e
# Parent d849ca2ef197dbf0e731aa4726da0eb3e2801280
libxl: refuse to try and migrate an HVM guest using qemu-xen
libxl/qemu-upstream currently do not collude together to enable log-dirty mode
and therefore migrations are unsafe. Refuse to even try for now.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
diff -r d849ca2ef197 -r a0c1c8c585e0 tools/libxl/libxl.c
--- a/tools/libxl/libxl.c Fri Jun 29 15:57:28 2012 +0100
+++ b/tools/libxl/libxl.c Fri Jun 29 16:59:55 2012 +0100
@@ -746,6 +746,17 @@ int libxl_domain_suspend(libxl_ctx *ctx,
goto out_err;
}
+ libxl_device_model_version dm =
+ libxl__device_model_version_running(gc, domid);
+ if (type == LIBXL_DOMAIN_TYPE_HVM &&
+ dm == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN &&
+ flags & LIBXL_SUSPEND_LIVE) {
+ LOG(ERROR, "cannot live migrate HVM domains with qemu-xen device-model");
+ rc = ERROR_FAIL;
+ goto out_err;
+
+ }
+
libxl__domain_suspend_state *dss;
GCNEW(dss);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-06-29 16:01 ` [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen Ian Campbell
@ 2012-06-29 16:03 ` Ian Jackson
2012-07-02 11:19 ` Ian Campbell
0 siblings, 1 reply; 16+ messages in thread
From: Ian Jackson @ 2012-06-29 16:03 UTC (permalink / raw)
To: Ian Campbell; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
Ian Campbell writes ("[PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):
> > I think we should add an appropriate error message as a blocker. We
> > should also try to fix this on the QEMU side, but given that the QEMU
> > 1.0 stable tree is pretty much unmaintaned, we won't be able to backport
> > the fix there, so we cannot be sure that a distro will end up with a
> > QEMU with or without the fix.
>
> How about this for the time being, we can always revert or enhance as
> necessary before 4.2.
...
> + libxl_device_model_version dm =
> + libxl__device_model_version_running(gc, domid);
libxl__device_model_version_running can return -1. (Its error
handling is pretty bad but that's no excuse for making matters worse.)
I suggest we retcon it as returning a libxl error code.
Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-06-29 16:03 ` Ian Jackson
@ 2012-07-02 11:19 ` Ian Campbell
2012-07-23 10:52 ` Ian Jackson
0 siblings, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2012-07-02 11:19 UTC (permalink / raw)
To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
On Fri, 2012-06-29 at 17:03 +0100, Ian Jackson wrote:
> Ian Campbell writes ("[PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):
> > > I think we should add an appropriate error message as a blocker. We
> > > should also try to fix this on the QEMU side, but given that the QEMU
> > > 1.0 stable tree is pretty much unmaintaned, we won't be able to backport
> > > the fix there, so we cannot be sure that a distro will end up with a
> > > QEMU with or without the fix.
> >
> > How about this for the time being, we can always revert or enhance as
> > necessary before 4.2.
> ...
> > + libxl_device_model_version dm =
> > + libxl__device_model_version_running(gc, domid);
>
> libxl__device_model_version_running can return -1. (Its error
> handling is pretty bad but that's no excuse for making matters worse.)
> I suggest we retcon it as returning a libxl error code.
the current callers all do something like
switch (libxl__device_model_version_running(gc, domid)) {
case LIBXL_DEVICE_MODEL...
...
default: return ERROR_INVAL;
}
and would need some rejigging to support a switch to ERROR_*, and
associated chance of triggering gcc's switch(enum) warnings.
I think fixing this properly can be left to 4.3 and for now I'll just
follow this same pattern.
Ian.
# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1341227605 -3600
# Node ID c6a2abdee84d5ae0692728dc169327314991090a
# Parent d29a88850da26424b52520a617ca449d49124143
libxl: refuse to try and migrate an HVM guest using qemu-xen
libxl/qemu-upstream currently do not collude together to enable log-dirty mode
and therefore migrations are unsafe. Refuse to even try for now.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
diff -r d29a88850da2 -r c6a2abdee84d tools/libxl/libxl.c
--- a/tools/libxl/libxl.c Mon Jul 02 12:06:22 2012 +0100
+++ b/tools/libxl/libxl.c Mon Jul 02 12:13:25 2012 +0100
@@ -746,6 +746,22 @@ int libxl_domain_suspend(libxl_ctx *ctx,
goto out_err;
}
+ if (type == LIBXL_DOMAIN_TYPE_HVM && flags & LIBXL_SUSPEND_LIVE) {
+ switch (libxl__device_model_version_running(gc, domid)) {
+ case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+ LOG(ERROR,
+ "cannot live migrate HVM domains with qemu-xen device-model");
+ rc = ERROR_FAIL;
+ goto out_err;
+ case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+ /* No problem */
+ break;
+ default:
+ rc = ERROR_FAIL;
+ goto out_err;
+ }
+ }
+
libxl__domain_suspend_state *dss;
GCNEW(dss);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-07-02 11:19 ` Ian Campbell
@ 2012-07-23 10:52 ` Ian Jackson
2012-07-23 10:58 ` Ian Campbell
0 siblings, 1 reply; 16+ messages in thread
From: Ian Jackson @ 2012-07-23 10:52 UTC (permalink / raw)
To: Ian Campbell; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
Ian Campbell writes ("Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):
> diff -r d29a88850da2 -r c6a2abdee84d tools/libxl/libxl.c
> --- a/tools/libxl/libxl.c Mon Jul 02 12:06:22 2012 +0100
> +++ b/tools/libxl/libxl.c Mon Jul 02 12:13:25 2012 +0100
> @@ -746,6 +746,22 @@ int libxl_domain_suspend(libxl_ctx *ctx,
...
> + switch (libxl__device_model_version_running(gc, domid)) {
> + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> + LOG(ERROR,
> + "cannot live migrate HVM domains with qemu-xen device-model");
> + rc = ERROR_FAIL;
> + goto out_err;
> + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> + /* No problem */
> + break;
> + default:
> + rc = ERROR_FAIL;
> + goto out_err;
> + }
This last case should abort() or log something or not exist.
Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-07-23 10:52 ` Ian Jackson
@ 2012-07-23 10:58 ` Ian Campbell
2012-07-23 11:18 ` Ian Campbell
0 siblings, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2012-07-23 10:58 UTC (permalink / raw)
To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
On Mon, 2012-07-23 at 11:52 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):
> > diff -r d29a88850da2 -r c6a2abdee84d tools/libxl/libxl.c
> > --- a/tools/libxl/libxl.c Mon Jul 02 12:06:22 2012 +0100
> > +++ b/tools/libxl/libxl.c Mon Jul 02 12:13:25 2012 +0100
> > @@ -746,6 +746,22 @@ int libxl_domain_suspend(libxl_ctx *ctx,
> ...
> > + switch (libxl__device_model_version_running(gc, domid)) {
> > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> > + LOG(ERROR,
> > + "cannot live migrate HVM domains with qemu-xen device-model");
> > + rc = ERROR_FAIL;
> > + goto out_err;
> > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> > + /* No problem */
> > + break;
> > + default:
> > + rc = ERROR_FAIL;
> > + goto out_err;
> > + }
>
> This last case should abort() or log something or not exist.
I think
case -1:
rc = ERROR_FAIL
goto out_err;
default: abort()
would be OK for 4.2. libxl__device_model_version_running either returns
-1 (and logs) or a valid LIBXL_DEVICE_MODEL_VERSION_* so the only thing
to catch is a new LIBXL_DEVICE_MODEL_VERSION_*, in which case an abort()
seems appropriate since it would be a bug to not handle it here.
Possibly -1 ought to become LIBXL_DEVICE_MODEL_VERSION_INVALID too
Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-07-23 10:58 ` Ian Campbell
@ 2012-07-23 11:18 ` Ian Campbell
2012-07-24 10:30 ` Ian Jackson
0 siblings, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2012-07-23 11:18 UTC (permalink / raw)
To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
On Mon, 2012-07-23 at 11:58 +0100, Ian Campbell wrote:
> On Mon, 2012-07-23 at 11:52 +0100, Ian Jackson wrote:
> > Ian Campbell writes ("Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):
> > > diff -r d29a88850da2 -r c6a2abdee84d tools/libxl/libxl.c
> > > --- a/tools/libxl/libxl.c Mon Jul 02 12:06:22 2012 +0100
> > > +++ b/tools/libxl/libxl.c Mon Jul 02 12:13:25 2012 +0100
> > > @@ -746,6 +746,22 @@ int libxl_domain_suspend(libxl_ctx *ctx,
> > ...
> > > + switch (libxl__device_model_version_running(gc, domid)) {
> > > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
> > > + LOG(ERROR,
> > > + "cannot live migrate HVM domains with qemu-xen device-model");
> > > + rc = ERROR_FAIL;
> > > + goto out_err;
> > > + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
> > > + /* No problem */
> > > + break;
> > > + default:
> > > + rc = ERROR_FAIL;
> > > + goto out_err;
> > > + }
> >
> > This last case should abort() or log something or not exist.
>
> I think
> case -1:
> rc = ERROR_FAIL
> goto out_err;
> default: abort()
>
> would be OK for 4.2. libxl__device_model_version_running either returns
> -1 (and logs) or a valid LIBXL_DEVICE_MODEL_VERSION_* so the only thing
> to catch is a new LIBXL_DEVICE_MODEL_VERSION_*, in which case an abort()
> seems appropriate since it would be a bug to not handle it here.
>
> Possibly -1 ought to become LIBXL_DEVICE_MODEL_VERSION_INVALID too
Actually I think -1 == INVALID would be wrong, the -1 here really means
something went wrong internally to libxl. Anyway, I remember now that
the reason I did it this was was to avoid the need to rewrite the other
callsites for 4.2. In 4.3 we should make
libxl__device_model_version_running return ERROR_foo or something valid,
and perhaps add 0 == UNKNOWN as a member of the enum.
For now I think this will do:
8<------------------------------------------------
# HG changeset patch
# User Ian Campbell <ian.campbell@citrix.com>
# Date 1343042178 -3600
# Node ID 11b539fb8400f163a871145aa20d7b4b2967e5b4
# Parent 21b166ed1e8765b75a2e90caa9e0e156ea4010d9
libxl: refuse to try and migrate an HVM guest using qemu-xen
libxl/qemu-upstream currently do not collude together to enable log-dirty mode
and therefore migrations are unsafe. Refuse to even try for now.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
diff -r 21b166ed1e87 -r 11b539fb8400 tools/libxl/libxl.c
--- a/tools/libxl/libxl.c Mon Jul 23 12:09:53 2012 +0100
+++ b/tools/libxl/libxl.c Mon Jul 23 12:16:18 2012 +0100
@@ -746,6 +746,23 @@ int libxl_domain_suspend(libxl_ctx *ctx,
goto out_err;
}
+ if (type == LIBXL_DOMAIN_TYPE_HVM && flags & LIBXL_SUSPEND_LIVE) {
+ switch (libxl__device_model_version_running(gc, domid)) {
+ case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+ LOG(ERROR,
+ "cannot live migrate HVM domains with qemu-xen device-model");
+ rc = ERROR_FAIL;
+ goto out_err;
+ case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+ /* No problem */
+ break;
+ case -1:
+ rc = ERROR_FAIL;
+ goto out_err;
+ default: abort();
+ }
+ }
+
libxl__domain_suspend_state *dss;
GCNEW(dss);
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-07-23 11:18 ` Ian Campbell
@ 2012-07-24 10:30 ` Ian Jackson
2012-07-25 16:45 ` Ian Campbell
0 siblings, 1 reply; 16+ messages in thread
From: Ian Jackson @ 2012-07-24 10:30 UTC (permalink / raw)
To: Ian Campbell; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
Ian Campbell writes ("Re: [Xen-devel] [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):
> Actually I think -1 == INVALID would be wrong, the -1 here really means
> something went wrong internally to libxl. Anyway, I remember now that
> the reason I did it this was was to avoid the need to rewrite the other
> callsites for 4.2. In 4.3 we should make
> libxl__device_model_version_running return ERROR_foo or something valid,
> and perhaps add 0 == UNKNOWN as a member of the enum.
Yes.
> For now I think this will do:
..
> libxl: refuse to try and migrate an HVM guest using qemu-xen
>
> libxl/qemu-upstream currently do not collude together to enable log-dirty mode
> and therefore migrations are unsafe. Refuse to even try for now.
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Thanks,
Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-07-24 10:30 ` Ian Jackson
@ 2012-07-25 16:45 ` Ian Campbell
2012-07-25 16:56 ` Ian Campbell
0 siblings, 1 reply; 16+ messages in thread
From: Ian Campbell @ 2012-07-25 16:45 UTC (permalink / raw)
To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
On Tue, 2012-07-24 at 11:30 +0100, Ian Jackson wrote:
> Ian Campbell writes ("Re: [Xen-devel] [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):
> > Actually I think -1 == INVALID would be wrong, the -1 here really means
> > something went wrong internally to libxl. Anyway, I remember now that
> > the reason I did it this was was to avoid the need to rewrite the other
> > callsites for 4.2. In 4.3 we should make
> > libxl__device_model_version_running return ERROR_foo or something valid,
> > and perhaps add 0 == UNKNOWN as a member of the enum.
>
> Yes.
>
> > For now I think this will do:
> ..
> > libxl: refuse to try and migrate an HVM guest using qemu-xen
> >
> > libxl/qemu-upstream currently do not collude together to enable log-dirty mode
> > and therefore migrations are unsafe. Refuse to even try for now.
>
> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Applied, thanks.
>
> Thanks,
> Ian.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen
2012-07-25 16:45 ` Ian Campbell
@ 2012-07-25 16:56 ` Ian Campbell
0 siblings, 0 replies; 16+ messages in thread
From: Ian Campbell @ 2012-07-25 16:56 UTC (permalink / raw)
To: Ian Jackson; +Cc: Anthony Perard, xen-devel, Stefano Stabellini
On Wed, 2012-07-25 at 17:45 +0100, Ian Campbell wrote:
> On Tue, 2012-07-24 at 11:30 +0100, Ian Jackson wrote:
> > Ian Campbell writes ("Re: [Xen-devel] [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen"):
> > > Actually I think -1 == INVALID would be wrong, the -1 here really means
> > > something went wrong internally to libxl. Anyway, I remember now that
> > > the reason I did it this was was to avoid the need to rewrite the other
> > > callsites for 4.2. In 4.3 we should make
> > > libxl__device_model_version_running return ERROR_foo or something valid,
> > > and perhaps add 0 == UNKNOWN as a member of the enum.
> >
> > Yes.
> >
> > > For now I think this will do:
> > ..
> > > libxl: refuse to try and migrate an HVM guest using qemu-xen
> > >
> > > libxl/qemu-upstream currently do not collude together to enable log-dirty mode
> > > and therefore migrations are unsafe. Refuse to even try for now.
> >
> > Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
>
> Applied, thanks.
Right after I pushed I noticed that gcc 4.6.2 (which is not the one I
typically use for commit testing) fails with:
libxl.c: In function ‘libxl_domain_suspend’:
libxl.c:778:9: error: case value ‘4294967295’ not in enumerated type ‘libxl_device_model_version’ [-Werror=switch]
Whereas 4.4.5 (which is my typical commit test and day to day gcc) does
not.
I have pushed the following additional patch to fix this up, tested with
4.4.5 and 4.6.2.
8<---------------
From 365636cac708d4af0907503c07bec530b2dd3a03 Mon Sep 17 00:00:00 2001
From: Ian Campbell <ian.campbell@citrix.com>
Date: Wed, 25 Jul 2012 17:52:12 +0100
Subject: [PATCH] libxl: libxl__device_model_version_running should return an int
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
On error it returns -1 and therefore it needs to return int and not
libxl_device_model_enum. Otherwise gcc 4.6.2 complains:
libxl.c: In function ‘libxl_domain_suspend’:
libxl.c:778:9: error: case value ‘4294967295’ not in enumerated type ‘libxl_device_model_version’ [-Werror=switch]
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
tools/libxl/libxl_internal.c | 3 +--
tools/libxl/libxl_internal.h | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
index fbff7d0..24099f5 100644
--- a/tools/libxl/libxl_internal.c
+++ b/tools/libxl/libxl_internal.c
@@ -330,8 +330,7 @@ out:
return rc;
}
-libxl_device_model_version libxl__device_model_version_running(libxl__gc *gc,
- uint32_t domid)
+int libxl__device_model_version_running(libxl__gc *gc, uint32_t domid)
{
char *path = NULL;
char *dm_version = NULL;
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 9ffb9d4..1b9b417 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -1515,8 +1515,7 @@ _hidden libxl__json_object *libxl__json_parse(libxl__gc *gc, const char *s);
/* Based on /local/domain/$domid/dm-version xenstore key
* default is qemu xen traditional */
-_hidden libxl_device_model_version
-libxl__device_model_version_running(libxl__gc *gc, uint32_t domid);
+_hidden int libxl__device_model_version_running(libxl__gc *gc, uint32_t domid);
/*
--
1.7.2.5
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply related [flat|nested] 16+ messages in thread
end of thread, other threads:[~2012-07-25 16:56 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-29 10:41 [xen-unstable test] 13394: regressions - FAIL xen.org
2012-06-29 11:20 ` Ian Jackson
2012-06-29 11:47 ` Ian Campbell
2012-06-29 12:29 ` Stefano Stabellini
2012-06-29 12:32 ` Ian Campbell
2012-06-29 12:36 ` Stefano Stabellini
2012-06-29 12:45 ` Ian Campbell
2012-06-29 16:01 ` [PATCH] Rlibxl: refuse to try and migrate an HVM guest using qemu-xen Ian Campbell
2012-06-29 16:03 ` Ian Jackson
2012-07-02 11:19 ` Ian Campbell
2012-07-23 10:52 ` Ian Jackson
2012-07-23 10:58 ` Ian Campbell
2012-07-23 11:18 ` Ian Campbell
2012-07-24 10:30 ` Ian Jackson
2012-07-25 16:45 ` Ian Campbell
2012-07-25 16:56 ` Ian Campbell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.