* [xen-unstable test] 57852: regressions - FAIL
@ 2015-06-04 12:01 osstest service user
2015-06-05 8:45 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: osstest service user @ 2015-06-04 12:01 UTC (permalink / raw)
To: xen-devel; +Cc: ian.jackson
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 18584 bytes --]
flight 57852 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/57852/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
test-amd64-amd64-xl-qemuu-win7-amd64 9 windows-install fail REGR. vs. 57419
Regressions which are regarded as allowable (not blocking):
test-amd64-amd64-libvirt-xsm 11 guest-start fail REGR. vs. 57419
test-amd64-i386-libvirt 11 guest-start fail like 57419
test-amd64-i386-libvirt-xsm 11 guest-start fail like 57419
test-amd64-amd64-libvirt 11 guest-start fail like 57419
test-amd64-amd64-rumpuserxen-amd64 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail like 57419
test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 57419
test-armhf-armhf-libvirt-xsm 11 guest-start fail like 57419
test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 57419
Tests which did not succeed, but are not blocking:
test-amd64-i386-xl-xsm 14 guest-localmigrate fail never pass
test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass
test-amd64-amd64-xl-xsm 14 guest-localmigrate fail never pass
test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass
test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass
test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass
test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass
test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass
test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass
test-armhf-armhf-libvirt 12 migrate-support-check fail never pass
test-armhf-armhf-xl-xsm 12 migrate-support-check fail never pass
test-armhf-armhf-xl-arndale 12 migrate-support-check fail never pass
test-armhf-armhf-xl-cubietruck 12 migrate-support-check fail never pass
test-armhf-armhf-xl-multivcpu 12 migrate-support-check fail never pass
test-armhf-armhf-xl 12 migrate-support-check fail never pass
test-armhf-armhf-xl-sedf 12 migrate-support-check fail never pass
test-armhf-armhf-xl-sedf-pin 12 migrate-support-check fail never pass
test-armhf-armhf-xl-credit2 12 migrate-support-check fail never pass
version targeted for testing:
xen fed56ba0e69b251d0222ef0785cd1c1838f9e51d
baseline version:
xen d6b6bd8374ac30597495d457829ce7ad6e8b7016
------------------------------------------------------------
People who touched revisions under test:
Andrew Cooper <andrew.cooper3@citrix.com>
Dario Faggioli <dario.faggioli@citrix.com>
George Dunlap <george.dunlap@eu.citrix.com>
Ian Campbell <ian.campbell@citrix.com>
Jan Beulich <jbeulich@suse.com>
Kevin Tian <kevin.tian@intel.com>
Roger Pau Monné <roger.pau@citrix.com>
Ross Lagerwall <ross.lagerwall@citrix.com>
Tim Deegan <tim@xen.org>
Vitaly Kuznetsov <vkuznets@redhat.com>
Yang Hongyang <yanghy@cn.fujitsu.com>
------------------------------------------------------------
jobs:
build-amd64-xsm pass
build-armhf-xsm pass
build-i386-xsm pass
build-amd64 pass
build-armhf pass
build-i386 pass
build-amd64-libvirt pass
build-armhf-libvirt pass
build-i386-libvirt pass
build-amd64-oldkern pass
build-i386-oldkern pass
build-amd64-pvops pass
build-armhf-pvops pass
build-i386-pvops pass
build-amd64-rumpuserxen pass
build-i386-rumpuserxen pass
test-amd64-amd64-xl pass
test-armhf-armhf-xl pass
test-amd64-i386-xl pass
test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm fail
test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail
test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm fail
test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm fail
test-amd64-amd64-libvirt-xsm fail
test-armhf-armhf-libvirt-xsm fail
test-amd64-i386-libvirt-xsm fail
test-amd64-amd64-xl-xsm fail
test-armhf-armhf-xl-xsm pass
test-amd64-i386-xl-xsm fail
test-amd64-amd64-xl-pvh-amd fail
test-amd64-i386-qemut-rhel6hvm-amd pass
test-amd64-i386-qemuu-rhel6hvm-amd pass
test-amd64-amd64-xl-qemut-debianhvm-amd64 pass
test-amd64-i386-xl-qemut-debianhvm-amd64 pass
test-amd64-amd64-xl-qemuu-debianhvm-amd64 pass
test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
test-amd64-i386-freebsd10-amd64 pass
test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
test-amd64-i386-xl-qemuu-ovmf-amd64 pass
test-amd64-amd64-rumpuserxen-amd64 fail
test-amd64-amd64-xl-qemut-win7-amd64 fail
test-amd64-i386-xl-qemut-win7-amd64 fail
test-amd64-amd64-xl-qemuu-win7-amd64 fail
test-amd64-i386-xl-qemuu-win7-amd64 fail
test-armhf-armhf-xl-arndale pass
test-amd64-amd64-xl-credit2 pass
test-armhf-armhf-xl-credit2 pass
test-armhf-armhf-xl-cubietruck pass
test-amd64-i386-freebsd10-i386 pass
test-amd64-i386-rumpuserxen-i386 pass
test-amd64-amd64-xl-pvh-intel fail
test-amd64-i386-qemut-rhel6hvm-intel pass
test-amd64-i386-qemuu-rhel6hvm-intel pass
test-amd64-amd64-libvirt fail
test-armhf-armhf-libvirt pass
test-amd64-i386-libvirt fail
test-amd64-amd64-xl-multivcpu pass
test-armhf-armhf-xl-multivcpu pass
test-amd64-amd64-pair pass
test-amd64-i386-pair pass
test-amd64-amd64-xl-sedf-pin pass
test-armhf-armhf-xl-sedf-pin pass
test-amd64-amd64-xl-sedf pass
test-armhf-armhf-xl-sedf pass
test-amd64-i386-xl-qemut-winxpsp3-vcpus1 pass
test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass
test-amd64-amd64-xl-qemut-winxpsp3 pass
test-amd64-i386-xl-qemut-winxpsp3 pass
test-amd64-amd64-xl-qemuu-winxpsp3 pass
test-amd64-i386-xl-qemuu-winxpsp3 pass
------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images
Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs
Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary
Not pushing.
------------------------------------------------------------
commit fed56ba0e69b251d0222ef0785cd1c1838f9e51d
Author: Jan Beulich <jbeulich@suse.com>
Date: Tue Jun 2 13:45:03 2015 +0200
unmodified-drivers: tolerate IRQF_DISABLED being undefined
It's being removed in Linux 4.1.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
commit 8a753b3f1cf5e4714974196df9517849bf174324
Author: Ross Lagerwall <ross.lagerwall@citrix.com>
Date: Tue Jun 2 13:44:24 2015 +0200
efi: fix allocation problems if ExitBootServices() fails
If calling ExitBootServices() fails, the required memory map size may
have increased. When initially allocating the memory map, allocate a
slightly larger buffer (by an arbitrary 8 entries) to fix this.
The ARM code path was already allocating a larger buffer than required,
so this moves the code to be common for all architectures.
This was seen on the following machine when using the iscsidxe UEFI
driver. The machine would consistently fail the first call to
ExitBootServices().
System Information
Manufacturer: Supermicro
Product Name: X10SLE-F/HF
BIOS Information
Vendor: American Megatrends Inc.
Version: 2.00
Release Date: 04/24/2014
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roy Franz <roy.franz@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
commit 376bbbabbda607d2039b8f839f15ff02721597d2
Author: Dario Faggioli <dario.faggioli@citrix.com>
Date: Tue Jun 2 13:43:15 2015 +0200
sched_rt: print useful affinity info when dumping
In fact, printing the cpupool's CPU online mask
for each vCPU is just redundant, as that is the
same for all the vCPUs of all the domains in the
same cpupool, while hard affinity is already part
of the output of dumping domains info.
Instead, print the intersection between hard
affinity and online CPUs, which is --in case of this
scheduler-- the effective affinity always used for
the vCPUs.
This change also takes the chance to add a scratch
cpumask area, to avoid having to either put one
(more) cpumask_t on the stack, or dynamically
allocate it within the dumping routine. (The former
being bad because hypervisor stack size is limited,
the latter because dynamic allocations can fail, if
the hypervisor was built for a large enough number
of CPUs.) We allocate such scratch area, for all pCPUs,
when the first instance of the RTDS scheduler is
activated and, in order not to loose track/leak it
if other instances are activated in new cpupools,
and when the last instance is deactivated, we (sort
of) refcount it.
Such scratch area can be used to kill most of the
cpumasks{_var}_t local variables in other functions
in the file, but that is *NOT* done in this chage.
Finally, convert the file to use keyhandler scratch,
instead of open coded string buffers.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
commit e758ed14f390342513405dd766e874934573e6cb
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date: Mon Jun 1 12:00:18 2015 +0200
docs: clarification to terms used in hypervisor memory management
Memory management is hard[citation needed]. Furthermore, it isn't helped by
the inconsistent use of terms through the code, or that some terms have
changed meaning over time.
Describe the currently-used terms in a more practical fashon, so new code has
a concrete reference.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
commit 591e1e357c29589e9d6121d8faadc4f4d3b9013e
Author: Ross Lagerwall <ross.lagerwall@citrix.com>
Date: Mon Jun 1 11:59:14 2015 +0200
x86: don't crash when mapping a page using EFI runtime page tables
When an interrupt is received during an EFI runtime service call, Xen
may call map_domain_page() while using the EFI runtime page tables.
This fails because, although the EFI runtime page tables are a
copy of the idle domain's page tables, current points at a different
domain's vCPU.
To fix this, return NULL from mapcache_current_vcpu() when using the EFI
runtime page tables which is treated equivalently to running in an idle
vCPU.
This issue can be reproduced by repeatedly calling GetVariable() from
dom0 while using VT-d, since VT-d frequently maps a page from interrupt
context.
Example call trace:
[<ffff82d0801615dc>] __find_next_zero_bit+0x28/0x60
[<ffff82d08016a10e>] map_domain_page+0x4c6/0x4eb
[<ffff82d080156ae6>] map_vtd_domain_page+0xd/0xf
[<ffff82d08015533a>] msi_msg_read_remap_rte+0xe3/0x1d8
[<ffff82d08014e516>] iommu_read_msi_from_ire+0x31/0x34
[<ffff82d08016ff6c>] set_msi_affinity+0x134/0x17a
[<ffff82d0801737b5>] move_masked_irq+0x5c/0x98
[<ffff82d080173816>] move_native_irq+0x25/0x36
[<ffff82d08016ffcb>] ack_nonmaskable_msi_irq+0x19/0x20
[<ffff82d08016ffdb>] ack_maskable_msi_irq+0x9/0x37
[<ffff82d080173e8b>] do_IRQ+0x251/0x635
[<ffff82d080234502>] common_interrupt+0x62/0x70
[<00000000df7ed2be>] 00000000df7ed2be
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
commit 47ec25a3c8cdd7a057af0a05e8e00257ef950437
Merge: 088e9b2 818e376
Author: Ian Campbell <ian.campbell@citrix.com>
Date: Fri May 29 13:22:31 2015 +0100
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
commit 088e9b2796bd1f9ebe4fda800275cc689677b699
Author: Yang Hongyang <yanghy@cn.fujitsu.com>
Date: Mon May 18 15:03:56 2015 +0800
libxc/restore: implement Remus checkpointed restore
With Remus, the restore flow should be:
the first full migration stream -> { periodically restore stream }
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
commit a25e4e96fc95150f5c58d069de1b204aa6487ed8
Author: Yang Hongyang <yanghy@cn.fujitsu.com>
Date: Mon May 18 15:03:55 2015 +0800
libxc/save: implement Remus checkpointed save
With Remus, the save flow should be:
live migration->{ periodically save(checkpointed save) }
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
commit cfa955591caea5d7ec505cdcbf4442f2d6e889e1
Author: Yang Hongyang <yanghy@cn.fujitsu.com>
Date: Mon May 18 15:03:54 2015 +0800
libxc/save: refactor of send_domain_memory_live()
Split the send_domain_memory_live() into three helper function:
- send_memory_live() do the actually live send
- suspend_and_send_dirty() suspend the guest and send dirty pages
- send_memory_verify()
The motivation of this is that when we send checkpointed stream, we
will skip the actually live part.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
commit 818e376d3b17845d39735517650224c64c9e0078
Author: Jan Beulich <jbeulich@suse.com>
Date: Thu May 28 12:07:33 2015 +0200
Revert "use ticket locks for spin locks"
This reverts commit 45fcc4568c5162b00fb3907fb158af82dd484a3d as it
introduces yet to be explained issues on ARM.
commit 02cdd81aa0a88007addc788c6cf93e2f1cb1a314
Author: Jan Beulich <jbeulich@suse.com>
Date: Thu May 28 12:06:47 2015 +0200
Revert "spinlock: fix build with older GCC"
This reverts commit 1037e33c88bb0e1fe530c164f242df17030102e1 as its
prereq commit 45fcc4568c is about to be reverted.
commit 814ca12647f06b023f4aac8eae837ba9b417acc7
Author: Jan Beulich <jbeulich@suse.com>
Date: Thu May 28 11:59:34 2015 +0200
Revert "x86,arm: remove asm/spinlock.h from all architectures"
This reverts commit e62e49e6d5d4e8d22f3df0b75443ede65a812435 as
its prerequisite 45fcc4568c is going to be reverted.
commit cf6b3ccf28faee01a078311fcfe671148c81ea75
Author: Roger Pau Monné <roger.pau@citrix.com>
Date: Thu May 28 10:56:08 2015 +0200
x86/pvh: disable posted interrupts
Enabling posted interrupts requires the virtual interrupt delivery feature,
which is disabled for PVH guests, so make sure posted interrupts are also
disabled or else vmlaunch will fail.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-and-Tested-by: Lars Eggert <lars@netapp.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
commit d4d39de054a6f6c5a474aee62999a8ea7c2fd180
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Thu May 28 10:55:43 2015 +0200
public: fix xen_domctl_monitor_op_t definition
It seems xen_domctl_monitor_op_t was supposed to be a typedef for
struct xen_domctl_monitor_op and not the non-existent xen_domctl__op.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
(qemu changes not included)
[-- Attachment #2: Type: text/plain, Size: 126 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-04 12:01 [xen-unstable test] 57852: regressions - FAIL osstest service user
@ 2015-06-05 8:45 ` Ian Campbell
2015-06-05 9:00 ` Jan Beulich
0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-05 8:45 UTC (permalink / raw)
To: Jan Beulich, Andrew Cooper; +Cc: xen-devel, ian.jackson
On Thu, 2015-06-04 at 12:01 +0000, osstest service user wrote:
> flight 57852 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/57852/
>
> Regressions :-(
>
> Tests which did not succeed and are blocking,
> including tests which could not be run:
> test-amd64-amd64-xl-qemuu-win7-amd64 9 windows-install fail REGR. vs. 57419
Is anyone looking into this?
It seems to have been intermittent for a long time but the probability
of failure seems to have increased significantly some time around flight
52633 (see [0]). Before that it failed <5% of the time and since then it
looks to be closer to 45-50%. 5% could be put down to infrastructure or
guest flakiness, 50% seems more like something on the Xen (or qemu etc)
side.
The bisector is taking a look[1] but TBH given a 50% pass rate I think
it is unlikely to get anywhere (I suspect this isn't its first attempt
at this either, pretty sure I saw a failed attempt on an earlier range).
Taking 50370 as a rough baseline (4 consecutive passes before the first
of the more frequent failures) gives a range of
b6e7fbadbda4..5c44b5cf352e which is quite a few. It's noteworthy though
that qemuu didn't change during the interval 50370..52633 (again, from
[0]).
None of the vnc snapshots look interesting, just the windows login
screen. Neither do any of the logs look interesting.
Ian.
[0] http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.xen-unstable.html
[1] http://logs.test-lab.xenproject.org/osstest/results/bisect.xen-unstable.test-amd64-amd64-xl-qemuu-win7-amd64.windows-install.html
>
> Regressions which are regarded as allowable (not blocking):
> test-amd64-amd64-libvirt-xsm 11 guest-start fail REGR. vs. 57419
> test-amd64-i386-libvirt 11 guest-start fail like 57419
> test-amd64-i386-libvirt-xsm 11 guest-start fail like 57419
> test-amd64-amd64-libvirt 11 guest-start fail like 57419
> test-amd64-amd64-rumpuserxen-amd64 15 rumpuserxen-demo-xenstorels/xenstorels.repeat fail like 57419
> test-amd64-amd64-xl-qemut-win7-amd64 16 guest-stop fail like 57419
> test-armhf-armhf-libvirt-xsm 11 guest-start fail like 57419
> test-amd64-i386-xl-qemuu-win7-amd64 16 guest-stop fail like 57419
>
> Tests which did not succeed, but are not blocking:
> test-amd64-i386-xl-xsm 14 guest-localmigrate fail never pass
> test-amd64-amd64-xl-pvh-amd 11 guest-start fail never pass
> test-amd64-amd64-xl-xsm 14 guest-localmigrate fail never pass
> test-amd64-amd64-xl-pvh-intel 11 guest-start fail never pass
> test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass
> test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass
> test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass
> test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 12 guest-localmigrate fail never pass
> test-amd64-i386-xl-qemut-win7-amd64 16 guest-stop fail never pass
> test-armhf-armhf-libvirt 12 migrate-support-check fail never pass
> test-armhf-armhf-xl-xsm 12 migrate-support-check fail never pass
> test-armhf-armhf-xl-arndale 12 migrate-support-check fail never pass
> test-armhf-armhf-xl-cubietruck 12 migrate-support-check fail never pass
> test-armhf-armhf-xl-multivcpu 12 migrate-support-check fail never pass
> test-armhf-armhf-xl 12 migrate-support-check fail never pass
> test-armhf-armhf-xl-sedf 12 migrate-support-check fail never pass
> test-armhf-armhf-xl-sedf-pin 12 migrate-support-check fail never pass
> test-armhf-armhf-xl-credit2 12 migrate-support-check fail never pass
>
> version targeted for testing:
> xen fed56ba0e69b251d0222ef0785cd1c1838f9e51d
> baseline version:
> xen d6b6bd8374ac30597495d457829ce7ad6e8b7016
>
> ------------------------------------------------------------
> People who touched revisions under test:
> Andrew Cooper <andrew.cooper3@citrix.com>
> Dario Faggioli <dario.faggioli@citrix.com>
> George Dunlap <george.dunlap@eu.citrix.com>
> Ian Campbell <ian.campbell@citrix.com>
> Jan Beulich <jbeulich@suse.com>
> Kevin Tian <kevin.tian@intel.com>
> Roger Pau Monné <roger.pau@citrix.com>
> Ross Lagerwall <ross.lagerwall@citrix.com>
> Tim Deegan <tim@xen.org>
> Vitaly Kuznetsov <vkuznets@redhat.com>
> Yang Hongyang <yanghy@cn.fujitsu.com>
> ------------------------------------------------------------
>
> jobs:
> build-amd64-xsm pass
> build-armhf-xsm pass
> build-i386-xsm pass
> build-amd64 pass
> build-armhf pass
> build-i386 pass
> build-amd64-libvirt pass
> build-armhf-libvirt pass
> build-i386-libvirt pass
> build-amd64-oldkern pass
> build-i386-oldkern pass
> build-amd64-pvops pass
> build-armhf-pvops pass
> build-i386-pvops pass
> build-amd64-rumpuserxen pass
> build-i386-rumpuserxen pass
> test-amd64-amd64-xl pass
> test-armhf-armhf-xl pass
> test-amd64-i386-xl pass
> test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm fail
> test-amd64-i386-xl-qemut-debianhvm-amd64-xsm fail
> test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm fail
> test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm fail
> test-amd64-amd64-libvirt-xsm fail
> test-armhf-armhf-libvirt-xsm fail
> test-amd64-i386-libvirt-xsm fail
> test-amd64-amd64-xl-xsm fail
> test-armhf-armhf-xl-xsm pass
> test-amd64-i386-xl-xsm fail
> test-amd64-amd64-xl-pvh-amd fail
> test-amd64-i386-qemut-rhel6hvm-amd pass
> test-amd64-i386-qemuu-rhel6hvm-amd pass
> test-amd64-amd64-xl-qemut-debianhvm-amd64 pass
> test-amd64-i386-xl-qemut-debianhvm-amd64 pass
> test-amd64-amd64-xl-qemuu-debianhvm-amd64 pass
> test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
> test-amd64-i386-freebsd10-amd64 pass
> test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
> test-amd64-i386-xl-qemuu-ovmf-amd64 pass
> test-amd64-amd64-rumpuserxen-amd64 fail
> test-amd64-amd64-xl-qemut-win7-amd64 fail
> test-amd64-i386-xl-qemut-win7-amd64 fail
> test-amd64-amd64-xl-qemuu-win7-amd64 fail
> test-amd64-i386-xl-qemuu-win7-amd64 fail
> test-armhf-armhf-xl-arndale pass
> test-amd64-amd64-xl-credit2 pass
> test-armhf-armhf-xl-credit2 pass
> test-armhf-armhf-xl-cubietruck pass
> test-amd64-i386-freebsd10-i386 pass
> test-amd64-i386-rumpuserxen-i386 pass
> test-amd64-amd64-xl-pvh-intel fail
> test-amd64-i386-qemut-rhel6hvm-intel pass
> test-amd64-i386-qemuu-rhel6hvm-intel pass
> test-amd64-amd64-libvirt fail
> test-armhf-armhf-libvirt pass
> test-amd64-i386-libvirt fail
> test-amd64-amd64-xl-multivcpu pass
> test-armhf-armhf-xl-multivcpu pass
> test-amd64-amd64-pair pass
> test-amd64-i386-pair pass
> test-amd64-amd64-xl-sedf-pin pass
> test-armhf-armhf-xl-sedf-pin pass
> test-amd64-amd64-xl-sedf pass
> test-armhf-armhf-xl-sedf pass
> test-amd64-i386-xl-qemut-winxpsp3-vcpus1 pass
> test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 pass
> test-amd64-amd64-xl-qemut-winxpsp3 pass
> test-amd64-i386-xl-qemut-winxpsp3 pass
> test-amd64-amd64-xl-qemuu-winxpsp3 pass
> test-amd64-i386-xl-qemuu-winxpsp3 pass
>
>
> ------------------------------------------------------------
> sg-report-flight on osstest.test-lab.xenproject.org
> logs: /home/logs/logs
> images: /home/logs/images
>
> Logs, config files, etc. are available at
> http://logs.test-lab.xenproject.org/osstest/logs
>
> Test harness code can be found at
> http://xenbits.xen.org/gitweb?p=osstest.git;a=summary
>
>
> Not pushing.
>
> ------------------------------------------------------------
> commit fed56ba0e69b251d0222ef0785cd1c1838f9e51d
> Author: Jan Beulich <jbeulich@suse.com>
> Date: Tue Jun 2 13:45:03 2015 +0200
>
> unmodified-drivers: tolerate IRQF_DISABLED being undefined
>
> It's being removed in Linux 4.1.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> commit 8a753b3f1cf5e4714974196df9517849bf174324
> Author: Ross Lagerwall <ross.lagerwall@citrix.com>
> Date: Tue Jun 2 13:44:24 2015 +0200
>
> efi: fix allocation problems if ExitBootServices() fails
>
> If calling ExitBootServices() fails, the required memory map size may
> have increased. When initially allocating the memory map, allocate a
> slightly larger buffer (by an arbitrary 8 entries) to fix this.
>
> The ARM code path was already allocating a larger buffer than required,
> so this moves the code to be common for all architectures.
>
> This was seen on the following machine when using the iscsidxe UEFI
> driver. The machine would consistently fail the first call to
> ExitBootServices().
> System Information
> Manufacturer: Supermicro
> Product Name: X10SLE-F/HF
> BIOS Information
> Vendor: American Megatrends Inc.
> Version: 2.00
> Release Date: 04/24/2014
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roy Franz <roy.franz@linaro.org>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> commit 376bbbabbda607d2039b8f839f15ff02721597d2
> Author: Dario Faggioli <dario.faggioli@citrix.com>
> Date: Tue Jun 2 13:43:15 2015 +0200
>
> sched_rt: print useful affinity info when dumping
>
> In fact, printing the cpupool's CPU online mask
> for each vCPU is just redundant, as that is the
> same for all the vCPUs of all the domains in the
> same cpupool, while hard affinity is already part
> of the output of dumping domains info.
>
> Instead, print the intersection between hard
> affinity and online CPUs, which is --in case of this
> scheduler-- the effective affinity always used for
> the vCPUs.
>
> This change also takes the chance to add a scratch
> cpumask area, to avoid having to either put one
> (more) cpumask_t on the stack, or dynamically
> allocate it within the dumping routine. (The former
> being bad because hypervisor stack size is limited,
> the latter because dynamic allocations can fail, if
> the hypervisor was built for a large enough number
> of CPUs.) We allocate such scratch area, for all pCPUs,
> when the first instance of the RTDS scheduler is
> activated and, in order not to loose track/leak it
> if other instances are activated in new cpupools,
> and when the last instance is deactivated, we (sort
> of) refcount it.
>
> Such scratch area can be used to kill most of the
> cpumasks{_var}_t local variables in other functions
> in the file, but that is *NOT* done in this chage.
>
> Finally, convert the file to use keyhandler scratch,
> instead of open coded string buffers.
>
> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
> Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
>
> commit e758ed14f390342513405dd766e874934573e6cb
> Author: Andrew Cooper <andrew.cooper3@citrix.com>
> Date: Mon Jun 1 12:00:18 2015 +0200
>
> docs: clarification to terms used in hypervisor memory management
>
> Memory management is hard[citation needed]. Furthermore, it isn't helped by
> the inconsistent use of terms through the code, or that some terms have
> changed meaning over time.
>
> Describe the currently-used terms in a more practical fashon, so new code has
> a concrete reference.
>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Tim Deegan <tim@xen.org>
>
> commit 591e1e357c29589e9d6121d8faadc4f4d3b9013e
> Author: Ross Lagerwall <ross.lagerwall@citrix.com>
> Date: Mon Jun 1 11:59:14 2015 +0200
>
> x86: don't crash when mapping a page using EFI runtime page tables
>
> When an interrupt is received during an EFI runtime service call, Xen
> may call map_domain_page() while using the EFI runtime page tables.
> This fails because, although the EFI runtime page tables are a
> copy of the idle domain's page tables, current points at a different
> domain's vCPU.
>
> To fix this, return NULL from mapcache_current_vcpu() when using the EFI
> runtime page tables which is treated equivalently to running in an idle
> vCPU.
>
> This issue can be reproduced by repeatedly calling GetVariable() from
> dom0 while using VT-d, since VT-d frequently maps a page from interrupt
> context.
>
> Example call trace:
> [<ffff82d0801615dc>] __find_next_zero_bit+0x28/0x60
> [<ffff82d08016a10e>] map_domain_page+0x4c6/0x4eb
> [<ffff82d080156ae6>] map_vtd_domain_page+0xd/0xf
> [<ffff82d08015533a>] msi_msg_read_remap_rte+0xe3/0x1d8
> [<ffff82d08014e516>] iommu_read_msi_from_ire+0x31/0x34
> [<ffff82d08016ff6c>] set_msi_affinity+0x134/0x17a
> [<ffff82d0801737b5>] move_masked_irq+0x5c/0x98
> [<ffff82d080173816>] move_native_irq+0x25/0x36
> [<ffff82d08016ffcb>] ack_nonmaskable_msi_irq+0x19/0x20
> [<ffff82d08016ffdb>] ack_maskable_msi_irq+0x9/0x37
> [<ffff82d080173e8b>] do_IRQ+0x251/0x635
> [<ffff82d080234502>] common_interrupt+0x62/0x70
> [<00000000df7ed2be>] 00000000df7ed2be
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
>
> commit 47ec25a3c8cdd7a057af0a05e8e00257ef950437
> Merge: 088e9b2 818e376
> Author: Ian Campbell <ian.campbell@citrix.com>
> Date: Fri May 29 13:22:31 2015 +0100
>
> Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
>
> commit 088e9b2796bd1f9ebe4fda800275cc689677b699
> Author: Yang Hongyang <yanghy@cn.fujitsu.com>
> Date: Mon May 18 15:03:56 2015 +0800
>
> libxc/restore: implement Remus checkpointed restore
>
> With Remus, the restore flow should be:
> the first full migration stream -> { periodically restore stream }
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> commit a25e4e96fc95150f5c58d069de1b204aa6487ed8
> Author: Yang Hongyang <yanghy@cn.fujitsu.com>
> Date: Mon May 18 15:03:55 2015 +0800
>
> libxc/save: implement Remus checkpointed save
>
> With Remus, the save flow should be:
> live migration->{ periodically save(checkpointed save) }
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> commit cfa955591caea5d7ec505cdcbf4442f2d6e889e1
> Author: Yang Hongyang <yanghy@cn.fujitsu.com>
> Date: Mon May 18 15:03:54 2015 +0800
>
> libxc/save: refactor of send_domain_memory_live()
>
> Split the send_domain_memory_live() into three helper function:
> - send_memory_live() do the actually live send
> - suspend_and_send_dirty() suspend the guest and send dirty pages
> - send_memory_verify()
> The motivation of this is that when we send checkpointed stream, we
> will skip the actually live part.
>
> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
> CC: Ian Campbell <Ian.Campbell@citrix.com>
> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
> CC: Wei Liu <wei.liu2@citrix.com>
> CC: Andrew Cooper <andrew.cooper3@citrix.com>
> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> commit 818e376d3b17845d39735517650224c64c9e0078
> Author: Jan Beulich <jbeulich@suse.com>
> Date: Thu May 28 12:07:33 2015 +0200
>
> Revert "use ticket locks for spin locks"
>
> This reverts commit 45fcc4568c5162b00fb3907fb158af82dd484a3d as it
> introduces yet to be explained issues on ARM.
>
> commit 02cdd81aa0a88007addc788c6cf93e2f1cb1a314
> Author: Jan Beulich <jbeulich@suse.com>
> Date: Thu May 28 12:06:47 2015 +0200
>
> Revert "spinlock: fix build with older GCC"
>
> This reverts commit 1037e33c88bb0e1fe530c164f242df17030102e1 as its
> prereq commit 45fcc4568c is about to be reverted.
>
> commit 814ca12647f06b023f4aac8eae837ba9b417acc7
> Author: Jan Beulich <jbeulich@suse.com>
> Date: Thu May 28 11:59:34 2015 +0200
>
> Revert "x86,arm: remove asm/spinlock.h from all architectures"
>
> This reverts commit e62e49e6d5d4e8d22f3df0b75443ede65a812435 as
> its prerequisite 45fcc4568c is going to be reverted.
>
> commit cf6b3ccf28faee01a078311fcfe671148c81ea75
> Author: Roger Pau Monné <roger.pau@citrix.com>
> Date: Thu May 28 10:56:08 2015 +0200
>
> x86/pvh: disable posted interrupts
>
> Enabling posted interrupts requires the virtual interrupt delivery feature,
> which is disabled for PVH guests, so make sure posted interrupts are also
> disabled or else vmlaunch will fail.
>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Reported-and-Tested-by: Lars Eggert <lars@netapp.com>
> Acked-by: Kevin Tian <kevin.tian@intel.com>
>
> commit d4d39de054a6f6c5a474aee62999a8ea7c2fd180
> Author: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date: Thu May 28 10:55:43 2015 +0200
>
> public: fix xen_domctl_monitor_op_t definition
>
> It seems xen_domctl_monitor_op_t was supposed to be a typedef for
> struct xen_domctl_monitor_op and not the non-existent xen_domctl__op.
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> (qemu changes not included)
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-05 8:45 ` Ian Campbell
@ 2015-06-05 9:00 ` Jan Beulich
2015-06-05 9:07 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-05 9:00 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 05.06.15 at 10:45, <ian.campbell@citrix.com> wrote:
> On Thu, 2015-06-04 at 12:01 +0000, osstest service user wrote:
>> flight 57852 xen-unstable real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/57852/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>> test-amd64-amd64-xl-qemuu-win7-amd64 9 windows-install fail REGR. vs. 57419
>
> Is anyone looking into this?
Not actively, to be honest.
> It seems to have been intermittent for a long time but the probability
> of failure seems to have increased significantly some time around flight
> 52633 (see [0]). Before that it failed <5% of the time and since then it
> looks to be closer to 45-50%. 5% could be put down to infrastructure or
> guest flakiness, 50% seems more like something on the Xen (or qemu etc)
> side.
>
> The bisector is taking a look[1] but TBH given a 50% pass rate I think
> it is unlikely to get anywhere (I suspect this isn't its first attempt
> at this either, pretty sure I saw a failed attempt on an earlier range).
>
> Taking 50370 as a rough baseline (4 consecutive passes before the first
> of the more frequent failures) gives a range of
> b6e7fbadbda4..5c44b5cf352e which is quite a few. It's noteworthy though
> that qemuu didn't change during the interval 50370..52633 (again, from
> [0]).
>
> None of the vnc snapshots look interesting, just the windows login
> screen. Neither do any of the logs look interesting.
Which is the main reason for it being difficult to look into without
seeing it oneself. Two things are possibly noteworthy: This again
is an issue only ever seen with qemuu (just like the migration issue
on the stable branches), and the other day there was a report of
posted interrupts causing spurious hangs, which raises the question
whether the increased failure rate was perhaps due to the new
osstest host system pool having got extended at around that time.
(As noted in a reply to that report, this possible issue can't be an
explanation for the issue on the stable trees, as 4.3 doesn't support
posted interrupts yet.)
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-05 9:00 ` Jan Beulich
@ 2015-06-05 9:07 ` Ian Campbell
2015-06-05 9:18 ` Jan Beulich
0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-05 9:07 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Fri, 2015-06-05 at 10:00 +0100, Jan Beulich wrote:
> >>> On 05.06.15 at 10:45, <ian.campbell@citrix.com> wrote:
> > On Thu, 2015-06-04 at 12:01 +0000, osstest service user wrote:
> >> flight 57852 xen-unstable real [real]
> >> http://logs.test-lab.xenproject.org/osstest/logs/57852/
> >>
> >> Regressions :-(
> >>
> >> Tests which did not succeed and are blocking,
> >> including tests which could not be run:
> >> test-amd64-amd64-xl-qemuu-win7-amd64 9 windows-install fail REGR. vs. 57419
> >
> > Is anyone looking into this?
>
> Not actively, to be honest.
>
> > It seems to have been intermittent for a long time but the probability
> > of failure seems to have increased significantly some time around flight
> > 52633 (see [0]). Before that it failed <5% of the time and since then it
> > looks to be closer to 45-50%. 5% could be put down to infrastructure or
> > guest flakiness, 50% seems more like something on the Xen (or qemu etc)
> > side.
> >
> > The bisector is taking a look[1] but TBH given a 50% pass rate I think
> > it is unlikely to get anywhere (I suspect this isn't its first attempt
> > at this either, pretty sure I saw a failed attempt on an earlier range).
> >
> > Taking 50370 as a rough baseline (4 consecutive passes before the first
> > of the more frequent failures) gives a range of
> > b6e7fbadbda4..5c44b5cf352e which is quite a few. It's noteworthy though
> > that qemuu didn't change during the interval 50370..52633 (again, from
> > [0]).
> >
> > None of the vnc snapshots look interesting, just the windows login
> > screen. Neither do any of the logs look interesting.
>
> Which is the main reason for it being difficult to look into without
> seeing it oneself. Two things are possibly noteworthy: This again
> is an issue only ever seen with qemuu (just like the migration issue
> on the stable branches), and the other day there was a report of
> posted interrupts causing spurious hangs, which raises the question
> whether the increased failure rate was perhaps due to the new
> osstest host system pool having got extended at around that time.
> (As noted in a reply to that report, this possible issue can't be an
> explanation for the issue on the stable trees, as 4.3 doesn't support
> posted interrupts yet.)
From
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.xen-4.3-testing.html
it doesn't seem like 4.3-testing is suffering from the higher incidence
of windows-install failures, just the background noise which unstable
had prior to 52633.
From:
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.xen-4.4-testing.html
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.xen-4.5-testing.html
it looks like none of the stable branches suffer from the install issue.
I'd be inclined to discount any possible link with the migration issue
based on that.
WRT the move to the colo, flights in 5xxxx are in the new one, while
3xxxx are in the old one,
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.xen-unstable.html
shows that things seemed ok for 8 consecutive runs after the move
(ignoring blockages).
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-05 9:07 ` Ian Campbell
@ 2015-06-05 9:18 ` Jan Beulich
2015-06-05 10:48 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-05 9:18 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 05.06.15 at 11:07, <ian.campbell@citrix.com> wrote:
> From:
> http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> -xl-qemuu-win7-amd64.xen-4.4-testing.html
> http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> -xl-qemuu-win7-amd64.xen-4.5-testing.html
> it looks like none of the stable branches suffer from the install issue.
> I'd be inclined to discount any possible link with the migration issue
> based on that.
Generally I would agree, but it strikes me as extremely odd that
(a) stable trees face only the migration issue, while unstable only
faces the install one,
(b) a tree as old as 4.3 (receiving only security updates) developed
this migration issue (I went into more detail on this in a reply to flight
57474's report).
> WRT the move to the colo, flights in 5xxxx are in the new one, while
> 3xxxx are in the old one,
> http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> -xl-qemuu-win7-amd64.xen-unstable.html
> shows that things seemed ok for 8 consecutive runs after the move
> (ignoring blockages).
And when it went live, all systems being in use now got immediately
deployed?
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-05 9:18 ` Jan Beulich
@ 2015-06-05 10:48 ` Ian Campbell
2015-06-05 16:46 ` Ian Campbell
2015-06-08 8:07 ` Jan Beulich
0 siblings, 2 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-05 10:48 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Fri, 2015-06-05 at 10:18 +0100, Jan Beulich wrote:
> >>> On 05.06.15 at 11:07, <ian.campbell@citrix.com> wrote:
> > From:
> > http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> > -xl-qemuu-win7-amd64.xen-4.4-testing.html
> > http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> > -xl-qemuu-win7-amd64.xen-4.5-testing.html
> > it looks like none of the stable branches suffer from the install issue.
> > I'd be inclined to discount any possible link with the migration issue
> > based on that.
>
> Generally I would agree, but it strikes me as extremely odd that
> (a) stable trees face only the migration issue, while unstable only
> faces the install one,
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.xen-unstable.html shows some migration failures too (in a batch though, not spread out).
Wouldn't the migration issue be potentially blocked by the install one?
> (b) a tree as old as 4.3 (receiving only security updates) developed
> this migration issue (I went into more detail on this in a reply to flight
> 57474's report).
>
> > WRT the move to the colo, flights in 5xxxx are in the new one, while
> > 3xxxx are in the old one,
> > http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> > -xl-qemuu-win7-amd64.xen-unstable.html
> > shows that things seemed ok for 8 consecutive runs after the move
> > (ignoring blockages).
>
> And when it went live, all systems being in use now got immediately
> deployed?
All the flights in the new colo seem to have been on fiano[01].
But having looked at the page again the early success was all on fiano0
while the later failures were all on fiano1.
fiano[01] are supposedly identical hardware.
This might be simply explained by osstest's stickiness for jobs on hosts
where they are failing. I'll run a few adhoc jobs on fiano0 using 57852
as a template so we can see if that's the case.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-05 10:48 ` Ian Campbell
@ 2015-06-05 16:46 ` Ian Campbell
2015-06-08 8:07 ` Jan Beulich
1 sibling, 0 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-05 16:46 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Fri, 2015-06-05 at 11:48 +0100, Ian Campbell wrote:
> All the flights in the new colo seem to have been on fiano[01].
>
> But having looked at the page again the early success was all on fiano0
> while the later failures were all on fiano1.
>
> fiano[01] are supposedly identical hardware.
>
> This might be simply explained by osstest's stickiness for jobs on hosts
> where they are failing. I'll run a few adhoc jobs on fiano0 using 57852
> as a template so we can see if that's the case.
http://logs.test-lab.xenproject.org/osstest/logs/57940/
http://logs.test-lab.xenproject.org/osstest/logs/57945/
http://logs.test-lab.xenproject.org/osstest/logs/57953/
All ran in fiano0 and passed the install phase (they failed shutdown,
but that's a different story). They were using the exact same binaries
every time, the ones from flight 57852 which failed on fiano1.
So we may have a host specific issue on just 1 or a pair of hosts, which
is certainly annoying!
I'm going to run 3 on fiano1 to confirm that it still fails there.
Then I'm going to run 3 more on each to make extra sure...
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-05 10:48 ` Ian Campbell
2015-06-05 16:46 ` Ian Campbell
@ 2015-06-08 8:07 ` Jan Beulich
2015-06-08 8:53 ` Ian Campbell
1 sibling, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-08 8:07 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 05.06.15 at 12:48, <ian.campbell@citrix.com> wrote:
> On Fri, 2015-06-05 at 10:18 +0100, Jan Beulich wrote:
>> >>> On 05.06.15 at 11:07, <ian.campbell@citrix.com> wrote:
>> > WRT the move to the colo, flights in 5xxxx are in the new one, while
>> > 3xxxx are in the old one,
>> > http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
>> > -xl-qemuu-win7-amd64.xen-unstable.html
>> > shows that things seemed ok for 8 consecutive runs after the move
>> > (ignoring blockages).
>>
>> And when it went live, all systems being in use now got immediately
>> deployed?
>
> All the flights in the new colo seem to have been on fiano[01].
So are there just two hosts to run all x86 tests on? I thought one
of the purposes of the switch was to have a wider pool of test
systems...
> But having looked at the page again the early success was all on fiano0
> while the later failures were all on fiano1.
But that's for the unstable install failures only as it looks. At the
example of flight 57955 (testing 4.2) a local migration failure was
observed on fiano0. Which would seem to support your earlier
assumption that the install and migration issues are likely unrelated
(yet their coincidence still strikes me as odd).
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 8:07 ` Jan Beulich
@ 2015-06-08 8:53 ` Ian Campbell
2015-06-08 9:15 ` Jan Beulich
0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-08 8:53 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Mon, 2015-06-08 at 09:07 +0100, Jan Beulich wrote:
> >>> On 05.06.15 at 12:48, <ian.campbell@citrix.com> wrote:
> > On Fri, 2015-06-05 at 10:18 +0100, Jan Beulich wrote:
> >> >>> On 05.06.15 at 11:07, <ian.campbell@citrix.com> wrote:
> >> > WRT the move to the colo, flights in 5xxxx are in the new one, while
> >> > 3xxxx are in the old one,
> >> > http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> >> > -xl-qemuu-win7-amd64.xen-unstable.html
> >> > shows that things seemed ok for 8 consecutive runs after the move
> >> > (ignoring blockages).
> >>
> >> And when it went live, all systems being in use now got immediately
> >> deployed?
> >
> > All the flights in the new colo seem to have been on fiano[01].
>
> So are there just two hosts to run all x86 tests on? I thought one
> of the purposes of the switch was to have a wider pool of test
> systems...
There are about a dozen, but when a test is failing osstest will have a
preference for the host on which it failed last time (i.e. failures
become sticky to the host), in order to catch host specific failures I
think.
I think it was just coincidence that the first group of runs which
passed were on fiano0, although perhaps the pool was smaller then since
the colo was in the process of being commissioned.
The stickiness does make it a bit harder to know if a failure is host
specific though, since you often don't get results for other systems.
> > But having looked at the page again the early success was all on fiano0
> > while the later failures were all on fiano1.
>
> But that's for the unstable install failures only as it looks. At the
> example of flight 57955 (testing 4.2) a local migration failure was
> observed on fiano0. Which would seem to support your earlier
> assumption that the install and migration issues are likely unrelated
> (yet their coincidence still strikes me as odd).
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.html has the cross branch history for this test case. With one exception (on chardonay0, in a linux-next test) all the fails were on fiano[01] and they were all on branches which would use xen-unstable as the Xen version (xen-unstable itself and linux-* + qemu-mainline which both use the current xen.git#master as their Xen).
I've got some adhoc results over the weekend, all can be found at
http://logs.test-lab.xenproject.org/osstest/logs/<NNNNN>/test-amd64-amd64-xl-qemuu-win7-amd64/info.html for flight <NNNNN>. All of them are using the binaries from 57852.
I messed up my first command line and ran them all on fiano0 by mistake,
so there are more results than I was planning for.
Flight Host Failed at Install step duration
57940 fiano0 ts-guest-stop 1483
57945 fiano0 ts-guest-stop 1640
57953 fiano0 ts-guest-stop 1473
57958 fiano0 ts-guest-stop 1472
57962 fiano0 windows-install 7512
57973 fiano0 windows-install 7693
57080 fiano0 ts-guest-stop 1534
57986 fiano0 windows-install 7203
57933 fiano0 ts-guest-stop 1529
57997 fiano0 ts-guest-stop 1494
58004 fiano0 ts-guest-stop 1492
58011 fiano1 ts-guest-stop 1408
58012 fiano1 ts-guest-stop 1529
58017 fiano1 ts-guest-stop 1466
58023 fiano1 ts-guest-stop 1624
58028 fiano1 windows-install 7208
58038 fiano1 ts-guest-stop 1479
58043 fiano1 ts-guest-stop 1493
58053 fiano0 windows-install 7439
58062 fiano0 windows-install 1916
58063 fiano0 windows-install 1477
58067 fiano1 ts-guest-stop 1453
58071 fiano1 ts-guest-stop 1550
58077 fiano1 windows-install 7156
That's 6/14 (43%) failure rate on fiano0 and 2/10 (20%) on fiano1. Which
differs form the apparent xen-unstable failure rate. But I wouldn't take
this as evidence that the two systems differ significantly, despite how
the unstable results looked at first glance.
On successful install the test step takes 1450-1650s, with one outlier
at 1916. The failures take 7000-7500s (test case timeout is 7000, so
with slop that fits). So on success it takes <30mins and on fail it has
been given nearly 2hours.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 8:53 ` Ian Campbell
@ 2015-06-08 9:15 ` Jan Beulich
2015-06-08 9:27 ` Ian Campbell
2015-06-08 10:10 ` Ian Campbell
0 siblings, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2015-06-08 9:15 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 08.06.15 at 10:53, <ian.campbell@citrix.com> wrote:
> That's 6/14 (43%) failure rate on fiano0 and 2/10 (20%) on fiano1. Which
> differs form the apparent xen-unstable failure rate. But I wouldn't take
> this as evidence that the two systems differ significantly, despite how
> the unstable results looked at first glance.
So we can basically rule out just one of the hosts being the culprit;
it's either both or our software. Considering that (again at the
example of the recent 4.2 flight) the guest is apparently waiting for
a timer (or other) interrupt (on a HLT instruction), this is very likely
interrupt delivery related, yet (as said before, albeit wrongly for
4.3) 4.2 doesn't have APICV support yet (4.3 only lack the option
to disable it), so it can't be that (alone).
Looking at the hardware - are fiano[01], in terms of CPU and
chipset, perhaps the newest or oldest in the pool? (I'm trying to
make myself a picture of what debugging options we have.)
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 9:15 ` Jan Beulich
@ 2015-06-08 9:27 ` Ian Campbell
2015-06-08 10:17 ` Jan Beulich
` (2 more replies)
2015-06-08 10:10 ` Ian Campbell
1 sibling, 3 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-08 9:27 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Mon, 2015-06-08 at 10:15 +0100, Jan Beulich wrote:
> >>> On 08.06.15 at 10:53, <ian.campbell@citrix.com> wrote:
> > That's 6/14 (43%) failure rate on fiano0 and 2/10 (20%) on fiano1. Which
> > differs form the apparent xen-unstable failure rate. But I wouldn't take
> > this as evidence that the two systems differ significantly, despite how
> > the unstable results looked at first glance.
>
> So we can basically rule out just one of the hosts being the culprit;
> it's either both or our software. Considering that (again at the
> example of the recent 4.2 flight) the guest is apparently waiting for
> a timer (or other) interrupt (on a HLT instruction), this is very likely
> interrupt delivery related, yet (as said before, albeit wrongly for
> 4.3) 4.2 doesn't have APICV support yet (4.3 only lack the option
> to disable it), so it can't be that (alone).
>
> Looking at the hardware - are fiano[01], in terms of CPU and
> chipset, perhaps the newest or oldest in the pool? (I'm trying to
> make myself a picture of what debugging options we have.)
I don't know much about the hardware in the pool other than what can be
gathered from the serial and dmesg logs.
http://logs.test-lab.xenproject.org/osstest/logs/58028/test-amd64-amd64-xl-qemuu-win7-amd64/info.html
>From the serial log and this:
Jun 6 12:09:27.089020 (XEN) VMX: Supported advanced features:
Jun 6 12:09:27.089052 (XEN) - APIC MMIO access virtualisation
Jun 6 12:09:27.097051 (XEN) - APIC TPR shadow
Jun 6 12:09:27.097088 (XEN) - Extended Page Tables (EPT)
Jun 6 12:09:27.097118 (XEN) - Virtual-Processor Identifiers (VPID)
Jun 6 12:09:27.105066 (XEN) - Virtual NMI
Jun 6 12:09:27.105100 (XEN) - MSR direct-access bitmap
Jun 6 12:09:27.105130 (XEN) - Unrestricted Guest
Jun 6 12:09:27.113269 (XEN) - APIC Register Virtualization
Jun 6 12:09:27.113290 (XEN) - Virtual Interrupt Delivery
Jun 6 12:09:27.113328 (XEN) - Posted Interrupt Processing
Jun 6 12:09:27.121180 (XEN) HVM: ASIDs enabled.
Jun 6 12:09:27.121235 (XEN) HVM: VMX enabled
Jun 6 12:09:27.121267 (XEN) HVM: Hardware Assisted Paging (HAP) detected
Jun 6 12:09:27.129069 (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
I guess they are pretty new?
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 9:15 ` Jan Beulich
2015-06-08 9:27 ` Ian Campbell
@ 2015-06-08 10:10 ` Ian Campbell
1 sibling, 0 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-08 10:10 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Mon, 2015-06-08 at 10:15 +0100, Jan Beulich wrote:
> (I'm trying to make myself a picture of what debugging options we
> have.)
In the meantime I've kicked off an adhoc job using no-apicv as suggested
by Andy (on IIRC last week IIRC). Assuming that my tweak takes effect in
practice I'll run a bunch of those to hopefully come up with a
significant result.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 9:27 ` Ian Campbell
@ 2015-06-08 10:17 ` Jan Beulich
2015-06-08 14:43 ` Ian Jackson
2015-06-08 12:16 ` Ian Campbell
2015-06-08 13:50 ` Konrad Rzeszutek Wilk
2 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-08 10:17 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 08.06.15 at 11:27, <ian.campbell@citrix.com> wrote:
> I don't know much about the hardware in the pool other than what can be
> gathered from the serial and dmesg logs.
Right - this is useful for learning details of an individual system, but
isn't really helpful when wanting to compare all system kinds that are
in the pool.
> From the serial log and this:
>
> Jun 6 12:09:27.089020 (XEN) VMX: Supported advanced features:
> Jun 6 12:09:27.089052 (XEN) - APIC MMIO access virtualisation
> Jun 6 12:09:27.097051 (XEN) - APIC TPR shadow
> Jun 6 12:09:27.097088 (XEN) - Extended Page Tables (EPT)
> Jun 6 12:09:27.097118 (XEN) - Virtual-Processor Identifiers (VPID)
> Jun 6 12:09:27.105066 (XEN) - Virtual NMI
> Jun 6 12:09:27.105100 (XEN) - MSR direct-access bitmap
> Jun 6 12:09:27.105130 (XEN) - Unrestricted Guest
> Jun 6 12:09:27.113269 (XEN) - APIC Register Virtualization
> Jun 6 12:09:27.113290 (XEN) - Virtual Interrupt Delivery
> Jun 6 12:09:27.113328 (XEN) - Posted Interrupt Processing
> Jun 6 12:09:27.121180 (XEN) HVM: ASIDs enabled.
> Jun 6 12:09:27.121235 (XEN) HVM: VMX enabled
> Jun 6 12:09:27.121267 (XEN) HVM: Hardware Assisted Paging (HAP) detected
> Jun 6 12:09:27.129069 (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
>
> I guess they are pretty new?
Looks like so, yes.
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 9:27 ` Ian Campbell
2015-06-08 10:17 ` Jan Beulich
@ 2015-06-08 12:16 ` Ian Campbell
2015-06-08 12:19 ` Andrew Cooper
` (2 more replies)
2015-06-08 13:50 ` Konrad Rzeszutek Wilk
2 siblings, 3 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-08 12:16 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Mon, 2015-06-08 at 10:27 +0100, Ian Campbell wrote:
> On Mon, 2015-06-08 at 10:15 +0100, Jan Beulich wrote:
> > >>> On 08.06.15 at 10:53, <ian.campbell@citrix.com> wrote:
> > > That's 6/14 (43%) failure rate on fiano0 and 2/10 (20%) on fiano1. Which
> > > differs form the apparent xen-unstable failure rate. But I wouldn't take
> > > this as evidence that the two systems differ significantly, despite how
> > > the unstable results looked at first glance.
> >
> > So we can basically rule out just one of the hosts being the culprit;
> > it's either both or our software. Considering that (again at the
> > example of the recent 4.2 flight) the guest is apparently waiting for
> > a timer (or other) interrupt (on a HLT instruction), this is very likely
> > interrupt delivery related, yet (as said before, albeit wrongly for
> > 4.3) 4.2 doesn't have APICV support yet (4.3 only lack the option
> > to disable it), so it can't be that (alone).
> >
> > Looking at the hardware - are fiano[01], in terms of CPU and
> > chipset, perhaps the newest or oldest in the pool? (I'm trying to
> > make myself a picture of what debugging options we have.)
>
> I don't know much about the hardware in the pool other than what can be
> gathered from the serial and dmesg logs.
>
> http://logs.test-lab.xenproject.org/osstest/logs/58028/test-amd64-amd64-xl-qemuu-win7-amd64/info.html
>
> From the serial log and this:
>
> Jun 6 12:09:27.089020 (XEN) VMX: Supported advanced features:
> Jun 6 12:09:27.089052 (XEN) - APIC MMIO access virtualisation
> Jun 6 12:09:27.097051 (XEN) - APIC TPR shadow
> Jun 6 12:09:27.097088 (XEN) - Extended Page Tables (EPT)
> Jun 6 12:09:27.097118 (XEN) - Virtual-Processor Identifiers (VPID)
> Jun 6 12:09:27.105066 (XEN) - Virtual NMI
> Jun 6 12:09:27.105100 (XEN) - MSR direct-access bitmap
> Jun 6 12:09:27.105130 (XEN) - Unrestricted Guest
Running with no-apicv seems to have disabled these three:
> Jun 6 12:09:27.113269 (XEN) - APIC Register Virtualization
> Jun 6 12:09:27.113290 (XEN) - Virtual Interrupt Delivery
> Jun 6 12:09:27.113328 (XEN) - Posted Interrupt Processing
Is that expected?
The adhoc run passed, but that's not statistically significant.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 12:16 ` Ian Campbell
@ 2015-06-08 12:19 ` Andrew Cooper
2015-06-08 12:24 ` Jan Beulich
2015-06-09 8:26 ` Ian Campbell
2 siblings, 0 replies; 40+ messages in thread
From: Andrew Cooper @ 2015-06-08 12:19 UTC (permalink / raw)
To: Ian Campbell, Jan Beulich; +Cc: xen-devel, ian.jackson
On 08/06/15 13:16, Ian Campbell wrote:
> On Mon, 2015-06-08 at 10:27 +0100, Ian Campbell wrote:
>> On Mon, 2015-06-08 at 10:15 +0100, Jan Beulich wrote:
>>>>>> On 08.06.15 at 10:53, <ian.campbell@citrix.com> wrote:
>>>> That's 6/14 (43%) failure rate on fiano0 and 2/10 (20%) on fiano1. Which
>>>> differs form the apparent xen-unstable failure rate. But I wouldn't take
>>>> this as evidence that the two systems differ significantly, despite how
>>>> the unstable results looked at first glance.
>>> So we can basically rule out just one of the hosts being the culprit;
>>> it's either both or our software. Considering that (again at the
>>> example of the recent 4.2 flight) the guest is apparently waiting for
>>> a timer (or other) interrupt (on a HLT instruction), this is very likely
>>> interrupt delivery related, yet (as said before, albeit wrongly for
>>> 4.3) 4.2 doesn't have APICV support yet (4.3 only lack the option
>>> to disable it), so it can't be that (alone).
>>>
>>> Looking at the hardware - are fiano[01], in terms of CPU and
>>> chipset, perhaps the newest or oldest in the pool? (I'm trying to
>>> make myself a picture of what debugging options we have.)
>> I don't know much about the hardware in the pool other than what can be
>> gathered from the serial and dmesg logs.
>>
>> http://logs.test-lab.xenproject.org/osstest/logs/58028/test-amd64-amd64-xl-qemuu-win7-amd64/info.html
>>
>> From the serial log and this:
>>
>> Jun 6 12:09:27.089020 (XEN) VMX: Supported advanced features:
>> Jun 6 12:09:27.089052 (XEN) - APIC MMIO access virtualisation
>> Jun 6 12:09:27.097051 (XEN) - APIC TPR shadow
>> Jun 6 12:09:27.097088 (XEN) - Extended Page Tables (EPT)
>> Jun 6 12:09:27.097118 (XEN) - Virtual-Processor Identifiers (VPID)
>> Jun 6 12:09:27.105066 (XEN) - Virtual NMI
>> Jun 6 12:09:27.105100 (XEN) - MSR direct-access bitmap
>> Jun 6 12:09:27.105130 (XEN) - Unrestricted Guest
> Running with no-apicv seems to have disabled these three:
>
>> Jun 6 12:09:27.113269 (XEN) - APIC Register Virtualization
>> Jun 6 12:09:27.113290 (XEN) - Virtual Interrupt Delivery
>> Jun 6 12:09:27.113328 (XEN) - Posted Interrupt Processing
> Is that expected?
Yes - The first is APICV itself, and the further two are dependent features.
~Andrew
>
> The adhoc run passed, but that's not statistically significant.
>
> Ian.
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 12:16 ` Ian Campbell
2015-06-08 12:19 ` Andrew Cooper
@ 2015-06-08 12:24 ` Jan Beulich
2015-06-09 8:26 ` Ian Campbell
2 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2015-06-08 12:24 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 08.06.15 at 14:16, <ian.campbell@citrix.com> wrote:
> On Mon, 2015-06-08 at 10:27 +0100, Ian Campbell wrote:
>> On Mon, 2015-06-08 at 10:15 +0100, Jan Beulich wrote:
>> > >>> On 08.06.15 at 10:53, <ian.campbell@citrix.com> wrote:
>> > > That's 6/14 (43%) failure rate on fiano0 and 2/10 (20%) on fiano1. Which
>> > > differs form the apparent xen-unstable failure rate. But I wouldn't take
>> > > this as evidence that the two systems differ significantly, despite how
>> > > the unstable results looked at first glance.
>> >
>> > So we can basically rule out just one of the hosts being the culprit;
>> > it's either both or our software. Considering that (again at the
>> > example of the recent 4.2 flight) the guest is apparently waiting for
>> > a timer (or other) interrupt (on a HLT instruction), this is very likely
>> > interrupt delivery related, yet (as said before, albeit wrongly for
>> > 4.3) 4.2 doesn't have APICV support yet (4.3 only lack the option
>> > to disable it), so it can't be that (alone).
>> >
>> > Looking at the hardware - are fiano[01], in terms of CPU and
>> > chipset, perhaps the newest or oldest in the pool? (I'm trying to
>> > make myself a picture of what debugging options we have.)
>>
>> I don't know much about the hardware in the pool other than what can be
>> gathered from the serial and dmesg logs.
>>
>>
> http://logs.test-lab.xenproject.org/osstest/logs/58028/test-amd64-amd64-xl-qe
> muu-win7-amd64/info.html
>>
>> From the serial log and this:
>>
>> Jun 6 12:09:27.089020 (XEN) VMX: Supported advanced features:
>> Jun 6 12:09:27.089052 (XEN) - APIC MMIO access virtualisation
>> Jun 6 12:09:27.097051 (XEN) - APIC TPR shadow
>> Jun 6 12:09:27.097088 (XEN) - Extended Page Tables (EPT)
>> Jun 6 12:09:27.097118 (XEN) - Virtual-Processor Identifiers (VPID)
>> Jun 6 12:09:27.105066 (XEN) - Virtual NMI
>> Jun 6 12:09:27.105100 (XEN) - MSR direct-access bitmap
>> Jun 6 12:09:27.105130 (XEN) - Unrestricted Guest
>
> Running with no-apicv seems to have disabled these three:
>
>> Jun 6 12:09:27.113269 (XEN) - APIC Register Virtualization
>> Jun 6 12:09:27.113290 (XEN) - Virtual Interrupt Delivery
>> Jun 6 12:09:27.113328 (XEN) - Posted Interrupt Processing
>
> Is that expected?
I think so, based on
if ( (_vmx_cpu_based_exec_control & CPU_BASED_TPR_SHADOW) &&
opt_apicv_enabled )
opt |= SECONDARY_EXEC_APIC_REGISTER_VIRT |
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
and
if ( !(_vmx_secondary_exec_control & SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY)
|| !(_vmx_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT) )
_vmx_pin_based_exec_control &= ~ PIN_BASED_POSTED_INTERRUPT;
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 9:27 ` Ian Campbell
2015-06-08 10:17 ` Jan Beulich
2015-06-08 12:16 ` Ian Campbell
@ 2015-06-08 13:50 ` Konrad Rzeszutek Wilk
2015-06-08 14:02 ` Ian Campbell
2015-06-08 14:47 ` Ian Jackson
2 siblings, 2 replies; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-06-08 13:50 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, Jan Beulich, xen-devel
On Mon, Jun 08, 2015 at 10:27:32AM +0100, Ian Campbell wrote:
> On Mon, 2015-06-08 at 10:15 +0100, Jan Beulich wrote:
> > >>> On 08.06.15 at 10:53, <ian.campbell@citrix.com> wrote:
> > > That's 6/14 (43%) failure rate on fiano0 and 2/10 (20%) on fiano1. Which
> > > differs form the apparent xen-unstable failure rate. But I wouldn't take
> > > this as evidence that the two systems differ significantly, despite how
> > > the unstable results looked at first glance.
> >
> > So we can basically rule out just one of the hosts being the culprit;
> > it's either both or our software. Considering that (again at the
> > example of the recent 4.2 flight) the guest is apparently waiting for
> > a timer (or other) interrupt (on a HLT instruction), this is very likely
> > interrupt delivery related, yet (as said before, albeit wrongly for
> > 4.3) 4.2 doesn't have APICV support yet (4.3 only lack the option
> > to disable it), so it can't be that (alone).
> >
> > Looking at the hardware - are fiano[01], in terms of CPU and
> > chipset, perhaps the newest or oldest in the pool? (I'm trying to
> > make myself a picture of what debugging options we have.)
>
> I don't know much about the hardware in the pool other than what can be
> gathered from the serial and dmesg logs.
>
> http://logs.test-lab.xenproject.org/osstest/logs/58028/test-amd64-amd64-xl-qemuu-win7-amd64/info.html
>
> >From the serial log and this:
>
> Jun 6 12:09:27.089020 (XEN) VMX: Supported advanced features:
> Jun 6 12:09:27.089052 (XEN) - APIC MMIO access virtualisation
> Jun 6 12:09:27.097051 (XEN) - APIC TPR shadow
> Jun 6 12:09:27.097088 (XEN) - Extended Page Tables (EPT)
> Jun 6 12:09:27.097118 (XEN) - Virtual-Processor Identifiers (VPID)
> Jun 6 12:09:27.105066 (XEN) - Virtual NMI
> Jun 6 12:09:27.105100 (XEN) - MSR direct-access bitmap
> Jun 6 12:09:27.105130 (XEN) - Unrestricted Guest
> Jun 6 12:09:27.113269 (XEN) - APIC Register Virtualization
> Jun 6 12:09:27.113290 (XEN) - Virtual Interrupt Delivery
> Jun 6 12:09:27.113328 (XEN) - Posted Interrupt Processing
> Jun 6 12:09:27.121180 (XEN) HVM: ASIDs enabled.
> Jun 6 12:09:27.121235 (XEN) HVM: VMX enabled
> Jun 6 12:09:27.121267 (XEN) HVM: Hardware Assisted Paging (HAP) detected
> Jun 6 12:09:27.129069 (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
>
> I guess they are pretty new?
Could it be an missing microcode update? I don't know if the OSSTest does
the ucode=scan or updates the microcode later?
>
> Ian.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 13:50 ` Konrad Rzeszutek Wilk
@ 2015-06-08 14:02 ` Ian Campbell
2015-06-08 14:47 ` Ian Jackson
1 sibling, 0 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-08 14:02 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, ian.jackson, Jan Beulich, xen-devel
On Mon, 2015-06-08 at 09:50 -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 08, 2015 at 10:27:32AM +0100, Ian Campbell wrote:
> > On Mon, 2015-06-08 at 10:15 +0100, Jan Beulich wrote:
> > > >>> On 08.06.15 at 10:53, <ian.campbell@citrix.com> wrote:
> > > > That's 6/14 (43%) failure rate on fiano0 and 2/10 (20%) on fiano1. Which
> > > > differs form the apparent xen-unstable failure rate. But I wouldn't take
> > > > this as evidence that the two systems differ significantly, despite how
> > > > the unstable results looked at first glance.
> > >
> > > So we can basically rule out just one of the hosts being the culprit;
> > > it's either both or our software. Considering that (again at the
> > > example of the recent 4.2 flight) the guest is apparently waiting for
> > > a timer (or other) interrupt (on a HLT instruction), this is very likely
> > > interrupt delivery related, yet (as said before, albeit wrongly for
> > > 4.3) 4.2 doesn't have APICV support yet (4.3 only lack the option
> > > to disable it), so it can't be that (alone).
> > >
> > > Looking at the hardware - are fiano[01], in terms of CPU and
> > > chipset, perhaps the newest or oldest in the pool? (I'm trying to
> > > make myself a picture of what debugging options we have.)
> >
> > I don't know much about the hardware in the pool other than what can be
> > gathered from the serial and dmesg logs.
> >
> > http://logs.test-lab.xenproject.org/osstest/logs/58028/test-amd64-amd64-xl-qemuu-win7-amd64/info.html
> >
> > >From the serial log and this:
> >
> > Jun 6 12:09:27.089020 (XEN) VMX: Supported advanced features:
> > Jun 6 12:09:27.089052 (XEN) - APIC MMIO access virtualisation
> > Jun 6 12:09:27.097051 (XEN) - APIC TPR shadow
> > Jun 6 12:09:27.097088 (XEN) - Extended Page Tables (EPT)
> > Jun 6 12:09:27.097118 (XEN) - Virtual-Processor Identifiers (VPID)
> > Jun 6 12:09:27.105066 (XEN) - Virtual NMI
> > Jun 6 12:09:27.105100 (XEN) - MSR direct-access bitmap
> > Jun 6 12:09:27.105130 (XEN) - Unrestricted Guest
> > Jun 6 12:09:27.113269 (XEN) - APIC Register Virtualization
> > Jun 6 12:09:27.113290 (XEN) - Virtual Interrupt Delivery
> > Jun 6 12:09:27.113328 (XEN) - Posted Interrupt Processing
> > Jun 6 12:09:27.121180 (XEN) HVM: ASIDs enabled.
> > Jun 6 12:09:27.121235 (XEN) HVM: VMX enabled
> > Jun 6 12:09:27.121267 (XEN) HVM: Hardware Assisted Paging (HAP) detected
> > Jun 6 12:09:27.129069 (XEN) HVM: HAP page sizes: 4kB, 2MB, 1GB
> >
> > I guess they are pretty new?
>
> Could it be an missing microcode update? I don't know if the OSSTest does
> the ucode=scan or updates the microcode later?
I rather suspect it doesn't do microcode updates at all. (It probably
should)
Is there some reason to expect APICV (or something else) would cause
these failures if microcode wasn't up to date?
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 10:17 ` Jan Beulich
@ 2015-06-08 14:43 ` Ian Jackson
0 siblings, 0 replies; 40+ messages in thread
From: Ian Jackson @ 2015-06-08 14:43 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, Ian Campbell, xen-devel
Jan Beulich writes ("Re: [Xen-devel] [xen-unstable test] 57852: regressions - FAIL"):
> On 08.06.15 at 11:27, <ian.campbell@citrix.com> wrote:
> > I don't know much about the hardware in the pool other than what can be
> > gathered from the serial and dmesg logs.
>
> Right - this is useful for learning details of an individual system, but
> isn't really helpful when wanting to compare all system kinds that are
> in the pool.
The other information we have is from the procurement exercise.
Summary spreadsheet:
http://xenbits.xen.org/gitweb/?p=people/iwj/colo-for-testing.git;a=blob;f=selections.ods;h=82134d5bc2c441a0b23006edc33a8ad80aae71e3;hb=master
Contract:
http://xenbits.xen.org/gitweb/?p=people/iwj/colo-for-testing.git;a=blob;f=PURCHASE+AND+SALE+AGREEMENT.doc;h=a87814184c1a8fe45ec1992548cfac088113f3d7;hb=master
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 13:50 ` Konrad Rzeszutek Wilk
2015-06-08 14:02 ` Ian Campbell
@ 2015-06-08 14:47 ` Ian Jackson
2015-06-08 15:21 ` Konrad Rzeszutek Wilk
1 sibling, 1 reply; 40+ messages in thread
From: Ian Jackson @ 2015-06-08 14:47 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, Ian Campbell, Jan Beulich, xen-devel
Konrad Rzeszutek Wilk writes ("Re: [Xen-devel] [xen-unstable test] 57852: regressions - FAIL"):
> Could it be an missing microcode update? I don't know if the OSSTest does
> the ucode=scan or updates the microcode later?
I think osstest's machines don't get microcode updates. I'm no expert
on x86 microcode, but my understanding is:
Microcode updates (i) have to be loaded dynamically at boot time and
(ii) are regarded as a non-free package by Debian.
We could arrange to install the non-free microcode package. I haven't
looked into it but I would expect that to automatically arrange to
load the microcode for native boots. It IMO ought to do the same for
non-native boots but I wouldn't rely on that being the case.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 14:47 ` Ian Jackson
@ 2015-06-08 15:21 ` Konrad Rzeszutek Wilk
2015-06-08 15:29 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: Konrad Rzeszutek Wilk @ 2015-06-08 15:21 UTC (permalink / raw)
To: Ian Jackson; +Cc: Andrew Cooper, Ian Campbell, Jan Beulich, xen-devel
On Mon, Jun 08, 2015 at 03:47:22PM +0100, Ian Jackson wrote:
> Konrad Rzeszutek Wilk writes ("Re: [Xen-devel] [xen-unstable test] 57852: regressions - FAIL"):
> > Could it be an missing microcode update? I don't know if the OSSTest does
> > the ucode=scan or updates the microcode later?
>
> I think osstest's machines don't get microcode updates. I'm no expert
> on x86 microcode, but my understanding is:
>
> Microcode updates (i) have to be loaded dynamically at boot time and
> (ii) are regarded as a non-free package by Debian.
>
> We could arrange to install the non-free microcode package. I haven't
> looked into it but I would expect that to automatically arrange to
> load the microcode for native boots. It IMO ought to do the same for
> non-native boots but I wouldn't rely on that being the case.
If Debian is using dracut it just requires adding in /etc/dracut.conf
early_microcode=yes
and from there on any regenerated initramfs will have the required
microcode. Then 'ucode=scan' needs to be added on the Xen command line.
But this is a shoot in the dark - this microcode update might have nothing
to do with these failures.
>
> Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 15:21 ` Konrad Rzeszutek Wilk
@ 2015-06-08 15:29 ` Ian Campbell
0 siblings, 0 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-08 15:29 UTC (permalink / raw)
To: Konrad Rzeszutek Wilk; +Cc: Andrew Cooper, Ian Jackson, Jan Beulich, xen-devel
On Mon, 2015-06-08 at 11:21 -0400, Konrad Rzeszutek Wilk wrote:
> On Mon, Jun 08, 2015 at 03:47:22PM +0100, Ian Jackson wrote:
> > Konrad Rzeszutek Wilk writes ("Re: [Xen-devel] [xen-unstable test] 57852: regressions - FAIL"):
> > > Could it be an missing microcode update? I don't know if the OSSTest does
> > > the ucode=scan or updates the microcode later?
> >
> > I think osstest's machines don't get microcode updates. I'm no expert
> > on x86 microcode, but my understanding is:
> >
> > Microcode updates (i) have to be loaded dynamically at boot time and
> > (ii) are regarded as a non-free package by Debian.
> >
> > We could arrange to install the non-free microcode package. I haven't
> > looked into it but I would expect that to automatically arrange to
> > load the microcode for native boots. It IMO ought to do the same for
> > non-native boots but I wouldn't rely on that being the case.
>
> If Debian is using dracut
It doesn't, it uses initramfs-tools.
I'm not sure about Wheezy but from Jessie onwards installing the
microcode packages adds hooks which makes initramfs-tools do the right
thing.
> it just requires adding in /etc/dracut.conf
> early_microcode=yes
>
> and from there on any regenerated initramfs will have the required
> microcode. Then 'ucode=scan' needs to be added on the Xen command line.
This is still needed with initramfs-tools.
There's also Debian bug #785187 (discussed on xen-devel) which stops
ucode=scan from working, for a reason I've not had a changce to dig into
yet.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-08 12:16 ` Ian Campbell
2015-06-08 12:19 ` Andrew Cooper
2015-06-08 12:24 ` Jan Beulich
@ 2015-06-09 8:26 ` Ian Campbell
2015-06-09 9:29 ` Jan Beulich
2 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-09 8:26 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Mon, 2015-06-08 at 13:16 +0100, Ian Campbell wrote:
> The adhoc run passed, but that's not statistically significant.
I ran a bunch more in this no-apicv configuration, the logs are at
http://logs.test-lab.xenproject.org/osstest/logs/<NNNN>:
Flight Host Failed at
58190 fiano0 ts-guest-stop
58198 fiano0 ts-guest-stop
58203 fiano0 ts-windows-install
58208 fiano0 ts-guest-stop
58210 fiano1 ts-guest-stop
58214 fiano1 ts-guest-stop
58217 fiano1 ts-guest-stop
I think that's not sufficient data to draw a conclusion, since there was
always a small background failure rate. I'm going to run another half
dozen (3 on each).
NB the build artefacts from 57852 got garbage collected, so 58190
rebuilt them all (from the same versions) and those were used for the
other flights. I think this is unlikely to have made any difference.
I'm also going to run some on another pair of hosts without no-apicv. I
chose elbling[01] since according to
http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64-xl-qemuu-win7-amd64.html it has been used a few times on various branches and hasn't so far failed a ts-windows-install. It looks to have the same set of advanced features as fiano*:
Jun 8 05:24:51.033042 (XEN) VMX: Supported advanced features:
Jun 8 05:24:51.041023 (XEN) - APIC MMIO access virtualisation
Jun 8 05:24:51.049018 (XEN) - APIC TPR shadow
Jun 8 05:24:51.049050 (XEN) - Extended Page Tables (EPT)
Jun 8 05:24:51.049079 (XEN) - Virtual-Processor Identifiers (VPID)
Jun 8 05:24:51.057032 (XEN) - Virtual NMI
Jun 8 05:24:51.057063 (XEN) - MSR direct-access bitmap
Jun 8 05:24:51.065034 (XEN) - Unrestricted Guest
Jun 8 05:24:51.065067 (XEN) - APIC Register Virtualization
Jun 8 05:24:51.073034 (XEN) - Virtual Interrupt Delivery
Jun 8 05:24:51.073073 (XEN) - Posted Interrupt Processing
Jun 8 05:24:51.073101 (XEN) HVM: ASIDs enabled.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-09 8:26 ` Ian Campbell
@ 2015-06-09 9:29 ` Jan Beulich
2015-06-10 8:50 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-09 9:29 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 09.06.15 at 10:26, <ian.campbell@citrix.com> wrote:
> On Mon, 2015-06-08 at 13:16 +0100, Ian Campbell wrote:
>
>> The adhoc run passed, but that's not statistically significant.
>
> I ran a bunch more in this no-apicv configuration, the logs are at
> http://logs.test-lab.xenproject.org/osstest/logs/<NNNN>:
>
> Flight Host Failed at
> 58190 fiano0 ts-guest-stop
> 58198 fiano0 ts-guest-stop
> 58203 fiano0 ts-windows-install
> 58208 fiano0 ts-guest-stop
> 58210 fiano1 ts-guest-stop
> 58214 fiano1 ts-guest-stop
> 58217 fiano1 ts-guest-stop
>
> I think that's not sufficient data to draw a conclusion, since there was
> always a small background failure rate. I'm going to run another half
> dozen (3 on each).
At least the one failure is following the patterns of previous ones
(ping timing out and guest sitting with both of its vCPU-s on HLT,
and the last VM entry having delivered a timer interrupt). Without
knowing _when_ that last timer interrupt got injected and whether
other interrupts are occurring for the guest as necessary, that
again doesn't mean much.
> I'm also going to run some on another pair of hosts without no-apicv. I
> chose elbling[01] since according to
> http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> -xl-qemuu-win7-amd64.html it has been used a few times on various branches
> and hasn't so far failed a ts-windows-install. It looks to have the same set
> of advanced features as fiano*:
That's a good idea, thanks.
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-09 9:29 ` Jan Beulich
@ 2015-06-10 8:50 ` Ian Campbell
2015-06-10 9:36 ` Jan Beulich
0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-10 8:50 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Tue, 2015-06-09 at 10:29 +0100, Jan Beulich wrote:
> >>> On 09.06.15 at 10:26, <ian.campbell@citrix.com> wrote:
> > On Mon, 2015-06-08 at 13:16 +0100, Ian Campbell wrote:
> >
> >> The adhoc run passed, but that's not statistically significant.
> >
> > I ran a bunch more in this no-apicv configuration, the logs are at
> > http://logs.test-lab.xenproject.org/osstest/logs/<NNNN>:
> >
> > Flight Host Failed at
> > 58190 fiano0 ts-guest-stop
> > 58198 fiano0 ts-guest-stop
> > 58203 fiano0 ts-windows-install
> > 58208 fiano0 ts-guest-stop
> > 58210 fiano1 ts-guest-stop
> > 58214 fiano1 ts-guest-stop
> > 58217 fiano1 ts-guest-stop
> >
> > I think that's not sufficient data to draw a conclusion, since there was
> > always a small background failure rate. I'm going to run another half
> > dozen (3 on each).
>
> At least the one failure is following the patterns of previous ones
> (ping timing out and guest sitting with both of its vCPU-s on HLT,
> and the last VM entry having delivered a timer interrupt). Without
> knowing _when_ that last timer interrupt got injected and whether
> other interrupts are occurring for the guest as necessary, that
> again doesn't mean much.
58243 fiano0 ts-guest-stop
58251 fiano0 ts-guest-stop
58258 fiano0 ts-guest-stop
58266 fiano1 ts-guest-stop
58279 fiano1 ts-guest-stop
58282 fiano1 ts-guest-stop
> > I'm also going to run some on another pair of hosts without no-apicv. I
> > chose elbling[01] since according to
> > http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
> > -xl-qemuu-win7-amd64.html it has been used a few times on various branches
> > and hasn't so far failed a ts-windows-install. It looks to have the same set
> > of advanced features as fiano*:
>
> That's a good idea, thanks.
58244 elbling0 ts-guest-stop
58250 elbling0 ts-guest-stop
58256 elbling0 ts-guest-stop
58261 elbling1 ts-guest-stop
58269 elbling1 ts-guest-stop
58274 elbling1 ts-guest-stop
So it is looking awfully like a host specific issue with apicv on
fiano*.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 8:50 ` Ian Campbell
@ 2015-06-10 9:36 ` Jan Beulich
2015-06-10 11:01 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-10 9:36 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 10.06.15 at 10:50, <ian.campbell@citrix.com> wrote:
> On Tue, 2015-06-09 at 10:29 +0100, Jan Beulich wrote:
>> >>> On 09.06.15 at 10:26, <ian.campbell@citrix.com> wrote:
>> > On Mon, 2015-06-08 at 13:16 +0100, Ian Campbell wrote:
>> >
>> >> The adhoc run passed, but that's not statistically significant.
>> >
>> > I ran a bunch more in this no-apicv configuration, the logs are at
>> > http://logs.test-lab.xenproject.org/osstest/logs/<NNNN>:
>> >
>> > Flight Host Failed at
>> > 58190 fiano0 ts-guest-stop
>> > 58198 fiano0 ts-guest-stop
>> > 58203 fiano0 ts-windows-install
>> > 58208 fiano0 ts-guest-stop
>> > 58210 fiano1 ts-guest-stop
>> > 58214 fiano1 ts-guest-stop
>> > 58217 fiano1 ts-guest-stop
>> >
>> > I think that's not sufficient data to draw a conclusion, since there was
>> > always a small background failure rate. I'm going to run another half
>> > dozen (3 on each).
>>
>> At least the one failure is following the patterns of previous ones
>> (ping timing out and guest sitting with both of its vCPU-s on HLT,
>> and the last VM entry having delivered a timer interrupt). Without
>> knowing _when_ that last timer interrupt got injected and whether
>> other interrupts are occurring for the guest as necessary, that
>> again doesn't mean much.
>
> 58243 fiano0 ts-guest-stop
> 58251 fiano0 ts-guest-stop
> 58258 fiano0 ts-guest-stop
> 58266 fiano1 ts-guest-stop
> 58279 fiano1 ts-guest-stop
> 58282 fiano1 ts-guest-stop
>
>> > I'm also going to run some on another pair of hosts without no-apicv. I
>> > chose elbling[01] since according to
>> >
> http://logs.test-lab.xenproject.org/osstest/results/history.test-amd64-amd64
>> > -xl-qemuu-win7-amd64.html it has been used a few times on various branches
>> > and hasn't so far failed a ts-windows-install. It looks to have the same
> set
>> > of advanced features as fiano*:
>>
>> That's a good idea, thanks.
>
> 58244 elbling0 ts-guest-stop
> 58250 elbling0 ts-guest-stop
> 58256 elbling0 ts-guest-stop
> 58261 elbling1 ts-guest-stop
> 58269 elbling1 ts-guest-stop
> 58274 elbling1 ts-guest-stop
>
> So it is looking awfully like a host specific issue with apicv on
> fiano*.
Indeed. Leaving us with the slight hope that there is a microcode
update available that's newer than what the BIOS of those boxes
loads. Could we perhaps afford un-blessing the two systems for
the time being? And maybe get Intel involved if there's no ucode
update available that helps?
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 9:36 ` Jan Beulich
@ 2015-06-10 11:01 ` Ian Campbell
2015-06-10 11:48 ` Jan Beulich
0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-10 11:01 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Wed, 2015-06-10 at 10:36 +0100, Jan Beulich wrote:
> Indeed. Leaving us with the slight hope that there is a microcode
> update available that's newer than what the BIOS of those boxes
> loads. Could we perhaps afford un-blessing the two systems for
> the time being? And maybe get Intel involved if there's no ucode
> update available that helps?
Arranging to do microcode updates looks like it is going to be a bit
non-trivial from the osstest side. Is there any reason to think it would
help other than just hoping it will?
Can't we get Intel involved right away?
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 11:01 ` Ian Campbell
@ 2015-06-10 11:48 ` Jan Beulich
2015-06-10 12:56 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-10 11:48 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 10.06.15 at 13:01, <ian.campbell@citrix.com> wrote:
> On Wed, 2015-06-10 at 10:36 +0100, Jan Beulich wrote:
>> Indeed. Leaving us with the slight hope that there is a microcode
>> update available that's newer than what the BIOS of those boxes
>> loads. Could we perhaps afford un-blessing the two systems for
>> the time being? And maybe get Intel involved if there's no ucode
>> update available that helps?
>
> Arranging to do microcode updates looks like it is going to be a bit
> non-trivial from the osstest side. Is there any reason to think it would
> help other than just hoping it will?
It's really hope, not much more. But I guess you could at least check
what microcode the box has in use - if there's nothing newer available,
then trying to get the microcode updating working isn't of immediate
importance anymore (but of course it would still be nice to have in
place).
> Can't we get Intel involved right away?
Sure we can; I just generally prefer not to bother people with
problems they already solved, but maybe that's the wrong approach
a case like this.
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 11:48 ` Jan Beulich
@ 2015-06-10 12:56 ` Ian Campbell
2015-06-10 13:23 ` Jan Beulich
` (2 more replies)
0 siblings, 3 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-10 12:56 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Wed, 2015-06-10 at 12:48 +0100, Jan Beulich wrote:
> >>> On 10.06.15 at 13:01, <ian.campbell@citrix.com> wrote:
> > On Wed, 2015-06-10 at 10:36 +0100, Jan Beulich wrote:
> >> Indeed. Leaving us with the slight hope that there is a microcode
> >> update available that's newer than what the BIOS of those boxes
> >> loads. Could we perhaps afford un-blessing the two systems for
> >> the time being? And maybe get Intel involved if there's no ucode
> >> update available that helps?
> >
> > Arranging to do microcode updates looks like it is going to be a bit
> > non-trivial from the osstest side. Is there any reason to think it would
> > help other than just hoping it will?
>
> It's really hope, not much more.
OK. I think this is something which is worth doing but I'm going to
treat it more like a feature request than a bug fix in terms of
prioritising it.
> But I guess you could at least check
> what microcode the box has in use - if there's nothing newer available,
> then trying to get the microcode updating working isn't of immediate
> importance anymore (but of course it would still be nice to have in
> place).
I logged into fiano1 while it was running under Xen:
cpuinfo contains (just the first processor for brevity):
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E5-2403 v2 @ 1.80GHz
stepping : 4
microcode : 0x416
cpu MHz : 1800.041
cache size : 10240 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fdiv_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu de tsc msr pae mce cx8 apic sep mca cmov pat clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc nonstop_tsc eagerfpu pni pclmulqdq monitor est ssse3 sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor arat epb xsaveopt pln pts dtherm fsgsbase erms
bogomips : 3600.08
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
I'll hold onto the machine for another hour (until 1500 BST) if you want
to know anything else (otherwise I'll have to relock it which will imply
waiting for a test to finish)
> > Can't we get Intel involved right away?
>
> Sure we can; I just generally prefer not to bother people with
> problems they already solved, but maybe that's the wrong approach
> a case like this.
Is the list of errata fixed by a given ucode update public? If not then
I think we've done sufficient due diligence that we should feel ok to
ask, even if the answer turns out to be fixed in microcode.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 12:56 ` Ian Campbell
@ 2015-06-10 13:23 ` Jan Beulich
2015-06-10 13:45 ` Jan Beulich
2015-06-10 14:34 ` Ian Campbell
2 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2015-06-10 13:23 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 10.06.15 at 14:56, <ian.campbell@citrix.com> wrote:
> On Wed, 2015-06-10 at 12:48 +0100, Jan Beulich wrote:
>> But I guess you could at least check
>> what microcode the box has in use - if there's nothing newer available,
>> then trying to get the microcode updating working isn't of immediate
>> importance anymore (but of course it would still be nice to have in
>> place).
>
> I logged into fiano1 while it was running under Xen:
>
> cpuinfo contains (just the first processor for brevity):
>
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 62
> model name : Intel(R) Xeon(R) CPU E5-2403 v2 @ 1.80GHz
> stepping : 4
> microcode : 0x416
Peeking into the microcode files I have lying around, 0x428 ought to
be available for that family+model+stepping.
>> > Can't we get Intel involved right away?
>>
>> Sure we can; I just generally prefer not to bother people with
>> problems they already solved, but maybe that's the wrong approach
>> a case like this.
>
> Is the list of errata fixed by a given ucode update public? If not then
> I think we've done sufficient due diligence that we should feel ok to
> ask, even if the answer turns out to be fixed in microcode.
No, Intel isn't doing as good a job as AMD in that regard.
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 12:56 ` Ian Campbell
2015-06-10 13:23 ` Jan Beulich
@ 2015-06-10 13:45 ` Jan Beulich
2015-06-10 14:08 ` Ian Campbell
2015-06-10 14:34 ` Ian Campbell
2 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-10 13:45 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 10.06.15 at 14:56, <ian.campbell@citrix.com> wrote:
> On Wed, 2015-06-10 at 12:48 +0100, Jan Beulich wrote:
>> Sure we can; I just generally prefer not to bother people with
>> problems they already solved, but maybe that's the wrong approach
>> a case like this.
>
> Is the list of errata fixed by a given ucode update public? If not then
> I think we've done sufficient due diligence that we should feel ok to
> ask, even if the answer turns out to be fixed in microcode.
So I went though the errata list for that specific model; the only
one really concerning seems to be CA135 ("A MOV to CR3 When
EPT is Enabled May Lead to an Unexpected Page Fault or an
Incorrect Page Translation"). But while it would affect us, it would
quite likely make the guest crash instead of idling or being hung.
So if we're going to approach Intel with this - will you or should I?
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 13:45 ` Jan Beulich
@ 2015-06-10 14:08 ` Ian Campbell
2015-06-11 7:02 ` Jan Beulich
0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-10 14:08 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Wed, 2015-06-10 at 14:45 +0100, Jan Beulich wrote:
> >>> On 10.06.15 at 14:56, <ian.campbell@citrix.com> wrote:
> > On Wed, 2015-06-10 at 12:48 +0100, Jan Beulich wrote:
> >> Sure we can; I just generally prefer not to bother people with
> >> problems they already solved, but maybe that's the wrong approach
> >> a case like this.
> >
> > Is the list of errata fixed by a given ucode update public? If not then
> > I think we've done sufficient due diligence that we should feel ok to
> > ask, even if the answer turns out to be fixed in microcode.
>
> So I went though the errata list for that specific model; the only
> one really concerning seems to be CA135 ("A MOV to CR3 When
> EPT is Enabled May Lead to an Unexpected Page Fault or an
> Incorrect Page Translation"). But while it would affect us, it would
> quite likely make the guest crash instead of idling or being hung.
Yes, sounds like it.
> So if we're going to approach Intel with this - will you or should I?
I think it'd be best coming from you.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 12:56 ` Ian Campbell
2015-06-10 13:23 ` Jan Beulich
2015-06-10 13:45 ` Jan Beulich
@ 2015-06-10 14:34 ` Ian Campbell
2015-06-10 15:59 ` Jan Beulich
2 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-10 14:34 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Wed, 2015-06-10 at 13:56 +0100, Ian Campbell wrote:
> > > Arranging to do microcode updates looks like it is going to be a bit
> > > non-trivial from the osstest side.
>
> OK. I think this is something which is worth doing
So for AMD I think things are pretty clear, cat
linux-firmware.git/amd-ucode/*.bin into
kernel/x86/microcode/AuthenticAMD.bin inside microcode.cpio
For Intel I'm less sure, I've got microcode-20150121.tgz containing
microcode.dat. Is that just to be placed at
kernel/x86/microcode/GenuineIntel.bin and done, or is there some
processing needed?
I've got a thing called iucode-tool in my hand from a Debian package if
I need it.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 14:34 ` Ian Campbell
@ 2015-06-10 15:59 ` Jan Beulich
2015-06-10 16:18 ` Don Slutz
2015-06-10 18:00 ` Ian Campbell
0 siblings, 2 replies; 40+ messages in thread
From: Jan Beulich @ 2015-06-10 15:59 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 10.06.15 at 16:34, <ian.campbell@citrix.com> wrote:
> For Intel I'm less sure, I've got microcode-20150121.tgz containing
> microcode.dat. Is that just to be placed at
> kernel/x86/microcode/GenuineIntel.bin and done, or is there some
> processing needed?
The full blob (albeit usually named microcode.bin; microcode.dat
ordinarily is a text file) can be used if so desired, but there's also a
tool to split it into more fine grained chunks.
> I've got a thing called iucode-tool in my hand from a Debian package if
> I need it.
Or maybe there are multiple different tools - the one I know about
is commonly named intel-microcode2ucode taking microcode.dat as
input and producing microcode.bin as well as many individual
<family>-<model>-<stepping> blobs.
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 15:59 ` Jan Beulich
@ 2015-06-10 16:18 ` Don Slutz
2015-06-10 18:00 ` Ian Campbell
1 sibling, 0 replies; 40+ messages in thread
From: Don Slutz @ 2015-06-10 16:18 UTC (permalink / raw)
To: Jan Beulich, Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
On 06/10/15 11:59, Jan Beulich wrote:
>>>> On 10.06.15 at 16:34, <ian.campbell@citrix.com> wrote:
>> For Intel I'm less sure, I've got microcode-20150121.tgz containing
>> microcode.dat. Is that just to be placed at
>> kernel/x86/microcode/GenuineIntel.bin and done, or is there some
>> processing needed?
>
> The full blob (albeit usually named microcode.bin; microcode.dat
> ordinarily is a text file) can be used if so desired, but there's also a
> tool to split it into more fine grained chunks.
>
>> I've got a thing called iucode-tool in my hand from a Debian package if
>> I need it.
>
> Or maybe there are multiple different tools - the one I know about
> is commonly named intel-microcode2ucode taking microcode.dat as
> input and producing microcode.bin as well as many individual
> <family>-<model>-<stepping> blobs.
>
Well, my version did not produce microcode.bin. Based on:
...
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 42
model name : Intel(R) Xeon(R) CPU E31265L @ 2.40GHz
stepping : 7
...
and grub2 (Not sure for ucode=scan):
1) intel-microcode2ucode microcode.dat
2a) cp intel-ucode/06-2a-07 /boot/microcode.bin
or
2b) cat intel-ucode/* >/boot/microcode.bin
3) Make sure "ucode=-1" is in GRUB_CMDLINE_XEN
4) /sbin/grub2-mkconfig -o /boot/grub2/grub.cfg
And you see microcode loaded on the serial console.
-Don Slutz
> Jan
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 15:59 ` Jan Beulich
2015-06-10 16:18 ` Don Slutz
@ 2015-06-10 18:00 ` Ian Campbell
1 sibling, 0 replies; 40+ messages in thread
From: Ian Campbell @ 2015-06-10 18:00 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Wed, 2015-06-10 at 16:59 +0100, Jan Beulich wrote:
> >>> On 10.06.15 at 16:34, <ian.campbell@citrix.com> wrote:
> > For Intel I'm less sure, I've got microcode-20150121.tgz containing
> > microcode.dat. Is that just to be placed at
> > kernel/x86/microcode/GenuineIntel.bin and done, or is there some
> > processing needed?
>
> The full blob (albeit usually named microcode.bin; microcode.dat
> ordinarily is a text file) can be used if so desired, but there's also a
> tool to split it into more fine grained chunks.
>
> > I've got a thing called iucode-tool in my hand from a Debian package if
> > I need it.
>
> Or maybe there are multiple different tools - the one I know about
> is commonly named intel-microcode2ucode taking microcode.dat as
> input and producing microcode.bin as well as many individual
> <family>-<model>-<stepping> blobs.
Not sure if they are the same tool, but I seem to have managed to get
iucode-tool to take my microcode.dat and produce a suitable binary file.
I'm testing the integration with osstest now.
Thanks,
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-10 14:08 ` Ian Campbell
@ 2015-06-11 7:02 ` Jan Beulich
2015-06-11 8:45 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: Jan Beulich @ 2015-06-11 7:02 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 10.06.15 at 16:08, <ian.campbell@citrix.com> wrote:
> On Wed, 2015-06-10 at 14:45 +0100, Jan Beulich wrote:
>> So if we're going to approach Intel with this - will you or should I?
>
> I think it'd be best coming from you.
Just have sent it off; in putting together the technical details it
became clear that elbling* indeed are at a newer microcode level,
so I think this at least slightly raises the chances of an update to
help fiano* (if so I of course wonder why the vendor hasn't made
a suitable BIOS update available yet).
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-11 7:02 ` Jan Beulich
@ 2015-06-11 8:45 ` Ian Campbell
2015-06-15 8:57 ` Ian Campbell
0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-11 8:45 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Thu, 2015-06-11 at 08:02 +0100, Jan Beulich wrote:
> >>> On 10.06.15 at 16:08, <ian.campbell@citrix.com> wrote:
> > On Wed, 2015-06-10 at 14:45 +0100, Jan Beulich wrote:
> >> So if we're going to approach Intel with this - will you or should I?
> >
> > I think it'd be best coming from you.
>
> Just have sent it off; in putting together the technical details it
> became clear that elbling* indeed are at a newer microcode level,
> so I think this at least slightly raises the chances of an update to
> help fiano* (if so I of course wonder why the vendor hasn't made
> a suitable BIOS update available yet).
It's possible that there is one which we've not applied...
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-11 8:45 ` Ian Campbell
@ 2015-06-15 8:57 ` Ian Campbell
2015-06-15 9:03 ` Jan Beulich
0 siblings, 1 reply; 40+ messages in thread
From: Ian Campbell @ 2015-06-15 8:57 UTC (permalink / raw)
To: Jan Beulich; +Cc: Andrew Cooper, ian.jackson, xen-devel
On Thu, 2015-06-11 at 09:45 +0100, Ian Campbell wrote:
> On Thu, 2015-06-11 at 08:02 +0100, Jan Beulich wrote:
> > >>> On 10.06.15 at 16:08, <ian.campbell@citrix.com> wrote:
> > > On Wed, 2015-06-10 at 14:45 +0100, Jan Beulich wrote:
> > >> So if we're going to approach Intel with this - will you or should I?
> > >
> > > I think it'd be best coming from you.
> >
> > Just have sent it off; in putting together the technical details it
> > became clear that elbling* indeed are at a newer microcode level,
> > so I think this at least slightly raises the chances of an update to
> > help fiano* (if so I of course wonder why the vendor hasn't made
> > a suitable BIOS update available yet).
>
> It's possible that there is one which we've not applied...
I've now run a bunch of adhoc runs with the microcode update in place
(from 0x416 to 0x428 on these particular machines):
58468 fiano0 guest-stop
58479 fiano0 guest-stop
58485 fiano0 windows-install
58494 fiano0 guest-stop
58499 fiano1 guest-stop
58509 fiano1 windows-install
58516 fiano1 guest-stop
58527* fiano0 guest-stop
58531 fiano0 guest-stop
58534 fiano0 guest-stop
58537 fiano0 guest-stop
58538 fiano1 guest-stop
58544 fiano1 guest-stop
58547 fiano1 guest-stop
58550 fiano0 guest-stop
58555 fiano0 guest-stop
58557 fiano0 guest-stop
58560 fiano1 guest-stop
58563 fiano1 guest-stop
58565 fiano1 windows-install
(*) rebuilt binaries because previous build was gc'd, same versions as
before.
So 3/20 = 15% failure rate (fiano0: 1/11=9%; fiano1: 2/9=22%). Which is
better than the ~50% seen at the start of this thread, so it is worth
applying the ucode update I think (and it would have been regardless the
right thing to do),
I do think a 15-20% failure rate might be worthy of further
investigation by Intel too, since the failure rate with no-apicv was
1/13 = 7% (fiano0: 1/7=14%, fiano1: 0/6=0%), although those numbers are
less significant due to fewer runs.
Ian.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [xen-unstable test] 57852: regressions - FAIL
2015-06-15 8:57 ` Ian Campbell
@ 2015-06-15 9:03 ` Jan Beulich
0 siblings, 0 replies; 40+ messages in thread
From: Jan Beulich @ 2015-06-15 9:03 UTC (permalink / raw)
To: Ian Campbell; +Cc: Andrew Cooper, ian.jackson, xen-devel
>>> On 15.06.15 at 10:57, <ian.campbell@citrix.com> wrote:
> So 3/20 = 15% failure rate (fiano0: 1/11=9%; fiano1: 2/9=22%). Which is
> better than the ~50% seen at the start of this thread, so it is worth
> applying the ucode update I think (and it would have been regardless the
> right thing to do),
>
> I do think a 15-20% failure rate might be worthy of further
> investigation by Intel too, since the failure rate with no-apicv was
> 1/13 = 7% (fiano0: 1/7=14%, fiano1: 0/6=0%), although those numbers are
> less significant due to fewer runs.
I fully agree; even the remaining 7% should be looked into, provided
they can be reproduced by them on sufficiently similar hardware.
Jan
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2015-06-15 9:03 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-04 12:01 [xen-unstable test] 57852: regressions - FAIL osstest service user
2015-06-05 8:45 ` Ian Campbell
2015-06-05 9:00 ` Jan Beulich
2015-06-05 9:07 ` Ian Campbell
2015-06-05 9:18 ` Jan Beulich
2015-06-05 10:48 ` Ian Campbell
2015-06-05 16:46 ` Ian Campbell
2015-06-08 8:07 ` Jan Beulich
2015-06-08 8:53 ` Ian Campbell
2015-06-08 9:15 ` Jan Beulich
2015-06-08 9:27 ` Ian Campbell
2015-06-08 10:17 ` Jan Beulich
2015-06-08 14:43 ` Ian Jackson
2015-06-08 12:16 ` Ian Campbell
2015-06-08 12:19 ` Andrew Cooper
2015-06-08 12:24 ` Jan Beulich
2015-06-09 8:26 ` Ian Campbell
2015-06-09 9:29 ` Jan Beulich
2015-06-10 8:50 ` Ian Campbell
2015-06-10 9:36 ` Jan Beulich
2015-06-10 11:01 ` Ian Campbell
2015-06-10 11:48 ` Jan Beulich
2015-06-10 12:56 ` Ian Campbell
2015-06-10 13:23 ` Jan Beulich
2015-06-10 13:45 ` Jan Beulich
2015-06-10 14:08 ` Ian Campbell
2015-06-11 7:02 ` Jan Beulich
2015-06-11 8:45 ` Ian Campbell
2015-06-15 8:57 ` Ian Campbell
2015-06-15 9:03 ` Jan Beulich
2015-06-10 14:34 ` Ian Campbell
2015-06-10 15:59 ` Jan Beulich
2015-06-10 16:18 ` Don Slutz
2015-06-10 18:00 ` Ian Campbell
2015-06-08 13:50 ` Konrad Rzeszutek Wilk
2015-06-08 14:02 ` Ian Campbell
2015-06-08 14:47 ` Ian Jackson
2015-06-08 15:21 ` Konrad Rzeszutek Wilk
2015-06-08 15:29 ` Ian Campbell
2015-06-08 10:10 ` Ian Campbell
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.