* [xen-4.5-testing test] 34157: regressions - FAIL
@ 2015-02-05 9:53 xen.org
2015-02-05 12:53 ` Jan Beulich
0 siblings, 1 reply; 7+ messages in thread
From: xen.org @ 2015-02-05 9:53 UTC (permalink / raw)
To: xen-devel; +Cc: ian.jackson
flight 34157 xen-4.5-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088
test-amd64-i386-xl-qemuu-winxpsp3 5 xen-boot fail REGR. vs. 34088
test-amd64-i386-xl-qemut-win7-amd64 5 xen-boot fail REGR. vs. 34088
Tests which did not succeed, but are not blocking:
test-amd64-amd64-xl-pvh-intel 9 guest-start fail never pass
test-armhf-armhf-xl-multivcpu 10 migrate-support-check fail never pass
test-armhf-armhf-xl-sedf-pin 10 migrate-support-check fail never pass
test-armhf-armhf-xl 10 migrate-support-check fail never pass
test-armhf-armhf-xl-midway 10 migrate-support-check fail never pass
test-amd64-amd64-xl-pvh-amd 9 guest-start fail never pass
test-armhf-armhf-xl-sedf 10 migrate-support-check fail never pass
test-amd64-i386-libvirt 9 guest-start fail never pass
test-amd64-amd64-libvirt 9 guest-start fail never pass
test-armhf-armhf-libvirt 9 guest-start fail never pass
test-amd64-amd64-xl-pcipt-intel 9 guest-start fail never pass
test-armhf-armhf-xl-credit2 5 xen-boot fail never pass
test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
test-amd64-i386-xl-winxpsp3 14 guest-stop fail never pass
test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop fail never pass
test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop fail never pass
test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop fail never pass
test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail never pass
test-amd64-i386-xl-qemut-winxpsp3 14 guest-stop fail never pass
test-amd64-amd64-xl-win7-amd64 14 guest-stop fail never pass
test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop fail never pass
test-amd64-i386-xl-win7-amd64 14 guest-stop fail never pass
test-amd64-amd64-xl-winxpsp3 14 guest-stop fail never pass
version targeted for testing:
xen d8e78d691d9b4bcc945d8f0b0ed2b48713931c4d
baseline version:
xen 896437d6305879fab0f8c4f1d7292d1db0de6d97
------------------------------------------------------------
People who touched revisions under test:
Andrew Cooper <andrew.cooper3@citrix.com>
Dan Carpenter <dan.carpenter@oracle.com>
Daniel De Graaf <dgdegra@tycho.nsa.gov>
Ian Campbell <ian.campbell@citrix.com>
Jan Beulich <jbeulich@suse.com>
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tim Deegan <tim@xen.org>
Wei Liu <wei.liu2@citrix.com>
------------------------------------------------------------
jobs:
build-amd64 pass
build-armhf pass
build-i386 pass
build-amd64-libvirt pass
build-armhf-libvirt pass
build-i386-libvirt pass
build-amd64-pvops pass
build-armhf-pvops pass
build-i386-pvops pass
build-amd64-rumpuserxen pass
build-i386-rumpuserxen pass
test-amd64-amd64-xl pass
test-armhf-armhf-xl pass
test-amd64-i386-xl pass
test-amd64-amd64-xl-pvh-amd fail
test-amd64-i386-rhel6hvm-amd pass
test-amd64-i386-qemut-rhel6hvm-amd pass
test-amd64-i386-qemuu-rhel6hvm-amd pass
test-amd64-amd64-xl-qemut-debianhvm-amd64 pass
test-amd64-i386-xl-qemut-debianhvm-amd64 pass
test-amd64-amd64-xl-qemuu-debianhvm-amd64 pass
test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
test-amd64-i386-freebsd10-amd64 pass
test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
test-amd64-i386-xl-qemuu-ovmf-amd64 pass
test-amd64-amd64-rumpuserxen-amd64 fail
test-amd64-amd64-xl-qemut-win7-amd64 fail
test-amd64-i386-xl-qemut-win7-amd64 fail
test-amd64-amd64-xl-qemuu-win7-amd64 fail
test-amd64-i386-xl-qemuu-win7-amd64 fail
test-amd64-amd64-xl-win7-amd64 fail
test-amd64-i386-xl-win7-amd64 fail
test-amd64-amd64-xl-credit2 pass
test-armhf-armhf-xl-credit2 fail
test-amd64-i386-freebsd10-i386 pass
test-amd64-i386-rumpuserxen-i386 pass
test-amd64-amd64-xl-pcipt-intel fail
test-amd64-amd64-xl-pvh-intel fail
test-amd64-i386-rhel6hvm-intel pass
test-amd64-i386-qemut-rhel6hvm-intel pass
test-amd64-i386-qemuu-rhel6hvm-intel pass
test-amd64-amd64-libvirt fail
test-armhf-armhf-libvirt fail
test-amd64-i386-libvirt fail
test-armhf-armhf-xl-midway pass
test-amd64-amd64-xl-multivcpu pass
test-armhf-armhf-xl-multivcpu pass
test-amd64-amd64-pair pass
test-amd64-i386-pair fail
test-amd64-amd64-xl-sedf-pin pass
test-armhf-armhf-xl-sedf-pin pass
test-amd64-amd64-xl-sedf pass
test-armhf-armhf-xl-sedf pass
test-amd64-i386-xl-qemut-winxpsp3-vcpus1 fail
test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 fail
test-amd64-i386-xl-winxpsp3-vcpus1 fail
test-amd64-amd64-xl-qemut-winxpsp3 fail
test-amd64-i386-xl-qemut-winxpsp3 fail
test-amd64-amd64-xl-qemuu-winxpsp3 fail
test-amd64-i386-xl-qemuu-winxpsp3 fail
test-amd64-amd64-xl-winxpsp3 fail
test-amd64-i386-xl-winxpsp3 fail
------------------------------------------------------------
sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images
Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs
Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary
Not pushing.
------------------------------------------------------------
commit d8e78d691d9b4bcc945d8f0b0ed2b48713931c4d
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date: Tue Feb 3 12:22:01 2015 +0100
bunzip2: off by one in get_next_block()
"origPtr" is used as an offset into the bd->dbuf[] array. That array is
allocated in start_bunzip() and has "bd->dbufSize" number of elements so
the test here should be >= instead of >.
Later we check "origPtr" again before using it as an offset so I don't
know if this bug can be triggered in real life.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Trivial adjustments to make the respective Linux commit
b5c8afe5be51078a979d86ae5ae78c4ac948063d apply to Xen.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: 39798e95a954eec660a3f5f21489c30ef78daf6d
master date: 2015-01-28 16:50:08 +0100
commit 8a855b35ddf9edb69afd23d02908bb1d4bdf9a14
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date: Tue Feb 3 12:21:38 2015 +0100
docs/commandline: correct information for 'x2apic_phys' parameter
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 89c381c30b46ec714f2d5bef4b0cb6d759abc7e4
master date: 2015-01-28 16:31:07 +0100
commit 3a777bedcbf4f273846ae33b01dd9c619e890f2d
Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Tue Feb 3 12:20:44 2015 +0100
x86: vcpu_destroy_pagetables() must not return -EINTR
.. otherwise it has the side effect that: domain_relinquish_resources
will stop and will return to user-space with -EINTR which it is not
equipped to deal with that error code; or vcpu_reset - which will
ignore it and convert the error to -ENOMEM..
The preemption mechanism we have for domain destruction is to return
-EAGAIN (and then user-space calls the hypercall again) and as such we need
to catch the case of:
domain_relinquish_resources
->vcpu_destroy_pagetables
-> put_page_and_type_preemptible
-> __put_page_type
returns -EINTR
and convert it to the proper type. For:
XEN_DOMCTL_setvcpucontext
-> vcpu_reset
-> vcpu_destroy_pagetables
we need to return -ERESTART otherwise we end up returning -ENOMEM.
There are also other callers of vcpu_destroy_pagetables: arch_vcpu_reset
(vcpu_reset) are:
- hvm_s3_suspend (asserts on any return code),
- vlapic_init_sipi_one (asserts on any return code),
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit: de4f284b3d7b47d3b9807f354552ecf3e0fff56b
master date: 2015-01-26 12:51:09 +0100
commit 1acb3b6f12821597eac1aa8ce33578f3e26bc272
Author: Wei Liu <wei.liu2@citrix.com>
Date: Tue Feb 3 12:18:36 2015 +0100
handle XENMEM_get_vnumainfo in compat_memory_op
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 5f6ec28f2c3c3ba17a0b7f2a1d98324665420f46
master date: 2015-01-23 15:06:26 +0100
commit 4eec09f613778ef813bcc2d653b9930aeaf3755f
Author: Jan Beulich <jbeulich@suse.com>
Date: Tue Feb 3 12:17:59 2015 +0100
x86: correctly check for sub-leaf zero of leaf 7 in pv_cpuid()
Only the low 32 bits are relevant.
For consistency also change a cast on regs->eax to regs->_eax.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: ae1edef1ae33f3bcff2580116ae2b7c9ffef42f2
master date: 2015-01-22 12:48:40 +0100
commit 7788cbb0a59b932c2bc36823d23d52b65099c80a
Author: Jan Beulich <jbeulich@suse.com>
Date: Tue Feb 3 12:17:26 2015 +0100
x86: don't expose XSAVES capability to PV guests
As done by the recent Linux commit b65d6e17fe ("kvm: x86: mask out
XSAVES") for KVM, we should also mask out XSAVES from what PV guests
get to see as long as we don't emulate accesses to MSR_IA32_XSS.
Actually, go beyond that: Just like for leaf 7, switch from
blacklisting to whitelisting, i.e. only allow XSAVEOPT and XSAVEC for
the time being. And do these overrides consistently for both Dom0 and
DomU-s.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 8d050ed1097ce5f4bf6a1d6806fb1e3471976adb
master date: 2015-01-22 12:47:56 +0100
commit 4cfc54b1b81fb1a91080072b3250801c020a3134
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date: Tue Feb 3 12:16:30 2015 +0100
xsm/evtchn: never pretend to have successfully created a Xen event channel
Xen event channels are not internal resources. They still have one end in a
domain, and are created at the request of privileged domains. This logic
which "successfully" creates a Xen event channel opens up undesirable failure
cases with ill-specified XSM policies.
If a domain is permitted to create ioreq servers or memevent listeners, but
not to create event channels, the ioreq/memevent creation will succeed but
attempting to bind the returned event channel will fail without any indication
of a permission error.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
master commit: 09aa4759faa29c1fe735266de4c79f17329bd67b
master date: 2015-01-20 10:42:26 +0100
commit 2fdd521e4801bfb45ba0e88ca820a8606aa5e1b7
Author: Jan Beulich <jbeulich@suse.com>
Date: Tue Feb 3 12:15:58 2015 +0100
common/memory: fix an XSM error path
XENMEM_{in,de}crease_reservation as well as XENMEM_populate_physmap
return the extent at which failure was detected, not error indicators.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
master commit: 76d4ff26d9647088353acaf4a56388a354a5d6e9
master date: 2015-01-19 11:59:05 +0100
commit ad83ad993d1a42c61f4edd97eb5d6396a589ad48
Author: Jan Beulich <jbeulich@suse.com>
Date: Tue Feb 3 12:15:03 2015 +0100
x86emul: tighten CLFLUSH emulation
While for us it's not as bad as it was for Linux, their commit
13e457e0ee ("KVM: x86: Emulator does not decode clflush well", by
Nadav Amit <namit@cs.technion.ac.il>) nevertheless points out two
shortcomings in our code: opcode 0F AE /7 is clflush only when it uses
a memory mode (otherwise it's SFENCE) and when there's no REP prefix
(an operand size prefix is fine, as that's CLFLUSHOPT).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 9d03db6b81d1880bf3aa4fc83a60346bf02be251
master date: 2015-01-12 15:41:12 +0100
(qemu changes not included)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [xen-4.5-testing test] 34157: regressions - FAIL
2015-02-05 9:53 [xen-4.5-testing test] 34157: regressions - FAIL xen.org
@ 2015-02-05 12:53 ` Jan Beulich
2015-02-05 13:00 ` Ian Campbell
0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2015-02-05 12:53 UTC (permalink / raw)
To: xen-devel
>>> On 05.02.15 at 10:53, <Ian.Jackson@eu.citrix.com> wrote:
> flight 34157 xen-4.5-testing real [real]
> http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/
>
> Regressions :-(
>
> Tests which did not succeed and are blocking,
> including tests which could not be run:
> test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088
I have no clue what is going on here, ...
> test-amd64-i386-xl-qemuu-winxpsp3 5 xen-boot fail REGR. vs. 34088
> test-amd64-i386-xl-qemut-win7-amd64 5 xen-boot fail REGR. vs. 34088
... while these two again seem to suffer from the ntpd daemon
startup issue queried on before.
Jan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [xen-4.5-testing test] 34157: regressions - FAIL
2015-02-05 12:53 ` Jan Beulich
@ 2015-02-05 13:00 ` Ian Campbell
2015-02-05 14:44 ` Ian Campbell
0 siblings, 1 reply; 7+ messages in thread
From: Ian Campbell @ 2015-02-05 13:00 UTC (permalink / raw)
To: Jan Beulich, Ian Jackson; +Cc: xen-devel
On Thu, 2015-02-05 at 12:53 +0000, Jan Beulich wrote:
> >>> On 05.02.15 at 10:53, <Ian.Jackson@eu.citrix.com> wrote:
> > flight 34157 xen-4.5-testing real [real]
> > http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/
> >
> > Regressions :-(
> >
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> > test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088
>
> I have no clue what is going on here, ...
I saw something very similar in a recent flight on the osstest branch.
Ian -- YHM about that somewhere (list wasn't ccd though).
It looks to be some sort of Heisenbug in the rump kernel stuff.
> > test-amd64-i386-xl-qemuu-winxpsp3 5 xen-boot fail REGR. vs. 34088
> > test-amd64-i386-xl-qemut-win7-amd64 5 xen-boot fail REGR. vs. 34088
>
> ... while these two again seem to suffer from the ntpd daemon
> startup issue queried on before.
I didn't see that (or don't remember at least), but I suppose it is
infrastructure related?
Hopefully that sort of thing will diminish after the move to the new
colo...
Ian.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [xen-4.5-testing test] 34157: regressions - FAIL
2015-02-05 13:00 ` Ian Campbell
@ 2015-02-05 14:44 ` Ian Campbell
2015-02-05 15:51 ` GPF Heisenbug with rumprun-xen Ian Jackson
[not found] ` <21715.37244.683000.194074@mariner.uk.xensource.com>
0 siblings, 2 replies; 7+ messages in thread
From: Ian Campbell @ 2015-02-05 14:44 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel, Ian Jackson
On Thu, 2015-02-05 at 13:00 +0000, Ian Campbell wrote:
> On Thu, 2015-02-05 at 12:53 +0000, Jan Beulich wrote:
> > >>> On 05.02.15 at 10:53, <Ian.Jackson@eu.citrix.com> wrote:
> > > flight 34157 xen-4.5-testing real [real]
> > > http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/
> > >
> > > Regressions :-(
> > >
> > > Tests which did not succeed and are blocking,
> > > including tests which could not be run:
> > > test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088
> >
> > I have no clue what is going on here, ...
>
> I saw something very similar in a recent flight on the osstest branch.
> Ian -- YHM about that somewhere (list wasn't ccd though).
>
> It looks to be some sort of Heisenbug in the rump kernel stuff.
http://www.chiark.greenend.org.uk/~xensrcts/results/history.test-amd64-amd64-rumpuserxen-amd64.html
show a history of random failures at the xenstorels step.
At least the ones as far back as 33830 (the last one with logs still
available) all show signs of what looks like memory corruption of some
sort.
http://www.chiark.greenend.org.uk/~xensrcts/logs/33830/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
GPF rip: 0x46ea1c, error_code=0
Page fault at linear address 0x0, rip 0x13563, regs 0x469ff8, sp 0x46a0a8, our_sp 0x469fe0, code 0
Page fault in pagetable walk (access to invalid memory?).
http://www.chiark.greenend.org.uk/~xensrcts/logs/33846/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
$VAR1 = {
'theirs' => 'STUB ``__sigaction14\'\' called
device/\x02G:
rumpxenstack:
could not access permissions for \x02G: Invalid argument
rumpxenstack: xs_directory (device/\x02G): Invalid argument
http://www.chiark.greenend.org.uk/~xensrcts/logs/33925/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
Similar to 33846
http://www.chiark.greenend.org.uk/~xensrcts/logs/34086/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
rumpxenstack:
could not access permissions for mac: Invalid argument
device/vif/0/mac = "5a:36:0e:26:00:05"
rumpxenstack: xs_directory (device/vif/0/mac): Bad file descriptor
http://www.chiark.greenend.org.uk/~xensrcts/logs/34127/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
Page fault at linear address 0x256, rip 0x33e1c, regs 0x53f458, sp 0x53f500, our_sp 0x53f440, code 0
Thread: main
RIP: e030:[<0000000000033e1c>]
RSP: e02b:000000000053f500 EFLAGS: 00010202
RAX: 000000000025a927 RBX: 0000000000000246 RCX: 000000000053fe88
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000025a8b0
RBP: 00000000001d55df R08: 0000000000454091 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 000000000053fe88
R13: 0000000000000000 R14: 000000000025a8b0 R15: 0000000000000000
base is 0x1d55df caller is 0x6e69616d20676e69
base is 0x6c6c6163203d3d3d GPF rip: 0x13e00, error_code=0
http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
Another bad fd, on device/vbd but otherwise similar to 34086.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 7+ messages in thread
* GPF Heisenbug with rumprun-xen
2015-02-05 14:44 ` Ian Campbell
@ 2015-02-05 15:51 ` Ian Jackson
[not found] ` <21715.37244.683000.194074@mariner.uk.xensource.com>
1 sibling, 0 replies; 7+ messages in thread
From: Ian Jackson @ 2015-02-05 15:51 UTC (permalink / raw)
To: Ian Campbell; +Cc: rumpkernel-users, xen-devel, Jan Beulich
Ian Campbell writes ("Re: [Xen-devel] [xen-4.5-testing test] 34157: regressions - FAIL"):
> > > > http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/
...
> > > > test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088
Guest console contains:
device/vbd/768/protocol = "x86_64-abi" (n2,r0)
device/vbd/832:
rumpxenstack:
could not access permissions for 832: Bad file descriptor
rumpxenstack: xs_directory (device/vbd/832): Bad file descriptor
=== ERROR: _exit(1) called ===
> It looks to be some sort of Heisenbug in the rump kernel stuff.
I agree. We had a failure on the 16th of January which looked like
some kind of race:
(Subject: Re: [Xen-devel] [rumpuserxen test] 33416: regressions - FAIL)
> This
> http://www.chiark.greenend.org.uk/~xensrcts/results/history.test-amd64-amd64-rumpuserxen-amd64.html
> show a history of random failures at the xenstorels step.
>
> At least the ones as far back as 33830 (the last one with logs still
> available) all show signs of what looks like memory corruption of some
> sort.
Thanks for the digging. (I have left the quoted text in for the
benefit of rumpkernel-users.)
The first failure in that history that looks like part of this is
flight 33690. We don't have logs for that any more but it used
rumpuserxen 598ceb54916b
xen 49de0b57b853
netbsdsrc 17a547ca2943
Failure probability after then seems about 20%. If I go back 10
passes from 33690 I get to 33611 which used
rumpuserxen ffcd777f8062
xen 0d2879062076
netbsdsrc a7c6b12e1752
It seems unlikely that the difference is going to be due to changes
in the versions of linux, linuxfirmware, ovmf, qemu[u] or seabios.
buildrump.sh has been 47b1a5eef43c throughout.
Ian.
> http://www.chiark.greenend.org.uk/~xensrcts/logs/33830/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
> GPF rip: 0x46ea1c, error_code=0
> Page fault at linear address 0x0, rip 0x13563, regs 0x469ff8, sp 0x46a0a8, our_sp 0x469fe0, code 0
> Page fault in pagetable walk (access to invalid memory?).
>
> http://www.chiark.greenend.org.uk/~xensrcts/logs/33846/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
> $VAR1 = {
> 'theirs' => 'STUB ``__sigaction14\'\' called
> device/\x02G:
> rumpxenstack:
> could not access permissions for \x02G: Invalid argument
>
> rumpxenstack: xs_directory (device/\x02G): Invalid argument
>
> http://www.chiark.greenend.org.uk/~xensrcts/logs/33925/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
> Similar to 33846
>
> http://www.chiark.greenend.org.uk/~xensrcts/logs/34086/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
> rumpxenstack:
> could not access permissions for mac: Invalid argument
> device/vif/0/mac = "5a:36:0e:26:00:05"
> rumpxenstack: xs_directory (device/vif/0/mac): Bad file descriptor
>
> http://www.chiark.greenend.org.uk/~xensrcts/logs/34127/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
> Page fault at linear address 0x256, rip 0x33e1c, regs 0x53f458, sp 0x53f500, our_sp 0x53f440, code 0
> Thread: main
> RIP: e030:[<0000000000033e1c>]
> RSP: e02b:000000000053f500 EFLAGS: 00010202
> RAX: 000000000025a927 RBX: 0000000000000246 RCX: 000000000053fe88
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000025a8b0
> RBP: 00000000001d55df R08: 0000000000454091 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 000000000053fe88
> R13: 0000000000000000 R14: 000000000025a8b0 R15: 0000000000000000
> base is 0x1d55df caller is 0x6e69616d20676e69
> base is 0x6c6c6163203d3d3d GPF rip: 0x13e00, error_code=0
>
> http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
> Another bad fd, on device/vbd but otherwise similar to 34086.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: GPF Heisenbug with rumprun-xen
[not found] ` <21715.37244.683000.194074@mariner.uk.xensource.com>
@ 2015-02-05 20:25 ` Antti Kantee
[not found] ` <54D3D1D6.3010901@iki.fi>
1 sibling, 0 replies; 7+ messages in thread
From: Antti Kantee @ 2015-02-05 20:25 UTC (permalink / raw)
To: Ian Jackson, Ian Campbell; +Cc: rumpkernel-users, xen-devel, Jan Beulich
On 05/02/15 15:51, Ian Jackson wrote:
>> It looks to be some sort of Heisenbug in the rump kernel stuff.
>
> I agree. We had a failure on the 16th of January which looked like
> some kind of race:
> (Subject: Re: [Xen-devel] [rumpuserxen test] 33416: regressions - FAIL)
Aha! I told you I don't believe in cosmic rays ;)
>
>> This
>> http://www.chiark.greenend.org.uk/~xensrcts/results/history.test-amd64-amd64-rumpuserxen-amd64.html
>> show a history of random failures at the xenstorels step.
>>
>> At least the ones as far back as 33830 (the last one with logs still
>> available) all show signs of what looks like memory corruption of some
>> sort.
>
> Thanks for the digging. (I have left the quoted text in for the
> benefit of rumpkernel-users.)
>
> The first failure in that history that looks like part of this is
> flight 33690. We don't have logs for that any more but it used
> rumpuserxen 598ceb54916b
> xen 49de0b57b853
> netbsdsrc 17a547ca2943
> Failure probability after then seems about 20%. If I go back 10
> passes from 33690 I get to 33611 which used
> rumpuserxen ffcd777f8062
> xen 0d2879062076
> netbsdsrc a7c6b12e1752
>
> It seems unlikely that the difference is g>
>oing to be due to changes
> in the versions of linux, linuxfirmware, ovmf, qemu[u] or seabios.
> buildrump.sh has been 47b1a5eef43c throughout.
The diffs for rumpuserxen and netbsdsrc between those revisions are
luckily small. I couldn't spot anything in there which would
immediately look suspicious. The most suspicious change is calling
sched_yield() as part of the bootstrap process, but that's not very
dramatic as far as suspicious goes. TLS support was added, but I'm not
sure how that would affect threads which do not use TLS. That said, TLS
did work right off the bat, so it is a bit suspicious ...
Is it possible that some change in xen is tickling the bug? That would
explain why attempts to reproduce the bug in other setups have failed.
Is it easy to fire off runs with arbitrary revisions of each repo?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: GPF Heisenbug with rumprun-xen
[not found] ` <54D3D1D6.3010901@iki.fi>
@ 2015-02-06 10:42 ` Ian Jackson
0 siblings, 0 replies; 7+ messages in thread
From: Ian Jackson @ 2015-02-06 10:42 UTC (permalink / raw)
To: Antti Kantee; +Cc: rumpkernel-users, xen-devel, Ian Campbell, Jan Beulich
Antti Kantee writes ("Re: GPF Heisenbug with rumprun-xen"):
> On 05/02/15 15:51, Ian Jackson wrote:
> > (Subject: Re: [Xen-devel] [rumpuserxen test] 33416: regressions - FAIL)
>
> Aha! I told you I don't believe in cosmic rays ;)
:-).
> The diffs for rumpuserxen and netbsdsrc between those revisions are
> luckily small. I couldn't spot anything in there which would
> immediately look suspicious. The most suspicious change is calling
> sched_yield() as part of the bootstrap process, but that's not very
> dramatic as far as suspicious goes.
Yes - but it could expose an existing bug.
> TLS support was added, but I'm not sure how that would affect
> threads which do not use TLS. That said, TLS did work right off the
> bat, so it is a bit suspicious ...
Indeed.
> Is it possible that some change in xen is tickling the bug? That would
> explain why attempts to reproduce the bug in other setups have failed.
> Is it easy to fire off runs with arbitrary revisions of each repo?
It is possible that it's due to a change in Xen. We can fire off runs
with different versions, but given the low failure probability I have
been working on adding a "do the xenstorels test many times" step to
the test run, first.
If that does what I hope, I'll be able to point osstest's automatic
bisector at it.
Ian.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-02-06 10:42 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-05 9:53 [xen-4.5-testing test] 34157: regressions - FAIL xen.org
2015-02-05 12:53 ` Jan Beulich
2015-02-05 13:00 ` Ian Campbell
2015-02-05 14:44 ` Ian Campbell
2015-02-05 15:51 ` GPF Heisenbug with rumprun-xen Ian Jackson
[not found] ` <21715.37244.683000.194074@mariner.uk.xensource.com>
2015-02-05 20:25 ` Antti Kantee
[not found] ` <54D3D1D6.3010901@iki.fi>
2015-02-06 10:42 ` Ian Jackson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.