All of lore.kernel.org
 help / color / mirror / Atom feed
* [xen-4.5-testing test] 34157: regressions - FAIL
@ 2015-02-05  9:53 xen.org
  2015-02-05 12:53 ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: xen.org @ 2015-02-05  9:53 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson

flight 34157 xen-4.5-testing real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088
 test-amd64-i386-xl-qemuu-winxpsp3  5 xen-boot             fail REGR. vs. 34088
 test-amd64-i386-xl-qemut-win7-amd64  5 xen-boot           fail REGR. vs. 34088

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start                  fail  never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-check        fail  never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-check        fail   never pass
 test-armhf-armhf-xl          10 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-check        fail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start                  fail   never pass
 test-armhf-armhf-xl-sedf     10 migrate-support-check        fail   never pass
 test-amd64-i386-libvirt       9 guest-start                  fail   never pass
 test-amd64-amd64-libvirt      9 guest-start                  fail   never pass
 test-armhf-armhf-libvirt      9 guest-start                  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start                 fail never pass
 test-armhf-armhf-xl-credit2   5 xen-boot                     fail   never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop             fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop                   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop         fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop               fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop         fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop             fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop              fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop               fail never pass
 test-amd64-i386-pair        17 guest-migrate/src_host/dst_host fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stop                fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop                   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop               fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop                   fail  never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop                   fail   never pass

version targeted for testing:
 xen                  d8e78d691d9b4bcc945d8f0b0ed2b48713931c4d
baseline version:
 xen                  896437d6305879fab0f8c4f1d7292d1db0de6d97

------------------------------------------------------------
People who touched revisions under test:
  Andrew Cooper <andrew.cooper3@citrix.com>
  Dan Carpenter <dan.carpenter@oracle.com>
  Daniel De Graaf <dgdegra@tycho.nsa.gov>
  Ian Campbell <ian.campbell@citrix.com>
  Jan Beulich <jbeulich@suse.com>
  Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
  Tim Deegan <tim@xen.org>
  Wei Liu <wei.liu2@citrix.com>
------------------------------------------------------------

jobs:
 build-amd64                                                  pass    
 build-armhf                                                  pass    
 build-i386                                                   pass    
 build-amd64-libvirt                                          pass    
 build-armhf-libvirt                                          pass    
 build-i386-libvirt                                           pass    
 build-amd64-pvops                                            pass    
 build-armhf-pvops                                            pass    
 build-i386-pvops                                             pass    
 build-amd64-rumpuserxen                                      pass    
 build-i386-rumpuserxen                                       pass    
 test-amd64-amd64-xl                                          pass    
 test-armhf-armhf-xl                                          pass    
 test-amd64-i386-xl                                           pass    
 test-amd64-amd64-xl-pvh-amd                                  fail    
 test-amd64-i386-rhel6hvm-amd                                 pass    
 test-amd64-i386-qemut-rhel6hvm-amd                           pass    
 test-amd64-i386-qemuu-rhel6hvm-amd                           pass    
 test-amd64-amd64-xl-qemut-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemut-debianhvm-amd64                     pass    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64                     pass    
 test-amd64-i386-freebsd10-amd64                              pass    
 test-amd64-amd64-xl-qemuu-ovmf-amd64                         pass    
 test-amd64-i386-xl-qemuu-ovmf-amd64                          pass    
 test-amd64-amd64-rumpuserxen-amd64                           fail    
 test-amd64-amd64-xl-qemut-win7-amd64                         fail    
 test-amd64-i386-xl-qemut-win7-amd64                          fail    
 test-amd64-amd64-xl-qemuu-win7-amd64                         fail    
 test-amd64-i386-xl-qemuu-win7-amd64                          fail    
 test-amd64-amd64-xl-win7-amd64                               fail    
 test-amd64-i386-xl-win7-amd64                                fail    
 test-amd64-amd64-xl-credit2                                  pass    
 test-armhf-armhf-xl-credit2                                  fail    
 test-amd64-i386-freebsd10-i386                               pass    
 test-amd64-i386-rumpuserxen-i386                             pass    
 test-amd64-amd64-xl-pcipt-intel                              fail    
 test-amd64-amd64-xl-pvh-intel                                fail    
 test-amd64-i386-rhel6hvm-intel                               pass    
 test-amd64-i386-qemut-rhel6hvm-intel                         pass    
 test-amd64-i386-qemuu-rhel6hvm-intel                         pass    
 test-amd64-amd64-libvirt                                     fail    
 test-armhf-armhf-libvirt                                     fail    
 test-amd64-i386-libvirt                                      fail    
 test-armhf-armhf-xl-midway                                   pass    
 test-amd64-amd64-xl-multivcpu                                pass    
 test-armhf-armhf-xl-multivcpu                                pass    
 test-amd64-amd64-pair                                        pass    
 test-amd64-i386-pair                                         fail    
 test-amd64-amd64-xl-sedf-pin                                 pass    
 test-armhf-armhf-xl-sedf-pin                                 pass    
 test-amd64-amd64-xl-sedf                                     pass    
 test-armhf-armhf-xl-sedf                                     pass    
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1                     fail    
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1                     fail    
 test-amd64-i386-xl-winxpsp3-vcpus1                           fail    
 test-amd64-amd64-xl-qemut-winxpsp3                           fail    
 test-amd64-i386-xl-qemut-winxpsp3                            fail    
 test-amd64-amd64-xl-qemuu-winxpsp3                           fail    
 test-amd64-i386-xl-qemuu-winxpsp3                            fail    
 test-amd64-amd64-xl-winxpsp3                                 fail    
 test-amd64-i386-xl-winxpsp3                                  fail    


------------------------------------------------------------
sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
    http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
    http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.

------------------------------------------------------------
commit d8e78d691d9b4bcc945d8f0b0ed2b48713931c4d
Author: Dan Carpenter <dan.carpenter@oracle.com>
Date:   Tue Feb 3 12:22:01 2015 +0100

    bunzip2: off by one in get_next_block()
    
    "origPtr" is used as an offset into the bd->dbuf[] array.  That array is
    allocated in start_bunzip() and has "bd->dbufSize" number of elements so
    the test here should be >= instead of >.
    
    Later we check "origPtr" again before using it as an offset so I don't
    know if this bug can be triggered in real life.
    
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    
    Trivial adjustments to make the respective Linux commit
    b5c8afe5be51078a979d86ae5ae78c4ac948063d apply to Xen.
    
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Acked-by: Ian Campbell <ian.campbell@citrix.com>
    master commit: 39798e95a954eec660a3f5f21489c30ef78daf6d
    master date: 2015-01-28 16:50:08 +0100

commit 8a855b35ddf9edb69afd23d02908bb1d4bdf9a14
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date:   Tue Feb 3 12:21:38 2015 +0100

    docs/commandline: correct information for 'x2apic_phys' parameter
    
    Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
    master commit: 89c381c30b46ec714f2d5bef4b0cb6d759abc7e4
    master date: 2015-01-28 16:31:07 +0100

commit 3a777bedcbf4f273846ae33b01dd9c619e890f2d
Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date:   Tue Feb 3 12:20:44 2015 +0100

    x86: vcpu_destroy_pagetables() must not return -EINTR
    
    .. otherwise it has the side effect that: domain_relinquish_resources
    will stop and will return to user-space with -EINTR which it is not
    equipped to deal with that error code; or vcpu_reset - which will
    ignore it and convert the error to -ENOMEM..
    
    The preemption mechanism we have for domain destruction is to return
    -EAGAIN (and then user-space calls the hypercall again) and as such we need
    to catch the case of:
    
    domain_relinquish_resources
      ->vcpu_destroy_pagetables
        -> put_page_and_type_preemptible
           -> __put_page_type
               returns -EINTR
    
    and convert it to the proper type. For:
    
    XEN_DOMCTL_setvcpucontext
     -> vcpu_reset
       -> vcpu_destroy_pagetables
    
    we need to return -ERESTART otherwise we end up returning -ENOMEM.
    
    There are also other callers of vcpu_destroy_pagetables: arch_vcpu_reset
    (vcpu_reset) are:
     - hvm_s3_suspend (asserts on any return code),
     - vlapic_init_sipi_one (asserts on any return code),
    
    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    master commit: de4f284b3d7b47d3b9807f354552ecf3e0fff56b
    master date: 2015-01-26 12:51:09 +0100

commit 1acb3b6f12821597eac1aa8ce33578f3e26bc272
Author: Wei Liu <wei.liu2@citrix.com>
Date:   Tue Feb 3 12:18:36 2015 +0100

    handle XENMEM_get_vnumainfo in compat_memory_op
    
    Signed-off-by: Wei Liu <wei.liu2@citrix.com>
    Reviewed-by: Jan Beulich <jbeulich@suse.com>
    master commit: 5f6ec28f2c3c3ba17a0b7f2a1d98324665420f46
    master date: 2015-01-23 15:06:26 +0100

commit 4eec09f613778ef813bcc2d653b9930aeaf3755f
Author: Jan Beulich <jbeulich@suse.com>
Date:   Tue Feb 3 12:17:59 2015 +0100

    x86: correctly check for sub-leaf zero of leaf 7 in pv_cpuid()
    
    Only the low 32 bits are relevant.
    
    For consistency also change a cast on regs->eax to regs->_eax.
    
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
    master commit: ae1edef1ae33f3bcff2580116ae2b7c9ffef42f2
    master date: 2015-01-22 12:48:40 +0100

commit 7788cbb0a59b932c2bc36823d23d52b65099c80a
Author: Jan Beulich <jbeulich@suse.com>
Date:   Tue Feb 3 12:17:26 2015 +0100

    x86: don't expose XSAVES capability to PV guests
    
    As done by the recent Linux commit b65d6e17fe ("kvm: x86: mask out
    XSAVES") for KVM, we should also mask out XSAVES from what PV guests
    get to see as long as we don't emulate accesses to MSR_IA32_XSS.
    
    Actually, go beyond that: Just like for leaf 7, switch from
    blacklisting to whitelisting, i.e. only allow XSAVEOPT and XSAVEC for
    the time being. And do these overrides consistently for both Dom0 and
    DomU-s.
    
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
    master commit: 8d050ed1097ce5f4bf6a1d6806fb1e3471976adb
    master date: 2015-01-22 12:47:56 +0100

commit 4cfc54b1b81fb1a91080072b3250801c020a3134
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date:   Tue Feb 3 12:16:30 2015 +0100

    xsm/evtchn: never pretend to have successfully created a Xen event channel
    
    Xen event channels are not internal resources.  They still have one end in a
    domain, and are created at the request of privileged domains.  This logic
    which "successfully" creates a Xen event channel opens up undesirable failure
    cases with ill-specified XSM policies.
    
    If a domain is permitted to create ioreq servers or memevent listeners, but
    not to create event channels, the ioreq/memevent creation will succeed but
    attempting to bind the returned event channel will fail without any indication
    of a permission error.
    
    Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
    master commit: 09aa4759faa29c1fe735266de4c79f17329bd67b
    master date: 2015-01-20 10:42:26 +0100

commit 2fdd521e4801bfb45ba0e88ca820a8606aa5e1b7
Author: Jan Beulich <jbeulich@suse.com>
Date:   Tue Feb 3 12:15:58 2015 +0100

    common/memory: fix an XSM error path
    
    XENMEM_{in,de}crease_reservation as well as XENMEM_populate_physmap
    return the extent at which failure was detected, not error indicators.
    
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
    Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
    Acked-by: Tim Deegan <tim@xen.org>
    master commit: 76d4ff26d9647088353acaf4a56388a354a5d6e9
    master date: 2015-01-19 11:59:05 +0100

commit ad83ad993d1a42c61f4edd97eb5d6396a589ad48
Author: Jan Beulich <jbeulich@suse.com>
Date:   Tue Feb 3 12:15:03 2015 +0100

    x86emul: tighten CLFLUSH emulation
    
    While for us it's not as bad as it was for Linux, their commit
    13e457e0ee ("KVM: x86: Emulator does not decode clflush well", by
    Nadav Amit <namit@cs.technion.ac.il>) nevertheless points out two
    shortcomings in our code: opcode 0F AE /7 is clflush only when it uses
    a memory mode (otherwise it's SFENCE) and when there's no REP prefix
    (an operand size prefix is fine, as that's CLFLUSHOPT).
    
    Signed-off-by: Jan Beulich <jbeulich@suse.com>
    Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
    master commit: 9d03db6b81d1880bf3aa4fc83a60346bf02be251
    master date: 2015-01-12 15:41:12 +0100
(qemu changes not included)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [xen-4.5-testing test] 34157: regressions - FAIL
  2015-02-05  9:53 [xen-4.5-testing test] 34157: regressions - FAIL xen.org
@ 2015-02-05 12:53 ` Jan Beulich
  2015-02-05 13:00   ` Ian Campbell
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2015-02-05 12:53 UTC (permalink / raw)
  To: xen-devel

>>> On 05.02.15 at 10:53, <Ian.Jackson@eu.citrix.com> wrote:
> flight 34157 xen-4.5-testing real [real]
> http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/ 
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088

I have no clue what is going on here, ...

>  test-amd64-i386-xl-qemuu-winxpsp3  5 xen-boot             fail REGR. vs. 34088
>  test-amd64-i386-xl-qemut-win7-amd64  5 xen-boot           fail REGR. vs. 34088

... while these two again seem to suffer from the ntpd daemon
startup issue queried on before.

Jan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [xen-4.5-testing test] 34157: regressions - FAIL
  2015-02-05 12:53 ` Jan Beulich
@ 2015-02-05 13:00   ` Ian Campbell
  2015-02-05 14:44     ` Ian Campbell
  0 siblings, 1 reply; 7+ messages in thread
From: Ian Campbell @ 2015-02-05 13:00 UTC (permalink / raw)
  To: Jan Beulich, Ian Jackson; +Cc: xen-devel

On Thu, 2015-02-05 at 12:53 +0000, Jan Beulich wrote:
> >>> On 05.02.15 at 10:53, <Ian.Jackson@eu.citrix.com> wrote:
> > flight 34157 xen-4.5-testing real [real]
> > http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/ 
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088
> 
> I have no clue what is going on here, ...

I saw something very similar in a recent flight on the osstest branch.
Ian -- YHM about that somewhere (list wasn't ccd though).

It looks to be some sort of Heisenbug in the rump kernel stuff.

> >  test-amd64-i386-xl-qemuu-winxpsp3  5 xen-boot             fail REGR. vs. 34088
> >  test-amd64-i386-xl-qemut-win7-amd64  5 xen-boot           fail REGR. vs. 34088
> 
> ... while these two again seem to suffer from the ntpd daemon
> startup issue queried on before.

I didn't see that (or don't remember at least), but I suppose it is
infrastructure related?

Hopefully that sort of thing will diminish after the move to the new
colo...

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [xen-4.5-testing test] 34157: regressions - FAIL
  2015-02-05 13:00   ` Ian Campbell
@ 2015-02-05 14:44     ` Ian Campbell
  2015-02-05 15:51       ` GPF Heisenbug with rumprun-xen Ian Jackson
       [not found]       ` <21715.37244.683000.194074@mariner.uk.xensource.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Ian Campbell @ 2015-02-05 14:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Ian Jackson

On Thu, 2015-02-05 at 13:00 +0000, Ian Campbell wrote:
> On Thu, 2015-02-05 at 12:53 +0000, Jan Beulich wrote:
> > >>> On 05.02.15 at 10:53, <Ian.Jackson@eu.citrix.com> wrote:
> > > flight 34157 xen-4.5-testing real [real]
> > > http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/ 
> > > 
> > > Regressions :-(
> > > 
> > > Tests which did not succeed and are blocking,
> > > including tests which could not be run:
> > >  test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088
> > 
> > I have no clue what is going on here, ...
> 
> I saw something very similar in a recent flight on the osstest branch.
> Ian -- YHM about that somewhere (list wasn't ccd though).
> 
> It looks to be some sort of Heisenbug in the rump kernel stuff.

http://www.chiark.greenend.org.uk/~xensrcts/results/history.test-amd64-amd64-rumpuserxen-amd64.html

show a history of random failures at the xenstorels step.

At least the ones as far back as 33830 (the last one with logs still
available) all show signs of what looks like memory corruption of some
sort.

http://www.chiark.greenend.org.uk/~xensrcts/logs/33830/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
        GPF rip: 0x46ea1c, error_code=0
        Page fault at linear address 0x0, rip 0x13563, regs 0x469ff8, sp 0x46a0a8, our_sp 0x469fe0, code 0
        Page fault in pagetable walk (access to invalid memory?).
        
http://www.chiark.greenend.org.uk/~xensrcts/logs/33846/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
        $VAR1 = {
                  'theirs' => 'STUB ``__sigaction14\'\' called
        device/\x02G:
        rumpxenstack: 
        could not access permissions for \x02G: Invalid argument
        
        rumpxenstack: xs_directory (device/\x02G): Invalid argument

http://www.chiark.greenend.org.uk/~xensrcts/logs/33925/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
        Similar to 33846

        http://www.chiark.greenend.org.uk/~xensrcts/logs/34086/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
        rumpxenstack: 
        could not access permissions for mac: Invalid argument
        device/vif/0/mac = "5a:36:0e:26:00:05" 
        rumpxenstack: xs_directory (device/vif/0/mac): Bad file descriptor

http://www.chiark.greenend.org.uk/~xensrcts/logs/34127/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
        Page fault at linear address 0x256, rip 0x33e1c, regs 0x53f458, sp 0x53f500, our_sp 0x53f440, code 0
        Thread: main
        RIP: e030:[<0000000000033e1c>] 
        RSP: e02b:000000000053f500  EFLAGS: 00010202
        RAX: 000000000025a927 RBX: 0000000000000246 RCX: 000000000053fe88
        RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000025a8b0
        RBP: 00000000001d55df R08: 0000000000454091 R09: 0000000000000000
        R10: 0000000000000000 R11: 0000000000000000 R12: 000000000053fe88
        R13: 0000000000000000 R14: 000000000025a8b0 R15: 0000000000000000
        base is 0x1d55df caller is 0x6e69616d20676e69
        base is 0x6c6c6163203d3d3d GPF rip: 0x13e00, error_code=0
        
http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
        Another bad fd, on device/vbd but otherwise similar to 34086.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* GPF Heisenbug with rumprun-xen
  2015-02-05 14:44     ` Ian Campbell
@ 2015-02-05 15:51       ` Ian Jackson
       [not found]       ` <21715.37244.683000.194074@mariner.uk.xensource.com>
  1 sibling, 0 replies; 7+ messages in thread
From: Ian Jackson @ 2015-02-05 15:51 UTC (permalink / raw)
  To: Ian Campbell; +Cc: rumpkernel-users, xen-devel, Jan Beulich

Ian Campbell writes ("Re: [Xen-devel] [xen-4.5-testing test] 34157: regressions - FAIL"):
> > > > http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/ 
...
> > > >  test-amd64-amd64-rumpuserxen-amd64 11 rumpuserxen-demo-xenstorels/xenstorels fail REGR. vs. 34088

Guest console contains:

   device/vbd/768/protocol = "x86_64-abi"   (n2,r0)
   device/vbd/832:
   rumpxenstack:
   could not access permissions for 832: Bad file descriptor

   rumpxenstack: xs_directory (device/vbd/832): Bad file descriptor

   === ERROR: _exit(1) called ===

> It looks to be some sort of Heisenbug in the rump kernel stuff.

I agree.  We had a failure on the 16th of January which looked like
some kind of race:
 (Subject: Re: [Xen-devel] [rumpuserxen test] 33416: regressions - FAIL)


> This 
> http://www.chiark.greenend.org.uk/~xensrcts/results/history.test-amd64-amd64-rumpuserxen-amd64.html
> show a history of random failures at the xenstorels step.
>
> At least the ones as far back as 33830 (the last one with logs still
> available) all show signs of what looks like memory corruption of some
> sort.

Thanks for the digging.  (I have left the quoted text in for the
benefit of rumpkernel-users.)

The first failure in that history that looks like part of this is
flight 33690.  We don't have logs for that any more but it used
  rumpuserxen 598ceb54916b
  xen         49de0b57b853
  netbsdsrc   17a547ca2943
Failure probability after then seems about 20%.  If I go back 10
passes from 33690 I get to 33611 which used
  rumpuserxen ffcd777f8062
  xen         0d2879062076
  netbsdsrc   a7c6b12e1752

It seems unlikely that the difference is going to be due to changes
in the versions of linux, linuxfirmware, ovmf, qemu[u] or seabios.
buildrump.sh has been 47b1a5eef43c throughout.

Ian.


> http://www.chiark.greenend.org.uk/~xensrcts/logs/33830/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
>         GPF rip: 0x46ea1c, error_code=0
>         Page fault at linear address 0x0, rip 0x13563, regs 0x469ff8, sp 0x46a0a8, our_sp 0x469fe0, code 0
>         Page fault in pagetable walk (access to invalid memory?).
>         
> http://www.chiark.greenend.org.uk/~xensrcts/logs/33846/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
>         $VAR1 = {
>                   'theirs' => 'STUB ``__sigaction14\'\' called
>         device/\x02G:
>         rumpxenstack: 
>         could not access permissions for \x02G: Invalid argument
>         
>         rumpxenstack: xs_directory (device/\x02G): Invalid argument
> 
> http://www.chiark.greenend.org.uk/~xensrcts/logs/33925/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
>         Similar to 33846
> 
>         http://www.chiark.greenend.org.uk/~xensrcts/logs/34086/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
>         rumpxenstack: 
>         could not access permissions for mac: Invalid argument
>         device/vif/0/mac = "5a:36:0e:26:00:05" 
>         rumpxenstack: xs_directory (device/vif/0/mac): Bad file descriptor
> 
> http://www.chiark.greenend.org.uk/~xensrcts/logs/34127/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
>         Page fault at linear address 0x256, rip 0x33e1c, regs 0x53f458, sp 0x53f500, our_sp 0x53f440, code 0
>         Thread: main
>         RIP: e030:[<0000000000033e1c>] 
>         RSP: e02b:000000000053f500  EFLAGS: 00010202
>         RAX: 000000000025a927 RBX: 0000000000000246 RCX: 000000000053fe88
>         RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000025a8b0
>         RBP: 00000000001d55df R08: 0000000000454091 R09: 0000000000000000
>         R10: 0000000000000000 R11: 0000000000000000 R12: 000000000053fe88
>         R13: 0000000000000000 R14: 000000000025a8b0 R15: 0000000000000000
>         base is 0x1d55df caller is 0x6e69616d20676e69
>         base is 0x6c6c6163203d3d3d GPF rip: 0x13e00, error_code=0
>         
> http://www.chiark.greenend.org.uk/~xensrcts/logs/34157/test-amd64-amd64-rumpuserxen-amd64/11.ts-rumpuserxen-demo-xenstorels.log
>         Another bad fd, on device/vbd but otherwise similar to 34086.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GPF Heisenbug with rumprun-xen
       [not found]       ` <21715.37244.683000.194074@mariner.uk.xensource.com>
@ 2015-02-05 20:25         ` Antti Kantee
       [not found]         ` <54D3D1D6.3010901@iki.fi>
  1 sibling, 0 replies; 7+ messages in thread
From: Antti Kantee @ 2015-02-05 20:25 UTC (permalink / raw)
  To: Ian Jackson, Ian Campbell; +Cc: rumpkernel-users, xen-devel, Jan Beulich

On 05/02/15 15:51, Ian Jackson wrote:
>> It looks to be some sort of Heisenbug in the rump kernel stuff.
>
> I agree.  We had a failure on the 16th of January which looked like
> some kind of race:
>   (Subject: Re: [Xen-devel] [rumpuserxen test] 33416: regressions - FAIL)

Aha!  I told you I don't believe in cosmic rays ;)

>
>> This
>> http://www.chiark.greenend.org.uk/~xensrcts/results/history.test-amd64-amd64-rumpuserxen-amd64.html
>> show a history of random failures at the xenstorels step.
>>
>> At least the ones as far back as 33830 (the last one with logs still
>> available) all show signs of what looks like memory corruption of some
>> sort.
>
> Thanks for the digging.  (I have left the quoted text in for the
> benefit of rumpkernel-users.)
>
> The first failure in that history that looks like part of this is
> flight 33690.  We don't have logs for that any more but it used
>    rumpuserxen 598ceb54916b
>    xen         49de0b57b853
>    netbsdsrc   17a547ca2943
> Failure probability after then seems about 20%.  If I go back 10
> passes from 33690 I get to 33611 which used
>    rumpuserxen ffcd777f8062
>    xen         0d2879062076
>    netbsdsrc   a7c6b12e1752
>
> It seems unlikely that the difference is g>
>oing to be due to changes
> in the versions of linux, linuxfirmware, ovmf, qemu[u] or seabios.
> buildrump.sh has been 47b1a5eef43c throughout.

The diffs for rumpuserxen and netbsdsrc between those revisions are 
luckily small.  I couldn't spot anything in there which would 
immediately look suspicious.  The most suspicious change is calling 
sched_yield() as part of the bootstrap process, but that's not very 
dramatic as far as suspicious goes.  TLS support was added, but I'm not 
sure how that would affect threads which do not use TLS.  That said, TLS 
did work right off the bat, so it is a bit suspicious ...

Is it possible that some change in xen is tickling the bug?  That would 
explain why attempts to reproduce the bug in other setups have failed. 
Is it easy to fire off runs with arbitrary revisions of each repo?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: GPF Heisenbug with rumprun-xen
       [not found]         ` <54D3D1D6.3010901@iki.fi>
@ 2015-02-06 10:42           ` Ian Jackson
  0 siblings, 0 replies; 7+ messages in thread
From: Ian Jackson @ 2015-02-06 10:42 UTC (permalink / raw)
  To: Antti Kantee; +Cc: rumpkernel-users, xen-devel, Ian Campbell, Jan Beulich

Antti Kantee writes ("Re: GPF Heisenbug with rumprun-xen"):
> On 05/02/15 15:51, Ian Jackson wrote:
> >   (Subject: Re: [Xen-devel] [rumpuserxen test] 33416: regressions - FAIL)
> 
> Aha!  I told you I don't believe in cosmic rays ;)

:-).

> The diffs for rumpuserxen and netbsdsrc between those revisions are 
> luckily small.  I couldn't spot anything in there which would 
> immediately look suspicious.  The most suspicious change is calling 
> sched_yield() as part of the bootstrap process, but that's not very 
> dramatic as far as suspicious goes.

Yes - but it could expose an existing bug.

>  TLS support was added, but I'm not sure how that would affect
> threads which do not use TLS.  That said, TLS did work right off the
> bat, so it is a bit suspicious ...

Indeed.

> Is it possible that some change in xen is tickling the bug?  That would 
> explain why attempts to reproduce the bug in other setups have failed. 
> Is it easy to fire off runs with arbitrary revisions of each repo?

It is possible that it's due to a change in Xen.  We can fire off runs
with different versions, but given the low failure probability I have
been working on adding a "do the xenstorels test many times" step to
the test run, first.

If that does what I hope, I'll be able to point osstest's automatic
bisector at it.

Ian.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-02-06 10:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-05  9:53 [xen-4.5-testing test] 34157: regressions - FAIL xen.org
2015-02-05 12:53 ` Jan Beulich
2015-02-05 13:00   ` Ian Campbell
2015-02-05 14:44     ` Ian Campbell
2015-02-05 15:51       ` GPF Heisenbug with rumprun-xen Ian Jackson
     [not found]       ` <21715.37244.683000.194074@mariner.uk.xensource.com>
2015-02-05 20:25         ` Antti Kantee
     [not found]         ` <54D3D1D6.3010901@iki.fi>
2015-02-06 10:42           ` Ian Jackson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.