* [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
@ 2017-08-25 3:15 osstest service owner
2017-08-25 13:40 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: osstest service owner @ 2017-08-25 3:15 UTC (permalink / raw)
To: xen-devel, osstest-admin
[-- Attachment #1: Type: text/plain, Size: 25753 bytes --]
flight 112855 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/112855/
Regressions :-(
Tests which did not succeed and are blocking,
including tests which could not be run:
test-amd64-i386-examine 7 reboot fail REGR. vs. 112809
test-amd64-i386-freebsd10-amd64 7 xen-boot fail REGR. vs. 112809
build-amd64-xsm 6 xen-build fail REGR. vs. 112809
test-amd64-amd64-xl-qemuu-win7-amd64 16 guest-localmigrate/x10 fail REGR. vs. 112809
test-armhf-armhf-xl-credit2 16 guest-start/debian.repeat fail REGR. vs. 112809
test-amd64-amd64-xl-qemut-win7-amd64 10 windows-install fail REGR. vs. 112809
test-amd64-i386-xl-qemut-win7-amd64 10 windows-install fail REGR. vs. 112809
test-amd64-i386-xl-qemut-ws16-amd64 10 windows-install fail REGR. vs. 112809
Tests which did not succeed, but are not blocking:
test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
test-amd64-i386-xl-qemut-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
build-arm64-libvirt 1 build-check(1) blocked n/a
test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
test-arm64-arm64-examine 1 build-check(1) blocked n/a
test-amd64-i386-libvirt-xsm 1 build-check(1) blocked n/a
test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
test-arm64-arm64-libvirt-xsm 1 build-check(1) blocked n/a
test-amd64-i386-xl-xsm 1 build-check(1) blocked n/a
test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
test-arm64-arm64-xl 1 build-check(1) blocked n/a
test-amd64-amd64-libvirt-xsm 1 build-check(1) blocked n/a
test-amd64-amd64-xl-xsm 1 build-check(1) blocked n/a
test-arm64-arm64-xl-credit2 1 build-check(1) blocked n/a
test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm 1 build-check(1) blocked n/a
test-arm64-arm64-xl-xsm 1 build-check(1) blocked n/a
build-arm64-pvops 2 hosts-allocate broken like 112809
build-arm64 2 hosts-allocate broken like 112809
build-arm64-xsm 2 hosts-allocate broken like 112809
build-arm64-pvops 3 capture-logs broken like 112809
build-arm64 3 capture-logs broken like 112809
build-arm64-xsm 3 capture-logs broken like 112809
test-armhf-armhf-libvirt-xsm 14 saverestore-support-check fail like 112809
test-armhf-armhf-libvirt 14 saverestore-support-check fail like 112809
test-amd64-i386-xl-qemuu-win7-amd64 17 guest-stop fail like 112809
test-armhf-armhf-libvirt-raw 13 saverestore-support-check fail like 112809
test-amd64-amd64-xl-rtds 10 debian-install fail like 112809
test-armhf-armhf-xl-rtds 16 guest-start/debian.repeat fail like 112809
test-amd64-amd64-xl-qemuu-ws16-amd64 10 windows-install fail never pass
test-amd64-amd64-xl-qemut-ws16-amd64 10 windows-install fail never pass
test-amd64-i386-libvirt 13 migrate-support-check fail never pass
test-amd64-amd64-libvirt 13 migrate-support-check fail never pass
test-armhf-armhf-xl-arndale 13 migrate-support-check fail never pass
test-armhf-armhf-xl-arndale 14 saverestore-support-check fail never pass
test-amd64-amd64-qemuu-nested-amd 17 debian-hvm-install/l1/l2 fail never pass
test-amd64-i386-xl-qemuu-ws16-amd64 13 guest-saverestore fail never pass
test-armhf-armhf-xl-xsm 13 migrate-support-check fail never pass
test-armhf-armhf-xl-xsm 14 saverestore-support-check fail never pass
test-armhf-armhf-xl-multivcpu 13 migrate-support-check fail never pass
test-armhf-armhf-xl-multivcpu 14 saverestore-support-check fail never pass
test-armhf-armhf-xl-cubietruck 13 migrate-support-check fail never pass
test-armhf-armhf-xl-cubietruck 14 saverestore-support-check fail never pass
test-armhf-armhf-libvirt-xsm 13 migrate-support-check fail never pass
test-armhf-armhf-xl 13 migrate-support-check fail never pass
test-armhf-armhf-xl 14 saverestore-support-check fail never pass
test-armhf-armhf-xl-credit2 13 migrate-support-check fail never pass
test-armhf-armhf-xl-credit2 14 saverestore-support-check fail never pass
test-armhf-armhf-libvirt 13 migrate-support-check fail never pass
test-armhf-armhf-xl-rtds 13 migrate-support-check fail never pass
test-armhf-armhf-xl-rtds 14 saverestore-support-check fail never pass
test-armhf-armhf-libvirt-raw 12 migrate-support-check fail never pass
test-armhf-armhf-xl-vhd 12 migrate-support-check fail never pass
test-armhf-armhf-xl-vhd 13 saverestore-support-check fail never pass
test-amd64-amd64-libvirt-vhd 12 migrate-support-check fail never pass
test-amd64-i386-xl-qemut-win10-i386 10 windows-install fail never pass
test-amd64-i386-xl-qemuu-win10-i386 10 windows-install fail never pass
test-amd64-amd64-xl-qemuu-win10-i386 10 windows-install fail never pass
test-amd64-amd64-xl-qemut-win10-i386 10 windows-install fail never pass
version targeted for testing:
xen 98df75f2782e47c47002d57ca5c5832de4e903fc
baseline version:
xen 9053a74c08fd6abf43bb45ff932b4386de7e8510
Last test of basis 112809 2017-08-22 04:57:01 Z 2 days
Failing since 112841 2017-08-23 06:00:13 Z 1 days 2 attempts
Testing same since 112855 2017-08-24 02:34:07 Z 1 days 1 attempts
------------------------------------------------------------
People who touched revisions under test:
Andrew Cooper <andrew.cooper3@citrix.com>
Bernd Kuhls <bernd.kuhls@t-online.de>
Boris Ostrovsky <boris.ostrovsky@oracle.com>
Christopher Clark <christopher.clark6@baesystems.com>
Daniel De Graaf <dgdegra@tycho.nsa.gov>
Igor Druzhinin <igor.druzhinin@citrix.com>
Jan Beulich <jbeulich@suse.com>
Julien Grall <julien.grall@arm.com>
Roger Pau Monné <roger.pau@citrix.com>
Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Wei Liu <wei.liu2@citrix.com>
jobs:
build-amd64-xsm fail
build-arm64-xsm broken
build-armhf-xsm pass
build-i386-xsm pass
build-amd64-xtf pass
build-amd64 pass
build-arm64 broken
build-armhf pass
build-i386 pass
build-amd64-libvirt pass
build-arm64-libvirt blocked
build-armhf-libvirt pass
build-i386-libvirt pass
build-amd64-prev pass
build-i386-prev pass
build-amd64-pvops pass
build-arm64-pvops broken
build-armhf-pvops pass
build-i386-pvops pass
build-amd64-rumprun pass
build-i386-rumprun pass
test-xtf-amd64-amd64-1 pass
test-xtf-amd64-amd64-2 pass
test-xtf-amd64-amd64-3 pass
test-xtf-amd64-amd64-4 pass
test-xtf-amd64-amd64-5 pass
test-amd64-amd64-xl pass
test-arm64-arm64-xl blocked
test-armhf-armhf-xl pass
test-amd64-i386-xl pass
test-amd64-amd64-xl-qemut-debianhvm-amd64-xsm blocked
test-amd64-i386-xl-qemut-debianhvm-amd64-xsm blocked
test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm blocked
test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm blocked
test-amd64-amd64-xl-qemuu-debianhvm-amd64-xsm blocked
test-amd64-i386-xl-qemuu-debianhvm-amd64-xsm blocked
test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm blocked
test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm blocked
test-amd64-amd64-libvirt-xsm blocked
test-arm64-arm64-libvirt-xsm blocked
test-armhf-armhf-libvirt-xsm pass
test-amd64-i386-libvirt-xsm blocked
test-amd64-amd64-xl-xsm blocked
test-arm64-arm64-xl-xsm blocked
test-armhf-armhf-xl-xsm pass
test-amd64-i386-xl-xsm blocked
test-amd64-amd64-qemuu-nested-amd fail
test-amd64-amd64-xl-pvh-amd pass
test-amd64-i386-qemut-rhel6hvm-amd pass
test-amd64-i386-qemuu-rhel6hvm-amd pass
test-amd64-amd64-xl-qemut-debianhvm-amd64 pass
test-amd64-i386-xl-qemut-debianhvm-amd64 pass
test-amd64-amd64-xl-qemuu-debianhvm-amd64 pass
test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
test-amd64-i386-freebsd10-amd64 fail
test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
test-amd64-i386-xl-qemuu-ovmf-amd64 pass
test-amd64-amd64-rumprun-amd64 pass
test-amd64-amd64-xl-qemut-win7-amd64 fail
test-amd64-i386-xl-qemut-win7-amd64 fail
test-amd64-amd64-xl-qemuu-win7-amd64 fail
test-amd64-i386-xl-qemuu-win7-amd64 fail
test-amd64-amd64-xl-qemut-ws16-amd64 fail
test-amd64-i386-xl-qemut-ws16-amd64 fail
test-amd64-amd64-xl-qemuu-ws16-amd64 fail
test-amd64-i386-xl-qemuu-ws16-amd64 fail
test-armhf-armhf-xl-arndale pass
test-amd64-amd64-xl-credit2 pass
test-arm64-arm64-xl-credit2 blocked
test-armhf-armhf-xl-credit2 fail
test-armhf-armhf-xl-cubietruck pass
test-amd64-amd64-examine pass
test-arm64-arm64-examine blocked
test-armhf-armhf-examine pass
test-amd64-i386-examine fail
test-amd64-i386-freebsd10-i386 pass
test-amd64-i386-rumprun-i386 pass
test-amd64-amd64-xl-qemut-win10-i386 fail
test-amd64-i386-xl-qemut-win10-i386 fail
test-amd64-amd64-xl-qemuu-win10-i386 fail
test-amd64-i386-xl-qemuu-win10-i386 fail
test-amd64-amd64-qemuu-nested-intel pass
test-amd64-amd64-xl-pvh-intel pass
test-amd64-i386-qemut-rhel6hvm-intel pass
test-amd64-i386-qemuu-rhel6hvm-intel pass
test-amd64-amd64-libvirt pass
test-armhf-armhf-libvirt pass
test-amd64-i386-libvirt pass
test-amd64-amd64-livepatch pass
test-amd64-i386-livepatch pass
test-amd64-amd64-migrupgrade pass
test-amd64-i386-migrupgrade pass
test-amd64-amd64-xl-multivcpu pass
test-armhf-armhf-xl-multivcpu pass
test-amd64-amd64-pair pass
test-amd64-i386-pair pass
test-amd64-amd64-libvirt-pair pass
test-amd64-i386-libvirt-pair pass
test-amd64-amd64-amd64-pvgrub pass
test-amd64-amd64-i386-pvgrub pass
test-amd64-amd64-pygrub pass
test-amd64-amd64-xl-qcow2 pass
test-armhf-armhf-libvirt-raw pass
test-amd64-i386-xl-raw pass
test-amd64-amd64-xl-rtds fail
test-armhf-armhf-xl-rtds fail
test-amd64-amd64-libvirt-vhd pass
test-armhf-armhf-xl-vhd pass
------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images
Logs, config files, etc. are available at
http://logs.test-lab.xenproject.org/osstest/logs
Explanation of these reports, and of osstest in general, is at
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master
Test harness code can be found at
http://xenbits.xen.org/gitweb?p=osstest.git;a=summary
broken-step build-arm64-pvops hosts-allocate
broken-step build-arm64 hosts-allocate
broken-step build-arm64-xsm hosts-allocate
broken-step build-arm64-pvops capture-logs
broken-step build-arm64 capture-logs
broken-step build-arm64-xsm capture-logs
Not pushing.
------------------------------------------------------------
commit 98df75f2782e47c47002d57ca5c5832de4e903fc
Author: Roger Pau Monné <roger.pau@citrix.com>
Date: Wed Aug 23 17:47:38 2017 +0200
hvmloader: add fields for SMBIOS 2.4 compliance
The version of SMBIOS set in the entry point is 2.4, however several
structures are missing fields required by 2.4. Fix this by adding the
missing fields, this is based on the documents found at the DMTF site
[0].
Most fields are set to 0 (undefined/not specified), except for the
cache related handlers that need to be initialized to 0xffff in order
to signal that the information is not provided.
[0] https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.1.1.pdf
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported by: Chris Gilbert <chris.gilbert@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
commit 2993eca8288f36fac12482ff370cd470ae9e7cbb
Author: Christopher Clark <christopher.clark6@baesystems.com>
Date: Wed Aug 23 17:47:04 2017 +0200
xsm: policy hooks to require an IOMMU and interrupt remapping
Isolation of devices passed through to domains usually requires an
active IOMMU. The existing method of requiring an IOMMU is via a Xen
boot parameter ("iommu=force") which will abort boot if an IOMMU is not
available.
More graceful degradation of behaviour when an IOMMU is absent can be
achieved by enabling XSM to perform enforcement of IOMMU requirement.
This patch enables an enforceable XSM policy to specify that an IOMMU is
required for particular domains to access devices and how capable that
IOMMU must be. This allows a Xen system to boot whilst still
ensuring that an IOMMU is active before permitting device use.
Using a XSM policy ensures that the isolation properties remain enforced
even when the large, complex toolstack software changes.
For some hardware platforms interrupt remapping is a strict requirement
for secure isolation. Not all IOMMUs provide interrupt remapping.
The XSM policy can now optionally require interrupt remapping.
The device use hooks now check whether an IOMMU is:
* Active and securely isolating:
-- current criteria for this is that interrupt remapping is ok
* Active but interrupt remapping is not available
* Not active
This patch also updates the reference XSM policy to use the new
primitives, with policy entries that do not require an active IOMMU.
Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Ross Philipson <ross.philipson@gmail.com>
commit 59546c1897a90fe9af5ebbbb05ead8d98b4d17b9
Author: Jan Beulich <jbeulich@suse.com>
Date: Wed Aug 23 17:45:45 2017 +0200
arm/mm: release grant lock on xenmem_add_to_physmap_one() error paths
Commit 55021ff9ab ("xen/arm: add_to_physmap_one: Avoid to map mfn 0 if
an error occurs") introduced error paths not releasing the grant table
lock. Replace them by a suitable check after the lock was dropped.
This is XSA-235.
Reported-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
commit 4a0485c3d343e1c582fa824e4896b9b613a14efe
Author: Wei Liu <wei.liu2@citrix.com>
Date: Mon Aug 21 15:09:13 2017 +0100
x86: switch to plain bool in passthrough code
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
commit 18f518eace0619d902ea1132226c7ebc64312f78
Author: Wei Liu <wei.liu2@citrix.com>
Date: Mon Aug 21 15:09:12 2017 +0100
xen: merge common hvm/irq.h into x86 hvm/irq.h
That header file is only used by x86. Merge is into the x86 header.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
commit 108896d73e5a95a99bbebb1d68e0a656cf93b703
Author: Wei Liu <wei.liu2@citrix.com>
Date: Mon Aug 21 15:09:11 2017 +0100
xen: move hvm save code under common to x86
The code is only used by x86 at this point. Merge common/hvm/save.c
into x86 hvm/save.c. Move the headers and fix up inclusions. Remove
the now empty common/hvm directory.
Also fix some issues while moving:
1. removing trailing spaces;
2. fix multi-line comment;
3. make "i" in hvm_save unsigned int;
4. add some blank lines to separate sections of code;
5. change bool_t to bool.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
commit 149c6bbbf775b5e6dd6beae329fcdaab33a0f8cd
Author: Igor Druzhinin <igor.druzhinin@citrix.com>
Date: Thu Aug 17 15:57:13 2017 +0100
hvmloader, libxl: use the correct ACPI settings depending on device model
We need to choose ACPI tables and ACPI IO port location
properly depending on the device model version we are running.
Previously, this decision was made by BIOS type specific
code in hvmloader, e.g. always load QEMU traditional specific
tables if it's ROMBIOS and always load QEMU Xen specific
tables if it's SeaBIOS.
This change saves this behavior (for compatibility) but adds
an additional way (xenstore key) to specify the correct
device model if we happen to run a non-default one. Toolstack
bit makes use of it.
The enforcement of BIOS type depending on QEMU version will
be lifted later when the rest of ROMBIOS compatibility fixes
are in place.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
commit 88bfbf90e35f1213f9967a97dee0b2039f9998a4
Author: Bernd Kuhls <bernd.kuhls@t-online.de>
Date: Sat Aug 19 16:21:42 2017 +0200
tools/libxc/xc_dom_arm: add missing variable initialization
The variable domctl.u.address_size.size may remain uninitialized if
guest_type is not one of xen-3.0-aarch64 or xen-3.0-armv7l. And the
code precisely checks if this variable is still 0 to decide if the
guest type is supported or not.
This fixes the following build failure with gcc 7.x:
xc_dom_arm.c:229:31: error: 'domctl.u.address_size.size' may be used uninitialized in this function [-Werror=maybe-uninitialized]
if ( domctl.u.address_size.size == 0 )
Patch originally taken from
https://www.mail-archive.com/xen-devel@lists.xen.org/msg109313.html.
Signed-off-by: Bernd Kuhls <bernd.kuhls@t-online.de>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
commit 0c5f2f9cefacd0881b86abbe36e231815cef7735
Author: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Date: Wed Aug 16 20:31:00 2017 +0200
mm: Make sure pages are scrubbed
Add a debug Kconfig option that will make page allocator verify
that pages that were supposed to be scrubbed are, in fact, clean.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
commit d6bbb14cdc566745653df8b77dc103191efd1650
Author: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Date: Wed Aug 16 20:30:00 2017 +0200
mm: Print number of unscrubbed pages in 'H' debug handler
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
commit b43abf5ca3412554b97936e57714581c86ff440f
Author: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Date: Wed Aug 16 20:31:00 2017 +0200
mm: Keep heap accessible to others while scrubbing
Instead of scrubbing pages while holding heap lock we can mark
buddy's head as being scrubbed and drop the lock temporarily.
If someone (most likely alloc_heap_pages()) tries to access
this chunk it will signal the scrubber to abort scrub by setting
head's BUDDY_SCRUB_ABORT bit. The scrubber checks this bit after
processing each page and stops its work as soon as it sees it.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
commit 462090402a1485504c18d79f7a22b8ead03f1fdd
Author: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Date: Wed Aug 16 20:31:00 2017 +0200
spinlock: Introduce spin_lock_cb()
While waiting for a lock we may want to periodically run some
code. This code may, for example, allow the caller to release
resources held by it that are no longer needed in the critical
section protected by the lock.
Specifically, this feature will be needed by scrubbing code where
the scrubber, while waiting for heap lock to merge back clean
pages, may be requested by page allocator (which is currently
holding the lock) to abort merging and release the buddy page head
that the allocator wants.
We could use spin_trylock() but since it doesn't take lock ticket
it may take long time until the lock is taken. Instead we add
spin_lock_cb() that allows us to grab the ticket and execute a
callback while waiting. This callback is executed on every iteration
of the spinlock waiting loop.
Since we may be sleeping in the lock until it is released we need a
mechanism that will make sure that the callback has a chance to run.
We add spin_lock_kick() that will wake up the waiter.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Julien Grall <julien.grall@arm.com>
commit 55066985050f5366ed800dcd5ee9308d6ff943b1
Author: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Date: Wed Aug 16 20:30:00 2017 +0200
mm: Scrub memory from idle loop
Instead of scrubbing pages during guest destruction (from
free_heap_pages()) do this opportunistically, from the idle loop.
We might come to scrub_free_pages()from idle loop while another CPU
uses mapcache override, resulting in a fault while trying to do
__map_domain_page() in scrub_one_page(). To avoid this, make mapcache
vcpu override a per-cpu variable.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
(qemu changes not included)
[-- Attachment #2: Type: text/plain, Size: 127 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-25 3:15 [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass osstest service owner
@ 2017-08-25 13:40 ` Jan Beulich
2017-08-25 17:14 ` Boris Ostrovsky
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2017-08-25 13:40 UTC (permalink / raw)
To: xen-devel; +Cc: Igor Druzhinin, Boris Ostrovsky, osstest-admin, Roger Pau Monne
>>> On 25.08.17 at 05:15, <osstest-admin@xenproject.org> wrote:
> flight 112855 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/112855/
>
> Regressions :-(
>
> Tests which did not succeed and are blocking,
> including tests which could not be run:
> test-amd64-i386-examine 7 reboot fail REGR. vs. 112809
> test-amd64-i386-freebsd10-amd64 7 xen-boot fail REGR. vs. 112809
These two are watchdog NMIs during the loading of Dom0. Most
likely candidate for introducing the issue is Boris' scrub series.
> build-amd64-xsm 6 xen-build fail REGR. vs. 112809
This looks like a network glitch.
> test-amd64-amd64-xl-qemut-win7-amd64 10 windows-install fail REGR. vs. 112809
> test-amd64-i386-xl-qemut-win7-amd64 10 windows-install fail REGR. vs. 112809
> test-amd64-i386-xl-qemut-ws16-amd64 10 windows-install fail REGR. vs. 112809
The guests here look to all be stuck on early first time boot.
One of the two hvmloader changes would look to be the
primary suspects.
In both problem cases we may alternatively need to see whether
the bisector can narrow it down over the weekend.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-25 13:40 ` Jan Beulich
@ 2017-08-25 17:14 ` Boris Ostrovsky
2017-08-28 7:25 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2017-08-25 17:14 UTC (permalink / raw)
To: Jan Beulich, xen-devel; +Cc: Igor Druzhinin, osstest-admin, Roger Pau Monne
On 08/25/2017 09:40 AM, Jan Beulich wrote:
>>>> On 25.08.17 at 05:15, <osstest-admin@xenproject.org> wrote:
>> flight 112855 xen-unstable real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/112855/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>> test-amd64-i386-examine 7 reboot fail REGR. vs. 112809
>> test-amd64-i386-freebsd10-amd64 7 xen-boot fail REGR. vs. 112809
> These two are watchdog NMIs during the loading of Dom0. Most
> likely candidate for introducing the issue is Boris' scrub series.
I haven't been able to reproduce this but perhaps adding
process_pending_softirqs() in alloc_heap_pages() and free_heap_pages()
loops if CONFIG_SCRUB_DEBUG is set might help?
One other thing that also comes to mind is that there is probably no
reason to scrub (and in some cases poison) pages during dom0 creation.
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-25 17:14 ` Boris Ostrovsky
@ 2017-08-28 7:25 ` Jan Beulich
2017-08-28 13:57 ` Boris Ostrovsky
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2017-08-28 7:25 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
>>> On 25.08.17 at 19:14, <boris.ostrovsky@oracle.com> wrote:
> On 08/25/2017 09:40 AM, Jan Beulich wrote:
>>>>> On 25.08.17 at 05:15, <osstest-admin@xenproject.org> wrote:
>>> flight 112855 xen-unstable real [real]
>>> http://logs.test-lab.xenproject.org/osstest/logs/112855/
>>>
>>> Regressions :-(
>>>
>>> Tests which did not succeed and are blocking,
>>> including tests which could not be run:
>>> test-amd64-i386-examine 7 reboot fail REGR. vs. 112809
>>> test-amd64-i386-freebsd10-amd64 7 xen-boot fail REGR. vs. 112809
>> These two are watchdog NMIs during the loading of Dom0. Most
>> likely candidate for introducing the issue is Boris' scrub series.
>
>
> I haven't been able to reproduce this but perhaps adding
> process_pending_softirqs() in alloc_heap_pages() and free_heap_pages()
> loops if CONFIG_SCRUB_DEBUG is set might help?
That's possible, but might as well only be papering over a deeper
issue, e.g. ...
> One other thing that also comes to mind is that there is probably no
> reason to scrub (and in some cases poison) pages during dom0 creation.
... this one: Iirc before your series Dom0 pages weren't being
scrubbed, and imo this property ought to be retained (also if
any other boot time allocations now suddenly got scrubbed).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-28 7:25 ` Jan Beulich
@ 2017-08-28 13:57 ` Boris Ostrovsky
2017-08-28 14:02 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2017-08-28 13:57 UTC (permalink / raw)
To: Jan Beulich; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
On 08/28/2017 03:25 AM, Jan Beulich wrote:
>>>> On 25.08.17 at 19:14, <boris.ostrovsky@oracle.com> wrote:
>> On 08/25/2017 09:40 AM, Jan Beulich wrote:
>>>>>> On 25.08.17 at 05:15, <osstest-admin@xenproject.org> wrote:
>>>> flight 112855 xen-unstable real [real]
>>>> http://logs.test-lab.xenproject.org/osstest/logs/112855/
>>>>
>>>> Regressions :-(
>>>>
>>>> Tests which did not succeed and are blocking,
>>>> including tests which could not be run:
>>>> test-amd64-i386-examine 7 reboot fail REGR. vs. 112809
>>>> test-amd64-i386-freebsd10-amd64 7 xen-boot fail REGR. vs. 112809
>>> These two are watchdog NMIs during the loading of Dom0. Most
>>> likely candidate for introducing the issue is Boris' scrub series.
>>
>> I haven't been able to reproduce this but perhaps adding
>> process_pending_softirqs() in alloc_heap_pages() and free_heap_pages()
>> loops if CONFIG_SCRUB_DEBUG is set might help?
> That's possible, but might as well only be papering over a deeper
> issue, e.g. ...
>
>> One other thing that also comes to mind is that there is probably no
>> reason to scrub (and in some cases poison) pages during dom0 creation.
> ... this one: Iirc before your series Dom0 pages weren't being
> scrubbed, and imo this property ought to be retained (also if
> any other boot time allocations now suddenly got scrubbed).
It is scrubbed if CONFIG_SCRUB_DEBUG in free_domheap_pages:
#ifndef CONFIG_SCRUB_DEBUG
/*
* Normally we expect a domain to clear pages before freeing
them,
* if it cares about the secrecy of their contents. However,
after
* a domain has died we assume responsibility for erasure.
*/
scrub = !!d->is_dying;
#else
scrub = true;
#endif
so the question is whether we need to do this (at least for dom0).
As for periodically testing process_pending_softirqs() we may still want
to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG. And
while at it, also think we can execute the 'for' loop without holding
heap lock since the pages are now removed from the heap (or do we need
to modify count_info/type_info/owner under the lock?)
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-28 13:57 ` Boris Ostrovsky
@ 2017-08-28 14:02 ` Jan Beulich
2017-08-28 14:24 ` Boris Ostrovsky
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2017-08-28 14:02 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
>>> On 28.08.17 at 15:57, <boris.ostrovsky@oracle.com> wrote:
> On 08/28/2017 03:25 AM, Jan Beulich wrote:
>>>>> On 25.08.17 at 19:14, <boris.ostrovsky@oracle.com> wrote:
>>> On 08/25/2017 09:40 AM, Jan Beulich wrote:
>>>>>>> On 25.08.17 at 05:15, <osstest-admin@xenproject.org> wrote:
>>>>> flight 112855 xen-unstable real [real]
>>>>> http://logs.test-lab.xenproject.org/osstest/logs/112855/
>>>>>
>>>>> Regressions :-(
>>>>>
>>>>> Tests which did not succeed and are blocking,
>>>>> including tests which could not be run:
>>>>> test-amd64-i386-examine 7 reboot fail REGR. vs. 112809
>>>>> test-amd64-i386-freebsd10-amd64 7 xen-boot fail REGR. vs. 112809
>>>> These two are watchdog NMIs during the loading of Dom0. Most
>>>> likely candidate for introducing the issue is Boris' scrub series.
>>>
>>> I haven't been able to reproduce this but perhaps adding
>>> process_pending_softirqs() in alloc_heap_pages() and free_heap_pages()
>>> loops if CONFIG_SCRUB_DEBUG is set might help?
>> That's possible, but might as well only be papering over a deeper
>> issue, e.g. ...
>>
>>> One other thing that also comes to mind is that there is probably no
>>> reason to scrub (and in some cases poison) pages during dom0 creation.
>> ... this one: Iirc before your series Dom0 pages weren't being
>> scrubbed, and imo this property ought to be retained (also if
>> any other boot time allocations now suddenly got scrubbed).
>
> It is scrubbed if CONFIG_SCRUB_DEBUG in free_domheap_pages:
>
> #ifndef CONFIG_SCRUB_DEBUG
> /*
> * Normally we expect a domain to clear pages before freeing
> them,
> * if it cares about the secrecy of their contents. However,
> after
> * a domain has died we assume responsibility for erasure.
> */
> scrub = !!d->is_dying;
> #else
> scrub = true;
> #endif
>
> so the question is whether we need to do this (at least for dom0).
We should start doing this only once Dom0 started running, imo.
> As for periodically testing process_pending_softirqs() we may still want
> to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG.
For my taste, alloc_heap_pages() is the wrong place for such
calls.
> And
> while at it, also think we can execute the 'for' loop without holding
> heap lock since the pages are now removed from the heap (or do we need
> to modify count_info/type_info/owner under the lock?)
I don't think these fields need updating with the heap lock held.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-28 14:02 ` Jan Beulich
@ 2017-08-28 14:24 ` Boris Ostrovsky
2017-08-28 14:52 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2017-08-28 14:24 UTC (permalink / raw)
To: Jan Beulich; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
>> As for periodically testing process_pending_softirqs() we may still want
>> to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG.
> For my taste, alloc_heap_pages() is the wrong place for such
> calls.
But the loop is in alloc_heap_pages() --- where else would you be testing?
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-28 14:24 ` Boris Ostrovsky
@ 2017-08-28 14:52 ` Jan Beulich
2017-08-28 15:36 ` Boris Ostrovsky
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2017-08-28 14:52 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
>>> On 28.08.17 at 16:24, <boris.ostrovsky@oracle.com> wrote:
>>> As for periodically testing process_pending_softirqs() we may still want
>>> to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG.
>> For my taste, alloc_heap_pages() is the wrong place for such
>> calls.
>
> But the loop is in alloc_heap_pages() --- where else would you be testing?
It can only reasonably be the callers of alloc_heap_pages() imo.
A single call to it should never trigger the watchdog, only calls
themselves sitting in a loop should be potential candidates for
causing such.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-28 14:52 ` Jan Beulich
@ 2017-08-28 15:36 ` Boris Ostrovsky
2017-08-29 8:07 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2017-08-28 15:36 UTC (permalink / raw)
To: Jan Beulich; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
On 08/28/2017 10:52 AM, Jan Beulich wrote:
>>>> On 28.08.17 at 16:24, <boris.ostrovsky@oracle.com> wrote:
>>>> As for periodically testing process_pending_softirqs() we may still want
>>>> to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG.
>>> For my taste, alloc_heap_pages() is the wrong place for such
>>> calls.
>> But the loop is in alloc_heap_pages() --- where else would you be testing?
> It can only reasonably be the callers of alloc_heap_pages() imo.
> A single call to it should never trigger the watchdog,
check_one_page() is rather slow so for a large order allocation even
with clean heap the 'for' loop may take quite some time. Whether it
could trip the watchdog -- I don't know.
-boris
> only calls
> themselves sitting in a loop should be potential candidates for
> causing such.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-28 15:36 ` Boris Ostrovsky
@ 2017-08-29 8:07 ` Jan Beulich
2017-08-29 12:45 ` Boris Ostrovsky
0 siblings, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2017-08-29 8:07 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
>>> On 28.08.17 at 17:36, <boris.ostrovsky@oracle.com> wrote:
> On 08/28/2017 10:52 AM, Jan Beulich wrote:
>>>>> On 28.08.17 at 16:24, <boris.ostrovsky@oracle.com> wrote:
>>>>> As for periodically testing process_pending_softirqs() we may still want
>>>>> to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG.
>>>> For my taste, alloc_heap_pages() is the wrong place for such
>>>> calls.
>>> But the loop is in alloc_heap_pages() --- where else would you be testing?
>> It can only reasonably be the callers of alloc_heap_pages() imo.
>> A single call to it should never trigger the watchdog,
>
> check_one_page() is rather slow so for a large order allocation even
> with clean heap the 'for' loop may take quite some time. Whether it
> could trip the watchdog -- I don't know.
If that was a problem, we'd have to think about shortening the
loop. I stand by my assertion that nowhere down from
alloc_heap_pages() should be any invocation of
process_pending_softirqs() - it is simply too risky, as we don't
know what state we're in. One thing I could imagine to do is not
check the entire page, but (randomly?) pick a couple of locations
to check. But first of all we really need to be clear about whether
it's really a single alloc_heap_pages() invocation that trips the
watchdog, or whether something can be done about it in the
caller(s).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-29 8:07 ` Jan Beulich
@ 2017-08-29 12:45 ` Boris Ostrovsky
2017-08-29 13:12 ` Jan Beulich
0 siblings, 1 reply; 12+ messages in thread
From: Boris Ostrovsky @ 2017-08-29 12:45 UTC (permalink / raw)
To: Jan Beulich; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
On 08/29/2017 04:07 AM, Jan Beulich wrote:
>>>> On 28.08.17 at 17:36, <boris.ostrovsky@oracle.com> wrote:
>> On 08/28/2017 10:52 AM, Jan Beulich wrote:
>>>>>> On 28.08.17 at 16:24, <boris.ostrovsky@oracle.com> wrote:
>>>>>> As for periodically testing process_pending_softirqs() we may still want
>>>>>> to do this in alloc_heap_pages(), even without CONFIG_SCRUB_DEBUG.
>>>>> For my taste, alloc_heap_pages() is the wrong place for such
>>>>> calls.
>>>> But the loop is in alloc_heap_pages() --- where else would you be testing?
>>> It can only reasonably be the callers of alloc_heap_pages() imo.
>>> A single call to it should never trigger the watchdog,
>> check_one_page() is rather slow so for a large order allocation even
>> with clean heap the 'for' loop may take quite some time. Whether it
>> could trip the watchdog -- I don't know.
> If that was a problem, we'd have to think about shortening the
> loop. I stand by my assertion that nowhere down from
> alloc_heap_pages() should be any invocation of
> process_pending_softirqs() - it is simply too risky, as we don't
> know what state we're in. One thing I could imagine to do is not
> check the entire page, but (randomly?) pick a couple of locations
> to check. But first of all we really need to be clear about whether
> it's really a single alloc_heap_pages() invocation that trips the
> watchdog, or whether something can be done about it in the
> caller(s).
At least one of the crashes was from alloc_chunk()->free_heap_pages(),
i.e. not from inside alloc_heap_pages()' loop. My proposal was not
necessarily based on the specific crashes in this flight (this issue
will be addressed by the patches I sent yesterday) but rather as a
general suggestion. But I understand that calling alloc_heap_pages()
from alloc_heap_pages() may not be a great idea.
I am somewhat puzzled though by the fact that I haven't seen this in my
testing --- I was creating/destroying very large guests (> 1TB) in
parallel so there must have been loops over high orders and I never had
a watchdog go off. And my dom0s were quite large too while the one in
this flight is only 512M.
-boris
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass
2017-08-29 12:45 ` Boris Ostrovsky
@ 2017-08-29 13:12 ` Jan Beulich
0 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2017-08-29 13:12 UTC (permalink / raw)
To: Boris Ostrovsky; +Cc: Igor Druzhinin, osstest-admin, xen-devel, Roger Pau Monne
>>> On 29.08.17 at 14:45, <boris.ostrovsky@oracle.com> wrote:
> I am somewhat puzzled though by the fact that I haven't seen this in my
> testing --- I was creating/destroying very large guests (> 1TB) in
> parallel so there must have been loops over high orders and I never had
> a watchdog go off. And my dom0s were quite large too while the one in
> this flight is only 512M.
I guess much depends on memory access latencies on the particular
systems.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2017-08-29 13:12 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-25 3:15 [xen-unstable test] 112855: regressions - trouble: blocked/broken/fail/pass osstest service owner
2017-08-25 13:40 ` Jan Beulich
2017-08-25 17:14 ` Boris Ostrovsky
2017-08-28 7:25 ` Jan Beulich
2017-08-28 13:57 ` Boris Ostrovsky
2017-08-28 14:02 ` Jan Beulich
2017-08-28 14:24 ` Boris Ostrovsky
2017-08-28 14:52 ` Jan Beulich
2017-08-28 15:36 ` Boris Ostrovsky
2017-08-29 8:07 ` Jan Beulich
2017-08-29 12:45 ` Boris Ostrovsky
2017-08-29 13:12 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.