All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-linus test] 30356: regressions - FAIL
@ 2014-09-23 11:30 xen.org
  2014-09-23 11:45 ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: xen.org @ 2014-09-23 11:30 UTC (permalink / raw)
  To: xen-devel; +Cc: ian.jackson

flight 30356 linux-linus real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-armhf-armhf-xl           9 guest-start               fail REGR. vs. 30019
 test-amd64-amd64-xl-qemuu-win7-amd64  7 windows-install   fail REGR. vs. 30019

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-freebsd10-i386  7 freebsd-install              fail like 30019
 test-amd64-i386-freebsd10-amd64  7 freebsd-install             fail like 30019
 test-amd64-i386-pair        17 guest-migrate/src_host/dst_host fail like 30019
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install          fail like 30019

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-libvirt      9 guest-start                  fail   never pass
 test-amd64-i386-libvirt       9 guest-start                  fail   never pass
 test-amd64-amd64-libvirt      9 guest-start                  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start                 fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop         fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stop                fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop                   fail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop              fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop             fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop              fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop                   fail  never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop               fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop                   fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop               fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop         fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stop                fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop                   fail   never pass

version targeted for testing:
 linux                b0e2a55c6536f255ebe80bc84c3f565c2a8f2a9d
baseline version:
 linux                f1bd473f95e02bc382d4dae94d7f82e2a455e05d

------------------------------------------------------------
567 people touched revisions under test,
not listing them all
------------------------------------------------------------

jobs:
 build-amd64                                                  pass    
 build-armhf                                                  pass    
 build-i386                                                   pass    
 build-amd64-libvirt                                          pass    
 build-armhf-libvirt                                          pass    
 build-i386-libvirt                                           pass    
 build-amd64-pvops                                            pass    
 build-armhf-pvops                                            pass    
 build-i386-pvops                                             pass    
 build-amd64-rumpuserxen                                      pass    
 build-i386-rumpuserxen                                       pass    
 test-amd64-amd64-xl                                          pass    
 test-armhf-armhf-xl                                          fail    
 test-amd64-i386-xl                                           pass    
 test-amd64-i386-rhel6hvm-amd                                 pass    
 test-amd64-i386-qemut-rhel6hvm-amd                           pass    
 test-amd64-i386-qemuu-rhel6hvm-amd                           pass    
 test-amd64-amd64-xl-qemut-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemut-debianhvm-amd64                     pass    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64                     pass    
 test-amd64-i386-freebsd10-amd64                              fail    
 test-amd64-amd64-xl-qemuu-ovmf-amd64                         pass    
 test-amd64-i386-xl-qemuu-ovmf-amd64                          pass    
 test-amd64-amd64-rumpuserxen-amd64                           pass    
 test-amd64-amd64-xl-qemut-win7-amd64                         fail    
 test-amd64-i386-xl-qemut-win7-amd64                          fail    
 test-amd64-amd64-xl-qemuu-win7-amd64                         fail    
 test-amd64-i386-xl-qemuu-win7-amd64                          fail    
 test-amd64-amd64-xl-win7-amd64                               fail    
 test-amd64-i386-xl-win7-amd64                                fail    
 test-amd64-i386-xl-credit2                                   pass    
 test-amd64-i386-freebsd10-i386                               fail    
 test-amd64-i386-rumpuserxen-i386                             pass    
 test-amd64-amd64-xl-pcipt-intel                              fail    
 test-amd64-i386-rhel6hvm-intel                               pass    
 test-amd64-i386-qemut-rhel6hvm-intel                         pass    
 test-amd64-i386-qemuu-rhel6hvm-intel                         pass    
 test-amd64-amd64-libvirt                                     fail    
 test-armhf-armhf-libvirt                                     fail    
 test-amd64-i386-libvirt                                      fail    
 test-amd64-i386-xl-multivcpu                                 pass    
 test-amd64-amd64-pair                                        pass    
 test-amd64-i386-pair                                         fail    
 test-amd64-amd64-xl-sedf-pin                                 pass    
 test-amd64-amd64-xl-sedf                                     pass    
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1                     fail    
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1                     fail    
 test-amd64-i386-xl-winxpsp3-vcpus1                           fail    
 test-amd64-amd64-xl-qemut-winxpsp3                           fail    
 test-amd64-i386-xl-qemut-winxpsp3                            fail    
 test-amd64-amd64-xl-qemuu-winxpsp3                           fail    
 test-amd64-i386-xl-qemuu-winxpsp3                            fail    
 test-amd64-amd64-xl-winxpsp3                                 fail    
 test-amd64-i386-xl-winxpsp3                                  fail    


------------------------------------------------------------
sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
    http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
    http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.

(No revision log; it would be 21553 lines long.)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-23 11:30 [linux-linus test] 30356: regressions - FAIL xen.org
@ 2014-09-23 11:45 ` Ian Campbell
  2014-09-23 15:20   ` Julien Grall
  2014-10-30 23:18   ` Julien Grall
  0 siblings, 2 replies; 10+ messages in thread
From: Ian Campbell @ 2014-09-23 11:45 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, xen.org

On Tue, 2014-09-23 at 12:30 +0100, xen.org wrote:
> flight 30356 linux-linus real [real]
> http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-armhf-armhf-xl           9 guest-start               fail REGR. vs. 30019

http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/test-armhf-armhf-xl/info.html

This has failed in everyone of a couple of dozen runs since the end of
August.

30019 was OK, so was 30032 but from 30050 onwards it is consistently
failing.

Failure is:
2014-09-23 05:39:03 Z guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: waiting 40s...
2014-09-23 05:39:03 Z guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: no active lease (waiting) ...
...
2014-09-23 05:39:44 Z FAILURE: guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: wait timed out: no active lease.
failure: guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: wait timed out: no active lease.

Guest console logs are empty, host console log is uninteresting. The
xenstore dump shows that none of the devices are connected, all of which
suggests an early crash of some sort.

Baseline was f1bd473f aka v3.17-rc2-42-gf1bd473 and this is b0e2a55c6536
aka v3.17-rc6-7-gb0e2a55

Perhaps more interestingly the first failure in 30050 was 
fd5984d7c8e aka v3.17-rc2-227-gfd5984d, which is a much smaller range.

Nothing in the log (below) looks terribly exciting to me and there's not
a lot to go on. Anyone got any ideas?

Ian.

$ git log --no-merges --oneline f1bd473f..fd5984d7c8e
b0108f9 kexec: purgatory: add clean-up for purgatory directory
16b0371 Documentation/kdump/kdump.txt: add ARM description
e356030 flush_icache_range: export symbol to fix build errors
498b473 tools: selftests: fix build issue with make kselftests target
8c7b638 ocfs2: quorum: add a log for node not fenced
8e9801d ocfs2: o2net: set tcp user timeout to max value
c43c363 ocfs2: o2net: don't shutdown connection when idle timeout
2b46263 ocfs2: do not write error flag to user structure we cannot copy from/to
4df4185 x86/purgatory: use approprate -m64/-32 build flag for arch/x86/purgatory
b7d5b9a drivers/rtc/rtc-s5m.c: re-add support for devices without irq specified
bfcfd44 xattr: fix check for simultaneous glibc header inclusion
b41d34b kexec: remove CONFIG_KEXEC dependency on crypto
74ca317 kexec: create a new config option CONFIG_KEXEC_FILE for new syscall
b38af47 x86,mm: fix pte_special versus pte_numa
7ea8574 hugetlb_cgroup: use lockdep_assert_held rather than spin_is_locked
137f8cf mm/zpool: use prefixed module loading
0cf1e9d zram: fix incorrect stat with failed_reads
0c38e1f lib: turn CONFIG_STACKTRACE into an actual option.
ce8369b mm: actually clear pmd_numa before invalidating
0cfb8f0 memblock, memhotplug: fix wrong type in memblock_find_in_range_node().
800df62 resource: fix the case of null pointer access
3f6316b checkpatch: relax check for length of git commit IDs
9e36c63 alpha: io: implement relaxed accessor macros for writes
5691e44 alpha: Wire up sched_setattr, sched_getattr, and renameat2 syscalls.
9eabc99 x86, irq, PCI: Keep IRQ assignment for runtime power management
d80d448 ext4: fix same-dir rename when inline data directory overflows
db9ee22 jbd2: fix descriptor block size handling errors with journal_csum
022eaa7 jbd2: fix infinite loop when recovering corrupt journal blocks
6603120 ext4: update i_disksize coherently with block allocation on error path
d49ec52 dm crypt: fix access beyond the end of allocated space
daebabd mfd: twl4030-power: Fix PM idle pin configuration to not conflict with regulators
bc80436 mfd: tc3589x: Add device tree bindings
7b5af5c cfq-iosched: Add comments on update timing of weight
1f58d94 dma-buf/fence: Fix one more kerneldoc warning
e9f3b79 dma-buf/fence: Fix a kerneldoc warning
a07b3b4 Documentation/dma-buf-sharing.txt: update API descriptions
b8d758d drm/ast: Add missing entry to dclk_table[]
00e7208 drm: fix division-by-zero on dumb_create()
4d69237 ww-mutex: clarify help text for DEBUG_WW_MUTEX_SLOWPATH
a9ef803 USB: fix build error with CONFIG_PM_RUNTIME disabled
dc26874 cpufreq: s5pv210: Remove spurious __init annotation
16405f9 cpufreq: intel_pstate: Add CPU ID for Braswell processor
ce71761 intel_pstate: Turn per cpu printk into pr_debug
c174e6d ext4: fix transaction issues for ext4_fallocate and ext_zero_range
69dc953 ext4: fix incorect journal credits reservation in ext4_zero_range
7c38405 Revert "usb: ehci/ohci-exynos: Fix PHY getting sequence"
0bd252d radeon: Test for PCI root bus before assuming bus->self
e21eba0 xhci: Disable streams on Via XHCI with device-id 0x3432
5654699 USB: serial: fix potential heap buffer overflow
d979e9f USB: serial: fix potential stack buffer overflow
f395dca x86: irq: Fix bug in setting IOAPIC pin attributes
1a22e77 ALSA: hda - Set up initial pins for Acer Aspire V5
0393689 usb: ehci/ohci-exynos: Fix PHY getting sequence
bdd405d usb: hub: Prevent hub autosuspend if usbcore.autosuspend is -1
72ad366 thunderbolt: Clear hops before overwriting
f87d928f NFSv3: Fix another acl regression
412f6c4 NFSv4: Don't clear the open state when we just did an OPEN_DOWNGRADE
aee7af3 NFSv4: Fix problems with close in the presence of a delegation
5b6b80a USB: sisusb: add device id for Magic Control USB video
0a5f6e9 drm/radeon: handle broken disabled rb mask gracefully (6xx/7xx) (v2)
054e01d drm/radeon: save/restore the PD addr on suspend/resume
e15693e cfq-iosched: Fix wrong children_weight calculation
0d9509d drm/msm: Fix missing unlock on error in msm_fbdev_create()
12313c2 drm/msm: fix compile error for non-dt builds
119ecb7 drm/msm/mdp4: request vblank during modeset
6814dbf drm/msm: avoid flood of kernel logs on faults
d19d744 block: fix error handling in sg_io
4d4e2c0 video: da8xx-fb: preserve display width when changing HSYNC
f5ec6c4 drm: sti: Add missing dependency on RESET_CONTROLLER
8e932cf drm: sti: Make of_device_id array const
eacd9aa drm: sti: Fix return value check in sti_drm_platform_probe()
5024a2b drm: sti: hda: fix return value check in sti_hda_probe()
88cfc3f drm: sti: hdmi: fix return value check in sti_hdmi_probe()
31f32a2 drm: sti: tvout: fix return value check in sti_tvout_probe()
62795a0 video: of: display_timing: double free on error
6c13185 drivers: video: fbdev: atmel_lcdfb.c: fix error return code
2b6c53b video: ARM CLCD: Fix calculation of bits-per-pixel
754d561 fbdev: Remove __init from chips_hw_init() to fix build failure
1bfbd8e ACPI / LPSS: Add ACPI IDs for Intel Braswell
558e473 ACPI / EC: Add support to disallow QR_EC to be issued before completing previous QR_EC
3afcf2e ACPI / EC: Add support to disallow QR_EC to be issued when SCI_EVT isn't set
236105d ACPI: Run fixed event device notifications in process context
fc2e0a8 ACPI / scan: Allow ACPI drivers to bind to PNP device objects
a2fa672 staging: r8188eu: Add new USB ID
a90b858 x86: Fix non-PC platform kernel crash on boot due to NULL dereference
8626d52 staging/rtl8188eu: add 0df6:0076 Sitecom Europe B.V.
8e8248b mei: nfc: fix memory leak in error path
73ab423 mei: reset client state on queued connect request
9b2667f usb: dwc2: gadget: Set the default EP max packet value as 8 bytes
5cbcc35 usb: ehci: using wIndex + 1 for hub port
a7e69dd USB: storage: add quirk for Newer Technology uSCSI SCSI-USB converter
563da3a MAINTAINERS: Add an entry for USB/IP driver
3f653c5 usbip: remove struct usb_device_id table
96c2737 usbip: move usbip kernel code out of staging
588b48c usbip: move usbip userspace code out of staging
6817ae2 USB: whiteheat: Added bounds checking for bulk command response
4631dbf ext4: move i_size,i_disksize update routines to helper function
c99d1e6 ext4: fix BUG_ON in mb_free_blocks()
36de928 ext4: propagate errors up to ext4_find_entry()'s callers
2ba136d fix regression in SCSI_IOCTL_SEND_COMMAND
6f4a1626 scsi-mq: fix requests that use a separate CDB buffer
94a988a ALSA: pcm: Fix the silence data for DSD formats
ee3043b ALSA: ctxfi: ct20k1reg: Fix typo in include guard
3c25d04 ALSA: hda: ca0132_regs.h: Fix typo in include guard
ddc64b2 ALSA: core: fix buffer overflow in snd_info_get_line()
a57821c block: support > 16 byte CDBs for SG_IO
2cada58 block: cleanup error handling in sg_io
aeac318 brd: add ram disk visibility option
ffb5db7 block: systemace: Remove .owner field for driver
cddd5d1 blk-mq: blk_mq_freeze_queue() should allow nesting
a68aafa blk-mq: correct a few wrong/bad comments
16f408d block: Fix BUG_ON when pi errors occur
274a584 blk-mq: don't allow merges if turned off for the queue
0252d6a pinctrl: qcom: apq8064: Correct interrupts in example
f6a8249 pinctrl: exynos: Lock GPIOs as interrupts when used as EINTs
5d19703 usb: gadget: remove $(PWD) in ccflags-y
a68df70 usb: pch_udc: usb gadget device support for Intel Quark X1000
72ef8e4 mfd: ab8500-core: Use 'ifdef' for config options
6065c9a mfd: htc-i2cpld: Fix %d confusingly prefixed with 0x in format string
ddde06b mfd: omap-usb-host: Fix %d confusingly prefixed with 0x in format string
937222c pwm-backlight: Fix bogus request for GPIO#0 when instantiated from DT
bd52b81 usb: gadget: uvc: fix possible lockup in uvc gadget
6835a3a usb: wusbcore: fix below build warning
5b1dc20 usb: core: fix below build warning
194f74eb usb: dwc2: gadget: fix below build warning
365038d xhci: rework cycle bit checking for new dequeue pointers
2597fe9 usb: xhci: amd chipset also needs short TX quirk
9a54886 xhci: Treat not finding the event_seg on COMP_STOP the same as COMP_STOP_INVAL
dd5f500 usbcore: Fix wrong device in an error message in hub_port_connect()
7166c32 Revert "usb: gadget: u_ether: synchronize with transmit when stopping queue"
716d28e usb: phy: msm: Fix return value check in msm_otg_probe()
4b11f88 usb: gadget: Fix return value check in r8a66597_probe()
7042e8f usb: gadget: Fix return value check in ep_write()
788b0bc4 usb: dwc3: omap: signedness bug in dwc3_omap_set_utmi_mode()
50f9f79 usb: musb: ux500: fix decimal printf format specifiers prefixed with 0x
bcabdc2 usb: atmel_usba_udc: fix it to deal with final dma channel
20e7d46 usb: gadget: fix error return code
bbc66e1 usb: phy: samsung: Fix wrong bit mask for PHYPARAM1_PCS_TXDEEMPH
4958cf3 usb: dbgp gadget: fix use after free in dbgp_unbind()
0c58240 usb: phy: drop kfree of devm_kzalloc's data
2c4e3db usb: phy: return -ENODEV on failure of try_module_get
646907f USB: ftdi_sio: Added PID for new ekey device
91fcb1ce USB: serial: pl2303: add device id for ztek device
6552cc7 USB: ftdi_sio: add Basic Micro ATOM Nano USB2Serial PID
754eb21 USB: zte_ev: remove duplicate Qualcom PID
95be573 USB: zte_ev: remove duplicate Gobi PID
63a901c Revert "USB: option,zte_ev: move most ZTE CDMA devices to zte_ev"
d773027 USB: option: add VIA Telecom CDS7 chipset device id
f0e4cba USB: option: reduce interrupt-urb logging verbosity
4b6fe45 pinctrl: pinctrl-at91.c: fix decimal printf format specifiers prefixed with 0x
1d54f0f pinctrl: abx500: remove useless check
8a3cfb7 pinctrl: tegra-xusb: testing wrong variable in probe()
8e1594d pinctrl: tegra-xusb: fix an off by one test
99e872d pinctrl: rockchip: fix rk3288 gpio0 configuration
302fb17 sh-pfc: r8a7791: fix CAN pin groups
eb29835 staging: android: fix a possible memory leak
299ef8c staging: lustre: lustre: libcfs: workitem.c: Cleaning up missing null-terminate after strncpy call
ec0a38b staging: et131x: Fix errors caused by phydev->addr accesses before initialisation
e409842 staging: lustre: Remove circular dependency on header
8a58d1f blk-mq: get rid of unused BLK_MQ_F_SHOULD_SORT flag
dd84008 blk-mq: fix WARNING "percpu_ref_kill() called more than once!"

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-23 11:45 ` Ian Campbell
@ 2014-09-23 15:20   ` Julien Grall
  2014-09-23 17:31     ` Julien Grall
  2014-10-30 23:18   ` Julien Grall
  1 sibling, 1 reply; 10+ messages in thread
From: Julien Grall @ 2014-09-23 15:20 UTC (permalink / raw)
  To: Ian Campbell, Stefano Stabellini; +Cc: xen-devel, xen.org

On 09/23/2014 12:45 PM, Ian Campbell wrote:
> On Tue, 2014-09-23 at 12:30 +0100, xen.org wrote:
>> flight 30356 linux-linus real [real]
>> http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  test-armhf-armhf-xl           9 guest-start               fail REGR. vs. 30019
> 
> http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/test-armhf-armhf-xl/info.html
> 
> This has failed in everyone of a couple of dozen runs since the end of
> August.
> 
> 30019 was OK, so was 30032 but from 30050 onwards it is consistently
> failing.
> 
> Failure is:
> 2014-09-23 05:39:03 Z guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: waiting 40s...
> 2014-09-23 05:39:03 Z guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: no active lease (waiting) ...
> ...
> 2014-09-23 05:39:44 Z FAILURE: guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: wait timed out: no active lease.
> failure: guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: wait timed out: no active lease.
> 
> Guest console logs are empty, host console log is uninteresting. The
> xenstore dump shows that none of the devices are connected, all of which
> suggests an early crash of some sort.
> 
> Baseline was f1bd473f aka v3.17-rc2-42-gf1bd473 and this is b0e2a55c6536
> aka v3.17-rc6-7-gb0e2a55
> 
> Perhaps more interestingly the first failure in 30050 was 
> fd5984d7c8e aka v3.17-rc2-227-gfd5984d, which is a much smaller range.
> 
> Nothing in the log (below) looks terribly exciting to me and there's not
> a lot to go on. Anyone got any ideas?

It looks like to be an issue with the .config. I have a working linux if
I disable most the CONFIG_MACH_* and CONFIG_ARCH_*.

Touching those CONFIGs also modify CONFIG_DEBUG_LL_INCLUDE.

The current value of CONFIG_DEBUG_LL_INCLUDE looks wrong to me but I
still need to figure out if it's really an issue or not.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-23 15:20   ` Julien Grall
@ 2014-09-23 17:31     ` Julien Grall
  2014-09-23 18:40       ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: Julien Grall @ 2014-09-23 17:31 UTC (permalink / raw)
  To: Ian Campbell, Stefano Stabellini; +Cc: xen-devel, xen.org

Hi,

On 09/23/2014 04:20 PM, Julien Grall wrote:
> On 09/23/2014 12:45 PM, Ian Campbell wrote:
>> On Tue, 2014-09-23 at 12:30 +0100, xen.org wrote:
>>> flight 30356 linux-linus real [real]
>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/
>>>
>>> Regressions :-(
>>>
>>> Tests which did not succeed and are blocking,
>>> including tests which could not be run:
>>>  test-armhf-armhf-xl           9 guest-start               fail REGR. vs. 30019
>>
>> http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/test-armhf-armhf-xl/info.html
>>
>> This has failed in everyone of a couple of dozen runs since the end of
>> August.
>>
>> 30019 was OK, so was 30032 but from 30050 onwards it is consistently
>> failing.
>>
>> Failure is:
>> 2014-09-23 05:39:03 Z guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: waiting 40s...
>> 2014-09-23 05:39:03 Z guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: no active lease (waiting) ...
>> ...
>> 2014-09-23 05:39:44 Z FAILURE: guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: wait timed out: no active lease.
>> failure: guest debian.guest.osstest 5a:36:0e:94:00:07 22 link/ip/tcp: wait timed out: no active lease.
>>
>> Guest console logs are empty, host console log is uninteresting. The
>> xenstore dump shows that none of the devices are connected, all of which
>> suggests an early crash of some sort.
>>
>> Baseline was f1bd473f aka v3.17-rc2-42-gf1bd473 and this is b0e2a55c6536
>> aka v3.17-rc6-7-gb0e2a55
>>
>> Perhaps more interestingly the first failure in 30050 was 
>> fd5984d7c8e aka v3.17-rc2-227-gfd5984d, which is a much smaller range.
>>
>> Nothing in the log (below) looks terribly exciting to me and there's not
>> a lot to go on. Anyone got any ideas?
> 
> It looks like to be an issue with the .config. I have a working linux if
> I disable most the CONFIG_MACH_* and CONFIG_ARCH_*.
> 
> Touching those CONFIGs also modify CONFIG_DEBUG_LL_INCLUDE.
> 
> The current value of CONFIG_DEBUG_LL_INCLUDE looks wrong to me but I
> still need to figure out if it's really an issue or not.

I've spent more time to debug this issue and found another one, which is
finally related.

When multi_v7 config (+ Xen options) is used, DOM0 will crash [1] in
the swiotlb code.

The config is used short page table, which make linux using 32 bits
for the physical address. If we choose to use 64 bits for the DMA
(enabled when Xen is selected), BUG(dma != phys) will likely hit with
the recent change in swiotlb (i.e handle multiple grant reference on
the same mapping).

We now require to use LPAE by default. Enable CONFIG_LPAE=y also
solve guest boot. I haven't yet figured out if it's related or not.

I guess we will have to select LPAE when XEN is enabled, right? If
it's the case that would mean the user won't be able to compile a
Linux guest with short page table and Xen.

Any though?

Regards,


[1] DOM0 crash:

[  110.968052] kernel BUG at /local/home/julien/works/linux/drivers/xen/swiotlb-xen.c:101!
[  110.976124] Internal error: Oops - BUG: 0 [#1] SMP ARM
[  110.981331] Modules linked in:
[  110.984459] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0-rc6+ #84
[  110.990968] task: c0bbdaa0 ti: c0bb2000 task.ti: c0bb2000
[  110.996449] PC is at xen_unmap_single+0xc4/0xc8
[  111.001037] LR is at xen_unmap_single+0xc4/0xc8
[  111.005637] pc : [<c04c4670>]    lr : [<c04c4670>]    psr: 20010193
[  111.005637] sp : c0bb3d90  ip : 00000001  fp : b679a000
[  111.017274] r10: 00000200  r9 : 00000002  r8 : 00000002
[  111.022562] r7 : 00000000  r6 : 002b679a  r5 : 00000002  r4 : b679a000
[  111.029159] r3 : 0000075d  r2 : c0bba968  r1 : 20010193  r0 : 00000020
[  111.035757] Flags: nzCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[  111.043221] Control: 10c5387d  Table: 39fdc06a  DAC: 00000015
[  111.049036] Process swapper/0 (pid: 0, stack limit = 0xc0bb2250)
[  111.055112] Stack: (0xc0bb3d90 to 0xc0bb4000)
[  111.059540] 3d80:                                     c0bb3da4 c0268948 d9d8f980 dbbb2cc0
[  111.067786] 3da0: c0bb3dcc db0e6210 00208040 db717f00 00000001 00000001 00000002 db0e6210
[  111.076032] 3dc0: 00000000 db439c90 0000259c c04c48b0 00000200 00000002 00000000 d80b7f18
[  111.084278] 3de0: d8042058 db465088 db717f00 db464000 00000001 000000b0 00000001 c0575b1c
[  111.092533] 3e00: 00000000 c0cbfb40 c0cbfb40 c0295ca8 40010193 db465088 db464000 db4656c8
[  111.100770] 3e20: 00000001 c0575df0 00000000 00000017 db464000 c05761b8 db464000 db4656c8
[  111.109016] 3e40: 00000000 db464000 00000008 e0880100 00000001 db443810 db439b90 c058c2ac
[  111.117267] 3e60: 8007a120 00009896 00000000 00989680 00000000 00989680 0016e360 00000000
[  111.125517] 3e80: db439c90 e0880000 00000001 00000001 00000001 db439c90 0000259c c058ca40
[  111.133754] 3ea0: c0bba54c 80010193 d3d7ef03 c0bba44c 00000001 db447780 c0c065d0 00000000
[  111.142006] 3ec0: 00000000 00000073 db00b780 c0c9841c 00000001 c0282478 c0cbff98 d56e6afc
[  111.150246] 3ee0: 00000019 db00b780 c0c065d0 00000000 e0804000 c0bb0060 c0bb2000 00000000
[  111.158501] 3f00: c0842468 c02825b0 00000000 db00b780 c0c065d0 c02850d0 c0285028 00000073
[  111.166738] 3f20: 00000073 c0281cc8 c0bafc78 c020fb40 e080400c c0bba960 c0bb3f58 c0208910
[  111.174984] 3f40: c020fe44 c020fe48 60010013 ffffffff c0bb3f8c c02132c0 ffffffed 00000000
[  111.183230] 3f60: ffffffed c0220260 c0bba504 c0bba4a0 00000000 00000000 c0bb0060 c0bb2000
[  111.191485] 3f80: 00000000 c0842468 00000020 c0bb3fa0 c020fe44 c020fe48 60010013 ffffffff
[  111.199722] 3fa0: 00000000 c0277e5c c0bb3fb4 c0c9841a 00000000 c0c985c0 00000000 c0affba8
[  111.207973] 3fc0: ffffffff ffffffff c0aff5ec 00000000 00000000 c0b73eb8 c0c99214 c0bba484
[  111.216214] 3fe0: c0b73eb4 c0bbeb94 2020406a 413fc0f2 00000000 20208074 00000000 00000000
[  111.224472] [<c04c4670>] (xen_unmap_single) from [<c04c48b0>] (xen_swiotlb_unmap_sg_attrs+0x48/0x68)



-- 
Julien Grall

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-23 17:31     ` Julien Grall
@ 2014-09-23 18:40       ` Ian Campbell
  2014-09-23 23:35         ` Julien Grall
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2014-09-23 18:40 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, xen.org, Stefano Stabellini

On Tue, 2014-09-23 at 18:31 +0100, Julien Grall wrote:
> I guess we will have to select LPAE when XEN is enabled, right? If
> it's the case that would mean the user won't be able to compile a
> Linux guest with short page table and Xen.
> 
> Any though?

Things must work without in guest LPAE too, so something somewhere else
will need fixing.

Apart from restricting the user in an unwanted way requiring LPAE will
mean that practically no distro installer will work in a Xen guest.

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-23 18:40       ` Ian Campbell
@ 2014-09-23 23:35         ` Julien Grall
  2014-09-24  8:24           ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: Julien Grall @ 2014-09-23 23:35 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, xen.org, Stefano Stabellini



On 23/09/2014 19:40, Ian Campbell wrote:
> On Tue, 2014-09-23 at 18:31 +0100, Julien Grall wrote:
>> I guess we will have to select LPAE when XEN is enabled, right? If
>> it's the case that would mean the user won't be able to compile a
>> Linux guest with short page table and Xen.
>>
>> Any though?
>
> Things must work without in guest LPAE too, so something somewhere else
> will need fixing.
>
> Apart from restricting the user in an unwanted way requiring LPAE will
> mean that practically no distro installer will work in a Xen guest.

Xen does an identity mapping for the host physical address into DOM0
for the grant mapping. DOM0 will use a scratch page (see commit 340720b
"xen/arm: reimplement xen_dma_unmap_page & friends") and map and this 
physical address.

That means on platform with an address space higher than 32 bits, which 
is the case on Midway, we have to handle 64 bits physical address in DOM0.

With the current implementation in Linux we can only use LPAE when a 
guest is started. The distro installer will still be able to work with
short page table.

The drawback is we are requiring LPAE from DOM0 and a different kernel
in the guest if the user doesn't want to use LPAE.

As the code is already pushed in Linux 3.17, I don't find a simpler 
solution to fix Linux boot without requiring LPAE.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-23 23:35         ` Julien Grall
@ 2014-09-24  8:24           ` Ian Campbell
  2014-09-24 10:02             ` Stefano Stabellini
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2014-09-24  8:24 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, xen.org, Stefano Stabellini

On Wed, 2014-09-24 at 00:35 +0100, Julien Grall wrote:
> 
> On 23/09/2014 19:40, Ian Campbell wrote:
> > On Tue, 2014-09-23 at 18:31 +0100, Julien Grall wrote:
> >> I guess we will have to select LPAE when XEN is enabled, right? If
> >> it's the case that would mean the user won't be able to compile a
> >> Linux guest with short page table and Xen.
> >>
> >> Any though?
> >
> > Things must work without in guest LPAE too, so something somewhere else
> > will need fixing.
> >
> > Apart from restricting the user in an unwanted way requiring LPAE will
> > mean that practically no distro installer will work in a Xen guest.
> 
> Xen does an identity mapping for the host physical address into DOM0
> for the grant mapping. DOM0 will use a scratch page (see commit 340720b
> "xen/arm: reimplement xen_dma_unmap_page & friends") and map and this 
> physical address.
> 
> That means on platform with an address space higher than 32 bits, which 
> is the case on Midway, we have to handle 64 bits physical address in DOM0.
> 
> With the current implementation in Linux we can only use LPAE when a 
> guest is started. The distro installer will still be able to work with
> short page table.
> 
> The drawback is we are requiring LPAE from DOM0 and a different kernel
> in the guest if the user doesn't want to use LPAE.
> 
> As the code is already pushed in Linux 3.17, I don't find a simpler 
> solution to fix Linux boot without requiring LPAE.

We will have to try harder then, requiring LPAE simply isn't acceptable
IMHO.

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-24  8:24           ` Ian Campbell
@ 2014-09-24 10:02             ` Stefano Stabellini
  2014-09-24 10:15               ` Ian Campbell
  0 siblings, 1 reply; 10+ messages in thread
From: Stefano Stabellini @ 2014-09-24 10:02 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Julien Grall, xen.org, xen-devel, Stefano Stabellini

On Wed, 24 Sep 2014, Ian Campbell wrote:
> On Wed, 2014-09-24 at 00:35 +0100, Julien Grall wrote:
> > 
> > On 23/09/2014 19:40, Ian Campbell wrote:
> > > On Tue, 2014-09-23 at 18:31 +0100, Julien Grall wrote:
> > >> I guess we will have to select LPAE when XEN is enabled, right? If
> > >> it's the case that would mean the user won't be able to compile a
> > >> Linux guest with short page table and Xen.
> > >>
> > >> Any though?
> > >
> > > Things must work without in guest LPAE too, so something somewhere else
> > > will need fixing.
> > >
> > > Apart from restricting the user in an unwanted way requiring LPAE will
> > > mean that practically no distro installer will work in a Xen guest.
> > 
> > Xen does an identity mapping for the host physical address into DOM0
> > for the grant mapping. DOM0 will use a scratch page (see commit 340720b
> > "xen/arm: reimplement xen_dma_unmap_page & friends") and map and this 
> > physical address.
> > 
> > That means on platform with an address space higher than 32 bits, which 
> > is the case on Midway, we have to handle 64 bits physical address in DOM0.
> > 
> > With the current implementation in Linux we can only use LPAE when a 
> > guest is started. The distro installer will still be able to work with
> > short page table.
> > 
> > The drawback is we are requiring LPAE from DOM0 and a different kernel
> > in the guest if the user doesn't want to use LPAE.
> > 
> > As the code is already pushed in Linux 3.17, I don't find a simpler 
> > solution to fix Linux boot without requiring LPAE.
> 
> We will have to try harder then, requiring LPAE simply isn't acceptable
> IMHO.

I agree but the solution is not simple.

With the current scheme we would need to find a way to map pages at
64bit physical addresses in Dom0 without CONFIG_ARM_LPAE. Not sure if
that is possible.

Otherwise we would need to come up with an entirely new scheme.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-24 10:02             ` Stefano Stabellini
@ 2014-09-24 10:15               ` Ian Campbell
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Campbell @ 2014-09-24 10:15 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Julien Grall, xen.org, xen-devel

On Wed, 2014-09-24 at 11:02 +0100, Stefano Stabellini wrote:
> On Wed, 24 Sep 2014, Ian Campbell wrote:
> > On Wed, 2014-09-24 at 00:35 +0100, Julien Grall wrote:
> > > 
> > > On 23/09/2014 19:40, Ian Campbell wrote:
> > > > On Tue, 2014-09-23 at 18:31 +0100, Julien Grall wrote:
> > > >> I guess we will have to select LPAE when XEN is enabled, right? If
> > > >> it's the case that would mean the user won't be able to compile a
> > > >> Linux guest with short page table and Xen.
> > > >>
> > > >> Any though?
> > > >
> > > > Things must work without in guest LPAE too, so something somewhere else
> > > > will need fixing.
> > > >
> > > > Apart from restricting the user in an unwanted way requiring LPAE will
> > > > mean that practically no distro installer will work in a Xen guest.
> > > 
> > > Xen does an identity mapping for the host physical address into DOM0
> > > for the grant mapping. DOM0 will use a scratch page (see commit 340720b
> > > "xen/arm: reimplement xen_dma_unmap_page & friends") and map and this 
> > > physical address.
> > > 
> > > That means on platform with an address space higher than 32 bits, which 
> > > is the case on Midway, we have to handle 64 bits physical address in DOM0.
> > > 
> > > With the current implementation in Linux we can only use LPAE when a 
> > > guest is started. The distro installer will still be able to work with
> > > short page table.
> > > 
> > > The drawback is we are requiring LPAE from DOM0 and a different kernel
> > > in the guest if the user doesn't want to use LPAE.
> > > 
> > > As the code is already pushed in Linux 3.17, I don't find a simpler 
> > > solution to fix Linux boot without requiring LPAE.
> > 
> > We will have to try harder then, requiring LPAE simply isn't acceptable
> > IMHO.
> 
> I agree but the solution is not simple.
> 
> With the current scheme we would need to find a way to map pages at
> 64bit physical addresses in Dom0 without CONFIG_ARM_LPAE. Not sure if
> that is possible.

I'm pretty certain it isn't...

> Otherwise we would need to come up with an entirely new scheme.

I fear this may end up being the case.

I've got a cold towel and a whiteboard waiting for you in the office...

Ian.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [linux-linus test] 30356: regressions - FAIL
  2014-09-23 11:45 ` Ian Campbell
  2014-09-23 15:20   ` Julien Grall
@ 2014-10-30 23:18   ` Julien Grall
  1 sibling, 0 replies; 10+ messages in thread
From: Julien Grall @ 2014-10-30 23:18 UTC (permalink / raw)
  To: Ian Campbell, Stefano Stabellini; +Cc: xen-devel, xen.org

Hi Ian,

On 23/09/2014 12:45, Ian Campbell wrote:
> On Tue, 2014-09-23 at 12:30 +0100, xen.org wrote:
>> flight 30356 linux-linus real [real]
>> http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>   test-armhf-armhf-xl           9 guest-start               fail REGR. vs. 30019
>
> http://www.chiark.greenend.org.uk/~xensrcts/logs/30356/test-armhf-armhf-xl/info.html
>
> This has failed in everyone of a couple of dozen runs since the end of
> August.
>
> 30019 was OK, so was 30032 but from 30050 onwards it is consistently
> failing.

Will sent a patch on the LKLM that fix guest boot on Xen [1]. It was 
related to missing cache flush in the Linux code for some specific 
condition, which is always happening with multi_v7_defconfig.

I hope it will be pushed soon in Linux so ARM test can pass again.

Regards,

[1] http://www.spinics.net/lists/arm-kernel/msg373680.html

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-10-30 23:18 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-23 11:30 [linux-linus test] 30356: regressions - FAIL xen.org
2014-09-23 11:45 ` Ian Campbell
2014-09-23 15:20   ` Julien Grall
2014-09-23 17:31     ` Julien Grall
2014-09-23 18:40       ` Ian Campbell
2014-09-23 23:35         ` Julien Grall
2014-09-24  8:24           ` Ian Campbell
2014-09-24 10:02             ` Stefano Stabellini
2014-09-24 10:15               ` Ian Campbell
2014-10-30 23:18   ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.