xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [xen-unstable test] 161917: regressions - FAIL
@ 2021-05-13  3:56 osstest service owner
  2021-05-13 20:15 ` Regressed XSA-286, was " Andrew Cooper
  0 siblings, 1 reply; 13+ messages in thread
From: osstest service owner @ 2021-05-13  3:56 UTC (permalink / raw)
  To: xen-devel, osstest-admin

flight 161917 xen-unstable real [real]
http://logs.test-lab.xenproject.org/osstest/logs/161917/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-arm64-arm64-examine      8 reboot                   fail REGR. vs. 161898
 test-arm64-arm64-xl-thunderx  8 xen-boot                 fail REGR. vs. 161898
 test-arm64-arm64-xl-credit1   8 xen-boot                 fail REGR. vs. 161898
 test-arm64-arm64-xl-credit2   8 xen-boot                 fail REGR. vs. 161898
 test-arm64-arm64-xl           8 xen-boot                 fail REGR. vs. 161898

Tests which are failing intermittently (not blocking):
 test-xtf-amd64-amd64-3 92 xtf/test-pv32pae-xsa-286 fail in 161909 pass in 161917
 test-arm64-arm64-libvirt-xsm  8 xen-boot                   fail pass in 161909

Tests which did not succeed, but are not blocking:
 test-arm64-arm64-libvirt-xsm 15 migrate-support-check fail in 161909 never pass
 test-arm64-arm64-libvirt-xsm 16 saverestore-support-check fail in 161909 never pass
 test-amd64-amd64-xl-qemut-win7-amd64 19 guest-stop            fail like 161898
 test-armhf-armhf-libvirt     16 saverestore-support-check    fail  like 161898
 test-amd64-amd64-qemuu-nested-amd 20 debian-hvm-install/l1/l2 fail like 161898
 test-amd64-amd64-xl-qemuu-ws16-amd64 19 guest-stop            fail like 161898
 test-amd64-i386-xl-qemut-ws16-amd64 19 guest-stop             fail like 161898
 test-armhf-armhf-xl-rtds     18 guest-start/debian.repeat    fail  like 161898
 test-amd64-i386-xl-qemut-win7-amd64 19 guest-stop             fail like 161898
 test-armhf-armhf-libvirt-raw 15 saverestore-support-check    fail  like 161898
 test-amd64-amd64-xl-qemuu-win7-amd64 19 guest-stop            fail like 161898
 test-amd64-amd64-xl-qemut-ws16-amd64 19 guest-stop            fail like 161898
 test-amd64-i386-xl-qemuu-ws16-amd64 19 guest-stop             fail like 161898
 test-amd64-i386-xl-qemuu-win7-amd64 19 guest-stop             fail like 161898
 test-amd64-i386-xl-pvshim    14 guest-start                  fail   never pass
 test-amd64-i386-libvirt-xsm  15 migrate-support-check        fail   never pass
 test-arm64-arm64-xl-seattle  15 migrate-support-check        fail   never pass
 test-arm64-arm64-xl-seattle  16 saverestore-support-check    fail   never pass
 test-amd64-amd64-libvirt     15 migrate-support-check        fail   never pass
 test-amd64-i386-libvirt      15 migrate-support-check        fail   never pass
 test-amd64-amd64-libvirt-xsm 15 migrate-support-check        fail   never pass
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass
 test-armhf-armhf-xl-arndale  15 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-arndale  16 saverestore-support-check    fail   never pass
 test-arm64-arm64-xl-xsm      15 migrate-support-check        fail   never pass
 test-arm64-arm64-xl-xsm      16 saverestore-support-check    fail   never pass
 test-amd64-amd64-libvirt-vhd 14 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-cubietruck 15 migrate-support-check        fail never pass
 test-armhf-armhf-xl-cubietruck 16 saverestore-support-check    fail never pass
 test-armhf-armhf-xl-credit2  15 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-credit2  16 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-multivcpu 15 migrate-support-check        fail  never pass
 test-armhf-armhf-xl-multivcpu 16 saverestore-support-check    fail  never pass
 test-armhf-armhf-xl-rtds     15 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-rtds     16 saverestore-support-check    fail   never pass
 test-armhf-armhf-libvirt     15 migrate-support-check        fail   never pass
 test-armhf-armhf-xl          15 migrate-support-check        fail   never pass
 test-armhf-armhf-xl          16 saverestore-support-check    fail   never pass
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm 13 migrate-support-check fail never pass
 test-armhf-armhf-libvirt-raw 14 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-vhd      14 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-vhd      15 saverestore-support-check    fail   never pass
 test-armhf-armhf-xl-credit1  15 migrate-support-check        fail   never pass
 test-armhf-armhf-xl-credit1  16 saverestore-support-check    fail   never pass

version targeted for testing:
 xen                  d4fb5f166c2bfbaf9ba0de69da0d411288f437a9
baseline version:
 xen                  982c89ed527bc5b0ffae5da9fd33f9d2d1528f06

Last test of basis   161898  2021-05-10 19:06:50 Z    2 days
Testing same since   161904  2021-05-11 10:00:22 Z    1 days    3 attempts

------------------------------------------------------------
People who touched revisions under test:
  Julien Grall <jgrall@amazon.com>
  Michal Orzel <michal.orzel@arm.com>
  Volodymyr Babchuk <volodymyr_babchuk@epam.com>

jobs:
 build-amd64-xsm                                              pass    
 build-arm64-xsm                                              pass    
 build-i386-xsm                                               pass    
 build-amd64-xtf                                              pass    
 build-amd64                                                  pass    
 build-arm64                                                  pass    
 build-armhf                                                  pass    
 build-i386                                                   pass    
 build-amd64-libvirt                                          pass    
 build-arm64-libvirt                                          pass    
 build-armhf-libvirt                                          pass    
 build-i386-libvirt                                           pass    
 build-amd64-prev                                             pass    
 build-i386-prev                                              pass    
 build-amd64-pvops                                            pass    
 build-arm64-pvops                                            pass    
 build-armhf-pvops                                            pass    
 build-i386-pvops                                             pass    
 test-xtf-amd64-amd64-1                                       pass    
 test-xtf-amd64-amd64-2                                       pass    
 test-xtf-amd64-amd64-3                                       pass    
 test-xtf-amd64-amd64-4                                       pass    
 test-xtf-amd64-amd64-5                                       pass    
 test-amd64-amd64-xl                                          pass    
 test-amd64-coresched-amd64-xl                                pass    
 test-arm64-arm64-xl                                          fail    
 test-armhf-armhf-xl                                          pass    
 test-amd64-i386-xl                                           pass    
 test-amd64-coresched-i386-xl                                 pass    
 test-amd64-amd64-libvirt-qemuu-debianhvm-amd64-xsm           pass    
 test-amd64-i386-libvirt-qemuu-debianhvm-amd64-xsm            pass    
 test-amd64-amd64-xl-qemut-stubdom-debianhvm-amd64-xsm        pass    
 test-amd64-i386-xl-qemut-stubdom-debianhvm-amd64-xsm         pass    
 test-amd64-amd64-xl-qemut-debianhvm-i386-xsm                 pass    
 test-amd64-i386-xl-qemut-debianhvm-i386-xsm                  pass    
 test-amd64-amd64-xl-qemuu-debianhvm-i386-xsm                 pass    
 test-amd64-i386-xl-qemuu-debianhvm-i386-xsm                  pass    
 test-amd64-amd64-libvirt-xsm                                 pass    
 test-arm64-arm64-libvirt-xsm                                 fail    
 test-amd64-i386-libvirt-xsm                                  pass    
 test-amd64-amd64-xl-xsm                                      pass    
 test-arm64-arm64-xl-xsm                                      pass    
 test-amd64-i386-xl-xsm                                       pass    
 test-amd64-amd64-qemuu-nested-amd                            fail    
 test-amd64-amd64-xl-pvhv2-amd                                pass    
 test-amd64-i386-qemut-rhel6hvm-amd                           pass    
 test-amd64-i386-qemuu-rhel6hvm-amd                           pass    
 test-amd64-amd64-dom0pvh-xl-amd                              pass    
 test-amd64-amd64-xl-qemut-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemut-debianhvm-amd64                     pass    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64                    pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64                     pass    
 test-amd64-i386-freebsd10-amd64                              pass    
 test-amd64-amd64-qemuu-freebsd11-amd64                       pass    
 test-amd64-amd64-qemuu-freebsd12-amd64                       pass    
 test-amd64-amd64-xl-qemuu-ovmf-amd64                         pass    
 test-amd64-i386-xl-qemuu-ovmf-amd64                          pass    
 test-amd64-amd64-xl-qemut-win7-amd64                         fail    
 test-amd64-i386-xl-qemut-win7-amd64                          fail    
 test-amd64-amd64-xl-qemuu-win7-amd64                         fail    
 test-amd64-i386-xl-qemuu-win7-amd64                          fail    
 test-amd64-amd64-xl-qemut-ws16-amd64                         fail    
 test-amd64-i386-xl-qemut-ws16-amd64                          fail    
 test-amd64-amd64-xl-qemuu-ws16-amd64                         fail    
 test-amd64-i386-xl-qemuu-ws16-amd64                          fail    
 test-armhf-armhf-xl-arndale                                  pass    
 test-amd64-amd64-xl-credit1                                  pass    
 test-arm64-arm64-xl-credit1                                  fail    
 test-armhf-armhf-xl-credit1                                  pass    
 test-amd64-amd64-xl-credit2                                  pass    
 test-arm64-arm64-xl-credit2                                  fail    
 test-armhf-armhf-xl-credit2                                  pass    
 test-armhf-armhf-xl-cubietruck                               pass    
 test-amd64-amd64-xl-qemuu-dmrestrict-amd64-dmrestrict        pass    
 test-amd64-i386-xl-qemuu-dmrestrict-amd64-dmrestrict         pass    
 test-amd64-amd64-examine                                     pass    
 test-arm64-arm64-examine                                     fail    
 test-armhf-armhf-examine                                     pass    
 test-amd64-i386-examine                                      pass    
 test-amd64-i386-freebsd10-i386                               pass    
 test-amd64-amd64-qemuu-nested-intel                          pass    
 test-amd64-amd64-xl-pvhv2-intel                              pass    
 test-amd64-i386-qemut-rhel6hvm-intel                         pass    
 test-amd64-i386-qemuu-rhel6hvm-intel                         pass    
 test-amd64-amd64-dom0pvh-xl-intel                            pass    
 test-amd64-amd64-libvirt                                     pass    
 test-armhf-armhf-libvirt                                     pass    
 test-amd64-i386-libvirt                                      pass    
 test-amd64-amd64-livepatch                                   pass    
 test-amd64-i386-livepatch                                    pass    
 test-amd64-amd64-migrupgrade                                 pass    
 test-amd64-i386-migrupgrade                                  pass    
 test-amd64-amd64-xl-multivcpu                                pass    
 test-armhf-armhf-xl-multivcpu                                pass    
 test-amd64-amd64-pair                                        pass    
 test-amd64-i386-pair                                         pass    
 test-amd64-amd64-libvirt-pair                                pass    
 test-amd64-i386-libvirt-pair                                 pass    
 test-amd64-amd64-amd64-pvgrub                                pass    
 test-amd64-amd64-i386-pvgrub                                 pass    
 test-amd64-amd64-xl-pvshim                                   pass    
 test-amd64-i386-xl-pvshim                                    fail    
 test-amd64-amd64-pygrub                                      pass    
 test-amd64-amd64-xl-qcow2                                    pass    
 test-armhf-armhf-libvirt-raw                                 pass    
 test-amd64-i386-xl-raw                                       pass    
 test-amd64-amd64-xl-rtds                                     pass    
 test-armhf-armhf-xl-rtds                                     fail    
 test-arm64-arm64-xl-seattle                                  pass    
 test-amd64-amd64-xl-qemuu-debianhvm-amd64-shadow             pass    
 test-amd64-i386-xl-qemuu-debianhvm-amd64-shadow              pass    
 test-amd64-amd64-xl-shadow                                   pass    
 test-amd64-i386-xl-shadow                                    pass    
 test-arm64-arm64-xl-thunderx                                 fail    
 test-amd64-amd64-libvirt-vhd                                 pass    
 test-armhf-armhf-xl-vhd                                      pass    


------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
    http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
    http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


Not pushing.

------------------------------------------------------------
commit d4fb5f166c2bfbaf9ba0de69da0d411288f437a9
Author: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Date:   Fri May 7 01:39:47 2021 +0000

    optee: enable OPTEE_SMC_SEC_CAP_MEMREF_NULL capability
    
    OP-TEE mediator already have support for NULL memory references. It
    was added in patch 0dbed3ad336 ("optee: allow plain TMEM buffers with
    NULL address"). But it does not propagate
    OPTEE_SMC_SEC_CAP_MEMREF_NULL capability flag to a guest, so well
    behaving guest can't use this feature.
    
    Note: linux optee driver honors this capability flag when handling
    buffers from userspace clients, but ignores it when working with
    internal calls. For instance, __optee_enumerate_devices() function
    uses NULL argument to get buffer size hint from OP-TEE. This was the
    reason, why "optee: allow plain TMEM buffers with NULL address" was
    introduced in the first place.
    
    This patch adds the mentioned capability to list of known
    capabilities. From Linux point of view it means that userspace clients
    can use this feature, which is confirmed by OP-TEE test suite:
    
    * regression_1025 Test memref NULL and/or 0 bytes size
    o regression_1025.1 Invalid NULL buffer memref registration
      regression_1025.1 OK
    o regression_1025.2 Input/Output MEMREF Buffer NULL - Size 0 bytes
      regression_1025.2 OK
    o regression_1025.3 Input MEMREF Buffer NULL - Size non 0 bytes
      regression_1025.3 OK
    o regression_1025.4 Input MEMREF Buffer NULL over PTA invocation
      regression_1025.4 OK
      regression_1025 OK
    
    Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
    Acked-by: Julien Grall <jgrall@amazon.com>

commit 30f34457b20c78b2862b2b16cb26cb4f10a667ad
Author: Julien Grall <jgrall@amazon.com>
Date:   Mon May 10 18:28:16 2021 +0100

    tools/xenstore: Fix indentation in the header of xenstored_control.c
    
    Commit e867af081d94 "tools/xenstore: save new binary for live update"
    seemed to have spuriously changed the indentation of the first line of
    the copyright header.
    
    The previous indentation is re-instated so all the lines are indented
    the same.
    
    Reported-by: Bjoern Doebel <doebel@amazon.com>
    Signed-off-by: Julien Grall <jgrall@amazon.com>
    Reviewed-by: Juergen Gross <jgross@suse.com>

commit 7e71b1e0affa83c0976c832f254276eeb6e6575c
Author: Julien Grall <jgrall@amazon.com>
Date:   Thu May 6 17:12:23 2021 +0100

    tools/xenstored: Prevent a buffer overflow in dump_state_node_perms()
    
    ASAN reported one issue when Live Updating Xenstored:
    
    =================================================================
    ==873==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffc194f53e0 at pc 0x555c6b323292 bp 0x7ffc194f5340 sp 0x7ffc194f5338
    WRITE of size 1 at 0x7ffc194f53e0 thread T0
        #0 0x555c6b323291 in dump_state_node_perms xen/tools/xenstore/xenstored_core.c:2468
        #1 0x555c6b32746e in dump_state_special_node xen/tools/xenstore/xenstored_domain.c:1257
        #2 0x555c6b32a702 in dump_state_special_nodes xen/tools/xenstore/xenstored_domain.c:1273
        #3 0x555c6b32ddb3 in lu_dump_state xen/tools/xenstore/xenstored_control.c:521
        #4 0x555c6b32e380 in do_lu_start xen/tools/xenstore/xenstored_control.c:660
        #5 0x555c6b31b461 in call_delayed xen/tools/xenstore/xenstored_core.c:278
        #6 0x555c6b32275e in main xen/tools/xenstore/xenstored_core.c:2357
        #7 0x7f95eecf3d09 in __libc_start_main ../csu/libc-start.c:308
        #8 0x555c6b3197e9 in _start (/usr/local/sbin/xenstored+0xc7e9)
    
    Address 0x7ffc194f53e0 is located in stack of thread T0 at offset 80 in frame
        #0 0x555c6b32713e in dump_state_special_node xen/tools/xenstore/xenstored_domain.c:1232
    
      This frame has 2 object(s):
        [32, 40) 'head' (line 1233)
        [64, 80) 'sn' (line 1234) <== Memory access at offset 80 overflows this variable
    
    This is happening because the callers are passing a pointer to a variable
    allocated on the stack. However, the field perms is a dynamic array, so
    Xenstored will end up to read outside of the variable.
    
    Rework the code so the permissions are written one by one in the fd.
    
    Fixes: ed6eebf17d2c ("tools/xenstore: dump the xenstore state for live update")
    Signed-off-by: Julien Grall <jgrall@amazon.com>
    Reviewed-by: Juergen Gross <jgross@suse.com>
    Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>

commit 3f568354a95ee2f0c9c553efb94c734fa6848af0
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:43:07 2021 +0200

    arm/time,vtimer: Get rid of READ/WRITE_SYSREG32
    
    AArch64 registers are 64bit whereas AArch32 registers
    are 32bit or 64bit. MSR/MRS are expecting 64bit values thus
    we should get rid of helpers READ/WRITE_SYSREG32
    in favour of using READ/WRITE_SYSREG.
    We should also use register_t type when reading sysregs
    which can correspond to uint64_t or uint32_t.
    Even though many AArch64 registers have upper 32bit reserved
    it does not mean that they can't be widen in the future.
    
    Modify type of vtimer structure's member: ctl to register_t.
    
    Add macro CNTFRQ_MASK containing mask for timer clock frequency
    field of CNTFRQ_EL0 register.
    
    Modify CNTx_CTL_* macros to return unsigned long instead of
    unsigned int as ctl is now of type register_t.
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Acked-by: Julien Grall <jgrall@amazon.com>

commit 86faae561cd8eee819e0f42ba7a18dd180aa49d1
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:43:06 2021 +0200

    arm/page: Get rid of READ/WRITE_SYSREG32
    
    AArch64 registers are 64bit whereas AArch32 registers
    are 32bit or 64bit. MSR/MRS are expecting 64bit values thus
    we should get rid of helpers READ/WRITE_SYSREG32
    in favour of using READ/WRITE_SYSREG.
    We should also use register_t type when reading sysregs
    which can correspond to uint64_t or uint32_t.
    Even though many AArch64 registers have upper 32bit reserved
    it does not mean that they can't be widen in the future.
    
    Modify accesses to CTR_EL0 to use READ/WRITE_SYSREG.
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Reviewed-by: Julien Grall <jgrall@amazon.com>

commit 25e5d0c412e0d7420f2aa7fdd71cc39d8ed6c528
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:43:05 2021 +0200

    xen/arm: Always access SCTLR_EL2 using READ/WRITE_SYSREG()
    
    The Armv8 specification describes the system register as a 64-bit value
    on AArch64 and 32-bit value on AArch32 (same as ARMv7).
    
    Unfortunately, Xen is accessing the system registers using
    READ/WRITE_SYSREG32() which means the top 32-bit are clobbered.
    
    This is only a latent bug so far because Xen will not yet use the top
    32-bit.
    
    There is also no change in behavior because arch/arm/arm64/head.S will
    initialize SCTLR_EL2 to a sane value with the top 32-bit zeroed.
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Acked-by: Julien Grall <jgrall@amazon.com>

commit 8eb7cc0465fa228064e807aad51eb7428d6d3199
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:43:04 2021 +0200

    arm/p2m: Get rid of READ/WRITE_SYSREG32
    
    AArch64 registers are 64bit whereas AArch32 registers
    are 32bit or 64bit. MSR/MRS are expecting 64bit values thus
    we should get rid of helpers READ/WRITE_SYSREG32
    in favour of using READ/WRITE_SYSREG.
    We should also use register_t type when reading sysregs
    which can correspond to uint64_t or uint32_t.
    Even though many AArch64 registers have upper 32bit reserved
    it does not mean that they can't be widen in the future.
    
    Modify type of vtcr to register_t.
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Reviewed-by: Julien Grall <jgrall@amazon.com>

commit 78e67c99eb3f90c22c8c6ee282ec3a43d2ddccb5
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:43:03 2021 +0200

    arm/gic: Get rid of READ/WRITE_SYSREG32
    
    AArch64 registers are 64bit whereas AArch32 registers
    are 32bit or 64bit. MSR/MRS are expecting 64bit values thus
    we should get rid of helpers READ/WRITE_SYSREG32
    in favour of using READ/WRITE_SYSREG.
    We should also use register_t type when reading sysregs
    which can correspond to uint64_t or uint32_t.
    Even though many AArch64 registers have upper 32bit reserved
    it does not mean that they can't be widen in the future.
    
    Modify types of following members of struct gic_v3 to register_t:
    -vmcr
    -sre_el1
    -apr0
    -apr1
    
    Add new macro GICC_IAR_INTID_MASK containing the mask
    for INTID field of ICC_IAR0/1_EL1 register as only the first 23-bits
    of IAR contains the interrupt number. The rest are RES0.
    Therefore, take the opportunity to mask the bits [23:31] as
    they should be used for an IRQ number (we don't know how the top bits
    will be used).
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Acked-by: Julien Grall <jgrall@amazon.com>

commit d55afb1acaffc6047af3cabc3ef4442f313bee2c
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:43:02 2021 +0200

    arm/gic: Remove member hcr of structure gic_v3
    
    ... as it is never used even in the patch introducing it.
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Acked-by: Julien Grall <jgrall@amazon.com>

commit b80470c84553808fef3a6803000ceee8a100e63c
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:43:01 2021 +0200

    arm: Modify type of actlr to register_t
    
    AArch64 registers are 64bit whereas AArch32 registers
    are 32bit or 64bit. MSR/MRS are expecting 64bit values thus
    we should get rid of helpers READ/WRITE_SYSREG32
    in favour of using READ/WRITE_SYSREG.
    We should also use register_t type when reading sysregs
    which can correspond to uint64_t or uint32_t.
    Even though many AArch64 registers have upper 32bit reserved
    it does not mean that they can't be widen in the future.
    
    ACTLR_EL1 system register bits are implementation defined
    which means it is possibly a latent bug on current HW as the CPU
    implementer may already have decided to use the top 32bit.
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Reviewed-by: Julien Grall <jgrall@amazon.com>

commit 3fd8336bc599788e5a52a7e63e833b6f03d79fd5
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:43:00 2021 +0200

    arm/domain: Get rid of READ/WRITE_SYSREG32
    
    AArch64 registers are 64bit whereas AArch32 registers
    are 32bit or 64bit. MSR/MRS are expecting 64bit values thus
    we should get rid of helpers READ/WRITE_SYSREG32
    in favour of using READ/WRITE_SYSREG.
    We should also use register_t type when reading sysregs
    which can correspond to uint64_t or uint32_t.
    Even though many AArch64 registers have upper 32bit reserved
    it does not mean that they can't be widen in the future.
    
    Modify type of register cntkctl to register_t.
    
    Thumbee registers are only usable by a 32-bit domain and therefore
    we can just store the bottom 32-bit (IOW there is no type change).
    In fact, this could technically be restricted to Armv7 HW (the
    support was dropped retrospectively in Armv8) but leave it as-is
    for now.
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Reviewed-by: Julien Grall <jgrall@amazon.com>

commit 8990f0eaca139364091109389416455f4f78cd65
Author: Michal Orzel <michal.orzel@arm.com>
Date:   Wed May 5 09:42:59 2021 +0200

    arm64/vfp: Get rid of READ/WRITE_SYSREG32
    
    AArch64 registers are 64bit whereas AArch32 registers
    are 32bit or 64bit. MSR/MRS are expecting 64bit values thus
    we should get rid of helpers READ/WRITE_SYSREG32
    in favour of using READ/WRITE_SYSREG.
    We should also use register_t type when reading sysregs
    which can correspond to uint64_t or uint32_t.
    Even though many AArch64 registers have upper 32bit reserved
    it does not mean that they can't be widen in the future.
    
    Modify type of FPCR, FPSR, FPEXC32_EL2 to register_t.
    
    Signed-off-by: Michal Orzel <michal.orzel@arm.com>
    Reviewed-by: Julien Grall <jgrall@amazon.com>
(qemu changes not included)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-05-13  3:56 [xen-unstable test] 161917: regressions - FAIL osstest service owner
@ 2021-05-13 20:15 ` Andrew Cooper
  2021-05-17  8:43   ` Jan Beulich
  2021-06-16  8:48   ` Jan Beulich
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew Cooper @ 2021-05-13 20:15 UTC (permalink / raw)
  To: osstest service owner, xen-devel, Ian Jackson, Jan Beulich,
	Roger Pau Monné

On 13/05/2021 04:56, osstest service owner wrote:
> flight 161917 xen-unstable real [real]
> http://logs.test-lab.xenproject.org/osstest/logs/161917/
>
> Regressions :-(
>
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-arm64-arm64-examine      8 reboot                   fail REGR. vs. 161898
>  test-arm64-arm64-xl-thunderx  8 xen-boot                 fail REGR. vs. 161898
>  test-arm64-arm64-xl-credit1   8 xen-boot                 fail REGR. vs. 161898
>  test-arm64-arm64-xl-credit2   8 xen-boot                 fail REGR. vs. 161898
>  test-arm64-arm64-xl           8 xen-boot                 fail REGR. vs. 161898

I reported these on IRC, and Julien/Stefano have already committed a fix.

> Tests which are failing intermittently (not blocking):
>  test-xtf-amd64-amd64-3 92 xtf/test-pv32pae-xsa-286 fail in 161909 pass in 161917

While noticing the ARM issue above, I also spotted this one by chance. 
There are two issues.

First, I have reverted bed7e6cad30 and edcfce55917.  The XTF test is
correct, and they really do reintroduce XSA-286.  It is a miracle of
timing that we don't need an XSA/CVE against Xen 4.15.

Given that I was unhappy with the changes in the first place, I don't
particularly want to see an attempt to resurrect them.  I did not find
the claim that they were a perf improvement in the first place very
convincing, and the XTF test demonstrates that the reasoning about their
safety was incorrect.


Second, the unexplained OSSTest behaviour.

When I repro'd this on pinot1, test-pv32pae-xsa-286 failing was totally
deterministic and repeatable (I tried 100 times because the test is a
fraction of a second).

From the log trawling which Ian already did, the first recorded failure
was flight 160912 on April 11th.  All failures (12, but this number is a
few flights old now) were on pinot*.

What would be interesting to see is whether there have been any passes
on pinot since 160912.

I can't see any reason why the test would be reliable for me, but
unreliable for OSSTest, so I'm wondering whether it is actually
reliable, and something is wrong with the stickiness heuristic.

~Andrew



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-05-13 20:15 ` Regressed XSA-286, was " Andrew Cooper
@ 2021-05-17  8:43   ` Jan Beulich
  2021-05-17 10:59     ` Jan Beulich
  2021-06-16  8:48   ` Jan Beulich
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2021-05-17  8:43 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: osstest service owner, Roger Pau Monné, xen-devel, Ian Jackson

On 13.05.2021 22:15, Andrew Cooper wrote:
> On 13/05/2021 04:56, osstest service owner wrote:
>> Tests which are failing intermittently (not blocking):
>>  test-xtf-amd64-amd64-3 92 xtf/test-pv32pae-xsa-286 fail in 161909 pass in 161917
> 
> While noticing the ARM issue above, I also spotted this one by chance. 
> There are two issues.
> 
> First, I have reverted bed7e6cad30 and edcfce55917.  The XTF test is
> correct, and they really do reintroduce XSA-286.  It is a miracle of
> timing that we don't need an XSA/CVE against Xen 4.15.

I have to admit that from the description in the revert (on top of
what you say here) it does not really become clear to me what is
wrong with _either_ of these changes:

"The TLB flushing is for Xen's correctness, not the guest's."

XSA-286 was solely about guest correctness, which was broken by Xen's
behavior. Hence we're still only talking about guest observable
behavior here.

"The text in c/s bed7e6cad30 is technically correct, from the guests
 point of view, but clearly false as far as XSA-286 is concerned."

As a result I also don't understand this, nor the actual reason why
you did revert both, rather than just ...

"That said, it is edcfce55917 which introduced the regression, which
 demonstrates that the reasoning is flawed."

... this. Furthermore you merely state an observation here, without
going into any detail as to what's wrong with the reasoning, and
hence why it is the change that's wrong and the test that's correct
(and no issue elsewhere). Don't get me wrong - I'm not excluding
you're right, but you fail to explain things properly. I can't see
how avoiding a flush for a page table which isn't hooked up anywhere
(and which hence isn't accessible via lookups through the linear
page tables) can have caused a problem (except perhaps uncover an
issue, e.g. a missing flush, elsewhere). Nor can I see how the XTF
test would trigger the flush avoidance, as it doesn't play with
free floating page tables. Plus this change affects 64-bit guests
as much as 32-bit ones, yet no (apparent) regression could be seen
there.

Similarly for the other change: Since only guest perspective matters,
the flush ought to be fine to defer until the guest actually reloads
CR3; until then using either the stale or updated linear page tables
is acceptable, and guests need to not rely on either, just like would
be the case on bare metal (and there it's even stronger: an OS can
rely upon the prior page tables to continue to be used, as the PDPTEs
get reloaded _only_ during CR3 loads; mimicking this for PV would be
not exactly trivial, I think). And I notice that the XTF test
exercises an L3 entry update without a subsequent CR3 write, which
is wrong for PAE. (I therefore suspect it is bed7e6cad30 which has
caused the test failure, not edcfce55917 as you have said in the
description of the revert.)

> Given that I was unhappy with the changes in the first place, I don't
> particularly want to see an attempt to resurrect them.  I did not find
> the claim that they were a perf improvement in the first place very
> convincing, and the XTF test demonstrates that the reasoning about their
> safety was incorrect.

Interesting: Where did you voice your unhappiness? All I can find on
that entire series' thread is a reply of yours on a post-commit-
message remark regarding a comment you had introduced with the 286
fix. All other discussion there was between Roger and me.

Additionally I don't see why you treated this as an emergency and
reverted without posting a patch and getting an ack.

> Second, the unexplained OSSTest behaviour.
> 
> When I repro'd this on pinot1, test-pv32pae-xsa-286 failing was totally
> deterministic and repeatable (I tried 100 times because the test is a
> fraction of a second).
> 
> From the log trawling which Ian already did, the first recorded failure
> was flight 160912 on April 11th.  All failures (12, but this number is a
> few flights old now) were on pinot*.
> 
> What would be interesting to see is whether there have been any passes
> on pinot since 160912.
> 
> I can't see any reason why the test would be reliable for me, but
> unreliable for OSSTest, so I'm wondering whether it is actually
> reliable, and something is wrong with the stickiness heuristic.

Isn't (un)reliability of this test, besides the sensitivity to IRQs
and context switches, tied to hardware behavior, in particular TLB
capacity and replacement policy? Aiui the test has

    xtf_success("Success: Probably not vulnerable to XSA-286\n");

for the combination of all of these reasons.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-05-17  8:43   ` Jan Beulich
@ 2021-05-17 10:59     ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2021-05-17 10:59 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: osstest service owner, Roger Pau Monné, xen-devel, Ian Jackson

On 17.05.2021 10:43, Jan Beulich wrote:
> On 13.05.2021 22:15, Andrew Cooper wrote:
>> Second, the unexplained OSSTest behaviour.
>>
>> When I repro'd this on pinot1, test-pv32pae-xsa-286 failing was totally
>> deterministic and repeatable (I tried 100 times because the test is a
>> fraction of a second).
>>
>> From the log trawling which Ian already did, the first recorded failure
>> was flight 160912 on April 11th.  All failures (12, but this number is a
>> few flights old now) were on pinot*.
>>
>> What would be interesting to see is whether there have been any passes
>> on pinot since 160912.
>>
>> I can't see any reason why the test would be reliable for me, but
>> unreliable for OSSTest, so I'm wondering whether it is actually
>> reliable, and something is wrong with the stickiness heuristic.
> 
> Isn't (un)reliability of this test, besides the sensitivity to IRQs
> and context switches, tied to hardware behavior, in particular TLB
> capacity and replacement policy? Aiui the test has
> 
>     xtf_success("Success: Probably not vulnerable to XSA-286\n");
> 
> for the combination of all of these reasons.

I've just done a dozen runs on my Skylake - all reported SUCCESS.

Jan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-05-13 20:15 ` Regressed XSA-286, was " Andrew Cooper
  2021-05-17  8:43   ` Jan Beulich
@ 2021-06-16  8:48   ` Jan Beulich
  2021-06-16 15:43     ` Andrew Cooper
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2021-06-16  8:48 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Roger Pau Monné, committers

On 13.05.2021 22:15, Andrew Cooper wrote:
> On 13/05/2021 04:56, osstest service owner wrote:
>> flight 161917 xen-unstable real [real]
>> http://logs.test-lab.xenproject.org/osstest/logs/161917/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking,
>> including tests which could not be run:
>>  test-arm64-arm64-examine      8 reboot                   fail REGR. vs. 161898
>>  test-arm64-arm64-xl-thunderx  8 xen-boot                 fail REGR. vs. 161898
>>  test-arm64-arm64-xl-credit1   8 xen-boot                 fail REGR. vs. 161898
>>  test-arm64-arm64-xl-credit2   8 xen-boot                 fail REGR. vs. 161898
>>  test-arm64-arm64-xl           8 xen-boot                 fail REGR. vs. 161898
> 
> I reported these on IRC, and Julien/Stefano have already committed a fix.
> 
>> Tests which are failing intermittently (not blocking):
>>  test-xtf-amd64-amd64-3 92 xtf/test-pv32pae-xsa-286 fail in 161909 pass in 161917
> 
> While noticing the ARM issue above, I also spotted this one by chance. 
> There are two issues.
> 
> First, I have reverted bed7e6cad30 and edcfce55917.  The XTF test is
> correct, and they really do reintroduce XSA-286.  It is a miracle of
> timing that we don't need an XSA/CVE against Xen 4.15.

As expressed at the time already, I view this reverting you did, without
there being any emergency and without you having gathered any acks or
allowed for objections, as overstepping your competencies. I did post a
patch to the XTF test, which I believe is wrong, without having had any
feedback there either. Unless I hear back by the end of this week with
substantial arguments of why I am wrong (which would need to also cover
the fact that an issue was found with 32-bit PAE only, in turn supporting
my view on the overall state), I intend to revert your revert early next
week.

Jan



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-06-16  8:48   ` Jan Beulich
@ 2021-06-16 15:43     ` Andrew Cooper
  2021-06-17 11:56       ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Cooper @ 2021-06-16 15:43 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Roger Pau Monné, committers

On 16/06/2021 09:48, Jan Beulich wrote:
> On 13.05.2021 22:15, Andrew Cooper wrote:
>> On 13/05/2021 04:56, osstest service owner wrote:
>>> flight 161917 xen-unstable real [real]
>>> http://logs.test-lab.xenproject.org/osstest/logs/161917/
>>>
>>> Regressions :-(
>>>
>>> Tests which did not succeed and are blocking,
>>> including tests which could not be run:
>>>  test-arm64-arm64-examine      8 reboot                   fail REGR. vs. 161898
>>>  test-arm64-arm64-xl-thunderx  8 xen-boot                 fail REGR. vs. 161898
>>>  test-arm64-arm64-xl-credit1   8 xen-boot                 fail REGR. vs. 161898
>>>  test-arm64-arm64-xl-credit2   8 xen-boot                 fail REGR. vs. 161898
>>>  test-arm64-arm64-xl           8 xen-boot                 fail REGR. vs. 161898
>> I reported these on IRC, and Julien/Stefano have already committed a fix.
>>
>>> Tests which are failing intermittently (not blocking):
>>>  test-xtf-amd64-amd64-3 92 xtf/test-pv32pae-xsa-286 fail in 161909 pass in 161917
>> While noticing the ARM issue above, I also spotted this one by chance. 
>> There are two issues.
>>
>> First, I have reverted bed7e6cad30 and edcfce55917.  The XTF test is
>> correct, and they really do reintroduce XSA-286.  It is a miracle of
>> timing that we don't need an XSA/CVE against Xen 4.15.
> As expressed at the time already, I view this reverting you did, without
> there being any emergency and without you having gathered any acks or
> allowed for objections, as overstepping your competencies. I did post a
> patch to the XTF test, which I believe is wrong, without having had any
> feedback there either. Unless I hear back by the end of this week with
> substantial arguments of why I am wrong (which would need to also cover
> the fact that an issue was found with 32-bit PAE only, in turn supporting
> my view on the overall state), I intend to revert your revert early next
> week.

It has frankly taken a while to formulate a civil reply.

I am very irritated that you have *twice* recently introduced security
vulnerabilities by bypassing my reviews/objections on patches.

At the time, I had to drop work on an in-progress security issue to
urgently investigate why we'd regressed upstream, and why OSSTest hadn't
blocked it.

I am more generally irritated that you are constantly breaking things
which GitlabCI can tell you is broken, and that I'm having to drop work
I'm supposed to be doing to unbreak them.

In the case of this revert specifically, I did get agreement on IRC
before reverting.


In your proposed edit to the XTF test, you say

  L3 entry updates aren't specified to take immediate effect in PAE mode:

but this is not accurate.  It's what the Intel SDM says, but is
contradicted by the AMD APM which states that this behaviour is not true
under NPT under any circumstance, nor is it true on native.

Furthermore, any 32bit PV guest knowing it is running on a 64bit Xen
(even from simply checking Xen >= 4.3) can rely on the relaxed
behaviour, irrespective of what the unwritten PV ABI might want to say
on the matter, due to knowing that it is running on Long mode paging as
opposed to legacy PAE paging.

If these two technical reasons aren't good enough, then consider the
manifestation of the issue itself.  XSA-286 is specifically about Xen
editing the wrong PTE, because of the use of linear pagetables, in light
of the guest not flushing the TLB.

If you were to remove linear pagetables from Xen, the issue
(do_mmu_update() edits the wrong PTE) would cease to manifest even on
legacy PAE paging, demonstrating that the problem is with Xen's actions,
not with the guests.

~Andrew



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-06-16 15:43     ` Andrew Cooper
@ 2021-06-17 11:56       ` Jan Beulich
  2021-06-17 13:05         ` Ian Jackson
  2021-06-17 21:26         ` Stefano Stabellini
  0 siblings, 2 replies; 13+ messages in thread
From: Jan Beulich @ 2021-06-17 11:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Roger Pau Monné, committers

On 16.06.2021 17:43, Andrew Cooper wrote:
> On 16/06/2021 09:48, Jan Beulich wrote:
>> On 13.05.2021 22:15, Andrew Cooper wrote:
>>> On 13/05/2021 04:56, osstest service owner wrote:
>>>> flight 161917 xen-unstable real [real]
>>>> http://logs.test-lab.xenproject.org/osstest/logs/161917/
>>>>
>>>> Regressions :-(
>>>>
>>>> Tests which did not succeed and are blocking,
>>>> including tests which could not be run:
>>>>  test-arm64-arm64-examine      8 reboot                   fail REGR. vs. 161898
>>>>  test-arm64-arm64-xl-thunderx  8 xen-boot                 fail REGR. vs. 161898
>>>>  test-arm64-arm64-xl-credit1   8 xen-boot                 fail REGR. vs. 161898
>>>>  test-arm64-arm64-xl-credit2   8 xen-boot                 fail REGR. vs. 161898
>>>>  test-arm64-arm64-xl           8 xen-boot                 fail REGR. vs. 161898
>>> I reported these on IRC, and Julien/Stefano have already committed a fix.
>>>
>>>> Tests which are failing intermittently (not blocking):
>>>>  test-xtf-amd64-amd64-3 92 xtf/test-pv32pae-xsa-286 fail in 161909 pass in 161917
>>> While noticing the ARM issue above, I also spotted this one by chance. 
>>> There are two issues.
>>>
>>> First, I have reverted bed7e6cad30 and edcfce55917.  The XTF test is
>>> correct, and they really do reintroduce XSA-286.  It is a miracle of
>>> timing that we don't need an XSA/CVE against Xen 4.15.
>> As expressed at the time already, I view this reverting you did, without
>> there being any emergency and without you having gathered any acks or
>> allowed for objections, as overstepping your competencies. I did post a
>> patch to the XTF test, which I believe is wrong, without having had any
>> feedback there either. Unless I hear back by the end of this week with
>> substantial arguments of why I am wrong (which would need to also cover
>> the fact that an issue was found with 32-bit PAE only, in turn supporting
>> my view on the overall state), I intend to revert your revert early next
>> week.
> 
> It has frankly taken a while to formulate a civil reply.
> 
> I am very irritated that you have *twice* recently introduced security
> vulnerabilities by bypassing my reviews/objections on patches.

I'm sorry, Andrew, but already in my original reply a month ago I did
express that I couldn't find any record of you having objected to the
changes. It doesn't help that you claim you've objected when you
really didn't (which is the impression I get from not finding anything,
and which also matches my recollection of what was discussed).

I don't think I know which 2nd instance you're referring to, and hence
I can't respond to that aspect.

> At the time, I had to drop work on an in-progress security issue to
> urgently investigate why we'd regressed upstream, and why OSSTest hadn't
> blocked it.
> 
> I am more generally irritated that you are constantly breaking things
> which GitlabCI can tell you is broken, and that I'm having to drop work
> I'm supposed to be doing to unbreak them.

GitlabCI doesn't tell me anything just yet, unless I go actively poll
it. And as mentioned just yesterday on irc, I don't think I can easily
navigate my way through those web pages, to find breakage I may have
introduced and hence would better go fix. Unlike osstest, where I am
told what failed, and I know where to find the corresponding logs.

It's also not clear to me at all in how far GitlabCI would have
spotted the issue here, no matter whether it's caused by a hypervisor
change or the XTF test being wrong. So far I've seen GitlabCI only
spot build issues.

I'm also puzzled, to put it mildly, of your use of "constantly" here.

> In the case of this revert specifically, I did get agreement on IRC
> before reverting.

How can I know you did? You didn't even care to reply to my mail from
a month ago. And there was no reason to make an emergency out of this
and ask on irc. You could have sent mail just like is done for all
other normal bug fixes etc. Iirc I was on PTO at that time; it would
hence only have been fair to wait until my return.

> In your proposed edit to the XTF test, you say
> 
>   L3 entry updates aren't specified to take immediate effect in PAE mode:
> 
> but this is not accurate.  It's what the Intel SDM says, but is
> contradicted by the AMD APM which states that this behaviour is not true
> under NPT under any circumstance, nor is it true on native.
> 
> Furthermore, any 32bit PV guest knowing it is running on a 64bit Xen
> (even from simply checking Xen >= 4.3) can rely on the relaxed
> behaviour, irrespective of what the unwritten PV ABI might want to say
> on the matter, due to knowing that it is running on Long mode paging as
> opposed to legacy PAE paging.

Neither of these are reasons for a 32-bit guest to _rely_ on such
behavior. Hence the change to the XTF test, which so far you also
didn't care to reply to.

I'm aware of NPT having different behavior, but can you point me to
the place in AMD doc saying so also for native? In fact I can find a
statement to the contrary:

"The behavior of PAE mode in a nested-paging guest differs slightly
 from the behavior of (host-only) legacy PAE mode, in that the
 guest’s four PDPEs are not loaded into the processor at the time
 CR3 is written. Instead, the PDPEs are accessed on demand as part
 of a table walk."

This to me implies that in the native case the behavior matches
Intel's.

> If these two technical reasons aren't good enough, then consider the
> manifestation of the issue itself.  XSA-286 is specifically about Xen
> editing the wrong PTE, because of the use of linear pagetables, in light
> of the guest not flushing the TLB.

The PTE edited is, as said, only perceived wrong by the XTF test.
Hence the patch to correct it.

> If you were to remove linear pagetables from Xen, the issue
> (do_mmu_update() edits the wrong PTE) would cease to manifest even on
> legacy PAE paging, demonstrating that the problem is with Xen's actions,
> not with the guests.

And if I introduced shadowing of the L3E writes, pushing the new ones
into the live page tables only upon CR3 writes, the issue would
reappear. It ought to be permissible to make such a change to Xen,
even if we may have no specific reason to do so at this point (albeit
I think we really better would, to match bare metal behavior).

Also, in my request to you (still in context above) I did specifically
ask about the aspect of the observed issue only manifesting on 32-bit,
yet you claiming a general problem, i.e. also affecting 64-bit. You
didn't comment on this at all.

Jan



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-06-17 11:56       ` Jan Beulich
@ 2021-06-17 13:05         ` Ian Jackson
  2021-06-17 14:40           ` Jan Beulich
  2021-06-28 12:35           ` Ping: " Jan Beulich
  2021-06-17 21:26         ` Stefano Stabellini
  1 sibling, 2 replies; 13+ messages in thread
From: Ian Jackson @ 2021-06-17 13:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, xen-devel, Roger Pau Monné, committers

Firstly, let me try to deal with substance and/or technical merit.

Jan, I am finding it difficult to follow in your message whether you
are asserting that your disputed change (to Xen) did not introduce a
vulnerability.

I think you are saying that there is no vulnerability, because in any
overall configuration where this is a vulnerability, the guest would
have to be making an unjustified assumption.

If this is your reasoning, I don't think it is sound.  The question is
not whether the assumption is justified or not (answering which
question seems to require nigh-incomprehensible exegesis of processor
documentation).

The question is whether any guest does in fact make that assumption.
If any do, then there is a vulnerability.  Whether that's a
vulnerability "in" Xen or "in" the guest is just a question of
finger-pointing.

If none do then there is no vulnerability.


On to process:

Jan Beulich writes ("Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL"):
> On 16.06.2021 17:43, Andrew Cooper wrote:
> > I am very irritated that you have *twice* recently introduced security
> > vulnerabilities by bypassing my reviews/objections on patches.
> 
> I'm sorry, Andrew, but already in my original reply a month ago I did
> express that I couldn't find any record of you having objected to the
> changes. It doesn't help that you claim you've objected when you
> really didn't (which is the impression I get from not finding anything,
> and which also matches my recollection of what was discussed).

Andrew, can you provide references to your objections ?

> I don't think I know which 2nd instance you're referring to, and hence
> I can't respond to that aspect.

And, likewise, references for this.

> > In the case of this revert specifically, I did get agreement on IRC
> > before reverting.
> 
> How can I know you did? You didn't even care to reply to my mail from
> a month ago. And there was no reason to make an emergency out of this
> and ask on irc. You could have sent mail just like is done for all
> other normal bug fixes etc. Iirc I was on PTO at that time; it would
> hence only have been fair to wait until my return.

I think it would be good practice to copy and paste relevant IRC
discussions into email in this kind of situation.  That email also
makes space to properly write down what you are doing, that you
realise it is controversial, who you have consulted, and why you are
going ahead.

I looked at one of the two disputed reverts in Xen,
cb199cc7de987cfda4659fccf51059f210f6ad34, and it does not have any
tags indicating approval by anyone else.

Andy, if you got agreement on IRC, who from ? [1]

Ian.

[1] This may well have included me.  I do not reliably record this
kind of information in my wetware.  That is what we have computers
for.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-06-17 13:05         ` Ian Jackson
@ 2021-06-17 14:40           ` Jan Beulich
  2021-06-17 14:49             ` Ian Jackson
  2021-06-28 12:35           ` Ping: " Jan Beulich
  1 sibling, 1 reply; 13+ messages in thread
From: Jan Beulich @ 2021-06-17 14:40 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Andrew Cooper, xen-devel, Roger Pau Monné, committers

On 17.06.2021 15:05, Ian Jackson wrote:
> Firstly, let me try to deal with substance and/or technical merit.
> 
> Jan, I am finding it difficult to follow in your message whether you
> are asserting that your disputed change (to Xen) did not introduce a
> vulnerability.
> 
> I think you are saying that there is no vulnerability, because in any
> overall configuration where this is a vulnerability, the guest would
> have to be making an unjustified assumption.
> 
> If this is your reasoning, I don't think it is sound.  The question is
> not whether the assumption is justified or not (answering which
> question seems to require nigh-incomprehensible exegesis of processor
> documentation).
> 
> The question is whether any guest does in fact make that assumption.
> If any do, then there is a vulnerability.  Whether that's a
> vulnerability "in" Xen or "in" the guest is just a question of
> finger-pointing.
> 
> If none do then there is no vulnerability.

I don't think any OS does, simply because they can't rely on such
behavior when on on bare metal. The only such assumption was baked
into the respective XTF test.

If any OS made such an assumption, then I don't think it would be
a vulnerability either. It would simply be a guest kernel bug then.

Jan



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-06-17 14:40           ` Jan Beulich
@ 2021-06-17 14:49             ` Ian Jackson
  2021-06-17 14:55               ` Jan Beulich
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Jackson @ 2021-06-17 14:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, xen-devel, Roger Pau Monné, committers

Jan Beulich writes ("Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL"):
> If any OS made such an assumption, then I don't think it would be
> a vulnerability either. It would simply be a guest kernel bug then.

For the avoidance of doubt:

I think you are saying that if any OS did make the assumption, the
resulting bug *would not be exploitable* (by an unprivileged guest
process, or by a PV backend it was speaking to, or, somehow, by
another guest).

Ian.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-06-17 14:49             ` Ian Jackson
@ 2021-06-17 14:55               ` Jan Beulich
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2021-06-17 14:55 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Andrew Cooper, xen-devel, Roger Pau Monné, committers

On 17.06.2021 16:49, Ian Jackson wrote:
> Jan Beulich writes ("Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL"):
>> If any OS made such an assumption, then I don't think it would be
>> a vulnerability either. It would simply be a guest kernel bug then.
> 
> For the avoidance of doubt:
> 
> I think you are saying that if any OS did make the assumption, the
> resulting bug *would not be exploitable* (by an unprivileged guest
> process, or by a PV backend it was speaking to, or, somehow, by
> another guest).

Not exactly: Whether such a kernel bug would also be a vulnerability
cannot be told without knowing how exactly the kernel screwed up.
But it's definitely not Xen to compensate for this, imo. But anyway,
this it largely moot, as there isn't - afaict - any OS making any
such assumption.

Jan



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-06-17 11:56       ` Jan Beulich
  2021-06-17 13:05         ` Ian Jackson
@ 2021-06-17 21:26         ` Stefano Stabellini
  1 sibling, 0 replies; 13+ messages in thread
From: Stefano Stabellini @ 2021-06-17 21:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, xen-devel, Roger Pau Monné, committers

On Thu, 17 Jun 2021, Jan Beulich wrote:
> GitlabCI doesn't tell me anything just yet, unless I go actively poll
> it. And as mentioned just yesterday on irc, I don't think I can easily
> navigate my way through those web pages, to find breakage I may have
> introduced and hence would better go fix. Unlike osstest, where I am
> told what failed, and I know where to find the corresponding logs.
> 
> It's also not clear to me at all in how far GitlabCI would have
> spotted the issue here, no matter whether it's caused by a hypervisor
> change or the XTF test being wrong. So far I've seen GitlabCI only
> spot build issues.

Without getting on the specifics of this problem, I just want to let you
know that Doug and I gave a little "tour" of GitlabCI at Xen Summit. I
recommend to watch the video when it becomes available. I find it very
easy to use and generally easier than other CIs. The very short version
is the following:


# find the pipeline running for the commits / patch series you care about

Pipelines for staging are here:
https://gitlab.com/xen-project/xen/-/pipelines

Pipelines for outstanding patch series on xen-devel are here:
https://gitlab.com/xen-project/patchew/xen/-/pipelines

I'll pick one of the recent runs for an outstanding series:
https://gitlab.com/xen-project/patchew/xen/-/pipelines/322112514

you can see what was committed by patchew by clicking on the link "84
jobs for patchew/20210616144324.31652-1-julien@xen.org in 87 minutes and
32 seconds (queued for 15 seconds)". The link brings you here where the
branch with the commits is:
https://gitlab.com/xen-project/patchew/xen/-/commits/patchew/20210616144324.31652-1-julien@xen.org


# find the failed jobs and logs

Look for the red "x" corresponding to individual jobs that failed in the
pipeline. In this case we have 2 red "x" on the right side which
correspond to these 2 jobs:

https://gitlab.com/xen-project/patchew/xen/-/jobs/1352370918
https://gitlab.com/xen-project/patchew/xen/-/jobs/1352370916

To get the full logs in text format simply click on the "document" icon
just above the black square with the logs. Other binary artifacts are
available if you click on "Download" on the right side of the screen.



# find details on the failed job

The jobs are divided into two groups: build jobs and test jobs. The
build jobs simply build Xen and tools with various compilers and
options. They are all in the left column in the pipeline page. They are
straightforward.

The test jobs are actually trying to run something inside QEMU (full
emulation). The scripts that runs things are:

automation/scripts/qemu-smoke-x86-64.sh
automation/scripts/qemu-smoke-arm64.sh
automation/scripts/qemu-alpine-arm64.sh

and their names correspond to the job names. In our example
qemu-smoke-x86-64.sh is the one that failed and it is running XTF inside
QEMU.


I hope this helps! I'd be happy to jump on a call to give you a short
intro on how to use gitlab-ci, just let me know.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Ping: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL
  2021-06-17 13:05         ` Ian Jackson
  2021-06-17 14:40           ` Jan Beulich
@ 2021-06-28 12:35           ` Jan Beulich
  1 sibling, 0 replies; 13+ messages in thread
From: Jan Beulich @ 2021-06-28 12:35 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Roger Pau Monné, committers, Ian Jackson

On 17.06.2021 15:05, Ian Jackson wrote:
> On to process:
> 
> Jan Beulich writes ("Re: Regressed XSA-286, was [xen-unstable test] 161917: regressions - FAIL"):
>> On 16.06.2021 17:43, Andrew Cooper wrote:
>>> I am very irritated that you have *twice* recently introduced security
>>> vulnerabilities by bypassing my reviews/objections on patches.
>>
>> I'm sorry, Andrew, but already in my original reply a month ago I did
>> express that I couldn't find any record of you having objected to the
>> changes. It doesn't help that you claim you've objected when you
>> really didn't (which is the impression I get from not finding anything,
>> and which also matches my recollection of what was discussed).
> 
> Andrew, can you provide references to your objections ?
> 
>> I don't think I know which 2nd instance you're referring to, and hence
>> I can't respond to that aspect.
> 
> And, likewise, references for this.
> 
>>> In the case of this revert specifically, I did get agreement on IRC
>>> before reverting.
>>
>> How can I know you did? You didn't even care to reply to my mail from
>> a month ago. And there was no reason to make an emergency out of this
>> and ask on irc. You could have sent mail just like is done for all
>> other normal bug fixes etc. Iirc I was on PTO at that time; it would
>> hence only have been fair to wait until my return.
> 
> I think it would be good practice to copy and paste relevant IRC
> discussions into email in this kind of situation.  That email also
> makes space to properly write down what you are doing, that you
> realise it is controversial, who you have consulted, and why you are
> going ahead.
> 
> I looked at one of the two disputed reverts in Xen,
> cb199cc7de987cfda4659fccf51059f210f6ad34, and it does not have any
> tags indicating approval by anyone else.
> 
> Andy, if you got agreement on IRC, who from ? [1]
> 
> Ian.
> 
> [1] This may well have included me.  I do not reliably record this
> kind of information in my wetware.  That is what we have computers
> for.

Another 11 days have passed without a reply to any of the questions
above. I find it generally inappropriate to try to have controversies
die out by simply not replying, but in a case like this one it is imo
extra bad to do so. In case it hasn't come through clearly before: My
primary goal is not to revert your revert. Instead I'd like want to be
given proper reasons, so I can fully understand parts I may have been
missing so far. But of course I also expect you to correct your views
in case the technical details speak against your original reasoning
(at which point undoing your change may indeed be the necessary
consequence).

And of course all technical aspects aside there remains the process
aspect of this whole situation.

Jan



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-06-28 12:35 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-13  3:56 [xen-unstable test] 161917: regressions - FAIL osstest service owner
2021-05-13 20:15 ` Regressed XSA-286, was " Andrew Cooper
2021-05-17  8:43   ` Jan Beulich
2021-05-17 10:59     ` Jan Beulich
2021-06-16  8:48   ` Jan Beulich
2021-06-16 15:43     ` Andrew Cooper
2021-06-17 11:56       ` Jan Beulich
2021-06-17 13:05         ` Ian Jackson
2021-06-17 14:40           ` Jan Beulich
2021-06-17 14:49             ` Ian Jackson
2021-06-17 14:55               ` Jan Beulich
2021-06-28 12:35           ` Ping: " Jan Beulich
2021-06-17 21:26         ` Stefano Stabellini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).