All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/19] MCE code cleanup and add LMCE support
@ 2017-02-17  6:39 Haozhong Zhang
  2017-02-17  6:39 ` [PATCH 01/19] x86/mce: fix indentation style in xen-mca.h and mce.h Haozhong Zhang
                   ` (18 more replies)
  0 siblings, 19 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Wei Liu, Jun Nakajima, Liu Jinsong,
	Christoph Egger, Ian Jackson, Jan Beulich, Andrew Cooper

[-- Attachment #1: Type: text/plain, Size: 9332 bytes --]

This patch series adds LMCE support to Xen, although more than half
patches are for code cleanup and bug fix.

LMCE
--------------
Intel Local MCE (LMCE) is a feature on Intel Skylake Server CPU that                                        
can deliver MCE to a single processor thread instead of broadcasting                                        
to all threads, which can reduce software's load when processing MCE                                        
on machines with a large number of processor threads.                                        
                                       
The technical details of LMCE can be found in Intel SDM Vol 3, Chapter                                        
"Machine-Check Architecture" (search for 'LMCE'). Basically,                                        
 * The capability of LMCE is indicated by bit 27 (MCG_LMCE_P) of                                        
   MSR_IA32_MCG_CAP.                                        
 * LMCE is enabled by setting bit 20 (MSR_IA32_FEATURE_CONTROL_LMCE)                                        
   of MSR_IA32_FEATURE_CONTROL and bit 0 (MCG_EXT_CTL_LMCE_EN) of                                        
   MSR_IA32_MCG_EXT_CTL.                                        
 * Software can determine if a MCE is local to the current processor                                        
   thread by checking bit 2 (MCG_STATUS_LMCE) of MSR_IA32_MCG_STATUS.

Patch Overview
--------------
In this patch series,
 * Xen enables LMCE by default if it's supported by host CPU unless Xen
   boot parameter "mce_fb=1" is present.
 * Xen handles LMCE only on the affected CPU and does not need all CPUs
   to enter MCE handler.
 * A new xl config "lmce=BOOLEAN" is added to control whether LMCE is
   supported for the HVM domain. It's disabled by default. If the host
   CPU does not support LMCE, this config will be ignored.
 * For HVM domain with LMCE support, if the vcpu affected by a host
   LMCE is known, Xen will inject a vLMCE to that vcpu. If the affected
   vcpu is unknown or LMCE support is disabled for a HVM domain, a MCE
   will be broadcast to all vcpus of that domain as before.  

This patch series is organized as below:
 * Patch 1 - 8 clean up existing MCE code and make one improvement to
   debugging messages. No functional change is introduced.
 * Patch 9 - 11 fix two bugs in vMCE injection and MCE handling.
 * Patch 12 & 13 add host-side LMCE support, including detecting,
   enabling LMCE feature and handling LMCE.
 * Patch 14 - 17 add guest-side LMCE support (only HVM domain so far),
   including emulating LMCE feature and injecting LMCE to HVM domain.
 * Patch 18 & 19 add xen-mceinj support to inject LMCE for test purpose.

How to Test
--------------
0. This patch series can be tested either on Intel CPU w/ LMCE support
   (Skylake-EX), or in the nested virtualization environment on
   KVM/QEMU (i.e. Xen as L1 hypervisor).

   QEMU 2.7.0 and later with KVM in Linux kernel 4.8 and later can
   emulate LMCE and do not require the host hardware support LMCE. You
   can start a nested virtualization environment with LMCE support by
   the following command:
        qemu-system-x86_64 -enable-kvm \
                           -smp 32 -cpu kvm64,lmce=on,+vmx \
                           -hda PATH-TO-DISK-IMG -m 2048
    
1. Build, install and boot Xen with this patch series. You can include
   "mce_verbosity=verbose" in Xen boot parameters to get more detailed
   debugging messages about MCE.

2. At boot time, if the Xen boot parameter 'mce_fb=1' is not
   present, Xen hypervisor should be able to detect and enable LMCE,
   and print the following message:

        (XEN) mce_intel.c:737: MCA Capability: BCAST 1 SER 1 CMCI 1 firstbank 0 extended MCE MSR LMCE 1

   If 'mce_fb=1' is specified, the last segment of above message will
   be "LMCE 0" which indicates Xen does not enable LMCE support.

3. Start a HVM domain with the attached config file xl.cfg. In the
   config,
    * "lmce = 1" enables LMCE for the domaim.
    * "cpus = [ ... ]" is helpful for the following steps to figure
      out which CPU should we inject to, and may be not a necessity.

   Run Linux kernel 4.2 or later (which has LMCE support) in the
   domain.

   Run the latest mcelog (https://www.mcelog.org/) in the domain as
   well to log MCEs injected in latter steps. Depending on the guest
   Linux distro, the log can be in /var/log/mcelog, syslog or systemd
   journal.

   Compile and run the attached claim_page.c in the domain. claim_page.c
   allocates a page of memory, prints its base (guest) physical address
   and enters an infinite loop. For example, it may print a message like

        Physical address of mmaped page = 0x36d4d000
   
4. Use "xl vcpu-list" to figure out the cpu number on which
   claim_page on is running. For example, xl vcpu-list may output

        Name     ID  VCPU   CPU State   Time(s) Affinity (Hard / Soft)
        lmce-l2   1     0    4   r--     546.5  4 / all
        lmce-l2   1     1    5   -b-       8.4  5 / all
        lmce-l2   1     2    6   -b-       6.4  6 / all 
        lmce-l2   1     3    7   -b-       6.4  7 / all

    As claim_page is the only workload that is actively running in
    the domain, CPU 4 (VCPU 0) is very likely the one it's running on.
    (You may even want to pin claim_page to a vcpu in guest Linux ... )

5. Use xen-mceinj to inject LMCE:
        ./xen-mceinj -c 4 -d 1 -p 0x36d4d000 -t 0 -l
                                                  ^^
                                                    inject LMCE

   If the injection succeeds, mcelog in the domain should generate the
   log like

        Hardware event. This is not a software error.
        MCE 0                                        
        CPU 0 BANK 1 TSC 2218fdf1380
        ^^^^^
             vcpu0 receives MCE
        RIP !INEXACT! 10:ffffffff810591e7            
        MISC 86 ADDR 36d4d000
                ^^^^^^^^^^^^^
                             error address
        TIME 1487302866 Fri Feb 17 11:41:06 2017     
        MCG status:RIPV MCIP LMCE
                             ^^^^
                                 LMCE is injected
        MCi status:                                  
        Uncorrected error                            
        Error enabled                                
        MCi_MISC register valid                      
        MCi_ADDR register valid                      
        SRAO                                         
        MCA: Generic CACHE Level-2 Eviction Error    
        STATUS bd2000000000017a MCGSTATUS d          
        MCGCAP 9000c02 APICID 1 SOCKETID 0           
        CPUID Vendor Intel Family 6 Model 79  


Haozhong Zhang (19):
  01/19 x86/mce: fix indentation style in xen-mca.h and mce.h
  02/19 x86/mce: remove declarations of non-existing functions in mce.h
  03/19 x86/mce: remove unnecessary braces around intel_get_extended_msrs()
  04/19 xen/mce: remove unused x86_mcinfo_add()
  05/19 x86/mce: merge loops to get Intel extended MC MSR
  06/19 x86/mce: merge intel_default_mce_dhandler/uhandler()
  07/19 x86/vmce: include domain/vcpu id in debug messages
  08/19 x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve()
  09/19 x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case
  10/19 x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU
  11/19 tools/xen-mceinj: fix the type of cpu number
  12/19 x86/mce: handle LMCE locally
  13/19 x86/mce_intel: detect and enable LMCE on Intel host
  14/19 x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL
  15/19 x86/vmce: emulate MSR_IA32_MCG_EXT_CTL
  16/19 x86/vmce: enable injecting LMCE to guest on Intel host
  17/19 x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
  18/19 xen/mce: add support of vLMCE injection to XEN_MC_inject_v2
  19/19 tools/xen-mceinj: support injecting LMCE

 docs/man/xl.cfg.pod.5.in                |  18 ++++
 tools/libxc/include/xenctrl.h           |   1 +
 tools/libxc/xc_misc.c                   |  25 ++++++
 tools/libxl/libxl_create.c              |   1 +
 tools/libxl/libxl_dom.c                 |   2 +
 tools/libxl/libxl_types.idl             |   1 +
 tools/libxl/xl_cmdimpl.c                |   3 +
 tools/tests/mce-test/tools/xen-mceinj.c |  70 +++++++++++++--
 xen/arch/x86/cpu/mcheck/barrier.c       |   4 +-
 xen/arch/x86/cpu/mcheck/mcaction.c      |  20 +++--
 xen/arch/x86/cpu/mcheck/mce.c           |  87 +++++++++++-------
 xen/arch/x86/cpu/mcheck/mce.h           |  51 ++++++-----
 xen/arch/x86/cpu/mcheck/mce_amd.c       |   4 +-
 xen/arch/x86/cpu/mcheck/mce_intel.c     |  86 ++++++++++--------
 xen/arch/x86/cpu/mcheck/vmce.c          | 153 ++++++++++++++++++++++++--------
 xen/arch/x86/cpu/mcheck/vmce.h          |   2 +-
 xen/arch/x86/cpu/mcheck/x86_mca.h       |   9 +-
 xen/arch/x86/hvm/hvm.c                  |   7 ++
 xen/arch/x86/hvm/vmx/vmx.c              |  10 +++
 xen/arch/x86/hvm/vmx/vvmx.c             |   4 -
 xen/include/asm-x86/mce.h               |   3 +
 xen/include/asm-x86/msr-index.h         |   2 +
 xen/include/public/arch-x86/hvm/save.h  |   2 +
 xen/include/public/arch-x86/xen-mca.h   |  25 +++---
 xen/include/public/hvm/params.h         |   5 +-
 25 files changed, 420 insertions(+), 175 deletions(-)

--
2.10.1

[-- Attachment #2: claim_page.c --]
[-- Type: text/x-csrc, Size: 1998 bytes --]

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>

struct pagemaps {
    unsigned long long pfn:55;
    unsigned long long shift:6;
    unsigned long long rsvd:1;
    unsigned long long swapped:1;
    unsigned long long present:1;
};

static int translate_va2pa(uint64_t va, uint64_t pagesize, uint64_t *pa)
{
    int rc = 0;
    static const char *pagemap_file = "/proc/self/pagemap";
    struct pagemaps pinfo;
    uint64_t pinfo_size = sizeof(pinfo);
    uint64_t offset = va / pagesize * pinfo_size;
    int fd = open(pagemap_file, O_RDONLY);

    if (fd == -1) {
        rc = errno;
        fprintf(stderr, "Failed to open %s: %s\n", pagemap_file, strerror(rc));
        goto ret;
    }

    if (pread(fd, (void *) &pinfo, pinfo_size, offset) != pinfo_size) {
        rc = errno;
        fprintf(stderr, "Failed to read offset 0x%"PRIx64": %s\n",
                offset, strerror(rc));
        goto ret_close;
    }

    *pa = (pinfo.pfn * pagesize) | (va & (pagesize - 1));

 ret_close:
    close(fd);
 ret:
    return rc;
}

int main(int argc, char **argv)
{
    void *buf;
    uint64_t buf_pa;
    int pagesize = getpagesize();
    int rc = 0;

    buf = mmap(NULL, pagesize,
               PROT_READ | PROT_WRITE | PROT_EXEC, MAP_PRIVATE | MAP_ANONYMOUS,
               -1, 0);
    if (buf == MAP_FAILED) {
        rc = errno;
        fprintf(stderr, "Failed to mmap a page: %s\n", strerror(rc));
        goto ret;
    }
    memset(buf, 0xcc, pagesize);

    rc = translate_va2pa((uint64_t) buf, pagesize, &buf_pa);
    if (rc || !buf_pa) {
        fprintf(stderr, "Failed to get physical address of mmaped page\n");
        goto ret_unmap;
    }

    fprintf(stderr, "Physical address of mmaped page = 0x%"PRIx64"\n", buf_pa);

    volatile int i = 1;
    while (i++);

 ret_unmap:
    munmap(buf, pagesize);
 ret:
    return rc;
}

[-- Attachment #3: xl.cfg --]
[-- Type: text/plain, Size: 322 bytes --]

builder = "hvm"
name    = "lmce-l2"

vcpus   = 4
memory  = 1024
disk    = [ '/dev/vdb,raw,xvda,rw' ]
cpus    = [ "4", "5", "6", "7" ]

lmce    = 1

device_model_override = '/usr/local/lib/xen/bin/qemu-system-i386'
device_model_version  = 'qemu-xen'

sdl     = 0
vnc     = 1
vnclisten='0.0.0.0'
stdvga  = 1
serial  = 'pty'

[-- Attachment #4: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* [PATCH 01/19] x86/mce: fix indentation style in xen-mca.h and mce.h
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17  9:49   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 02/19] x86/mce: remove declarations of non-existing functions in mce.h Haozhong Zhang
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Replace tab indentation by whitespace.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce.h         | 40 +++++++++++++++++------------------
 xen/include/public/arch-x86/xen-mca.h | 24 ++++++++++-----------
 2 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index e83d431..2f0fb07 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -30,11 +30,11 @@ extern int mce_verbosity;
         } while (0)
 
 enum mcheck_type {
-	mcheck_unset = -1,
-	mcheck_none,
-	mcheck_amd_famXX,
-	mcheck_amd_k8,
-	mcheck_intel
+    mcheck_unset = -1,
+    mcheck_none,
+    mcheck_amd_famXX,
+    mcheck_amd_k8,
+    mcheck_intel
 };
 
 extern uint8_t cmci_apic_vector;
@@ -59,7 +59,7 @@ unsigned int mce_firstbank(struct cpuinfo_x86 *c);
 struct mc_info *x86_mcinfo_getptr(void);
 void noreturn mc_panic(char *s);
 void x86_mc_get_cpu_info(unsigned, uint32_t *, uint16_t *, uint16_t *,
-			 uint32_t *, uint32_t *, uint32_t *, uint32_t *);
+                         uint32_t *, uint32_t *, uint32_t *, uint32_t *);
 
 /* Register a handler for machine check exceptions. */
 typedef void (*x86_mce_vector_t)(const struct cpu_user_regs *regs);
@@ -80,10 +80,10 @@ extern bool_t intpose_inval(unsigned int, uint64_t);
 
 static inline uint64_t mca_rdmsr(unsigned int msr)
 {
-	uint64_t val;
-	if (intpose_lookup(smp_processor_id(), msr, &val) == NULL)
-		rdmsrl(msr, val);
-	return val;
+    uint64_t val;
+    if (intpose_lookup(smp_processor_id(), msr, &val) == NULL)
+        rdmsrl(msr, val);
+    return val;
 }
 
 /* Write an MSR, invalidating any interposed value */
@@ -101,19 +101,19 @@ static inline uint64_t mca_rdmsr(unsigned int msr)
  * of the MCA data observed in the logout operation. */
 
 enum mca_source {
-	MCA_POLLER,
-	MCA_CMCI_HANDLER,
-	MCA_RESET,
-	MCA_MCE_SCAN
+    MCA_POLLER,
+    MCA_CMCI_HANDLER,
+    MCA_RESET,
+    MCA_MCE_SCAN
 };
 
 struct mca_summary {
-	uint32_t	errcnt;	/* number of banks with valid errors */
-	int		ripv;	/* meaningful on #MC */
-	int		eipv;	/* meaningful on #MC */
-	bool_t		uc;	/* UC flag */
-	bool_t		pcc;	/* PCC flag */
-	bool_t		recoverable; /* software error recoverable flag */
+    uint32_t    errcnt; /* number of banks with valid errors */
+    int         ripv;   /* meaningful on #MC */
+    int         eipv;   /* meaningful on #MC */
+    bool_t      uc;     /* UC flag */
+    bool_t      pcc;    /* PCC flag */
+    bool_t      recoverable; /* software error recoverable flag */
 };
 
 DECLARE_PER_CPU(struct mca_banks *, poll_bankmask);
diff --git a/xen/include/public/arch-x86/xen-mca.h b/xen/include/public/arch-x86/xen-mca.h
index a97e821..9566a33 100644
--- a/xen/include/public/arch-x86/xen-mca.h
+++ b/xen/include/public/arch-x86/xen-mca.h
@@ -312,8 +312,8 @@ DEFINE_XEN_GUEST_HANDLE(xen_mc_logical_cpu_t);
         struct mcinfo_common *_mic;                             \
                                                                 \
         found = 0;                                              \
-	(_ret) = NULL;						\
-	if (_mi == NULL) break;					\
+        (_ret) = NULL;                                          \
+        if (_mi == NULL) break;                                 \
         _mic = x86_mcinfo_first(_mi);                           \
         for (i = 0; i < x86_mcinfo_nentries(_mi); i++) {        \
             if (_mic->type == (_type)) {                        \
@@ -345,8 +345,8 @@ struct xen_mc_fetch {
     /* IN/OUT variables. */
     uint32_t flags;	/* IN: XEN_MC_NONURGENT, XEN_MC_URGENT,
                            XEN_MC_ACK if ack'ing an earlier fetch */
-			/* OUT: XEN_MC_OK, XEN_MC_FETCHFAILED,
-			   XEN_MC_NODATA, XEN_MC_NOMATCH */
+    /* OUT: XEN_MC_OK, XEN_MC_FETCHFAILED,
+            XEN_MC_NODATA, XEN_MC_NOMATCH */
     uint32_t _pad0;
     uint64_t fetch_id;	/* OUT: id for ack, IN: id we are ack'ing */
 
@@ -378,11 +378,11 @@ DEFINE_XEN_GUEST_HANDLE(xen_mc_notifydomain_t);
 
 #define XEN_MC_physcpuinfo 3
 struct xen_mc_physcpuinfo {
-	/* IN/OUT */
-	uint32_t ncpus;
-	uint32_t _pad0;
-	/* OUT */
-	XEN_GUEST_HANDLE(xen_mc_logical_cpu_t) info;
+    /* IN/OUT */
+    uint32_t ncpus;
+    uint32_t _pad0;
+    /* OUT */
+    XEN_GUEST_HANDLE(xen_mc_logical_cpu_t) info;
 };
 
 #define XEN_MC_msrinject    4
@@ -404,7 +404,7 @@ struct xen_mc_msrinject {
 
 #define XEN_MC_mceinject    5
 struct xen_mc_mceinject {
-	unsigned int mceinj_cpunr;      /* target processor id */
+    unsigned int mceinj_cpunr;      /* target processor id */
 };
 
 #if defined(__XEN__) || defined(__XEN_TOOLS__)
@@ -416,8 +416,8 @@ struct xen_mc_mceinject {
 #define XEN_MC_INJECT_CPU_BROADCAST 0x8
 
 struct xen_mc_inject_v2 {
-	uint32_t flags;
-	struct xenctl_bitmap cpumap;
+    uint32_t flags;
+    struct xenctl_bitmap cpumap;
 };
 #endif
 
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 02/19] x86/mce: remove declarations of non-existing functions in mce.h
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
  2017-02-17  6:39 ` [PATCH 01/19] x86/mce: fix indentation style in xen-mca.h and mce.h Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17  9:50   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 03/19] x86/mce: remove unnecessary braces around intel_get_extended_msrs() Haozhong Zhang
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Remove declarations of functions
    intel_mcheck_timer()
    mce_intel_feature_init()
    mce_cap_init()
    x86_mcinfo_getptr()
whose definitions had been removed long time ago.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index 2f0fb07..e697780 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -43,11 +43,8 @@ extern uint8_t cmci_apic_vector;
 enum mcheck_type amd_mcheck_init(struct cpuinfo_x86 *c);
 enum mcheck_type intel_mcheck_init(struct cpuinfo_x86 *c, bool_t bsp);
 
-void intel_mcheck_timer(struct cpuinfo_x86 *c);
-void mce_intel_feature_init(struct cpuinfo_x86 *c);
 void amd_nonfatal_mcheck_init(struct cpuinfo_x86 *c);
 
-uint64_t mce_cap_init(void);
 extern unsigned int firstbank;
 
 struct mcinfo_extended *intel_get_extended_msrs(
@@ -56,7 +53,6 @@ struct mcinfo_extended *intel_get_extended_msrs(
 int mce_available(struct cpuinfo_x86 *c);
 unsigned int mce_firstbank(struct cpuinfo_x86 *c);
 /* Helper functions used for collecting error telemetry */
-struct mc_info *x86_mcinfo_getptr(void);
 void noreturn mc_panic(char *s);
 void x86_mc_get_cpu_info(unsigned, uint32_t *, uint16_t *, uint16_t *,
                          uint32_t *, uint32_t *, uint32_t *, uint32_t *);
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 03/19] x86/mce: remove unnecessary braces around intel_get_extended_msrs()
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
  2017-02-17  6:39 ` [PATCH 01/19] x86/mce: fix indentation style in xen-mca.h and mce.h Haozhong Zhang
  2017-02-17  6:39 ` [PATCH 02/19] x86/mce: remove declarations of non-existing functions in mce.h Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17  9:51   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add() Haozhong Zhang
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 2695b0c..e327eab 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -356,11 +356,8 @@ mcheck_mca_logout(enum mca_source who, struct mca_banks *bankmask,
                 ASSERT(mig);
                 mca_init_global(mc_flags, mig);
                 /* A hook here to get global extended msrs */
-                {
-                    if (boot_cpu_data.x86_vendor ==
-                        X86_VENDOR_INTEL)
-                        intel_get_extended_msrs(mig, mci);
-                }
+                if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+                    intel_get_extended_msrs(mig, mci);
             }
         }
 
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add()
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (2 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 03/19] x86/mce: remove unnecessary braces around intel_get_extended_msrs() Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17  9:55   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 05/19] x86/mce: merge loops to get Intel extended MC MSR Haozhong Zhang
                   ` (14 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce.c | 16 ----------------
 xen/arch/x86/cpu/mcheck/mce.h |  1 -
 2 files changed, 17 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index e327eab..f682520 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -834,22 +834,6 @@ void *x86_mcinfo_reserve(struct mc_info *mi, int size)
     return memset(mic_index, 0, size);
 }
 
-void *x86_mcinfo_add(struct mc_info *mi, void *mcinfo)
-{
-    struct mcinfo_common *mic, *buf;
-
-    mic = (struct mcinfo_common *)mcinfo;
-    buf = x86_mcinfo_reserve(mi, mic->size);
-
-    if ( !buf )
-        mce_printk(MCE_CRITICAL,
-                   "mcinfo_add: No space left in mc_info\n");
-    else
-        memcpy(buf, mic, mic->size);
-
-    return buf;
-}
-
 static void x86_mcinfo_apei_save(
     struct mcinfo_global *mc_global, struct mcinfo_bank *mc_bank)
 {
diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index e697780..56877c1 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -146,7 +146,6 @@ typedef struct mcinfo_extended *(*x86_mce_callback_t)
     (struct mc_info *, uint16_t, uint64_t);
 extern void x86_mce_callback_register(x86_mce_callback_t);
 
-void *x86_mcinfo_add(struct mc_info *mi, void *mcinfo);
 void *x86_mcinfo_reserve(struct mc_info *mi, int size);
 void x86_mcinfo_dump(struct mc_info *mi);
 
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 05/19] x86/mce: merge loops to get Intel extended MC MSR
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (3 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add() Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17  9:58   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 06/19] x86/mce: merge intel_default_mce_dhandler/uhandler() Haozhong Zhang
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

The second loop that gets MSR_IA32_MCG_R8 to MSR_IA32_MCG_R15 was
surrounded by '#ifdef __X86_64__ ... #endif' and had to be seperated
from the first loop that gets MSR_IA32_MCG_EAX to MSR_IA32_MCG_MISC.
Because Xen had dropped support for 32-bit x86 host, these two loops
can be merged now.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce_intel.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c b/xen/arch/x86/cpu/mcheck/mce_intel.c
index 005e41d..498e8e4 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -211,10 +211,7 @@ intel_get_extended_msrs(struct mcinfo_global *mig, struct mc_info *mi)
     mc_ext->common.type = MC_TYPE_EXTENDED;
     mc_ext->common.size = sizeof(struct mcinfo_extended);
 
-    for (i = MSR_IA32_MCG_EAX; i <= MSR_IA32_MCG_MISC; i++)
-        intel_get_extended_msr(mc_ext, i);
-
-    for (i = MSR_IA32_MCG_R8; i <= MSR_IA32_MCG_R15; i++)
+    for (i = MSR_IA32_MCG_EAX; i <= MSR_IA32_MCG_R15; i++)
         intel_get_extended_msr(mc_ext, i);
 
     return mc_ext;
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 06/19] x86/mce: merge intel_default_mce_dhandler/uhandler()
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (4 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 05/19] x86/mce: merge loops to get Intel extended MC MSR Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17 10:01   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 07/19] x86/vmce: include domain/vcpu id in debug messages Haozhong Zhang
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Implementations of these two functions are effectively the same, so
unify them by a common intel_default_mce_handler().

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce_intel.c | 27 +++------------------------
 1 file changed, 3 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c b/xen/arch/x86/cpu/mcheck/mce_intel.c
index 498e8e4..b5ee8b8 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -342,7 +342,7 @@ static int intel_default_check(uint64_t status)
     return 1;
 }
 
-static void intel_default_mce_dhandler(
+static void intel_default_mce_handler(
              struct mca_binfo *binfo,
              enum mce_result *result,
              const struct cpu_user_regs * regs)
@@ -361,32 +361,11 @@ static void intel_default_mce_dhandler(
 static const struct mca_error_handler intel_mce_dhandlers[] = {
     {intel_srao_check, intel_srao_dhandler},
     {intel_srar_check, intel_srar_dhandler},
-    {intel_default_check, intel_default_mce_dhandler}
+    {intel_default_check, intel_default_mce_handler}
 };
 
-static void intel_default_mce_uhandler(
-             struct mca_binfo *binfo,
-             enum mce_result *result,
-             const struct cpu_user_regs *regs)
-{
-    uint64_t status = binfo->mib->mc_status;
-    enum intel_mce_type type;
-
-    type = intel_check_mce_type(status);
-
-    switch (type)
-    {
-    case intel_mce_fatal:
-        *result = MCER_RESET;
-        break;
-    default:
-        *result = MCER_CONTINUE;
-        break;
-    }
-}
-
 static const struct mca_error_handler intel_mce_uhandlers[] = {
-    {intel_default_check, intel_default_mce_uhandler}
+    {intel_default_check, intel_default_mce_handler}
 };
 
 /* According to MCA OS writer guide, CMCI handler need to clear bank when
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 07/19] x86/vmce: include domain/vcpu id in debug messages
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (5 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 06/19] x86/mce: merge intel_default_mce_dhandler/uhandler() Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17 10:03   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve() Haozhong Zhang
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/vmce.c | 35 ++++++++++++++++++++---------------
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 5f002e3..d83a3f2 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -110,15 +110,16 @@ static int bank_mce_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_IA32_MC0_CTL:
         /* stick all 1's to MCi_CTL */
         *val = ~0UL;
-        mce_printk(MCE_VERBOSE, "MCE: rd MC%u_CTL %#"PRIx64"\n", bank, *val);
+        mce_printk(MCE_VERBOSE, "MCE: %pv: rd MC%u_CTL %#"PRIx64"\n",
+                   v, bank, *val);
         break;
     case MSR_IA32_MC0_STATUS:
         if ( bank < GUEST_MC_BANK_NUM )
         {
             *val = v->arch.vmce.bank[bank].mci_status;
             if ( *val )
-                mce_printk(MCE_VERBOSE, "MCE: rd MC%u_STATUS %#"PRIx64"\n",
-                           bank, *val);
+                mce_printk(MCE_VERBOSE, "MCE: %pv: rd MC%u_STATUS %#"PRIx64"\n",
+                           v, bank, *val);
         }
         break;
     case MSR_IA32_MC0_ADDR:
@@ -126,8 +127,8 @@ static int bank_mce_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
         {
             *val = v->arch.vmce.bank[bank].mci_addr;
             if ( *val )
-                mce_printk(MCE_VERBOSE, "MCE: rd MC%u_ADDR %#"PRIx64"\n",
-                           bank, *val);
+                mce_printk(MCE_VERBOSE, "MCE: %pv: rd MC%u_ADDR %#"PRIx64"\n",
+                           v, bank, *val);
         }
         break;
     case MSR_IA32_MC0_MISC:
@@ -135,8 +136,8 @@ static int bank_mce_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
         {
             *val = v->arch.vmce.bank[bank].mci_misc;
             if ( *val )
-                mce_printk(MCE_VERBOSE, "MCE: rd MC%u_MISC %#"PRIx64"\n",
-                           bank, *val);
+                mce_printk(MCE_VERBOSE, "MCE: %pv: rd MC%u_MISC %#"PRIx64"\n",
+                           v, bank, *val);
         }
         break;
     default:
@@ -178,16 +179,16 @@ int vmce_rdmsr(uint32_t msr, uint64_t *val)
         *val = cur->arch.vmce.mcg_status;
         if (*val)
             mce_printk(MCE_VERBOSE,
-                       "MCE: rd MCG_STATUS %#"PRIx64"\n", *val);
+                       "MCE: %pv: rd MCG_STATUS %#"PRIx64"\n", cur, *val);
         break;
     case MSR_IA32_MCG_CAP:
         *val = cur->arch.vmce.mcg_cap;
-        mce_printk(MCE_VERBOSE, "MCE: rd MCG_CAP %#"PRIx64"\n", *val);
+        mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_CAP %#"PRIx64"\n", cur, *val);
         break;
     case MSR_IA32_MCG_CTL:
         if ( cur->arch.vmce.mcg_cap & MCG_CTL_P )
             *val = ~0ULL;
-        mce_printk(MCE_VERBOSE, "MCE: rd MCG_CTL %#"PRIx64"\n", *val);
+        mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_CTL %#"PRIx64"\n", cur, *val);
         break;
     default:
         ret = mce_bank_msr(cur, msr) ? bank_mce_rdmsr(cur, msr, val) : 0;
@@ -217,21 +218,24 @@ static int bank_mce_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
          */
         break;
     case MSR_IA32_MC0_STATUS:
-        mce_printk(MCE_VERBOSE, "MCE: wr MC%u_STATUS %#"PRIx64"\n", bank, val);
+        mce_printk(MCE_VERBOSE, "MCE: %pv: wr MC%u_STATUS %#"PRIx64"\n",
+                   v, bank, val);
         if ( val )
             ret = -1;
         else if ( bank < GUEST_MC_BANK_NUM )
             v->arch.vmce.bank[bank].mci_status = val;
         break;
     case MSR_IA32_MC0_ADDR:
-        mce_printk(MCE_VERBOSE, "MCE: wr MC%u_ADDR %#"PRIx64"\n", bank, val);
+        mce_printk(MCE_VERBOSE, "MCE: %pv: wr MC%u_ADDR %#"PRIx64"\n",
+                   v, bank, val);
         if ( val )
             ret = -1;
         else if ( bank < GUEST_MC_BANK_NUM )
             v->arch.vmce.bank[bank].mci_addr = val;
         break;
     case MSR_IA32_MC0_MISC:
-        mce_printk(MCE_VERBOSE, "MCE: wr MC%u_MISC %#"PRIx64"\n", bank, val);
+        mce_printk(MCE_VERBOSE, "MCE: %pv: wr MC%u_MISC %#"PRIx64"\n",
+                   v, bank, val);
         if ( val )
             ret = -1;
         else if ( bank < GUEST_MC_BANK_NUM )
@@ -275,7 +279,8 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
         break;
     case MSR_IA32_MCG_STATUS:
         cur->arch.vmce.mcg_status = val;
-        mce_printk(MCE_VERBOSE, "MCE: wr MCG_STATUS %"PRIx64"\n", val);
+        mce_printk(MCE_VERBOSE, "MCE: %pv: wr MCG_STATUS %"PRIx64"\n",
+                   cur, val);
         break;
     case MSR_IA32_MCG_CAP:
         /*
@@ -283,7 +288,7 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
          * the effect of writing to the IA32_MCG_CAP is undefined. Here we
          * treat writing as 'write not change'. Guest would not surprise.
          */
-        mce_printk(MCE_VERBOSE, "MCE: MCG_CAP is r/o\n");
+        mce_printk(MCE_VERBOSE, "MCE: %pv: MCG_CAP is r/o\n", cur);
         break;
     default:
         ret = mce_bank_msr(cur, msr) ? bank_mce_wrmsr(cur, msr, val) : 0;
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve()
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (6 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 07/19] x86/vmce: include domain/vcpu id in debug messages Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17 10:07   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case Haozhong Zhang
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

All existing calls to x86_mcinfo_reserve() are followed by statements
that set the size and the type of the reserved space, so move them into
x86_mcinfo_reserve() to simplify the code.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mcaction.c  |  4 +---
 xen/arch/x86/cpu/mcheck/mce.c       | 16 ++++++++--------
 xen/arch/x86/cpu/mcheck/mce.h       |  2 +-
 xen/arch/x86/cpu/mcheck/mce_amd.c   |  4 +---
 xen/arch/x86/cpu/mcheck/mce_intel.c |  6 +-----
 5 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c b/xen/arch/x86/cpu/mcheck/mcaction.c
index 9cf2499..cc90e7c 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -13,14 +13,12 @@ mci_action_add_pageoffline(int bank, struct mc_info *mi,
     if (!mi)
         return NULL;
 
-    rec = x86_mcinfo_reserve(mi, sizeof(*rec));
+    rec = x86_mcinfo_reserve(mi, sizeof(*rec), MC_TYPE_RECOVERY);
     if (!rec) {
         mi->flags |= MCINFO_FLAGS_UNCOMPLETE;
         return NULL;
     }
 
-    rec->common.type = MC_TYPE_RECOVERY;
-    rec->common.size = sizeof(*rec);
     rec->mc_bank = bank;
     rec->action_types = MC_ACTION_PAGE_OFFLINE;
     rec->action_info.page_retire.mfn = mfn;
diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index f682520..28bf579 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -204,7 +204,7 @@ static void mca_init_bank(enum mca_source who,
     if (!mi)
         return;
 
-    mib = x86_mcinfo_reserve(mi, sizeof(*mib));
+    mib = x86_mcinfo_reserve(mi, sizeof(*mib), MC_TYPE_BANK);
     if (!mib)
     {
         mi->flags |= MCINFO_FLAGS_UNCOMPLETE;
@@ -213,8 +213,6 @@ static void mca_init_bank(enum mca_source who,
 
     mib->mc_status = mca_rdmsr(MSR_IA32_MCx_STATUS(bank));
 
-    mib->common.type = MC_TYPE_BANK;
-    mib->common.size = sizeof (struct mcinfo_bank);
     mib->mc_bank = bank;
 
     if (mib->mc_status & MCi_STATUS_MISCV)
@@ -250,8 +248,6 @@ static int mca_init_global(uint32_t flags, struct mcinfo_global *mig)
     struct domain *d;
 
     /* Set global information */
-    mig->common.type = MC_TYPE_GLOBAL;
-    mig->common.size = sizeof (struct mcinfo_global);
     status = mca_rdmsr(MSR_IA32_MCG_STATUS);
     mig->mc_gstatus = status;
     mig->mc_domid = mig->mc_vcpuid = -1;
@@ -351,7 +347,7 @@ mcheck_mca_logout(enum mca_source who, struct mca_banks *bankmask,
             if ( (mctc = mctelem_reserve(which)) != NULL ) {
                 mci = mctelem_dataptr(mctc);
                 mcinfo_clear(mci);
-                mig = x86_mcinfo_reserve(mci, sizeof(*mig));
+                mig = x86_mcinfo_reserve(mci, sizeof(*mig), MC_TYPE_GLOBAL);
                 /* mc_info should at least hold up the global information */
                 ASSERT(mig);
                 mca_init_global(mc_flags, mig);
@@ -804,7 +800,7 @@ static void mcinfo_clear(struct mc_info *mi)
     x86_mcinfo_nentries(mi) = 0;
 }
 
-void *x86_mcinfo_reserve(struct mc_info *mi, int size)
+void *x86_mcinfo_reserve(struct mc_info *mi, uint16_t size, uint16_t type)
 {
     int i;
     unsigned long end1, end2;
@@ -831,7 +827,11 @@ void *x86_mcinfo_reserve(struct mc_info *mi, int size)
     /* there's enough space. add entry. */
     x86_mcinfo_nentries(mi)++;
 
-    return memset(mic_index, 0, size);
+    memset(mic_index, 0, size);
+    mic_index->size = size;
+    mic_index->type = type;
+
+    return mic_index;
 }
 
 static void x86_mcinfo_apei_save(
diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index 56877c1..2f4e7a4 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -146,7 +146,7 @@ typedef struct mcinfo_extended *(*x86_mce_callback_t)
     (struct mc_info *, uint16_t, uint64_t);
 extern void x86_mce_callback_register(x86_mce_callback_t);
 
-void *x86_mcinfo_reserve(struct mc_info *mi, int size);
+void *x86_mcinfo_reserve(struct mc_info *mi, uint16_t size, uint16_t type);
 void x86_mcinfo_dump(struct mc_info *mi);
 
 static inline int mce_vendor_bank_msr(const struct vcpu *v, uint32_t msr)
diff --git a/xen/arch/x86/cpu/mcheck/mce_amd.c b/xen/arch/x86/cpu/mcheck/mce_amd.c
index 599e465..fe51be9 100644
--- a/xen/arch/x86/cpu/mcheck/mce_amd.c
+++ b/xen/arch/x86/cpu/mcheck/mce_amd.c
@@ -218,15 +218,13 @@ amd_f10_handler(struct mc_info *mi, uint16_t bank, uint64_t status)
     if ( !(status & MCi_STATUS_MISCV) )
         return NULL;
 
-    mc_ext = x86_mcinfo_reserve(mi, sizeof(*mc_ext));
+    mc_ext = x86_mcinfo_reserve(mi, sizeof(*mc_ext), MC_TYPE_EXTENDED);
     if ( !mc_ext )
     {
         mi->flags |= MCINFO_FLAGS_UNCOMPLETE;
         return NULL;
     }
 
-    mc_ext->common.type = MC_TYPE_EXTENDED;
-    mc_ext->common.size = sizeof(*mc_ext);
     mc_ext->mc_msrs = 3;
 
     mc_ext->mc_msr[0].reg = MSR_F10_MC4_MISC1;
diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c b/xen/arch/x86/cpu/mcheck/mce_intel.c
index b5ee8b8..9e5ee3d 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -200,17 +200,13 @@ intel_get_extended_msrs(struct mcinfo_global *mig, struct mc_info *mi)
             !(mig->mc_gstatus & MCG_STATUS_EIPV))
         return NULL;
 
-    mc_ext = x86_mcinfo_reserve(mi, sizeof(*mc_ext));
+    mc_ext = x86_mcinfo_reserve(mi, sizeof(*mc_ext), MC_TYPE_EXTENDED);
     if (!mc_ext)
     {
         mi->flags |= MCINFO_FLAGS_UNCOMPLETE;
         return NULL;
     }
 
-    /* this function will called when CAP(9).MCG_EXT_P = 1 */
-    mc_ext->common.type = MC_TYPE_EXTENDED;
-    mc_ext->common.size = sizeof(struct mcinfo_extended);
-
     for (i = MSR_IA32_MCG_EAX; i <= MSR_IA32_MCG_R15; i++)
         intel_get_extended_msr(mc_ext, i);
 
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (7 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve() Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17 10:21   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU Haozhong Zhang
                   ` (9 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

The current implementation only fills MC MSRs on vcpu0 and leaves MC
MSRs on other vcpus empty in the broadcast case. When guest reads 0
from MSR_IA32_MCG_STATUS on vcpuN (N > 0), it may think it's not
possible to recover the execution on that vcpu and then get panic,
although MSR_IA32_MCG_STATUS filled on vcpu0 may imply the injected
vMCE is actually recoverable. To avoid such unnecessary guest panic,
set MSR_IA32_MCG_STATUS on vcpuN (N > 0) to MCG_STATUS_MCIP |
MCG_STATUS_RIPV.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mcaction.c | 14 ++++----
 xen/arch/x86/cpu/mcheck/vmce.c     | 67 +++++++++++++++++++++++++-------------
 xen/arch/x86/cpu/mcheck/vmce.h     |  2 +-
 3 files changed, 53 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c b/xen/arch/x86/cpu/mcheck/mcaction.c
index cc90e7c..8b2b834 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -88,21 +88,21 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
                     goto vmce_failed;
                 }
 
+                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+                    vmce_vcpuid = VMCE_INJECT_BROADCAST;
+                else
+                    vmce_vcpuid = global->mc_vcpuid;
+
                 bank->mc_addr = gfn << PAGE_SHIFT |
                   (bank->mc_addr & (PAGE_SIZE -1 ));
-                if ( fill_vmsr_data(bank, d,
-                      global->mc_gstatus) == -1 )
+                if ( fill_vmsr_data(bank, d, global->mc_gstatus,
+                                    vmce_vcpuid == VMCE_INJECT_BROADCAST) == -1 )
                 {
                     mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d "
                       "failed\n", bank->mc_domid);
                     goto vmce_failed;
                 }
 
-                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-                    vmce_vcpuid = VMCE_INJECT_BROADCAST;
-                else
-                    vmce_vcpuid = global->mc_vcpuid;
-
                 /* We will inject vMCE to DOMU*/
                 if ( inject_vmce(d, vmce_vcpuid) < 0 )
                 {
diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index d83a3f2..456d6f3 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -386,36 +386,59 @@ int inject_vmce(struct domain *d, int vcpu)
     return ret;
 }
 
+static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status,
+                             uint64_t mci_status, uint64_t mci_addr,
+                             uint64_t mci_misc)
+{
+    if ( v->arch.vmce.mcg_status & MCG_STATUS_MCIP )
+    {
+        mce_printk(MCE_QUIET, "MCE: %pv: guest has not handled previous"
+                   " vMCE yet!\n", v);
+        return -EBUSY;
+    }
+
+    spin_lock(&v->arch.vmce.lock);
+
+    v->arch.vmce.mcg_status = mcg_status;
+    /*
+     * 1. Skip bank 0 to avoid 'bank 0 quirk' of old processors
+     * 2. Filter MCi_STATUS MSCOD model specific error code to guest
+     */
+    v->arch.vmce.bank[1].mci_status = mci_status & MCi_STATUS_MSCOD_MASK;
+    v->arch.vmce.bank[1].mci_addr = mci_addr;
+    v->arch.vmce.bank[1].mci_misc = mci_misc;
+
+    spin_unlock(&v->arch.vmce.lock);
+
+    return 0;
+}
+
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-                   uint64_t gstatus)
+                   uint64_t gstatus, bool broadcast)
 {
     struct vcpu *v = d->vcpu[0];
+    int ret;
 
-    if ( mc_bank->mc_domid != (uint16_t)~0 )
-    {
-        if ( v->arch.vmce.mcg_status & MCG_STATUS_MCIP )
-        {
-            mce_printk(MCE_QUIET, "MCE: guest has not handled previous"
-                       " vMCE yet!\n");
-            return -1;
-        }
-
-        spin_lock(&v->arch.vmce.lock);
+    if ( mc_bank->mc_domid == (uint16_t)~0 )
+        return -EINVAL;
 
-        v->arch.vmce.mcg_status = gstatus;
-        /*
-         * 1. Skip bank 0 to avoid 'bank 0 quirk' of old processors
-         * 2. Filter MCi_STATUS MSCOD model specific error code to guest
-         */
-        v->arch.vmce.bank[1].mci_status = mc_bank->mc_status &
-                                              MCi_STATUS_MSCOD_MASK;
-        v->arch.vmce.bank[1].mci_addr = mc_bank->mc_addr;
-        v->arch.vmce.bank[1].mci_misc = mc_bank->mc_misc;
+    ret = vcpu_fill_mc_msrs(v, gstatus, mc_bank->mc_status,
+                            mc_bank->mc_addr, mc_bank->mc_misc);
+    if ( ret || !broadcast )
+        goto out;
 
-        spin_unlock(&v->arch.vmce.lock);
+    for_each_vcpu ( d, v )
+    {
+        if ( v == d->vcpu[0] )
+            continue;
+        ret = vcpu_fill_mc_msrs(v, MCG_STATUS_MCIP | MCG_STATUS_RIPV,
+                                0, 0, 0);
+        if ( ret )
+            break;
     }
 
-    return 0;
+ out:
+    return ret;
 }
 
 /* It's said some ram is setup as mmio_direct for UC cache attribute */
diff --git a/xen/arch/x86/cpu/mcheck/vmce.h b/xen/arch/x86/cpu/mcheck/vmce.h
index 163ce3c..74f6381 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.h
+++ b/xen/arch/x86/cpu/mcheck/vmce.h
@@ -17,7 +17,7 @@ int vmce_amd_rdmsr(const struct vcpu *, uint32_t msr, uint64_t *val);
 int vmce_amd_wrmsr(struct vcpu *, uint32_t msr, uint64_t val);
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-    uint64_t gstatus);
+                   uint64_t gstatus, bool broadcast);
 
 #define VMCE_INJECT_BROADCAST (-1)
 int inject_vmce(struct domain *d, int vcpu);
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (8 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17 10:26   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number Haozhong Zhang
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

An attemp to write to MSR_IA32_MCG_STATUS with any value other than 0
would result in #GP on Intel CPU.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 28bf579..95a9da3 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -538,7 +538,14 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
     gstatus = mca_rdmsr(MSR_IA32_MCG_STATUS);
     if ((gstatus & MCG_STATUS_MCIP) != 0) {
         mce_printk(MCE_CRITICAL, "MCE: Clear MCIP@ last step");
-        mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
+        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+            /*
+             * Intel SDM 3: An attempt to write to IA32_MCG_STATUS
+             * with any value other than 0 would result in #GP.
+             */
+            mca_wrmsr(MSR_IA32_MCG_STATUS, 0);
+        else
+            mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
     }
     mce_barrier_exit(&mce_trap_bar);
 
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (9 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-17 10:08   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 12/19] x86/mce: handle LMCE locally Haozhong Zhang
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Haozhong Zhang, Ian Jackson, Wei Liu

Use uint32_t rather than int to align to the type of
xen_mc_physcpuinfo.ncpus.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 tools/tests/mce-test/tools/xen-mceinj.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tools/tests/mce-test/tools/xen-mceinj.c b/tools/tests/mce-test/tools/xen-mceinj.c
index 51abc8a..5f70a61 100644
--- a/tools/tests/mce-test/tools/xen-mceinj.c
+++ b/tools/tests/mce-test/tools/xen-mceinj.c
@@ -161,7 +161,7 @@ static int flush_msr_inj(xc_interface *xc_handle)
     return xc_mca_op(xc_handle, &mc);
 }
 
-static int mca_cpuinfo(xc_interface *xc_handle)
+static uint32_t mca_cpuinfo(xc_interface *xc_handle)
 {
     struct xen_mc mc;
 
@@ -176,16 +176,19 @@ static int mca_cpuinfo(xc_interface *xc_handle)
         return 0;
 }
 
-static int inject_cmci(xc_interface *xc_handle, int cpu_nr)
+static int inject_cmci(xc_interface *xc_handle, uint32_t cpu_nr)
 {
     struct xen_mc mc;
-    int nr_cpus;
+    uint32_t nr_cpus;
 
     memset(&mc, 0, sizeof(struct xen_mc));
 
     nr_cpus = mca_cpuinfo(xc_handle);
     if (!nr_cpus)
         err(xc_handle, "Failed to get mca_cpuinfo");
+    if (cpu_nr >= nr_cpus)
+        err(xc_handle, "-c %"PRIu32" is larger than %"PRIu32,
+            cpu_nr, nr_cpus - 1);
 
     mc.cmd = XEN_MC_inject_v2;
     mc.interface_version = XEN_MCA_INTERFACE_VERSION;
@@ -420,7 +423,7 @@ int main(int argc, char *argv[])
     int c, opt_index;
     uint32_t domid;
     xc_interface *xc_handle;
-    int cpu_nr;
+    uint32_t cpu_nr;
     uint64_t gaddr, max_gpa;
 
     /* Default Value */
@@ -444,7 +447,7 @@ int main(int argc, char *argv[])
             dump=1;
             break;
         case 'c':
-            cpu_nr = strtol(optarg, &optarg, 10);
+            cpu_nr = strtoul(optarg, &optarg, 10);
             if ( strlen(optarg) != 0 )
                 err(xc_handle, "Please input a digit parameter for CPU");
             break;
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 12/19] x86/mce: handle LMCE locally
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (10 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-22 13:53   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host Haozhong Zhang
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

LMCE is sent to only one CPU thread, so MCE handler, barriers and
softirq handler should go without waiting for other CPUs, when
handling LMCE. Note LMCE is still broadcast to all vcpus as regular
MCE on Intel CPU right now.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/barrier.c  |  4 ++--
 xen/arch/x86/cpu/mcheck/mcaction.c |  4 +++-
 xen/arch/x86/cpu/mcheck/mce.c      | 25 ++++++++++++++++++++++---
 xen/arch/x86/cpu/mcheck/mce.h      |  3 +++
 xen/arch/x86/cpu/mcheck/x86_mca.h  |  4 +++-
 5 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/barrier.c b/xen/arch/x86/cpu/mcheck/barrier.c
index 5dce1fb..869fd20 100644
--- a/xen/arch/x86/cpu/mcheck/barrier.c
+++ b/xen/arch/x86/cpu/mcheck/barrier.c
@@ -20,7 +20,7 @@ void mce_barrier_enter(struct mce_softirq_barrier *bar)
 {
     int gen;
 
-    if (!mce_broadcast)
+    if ( !mce_broadcast || __get_cpu_var(lmce_in_process) )
         return;
     atomic_inc(&bar->ingen);
     gen = atomic_read(&bar->outgen);
@@ -38,7 +38,7 @@ void mce_barrier_exit(struct mce_softirq_barrier *bar)
 {
     int gen;
 
-    if ( !mce_broadcast )
+    if ( !mce_broadcast || __get_cpu_var(lmce_in_process) )
         return;
     atomic_inc(&bar->outgen);
     gen = atomic_read(&bar->ingen);
diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c b/xen/arch/x86/cpu/mcheck/mcaction.c
index 8b2b834..90c68ff 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -95,7 +95,9 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
 
                 bank->mc_addr = gfn << PAGE_SHIFT |
                   (bank->mc_addr & (PAGE_SIZE -1 ));
-                if ( fill_vmsr_data(bank, d, global->mc_gstatus,
+                /* TODO: support injecting LMCE */
+                if ( fill_vmsr_data(bank, d,
+                                    global->mc_gstatus & ~MCG_STATUS_LMCE,
                                     vmce_vcpuid == VMCE_INJECT_BROADCAST) == -1 )
                 {
                     mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d "
diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 95a9da3..2d69222 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -42,6 +42,17 @@ DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, poll_bankmask);
 DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, no_cmci_banks);
 DEFINE_PER_CPU_READ_MOSTLY(struct mca_banks *, mce_clear_banks);
 
+/*
+ * Flag to indicate whether the current MCE on this CPU is a LMCE.
+ *
+ * The MCE handler should set/clear this flag before entering any MCE
+ * barriers and raising MCE softirq. MCE barriers rely on this flag to
+ * decide whether they need to wait for other CPUs. MCE softirq handler
+ * relies on this flag to decide whether it needs to handle pending
+ * MCEs on other CPUs.
+ */
+DEFINE_PER_CPU(bool, lmce_in_process);
+
 static void intpose_init(void);
 static void mcinfo_clear(struct mc_info *);
 struct mca_banks *mca_allbanks;
@@ -399,6 +410,7 @@ mcheck_mca_logout(enum mca_source who, struct mca_banks *bankmask,
         sp->errcnt = errcnt;
         sp->ripv = (gstatus & MCG_STATUS_RIPV) != 0;
         sp->eipv = (gstatus & MCG_STATUS_EIPV) != 0;
+        sp->lmce = (gstatus & MCG_STATUS_LMCE) != 0;
         sp->uc = uc;
         sp->pcc = pcc;
         sp->recoverable = recover;
@@ -462,6 +474,7 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
     uint64_t gstatus;
     mctelem_cookie_t mctc = NULL;
     struct mca_summary bs;
+    bool *lmce_in_process = &__get_cpu_var(lmce_in_process);
 
     mce_spin_lock(&mce_logout_lock);
 
@@ -505,6 +518,8 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
     }
     mce_spin_unlock(&mce_logout_lock);
 
+    *lmce_in_process = bs.lmce;
+
     mce_barrier_enter(&mce_trap_bar);
     if ( mctc != NULL && mce_urgent_action(regs, mctc))
         cpumask_set_cpu(smp_processor_id(), &mce_fatal_cpus);
@@ -1709,6 +1724,7 @@ static void mce_softirq(void)
 {
     int cpu = smp_processor_id();
     unsigned int workcpu;
+    bool lmce = per_cpu(lmce_in_process, cpu);
 
     mce_printk(MCE_VERBOSE, "CPU%d enter softirq\n", cpu);
 
@@ -1738,9 +1754,12 @@ static void mce_softirq(void)
         /* Step1: Fill DOM0 LOG buffer, vMCE injection buffer and
          * vMCE MSRs virtualization buffer
          */
-        for_each_online_cpu(workcpu) {
-            mctelem_process_deferred(workcpu, mce_delayed_action);
-        }
+        if ( lmce )
+            mctelem_process_deferred(cpu, mce_delayed_action);
+        else
+            for_each_online_cpu(workcpu) {
+                mctelem_process_deferred(workcpu, mce_delayed_action);
+            }
 
         /* Step2: Send Log to DOM0 through vIRQ */
         if (dom0_vmce_enabled()) {
diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index 2f4e7a4..2c033af 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -110,12 +110,15 @@ struct mca_summary {
     bool_t      uc;     /* UC flag */
     bool_t      pcc;    /* PCC flag */
     bool_t      recoverable; /* software error recoverable flag */
+    bool_t      lmce;   /* LMCE flag (Intel specific) */
 };
 
 DECLARE_PER_CPU(struct mca_banks *, poll_bankmask);
 DECLARE_PER_CPU(struct mca_banks *, no_cmci_banks);
 DECLARE_PER_CPU(struct mca_banks *, mce_clear_banks);
 
+DECLARE_PER_CPU(bool, lmce_in_process);
+
 extern bool_t cmci_support;
 extern bool_t is_mc_panic;
 extern bool_t mce_broadcast;
diff --git a/xen/arch/x86/cpu/mcheck/x86_mca.h b/xen/arch/x86/cpu/mcheck/x86_mca.h
index e25d619..322b7d4 100644
--- a/xen/arch/x86/cpu/mcheck/x86_mca.h
+++ b/xen/arch/x86/cpu/mcheck/x86_mca.h
@@ -42,7 +42,9 @@
 #define MCG_STATUS_RIPV         0x0000000000000001ULL
 #define MCG_STATUS_EIPV         0x0000000000000002ULL
 #define MCG_STATUS_MCIP         0x0000000000000004ULL
-/* Bits 3-63 are reserved */
+#define MCG_STATUS_LMCE         0x0000000000000008ULL  /* Intel specific */
+/* Bits 3-63 are reserved on CPU not supporting LMCE */
+/* Bits 4-63 are reserved on CPU supporting LMCE */
 
 /* Bitfield of MSR_K8_MCi_STATUS registers */
 /* MCA error code */
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (11 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 12/19] x86/mce: handle LMCE locally Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-22 15:10   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 14/19] x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL Haozhong Zhang
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Enable LMCE if it's supported by the host CPU. If Xen boot parameter
"mce_fb = 1" is present, LMCE will be disabled forcibly.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce.h       |  1 +
 xen/arch/x86/cpu/mcheck/mce_intel.c | 44 ++++++++++++++++++++++++++++++++-----
 xen/arch/x86/cpu/mcheck/x86_mca.h   |  5 +++++
 xen/include/asm-x86/msr-index.h     |  2 ++
 4 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce.h b/xen/arch/x86/cpu/mcheck/mce.h
index 2c033af..461141a 100644
--- a/xen/arch/x86/cpu/mcheck/mce.h
+++ b/xen/arch/x86/cpu/mcheck/mce.h
@@ -38,6 +38,7 @@ enum mcheck_type {
 };
 
 extern uint8_t cmci_apic_vector;
+extern bool lmce_support;
 
 /* Init functions */
 enum mcheck_type amd_mcheck_init(struct cpuinfo_x86 *c);
diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c b/xen/arch/x86/cpu/mcheck/mce_intel.c
index 9e5ee3d..b4cc41a 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -29,6 +29,9 @@ boolean_param("mce_fb", mce_force_broadcast);
 
 static int __read_mostly nr_intel_ext_msrs;
 
+/* If mce_force_broadcast == 1, lmce_support will be disabled forcibly. */
+bool __read_mostly lmce_support = 0;
+
 /* Intel SDM define bit15~bit0 of IA32_MCi_STATUS as the MC error code */
 #define INTEL_MCCOD_MASK 0xFFFF
 
@@ -677,10 +680,34 @@ static int mce_is_broadcast(struct cpuinfo_x86 *c)
     return 0;
 }
 
+static bool intel_enable_lmce(void)
+{
+    uint64_t msr_content;
+
+    /*
+     * Section "Enabling Local Machine Check" in Intel SDM Vol 3
+     * requires software must ensure the LOCK bit and LMCE_ON bit
+     * of MSR_IA32_FEATURE_CONTROL are set before setting
+     * MSR_IA32_MCG_EXT_CTL.LMCE_EN.
+     */
+
+    if ( rdmsr_safe(MSR_IA32_FEATURE_CONTROL, msr_content) )
+        return 0;
+
+    if ( msr_content &
+         (IA32_FEATURE_CONTROL_LOCK | IA32_FEATURE_CONTROL_LMCE_ON) )
+    {
+        wrmsrl(MSR_IA32_MCG_EXT_CTL, MCG_EXT_CTL_LMCE_EN);
+        return 1;
+    }
+
+    return 0;
+}
+
 /* Check and init MCA */
 static void intel_init_mca(struct cpuinfo_x86 *c)
 {
-    bool_t broadcast, cmci = 0, ser = 0;
+    bool_t broadcast, cmci = 0, ser = 0, lmce = 0;
     int ext_num = 0, first;
     uint64_t msr_content;
 
@@ -700,26 +727,31 @@ static void intel_init_mca(struct cpuinfo_x86 *c)
 
     first = mce_firstbank(c);
 
+    if ( !mce_force_broadcast && (msr_content & MCG_LMCE_P) )
+        lmce = intel_enable_lmce();
+
     if (smp_processor_id() == 0)
     {
         dprintk(XENLOG_INFO, "MCA Capability: BCAST %x SER %x"
-                " CMCI %x firstbank %x extended MCE MSR %x\n",
-                broadcast, ser, cmci, first, ext_num);
+                " CMCI %x firstbank %x extended MCE MSR %x LMCE %x\n",
+                broadcast, ser, cmci, first, ext_num, lmce);
 
         mce_broadcast = broadcast;
         cmci_support = cmci;
         ser_support = ser;
         nr_intel_ext_msrs = ext_num;
         firstbank = first;
+        lmce_support = lmce;
     }
     else if (cmci != cmci_support || ser != ser_support ||
              broadcast != mce_broadcast ||
-             first != firstbank || ext_num != nr_intel_ext_msrs)
+             first != firstbank || ext_num != nr_intel_ext_msrs ||
+             lmce != lmce_support)
     {
         dprintk(XENLOG_WARNING,
-                "CPU %u has different MCA capability (%x,%x,%x,%x,%x)"
+                "CPU %u has different MCA capability (%x,%x,%x,%x,%x,%x)"
                 " than BSP, may cause undetermined result!!!\n",
-                smp_processor_id(), broadcast, ser, cmci, first, ext_num);
+                smp_processor_id(), broadcast, ser, cmci, first, ext_num, lmce);
     }
 }
 
diff --git a/xen/arch/x86/cpu/mcheck/x86_mca.h b/xen/arch/x86/cpu/mcheck/x86_mca.h
index 322b7d4..3b5060e 100644
--- a/xen/arch/x86/cpu/mcheck/x86_mca.h
+++ b/xen/arch/x86/cpu/mcheck/x86_mca.h
@@ -36,6 +36,7 @@
 #define MCG_TES_P               (1ULL<<11) /* Intel specific */
 #define MCG_EXT_CNT             16         /* Intel specific */
 #define MCG_SER_P               (1ULL<<24) /* Intel specific */
+#define MCG_LMCE_P              (1ULL<<27) /* Intel specific */
 /* Other bits are reserved */
 
 /* Bitfield of the MSR_IA32_MCG_STATUS register */
@@ -46,6 +47,10 @@
 /* Bits 3-63 are reserved on CPU not supporting LMCE */
 /* Bits 4-63 are reserved on CPU supporting LMCE */
 
+/* Bitfield of MSR_IA32_MCG_EXT_CTL register (Intel Specific) */
+#define MCG_EXT_CTL_LMCE_EN     (1ULL<<0)
+/* Other bits are reserved */
+
 /* Bitfield of MSR_K8_MCi_STATUS registers */
 /* MCA error code */
 #define MCi_STATUS_MCA          0x000000000000ffffULL
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 98dbff1..f0bc574 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -51,6 +51,7 @@
 #define MSR_IA32_MCG_CAP		0x00000179
 #define MSR_IA32_MCG_STATUS		0x0000017a
 #define MSR_IA32_MCG_CTL		0x0000017b
+#define MSR_IA32_MCG_EXT_CTL	0x000004d0
 
 #define MSR_IA32_PEBS_ENABLE		0x000003f1
 #define MSR_IA32_DS_AREA		0x00000600
@@ -294,6 +295,7 @@
 #define IA32_FEATURE_CONTROL_SENTER_PARAM_CTL         0x7f00
 #define IA32_FEATURE_CONTROL_ENABLE_SENTER            0x8000
 #define IA32_FEATURE_CONTROL_SGX_ENABLE               0x40000
+#define IA32_FEATURE_CONTROL_LMCE_ON                  0x100000
 
 #define MSR_IA32_TSC_ADJUST		0x0000003b
 
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 14/19] x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (12 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-22 15:20   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL Haozhong Zhang
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Jan Beulich, Andrew Cooper,
	Christoph Egger, Jun Nakajima, Liu Jinsong

If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, then set LMCE and
LOCK bits in guest MSR_IA32_FEATURE_CONTROL. Intel SDM requires those
bits are set before SW can enable LMCE.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
---
 xen/arch/x86/cpu/mcheck/mce_intel.c |  4 ++++
 xen/arch/x86/hvm/vmx/vmx.c          | 10 ++++++++++
 xen/arch/x86/hvm/vmx/vvmx.c         |  4 ----
 xen/include/asm-x86/mce.h           |  1 +
 4 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c b/xen/arch/x86/cpu/mcheck/mce_intel.c
index b4cc41a..81507d3 100644
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
@@ -916,3 +916,7 @@ int vmce_intel_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
     return 1;
 }
 
+bool vmce_support_lmce(const struct vcpu *v)
+{
+    return !!(v->arch.vmce.mcg_cap & MCG_LMCE_P);
+}
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 42f4fbd..6947af0 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -56,6 +56,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <asm/altp2m.h>
 #include <asm/event.h>
+#include <asm/mce.h>
 #include <asm/monitor.h>
 #include <public/arch-x86/cpuid.h>
 
@@ -2634,6 +2635,9 @@ static int is_last_branch_msr(u32 ecx)
 
 static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
 {
+    struct vcpu *v = current;
+    struct domain *d = v->domain;
+
     HVM_DBG_LOG(DBG_LEVEL_MSR, "ecx=%#x", msr);
 
     switch ( msr )
@@ -2651,6 +2655,12 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         __vmread(GUEST_IA32_DEBUGCTL, msr_content);
         break;
     case MSR_IA32_FEATURE_CONTROL:
+        *msr_content = IA32_FEATURE_CONTROL_LOCK;
+        if ( vmce_support_lmce(v) )
+            *msr_content |= IA32_FEATURE_CONTROL_LMCE_ON;
+        if ( nestedhvm_enabled(d) && d->arch.cpuid->basic.vmx )
+            *msr_content |= IA32_FEATURE_CONTROL_ENABLE_VMXON_OUTSIDE_SMX;
+        break;
     case MSR_IA32_VMX_BASIC...MSR_IA32_VMX_VMFUNC:
         if ( !nvmx_msr_read_intercept(msr, msr_content) )
             goto gp_fault;
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index ec3b946..0060723 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -2020,10 +2020,6 @@ int nvmx_msr_read_intercept(unsigned int msr, u64 *msr_content)
         data = gen_vmx_msr(data, VMX_ENTRY_CTLS_DEFAULT1, host_data);
         break;
 
-    case MSR_IA32_FEATURE_CONTROL:
-        data = IA32_FEATURE_CONTROL_LOCK |
-               IA32_FEATURE_CONTROL_ENABLE_VMXON_OUTSIDE_SMX;
-        break;
     case MSR_IA32_VMX_VMCS_ENUM:
         /* The max index of VVMCS encoding is 0x1f. */
         data = 0x1f << 1;
diff --git a/xen/include/asm-x86/mce.h b/xen/include/asm-x86/mce.h
index 549bef3..6b827ef 100644
--- a/xen/include/asm-x86/mce.h
+++ b/xen/include/asm-x86/mce.h
@@ -36,6 +36,7 @@ extern void vmce_init_vcpu(struct vcpu *);
 extern int vmce_restore_vcpu(struct vcpu *, const struct hvm_vmce_vcpu *);
 extern int vmce_wrmsr(uint32_t msr, uint64_t val);
 extern int vmce_rdmsr(uint32_t msr, uint64_t *val);
+extern bool vmce_support_lmce(const struct vcpu *v);
 
 extern unsigned int nr_mce_banks;
 
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (13 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 14/19] x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-22 15:36   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host Haozhong Zhang
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, then allow guest
to read/write MSR_IA32_MCG_EXT_CTL.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/vmce.c         | 32 +++++++++++++++++++++++++++++++-
 xen/include/asm-x86/mce.h              |  1 +
 xen/include/public/arch-x86/hvm/save.h |  2 ++
 3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 456d6f3..1278839 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -90,6 +90,7 @@ int vmce_restore_vcpu(struct vcpu *v, const struct hvm_vmce_vcpu *ctxt)
     v->arch.vmce.mcg_cap = ctxt->caps;
     v->arch.vmce.bank[0].mci_ctl2 = ctxt->mci_ctl2_bank0;
     v->arch.vmce.bank[1].mci_ctl2 = ctxt->mci_ctl2_bank1;
+    v->arch.vmce.lmce_enabled = ctxt->lmce_enabled;
 
     return 0;
 }
@@ -190,6 +191,25 @@ int vmce_rdmsr(uint32_t msr, uint64_t *val)
             *val = ~0ULL;
         mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_CTL %#"PRIx64"\n", cur, *val);
         break;
+    case MSR_IA32_MCG_EXT_CTL:
+        /*
+         * If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, the LMCE and LOCK
+         * bits are always set in guest MSR_IA32_FEATURE_CONTROL by Xen , so it
+         * does not need to check them here.
+         */
+        if ( !(cur->arch.vmce.mcg_cap & MCG_LMCE_P) )
+        {
+            ret = -1;
+            mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL, not supported\n",
+                       cur);
+        }
+        else
+        {
+            *val = cur->arch.vmce.lmce_enabled ? MCG_EXT_CTL_LMCE_EN : 0;
+            mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL %#"PRIx64"\n",
+                       cur, *val);
+        }
+        break;
     default:
         ret = mce_bank_msr(cur, msr) ? bank_mce_rdmsr(cur, msr, val) : 0;
         break;
@@ -290,6 +310,15 @@ int vmce_wrmsr(uint32_t msr, uint64_t val)
          */
         mce_printk(MCE_VERBOSE, "MCE: %pv: MCG_CAP is r/o\n", cur);
         break;
+    case MSR_IA32_MCG_EXT_CTL:
+        if ( !(cur->arch.vmce.mcg_cap & MCG_LMCE_P) ||
+             (val & ~MCG_EXT_CTL_LMCE_EN) )
+            ret = -1;
+        else
+            cur->arch.vmce.lmce_enabled = !!(val & MCG_EXT_CTL_LMCE_EN);
+        mce_printk(MCE_VERBOSE, "MCE: %pv: wr MCG_EXT_CTL %"PRIx64"%s\n",
+                   cur, val, (ret == -1) ? ", not supported" : "");
+        break;
     default:
         ret = mce_bank_msr(cur, msr) ? bank_mce_wrmsr(cur, msr, val) : 0;
         break;
@@ -308,7 +337,8 @@ static int vmce_save_vcpu_ctxt(struct domain *d, hvm_domain_context_t *h)
         struct hvm_vmce_vcpu ctxt = {
             .caps = v->arch.vmce.mcg_cap,
             .mci_ctl2_bank0 = v->arch.vmce.bank[0].mci_ctl2,
-            .mci_ctl2_bank1 = v->arch.vmce.bank[1].mci_ctl2
+            .mci_ctl2_bank1 = v->arch.vmce.bank[1].mci_ctl2,
+            .lmce_enabled = v->arch.vmce.lmce_enabled,
         };
 
         err = hvm_save_entry(VMCE_VCPU, v->vcpu_id, h, &ctxt);
diff --git a/xen/include/asm-x86/mce.h b/xen/include/asm-x86/mce.h
index 6b827ef..525a9e8 100644
--- a/xen/include/asm-x86/mce.h
+++ b/xen/include/asm-x86/mce.h
@@ -29,6 +29,7 @@ struct vmce {
     uint64_t mcg_status;
     spinlock_t lock;
     struct vmce_bank bank[GUEST_MC_BANK_NUM];
+    bool lmce_enabled;
 };
 
 /* Guest vMCE MSRs virtualization */
diff --git a/xen/include/public/arch-x86/hvm/save.h b/xen/include/public/arch-x86/hvm/save.h
index 8d73b51..2d62ec3 100644
--- a/xen/include/public/arch-x86/hvm/save.h
+++ b/xen/include/public/arch-x86/hvm/save.h
@@ -599,6 +599,8 @@ struct hvm_vmce_vcpu {
     uint64_t caps;
     uint64_t mci_ctl2_bank0;
     uint64_t mci_ctl2_bank1;
+    uint8_t  lmce_enabled;
+    uint8_t  _pad[7];
 };
 
 DECLARE_HVM_SAVE_TYPE(VMCE_VCPU, 18, struct hvm_vmce_vcpu);
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (14 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-22 15:48   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP Haozhong Zhang
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Inject LMCE to guest if the host MCE is LMCE and the affected vcpu is
known. Otherwise, broadcast MCE to all vcpus on Intel host.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mcaction.c | 14 ++++++++------
 xen/arch/x86/cpu/mcheck/vmce.c     |  9 ++++++++-
 xen/arch/x86/cpu/mcheck/vmce.h     |  2 +-
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/cpu/mcheck/mcaction.c b/xen/arch/x86/cpu/mcheck/mcaction.c
index 90c68ff..3410bfd 100644
--- a/xen/arch/x86/cpu/mcheck/mcaction.c
+++ b/xen/arch/x86/cpu/mcheck/mcaction.c
@@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
                     goto vmce_failed;
                 }
 
-                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
+                vmce_vcpuid = global->mc_vcpuid;
+                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+                     (vmce_vcpuid == -1 ||
+                      global->mc_domid != bank->mc_domid ||
+                      !(global->mc_gstatus & MCG_STATUS_LMCE) ||
+                      !d->vcpu[vmce_vcpuid]->arch.vmce.lmce_enabled) )
                     vmce_vcpuid = VMCE_INJECT_BROADCAST;
-                else
-                    vmce_vcpuid = global->mc_vcpuid;
 
                 bank->mc_addr = gfn << PAGE_SHIFT |
                   (bank->mc_addr & (PAGE_SIZE -1 ));
-                /* TODO: support injecting LMCE */
                 if ( fill_vmsr_data(bank, d,
-                                    global->mc_gstatus & ~MCG_STATUS_LMCE,
-                                    vmce_vcpuid == VMCE_INJECT_BROADCAST) == -1 )
+                                    global->mc_gstatus,
+                                    vmce_vcpuid) == -1 )
                 {
                     mce_printk(MCE_QUIET, "Fill vMCE# data for DOM%d "
                       "failed\n", bank->mc_domid);
diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 1278839..2a4d3f0 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -444,14 +444,21 @@ static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status,
 }
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-                   uint64_t gstatus, bool broadcast)
+                   uint64_t gstatus, int vmce_vcpuid)
 {
     struct vcpu *v = d->vcpu[0];
+    bool broadcast = (vmce_vcpuid == VMCE_INJECT_BROADCAST);
     int ret;
 
     if ( mc_bank->mc_domid == (uint16_t)~0 )
         return -EINVAL;
 
+    if ( (gstatus & MCG_STATUS_LMCE) && !broadcast )
+        v = d->vcpu[vmce_vcpuid];
+
+    if ( broadcast )
+        gstatus &= ~MCG_STATUS_LMCE;
+
     ret = vcpu_fill_mc_msrs(v, gstatus, mc_bank->mc_status,
                             mc_bank->mc_addr, mc_bank->mc_misc);
     if ( ret || !broadcast )
diff --git a/xen/arch/x86/cpu/mcheck/vmce.h b/xen/arch/x86/cpu/mcheck/vmce.h
index 74f6381..2797e00 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.h
+++ b/xen/arch/x86/cpu/mcheck/vmce.h
@@ -17,7 +17,7 @@ int vmce_amd_rdmsr(const struct vcpu *, uint32_t msr, uint64_t *val);
 int vmce_amd_wrmsr(struct vcpu *, uint32_t msr, uint64_t val);
 
 int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
-                   uint64_t gstatus, bool broadcast);
+                   uint64_t gstatus, int vmce_vcpuid);
 
 #define VMCE_INJECT_BROADCAST (-1)
 int inject_vmce(struct domain *d, int vcpu);
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (15 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-20 12:32   ` Wei Liu
  2017-02-22 15:55   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2 Haozhong Zhang
  2017-02-17  6:39 ` [PATCH 19/19] tools/xen-mceinj: support injecting LMCE Haozhong Zhang
  18 siblings, 2 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Wei Liu, Liu Jinsong, Christoph Egger,
	Ian Jackson, Jan Beulich, Andrew Cooper

If LMCE is supported by host and "lmce = 1" is present in xl config, the
LMCE capability will be exposed in guest MSR_IA32_MCG_CAP. By default,
LMCE is not exposed to guest so as to keep the backwards migration
compatibility.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 docs/man/xl.cfg.pod.5.in        | 18 ++++++++++++++++++
 tools/libxl/libxl_create.c      |  1 +
 tools/libxl/libxl_dom.c         |  2 ++
 tools/libxl/libxl_types.idl     |  1 +
 tools/libxl/xl_cmdimpl.c        |  3 +++
 xen/arch/x86/cpu/mcheck/vmce.c  | 14 +++++++++++++-
 xen/arch/x86/hvm/hvm.c          |  7 +++++++
 xen/include/asm-x86/mce.h       |  1 +
 xen/include/public/hvm/params.h |  5 ++++-
 9 files changed, 50 insertions(+), 2 deletions(-)

diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
index 46f9caf..1cdf372 100644
--- a/docs/man/xl.cfg.pod.5.in
+++ b/docs/man/xl.cfg.pod.5.in
@@ -2021,6 +2021,24 @@ natively or via hardware backwards compatibility support.
 
 =back
 
+=head3 Intel
+
+=over 4
+
+=item B<lmce=BOOLEAN>
+
+(HVM only) Enable/disable LMCE support for a HVM domain.
+
+=over 4
+
+=item B<default>
+
+Disabled.
+
+=back
+
+=back
+
 =head1 SEE ALSO
 
 =over 4
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e3bc257..381e5dc 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -324,6 +324,7 @@ int libxl__domain_build_info_setdefault(libxl__gc *gc,
         libxl_defbool_setdefault(&b_info->u.hvm.altp2m,             false);
         libxl_defbool_setdefault(&b_info->u.hvm.usb,                false);
         libxl_defbool_setdefault(&b_info->u.hvm.xen_platform_pci,   true);
+        libxl_defbool_setdefault(&b_info->u.hvm.lmce,               false);
 
         libxl_defbool_setdefault(&b_info->u.hvm.spice.enable, false);
         if (!libxl_defbool_val(b_info->u.hvm.spice.enable) &&
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index d519c8d..f04adf4 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -293,6 +293,8 @@ static void hvm_set_conf_params(xc_interface *handle, uint32_t domid,
                     libxl_defbool_val(info->u.hvm.nested_hvm));
     xc_hvm_param_set(handle, domid, HVM_PARAM_ALTP2M,
                     libxl_defbool_val(info->u.hvm.altp2m));
+    xc_hvm_param_set(handle, domid, HVM_PARAM_LMCE,
+                     libxl_defbool_val(info->u.hvm.lmce));
 }
 
 int libxl__build_pre(libxl__gc *gc, uint32_t domid,
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index a612d1f..3cb0d9a 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -550,6 +550,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
                                        ("serial_list",      libxl_string_list),
                                        ("rdm", libxl_rdm_reserve),
                                        ("rdm_mem_boundary_memkb", MemKB),
+                                       ("lmce",             libxl_defbool),
                                        ])),
                  ("pv", Struct(None, [("kernel", string),
                                       ("slack_memkb", MemKB),
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 37ebdce..4ed8e3e 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1698,6 +1698,9 @@ static void parse_config_data(const char *config_source,
 
         if (!xlu_cfg_get_long (config, "rdm_mem_boundary", &l, 0))
             b_info->u.hvm.rdm_mem_boundary_memkb = l * 1024;
+
+        xlu_cfg_get_defbool(config, "lmce", &b_info->u.hvm.lmce, 0);
+
         break;
     case LIBXL_DOMAIN_TYPE_PV:
     {
diff --git a/xen/arch/x86/cpu/mcheck/vmce.c b/xen/arch/x86/cpu/mcheck/vmce.c
index 2a4d3f0..fa9b499 100644
--- a/xen/arch/x86/cpu/mcheck/vmce.c
+++ b/xen/arch/x86/cpu/mcheck/vmce.c
@@ -74,7 +74,7 @@ int vmce_restore_vcpu(struct vcpu *v, const struct hvm_vmce_vcpu *ctxt)
     unsigned long guest_mcg_cap;
 
     if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
-        guest_mcg_cap = INTEL_GUEST_MCG_CAP;
+        guest_mcg_cap = INTEL_GUEST_MCG_CAP | (lmce_support ? MCG_LMCE_P : 0);
     else
         guest_mcg_cap = AMD_GUEST_MCG_CAP;
 
@@ -519,3 +519,15 @@ int unmmap_broken_page(struct domain *d, mfn_t mfn, unsigned long gfn)
     return rc;
 }
 
+int vmce_enable_lmce(struct domain *d)
+{
+    struct vcpu *v;
+
+    if ( !lmce_support )
+        return -EINVAL;
+
+    for_each_vcpu(d, v)
+        v->arch.vmce.mcg_cap |= MCG_LMCE_P;
+
+    return 0;
+}
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 266f708..19389c0 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4007,6 +4007,7 @@ static int hvm_allow_set_param(struct domain *d,
     case HVM_PARAM_IOREQ_SERVER_PFN:
     case HVM_PARAM_NR_IOREQ_SERVER_PAGES:
     case HVM_PARAM_ALTP2M:
+    case HVM_PARAM_LMCE:
         if ( value != 0 && a->value != value )
             rc = -EEXIST;
         break;
@@ -4185,6 +4186,12 @@ static int hvmop_set_param(
         }
         d->arch.x87_fip_width = a.value;
         break;
+    case HVM_PARAM_LMCE:
+        if ( a.value > 1 )
+            rc = -EINVAL;
+        else if ( a.value == 1 )
+            rc = vmce_enable_lmce(d);
+        break;
     }
 
     if ( rc != 0 )
diff --git a/xen/include/asm-x86/mce.h b/xen/include/asm-x86/mce.h
index 525a9e8..f5a9ff9 100644
--- a/xen/include/asm-x86/mce.h
+++ b/xen/include/asm-x86/mce.h
@@ -38,6 +38,7 @@ extern int vmce_restore_vcpu(struct vcpu *, const struct hvm_vmce_vcpu *);
 extern int vmce_wrmsr(uint32_t msr, uint64_t val);
 extern int vmce_rdmsr(uint32_t msr, uint64_t *val);
 extern bool vmce_support_lmce(const struct vcpu *v);
+extern int vmce_enable_lmce(struct domain *d);
 
 extern unsigned int nr_mce_banks;
 
diff --git a/xen/include/public/hvm/params.h b/xen/include/public/hvm/params.h
index 3f54a49..6b6ecbe 100644
--- a/xen/include/public/hvm/params.h
+++ b/xen/include/public/hvm/params.h
@@ -253,6 +253,9 @@
  */
 #define HVM_PARAM_X87_FIP_WIDTH 36
 
-#define HVM_NR_PARAMS 37
+/* Boolean: Enable LMCE */
+#define HVM_PARAM_LMCE 37
+
+#define HVM_NR_PARAMS 38
 
 #endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (16 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-22 15:59   ` Jan Beulich
  2017-02-17  6:39 ` [PATCH 19/19] tools/xen-mceinj: support injecting LMCE Haozhong Zhang
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Christoph Egger, Andrew Cooper, Jan Beulich, Liu Jinsong

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Christoph Egger <chegger@amazon.de>
Cc: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
 xen/arch/x86/cpu/mcheck/mce.c         | 16 ++++++++++++++++
 xen/include/public/arch-x86/xen-mca.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/xen/arch/x86/cpu/mcheck/mce.c b/xen/arch/x86/cpu/mcheck/mce.c
index 2d69222..56c1f5e 100644
--- a/xen/arch/x86/cpu/mcheck/mce.c
+++ b/xen/arch/x86/cpu/mcheck/mce.c
@@ -1510,6 +1510,7 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
     {
         const cpumask_t *cpumap;
         cpumask_var_t cmv;
+        int cpu_nr;
 
         if (nr_mce_banks == 0)
             return x86_mcerr("do_mca #MC", -ENODEV);
@@ -1552,6 +1553,21 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
                 send_IPI_mask(cpumap, cmci_apic_vector);
             }
             break;
+        case XEN_MC_INJECT_TYPE_LMCE:
+            if ( !lmce_support )
+            {
+                ret = x86_mcerr("No LMCE support in platform", -EINVAL);
+                break;
+            }
+            /* ensure at most one CPU is specified */
+            cpu_nr = cpumask_next(cpumask_first(cpumap), cpumap);
+            if ( cpu_nr < nr_cpu_ids )
+            {
+                ret = x86_mcerr("More than one CPU specified", -EINVAL);
+                break;
+            }
+            on_selected_cpus(cpumap, x86_mc_mceinject, NULL, 1);
+            break;
         default:
             ret = x86_mcerr("Wrong mca type\n", -EINVAL);
             break;
diff --git a/xen/include/public/arch-x86/xen-mca.h b/xen/include/public/arch-x86/xen-mca.h
index 9566a33..037a174 100644
--- a/xen/include/public/arch-x86/xen-mca.h
+++ b/xen/include/public/arch-x86/xen-mca.h
@@ -412,6 +412,7 @@ struct xen_mc_mceinject {
 #define XEN_MC_INJECT_TYPE_MASK     0x7
 #define XEN_MC_INJECT_TYPE_MCE      0x0
 #define XEN_MC_INJECT_TYPE_CMCI     0x1
+#define XEN_MC_INJECT_TYPE_LMCE     0x2
 
 #define XEN_MC_INJECT_CPU_BROADCAST 0x8
 
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* [PATCH 19/19] tools/xen-mceinj: support injecting LMCE
  2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
                   ` (17 preceding siblings ...)
  2017-02-17  6:39 ` [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2 Haozhong Zhang
@ 2017-02-17  6:39 ` Haozhong Zhang
  2017-02-20 12:53   ` Wei Liu
  18 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-17  6:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Haozhong Zhang, Ian Jackson, Wei Liu

If option '-l' or '--lmce' is specified and the host supports LMCE,
xen-mceinj will inject LMCE to CPU specified by '-c' (or CPU0 if '-c'
is not present).

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 tools/libxc/include/xenctrl.h           |  1 +
 tools/libxc/xc_misc.c                   | 25 +++++++++++++++
 tools/tests/mce-test/tools/xen-mceinj.c | 57 +++++++++++++++++++++++++++++++--
 3 files changed, 81 insertions(+), 2 deletions(-)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 85d7fe5..2598952 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1968,6 +1968,7 @@ int xc_cpuid_apply_policy(xc_interface *xch,
 void xc_cpuid_to_str(const unsigned int *regs,
                      char **strs); /* some strs[] may be NULL if ENOMEM */
 int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
+int xc_mca_op_cpumap(xc_interface *xch, struct xen_mc *mc, xc_cpumap_t cpumap);
 #endif
 
 struct xc_px_val {
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 0fc6c22..24f7fdf 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -341,6 +341,31 @@ int xc_mca_op(xc_interface *xch, struct xen_mc *mc)
     xc_hypercall_bounce_post(xch, mc);
     return ret;
 }
+
+int xc_mca_op_cpumap(xc_interface *xch, struct xen_mc *mc, xc_cpumap_t cpumap)
+{
+    int ret;
+    DECLARE_HYPERCALL_BOUNCE(cpumap, 0, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+    if ( cpumap )
+    {
+        HYPERCALL_BOUNCE_SET_SIZE(cpumap,
+                                  (mc->u.mc_inject_v2.cpumap.nr_bits + 7) / 8);
+        if ( xc_hypercall_bounce_pre(xch, cpumap) )
+        {
+            PERROR("Could not bouce cpumap memory buffer");
+            return -1;
+        }
+        set_xen_guest_handle(mc->u.mc_inject_v2.cpumap.bitmap, cpumap);
+    }
+
+    ret = xc_mca_op(xch, mc);
+
+    if ( cpumap )
+        xc_hypercall_bounce_post(xch, cpumap);
+
+    return ret;
+}
 #endif
 
 int xc_perfc_reset(xc_interface *xch)
diff --git a/tools/tests/mce-test/tools/xen-mceinj.c b/tools/tests/mce-test/tools/xen-mceinj.c
index 5f70a61..b2eb7d3 100644
--- a/tools/tests/mce-test/tools/xen-mceinj.c
+++ b/tools/tests/mce-test/tools/xen-mceinj.c
@@ -56,6 +56,8 @@
 #define MSR_IA32_MC0_MISC        0x00000403
 #define MSR_IA32_MC0_CTL2        0x00000280
 
+#define MCG_STATUS_LMCE          0x8
+
 struct mce_info {
     const char *description;
     uint8_t mcg_stat;
@@ -113,6 +115,7 @@ static struct mce_info mce_table[] = {
 #define LOGFILE stdout
 
 int dump;
+int lmce;
 struct xen_mc_msrinject msr_inj;
 
 static void Lprintf(const char *fmt, ...)
@@ -213,6 +216,42 @@ static int inject_mce(xc_interface *xc_handle, int cpu_nr)
     return xc_mca_op(xc_handle, &mc);
 }
 
+static int inject_lmce(xc_interface *xc_handle, uint32_t cpu_nr)
+{
+    struct xen_mc mc;
+    uint8_t *cpumap = NULL;
+    size_t cpumap_size, line, shift;
+    uint32_t nr_cpus;
+    int ret;
+
+    nr_cpus = mca_cpuinfo(xc_handle);
+    if ( !nr_cpus )
+        err(xc_handle, "Failed to get mca_cpuinfo");
+    if ( cpu_nr >= nr_cpus )
+        err(xc_handle, "-c %"PRIu32" is larger than %"PRIu32,
+            cpu_nr, nr_cpus - 1);
+
+    memset(&mc, 0, sizeof(struct xen_mc));
+    mc.cmd = XEN_MC_inject_v2;
+    mc.interface_version = XEN_MCA_INTERFACE_VERSION;
+    mc.u.mc_inject_v2.flags |= XEN_MC_INJECT_TYPE_LMCE;
+
+    cpumap_size = (nr_cpus + 7) / 8;
+    cpumap = malloc(cpumap_size);
+    if ( !cpumap )
+        err(xc_handle, "Failed to allocate cpumap\n");
+    memset(cpumap, 0, cpumap_size);
+    line = cpu_nr / 8;
+    shift = cpu_nr % 8;
+    memset(cpumap + line, 1 << shift, 1);
+
+    mc.u.mc_inject_v2.cpumap.nr_bits = cpumap_size * 8;
+    ret = xc_mca_op_cpumap(xc_handle, &mc, cpumap);
+
+    free(cpumap);
+    return ret;
+}
+
 static uint64_t bank_addr(int bank, int type)
 {
     uint64_t addr;
@@ -331,8 +370,15 @@ static int inject(xc_interface *xc_handle, struct mce_info *mce,
                   uint32_t cpu_nr, uint32_t domain, uint64_t gaddr)
 {
     int ret = 0;
+    uint8_t mcg_status = mce->mcg_stat;
 
-    ret = inject_mcg_status(xc_handle, cpu_nr, mce->mcg_stat, domain);
+    if ( lmce )
+    {
+        if ( mce->cmci )
+            err(xc_handle, "No support to inject CMCI as LMCE");
+        mcg_status |= MCG_STATUS_LMCE;
+    }
+    ret = inject_mcg_status(xc_handle, cpu_nr, mcg_status, domain);
     if ( ret )
         err(xc_handle, "Failed to inject MCG_STATUS MSR");
 
@@ -355,6 +401,8 @@ static int inject(xc_interface *xc_handle, struct mce_info *mce,
         err(xc_handle, "Failed to inject MSR");
     if ( mce->cmci )
         ret = inject_cmci(xc_handle, cpu_nr);
+    else if ( lmce )
+        ret = inject_lmce(xc_handle, cpu_nr);
     else
         ret = inject_mce(xc_handle, cpu_nr);
     if ( ret )
@@ -394,6 +442,7 @@ static struct option opts[] = {
     {"dump", 0, 0, 'D'},
     {"help", 0, 0, 'h'},
     {"page", 0, 0, 'p'},
+    {"lmce", 0, 0, 'l'},
     {"", 0, 0, '\0'}
 };
 
@@ -410,6 +459,7 @@ static void help(void)
            "  -d, --domain=DOMID   target domain, the default is Xen itself\n"
            "  -h, --help           print this page\n"
            "  -p, --page=ADDR      physical address to report\n"
+           "  -l, --lmce           inject as LMCE (Intel only)\n"
            "  -t, --type=ERROR     error type\n");
 
     for ( i = 0; i < MCE_TABLE_SIZE; i++ )
@@ -439,7 +489,7 @@ int main(int argc, char *argv[])
     }
 
     while ( 1 ) {
-        c = getopt_long(argc, argv, "c:Dd:t:hp:", opts, &opt_index);
+        c = getopt_long(argc, argv, "c:Dd:t:hp:l", opts, &opt_index);
         if ( c == -1 )
             break;
         switch ( c ) {
@@ -464,6 +514,9 @@ int main(int argc, char *argv[])
         case 't':
             type = strtol(optarg, NULL, 0);
             break;
+        case 'l':
+            lmce = 1;
+            break;
         case 'h':
         default:
             help();
-- 
2.10.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 78+ messages in thread

* Re: [PATCH 01/19] x86/mce: fix indentation style in xen-mca.h and mce.h
  2017-02-17  6:39 ` [PATCH 01/19] x86/mce: fix indentation style in xen-mca.h and mce.h Haozhong Zhang
@ 2017-02-17  9:49   ` Jan Beulich
  0 siblings, 0 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-17  9:49 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> Replace tab indentation by whitespace.
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Acked-by: Jan Beulich <jbeulich@suse.com>
with ...

> @@ -345,8 +345,8 @@ struct xen_mc_fetch {
>      /* IN/OUT variables. */
>      uint32_t flags;	/* IN: XEN_MC_NONURGENT, XEN_MC_URGENT,
>                             XEN_MC_ACK if ack'ing an earlier fetch */
> -			/* OUT: XEN_MC_OK, XEN_MC_FETCHFAILED,
> -			   XEN_MC_NODATA, XEN_MC_NOMATCH */
> +    /* OUT: XEN_MC_OK, XEN_MC_FETCHFAILED,
> +            XEN_MC_NODATA, XEN_MC_NOMATCH */

... indentation properly restored here (the two comments should
align with one another).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 02/19] x86/mce: remove declarations of non-existing functions in mce.h
  2017-02-17  6:39 ` [PATCH 02/19] x86/mce: remove declarations of non-existing functions in mce.h Haozhong Zhang
@ 2017-02-17  9:50   ` Jan Beulich
  0 siblings, 0 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-17  9:50 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> Remove declarations of functions
>     intel_mcheck_timer()
>     mce_intel_feature_init()
>     mce_cap_init()
>     x86_mcinfo_getptr()
> whose definitions had been removed long time ago.
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 03/19] x86/mce: remove unnecessary braces around intel_get_extended_msrs()
  2017-02-17  6:39 ` [PATCH 03/19] x86/mce: remove unnecessary braces around intel_get_extended_msrs() Haozhong Zhang
@ 2017-02-17  9:51   ` Jan Beulich
  0 siblings, 0 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-17  9:51 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Acked-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add()
  2017-02-17  6:39 ` [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add() Haozhong Zhang
@ 2017-02-17  9:55   ` Jan Beulich
  2017-02-20  1:52     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-17  9:55 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Not sure here, I can't imagine the function being here without
reason, even if the plans perhaps never got carried out. Could
you please identify in the commit message here whether there
ever were callers (and if so, when the last one went away) or
what commit introduced the function without any users?

> --- a/xen/arch/x86/cpu/mcheck/mce.h
> +++ b/xen/arch/x86/cpu/mcheck/mce.h
> @@ -146,7 +146,6 @@ typedef struct mcinfo_extended *(*x86_mce_callback_t)
>      (struct mc_info *, uint16_t, uint64_t);
>  extern void x86_mce_callback_register(x86_mce_callback_t);
>  
> -void *x86_mcinfo_add(struct mc_info *mi, void *mcinfo);
>  void *x86_mcinfo_reserve(struct mc_info *mi, int size);
>  void x86_mcinfo_dump(struct mc_info *mi);

There's a comment earlier in this file which would need adjustment
too. Maybe that also helps understand if the function would better
be left in place ...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 05/19] x86/mce: merge loops to get Intel extended MC MSR
  2017-02-17  6:39 ` [PATCH 05/19] x86/mce: merge loops to get Intel extended MC MSR Haozhong Zhang
@ 2017-02-17  9:58   ` Jan Beulich
  2017-02-20  1:11     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-17  9:58 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> The second loop that gets MSR_IA32_MCG_R8 to MSR_IA32_MCG_R15 was
> surrounded by '#ifdef __X86_64__ ... #endif' and had to be seperated
> from the first loop that gets MSR_IA32_MCG_EAX to MSR_IA32_MCG_MISC.
> Because Xen had dropped support for 32-bit x86 host, these two loops
> can be merged now.

No, they can't - the number spaces aren't contiguous.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 06/19] x86/mce: merge intel_default_mce_dhandler/uhandler()
  2017-02-17  6:39 ` [PATCH 06/19] x86/mce: merge intel_default_mce_dhandler/uhandler() Haozhong Zhang
@ 2017-02-17 10:01   ` Jan Beulich
  2017-02-20  2:40     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-17 10:01 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> Implementations of these two functions are effectively the same, so
> unify them by a common intel_default_mce_handler().

Them being the same right now may also be an issue with the
earlier authors never having completed their job. I'd like to see
justification here that the two handlers also are conceptionally
(mostly) identical. Mechanically the patch is fine.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 07/19] x86/vmce: include domain/vcpu id in debug messages
  2017-02-17  6:39 ` [PATCH 07/19] x86/vmce: include domain/vcpu id in debug messages Haozhong Zhang
@ 2017-02-17 10:03   ` Jan Beulich
  0 siblings, 0 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-17 10:03 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve()
  2017-02-17  6:39 ` [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve() Haozhong Zhang
@ 2017-02-17 10:07   ` Jan Beulich
  2017-02-20  2:48     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-17 10:07 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> @@ -804,7 +800,7 @@ static void mcinfo_clear(struct mc_info *mi)
>      x86_mcinfo_nentries(mi) = 0;
>  }
>  
> -void *x86_mcinfo_reserve(struct mc_info *mi, int size)
> +void *x86_mcinfo_reserve(struct mc_info *mi, uint16_t size, uint16_t type)

There's no need for fixed width types here afaics. With them
replaced by "unsigned int"
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number
  2017-02-17  6:39 ` [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number Haozhong Zhang
@ 2017-02-17 10:08   ` Jan Beulich
  2017-02-20  2:49     ` Haozhong Zhang
  2017-02-20 12:29     ` Wei Liu
  0 siblings, 2 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-17 10:08 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Ian Jackson, Wei Liu, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> Use uint32_t rather than int to align to the type of
> xen_mc_physcpuinfo.ncpus.

I'm not sure about policies in the tool stack, but in the hypervisor I
would request to use unsigned int instead of fixed width types
where the fixed width isn't really relevant.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case
  2017-02-17  6:39 ` [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case Haozhong Zhang
@ 2017-02-17 10:21   ` Jan Beulich
  2017-02-20  4:36     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-17 10:21 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/mcheck/mcaction.c
> +++ b/xen/arch/x86/cpu/mcheck/mcaction.c
> @@ -88,21 +88,21 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
>                      goto vmce_failed;
>                  }
>  
> +                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
> +                    vmce_vcpuid = VMCE_INJECT_BROADCAST;
> +                else
> +                    vmce_vcpuid = global->mc_vcpuid;
> +
>                  bank->mc_addr = gfn << PAGE_SHIFT |
>                    (bank->mc_addr & (PAGE_SIZE -1 ));
> -                if ( fill_vmsr_data(bank, d,
> -                      global->mc_gstatus) == -1 )
> +                if ( fill_vmsr_data(bank, d, global->mc_gstatus,
> +                                    vmce_vcpuid == VMCE_INJECT_BROADCAST) == -1 )

You're changing the return values of function to be proper -E...
values, so you can't validly compare against -1 here anymore.

>  int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
> -                   uint64_t gstatus)
> +                   uint64_t gstatus, bool broadcast)
>  {
>      struct vcpu *v = d->vcpu[0];
> +    int ret;
>  
> -    if ( mc_bank->mc_domid != (uint16_t)~0 )
> -    {
> -        if ( v->arch.vmce.mcg_status & MCG_STATUS_MCIP )
> -        {
> -            mce_printk(MCE_QUIET, "MCE: guest has not handled previous"
> -                       " vMCE yet!\n");
> -            return -1;
> -        }
> -
> -        spin_lock(&v->arch.vmce.lock);
> +    if ( mc_bank->mc_domid == (uint16_t)~0 )
> +        return -EINVAL;

Returning -EINVAL here is a behavioral change which, if intended,
needs justification in the commit message.

> -        v->arch.vmce.mcg_status = gstatus;
> -        /*
> -         * 1. Skip bank 0 to avoid 'bank 0 quirk' of old processors
> -         * 2. Filter MCi_STATUS MSCOD model specific error code to guest
> -         */
> -        v->arch.vmce.bank[1].mci_status = mc_bank->mc_status &
> -                                              MCi_STATUS_MSCOD_MASK;
> -        v->arch.vmce.bank[1].mci_addr = mc_bank->mc_addr;
> -        v->arch.vmce.bank[1].mci_misc = mc_bank->mc_misc;
> +    ret = vcpu_fill_mc_msrs(v, gstatus, mc_bank->mc_status,
> +                            mc_bank->mc_addr, mc_bank->mc_misc);
> +    if ( ret || !broadcast )
> +        goto out;
>  
> -        spin_unlock(&v->arch.vmce.lock);
> +    for_each_vcpu ( d, v )
> +    {
> +        if ( v == d->vcpu[0] )

Comparing v->vcpu_id with zero would be a cheaper check as it
seems.

> +            continue;
> +        ret = vcpu_fill_mc_msrs(v, MCG_STATUS_MCIP | MCG_STATUS_RIPV,
> +                                0, 0, 0);

What guarantees RIP to be valid, and why all the zeros? Perhaps
I'm lacking some understanding of how real hardware handles the
broadcasting, but it would help if you left an explaining comment
here.

> +        if ( ret )
> +            break;

Wouldn't you better, on a best effort basis, try to update the
remaining vCPU-s anyway (making sure you don't lose the error
indicator to be handed back to your caller).

>      }
>  
> -    return 0;
> + out:
> +    return ret;

If at the destination of a goto all you do is return, please avoid
using goto.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU
  2017-02-17  6:39 ` [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU Haozhong Zhang
@ 2017-02-17 10:26   ` Jan Beulich
  2017-02-17 15:01     ` Boris Ostrovsky
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-17 10:26 UTC (permalink / raw)
  To: Suravee Suthikulpanit, Haozhong Zhang, Boris Ostrovsky
  Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/mcheck/mce.c
> +++ b/xen/arch/x86/cpu/mcheck/mce.c
> @@ -538,7 +538,14 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
>      gstatus = mca_rdmsr(MSR_IA32_MCG_STATUS);
>      if ((gstatus & MCG_STATUS_MCIP) != 0) {
>          mce_printk(MCE_CRITICAL, "MCE: Clear MCIP@ last step");
> -        mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
> +        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
> +            /*
> +             * Intel SDM 3: An attempt to write to IA32_MCG_STATUS
> +             * with any value other than 0 would result in #GP.
> +             */
> +            mca_wrmsr(MSR_IA32_MCG_STATUS, 0);
> +        else
> +            mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);

I think this wants to be the other way around: Write zero in the
common case, and make an exception for AMD (and that only if
on AMD it's really useful to write other than plain zero). Boris,
Suravee, can you provide some input here please?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU
  2017-02-17 10:26   ` Jan Beulich
@ 2017-02-17 15:01     ` Boris Ostrovsky
  2017-02-17 15:13       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Boris Ostrovsky @ 2017-02-17 15:01 UTC (permalink / raw)
  To: Jan Beulich, Suravee Suthikulpanit, Haozhong Zhang
  Cc: Andrew Cooper, Christoph Egger, xen-devel



On 02/17/2017 05:26 AM, Jan Beulich wrote:
>>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> --- a/xen/arch/x86/cpu/mcheck/mce.c
>> +++ b/xen/arch/x86/cpu/mcheck/mce.c
>> @@ -538,7 +538,14 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
>>      gstatus = mca_rdmsr(MSR_IA32_MCG_STATUS);
>>      if ((gstatus & MCG_STATUS_MCIP) != 0) {
>>          mce_printk(MCE_CRITICAL, "MCE: Clear MCIP@ last step");
>> -        mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
>> +        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
>> +            /*
>> +             * Intel SDM 3: An attempt to write to IA32_MCG_STATUS
>> +             * with any value other than 0 would result in #GP.
>> +             */
>> +            mca_wrmsr(MSR_IA32_MCG_STATUS, 0);
>> +        else
>> +            mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
>
> I think this wants to be the other way around: Write zero in the
> common case, and make an exception for AMD (and that only if
> on AMD it's really useful to write other than plain zero). Boris,
> Suravee, can you provide some input here please?

I don't see anything stating that writing non-zero values can result in 
an exception of any sort. But I am only going by what I see in the 
documentation.

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU
  2017-02-17 15:01     ` Boris Ostrovsky
@ 2017-02-17 15:13       ` Jan Beulich
  2017-02-17 15:38         ` Boris Ostrovsky
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-17 15:13 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Andrew Cooper, Christoph Egger, xen-devel, Suravee Suthikulpanit,
	Haozhong Zhang

>>> On 17.02.17 at 16:01, <boris.ostrovsky@oracle.com> wrote:
> On 02/17/2017 05:26 AM, Jan Beulich wrote:
>>>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>>> --- a/xen/arch/x86/cpu/mcheck/mce.c
>>> +++ b/xen/arch/x86/cpu/mcheck/mce.c
>>> @@ -538,7 +538,14 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
>>>      gstatus = mca_rdmsr(MSR_IA32_MCG_STATUS);
>>>      if ((gstatus & MCG_STATUS_MCIP) != 0) {
>>>          mce_printk(MCE_CRITICAL, "MCE: Clear MCIP@ last step");
>>> -        mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
>>> +        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
>>> +            /*
>>> +             * Intel SDM 3: An attempt to write to IA32_MCG_STATUS
>>> +             * with any value other than 0 would result in #GP.
>>> +             */
>>> +            mca_wrmsr(MSR_IA32_MCG_STATUS, 0);
>>> +        else
>>> +            mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
>>
>> I think this wants to be the other way around: Write zero in the
>> common case, and make an exception for AMD (and that only if
>> on AMD it's really useful to write other than plain zero). Boris,
>> Suravee, can you provide some input here please?
> 
> I don't see anything stating that writing non-zero values can result in 
> an exception of any sort. But I am only going by what I see in the 
> documentation.

But the question was the other way around: Is there a need to
clear only the MCIP bit, or can we as well write plain zero to the
MSR (allowing the conditional to be dropped).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU
  2017-02-17 15:13       ` Jan Beulich
@ 2017-02-17 15:38         ` Boris Ostrovsky
  0 siblings, 0 replies; 78+ messages in thread
From: Boris Ostrovsky @ 2017-02-17 15:38 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Christoph Egger, xen-devel, Suravee Suthikulpanit,
	Haozhong Zhang



On 02/17/2017 10:13 AM, Jan Beulich wrote:
>>>> On 17.02.17 at 16:01, <boris.ostrovsky@oracle.com> wrote:
>> On 02/17/2017 05:26 AM, Jan Beulich wrote:
>>>>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>>>> --- a/xen/arch/x86/cpu/mcheck/mce.c
>>>> +++ b/xen/arch/x86/cpu/mcheck/mce.c
>>>> @@ -538,7 +538,14 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
>>>>      gstatus = mca_rdmsr(MSR_IA32_MCG_STATUS);
>>>>      if ((gstatus & MCG_STATUS_MCIP) != 0) {
>>>>          mce_printk(MCE_CRITICAL, "MCE: Clear MCIP@ last step");
>>>> -        mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
>>>> +        if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
>>>> +            /*
>>>> +             * Intel SDM 3: An attempt to write to IA32_MCG_STATUS
>>>> +             * with any value other than 0 would result in #GP.
>>>> +             */
>>>> +            mca_wrmsr(MSR_IA32_MCG_STATUS, 0);
>>>> +        else
>>>> +            mca_wrmsr(MSR_IA32_MCG_STATUS, gstatus & ~MCG_STATUS_MCIP);
>>>
>>> I think this wants to be the other way around: Write zero in the
>>> common case, and make an exception for AMD (and that only if
>>> on AMD it's really useful to write other than plain zero). Boris,
>>> Suravee, can you provide some input here please?
>>
>> I don't see anything stating that writing non-zero values can result in
>> an exception of any sort. But I am only going by what I see in the
>> documentation.
>
> But the question was the other way around: Is there a need to
> clear only the MCIP bit, or can we as well write plain zero to the
> MSR (allowing the conditional to be dropped).

I don't see any indication of problems writing any of the three bits 
with any value so assuming the other two bits have been handled by now 
(as they were before this series) I don't see any issues.

But I think Suravee would be a better person to answer this, perhaps 
after checking with HW people.

-boris

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 05/19] x86/mce: merge loops to get Intel extended MC MSR
  2017-02-17  9:58   ` Jan Beulich
@ 2017-02-20  1:11     ` Haozhong Zhang
  0 siblings, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  1:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/17/17 02:58 -0700, Jan Beulich wrote:
>>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> The second loop that gets MSR_IA32_MCG_R8 to MSR_IA32_MCG_R15 was
>> surrounded by '#ifdef __X86_64__ ... #endif' and had to be seperated
>> from the first loop that gets MSR_IA32_MCG_EAX to MSR_IA32_MCG_MISC.
>> Because Xen had dropped support for 32-bit x86 host, these two loops
>> can be merged now.
>
>No, they can't - the number spaces aren't contiguous.
>
Sorry, my mistake. I'll drop this one.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add()
  2017-02-17  9:55   ` Jan Beulich
@ 2017-02-20  1:52     ` Haozhong Zhang
  2017-02-20  9:00       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  1:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

On 02/17/17 02:55 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> Not sure here, I can't imagine the function being here without
> reason, even if the plans perhaps never got carried out. Could
> you please identify in the commit message here whether there
> ever were callers (and if so, when the last one went away) or
> what commit introduced the function without any users?
>

The commit that removed the last uses of x86_mcinfo_add() is

commit 9d13fd9fd320a7740c6446c048ff6a2990095966
Author: Keir Fraser <keir.fraser@citrix.com>
Date:   Mon Jun 7 15:46:48 2010 +0100

    x86 mce: Change the method to get the extended MCA information.
    
    Several changes to get the extended MCA information:
    a) Use the x86_mcinfo_reserve in mcinfo_extended to reserve buffer
    from
       mc_info, instead of using the stack
    b) For intel's extended MSR, we don't need write them one
       by one as the MSR are continous
    c) We don't need enum mca_extinfo, since we can consider
       the extended MSR as either per bank, or global. Currently
       we add a hook in global data collection, and didn't call
       register intel_get_extended_msrs as callback. Later that
       hook can be replaced by cleaner way
    
    Signed-off-by: Jiang, Yunhong <yunhong.jiang@inte.com>

Specially, it's a) in that commit that intended to directly update the
buffer reserved by x86_mcinfo_reserve(), instead of updating another
buffer and using x86_mcinfo_add() to copy it to the buffer reserved by
x86_mcinfo_reserve(). It looks that x86_mcinfo_add() was deprecated by
that commit, so it's reasonable to remove it.

> > --- a/xen/arch/x86/cpu/mcheck/mce.h
> > +++ b/xen/arch/x86/cpu/mcheck/mce.h
> > @@ -146,7 +146,6 @@ typedef struct mcinfo_extended *(*x86_mce_callback_t)
> >      (struct mc_info *, uint16_t, uint64_t);
> >  extern void x86_mce_callback_register(x86_mce_callback_t);
> >  
> > -void *x86_mcinfo_add(struct mc_info *mi, void *mcinfo);
> >  void *x86_mcinfo_reserve(struct mc_info *mi, int size);
> >  void x86_mcinfo_dump(struct mc_info *mi);
> 
> There's a comment earlier in this file which would need adjustment
> too. Maybe that also helps understand if the function would better
> be left in place ...
> 
I'll update the commit if we decide to remove this function.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 06/19] x86/mce: merge intel_default_mce_dhandler/uhandler()
  2017-02-17 10:01   ` Jan Beulich
@ 2017-02-20  2:40     ` Haozhong Zhang
  0 siblings, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  2:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/17/17 03:01 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > Implementations of these two functions are effectively the same, so
> > unify them by a common intel_default_mce_handler().
> 
> Them being the same right now may also be an issue with the
> earlier authors never having completed their job. I'd like to see
> justification here that the two handlers also are conceptionally
> (mostly) identical. Mechanically the patch is fine.
> 

Maybe conceptually not, because
1) uhandler is called in MCE context and, IIUC, should be more careful
   than dhandler;
2) the 3rd argument "regs" only makes sense in MCE context, so
   dhandler should never use them.

I think a better alternative is to move the same code to a common
function and let uhandler and dhandler call that common function. In
addition, I'll leave a comment to explain why not merge them.

Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve()
  2017-02-17 10:07   ` Jan Beulich
@ 2017-02-20  2:48     ` Haozhong Zhang
  2017-02-20  9:02       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  2:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

On 02/17/17 03:07 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > @@ -804,7 +800,7 @@ static void mcinfo_clear(struct mc_info *mi)
> >      x86_mcinfo_nentries(mi) = 0;
> >  }
> >  
> > -void *x86_mcinfo_reserve(struct mc_info *mi, int size)
> > +void *x86_mcinfo_reserve(struct mc_info *mi, uint16_t size, uint16_t type)
> 
> There's no need for fixed width types here afaics. With them
> replaced by "unsigned int"
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 

The reason I use uint16_t is because the type of mcinfo_common.type
and .size is uint16_t.

Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number
  2017-02-17 10:08   ` Jan Beulich
@ 2017-02-20  2:49     ` Haozhong Zhang
  2017-02-20 12:29     ` Wei Liu
  1 sibling, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  2:49 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Ian Jackson, Wei Liu, xen-devel

On 02/17/17 03:08 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > Use uint32_t rather than int to align to the type of
> > xen_mc_physcpuinfo.ncpus.
> 
> I'm not sure about policies in the tool stack, but in the hypervisor I
> would request to use unsigned int instead of fixed width types
> where the fixed width isn't really relevant.
> 
Will replace them by unsigned int.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case
  2017-02-17 10:21   ` Jan Beulich
@ 2017-02-20  4:36     ` Haozhong Zhang
  2017-02-20  9:04       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  4:36 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/17/17 03:21 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > --- a/xen/arch/x86/cpu/mcheck/mcaction.c
> > +++ b/xen/arch/x86/cpu/mcheck/mcaction.c
> > @@ -88,21 +88,21 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
> >                      goto vmce_failed;
> >                  }
> >  
> > +                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
> > +                    vmce_vcpuid = VMCE_INJECT_BROADCAST;
> > +                else
> > +                    vmce_vcpuid = global->mc_vcpuid;
> > +
> >                  bank->mc_addr = gfn << PAGE_SHIFT |
> >                    (bank->mc_addr & (PAGE_SIZE -1 ));
> > -                if ( fill_vmsr_data(bank, d,
> > -                      global->mc_gstatus) == -1 )
> > +                if ( fill_vmsr_data(bank, d, global->mc_gstatus,
> > +                                    vmce_vcpuid == VMCE_INJECT_BROADCAST) == -1 )
> 
> You're changing the return values of function to be proper -E...
> values, so you can't validly compare against -1 here anymore.
>

will fix

> >  int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
> > -                   uint64_t gstatus)
> > +                   uint64_t gstatus, bool broadcast)
> >  {
> >      struct vcpu *v = d->vcpu[0];
> > +    int ret;
> >  
> > -    if ( mc_bank->mc_domid != (uint16_t)~0 )
> > -    {
> > -        if ( v->arch.vmce.mcg_status & MCG_STATUS_MCIP )
> > -        {
> > -            mce_printk(MCE_QUIET, "MCE: guest has not handled previous"
> > -                       " vMCE yet!\n");
> > -            return -1;
> > -        }
> > -
> > -        spin_lock(&v->arch.vmce.lock);
> > +    if ( mc_bank->mc_domid == (uint16_t)~0 )
> > +        return -EINVAL;
> 
> Returning -EINVAL here is a behavioral change which, if intended,
> needs justification in the commit message.
>

It's intended. "ASSERT(d)" in its only caller mc_memerr_dhandler()
implies an invalid domain id should be treated as an error. I'll
mention this in the commit message.

> > -        v->arch.vmce.mcg_status = gstatus;
> > -        /*
> > -         * 1. Skip bank 0 to avoid 'bank 0 quirk' of old processors
> > -         * 2. Filter MCi_STATUS MSCOD model specific error code to guest
> > -         */
> > -        v->arch.vmce.bank[1].mci_status = mc_bank->mc_status &
> > -                                              MCi_STATUS_MSCOD_MASK;
> > -        v->arch.vmce.bank[1].mci_addr = mc_bank->mc_addr;
> > -        v->arch.vmce.bank[1].mci_misc = mc_bank->mc_misc;
> > +    ret = vcpu_fill_mc_msrs(v, gstatus, mc_bank->mc_status,
> > +                            mc_bank->mc_addr, mc_bank->mc_misc);
> > +    if ( ret || !broadcast )
> > +        goto out;
> >  
> > -        spin_unlock(&v->arch.vmce.lock);
> > +    for_each_vcpu ( d, v )
> > +    {
> > +        if ( v == d->vcpu[0] )
> 
> Comparing v->vcpu_id with zero would be a cheaper check as it
> seems.
>

will change

> > +            continue;
> > +        ret = vcpu_fill_mc_msrs(v, MCG_STATUS_MCIP | MCG_STATUS_RIPV,
> > +                                0, 0, 0);
> 
> What guarantees RIP to be valid, and why all the zeros? Perhaps
> I'm lacking some understanding of how real hardware handles the
> broadcasting, but it would help if you left an explaining comment
> here.
>

I just inject the less severity error on vcpus other than vcpu0. As
long as vMCE with the actual error information is injected to vcpu0,
the guest MCE handler should be able to decide whether it can recover
or not. If it can, vMCE injected on other vcpus should not offer
conflict information. If it cannot, MC MSRs on other vcpus will not
matter in fact as they represent a less severe error.

MSR_IA32_MCG_STATUS = MCG_STATUS_MCIP | MCG_STATUS_RIPV and all banks
being zero (invalid) is one of the combinations satisfying above
requirement.

> > +        if ( ret )
> > +            break;
> 
> Wouldn't you better, on a best effort basis, try to update the
> remaining vCPU-s anyway (making sure you don't lose the error
> indicator to be handed back to your caller).
>

sure, will change to update all vcpus

> >      }
> >  
> > -    return 0;
> > + out:
> > +    return ret;
> 
> If at the destination of a goto all you do is return, please avoid
> using goto.
>
will change

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add()
  2017-02-20  1:52     ` Haozhong Zhang
@ 2017-02-20  9:00       ` Jan Beulich
  2017-02-20  9:10         ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-20  9:00 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 20.02.17 at 02:52, <haozhong.zhang@intel.com> wrote:
> On 02/17/17 02:55 -0700, Jan Beulich wrote:
>> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
>> 
>> Not sure here, I can't imagine the function being here without
>> reason, even if the plans perhaps never got carried out. Could
>> you please identify in the commit message here whether there
>> ever were callers (and if so, when the last one went away) or
>> what commit introduced the function without any users?
>>
> 
> The commit that removed the last uses of x86_mcinfo_add() is
> 
> commit 9d13fd9fd320a7740c6446c048ff6a2990095966
> Author: Keir Fraser <keir.fraser@citrix.com>
> Date:   Mon Jun 7 15:46:48 2010 +0100
> 
>     x86 mce: Change the method to get the extended MCA information.
>     
>     Several changes to get the extended MCA information:
>     a) Use the x86_mcinfo_reserve in mcinfo_extended to reserve buffer
>     from
>        mc_info, instead of using the stack
>     b) For intel's extended MSR, we don't need write them one
>        by one as the MSR are continous
>     c) We don't need enum mca_extinfo, since we can consider
>        the extended MSR as either per bank, or global. Currently
>        we add a hook in global data collection, and didn't call
>        register intel_get_extended_msrs as callback. Later that
>        hook can be replaced by cleaner way
>     
>     Signed-off-by: Jiang, Yunhong <yunhong.jiang@inte.com>
> 
> Specially, it's a) in that commit that intended to directly update the
> buffer reserved by x86_mcinfo_reserve(), instead of updating another
> buffer and using x86_mcinfo_add() to copy it to the buffer reserved by
> x86_mcinfo_reserve(). It looks that x86_mcinfo_add() was deprecated by
> that commit, so it's reasonable to remove it.

Okay, so please add a reference to that commit to your patch
description.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve()
  2017-02-20  2:48     ` Haozhong Zhang
@ 2017-02-20  9:02       ` Jan Beulich
  2017-02-20  9:11         ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-20  9:02 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 20.02.17 at 03:48, <haozhong.zhang@intel.com> wrote:
> On 02/17/17 03:07 -0700, Jan Beulich wrote:
>> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> > @@ -804,7 +800,7 @@ static void mcinfo_clear(struct mc_info *mi)
>> >      x86_mcinfo_nentries(mi) = 0;
>> >  }
>> >  
>> > -void *x86_mcinfo_reserve(struct mc_info *mi, int size)
>> > +void *x86_mcinfo_reserve(struct mc_info *mi, uint16_t size, uint16_t type)
>> 
>> There's no need for fixed width types here afaics. With them
>> replaced by "unsigned int"
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> The reason I use uint16_t is because the type of mcinfo_common.type
> and .size is uint16_t.

I guessed that, but the parameters have no need to be fixed
width just because they get stored into some fixed width field.
Please remember that dealing with plain (unsigned) int is slightly
more efficient than dealing with uint16_t (or uint64_t, btw).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case
  2017-02-20  4:36     ` Haozhong Zhang
@ 2017-02-20  9:04       ` Jan Beulich
  2017-02-20  9:12         ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-20  9:04 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 20.02.17 at 05:36, <haozhong.zhang@intel.com> wrote:
> On 02/17/17 03:21 -0700, Jan Beulich wrote:
>> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> > +            continue;
>> > +        ret = vcpu_fill_mc_msrs(v, MCG_STATUS_MCIP | MCG_STATUS_RIPV,
>> > +                                0, 0, 0);
>> 
>> What guarantees RIP to be valid, and why all the zeros? Perhaps
>> I'm lacking some understanding of how real hardware handles the
>> broadcasting, but it would help if you left an explaining comment
>> here.
>>
> 
> I just inject the less severity error on vcpus other than vcpu0. As
> long as vMCE with the actual error information is injected to vcpu0,
> the guest MCE handler should be able to decide whether it can recover
> or not. If it can, vMCE injected on other vcpus should not offer
> conflict information. If it cannot, MC MSRs on other vcpus will not
> matter in fact as they represent a less severe error.
> 
> MSR_IA32_MCG_STATUS = MCG_STATUS_MCIP | MCG_STATUS_RIPV and all banks
> being zero (invalid) is one of the combinations satisfying above
> requirement.

Well, okay, but as said - this needs saying in a code comment.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add()
  2017-02-20  9:00       ` Jan Beulich
@ 2017-02-20  9:10         ` Haozhong Zhang
  0 siblings, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  9:10 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

On 02/20/17 02:00 -0700, Jan Beulich wrote:
> >>> On 20.02.17 at 02:52, <haozhong.zhang@intel.com> wrote:
> > On 02/17/17 02:55 -0700, Jan Beulich wrote:
> >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> >> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> >> 
> >> Not sure here, I can't imagine the function being here without
> >> reason, even if the plans perhaps never got carried out. Could
> >> you please identify in the commit message here whether there
> >> ever were callers (and if so, when the last one went away) or
> >> what commit introduced the function without any users?
> >>
> > 
> > The commit that removed the last uses of x86_mcinfo_add() is
> > 
> > commit 9d13fd9fd320a7740c6446c048ff6a2990095966
> > Author: Keir Fraser <keir.fraser@citrix.com>
> > Date:   Mon Jun 7 15:46:48 2010 +0100
> > 
> >     x86 mce: Change the method to get the extended MCA information.
> >     
> >     Several changes to get the extended MCA information:
> >     a) Use the x86_mcinfo_reserve in mcinfo_extended to reserve buffer
> >     from
> >        mc_info, instead of using the stack
> >     b) For intel's extended MSR, we don't need write them one
> >        by one as the MSR are continous
> >     c) We don't need enum mca_extinfo, since we can consider
> >        the extended MSR as either per bank, or global. Currently
> >        we add a hook in global data collection, and didn't call
> >        register intel_get_extended_msrs as callback. Later that
> >        hook can be replaced by cleaner way
> >     
> >     Signed-off-by: Jiang, Yunhong <yunhong.jiang@inte.com>
> > 
> > Specially, it's a) in that commit that intended to directly update the
> > buffer reserved by x86_mcinfo_reserve(), instead of updating another
> > buffer and using x86_mcinfo_add() to copy it to the buffer reserved by
> > x86_mcinfo_reserve(). It looks that x86_mcinfo_add() was deprecated by
> > that commit, so it's reasonable to remove it.
> 
> Okay, so please add a reference to that commit to your patch
> description.
> 
Sure, will add.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve()
  2017-02-20  9:02       ` Jan Beulich
@ 2017-02-20  9:11         ` Haozhong Zhang
  0 siblings, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  9:11 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

On 02/20/17 02:02 -0700, Jan Beulich wrote:
> >>> On 20.02.17 at 03:48, <haozhong.zhang@intel.com> wrote:
> > On 02/17/17 03:07 -0700, Jan Beulich wrote:
> >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> >> > @@ -804,7 +800,7 @@ static void mcinfo_clear(struct mc_info *mi)
> >> >      x86_mcinfo_nentries(mi) = 0;
> >> >  }
> >> >  
> >> > -void *x86_mcinfo_reserve(struct mc_info *mi, int size)
> >> > +void *x86_mcinfo_reserve(struct mc_info *mi, uint16_t size, uint16_t type)
> >> 
> >> There's no need for fixed width types here afaics. With them
> >> replaced by "unsigned int"
> >> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> > 
> > The reason I use uint16_t is because the type of mcinfo_common.type
> > and .size is uint16_t.
> 
> I guessed that, but the parameters have no need to be fixed
> width just because they get stored into some fixed width field.
> Please remember that dealing with plain (unsigned) int is slightly
> more efficient than dealing with uint16_t (or uint64_t, btw).
> 
OK, I'll change to the unsigned type.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case
  2017-02-20  9:04       ` Jan Beulich
@ 2017-02-20  9:12         ` Haozhong Zhang
  0 siblings, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20  9:12 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/20/17 02:04 -0700, Jan Beulich wrote:
> >>> On 20.02.17 at 05:36, <haozhong.zhang@intel.com> wrote:
> > On 02/17/17 03:21 -0700, Jan Beulich wrote:
> >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> >> > +            continue;
> >> > +        ret = vcpu_fill_mc_msrs(v, MCG_STATUS_MCIP | MCG_STATUS_RIPV,
> >> > +                                0, 0, 0);
> >> 
> >> What guarantees RIP to be valid, and why all the zeros? Perhaps
> >> I'm lacking some understanding of how real hardware handles the
> >> broadcasting, but it would help if you left an explaining comment
> >> here.
> >>
> > 
> > I just inject the less severity error on vcpus other than vcpu0. As
> > long as vMCE with the actual error information is injected to vcpu0,
> > the guest MCE handler should be able to decide whether it can recover
> > or not. If it can, vMCE injected on other vcpus should not offer
> > conflict information. If it cannot, MC MSRs on other vcpus will not
> > matter in fact as they represent a less severe error.
> > 
> > MSR_IA32_MCG_STATUS = MCG_STATUS_MCIP | MCG_STATUS_RIPV and all banks
> > being zero (invalid) is one of the combinations satisfying above
> > requirement.
> 
> Well, okay, but as said - this needs saying in a code comment.
> 
I'll explain in the code comment.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number
  2017-02-17 10:08   ` Jan Beulich
  2017-02-20  2:49     ` Haozhong Zhang
@ 2017-02-20 12:29     ` Wei Liu
  1 sibling, 0 replies; 78+ messages in thread
From: Wei Liu @ 2017-02-20 12:29 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Haozhong Zhang, Wei Liu, Ian Jackson, xen-devel

On Fri, Feb 17, 2017 at 03:08:55AM -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > Use uint32_t rather than int to align to the type of
> > xen_mc_physcpuinfo.ncpus.
> 
> I'm not sure about policies in the tool stack, but in the hypervisor I
> would request to use unsigned int instead of fixed width types
> where the fixed width isn't really relevant.
> 

There is no written policy AFAICT.

I agree with you that using unsigned int is better.

Wei.

> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
  2017-02-17  6:39 ` [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP Haozhong Zhang
@ 2017-02-20 12:32   ` Wei Liu
  2017-02-20 12:38     ` Jan Beulich
  2017-02-20 23:55     ` Haozhong Zhang
  2017-02-22 15:55   ` Jan Beulich
  1 sibling, 2 replies; 78+ messages in thread
From: Wei Liu @ 2017-02-20 12:32 UTC (permalink / raw)
  To: Haozhong Zhang
  Cc: Wei Liu, Liu Jinsong, Christoph Egger, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper

On Fri, Feb 17, 2017 at 02:39:34PM +0800, Haozhong Zhang wrote:
> If LMCE is supported by host and "lmce = 1" is present in xl config, the
> LMCE capability will be exposed in guest MSR_IA32_MCG_CAP. By default,
> LMCE is not exposed to guest so as to keep the backwards migration
> compatibility.
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Sorry for this stupid question: what does L in LMCE stand for? Googling
LMCE doesn't return useful result.

>  
>  int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index a612d1f..3cb0d9a 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -550,6 +550,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
>                                         ("serial_list",      libxl_string_list),
>                                         ("rdm", libxl_rdm_reserve),
>                                         ("rdm_mem_boundary_memkb", MemKB),
> +                                       ("lmce",             libxl_defbool),

You also need to add LIBXL_HAVE_LMCE to libxl.h.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
  2017-02-20 12:32   ` Wei Liu
@ 2017-02-20 12:38     ` Jan Beulich
  2017-02-20 14:12       ` Wei Liu
  2017-02-20 23:55     ` Haozhong Zhang
  1 sibling, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-20 12:38 UTC (permalink / raw)
  To: WeiLiu
  Cc: Haozhong Zhang, Liu Jinsong, Christoph Egger, Ian Jackson,
	xen-devel, Andrew Cooper

>>> On 20.02.17 at 13:32, <wei.liu2@citrix.com> wrote:
> On Fri, Feb 17, 2017 at 02:39:34PM +0800, Haozhong Zhang wrote:
>> If LMCE is supported by host and "lmce = 1" is present in xl config, the
>> LMCE capability will be exposed in guest MSR_IA32_MCG_CAP. By default,
>> LMCE is not exposed to guest so as to keep the backwards migration
>> compatibility.
>> 
>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> Sorry for this stupid question: what does L in LMCE stand for? Googling
> LMCE doesn't return useful result.

I think it's "local" (as opposed to the broadcasting to all (v)CPUs).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 19/19] tools/xen-mceinj: support injecting LMCE
  2017-02-17  6:39 ` [PATCH 19/19] tools/xen-mceinj: support injecting LMCE Haozhong Zhang
@ 2017-02-20 12:53   ` Wei Liu
  2017-02-20 23:50     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Wei Liu @ 2017-02-20 12:53 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Wei Liu, Ian Jackson, xen-devel

On Fri, Feb 17, 2017 at 02:39:36PM +0800, Haozhong Zhang wrote:
> If option '-l' or '--lmce' is specified and the host supports LMCE,
> xen-mceinj will inject LMCE to CPU specified by '-c' (or CPU0 if '-c'
> is not present).
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> ---
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  tools/libxc/include/xenctrl.h           |  1 +
>  tools/libxc/xc_misc.c                   | 25 +++++++++++++++

I suggest you split out changes to libxc to a separate patch.

>  tools/tests/mce-test/tools/xen-mceinj.c | 57 +++++++++++++++++++++++++++++++--
>  3 files changed, 81 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 85d7fe5..2598952 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1968,6 +1968,7 @@ int xc_cpuid_apply_policy(xc_interface *xch,
>  void xc_cpuid_to_str(const unsigned int *regs,
>                       char **strs); /* some strs[] may be NULL if ENOMEM */
>  int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
> +int xc_mca_op_cpumap(xc_interface *xch, struct xen_mc *mc, xc_cpumap_t cpumap);
>  #endif
>  
>  struct xc_px_val {
> diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
> index 0fc6c22..24f7fdf 100644
> --- a/tools/libxc/xc_misc.c
> +++ b/tools/libxc/xc_misc.c
> @@ -341,6 +341,31 @@ int xc_mca_op(xc_interface *xch, struct xen_mc *mc)
>      xc_hypercall_bounce_post(xch, mc);
>      return ret;
>  }
> +
> +int xc_mca_op_cpumap(xc_interface *xch, struct xen_mc *mc, xc_cpumap_t cpumap)
> +{
> +    int ret;
> +    DECLARE_HYPERCALL_BOUNCE(cpumap, 0, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +
> +    if ( cpumap )
> +    {
> +        HYPERCALL_BOUNCE_SET_SIZE(cpumap,
> +                                  (mc->u.mc_inject_v2.cpumap.nr_bits + 7) / 8);
> +        if ( xc_hypercall_bounce_pre(xch, cpumap) )
> +        {
> +            PERROR("Could not bouce cpumap memory buffer");
> +            return -1;
> +        }
> +        set_xen_guest_handle(mc->u.mc_inject_v2.cpumap.bitmap, cpumap);
> +    }
> +
> +    ret = xc_mca_op(xch, mc);
> +
> +    if ( cpumap )
> +        xc_hypercall_bounce_post(xch, cpumap);
> +
> +    return ret;
> +}

I kinda see why you did this: the bounce buffer infrastructure isn't
available to userspace program (by design). But this API isn't nice. 

This function replaces part of the struct. I suggest you construct a
xen_mc struct solely within this function, not doing part of it here and
the other part in another place (inject_lmce below).  Then you also need
to name it properly.

>  #endif
>  
>  int xc_perfc_reset(xc_interface *xch)
> diff --git a/tools/tests/mce-test/tools/xen-mceinj.c b/tools/tests/mce-test/tools/xen-mceinj.c
> index 5f70a61..b2eb7d3 100644
> --- a/tools/tests/mce-test/tools/xen-mceinj.c
> +++ b/tools/tests/mce-test/tools/xen-mceinj.c
> @@ -56,6 +56,8 @@
>  #define MSR_IA32_MC0_MISC        0x00000403
>  #define MSR_IA32_MC0_CTL2        0x00000280
>  
> +#define MCG_STATUS_LMCE          0x8
> +
>  struct mce_info {
>      const char *description;
>      uint8_t mcg_stat;
> @@ -113,6 +115,7 @@ static struct mce_info mce_table[] = {
>  #define LOGFILE stdout
>  
>  int dump;
> +int lmce;
>  struct xen_mc_msrinject msr_inj;
>  
>  static void Lprintf(const char *fmt, ...)
> @@ -213,6 +216,42 @@ static int inject_mce(xc_interface *xc_handle, int cpu_nr)
>      return xc_mca_op(xc_handle, &mc);
>  }
>  
> +static int inject_lmce(xc_interface *xc_handle, uint32_t cpu_nr)
> +{
> +    struct xen_mc mc;
> +    uint8_t *cpumap = NULL;
> +    size_t cpumap_size, line, shift;
> +    uint32_t nr_cpus;
> +    int ret;
> +
> +    nr_cpus = mca_cpuinfo(xc_handle);
> +    if ( !nr_cpus )
> +        err(xc_handle, "Failed to get mca_cpuinfo");

Why xc_handle in err()?

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
  2017-02-20 12:38     ` Jan Beulich
@ 2017-02-20 14:12       ` Wei Liu
  0 siblings, 0 replies; 78+ messages in thread
From: Wei Liu @ 2017-02-20 14:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Haozhong Zhang, WeiLiu, Liu Jinsong, Christoph Egger,
	Ian Jackson, xen-devel, Andrew Cooper

On Mon, Feb 20, 2017 at 05:38:53AM -0700, Jan Beulich wrote:
> >>> On 20.02.17 at 13:32, <wei.liu2@citrix.com> wrote:
> > On Fri, Feb 17, 2017 at 02:39:34PM +0800, Haozhong Zhang wrote:
> >> If LMCE is supported by host and "lmce = 1" is present in xl config, the
> >> LMCE capability will be exposed in guest MSR_IA32_MCG_CAP. By default,
> >> LMCE is not exposed to guest so as to keep the backwards migration
> >> compatibility.
> >> 
> >> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> > 
> > Sorry for this stupid question: what does L in LMCE stand for? Googling
> > LMCE doesn't return useful result.
> 
> I think it's "local" (as opposed to the broadcasting to all (v)CPUs).
> 

Actually Haozhong already mentioned that in the cover letter. My bad.

> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 19/19] tools/xen-mceinj: support injecting LMCE
  2017-02-20 12:53   ` Wei Liu
@ 2017-02-20 23:50     ` Haozhong Zhang
  2017-02-21  9:18       ` Wei Liu
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20 23:50 UTC (permalink / raw)
  To: Wei Liu; +Cc: Ian Jackson, xen-devel

On 02/20/17 12:53 +0000, Wei Liu wrote:
> On Fri, Feb 17, 2017 at 02:39:36PM +0800, Haozhong Zhang wrote:
> > If option '-l' or '--lmce' is specified and the host supports LMCE,
> > xen-mceinj will inject LMCE to CPU specified by '-c' (or CPU0 if '-c'
> > is not present).
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> > ---
> > Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  tools/libxc/include/xenctrl.h           |  1 +
> >  tools/libxc/xc_misc.c                   | 25 +++++++++++++++
> 
> I suggest you split out changes to libxc to a separate patch.
>

will do in the next version

> >  tools/tests/mce-test/tools/xen-mceinj.c | 57 +++++++++++++++++++++++++++++++--
> >  3 files changed, 81 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> > index 85d7fe5..2598952 100644
> > --- a/tools/libxc/include/xenctrl.h
> > +++ b/tools/libxc/include/xenctrl.h
> > @@ -1968,6 +1968,7 @@ int xc_cpuid_apply_policy(xc_interface *xch,
> >  void xc_cpuid_to_str(const unsigned int *regs,
> >                       char **strs); /* some strs[] may be NULL if ENOMEM */
> >  int xc_mca_op(xc_interface *xch, struct xen_mc *mc);
> > +int xc_mca_op_cpumap(xc_interface *xch, struct xen_mc *mc, xc_cpumap_t cpumap);
> >  #endif
> >  
> >  struct xc_px_val {
> > diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
> > index 0fc6c22..24f7fdf 100644
> > --- a/tools/libxc/xc_misc.c
> > +++ b/tools/libxc/xc_misc.c
> > @@ -341,6 +341,31 @@ int xc_mca_op(xc_interface *xch, struct xen_mc *mc)
> >      xc_hypercall_bounce_post(xch, mc);
> >      return ret;
> >  }
> > +
> > +int xc_mca_op_cpumap(xc_interface *xch, struct xen_mc *mc, xc_cpumap_t cpumap)
> > +{
> > +    int ret;
> > +    DECLARE_HYPERCALL_BOUNCE(cpumap, 0, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> > +
> > +    if ( cpumap )
> > +    {
> > +        HYPERCALL_BOUNCE_SET_SIZE(cpumap,
> > +                                  (mc->u.mc_inject_v2.cpumap.nr_bits + 7) / 8);
> > +        if ( xc_hypercall_bounce_pre(xch, cpumap) )
> > +        {
> > +            PERROR("Could not bouce cpumap memory buffer");
> > +            return -1;
> > +        }
> > +        set_xen_guest_handle(mc->u.mc_inject_v2.cpumap.bitmap, cpumap);
> > +    }
> > +
> > +    ret = xc_mca_op(xch, mc);
> > +
> > +    if ( cpumap )
> > +        xc_hypercall_bounce_post(xch, cpumap);
> > +
> > +    return ret;
> > +}
> 
> I kinda see why you did this: the bounce buffer infrastructure isn't
> available to userspace program (by design). But this API isn't nice. 
> 
> This function replaces part of the struct. I suggest you construct a
> xen_mc struct solely within this function, not doing part of it here and
> the other part in another place (inject_lmce below).  Then you also need
> to name it properly.
>

ditto

> >  #endif
> >  
> >  int xc_perfc_reset(xc_interface *xch)
> > diff --git a/tools/tests/mce-test/tools/xen-mceinj.c b/tools/tests/mce-test/tools/xen-mceinj.c
> > index 5f70a61..b2eb7d3 100644
> > --- a/tools/tests/mce-test/tools/xen-mceinj.c
> > +++ b/tools/tests/mce-test/tools/xen-mceinj.c
> > @@ -56,6 +56,8 @@
> >  #define MSR_IA32_MC0_MISC        0x00000403
> >  #define MSR_IA32_MC0_CTL2        0x00000280
> >  
> > +#define MCG_STATUS_LMCE          0x8
> > +
> >  struct mce_info {
> >      const char *description;
> >      uint8_t mcg_stat;
> > @@ -113,6 +115,7 @@ static struct mce_info mce_table[] = {
> >  #define LOGFILE stdout
> >  
> >  int dump;
> > +int lmce;
> >  struct xen_mc_msrinject msr_inj;
> >  
> >  static void Lprintf(const char *fmt, ...)
> > @@ -213,6 +216,42 @@ static int inject_mce(xc_interface *xc_handle, int cpu_nr)
> >      return xc_mca_op(xc_handle, &mc);
> >  }
> >  
> > +static int inject_lmce(xc_interface *xc_handle, uint32_t cpu_nr)
> > +{
> > +    struct xen_mc mc;
> > +    uint8_t *cpumap = NULL;
> > +    size_t cpumap_size, line, shift;
> > +    uint32_t nr_cpus;
> > +    int ret;
> > +
> > +    nr_cpus = mca_cpuinfo(xc_handle);
> > +    if ( !nr_cpus )
> > +        err(xc_handle, "Failed to get mca_cpuinfo");
> 
> Why xc_handle in err()?
>

err() needs to close xc_handle.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
  2017-02-20 12:32   ` Wei Liu
  2017-02-20 12:38     ` Jan Beulich
@ 2017-02-20 23:55     ` Haozhong Zhang
  1 sibling, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-20 23:55 UTC (permalink / raw)
  To: Wei Liu
  Cc: Liu Jinsong, Christoph Egger, Ian Jackson, xen-devel,
	Jan Beulich, Andrew Cooper

On 02/20/17 12:32 +0000, Wei Liu wrote:
> On Fri, Feb 17, 2017 at 02:39:34PM +0800, Haozhong Zhang wrote:
> > If LMCE is supported by host and "lmce = 1" is present in xl config, the
> > LMCE capability will be exposed in guest MSR_IA32_MCG_CAP. By default,
> > LMCE is not exposed to guest so as to keep the backwards migration
> > compatibility.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> Sorry for this stupid question: what does L in LMCE stand for? Googling
> LMCE doesn't return useful result.
>

Local

> >  
> >  int libxl__build_pre(libxl__gc *gc, uint32_t domid,
> > diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> > index a612d1f..3cb0d9a 100644
> > --- a/tools/libxl/libxl_types.idl
> > +++ b/tools/libxl/libxl_types.idl
> > @@ -550,6 +550,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
> >                                         ("serial_list",      libxl_string_list),
> >                                         ("rdm", libxl_rdm_reserve),
> >                                         ("rdm_mem_boundary_memkb", MemKB),
> > +                                       ("lmce",             libxl_defbool),
> 
> You also need to add LIBXL_HAVE_LMCE to libxl.h.
> 

will do

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 19/19] tools/xen-mceinj: support injecting LMCE
  2017-02-20 23:50     ` Haozhong Zhang
@ 2017-02-21  9:18       ` Wei Liu
  0 siblings, 0 replies; 78+ messages in thread
From: Wei Liu @ 2017-02-21  9:18 UTC (permalink / raw)
  To: Wei Liu, xen-devel, Ian Jackson

On Tue, Feb 21, 2017 at 07:50:38AM +0800, Haozhong Zhang wrote:
> On 02/20/17 12:53 +0000, Wei Liu wrote:
> > >  }
> > >  
> > > +static int inject_lmce(xc_interface *xc_handle, uint32_t cpu_nr)
> > > +{
> > > +    struct xen_mc mc;
> > > +    uint8_t *cpumap = NULL;
> > > +    size_t cpumap_size, line, shift;
> > > +    uint32_t nr_cpus;
> > > +    int ret;
> > > +
> > > +    nr_cpus = mca_cpuinfo(xc_handle);
> > > +    if ( !nr_cpus )
> > > +        err(xc_handle, "Failed to get mca_cpuinfo");
> > 
> > Why xc_handle in err()?
> >
> 
> err() needs to close xc_handle.
> 

Gosh, this is err in xen-mceinj.c, not err(3).

Sorry for the noise.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 12/19] x86/mce: handle LMCE locally
  2017-02-17  6:39 ` [PATCH 12/19] x86/mce: handle LMCE locally Haozhong Zhang
@ 2017-02-22 13:53   ` Jan Beulich
  2017-02-23  3:06     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-22 13:53 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/mcheck/barrier.c
> +++ b/xen/arch/x86/cpu/mcheck/barrier.c
> @@ -20,7 +20,7 @@ void mce_barrier_enter(struct mce_softirq_barrier *bar)
>  {
>      int gen;
>  
> -    if (!mce_broadcast)
> +    if ( !mce_broadcast || __get_cpu_var(lmce_in_process) )

this_cpu() please instead of __get_cpu_var() (which we should get
rid of rather sooner than later).

> @@ -462,6 +474,7 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
>      uint64_t gstatus;
>      mctelem_cookie_t mctc = NULL;
>      struct mca_summary bs;
> +    bool *lmce_in_process = &__get_cpu_var(lmce_in_process);
>  
>      mce_spin_lock(&mce_logout_lock);
>  
> @@ -505,6 +518,8 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
>      }
>      mce_spin_unlock(&mce_logout_lock);
>  
> +    *lmce_in_process = bs.lmce;

You don't need a new local variable for this.

> @@ -1709,6 +1724,7 @@ static void mce_softirq(void)
>  {
>      int cpu = smp_processor_id();
>      unsigned int workcpu;
> +    bool lmce = per_cpu(lmce_in_process, cpu);

Is this flag valid to be looked at anymore at this point in time? MCIP
was cleared a lot earlier, so there may well have been a 2nd #MC
in between. In any event you again don#t need the local variable
here afaict.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host
  2017-02-17  6:39 ` [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host Haozhong Zhang
@ 2017-02-22 15:10   ` Jan Beulich
  2017-02-23  3:16     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-22 15:10 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/mcheck/mce.h
> +++ b/xen/arch/x86/cpu/mcheck/mce.h
> @@ -38,6 +38,7 @@ enum mcheck_type {
>  };
>  
>  extern uint8_t cmci_apic_vector;
> +extern bool lmce_support;
>  
>  /* Init functions */
>  enum mcheck_type amd_mcheck_init(struct cpuinfo_x86 *c);
> diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c 
> b/xen/arch/x86/cpu/mcheck/mce_intel.c
> index 9e5ee3d..b4cc41a 100644
> --- a/xen/arch/x86/cpu/mcheck/mce_intel.c
> +++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
> @@ -29,6 +29,9 @@ boolean_param("mce_fb", mce_force_broadcast);
>  
>  static int __read_mostly nr_intel_ext_msrs;
>  
> +/* If mce_force_broadcast == 1, lmce_support will be disabled forcibly. */
> +bool __read_mostly lmce_support = 0;

false (but really there's no need for an initializer here)

> @@ -677,10 +680,34 @@ static int mce_is_broadcast(struct cpuinfo_x86 *c)
>      return 0;
>  }
>  
> +static bool intel_enable_lmce(void)
> +{
> +    uint64_t msr_content;
> +
> +    /*
> +     * Section "Enabling Local Machine Check" in Intel SDM Vol 3
> +     * requires software must ensure the LOCK bit and LMCE_ON bit
> +     * of MSR_IA32_FEATURE_CONTROL are set before setting
> +     * MSR_IA32_MCG_EXT_CTL.LMCE_EN.
> +     */
> +
> +    if ( rdmsr_safe(MSR_IA32_FEATURE_CONTROL, msr_content) )
> +        return 0;

false (and so on further down)

> +    if ( msr_content &
> +         (IA32_FEATURE_CONTROL_LOCK | IA32_FEATURE_CONTROL_LMCE_ON) )

This checks whether at least one of the bits is on, which isn't in
line with the comment.

>  static void intel_init_mca(struct cpuinfo_x86 *c)
>  {
> -    bool_t broadcast, cmci = 0, ser = 0;
> +    bool_t broadcast, cmci = 0, ser = 0, lmce = 0;

Please use the opportunity to change to plain bool (and false).

> @@ -700,26 +727,31 @@ static void intel_init_mca(struct cpuinfo_x86 *c)
>  
>      first = mce_firstbank(c);
>  
> +    if ( !mce_force_broadcast && (msr_content & MCG_LMCE_P) )

Please make all your additions match the prevailing coding style in
this file (which admittedly is neither ours nor Linux'es, but a mix).

> +        lmce = intel_enable_lmce();
> +
>      if (smp_processor_id() == 0)
>      {
>          dprintk(XENLOG_INFO, "MCA Capability: BCAST %x SER %x"
> -                " CMCI %x firstbank %x extended MCE MSR %x\n",
> -                broadcast, ser, cmci, first, ext_num);
> +                " CMCI %x firstbank %x extended MCE MSR %x LMCE %x\n",
> +                broadcast, ser, cmci, first, ext_num, lmce);

Please can you switch over to not printing booleans as numbers
here, but simply omitting the respective string from the output if
a feature is not there? Only actual numbers should be printed as
such.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 14/19] x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL
  2017-02-17  6:39 ` [PATCH 14/19] x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL Haozhong Zhang
@ 2017-02-22 15:20   ` Jan Beulich
  2017-02-23  4:10     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-22 15:20 UTC (permalink / raw)
  To: Haozhong Zhang
  Cc: Kevin Tian, Liu Jinsong, Christoph Egger, xen-devel,
	Jun Nakajima, Andrew Cooper

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/mcheck/mce_intel.c
> +++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
> @@ -916,3 +916,7 @@ int vmce_intel_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
>      return 1;
>  }
>  
> +bool vmce_support_lmce(const struct vcpu *v)
> +{
> +    return !!(v->arch.vmce.mcg_cap & MCG_LMCE_P);

No need for !! here.

> @@ -2634,6 +2635,9 @@ static int is_last_branch_msr(u32 ecx)
>  
>  static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>  {
> +    struct vcpu *v = current;

curr please.

> +    struct domain *d = v->domain;

No need for this local variable afaics.

> @@ -2651,6 +2655,12 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>          __vmread(GUEST_IA32_DEBUGCTL, msr_content);
>          break;
>      case MSR_IA32_FEATURE_CONTROL:
> +        *msr_content = IA32_FEATURE_CONTROL_LOCK;
> +        if ( vmce_support_lmce(v) )
> +            *msr_content |= IA32_FEATURE_CONTROL_LMCE_ON;
> +        if ( nestedhvm_enabled(d) && d->arch.cpuid->basic.vmx )

Doesn't the right side false imply the left side to be false?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL
  2017-02-17  6:39 ` [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL Haozhong Zhang
@ 2017-02-22 15:36   ` Jan Beulich
  2017-02-23  4:26     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-22 15:36 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> @@ -190,6 +191,25 @@ int vmce_rdmsr(uint32_t msr, uint64_t *val)
>              *val = ~0ULL;
>          mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_CTL %#"PRIx64"\n", cur, *val);
>          break;
> +    case MSR_IA32_MCG_EXT_CTL:
> +        /*
> +         * If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, the LMCE and LOCK
> +         * bits are always set in guest MSR_IA32_FEATURE_CONTROL by Xen , so it

Stray blank before comma.

> +         * does not need to check them here.
> +         */
> +        if ( !(cur->arch.vmce.mcg_cap & MCG_LMCE_P) )

Please invert the condition (swapping if and else blocks): This is not
only easier to read, but also gives the compiler correct information
on which case we expect to be the preferred (normal) one (at least
in the long run).

> +        {
> +            ret = -1;
> +            mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL, not supported\n",
> +                       cur);
> +        }
> +        else
> +        {
> +            *val = cur->arch.vmce.lmce_enabled ? MCG_EXT_CTL_LMCE_EN : 0;
> +            mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL %#"PRIx64"\n",
> +                       cur, *val);
> +        }
> +        break;
>      default:

Even if this isn't currently the case for the rest of this switch
statement, please add blank lines between non-fall-through case
blocks.

> --- a/xen/include/asm-x86/mce.h
> +++ b/xen/include/asm-x86/mce.h
> @@ -29,6 +29,7 @@ struct vmce {
>      uint64_t mcg_status;
>      spinlock_t lock;
>      struct vmce_bank bank[GUEST_MC_BANK_NUM];
> +    bool lmce_enabled;

I think this better goes ahead of bank[].

> --- a/xen/include/public/arch-x86/hvm/save.h
> +++ b/xen/include/public/arch-x86/hvm/save.h
> @@ -599,6 +599,8 @@ struct hvm_vmce_vcpu {
>      uint64_t caps;
>      uint64_t mci_ctl2_bank0;
>      uint64_t mci_ctl2_bank1;
> +    uint8_t  lmce_enabled;
> +    uint8_t  _pad[7];

This implicitly is a domctl interface change, so you need to bump the
interface version there (this hasn't happened yet afaics after 4.8
was branched off).

Plus - no leading underscore please.

All of this said - is this really a per-vCPU property, instead of a
per-domain one?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host
  2017-02-17  6:39 ` [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host Haozhong Zhang
@ 2017-02-22 15:48   ` Jan Beulich
  2017-02-23  4:48     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-22 15:48 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/mcheck/mcaction.c
> +++ b/xen/arch/x86/cpu/mcheck/mcaction.c
> @@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
>                      goto vmce_failed;
>                  }
>  
> -                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
> +                vmce_vcpuid = global->mc_vcpuid;
> +                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> +                     (vmce_vcpuid == -1 ||

What is this -1 matching up with? Or to say it differently, this needs a
#define used at both producing and consuming sides.

> +                      global->mc_domid != bank->mc_domid ||

I don't understand this check.

> --- a/xen/arch/x86/cpu/mcheck/vmce.c
> +++ b/xen/arch/x86/cpu/mcheck/vmce.c
> @@ -444,14 +444,21 @@ static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status,
>  }
>  
>  int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
> -                   uint64_t gstatus, bool broadcast)
> +                   uint64_t gstatus, int vmce_vcpuid)
>  {
>      struct vcpu *v = d->vcpu[0];
> +    bool broadcast = (vmce_vcpuid == VMCE_INJECT_BROADCAST);
>      int ret;
>  
>      if ( mc_bank->mc_domid == (uint16_t)~0 )
>          return -EINVAL;
>  
> +    if ( (gstatus & MCG_STATUS_LMCE) && !broadcast )
> +        v = d->vcpu[vmce_vcpuid];

While the ID looks to be coming from a trustworthy source, I'd
still prefer if there was at least an ASSERT() for it to be in range.

> +    if ( broadcast )
> +        gstatus &= ~MCG_STATUS_LMCE;

Please combine the two if()s:

    if ( broadcast )
        ...
    else if ( gstatus & MCG_STATUS_LMCE )
        ...

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
  2017-02-17  6:39 ` [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP Haozhong Zhang
  2017-02-20 12:32   ` Wei Liu
@ 2017-02-22 15:55   ` Jan Beulich
  2017-02-23  5:07     ` Haozhong Zhang
  1 sibling, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-22 15:55 UTC (permalink / raw)
  To: Haozhong Zhang
  Cc: Andrew Cooper, Christoph Egger, Wei Liu, Ian Jackson, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/mcheck/vmce.c
> +++ b/xen/arch/x86/cpu/mcheck/vmce.c
> @@ -74,7 +74,7 @@ int vmce_restore_vcpu(struct vcpu *v, const struct 
> hvm_vmce_vcpu *ctxt)
>      unsigned long guest_mcg_cap;
>  
>      if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
> -        guest_mcg_cap = INTEL_GUEST_MCG_CAP;
> +        guest_mcg_cap = INTEL_GUEST_MCG_CAP | (lmce_support ? MCG_LMCE_P : 0);
>      else
>          guest_mcg_cap = AMD_GUEST_MCG_CAP;
>  

Is it really a problem to restore a guest that has LMCE enabled on
an LMCE-incapable host? I.e. doesn't the guest need to be aware
of MCEs with the local bit clear anyway?

> @@ -4185,6 +4186,12 @@ static int hvmop_set_param(
>          }
>          d->arch.x87_fip_width = a.value;
>          break;
> +    case HVM_PARAM_LMCE:
> +        if ( a.value > 1 )
> +            rc = -EINVAL;
> +        else if ( a.value == 1 )
> +            rc = vmce_enable_lmce(d);
> +        break;

If you named the param slightly differently, and if you introduced
a #define for the one bit you care about, it could be ready for use
for further future flags right away.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2
  2017-02-17  6:39 ` [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2 Haozhong Zhang
@ 2017-02-22 15:59   ` Jan Beulich
  2017-02-23  5:14     ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-22 15:59 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> --- a/xen/arch/x86/cpu/mcheck/mce.c
> +++ b/xen/arch/x86/cpu/mcheck/mce.c
> @@ -1510,6 +1510,7 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
>      {
>          const cpumask_t *cpumap;
>          cpumask_var_t cmv;
> +        int cpu_nr;

unsigned int (and likely no need for the _nr suffix)

Or perhaps the local variable isn't needed at all.

> @@ -1552,6 +1553,21 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
>                  send_IPI_mask(cpumap, cmci_apic_vector);
>              }
>              break;
> +        case XEN_MC_INJECT_TYPE_LMCE:
> +            if ( !lmce_support )
> +            {
> +                ret = x86_mcerr("No LMCE support in platform", -EINVAL);
> +                break;
> +            }
> +            /* ensure at most one CPU is specified */

And what use is none at all? Also - comment style (should start with
a capital).

> +            cpu_nr = cpumask_next(cpumask_first(cpumap), cpumap);
> +            if ( cpu_nr < nr_cpu_ids )
> +            {
> +                ret = x86_mcerr("More than one CPU specified", -EINVAL);
> +                break;
> +            }
> +            on_selected_cpus(cpumap, x86_mc_mceinject, NULL, 1);
> +            break;
>          default:

See earlier patch comments regarding blank lines missing here.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 12/19] x86/mce: handle LMCE locally
  2017-02-22 13:53   ` Jan Beulich
@ 2017-02-23  3:06     ` Haozhong Zhang
  2017-02-23  7:42       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  3:06 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

On 02/22/17 06:53 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > --- a/xen/arch/x86/cpu/mcheck/barrier.c
> > +++ b/xen/arch/x86/cpu/mcheck/barrier.c
> > @@ -20,7 +20,7 @@ void mce_barrier_enter(struct mce_softirq_barrier *bar)
> >  {
> >      int gen;
> >  
> > -    if (!mce_broadcast)
> > +    if ( !mce_broadcast || __get_cpu_var(lmce_in_process) )
> 
> this_cpu() please instead of __get_cpu_var() (which we should get
> rid of rather sooner than later).

will use this_cpu()

>
> > @@ -462,6 +474,7 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
> >      uint64_t gstatus;
> >      mctelem_cookie_t mctc = NULL;
> >      struct mca_summary bs;
> > +    bool *lmce_in_process = &__get_cpu_var(lmce_in_process);
> >  
> >      mce_spin_lock(&mce_logout_lock);
> >  
> > @@ -505,6 +518,8 @@ void mcheck_cmn_handler(const struct cpu_user_regs *regs)
> >      }
> >      mce_spin_unlock(&mce_logout_lock);
> >  
> > +    *lmce_in_process = bs.lmce;
> 
> You don't need a new local variable for this.
> 
> > @@ -1709,6 +1724,7 @@ static void mce_softirq(void)
> >  {
> >      int cpu = smp_processor_id();
> >      unsigned int workcpu;
> > +    bool lmce = per_cpu(lmce_in_process, cpu);
> 
> Is this flag valid to be looked at anymore at this point in time? MCIP
> was cleared a lot earlier, so there may well have been a 2nd #MC
> in between. In any event you again don#t need the local variable
> here afaict.

A non-LMCE MC# coming in between does not cause problem. As
lmce_in_process on all CPUs are cleared, mce_softirq() on all CPUs
will sync with each other as before and finally one of them will
handle the pending LMCE.

I think the problem is one flag is not enough rather than non
needed. One lmce_in_process flag misses the following case:
1) mcheck_cmn_handler() first handles a non-LMCE MC on CPU#n and raises
   MACHINE_CHECK_SOFTIRQ.
2) Before mce_softirq() gets chance to run on CPU#n, another LMCE
   comes to CPU#n. Then mcheck_cmn_handler() sets lmce_in_process on
   CPU#n and raises MACHINE_CHECK_SOFTIRQ again.
3) mce_softirq() finally gets change to run on CPU#n. It sees
   lmce_in_process is set and consequently handles pending MCEs on
   CPU#n w/o waiting for other CPUs. However, one of the pending MCEs
   is not LMCE.

So I'm considering to introduce another local flag "mce_in_process" to
indicate whether there is a non-LMCE MC is pending for softirq.
1) When a non-LMCE MC# comes to CPU#n, mcheck_cmn_handler() sets
   mce_in_process flag on CPU#n.
2) When a LMCE comes to CPU#n, mcheck_cmn_handler() sets
   lmce_in_process flag on CPU#n.
3) When mce_softirq() starts, it clears lmce_in_process flag if
   mce_in_process is set, so it will not handle non-LMCE MC w/o
   waiting for other CPUs.
4) mce_softirq() clears both flags after exiting all MC barriers.

Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host
  2017-02-22 15:10   ` Jan Beulich
@ 2017-02-23  3:16     ` Haozhong Zhang
  2017-02-23  7:45       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  3:16 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/22/17 08:10 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > --- a/xen/arch/x86/cpu/mcheck/mce.h
> > +++ b/xen/arch/x86/cpu/mcheck/mce.h
> > @@ -38,6 +38,7 @@ enum mcheck_type {
> >  };
> >  
> >  extern uint8_t cmci_apic_vector;
> > +extern bool lmce_support;
> >  
> >  /* Init functions */
> >  enum mcheck_type amd_mcheck_init(struct cpuinfo_x86 *c);
> > diff --git a/xen/arch/x86/cpu/mcheck/mce_intel.c 
> > b/xen/arch/x86/cpu/mcheck/mce_intel.c
> > index 9e5ee3d..b4cc41a 100644
> > --- a/xen/arch/x86/cpu/mcheck/mce_intel.c
> > +++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
> > @@ -29,6 +29,9 @@ boolean_param("mce_fb", mce_force_broadcast);
> >  
> >  static int __read_mostly nr_intel_ext_msrs;
> >  
> > +/* If mce_force_broadcast == 1, lmce_support will be disabled forcibly. */
> > +bool __read_mostly lmce_support = 0;
> 
> false (but really there's no need for an initializer here)
> 
> > @@ -677,10 +680,34 @@ static int mce_is_broadcast(struct cpuinfo_x86 *c)
> >      return 0;
> >  }
> >  
> > +static bool intel_enable_lmce(void)
> > +{
> > +    uint64_t msr_content;
> > +
> > +    /*
> > +     * Section "Enabling Local Machine Check" in Intel SDM Vol 3
> > +     * requires software must ensure the LOCK bit and LMCE_ON bit
> > +     * of MSR_IA32_FEATURE_CONTROL are set before setting
> > +     * MSR_IA32_MCG_EXT_CTL.LMCE_EN.
> > +     */
> > +
> > +    if ( rdmsr_safe(MSR_IA32_FEATURE_CONTROL, msr_content) )
> > +        return 0;
> 
> false (and so on further down)

I'll fix these boolean stuffs here and in other patches.

> 
> > +    if ( msr_content &
> > +         (IA32_FEATURE_CONTROL_LOCK | IA32_FEATURE_CONTROL_LMCE_ON) )
> 
> This checks whether at least one of the bits is on, which isn't in
> line with the comment.
>

I'll fix this check.

> >  static void intel_init_mca(struct cpuinfo_x86 *c)
> >  {
> > -    bool_t broadcast, cmci = 0, ser = 0;
> > +    bool_t broadcast, cmci = 0, ser = 0, lmce = 0;
> 
> Please use the opportunity to change to plain bool (and false).

sure

> 
> > @@ -700,26 +727,31 @@ static void intel_init_mca(struct cpuinfo_x86 *c)
> >  
> >      first = mce_firstbank(c);
> >  
> > +    if ( !mce_force_broadcast && (msr_content & MCG_LMCE_P) )
> 
> Please make all your additions match the prevailing coding style in
> this file (which admittedly is neither ours nor Linux'es, but a mix).

The problem is the existing style in this file is not consistent. Both
if ( cond ) and if (cond) are being used in this file. I chose to use
Xen style in the new code.

> 
> > +        lmce = intel_enable_lmce();
> > +
> >      if (smp_processor_id() == 0)
> >      {
> >          dprintk(XENLOG_INFO, "MCA Capability: BCAST %x SER %x"
> > -                " CMCI %x firstbank %x extended MCE MSR %x\n",
> > -                broadcast, ser, cmci, first, ext_num);
> > +                " CMCI %x firstbank %x extended MCE MSR %x LMCE %x\n",
> > +                broadcast, ser, cmci, first, ext_num, lmce);
> 
> Please can you switch over to not printing booleans as numbers
> here, but simply omitting the respective string from the output if
> a feature is not there? Only actual numbers should be printed as
> such.

sure

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 14/19] x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL
  2017-02-22 15:20   ` Jan Beulich
@ 2017-02-23  4:10     ` Haozhong Zhang
  0 siblings, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  4:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Liu Jinsong, Christoph Egger, xen-devel,
	Jun Nakajima, Andrew Cooper

On 02/22/17 08:20 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > --- a/xen/arch/x86/cpu/mcheck/mce_intel.c
> > +++ b/xen/arch/x86/cpu/mcheck/mce_intel.c
> > @@ -916,3 +916,7 @@ int vmce_intel_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
> >      return 1;
> >  }
> >  
> > +bool vmce_support_lmce(const struct vcpu *v)
> > +{
> > +    return !!(v->arch.vmce.mcg_cap & MCG_LMCE_P);
> 
> No need for !! here.

will remove

> 
> > @@ -2634,6 +2635,9 @@ static int is_last_branch_msr(u32 ecx)
> >  
> >  static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
> >  {
> > +    struct vcpu *v = current;
> 
> curr please.

will change

> 
> > +    struct domain *d = v->domain;
> 
> No need for this local variable afaics.

will remove

> 
> > @@ -2651,6 +2655,12 @@ static int vmx_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
> >          __vmread(GUEST_IA32_DEBUGCTL, msr_content);
> >          break;
> >      case MSR_IA32_FEATURE_CONTROL:
> > +        *msr_content = IA32_FEATURE_CONTROL_LOCK;
> > +        if ( vmce_support_lmce(v) )
> > +            *msr_content |= IA32_FEATURE_CONTROL_LMCE_ON;
> > +        if ( nestedhvm_enabled(d) && d->arch.cpuid->basic.vmx )
> 
> Doesn't the right side false imply the left side to be false?

Yes, I'll remove the right side.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL
  2017-02-22 15:36   ` Jan Beulich
@ 2017-02-23  4:26     ` Haozhong Zhang
  2017-02-23  7:53       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  4:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/22/17 08:36 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > @@ -190,6 +191,25 @@ int vmce_rdmsr(uint32_t msr, uint64_t *val)
> >              *val = ~0ULL;
> >          mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_CTL %#"PRIx64"\n", cur, *val);
> >          break;
> > +    case MSR_IA32_MCG_EXT_CTL:
> > +        /*
> > +         * If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, the LMCE and LOCK
> > +         * bits are always set in guest MSR_IA32_FEATURE_CONTROL by Xen , so it
> 
> Stray blank before comma.
> 
> > +         * does not need to check them here.
> > +         */
> > +        if ( !(cur->arch.vmce.mcg_cap & MCG_LMCE_P) )
> 
> Please invert the condition (swapping if and else blocks): This is not
> only easier to read, but also gives the compiler correct information
> on which case we expect to be the preferred (normal) one (at least
> in the long run).
>

will do

> > +        {
> > +            ret = -1;
> > +            mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL, not supported\n",
> > +                       cur);
> > +        }
> > +        else
> > +        {
> > +            *val = cur->arch.vmce.lmce_enabled ? MCG_EXT_CTL_LMCE_EN : 0;
> > +            mce_printk(MCE_VERBOSE, "MCE: %pv: rd MCG_EXT_CTL %#"PRIx64"\n",
> > +                       cur, *val);
> > +        }
> > +        break;
> >      default:
> 
> Even if this isn't currently the case for the rest of this switch
> statement, please add blank lines between non-fall-through case
> blocks.
>

ditto

> > --- a/xen/include/asm-x86/mce.h
> > +++ b/xen/include/asm-x86/mce.h
> > @@ -29,6 +29,7 @@ struct vmce {
> >      uint64_t mcg_status;
> >      spinlock_t lock;
> >      struct vmce_bank bank[GUEST_MC_BANK_NUM];
> > +    bool lmce_enabled;
> 
> I think this better goes ahead of bank[].
>

ditto

> > --- a/xen/include/public/arch-x86/hvm/save.h
> > +++ b/xen/include/public/arch-x86/hvm/save.h
> > @@ -599,6 +599,8 @@ struct hvm_vmce_vcpu {
> >      uint64_t caps;
> >      uint64_t mci_ctl2_bank0;
> >      uint64_t mci_ctl2_bank1;
> > +    uint8_t  lmce_enabled;
> > +    uint8_t  _pad[7];
> 
> This implicitly is a domctl interface change, so you need to bump the
> interface version there (this hasn't happened yet afaics after 4.8
> was branched off).
>

ditto

> Plus - no leading underscore please.

ditto

> 
> All of this said - is this really a per-vCPU property, instead of a
> per-domain one?

Per-vCPU. At least it can be set in the per-CPU way. Patch 16, which
implements the vLMCE injection, checks this per-vcpu flag when
injecting vMCE. If the flag is cleared, vMCE (w/ MCG_STATUS_LMCE
removed) will be broadcast to all vcpus.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host
  2017-02-22 15:48   ` Jan Beulich
@ 2017-02-23  4:48     ` Haozhong Zhang
  2017-02-23  8:21       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  4:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/22/17 08:48 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > --- a/xen/arch/x86/cpu/mcheck/mcaction.c
> > +++ b/xen/arch/x86/cpu/mcheck/mcaction.c
> > @@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
> >                      goto vmce_failed;
> >                  }
> >  
> > -                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
> > +                vmce_vcpuid = global->mc_vcpuid;
> > +                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> > +                     (vmce_vcpuid == -1 ||
> 
> What is this -1 matching up with? Or to say it differently, this needs a
> #define used at both producing and consuming sides.

It comes from mca_init_global

> 
> > +                      global->mc_domid != bank->mc_domid ||
> 
> I don't understand this check.

It serves for MCE injection test (i.e. xen-mceinj). The check is
always false for LMCE that comes from the real hardware error. But for
xen-mceinj, which may specify both the injected domain and the
injected physical CPU, it's hard for xen-mceinj to ensure, when the
injection really happens, the injected CPU is still the one on which a
vCPU of the injected domain was running. So I add this check to avoid
injecting to the wrong domain.

> 
> > --- a/xen/arch/x86/cpu/mcheck/vmce.c
> > +++ b/xen/arch/x86/cpu/mcheck/vmce.c
> > @@ -444,14 +444,21 @@ static int vcpu_fill_mc_msrs(struct vcpu *v, uint64_t mcg_status,
> >  }
> >  
> >  int fill_vmsr_data(struct mcinfo_bank *mc_bank, struct domain *d,
> > -                   uint64_t gstatus, bool broadcast)
> > +                   uint64_t gstatus, int vmce_vcpuid)
> >  {
> >      struct vcpu *v = d->vcpu[0];
> > +    bool broadcast = (vmce_vcpuid == VMCE_INJECT_BROADCAST);
> >      int ret;
> >  
> >      if ( mc_bank->mc_domid == (uint16_t)~0 )
> >          return -EINVAL;
> >  
> > +    if ( (gstatus & MCG_STATUS_LMCE) && !broadcast )
> > +        v = d->vcpu[vmce_vcpuid];
> 
> While the ID looks to be coming from a trustworthy source, I'd
> still prefer if there was at least an ASSERT() for it to be in range.
>

will do

> > +    if ( broadcast )
> > +        gstatus &= ~MCG_STATUS_LMCE;
> 
> Please combine the two if()s:
> 
>     if ( broadcast )
>         ...
>     else if ( gstatus & MCG_STATUS_LMCE )
>         ...
>
ditto

Thanks,
Haozhong


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
  2017-02-22 15:55   ` Jan Beulich
@ 2017-02-23  5:07     ` Haozhong Zhang
  0 siblings, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  5:07 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Christoph Egger, Wei Liu, Ian Jackson, xen-devel

On 02/22/17 08:55 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > --- a/xen/arch/x86/cpu/mcheck/vmce.c
> > +++ b/xen/arch/x86/cpu/mcheck/vmce.c
> > @@ -74,7 +74,7 @@ int vmce_restore_vcpu(struct vcpu *v, const struct 
> > hvm_vmce_vcpu *ctxt)
> >      unsigned long guest_mcg_cap;
> >  
> >      if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
> > -        guest_mcg_cap = INTEL_GUEST_MCG_CAP;
> > +        guest_mcg_cap = INTEL_GUEST_MCG_CAP | (lmce_support ? MCG_LMCE_P : 0);
> >      else
> >          guest_mcg_cap = AMD_GUEST_MCG_CAP;
> >  
> 
> Is it really a problem to restore a guest that has LMCE enabled on
> an LMCE-incapable host? I.e. doesn't the guest need to be aware
> of MCEs with the local bit clear anyway?

There should be no problem. I'll change it to
-        guest_mcg_cap = INTEL_GUEST_MCG_CAP;
+        guest_mcg_cap = INTEL_GUEST_MCG_CAP | MCG_LMCE_P;

> 
> > @@ -4185,6 +4186,12 @@ static int hvmop_set_param(
> >          }
> >          d->arch.x87_fip_width = a.value;
> >          break;
> > +    case HVM_PARAM_LMCE:
> > +        if ( a.value > 1 )
> > +            rc = -EINVAL;
> > +        else if ( a.value == 1 )
> > +            rc = vmce_enable_lmce(d);
> > +        break;
> 
> If you named the param slightly differently, and if you introduced
> a #define for the one bit you care about, it could be ready for use
> for further future flags right away.
>

Yes, I'll rename it to HVM_PARAM_MCE_CAP and reserve bit 0 for LMCE capability.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2
  2017-02-22 15:59   ` Jan Beulich
@ 2017-02-23  5:14     ` Haozhong Zhang
  2017-02-23  8:26       ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  5:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/22/17 08:59 -0700, Jan Beulich wrote:
> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> > --- a/xen/arch/x86/cpu/mcheck/mce.c
> > +++ b/xen/arch/x86/cpu/mcheck/mce.c
> > @@ -1510,6 +1510,7 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
> >      {
> >          const cpumask_t *cpumap;
> >          cpumask_var_t cmv;
> > +        int cpu_nr;
> 
> unsigned int (and likely no need for the _nr suffix)
> 
> Or perhaps the local variable isn't needed at all.
>

I'll inline this variable.

> > @@ -1552,6 +1553,21 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
> >                  send_IPI_mask(cpumap, cmci_apic_vector);
> >              }
> >              break;
> > +        case XEN_MC_INJECT_TYPE_LMCE:
> > +            if ( !lmce_support )
> > +            {
> > +                ret = x86_mcerr("No LMCE support in platform", -EINVAL);
> > +                break;
> > +            }
> > +            /* ensure at most one CPU is specified */
> 
> And what use is none at all? Also - comment style (should start with
> a capital).
>

Do you mean the check of empty cpumap? It's checked at the beginning of case XEN_MC_inject_v2.

> > +            cpu_nr = cpumask_next(cpumask_first(cpumap), cpumap);
> > +            if ( cpu_nr < nr_cpu_ids )
> > +            {
> > +                ret = x86_mcerr("More than one CPU specified", -EINVAL);
> > +                break;
> > +            }
> > +            on_selected_cpus(cpumap, x86_mc_mceinject, NULL, 1);
> > +            break;
> >          default:
> 
> See earlier patch comments regarding blank lines missing here.
>

I'll add a blank line before the default case.

Thanks,
Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 12/19] x86/mce: handle LMCE locally
  2017-02-23  3:06     ` Haozhong Zhang
@ 2017-02-23  7:42       ` Jan Beulich
  2017-02-23  8:38         ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-23  7:42 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

>>> On 23.02.17 at 04:06, <haozhong.zhang@intel.com> wrote:
> On 02/22/17 06:53 -0700, Jan Beulich wrote:
>> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> > @@ -1709,6 +1724,7 @@ static void mce_softirq(void)
>> >  {
>> >      int cpu = smp_processor_id();
>> >      unsigned int workcpu;
>> > +    bool lmce = per_cpu(lmce_in_process, cpu);
>> 
>> Is this flag valid to be looked at anymore at this point in time? MCIP
>> was cleared a lot earlier, so there may well have been a 2nd #MC
>> in between. In any event you again don#t need the local variable
>> here afaict.
> 
> A non-LMCE MC# coming in between does not cause problem. As
> lmce_in_process on all CPUs are cleared, mce_softirq() on all CPUs
> will sync with each other as before and finally one of them will
> handle the pending LMCE.
> 
> I think the problem is one flag is not enough rather than non
> needed. One lmce_in_process flag misses the following case:
> 1) mcheck_cmn_handler() first handles a non-LMCE MC on CPU#n and raises
>    MACHINE_CHECK_SOFTIRQ.
> 2) Before mce_softirq() gets chance to run on CPU#n, another LMCE
>    comes to CPU#n. Then mcheck_cmn_handler() sets lmce_in_process on
>    CPU#n and raises MACHINE_CHECK_SOFTIRQ again.
> 3) mce_softirq() finally gets change to run on CPU#n. It sees
>    lmce_in_process is set and consequently handles pending MCEs on
>    CPU#n w/o waiting for other CPUs. However, one of the pending MCEs
>    is not LMCE.
> 
> So I'm considering to introduce another local flag "mce_in_process" to
> indicate whether there is a non-LMCE MC is pending for softirq.
> 1) When a non-LMCE MC# comes to CPU#n, mcheck_cmn_handler() sets
>    mce_in_process flag on CPU#n.
> 2) When a LMCE comes to CPU#n, mcheck_cmn_handler() sets
>    lmce_in_process flag on CPU#n.
> 3) When mce_softirq() starts, it clears lmce_in_process flag if
>    mce_in_process is set, so it will not handle non-LMCE MC w/o
>    waiting for other CPUs.
> 4) mce_softirq() clears both flags after exiting all MC barriers.

I'm afraid I still don't see how a static flag can deal with an
arbitrary number of #MC-s arriving prior to the softirq handler
getting a chance to run after the very first one came in.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host
  2017-02-23  3:16     ` Haozhong Zhang
@ 2017-02-23  7:45       ` Jan Beulich
  0 siblings, 0 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-23  7:45 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 23.02.17 at 04:16, <haozhong.zhang@intel.com> wrote:
> On 02/22/17 08:10 -0700, Jan Beulich wrote:
>> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> > @@ -700,26 +727,31 @@ static void intel_init_mca(struct cpuinfo_x86 *c)
>> >  
>> >      first = mce_firstbank(c);
>> >  
>> > +    if ( !mce_force_broadcast && (msr_content & MCG_LMCE_P) )
>> 
>> Please make all your additions match the prevailing coding style in
>> this file (which admittedly is neither ours nor Linux'es, but a mix).
> 
> The problem is the existing style in this file is not consistent. Both
> if ( cond ) and if (cond) are being used in this file. I chose to use
> Xen style in the new code.

Well, as said - the file isn't cleanly using one style. In such a case,
rather than making a function mixing styles, you should try to
match surrounding code's style (unless you feel up to making a
patch to convert the entire file to uniform style).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL
  2017-02-23  4:26     ` Haozhong Zhang
@ 2017-02-23  7:53       ` Jan Beulich
  2017-02-23  8:54         ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-23  7:53 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 23.02.17 at 05:26, <haozhong.zhang@intel.com> wrote:
> On 02/22/17 08:36 -0700, Jan Beulich wrote:
>> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> All of this said - is this really a per-vCPU property, instead of a
>> per-domain one?
> 
> Per-vCPU. At least it can be set in the per-CPU way. Patch 16, which
> implements the vLMCE injection, checks this per-vcpu flag when
> injecting vMCE. If the flag is cleared, vMCE (w/ MCG_STATUS_LMCE
> removed) will be broadcast to all vcpus.

You answer my question based on the mechanics of your patches,
but the question was from a conceptual / architectural perspective.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host
  2017-02-23  4:48     ` Haozhong Zhang
@ 2017-02-23  8:21       ` Jan Beulich
  0 siblings, 0 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-23  8:21 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 23.02.17 at 05:48, <haozhong.zhang@intel.com> wrote:
> On 02/22/17 08:48 -0700, Jan Beulich wrote:
>> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> > --- a/xen/arch/x86/cpu/mcheck/mcaction.c
>> > +++ b/xen/arch/x86/cpu/mcheck/mcaction.c
>> > @@ -88,17 +88,19 @@ mc_memerr_dhandler(struct mca_binfo *binfo,
>> >                      goto vmce_failed;
>> >                  }
>> >  
>> > -                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL )
>> > +                vmce_vcpuid = global->mc_vcpuid;
>> > +                if ( boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
>> > +                     (vmce_vcpuid == -1 ||
>> 
>> What is this -1 matching up with? Or to say it differently, this needs a
>> #define used at both producing and consuming sides.
> 
> It comes from mca_init_global

Now that's why I did add the second sentence: I understand that,
but prior to this patch there was (in the hypervisor) only a
producing side. That's bad enough, because it leaves the consumer
(outside of the hypervisor) at the mercy of this value remaining at
that hard coded -1, and doesn't allow for making the connection.
The latest with you adding a consumer in the hypervisor, the
values should become connectable to one another by using a
suitable #define on both sides. Quite likely that #define should
actually become part of the public interface, so that users of the
value have a proper item to compare against. Having to explain
this made me look at the type of the field, btw: It's uint16_t, so
the comparison above is always false.

The situation looks to be even worse for mc_domid, so I think I'll
prepare a patch (where I then may include the mc_vcpuid aspect
right away).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2
  2017-02-23  5:14     ` Haozhong Zhang
@ 2017-02-23  8:26       ` Jan Beulich
  2017-02-23  9:14         ` Haozhong Zhang
  0 siblings, 1 reply; 78+ messages in thread
From: Jan Beulich @ 2017-02-23  8:26 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 23.02.17 at 06:14, <haozhong.zhang@intel.com> wrote:
> On 02/22/17 08:59 -0700, Jan Beulich wrote:
>> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> > @@ -1552,6 +1553,21 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
>> >                  send_IPI_mask(cpumap, cmci_apic_vector);
>> >              }
>> >              break;
>> > +        case XEN_MC_INJECT_TYPE_LMCE:
>> > +            if ( !lmce_support )
>> > +            {
>> > +                ret = x86_mcerr("No LMCE support in platform", -EINVAL);
>> > +                break;
>> > +            }
>> > +            /* ensure at most one CPU is specified */
>> 
>> And what use is none at all? Also - comment style (should start with
>> a capital).
>>
> 
> Do you mean the check of empty cpumap? It's checked at the beginning of case 
> XEN_MC_inject_v2.

To be honest, I don't see any such check. But looking at that code
makes me notice you should also forbid the combination of
XEN_MC_INJECT_CPU_BROADCAST and XEN_MC_INJECT_TYPE_LMCE.

>> > +            cpu_nr = cpumask_next(cpumask_first(cpumap), cpumap);
>> > +            if ( cpu_nr < nr_cpu_ids )
>> > +            {
>> > +                ret = x86_mcerr("More than one CPU specified", -EINVAL);
>> > +                break;
>> > +            }
>> > +            on_selected_cpus(cpumap, x86_mc_mceinject, NULL, 1);
>> > +            break;
>> >          default:
>> 
>> See earlier patch comments regarding blank lines missing here.
> 
> I'll add a blank line before the default case.

And another one before the new case label you add.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 12/19] x86/mce: handle LMCE locally
  2017-02-23  7:42       ` Jan Beulich
@ 2017-02-23  8:38         ` Haozhong Zhang
  0 siblings, 0 replies; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  8:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Liu Jinsong, Christoph Egger, xen-devel, Andrew Cooper

On 02/23/17 00:42 -0700, Jan Beulich wrote:
> >>> On 23.02.17 at 04:06, <haozhong.zhang@intel.com> wrote:
> > On 02/22/17 06:53 -0700, Jan Beulich wrote:
> >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> >> > @@ -1709,6 +1724,7 @@ static void mce_softirq(void)
> >> >  {
> >> >      int cpu = smp_processor_id();
> >> >      unsigned int workcpu;
> >> > +    bool lmce = per_cpu(lmce_in_process, cpu);
> >> 
> >> Is this flag valid to be looked at anymore at this point in time? MCIP
> >> was cleared a lot earlier, so there may well have been a 2nd #MC
> >> in between. In any event you again don#t need the local variable
> >> here afaict.
> > 
> > A non-LMCE MC# coming in between does not cause problem. As
> > lmce_in_process on all CPUs are cleared, mce_softirq() on all CPUs
> > will sync with each other as before and finally one of them will
> > handle the pending LMCE.
> > 
> > I think the problem is one flag is not enough rather than non
> > needed. One lmce_in_process flag misses the following case:
> > 1) mcheck_cmn_handler() first handles a non-LMCE MC on CPU#n and raises
> >    MACHINE_CHECK_SOFTIRQ.
> > 2) Before mce_softirq() gets chance to run on CPU#n, another LMCE
> >    comes to CPU#n. Then mcheck_cmn_handler() sets lmce_in_process on
> >    CPU#n and raises MACHINE_CHECK_SOFTIRQ again.
> > 3) mce_softirq() finally gets change to run on CPU#n. It sees
> >    lmce_in_process is set and consequently handles pending MCEs on
> >    CPU#n w/o waiting for other CPUs. However, one of the pending MCEs
> >    is not LMCE.
> > 
> > So I'm considering to introduce another local flag "mce_in_process" to
> > indicate whether there is a non-LMCE MC is pending for softirq.
> > 1) When a non-LMCE MC# comes to CPU#n, mcheck_cmn_handler() sets
> >    mce_in_process flag on CPU#n.
> > 2) When a LMCE comes to CPU#n, mcheck_cmn_handler() sets
> >    lmce_in_process flag on CPU#n.
> > 3) When mce_softirq() starts, it clears lmce_in_process flag if
> >    mce_in_process is set, so it will not handle non-LMCE MC w/o
> >    waiting for other CPUs.
> > 4) mce_softirq() clears both flags after exiting all MC barriers.
> 
> I'm afraid I still don't see how a static flag can deal with an
> arbitrary number of #MC-s arriving prior to the softirq handler
> getting a chance to run after the very first one came in.
>

mce_barrier_enter/exit need to be adapted to check both flags to
decide whether it needs to wait for others.
    void mce_barrier_enter/exit(struct mce_softirq_barrier *bar)
    {
        int gen;

    -   if (!mce_broadcast)
    +   if ( !mce_broadcast ||
    +        (!this_cpu(mce_in_process) && this_cpu(lmce_in_process)) )
            return;


When mce softirq handler finally starts to run, regardless how many
MCs have happened before,

1) if it sees only lmce_in_process flag is set, all pending MCs in
   this_cpu(mctctl.pending) are LMCE and the handler does not need to
   wait for others and mce_barrier_enter/exit allows it to do so.

2) if it sees mce_in_process flag is set, all or some pending MCs in
   this_cpu(mctctl.pending) are non-LMCE, i.e. other CPUs have
   received them as well, so mce softirq handler should wait for
   others and mce_barrier_enter/exit does wait for others.


Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL
  2017-02-23  7:53       ` Jan Beulich
@ 2017-02-23  8:54         ` Haozhong Zhang
  2017-02-23  9:04           ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  8:54 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/23/17 00:53 -0700, Jan Beulich wrote:
> >>> On 23.02.17 at 05:26, <haozhong.zhang@intel.com> wrote:
> > On 02/22/17 08:36 -0700, Jan Beulich wrote:
> >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> >> All of this said - is this really a per-vCPU property, instead of a
> >> per-domain one?
> > 
> > Per-vCPU. At least it can be set in the per-CPU way. Patch 16, which
> > implements the vLMCE injection, checks this per-vcpu flag when
> > injecting vMCE. If the flag is cleared, vMCE (w/ MCG_STATUS_LMCE
> > removed) will be broadcast to all vcpus.
> 
> You answer my question based on the mechanics of your patches,
> but the question was from a conceptual / architectural perspective.
> 

LMCE can be enabled by SW in the per-CPU way on the real hardware,
i.e. it can be enabled on some CPUs while disabled on others.
lmce_enabled is to track whether guest SW enables LMCE on a vcpu, so
it should be a per-vcpu property.

Haozhong


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL
  2017-02-23  8:54         ` Haozhong Zhang
@ 2017-02-23  9:04           ` Jan Beulich
  0 siblings, 0 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-23  9:04 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 23.02.17 at 09:54, <haozhong.zhang@intel.com> wrote:
> On 02/23/17 00:53 -0700, Jan Beulich wrote:
>> >>> On 23.02.17 at 05:26, <haozhong.zhang@intel.com> wrote:
>> > On 02/22/17 08:36 -0700, Jan Beulich wrote:
>> >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> >> All of this said - is this really a per-vCPU property, instead of a
>> >> per-domain one?
>> > 
>> > Per-vCPU. At least it can be set in the per-CPU way. Patch 16, which
>> > implements the vLMCE injection, checks this per-vcpu flag when
>> > injecting vMCE. If the flag is cleared, vMCE (w/ MCG_STATUS_LMCE
>> > removed) will be broadcast to all vcpus.
>> 
>> You answer my question based on the mechanics of your patches,
>> but the question was from a conceptual / architectural perspective.
>> 
> 
> LMCE can be enabled by SW in the per-CPU way on the real hardware,
> i.e. it can be enabled on some CPUs while disabled on others.
> lmce_enabled is to track whether guest SW enables LMCE on a vcpu, so
> it should be a per-vcpu property.

Good point.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2
  2017-02-23  8:26       ` Jan Beulich
@ 2017-02-23  9:14         ` Haozhong Zhang
  2017-02-23  9:22           ` Jan Beulich
  0 siblings, 1 reply; 78+ messages in thread
From: Haozhong Zhang @ 2017-02-23  9:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Christoph Egger, xen-devel

On 02/23/17 01:26 -0700, Jan Beulich wrote:
> >>> On 23.02.17 at 06:14, <haozhong.zhang@intel.com> wrote:
> > On 02/22/17 08:59 -0700, Jan Beulich wrote:
> >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
> >> > @@ -1552,6 +1553,21 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) u_xen_mc)
> >> >                  send_IPI_mask(cpumap, cmci_apic_vector);
> >> >              }
> >> >              break;
> >> > +        case XEN_MC_INJECT_TYPE_LMCE:
> >> > +            if ( !lmce_support )
> >> > +            {
> >> > +                ret = x86_mcerr("No LMCE support in platform", -EINVAL);
> >> > +                break;
> >> > +            }
> >> > +            /* ensure at most one CPU is specified */
> >> 
> >> And what use is none at all? Also - comment style (should start with
> >> a capital).
> >>
> > 
> > Do you mean the check of empty cpumap? It's checked at the beginning of case 
> > XEN_MC_inject_v2.
> 
> To be honest, I don't see any such check. But looking at that code
> makes me notice you should also forbid the combination of
> XEN_MC_INJECT_CPU_BROADCAST and XEN_MC_INJECT_TYPE_LMCE.
>

I mean the following check. Of course, the additional check you
suggested must go before it.
        if ( op->u.mc_inject_v2.flags & XEN_MC_INJECT_CPU_BROADCAST )
            cpumap = &cpu_online_map;
        else
        {
            ret = xenctl_bitmap_to_cpumask(&cmv, &op->u.mc_inject_v2.cpumap);
            if ( ret )
                break;
            cpumap = cmv;
            if ( !cpumask_intersects(cpumap, &cpu_online_map) )  <=== I mean this one exactly
            {
                free_cpumask_var(cmv);
                ret = x86_mcerr("No online CPU passed\n", -EINVAL);
                break;
            }
            if ( !cpumask_subset(cpumap, &cpu_online_map) )
                dprintk(XENLOG_INFO,
                        "Not all required CPUs are online\n");
        }


Haozhong

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

* Re: [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2
  2017-02-23  9:14         ` Haozhong Zhang
@ 2017-02-23  9:22           ` Jan Beulich
  0 siblings, 0 replies; 78+ messages in thread
From: Jan Beulich @ 2017-02-23  9:22 UTC (permalink / raw)
  To: Haozhong Zhang; +Cc: Andrew Cooper, Christoph Egger, xen-devel

>>> On 23.02.17 at 10:14, <haozhong.zhang@intel.com> wrote:
> On 02/23/17 01:26 -0700, Jan Beulich wrote:
>> >>> On 23.02.17 at 06:14, <haozhong.zhang@intel.com> wrote:
>> > On 02/22/17 08:59 -0700, Jan Beulich wrote:
>> >> >>> On 17.02.17 at 07:39, <haozhong.zhang@intel.com> wrote:
>> >> > @@ -1552,6 +1553,21 @@ long do_mca(XEN_GUEST_HANDLE_PARAM(xen_mc_t) 
> u_xen_mc)
>> >> >                  send_IPI_mask(cpumap, cmci_apic_vector);
>> >> >              }
>> >> >              break;
>> >> > +        case XEN_MC_INJECT_TYPE_LMCE:
>> >> > +            if ( !lmce_support )
>> >> > +            {
>> >> > +                ret = x86_mcerr("No LMCE support in platform", -EINVAL);
>> >> > +                break;
>> >> > +            }
>> >> > +            /* ensure at most one CPU is specified */
>> >> 
>> >> And what use is none at all? Also - comment style (should start with
>> >> a capital).
>> >>
>> > 
>> > Do you mean the check of empty cpumap? It's checked at the beginning of case 
>> > XEN_MC_inject_v2.
>> 
>> To be honest, I don't see any such check. But looking at that code
>> makes me notice you should also forbid the combination of
>> XEN_MC_INJECT_CPU_BROADCAST and XEN_MC_INJECT_TYPE_LMCE.
>>
> 
> I mean the following check. Of course, the additional check you
> suggested must go before it.
>         if ( op->u.mc_inject_v2.flags & XEN_MC_INJECT_CPU_BROADCAST )
>             cpumap = &cpu_online_map;
>         else
>         {
>             ret = xenctl_bitmap_to_cpumask(&cmv, &op->u.mc_inject_v2.cpumap);
>             if ( ret )
>                 break;
>             cpumap = cmv;
>             if ( !cpumask_intersects(cpumap, &cpu_online_map) )  <=== I mean this one exactly

I've been blind, so I'm sorry for the noise.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 78+ messages in thread

end of thread, other threads:[~2017-02-23  9:22 UTC | newest]

Thread overview: 78+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-17  6:39 [PATCH 00/19] MCE code cleanup and add LMCE support Haozhong Zhang
2017-02-17  6:39 ` [PATCH 01/19] x86/mce: fix indentation style in xen-mca.h and mce.h Haozhong Zhang
2017-02-17  9:49   ` Jan Beulich
2017-02-17  6:39 ` [PATCH 02/19] x86/mce: remove declarations of non-existing functions in mce.h Haozhong Zhang
2017-02-17  9:50   ` Jan Beulich
2017-02-17  6:39 ` [PATCH 03/19] x86/mce: remove unnecessary braces around intel_get_extended_msrs() Haozhong Zhang
2017-02-17  9:51   ` Jan Beulich
2017-02-17  6:39 ` [PATCH 04/19] xen/mce: remove unused x86_mcinfo_add() Haozhong Zhang
2017-02-17  9:55   ` Jan Beulich
2017-02-20  1:52     ` Haozhong Zhang
2017-02-20  9:00       ` Jan Beulich
2017-02-20  9:10         ` Haozhong Zhang
2017-02-17  6:39 ` [PATCH 05/19] x86/mce: merge loops to get Intel extended MC MSR Haozhong Zhang
2017-02-17  9:58   ` Jan Beulich
2017-02-20  1:11     ` Haozhong Zhang
2017-02-17  6:39 ` [PATCH 06/19] x86/mce: merge intel_default_mce_dhandler/uhandler() Haozhong Zhang
2017-02-17 10:01   ` Jan Beulich
2017-02-20  2:40     ` Haozhong Zhang
2017-02-17  6:39 ` [PATCH 07/19] x86/vmce: include domain/vcpu id in debug messages Haozhong Zhang
2017-02-17 10:03   ` Jan Beulich
2017-02-17  6:39 ` [PATCH 08/19] x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve() Haozhong Zhang
2017-02-17 10:07   ` Jan Beulich
2017-02-20  2:48     ` Haozhong Zhang
2017-02-20  9:02       ` Jan Beulich
2017-02-20  9:11         ` Haozhong Zhang
2017-02-17  6:39 ` [PATCH 09/19] x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case Haozhong Zhang
2017-02-17 10:21   ` Jan Beulich
2017-02-20  4:36     ` Haozhong Zhang
2017-02-20  9:04       ` Jan Beulich
2017-02-20  9:12         ` Haozhong Zhang
2017-02-17  6:39 ` [PATCH 10/19] x86/mce: always write 0 to MSR_IA32_MCG_STATUS on Intel CPU Haozhong Zhang
2017-02-17 10:26   ` Jan Beulich
2017-02-17 15:01     ` Boris Ostrovsky
2017-02-17 15:13       ` Jan Beulich
2017-02-17 15:38         ` Boris Ostrovsky
2017-02-17  6:39 ` [PATCH 11/19] tools/xen-mceinj: fix the type of cpu number Haozhong Zhang
2017-02-17 10:08   ` Jan Beulich
2017-02-20  2:49     ` Haozhong Zhang
2017-02-20 12:29     ` Wei Liu
2017-02-17  6:39 ` [PATCH 12/19] x86/mce: handle LMCE locally Haozhong Zhang
2017-02-22 13:53   ` Jan Beulich
2017-02-23  3:06     ` Haozhong Zhang
2017-02-23  7:42       ` Jan Beulich
2017-02-23  8:38         ` Haozhong Zhang
2017-02-17  6:39 ` [PATCH 13/19] x86/mce_intel: detect and enable LMCE on Intel host Haozhong Zhang
2017-02-22 15:10   ` Jan Beulich
2017-02-23  3:16     ` Haozhong Zhang
2017-02-23  7:45       ` Jan Beulich
2017-02-17  6:39 ` [PATCH 14/19] x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL Haozhong Zhang
2017-02-22 15:20   ` Jan Beulich
2017-02-23  4:10     ` Haozhong Zhang
2017-02-17  6:39 ` [PATCH 15/19] x86/vmce: emulate MSR_IA32_MCG_EXT_CTL Haozhong Zhang
2017-02-22 15:36   ` Jan Beulich
2017-02-23  4:26     ` Haozhong Zhang
2017-02-23  7:53       ` Jan Beulich
2017-02-23  8:54         ` Haozhong Zhang
2017-02-23  9:04           ` Jan Beulich
2017-02-17  6:39 ` [PATCH 16/19] x86/vmce: enable injecting LMCE to guest on Intel host Haozhong Zhang
2017-02-22 15:48   ` Jan Beulich
2017-02-23  4:48     ` Haozhong Zhang
2017-02-23  8:21       ` Jan Beulich
2017-02-17  6:39 ` [PATCH 17/19] x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP Haozhong Zhang
2017-02-20 12:32   ` Wei Liu
2017-02-20 12:38     ` Jan Beulich
2017-02-20 14:12       ` Wei Liu
2017-02-20 23:55     ` Haozhong Zhang
2017-02-22 15:55   ` Jan Beulich
2017-02-23  5:07     ` Haozhong Zhang
2017-02-17  6:39 ` [PATCH 18/19] xen/mce: add support of vLMCE injection to XEN_MC_inject_v2 Haozhong Zhang
2017-02-22 15:59   ` Jan Beulich
2017-02-23  5:14     ` Haozhong Zhang
2017-02-23  8:26       ` Jan Beulich
2017-02-23  9:14         ` Haozhong Zhang
2017-02-23  9:22           ` Jan Beulich
2017-02-17  6:39 ` [PATCH 19/19] tools/xen-mceinj: support injecting LMCE Haozhong Zhang
2017-02-20 12:53   ` Wei Liu
2017-02-20 23:50     ` Haozhong Zhang
2017-02-21  9:18       ` Wei Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.