[RFC] RAS(Part II)--MCA enalbing in XEN

* [RFC] RAS(Part II)--MCA enalbing in XEN
@ 2009-02-16  5:35 Ke, Liping
  2009-02-16 13:34 ` Christoph Egger
  0 siblings, 1 reply; 45+ messages in thread
From: Ke, Liping @ 2009-02-16  5:35 UTC (permalink / raw)
  To: Keir Fraser, Christoph Egger, Frank.Vanderlinden, Gavin Maltby, Jia
  Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2891 bytes --]

Hi, all
These patches are for MCA enabling in XEN. It is sent as RFC firstly to collect some feedbacks for refinement if 
needed before the final patch. We also attach one description txt documents for your reference.
 
Some implementation notes:
1) When error happens, if the error is fatal (pcc = 1) or can't be recovered (pcc = 0, yet no good recovery methods),
    for avoiding losing logs in DOM0, we will reset machine immediately. Most of MCA MSRs are sticky. After reboot, 
    MCA polling mechanism will send vIRQ to DOM0 for logging.
2) When MCE# happens, all CPUs enter MCA context. The first CPU who read&clear the error MSR bank will be this
    MCE# owner. Necessary locks/synchronization will help to judge the owner and select most severe error.
3) For convenience, we will select the most offending CPU to do most of processing&recovery job.
4) MCE# happens, we will do three jobs:
    a. Send vIRQ to DOM0 for logging
    b. Send vMCE# to Impacted Guest (Currently Only inject to impacted DOM0)
    c. Guest vMCE MSR virtualization
5) Some further improvement/adds might be done if needed:
    a) Impacted DOM judgement algorithm. 
    b) Now vMCE# injection is controlled by centralized data(vmce_data). The injection algorithm is a bit complex. 
        We might change the algorithm which's based on PER_DOM data if you preferred.
        Notes for understanding:
        1) If several banks impact one domain, yet those banks belong to the same pCPU, it will be injected only once.
        2) If more than one bank impact one domain, yet error banks belong to different pCPU, ith will be injected nr_num(pCPU) times.
        3) We use centralized data [two arrays impact_domid, impact_cpus map in vmce_data] to represent the injection 
            algorithm. Combined the two array item (idx, impact_domid) and (idx, impact_cpus) into one item 
            (idx, impact_domid, impact_cpus). This item records the impact_domain id and the error pCPU map 
            (Finding UC errors on this CPU which impact this domain). Then, we can judge how to inject the vMCE
            (domid, impact_times[nr_pCPUs]).
        4) Although data structure is ready, we only inject vMCE# to DOMD0 currently.
    c) Connection with recovery actions (cpu/memory online/offline)
    d) More refines and tests for HVM might be done when needed.
 
Patch Description:
1. basic_mca_support: Enable MCA support in XEN. 
2. vmsr_virtualization: Guest MCE# MSR read/write virtualization support in XEN.
3. mce_dom0: Cooperating with XEN, DOM0 add vIRQ and vMCE# handler. Translate XEN log to DOM0, re-use 
    Linux kernel and MCELOG mechanisms and MCE handler. This is mainly a demonstration patch. 
 
About Test:
We did some internal test and the result is just fine.
 
Any feedback is welcome and thanks a lot for your help! :-)
Regards,
Criping

[-- Attachment #2: MCA_desc.txt --]
[-- Type: text/plain, Size: 3269 bytes --]

This DOC is a brief description about the series of patches for MCA enabling in XEN

With the new availability of hardware MCA support in newly x86 Platform as well as the increasing
software demands, we're doing MCA enhancement jobs for XEN upstream. The corrected
error handling (CMCI) part is now in upstream. This doc focuses on the uncorrected error handling.
Current XEN upstream MCA support is checked in by Christoph which already did much
great improvements. Most of our MCA jobs are based on it.

Our Target:
1) Narrow MCE# impact. Try to keep the system/guest working running as much as possible.
2) Log information in DOM0 as much as possible.

Diffs with current implementation:
When enabling MCA Intel platform, we also made some changes including:
1) Xen will handle the MCA, i.e. Xen will decide impacted components, take recover action, 
   inject virtual MCA to guest etc. Especially, Xen will directly inject vMCE# to the impacted DOM 
   to avoid current notification from DOM0 to DOMU.
2) Xen will provide MCA MSR virtualization so that guest's native #MC handler can run without changes. 
   With this method, we can benifit from guest #MC handler enhancement and no need to maintain PV MCA 
   handler. See http://lists.xensource.com/archives/html/xen-devel/2008-12/msg00643.html for more 
   information on how to support guest MCA. 
3) We add MCE owner judgement algorithm in XEN MCE# handler since some MCA banks are shared among CPUs.
4) Adopt two round banks scanning, reset system when meeting fatal/non-recoverable errors without clearing 
   the MCA MSR, so that the MCA information will be logged after reboot.

Some detailed notes for MCA handling:
1)  When MCE# happens, if the error is fatal (pcc=1) or can't be recovered, for avoiding losing LOGS in
     DOM0 as much as possible, we will reset machine since MSRs banks are sticky. After reboot, MCA polling
     mechanism will be responsible for LOG. So we adopt two round banks scanning.
2)  When MCE# happens, all CPUs enter MCA context. The first CPU who read&clear the error MSR bank will be this
     MCE# owner. Necessary locks/synchronization will help to judge the owner and select most severe error.
3)  XEN's MCA MSR virtualization will provide MCA MSR virtualization to guest for reuse guest native handler. 
    Currently, we only virtualize MSR read/write. 
4)  MCE# happens, we will do following jobs:
    a. Send vIRQ to DOM0 for logging. Log as complete as possible.
    b. Send vMCE# to Impacted Guest (Currently we do injection only if impacted guest is Dom0)
    c. Guest vMCE MSR virtualization
    d. Recovery action in XEN (offline offending page).

MCE# processing Sequence Flow for your reference:
1)  MCE# happens and invoke XEN MCE# handler.
2)  XEN MCE# handler judges the severity and the impacted domain, decides whether to reset whole system 
    or be able to do some recovery.
3)  If error can be recovered, send vIRQ to DOM0 for logging, send vMCE# to impacted Guest 
    (Currently we only inject to DOM0),  and continue recovery action.
4)  Guest MCA handler will be invoked after receiving the injected vMCE#. Guest MCA# MSR banks read/write 
    will be traped by HV(vMCE# MSR virtualization).



[-- Attachment #3: basic_mca_support.patch --]
[-- Type: application/octet-stream, Size: 33217 bytes --]

diff -r 2fe33f3403f5 xen/arch/x86/cpu/mcheck/mce_intel.c

--- a/xen/arch/x86/cpu/mcheck/mce_intel.c	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c	Mon Feb 16 19:02:16 2009 +0800
@@ -4,9 +4,11 @@
 #include <xen/event.h>
 #include <xen/kernel.h>
 #include <xen/smp.h>
+#include <xen/delay.h>
 #include <asm/processor.h> 
 #include <asm/system.h>
 #include <asm/msr.h>
+#include <xen/softirq.h>
 #include "mce.h"
 #include "x86_mca.h"
 
@@ -162,7 +164,7 @@
     struct mc_info *mi = NULL;
     int exceptions = (read_cr4() & X86_CR4_MCE);
     int i, nr_unit = 0, uc = 0, pcc = 0;
-    uint64_t status, addr;
+    uint64_t status;
     struct mcinfo_global mcg;
     struct mcinfo_extended mce;
     unsigned int cpu;
@@ -226,8 +228,8 @@
         if (status & MCi_STATUS_MISCV)
             rdmsrl(MSR_IA32_MC0_MISC + 4 * i, mcb.mc_misc);
         if (status & MCi_STATUS_ADDRV) {
-            rdmsrl(MSR_IA32_MC0_ADDR + 4 * i, addr);
-            d = maddr_get_owner(addr);
+            rdmsrl(MSR_IA32_MC0_ADDR + 4 * i, mcb.mc_addr);
+            d = maddr_get_owner(mcb.mc_addr);
             if ( d && (calltype == MC_FLAG_CMCI || calltype == MC_FLAG_POLLED) )
                 mcb.mc_domid = d->domain_id;
         }
@@ -252,7 +254,7 @@
         mcg.mc_flags |= MC_FLAG_UNCORRECTABLE;
     else if (uc)
         mcg.mc_flags |= MC_FLAG_RECOVERABLE;
-    else /* correctable */
+    else if (nr_unit) /* correctable */
         mcg.mc_flags |= MC_FLAG_CORRECTABLE;
 
     if (nr_unit && nr_intel_ext_msrs && 
@@ -266,73 +268,561 @@
     return mi;
 }
 
+/* Below are for MCE handling */
+
+/* Log worst error severity and offending CPU.,
+ * Pick this CPU for further processing in softirq */
+static int severity_cpu = -1;
+static int worst = 0;
+
+/* Lock of enter point@second round scanning in MCE# handler */
+static cpumask_t scanned_cpus;
+/* Lock for enter point@Critical Section in MCE# handler */
+static bool_t mce_enter_lock = 0;
+/* Record how many CPUs impacted in this MCE# */
+static cpumask_t impact_map;
+
+/* Lock of softirq rendezvous entering point */
+static cpumask_t mced_cpus;
+/*Lock of softirq rendezvous leaving point */
+static cpumask_t finished_cpus;
+/* Lock for picking one processing CPU */
+static bool_t mce_process_lock = 0;
+
+/* Spinlock for vMCE# MSR virtualization data */
+static DEFINE_SPINLOCK(mce_locks);
+/* Param for vMCE# injection */
+DEFINE_PER_CPU(struct softirq_trap, mce_softirq_trap);
+
+
+/* Local buffer for holding MCE# data temporarily, sharing between mce
+ * handler and softirq handler. Local buffer will be finally copied to
+ * global buffer for DOM0 LOG and per_dom related data for guest vMCE#
+ * MSR virtualization.
+ * Note: When local buffer is still in processing in softirq, another
+ * MCA comes, simply panic.
+ * TODO: We might have further improvement to have lockless ring if
+ * neccessary
+ */
+struct mc_local_t
+{
+    bool_t in_use;
+    struct mc_info mc[NR_CPUS];
+};
+static struct mc_local_t mc_local;
+
+/* For vMCE injection reference. It holds impacted domains and
+ * injection times for each impacted domain.
+ */
+struct intel_vmce_inject vmce_data;
+
+/* When a new MCE# comes, XEN handler will clear the old vMCE
+ * injection reference data. */
+static void init_vmce_data(void) {
+
+    for (int i = 0; i < MAX_IMPACT_DOMAIN; i++) {
+        vmce_data.impact_domid[i] = -1;
+        cpus_clear(vmce_data.impact_cpus[i]);
+    }
+}
+
+/* This node list records errors impacting a domain. when one
+ * MCE# happens, one error bank impact a domain. This error node
+ * will be inserted to the tail of the per_dom data for vMCE# MSR
+ * virtualization. When one vMCE# injection is finished, the corresponding
+ * node will be deleted. This node list is for GUEST vMCE# MSRS 
+ * virtualization.
+ */
+static struct bank_entry* alloc_bank_entry(void) {
+    struct bank_entry *entry;
+
+    entry = xmalloc(struct bank_entry);
+    if (!entry) {
+        printk(KERN_ERR "MCE: malloc bank_entry failed\n");
+        return NULL;
+    }
+    memset(entry, 0x0, sizeof(entry));
+    INIT_LIST_HEAD(&entry->list);
+    entry->cpu = -1;
+    return entry;
+}
+
+/* Fill error bank info to #vMCE injection ref data and GUEST vMCE#
+ * MSR virtualization data
+*/
+static int fill_vmsr_data(int cpu, struct mcinfo_bank *mc_bank, 
+        uint64_t gstatus) {
+    int32_t idx, flag_new = 0;
+    struct domain *d;
+    struct bank_entry *entry;
+
+    /* This error bank impacts some DOMs, we need to fill domain related
+     * data for vMCE MSRs virtualization and vMCE# injection */
+    if (mc_bank->mc_domid != (uint16_t)~0) {
+        d = get_domain_by_id(mc_bank->mc_domid);
+
+        /* Not impact a valid domain, skip this error of the bank */
+        if (!d) {
+            printk(KERN_DEBUG "MCE: Not found valid impacted DOM\n");
+            return 0;
+        }
+
+        for (idx = 0; idx < MAX_IMPACT_DOMAIN; idx++) {
+            if (vmce_data.impact_domid[idx] == mc_bank->mc_domid) {
+                /* Note: only when the error on DIFF pCPUs,
+                 * will it be injected nr_pCPUs times. Several errors
+                 * offending one CPU which impact one domain will be
+                 * put into the one node in the impact_header list.
+                 * Correspondingly, this error is injected only once.
+                 */
+
+                if (cpu_isset(cpu, vmce_data.impact_cpus[idx])) {
+                    /* Same CPU diff bank, no need to alloc new node */
+                    printk(KERN_DEBUG "MCE: No Need to alloc node!\n");
+                    if (!list_empty(&d->arch.vmca_msrs.impact_header)) {
+                        entry = list_entry(
+                            d->arch.vmca_msrs.impact_header.prev, 
+                            struct bank_entry, list);
+                    }
+                    else {
+                        printk(KERN_ERR "MCE: impact list should"
+                            " not be empty !\n");
+                        return -1;
+                    }
+                 }
+               /* Diff CPU, need to alloc new node */
+                else {
+                    printk(KERN_DEBUG "MCE: alloc Node for DOM%d\n",
+                        mc_bank->mc_domid);
+                    entry = alloc_bank_entry();
+                    flag_new = 1;
+                }
+                cpu_set(cpu, vmce_data.impact_cpus[idx]);
+                break;
+            }
+            /* First node of the impact DOM */
+            else if (vmce_data.impact_domid[idx] == -1) {
+                printk(KERN_DEBUG "MCE: fill new recored"
+                    "(IDX%d, DOM%d, CPU%d)\n", 
+                    idx, mc_bank->mc_domid, cpu);
+                vmce_data.impact_domid[idx] = 
+                                mc_bank->mc_domid;
+                printk(KERN_DEBUG "MCE: alloc Node for DOM%d\n",
+                    mc_bank->mc_domid);
+                entry = alloc_bank_entry();
+                flag_new = 1;
+                cpu_set(cpu, vmce_data.impact_cpus[idx]);
+                /* Fill MSR global status */
+                d->arch.vmca_msrs.mcg_status = gstatus;
+                break;
+            }
+        }
+        if (idx >= MAX_IMPACT_DOMAIN) {
+            printk(KERN_ERR "MCE: Errors impacts too many domains\n");
+            return -1;
+        }
+        entry->mci_status[mc_bank->mc_bank] = mc_bank->mc_status;
+        entry->mci_addr[mc_bank->mc_bank] = mc_bank->mc_addr;
+        entry->mci_misc[mc_bank->mc_bank] = mc_bank->mc_misc;
+
+        /* Something Wrong */
+        if (entry->cpu != -1 && entry->cpu != cpu) {
+            printk(KERN_ERR "MCE: vMSR Virtualization "
+                    "Data Filling Error\n");
+            return -1;
+        }
+        entry->cpu = cpu;
+
+        /* This is a new Node, insert to the tail of the per_dom data */
+        if (flag_new) {
+            printk(KERN_DEBUG "MCE: add new node for DOM%d\n", 
+                mc_bank->mc_domid);
+            list_add_tail(&entry->list, &d->arch.vmca_msrs.impact_header);
+        }
+
+        printk(KERN_DEBUG "MCE: Found error @[CPU%d BANK%d "
+                "status %lx addr %lx domid %d]\n ", entry->cpu, mc_bank->mc_bank,
+                mc_bank->mc_status, mc_bank->mc_addr, mc_bank->mc_domid);
+    }
+    return 0;
+}
+
+/* Filling vmce_data for:
+ * 1) Log down (array_idx, domain_id, impact_cpu_map) map for vMCE injection.
+      cpu_weight(impact_cpu_map) decides how many injections to the impacted
+	  DOM are needed.
+ * 2) Copy MCE# info to global buffer, for DOM0 logging.
+ * 3) Copy MCE# info to impacted DOM, for vMCE# MSRs virtualization
+ */
+static int mce_actions(void) {
+    int32_t cpu, ret;
+    struct mc_info *local_mi, *global_mi;
+    struct mcinfo_common *mic = NULL;
+    struct mcinfo_global *mc_global;
+    struct mcinfo_bank *mc_bank;
+
+    /* Spinlock is used for exclusive read/write of vMSR virtualization
+     * (per_dom vMCE# data)
+     */
+    spin_lock(&mce_locks);
+
+    /* local buffer is shared between MCE handler and softirq.
+     * If softirq is filling this buffer while another MCE# comes,
+     * simply panic
+     */
+    test_and_set_bool(mc_local.in_use);
+
+    init_vmce_data();
+
+    for_each_cpu_mask(cpu, impact_map) {
+
+        local_mi = &mc_local.mc[cpu];
+        x86_mcinfo_lookup(mic, local_mi, MC_TYPE_GLOBAL);
+        if (mic == NULL) {
+            printk(KERN_ERR "MCE: get local buffer entry failed\n ");
+            ret = -1;
+		    goto end;
+        }
+
+        /* Copy local data to Global buffer for DOM0 LOG */
+        mc_global = (struct mcinfo_global *)mic;
+        global_mi = x86_mcinfo_getptr();
+        if (!global_mi) {
+            printk(KERN_ERR "MCE: Get global buffer entry failed\n");
+            ret = -1;
+            goto end;
+        }
+        x86_mcinfo_clear(global_mi);
+        x86_mcinfo_add(global_mi, mc_global);
+
+        /* Processing bank information */
+        x86_mcinfo_lookup(mic, local_mi, MC_TYPE_BANK);
+
+        for ( ; mic && mic->size; mic = x86_mcinfo_next(mic) ) {
+            if (mic->type != MC_TYPE_BANK) {
+                continue;
+            }
+            mc_bank = (struct mcinfo_bank*)mic;
+            /* Copy bank info to global buffer */
+            x86_mcinfo_add(global_mi, mc_bank);
+
+            /* Fill vMCE# injection and vMCE# MSR virtualization related data */
+            if (fill_vmsr_data(cpu, mc_bank, mc_global->mc_gstatus) == -1) {
+                ret = -1;
+                goto end;
+            }
+
+            /* TODO: Add recovery actions here, such as page-offline, etc */
+
+        }
+    } /* end of impact_map loop */
+
+    /* Successfully filled all local/global buffer */
+    ret = 0;
+
+end:
+    test_and_clear_bool(mc_local.in_use);
+    spin_unlock(&mce_locks);
+    return ret;
+}
+
+/* Softirq Handler for this MCE# processing */
+static void mce_softirq(void)
+{
+    int cpu = smp_processor_id(), idx;
+    cpumask_t affinity;
+    struct softirq_trap *st = NULL;
+
+    /* Wait until all cpus entered softirq */
+    while ( cpus_weight(mced_cpus) != num_online_cpus() ) {
+        cpu_relax();
+    }
+    /* Not Found worst error on severity_cpu, it's weird */
+    if (severity_cpu == -1) {
+        printk(KERN_WARNING "MCE: not found severity_cpu!\n");
+        mc_panic("MCE: not found severity_cpu!");
+        return;
+    }
+    /* We choose severity_cpu for further processing */
+    if (severity_cpu == cpu) {
+
+        /* Step1: Fill DOM0 LOG buffer, vMCE injection buffer and
+         * vMCE MSRs virtualization buffer
+         */
+        if (mce_actions())
+            mc_panic("MCE recovery actions or Filling vMCE MSRS "
+			    "virtualization data failed!\n");
+
+        /* Step2: Send Log to DOM0 through vIRQ */
+        if (dom0 && guest_enabled_event(dom0->vcpu[0], VIRQ_MCA)) {
+            printk(KERN_DEBUG "MCE: send MCE# to DOM0 through virq\n");
+            send_guest_global_virq(dom0, VIRQ_MCA);
+        }
+
+        /* Step3: Inject vMCE to impacted DOM. Currently we cares DOM0 only */
+        for (idx = 0; idx < MAX_IMPACT_DOMAIN; idx++) {
+
+         /* Found errors impacting DOM0, bind this DOM0.vCPU0 to this pCPU */
+            if ( vmce_data.impact_domid[idx] == 0 )
+            {
+                st = &per_cpu(mce_softirq_trap, cpu);
+                st->domain = dom0;
+                st->vcpu = dom0->vcpu[0];
+                st->processor = st->vcpu->processor;
+                break;
+            }
+        }
+        if (idx < MAX_IMPACT_DOMAIN &&
+            guest_has_trap_callback
+                (st->domain, st->vcpu->vcpu_id, TRAP_machine_check)) {
+            /* inject vMCE into DOM0 cpu_weight(impact_map) times */
+            if (st && st->vcpu && !test_and_set_bool(st->vcpu->mce_pending)) {
+
+                st->vcpu->cpu_affinity_tmp = st->vcpu->cpu_affinity;
+                if (cpu != st->processor
+                    || (st->processor != st->vcpu->processor)){
+                    /* We're on the different physical cpu. Make
+                     * sure to wakeup the vcpu on the specified
+                     * processor */
+                     cpus_clear(affinity);
+                     cpu_set(cpu, affinity);
+                     printk(KERN_DEBUG "MCE: CPU%d set affinity\n", cpu);
+                     vcpu_set_affinity(st->vcpu, &affinity);
+                     /* Afinity is restored in the iRET hypercall */
+                }
+               vcpu_kick(st->vcpu);
+            }
+        }
+
+
+        /* Clean Data */
+        test_and_clear_bool(mce_process_lock);
+        cpus_clear(impact_map);
+        cpus_clear(scanned_cpus);
+        worst = 0;
+        cpus_clear(mced_cpus);
+        memset(&mc_local, 0x0, sizeof(mc_local));
+    }
+
+    cpu_set(cpu, finished_cpus);
+    wmb();
+   /* Leave until all cpus finished recovery actions in softirq */
+    while ( cpus_weight(finished_cpus) != num_online_cpus() ) {
+        cpu_relax();
+    }
+
+    cpus_clear(finished_cpus);
+    severity_cpu = -1;
+    printk(KERN_DEBUG "CPU%d exit softirq \n", cpu);
+}
+
+/* Machine Check owner judge algorithm:
+ * When error happens, all cpus serially read its msr banks.
+ * The first CPU who fetches the error bank's info will clear
+ * this bank. Later readers can't get any infor again.
+ * The first CPU is the actual mce_owner
+ *
+ * For Fatal (pcc=1) error, it might cause machine crash
+ * before we're able to log. For avoiding log missing, we adopt two
+ * round scanning:
+ * Round1: simply scan. If found pcc = 1 or ripv = 0, simply reset.
+ * All MCE banks are sticky, when boot up, MCE polling mechanism
+ * will help to collect and log those MCE errors.
+ * Round2: Do all MCE processing logic as normal.
+ */
+
+/* Simple Scan. Panic when found non-recovery errors. Doing this for
+ * avoiding LOG missing
+ */
+static void severity_scan(void)
+{
+    uint64_t status;
+    int32_t i;
+
+    /* TODO: For PCC = 0, we need to have further judge. If it is can't be
+     * recovered, we need to RESET for avoiding DOM0 LOG missing
+     */
+    for ( i = 0; i < nr_mce_banks; i++) {
+        rdmsrl(MSR_IA32_MC0_STATUS + 4 * i , status);
+        if ( !(status & MCi_STATUS_VAL) )
+            continue;
+        /* MCE handler only handles UC error */
+        if ( !(status & MCi_STATUS_UC) )
+            continue;
+        if ( !(status & MCi_STATUS_EN) )
+            continue;
+        if (status & MCi_STATUS_PCC)
+            mc_panic("pcc = 1, cpu unable to continue\n");
+    }
+
+    /* TODO: Further judgement here, maybe we need MCACOD assistence  */
+    /* EIPV and RIPV is not a reliable way to judge the error severity */
+
+}
 static fastcall void intel_machine_check(struct cpu_user_regs * regs, long error_code)
 {
-    /* MACHINE CHECK Error handler will be sent in another patch,
-     * simply copy old solutions here. This code will be replaced
-     * by upcoming machine check patches
-     */
+    unsigned int cpu = smp_processor_id();
+    struct mc_info *mi;
+    struct mcinfo_global mcg;
+    struct mcinfo_extended mce;
+    uint64_t status;
+    int32_t uc = 0, pcc = 0, nr_unit = 0, severity = 0, i;
+    struct domain *d;
 
-    int recover=1;
-    u32 alow, ahigh, high, low;
-    u32 mcgstl, mcgsth;
-    int i;
-   
-    rdmsr(MSR_IA32_MCG_STATUS, mcgstl, mcgsth);
-    if (mcgstl & (1<<0))       /* Recoverable ? */
-        recover=0;
-    
-    printk(KERN_EMERG "CPU %d: Machine Check Exception: %08x%08x\n",
-           smp_processor_id(), mcgsth, mcgstl);
-    
-    for (i=0; i<nr_mce_banks; i++) {
-        rdmsr (MSR_IA32_MC0_STATUS+i*4,low, high);
-        if (high & (1<<31)) {
-            if (high & (1<<29))
-                recover |= 1;
-            if (high & (1<<25))
-                recover |= 2;
-            printk (KERN_EMERG "Bank %d: %08x%08x", i, high, low);
-            high &= ~(1<<31);
-            if (high & (1<<27)) {
-                rdmsr (MSR_IA32_MC0_MISC+i*4, alow, ahigh);
-                printk ("[%08x%08x]", ahigh, alow);
-            }
-            if (high & (1<<26)) {
-                rdmsr (MSR_IA32_MC0_ADDR+i*4, alow, ahigh);
-                printk (" at %08x%08x", ahigh, alow);
-            }
-            printk ("\n");
+    /* First round scanning */
+    severity_scan();
+    cpu_set(cpu, scanned_cpus);
+    while (cpus_weight(scanned_cpus) < num_online_cpus())
+        cpu_relax();
+
+    wmb();
+    /* All CPUs Finished first round scanning */
+    if (mc_local.in_use != 0) {
+        mc_panic("MCE: Local buffer is being processed, can't handle new MCE!\n");
+        return;
+    }
+
+     /* Fill local data, let softirq to processing the local data */
+    mi = &mc_local.mc[cpu];
+    if (!mi) {
+        printk(KERN_ERR "MCE: Get mc_info entry failed\n");
+        mc_panic("MCE: Failed to get local buffer entry\n");
+        return;
+    }
+
+    x86_mcinfo_clear(mi);
+    memset(&mcg, 0, sizeof(mcg));
+    mcg.common.type = MC_TYPE_GLOBAL;
+    mcg.common.size = sizeof(mcg);
+
+   /* domid should be per_bank data */
+    mcg.mc_domid = -1;
+    mcg.mc_vcpuid = -1;
+    mcg.mc_flags = MC_FLAG_MCE;
+    mcg.mc_socketid = phys_proc_id[cpu];
+    mcg.mc_coreid = cpu_core_id[cpu];
+    mcg.mc_apicid = cpu_physical_id(cpu);
+    mcg.mc_core_threadid =
+        mcg.mc_apicid & ( 1 << (cpu_data[cpu].x86_num_siblings - 1));
+    rdmsrl(MSR_IA32_MCG_STATUS, mcg.mc_gstatus);
+
+    /* Enter Critical Section */
+    while (test_and_set_bool(mce_enter_lock)) {
+        udelay (1);
+    }
+
+    for ( i = 0; i < nr_mce_banks; i++) {
+        struct mcinfo_bank mcb;
+
+        memset(&mcb, 0, sizeof(mcb));
+        rdmsrl(MSR_IA32_MC0_STATUS + 4 * i , status);
+        if ( !(status & MCi_STATUS_VAL) )
+            continue;
+        /*MCE handler only deals with UC error*/
+        if ( !(status & MCi_STATUS_UC) )
+            continue;
+        uc = 1;
+        add_taint(TAINT_MACHINE_CHECK);
+        /* The bank found UC, but machine check event is
+         * not enabled. Skip and let polling deal with it.
+        */
+        if ( !(status & MCi_STATUS_EN) )
+            continue;
+        if (status & MCi_STATUS_PCC)
+            pcc = 1;
+        memset(&mcb, 0, sizeof(mcb));
+        mcb.common.type = MC_TYPE_BANK;
+        mcb.common.size = sizeof(mcb);
+        mcb.mc_bank = i;
+        mcb.mc_status = status;
+
+        if (status & MCi_STATUS_MISCV)
+            rdmsrl(MSR_IA32_MC0_MISC + 4 * i, mcb.mc_misc);
+        if (status & MCi_STATUS_ADDRV) {
+            rdmsrl(MSR_IA32_MC0_ADDR + 4 * i, mcb.mc_addr);
+
+            /* TODO: This is not correct way. We temperarily keep it.
+             * We'll do further improvement later
+             */
+            d = maddr_get_owner(mcb.mc_addr);
+            if (d)
+                mcb.mc_domid = d->domain_id;
         }
+        rdtscll(mcb.mc_tsc);
+        x86_mcinfo_add(mi, &mcb);
+        nr_unit++;
+        /* Clear state for this bank, this CPU will be this MCE error owner */
+        wrmsrl(MSR_IA32_MC0_STATUS + 4 * i, 0);
+        printk(KERN_DEBUG "MCE: bank%i CPU%d status[%"PRIx64"]\n", 
+                i, cpu, status);
+        printk(KERN_DEBUG "MCE: SOCKET%d, CORE%d, APICID[%d], "
+                "thread[%d]\n", mcg.mc_socketid, 
+                mcg.mc_coreid, mcg.mc_apicid, mcg.mc_core_threadid);
     }
-    
-    if (recover & 2)
-        mc_panic ("CPU context corrupt");
-    if (recover & 1)
-        mc_panic ("Unable to continue");
-    
-    printk(KERN_EMERG "Attempting to continue.\n");
-    /* 
-     * Do not clear the MSR_IA32_MCi_STATUS if the error is not 
-     * recoverable/continuable.This will allow BIOS to look at the MSRs
-     * for errors if the OS could not log the error.
-     */
-    for (i=0; i<nr_mce_banks; i++) {
-        u32 msr;
-        msr = MSR_IA32_MC0_STATUS+i*4;
-        rdmsr (msr, low, high);
-        if (high&(1<<31)) {
-            /* Clear it */
-            wrmsr(msr, 0UL, 0UL);
-            /* Serialize */
-            wmb();
-            add_taint(TAINT_MACHINE_CHECK);
+
+    if (nr_unit && nr_intel_ext_msrs && 
+                    (mcg.mc_gstatus & MCG_STATUS_EIPV)) {
+        printk(KERN_DEBUG "MCE: found extension MCE MSRs\n");
+        intel_get_extended_msrs(&mce);
+        x86_mcinfo_add(mi, &mce);
+    }
+    if (!nr_unit) {
+        /* Not offending CPU, goto softirq directly */
+        cpu_set(cpu, mced_cpus);
+        test_and_clear_bool(mce_enter_lock);
+        raise_softirq(MACHINE_CHECK_SOFTIRQ);
+        return;
+    }
+
+    if (pcc) {
+        printk(KERN_WARNING "PCC=1 should have caused reset\n");
+        mcg.mc_flags |= MC_FLAG_UNCORRECTABLE;
+        severity = 3;
+    }
+    else if (uc) {
+        mcg.mc_flags |= MC_FLAG_RECOVERABLE;
+        severity = 2;
+    }
+    else {
+        printk(KERN_WARNING "We should skip Correctable Error\n");
+        severity = 1; 
+    }
+    /* This is the offending cpu! */
+    cpu_set(cpu, impact_map);
+
+    x86_mcinfo_add(mi, &mcg);
+    if ( severity > worst) {
+        worst = severity;
+        /* This CPU found more severe error! */
+        severity_cpu = cpu;
+    }
+    cpu_set(cpu, mced_cpus);
+    test_and_clear_bool(mce_enter_lock);
+    wmb();
+
+    /* Wait for all cpus Leave Critical */
+    while (cpus_weight(mced_cpus) < num_online_cpus())
+        cpu_relax();
+    /* Print MCE error */
+    x86_mcinfo_dump(mi);
+
+    /* Pick one CPU to clear MCIP */
+    if (!test_and_set_bool(mce_process_lock)) {
+        wrmsrl(MSR_IA32_MCG_STATUS, mcg.mc_gstatus & ~MCG_STATUS_MCIP);
+
+        if (worst >= 3) {
+            printk(KERN_WARNING "worst=3 should have caused RESET\n");
+            mc_panic("worst=3 should have caused RESET");
         }
+        else {
+            printk(KERN_DEBUG "MCE: trying to recover\n");
+        }
+
     }
-    mcgstl &= ~(1<<2);
-    wrmsr (MSR_IA32_MCG_STATUS,mcgstl, mcgsth);
+    raise_softirq(MACHINE_CHECK_SOFTIRQ);
 }
 
+
 static DEFINE_SPINLOCK(cmci_discover_lock);
 static DEFINE_PER_CPU(cpu_banks_t, no_cmci_banks);
 
@@ -488,8 +978,10 @@
     mi = machine_check_poll(MC_FLAG_CMCI);
     if (mi) {
         x86_mcinfo_dump(mi);
-        if (dom0 && guest_enabled_event(dom0->vcpu[0], VIRQ_MCA))
+        if (dom0 && guest_enabled_event(dom0->vcpu[0], VIRQ_MCA)) {
+            printk(KERN_DEBUG "MCE: send CMCI info to DOM0 through virq\n");
             send_guest_global_virq(dom0, VIRQ_MCA);
+        }
     }
     irq_exit();
 }
@@ -501,13 +993,18 @@
     intel_init_thermal(c);
 #endif
     intel_init_cmci(c);
+    init_vmce_data();
 }
 
+uint64_t g_mcg_cap;
 static void mce_cap_init(struct cpuinfo_x86 *c)
 {
     u32 l, h;
 
     rdmsr (MSR_IA32_MCG_CAP, l, h);
+    /* For Guest vMCE usage */
+    g_mcg_cap = ((u64)h << 32 | l) & (~MCG_CMCI_P);
+
     if ((l & MCG_CMCI_P) && cpu_has_apic)
         cmci_support = 1;
 
@@ -576,6 +1073,7 @@
     mce_init();
     mce_intel_feature_init(c);
     mce_set_owner();
+    open_softirq(MACHINE_CHECK_SOFTIRQ, mce_softirq);
 }
 
 /*
diff -r 2fe33f3403f5 xen/arch/x86/cpu/mcheck/x86_mca.h
--- a/xen/arch/x86/cpu/mcheck/x86_mca.h	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/arch/x86/cpu/mcheck/x86_mca.h	Mon Feb 16 19:02:16 2009 +0800
@@ -79,7 +79,6 @@
 #define CMCI_THRESHOLD			0x2
 
 
-#define MAX_NR_BANKS 128
 
 typedef DECLARE_BITMAP(cpu_banks_t, MAX_NR_BANKS);
 DECLARE_PER_CPU(cpu_banks_t, mce_banks_owned);
diff -r 2fe33f3403f5 xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/arch/x86/domain.c	Mon Feb 16 19:02:16 2009 +0800
@@ -366,6 +366,7 @@
         hvm_vcpu_destroy(v);
 }
 
+extern uint64_t g_mcg_cap;
 int arch_domain_create(struct domain *d, unsigned int domcr_flags)
 {
 #ifdef __x86_64__
@@ -446,6 +447,15 @@
 
         if ( (rc = iommu_domain_init(d)) != 0 )
             goto fail;
+
+        /* For Guest vMCE MSRs virtualization */
+        d->arch.vmca_msrs.mcg_status = 0x0;
+        d->arch.vmca_msrs.mcg_cap = g_mcg_cap;
+        d->arch.vmca_msrs.mcg_ctl = (uint64_t)~0x0;
+        memset(d->arch.vmca_msrs.mci_ctl, 0x1,
+            sizeof(d->arch.vmca_msrs.mci_ctl));
+        INIT_LIST_HEAD(&d->arch.vmca_msrs.impact_header);
+
     }
 
     if ( is_hvm_domain(d) )
diff -r 2fe33f3403f5 xen/arch/x86/traps.c
--- a/xen/arch/x86/traps.c	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/arch/x86/traps.c	Mon Feb 16 19:02:16 2009 +0800
@@ -728,8 +728,6 @@
         if ( !opt_allow_hugepage )
             __clear_bit(X86_FEATURE_PSE, &d);
         __clear_bit(X86_FEATURE_PGE, &d);
-        __clear_bit(X86_FEATURE_MCE, &d);
-        __clear_bit(X86_FEATURE_MCA, &d);
         __clear_bit(X86_FEATURE_PSE36, &d);
     }
     switch ( (uint32_t)regs->eax )
diff -r 2fe33f3403f5 xen/arch/x86/x86_64/traps.c
--- a/xen/arch/x86/x86_64/traps.c	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/arch/x86/x86_64/traps.c	Mon Feb 16 19:02:16 2009 +0800
@@ -14,6 +14,8 @@
 #include <xen/nmi.h>
 #include <asm/current.h>
 #include <asm/flushtlb.h>
+#include <asm/traps.h>
+#include <asm/event.h>
 #include <asm/msr.h>
 #include <asm/page.h>
 #include <asm/shared.h>
@@ -260,11 +262,16 @@
 #endif
 }
 
+extern struct intel_vmce_inject vmce_data;
+DECLARE_PER_CPU(struct softirq_trap, mce_softirq_trap);
 unsigned long do_iret(void)
 {
     struct cpu_user_regs *regs = guest_cpu_user_regs();
     struct iret_context iret_saved;
     struct vcpu *v = current;
+    struct domain *d = v->domain;
+    struct bank_entry *entry;
+    int idx, cpu = smp_processor_id(), impact_cpu;
 
     if ( unlikely(copy_from_user(&iret_saved, (void *)regs->rsp,
                                  sizeof(iret_saved))) )
@@ -304,6 +311,64 @@
        && !cpus_equal(v->cpu_affinity_tmp, v->cpu_affinity))
         vcpu_set_affinity(v, &v->cpu_affinity_tmp);
 
+   /*Currently, only inject vMCE to DOM0.*/
+
+    if (v->trap_priority >= VCPU_TRAP_NMI) {
+        struct softirq_trap *st = &per_cpu(mce_softirq_trap, cpu);
+        for (idx = 0; idx < MAX_IMPACT_DOMAIN; idx++) {
+            if (vmce_data.impact_domid[idx] == 0) {
+                impact_cpu = first_cpu(vmce_data.impact_cpus[idx]);
+                if (impact_cpu < NR_CPUS) {
+                    cpu_clear(impact_cpu, vmce_data.impact_cpus[idx]);
+                    if (!list_empty(&d->arch.vmca_msrs.impact_header)) {
+                        entry = list_entry(d->arch.vmca_msrs.impact_header.next,
+                            struct bank_entry, list);
+                        printk(KERN_DEBUG "MCE: Delete last injection Node\n");
+                        list_del(&entry->list);
+                    }
+                    else {
+                        printk(KERN_DEBUG "MCE: Not found last injection "
+                        "Node, something Wrong!\n");
+                    }
+                }
+               if (cpus_weight(vmce_data.impact_cpus[idx]) <=0) {
+                   printk(KERN_DEBUG "MCE: All vMCEs are injected to DOM0\n");
+                   goto end;
+               }
+            }
+            break;
+        }
+
+        /* inject another vMCE into DOM0
+         * First injection is done in MCE# softirq handler. It's injected
+         * Serially
+        */
+        if (idx < MAX_IMPACT_DOMAIN &&
+            guest_has_trap_callback(st->domain, 
+                st->vcpu->vcpu_id, TRAP_machine_check)) {
+            if (st && st->vcpu && !test_and_set_bool(st->vcpu->mce_pending)) {
+                st->vcpu->cpu_affinity_tmp = st->vcpu->cpu_affinity;
+                if (cpu != st->processor 
+                    || (st->processor != st->vcpu->processor)){
+                    cpumask_t affinity;
+                    /* We're on the different physical cpu. Make
+                     * sure to wakeup the vcpu on the specified
+                     * processor */
+                     cpus_clear(affinity);
+                     cpu_set(cpu, affinity);
+                     printk(KERN_DEBUG "MCE: CPU%d set afinity\n", cpu);
+                     vcpu_set_affinity(st->vcpu, &affinity);
+                     /*Afinity is restored in the iRET hypercall*/
+                }
+                /*We need to use vMCE data when doing vMCE injection!
+                 * It will be cleared after the last injection is finished
+                */
+                vcpu_kick(st->vcpu);
+            }
+        }
+    } /* end of outer-if */
+
+end:
     /* Restore previous trap priority */
     v->trap_priority = v->old_trap_priority;
 
diff -r 2fe33f3403f5 xen/include/asm-x86/domain.h
--- a/xen/include/asm-x86/domain.h	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/include/asm-x86/domain.h	Mon Feb 16 19:02:16 2009 +0800
@@ -204,6 +204,29 @@
 
 struct p2m_domain;
 
+/* Define for GUEST MCA handling */
+#define MAX_NR_BANKS 128
+
+/* This entry is for recording bank nodes for the impacted domain,
+ * put into impact_header list. */
+struct bank_entry {
+    struct list_head list;
+    int32_t cpu;
+    uint64_t mci_status[MAX_NR_BANKS];
+    uint64_t mci_addr[MAX_NR_BANKS];
+    uint64_t mci_misc[MAX_NR_BANKS];
+};
+
+struct domain_mca_msrs
+{
+    /* Guest should not change below values after DOM boot up */
+    uint64_t mcg_cap;
+    uint64_t mcg_ctl;
+    uint64_t mcg_status;
+    uint64_t mci_ctl[MAX_NR_BANKS];
+    struct list_head impact_header;
+};
+
 struct arch_domain
 {
     l1_pgentry_t *mm_perdomain_pt;
@@ -268,6 +291,9 @@
     struct page_list_head relmem_list;
 
     cpuid_input_t cpuids[MAX_CPUID_INPUT];
+
+    /* For Guest vMCA handling */
+    struct domain_mca_msrs vmca_msrs;
 } __cacheline_aligned;
 
 #define has_arch_pdevs(d)    (!list_empty(&(d)->arch.pdev_list))
diff -r 2fe33f3403f5 xen/include/asm-x86/softirq.h
--- a/xen/include/asm-x86/softirq.h	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/include/asm-x86/softirq.h	Mon Feb 16 19:02:16 2009 +0800
@@ -5,6 +5,7 @@
 #define TIME_CALIBRATE_SOFTIRQ (NR_COMMON_SOFTIRQS + 1)
 #define VCPU_KICK_SOFTIRQ      (NR_COMMON_SOFTIRQS + 2)
 
-#define NR_ARCH_SOFTIRQS       3
+#define MACHINE_CHECK_SOFTIRQ  (NR_COMMON_SOFTIRQS + 3)
+#define NR_ARCH_SOFTIRQS       4
 
 #endif /* __ASM_SOFTIRQ_H__ */
diff -r 2fe33f3403f5 xen/include/asm-x86/traps.h
--- a/xen/include/asm-x86/traps.h	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/include/asm-x86/traps.h	Mon Feb 16 19:02:16 2009 +0800
@@ -20,6 +20,32 @@
 #ifndef ASM_TRAP_H
 #define ASM_TRAP_H
 
+/* No need to emulate CMCI related MSRs CMCI related MSRS. */
+#define MAX_IMPACT_DOMAIN 5
+#define MAX_NR_BANKS 128
+
+/* Data strcuture for vMCE MSRs virtualization. When MCA happens in
+ * physical CPUs, all machine MCA MSRs info will be copied to this
+ * data structure
+ */
+
+
+struct intel_vmce_inject {
+
+    /* Map: (Index, impact_domid, impact_cpumap). The map records
+     * how many vMCE# we need inject to the impacted domain.
+     * If MCE# happened on more than one pCPUs (nr_CPUs), and impact
+     * the same domain, vMCEs will be injected to the impacted domain
+     * nr_CPUs times. If MCE# errors happened on the same CPU
+     * yet different banks, the vMCE will be injected only once
+    */
+
+    int32_t impact_domid[MAX_IMPACT_DOMAIN];
+    cpumask_t impact_cpus[MAX_IMPACT_DOMAIN];
+
+};
+
+
 struct softirq_trap {
 	struct domain *domain;  /* domain to inject trap */
 	struct vcpu *vcpu;	/* vcpu to inject trap */
diff -r 2fe33f3403f5 xen/include/public/arch-x86/xen-mca.h
--- a/xen/include/public/arch-x86/xen-mca.h	Fri Feb 13 18:00:22 2009 +0800
+++ b/xen/include/public/arch-x86/xen-mca.h	Mon Feb 16 19:02:16 2009 +0800
@@ -106,10 +106,11 @@
 
 #define MC_FLAG_CORRECTABLE     (1 << 0)
 #define MC_FLAG_UNCORRECTABLE   (1 << 1)
-#define MC_FLAG_RECOVERABLE	(1 << 2)
-#define MC_FLAG_POLLED		(1 << 3)
-#define MC_FLAG_RESET		(1 << 4)
-#define MC_FLAG_CMCI		(1 << 5)
+#define MC_FLAG_RECOVERABLE     (1 << 2)
+#define MC_FLAG_POLLED          (1 << 3)
+#define MC_FLAG_RESET           (1 << 4)
+#define MC_FLAG_CMCI            (1 << 5)
+#define MC_FLAG_MCE             (1 << 6)
 /* contains global x86 mc information */
 struct mcinfo_global {
     struct mcinfo_common common;

[-- Attachment #4: vmsr_virtualization.patch --]
[-- Type: application/octet-stream, Size: 13340 bytes --]

diff -r 179b7b3d7f84 xen/arch/x86/cpu/mcheck/mce_intel.c
--- a/xen/arch/x86/cpu/mcheck/mce_intel.c	Mon Feb 16 19:04:25 2009 +0800
+++ b/xen/arch/x86/cpu/mcheck/mce_intel.c	Mon Feb 16 19:12:20 2009 +0800
@@ -1132,3 +1132,254 @@
     set_timer(&mce_timer, NOW() + MILLISECS(MCE_PERIOD));
 }
 
+/* Guest vMCE# MSRs virtualization ops (rdmsr/wrmsr) */
+int intel_mce_wrmsr(u32 msr, u32 lo, u32 hi)
+{
+    struct domain *d = current->domain;
+    struct bank_entry *entry = NULL;
+    uint64_t value = (u64)hi << 32 | lo;
+    int ret = 0;
+
+    spin_lock(&mce_locks);
+    switch(msr)
+    {
+        case MSR_IA32_MCG_CTL:
+            if (value != (u64)~0x0 && value != 0x0) {
+                printk(KERN_ERR "MCE: value writen to MCG_CTL"
+                    "should be all 0s or 1s\n");
+                ret = -1;
+                break;
+            }
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "MCE: wrmsr not in DOM context, skip\n");
+                break;
+            }
+            d->arch.vmca_msrs.mcg_ctl = value;
+            break;
+        case MSR_IA32_MCG_STATUS:
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "MCE: wrmsr not in DOM context, skip\n");
+                break;
+            }
+            d->arch.vmca_msrs.mcg_status = value;
+            printk(KERN_DEBUG "MCE: wrmsr MCG_CTL %lx\n", value);
+            break;
+        case MSR_IA32_MC0_CTL2:
+        case MSR_IA32_MC1_CTL2:
+        case MSR_IA32_MC2_CTL2:
+        case MSR_IA32_MC3_CTL2:
+        case MSR_IA32_MC4_CTL2:
+        case MSR_IA32_MC5_CTL2:
+        case MSR_IA32_MC6_CTL2:
+        case MSR_IA32_MC7_CTL2:
+        case MSR_IA32_MC8_CTL2:
+            printk(KERN_ERR "We have disabled CMCI capability, "
+                    "Guest should not write this MSR!\n");
+            break;
+        case MSR_IA32_MC0_CTL:
+        case MSR_IA32_MC1_CTL:
+        case MSR_IA32_MC2_CTL:
+        case MSR_IA32_MC3_CTL:
+        case MSR_IA32_MC4_CTL:
+        case MSR_IA32_MC5_CTL:
+        case MSR_IA32_MC6_CTL:
+        case MSR_IA32_MC7_CTL:
+        case MSR_IA32_MC8_CTL:
+            if (value != (u64)~0x0 && value != 0x0) {
+                printk(KERN_ERR "MCE: value writen to MCi_CTL"
+                    "should be all 0s or 1s\n");
+                ret = -1;
+                break;
+            }
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "MCE: wrmsr not in DOM context, skip\n");
+                break;
+            }
+            d->arch.vmca_msrs.mci_ctl[(msr - MSR_IA32_MC0_CTL)/4] = value;
+            break;
+        case MSR_IA32_MC0_STATUS:
+        case MSR_IA32_MC1_STATUS:
+        case MSR_IA32_MC2_STATUS:
+        case MSR_IA32_MC3_STATUS:
+        case MSR_IA32_MC4_STATUS:
+        case MSR_IA32_MC5_STATUS:
+        case MSR_IA32_MC6_STATUS:
+        case MSR_IA32_MC7_STATUS:
+        case MSR_IA32_MC8_STATUS:
+            if (!d || is_idle_domain(d)) {
+                /* Just skip */
+                printk(KERN_ERR "mce wrmsr: not in domain context!\n");
+                break;
+            }
+            /* Give the first entry of the list, it corresponds to current
+             * vMCE# injection. When vMCE# is finished processing by the
+             * the guest, this node will be deleted. 
+             */
+            if (!list_empty(&d->arch.vmca_msrs.impact_header)) {
+                entry = list_entry(d->arch.vmca_msrs.impact_header.next,
+                    struct bank_entry, list);
+                entry->mci_status[(msr - MSR_IA32_MC0_STATUS)/4] = value;
+                printk(KERN_DEBUG "MCE: wmrsr mci_status in vMCE# context\n");
+            }
+
+            printk(KERN_DEBUG "MCE: wrmsr mci_status val:%lx\n", value);
+            break;
+    }
+    spin_unlock(&mce_locks);
+    return ret;
+}
+
+int intel_mce_rdmsr(u32 msr, u32 *lo, u32 *hi)
+{
+    struct domain *d = current->domain;
+    int ret = 0;
+    struct bank_entry *entry = NULL;
+
+    spin_lock(&mce_locks);
+    switch(msr) 
+    {
+        case MSR_IA32_MCG_STATUS:
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "MCE: rdmsr not in domain context!\n");
+                *lo = *hi = 0x0;
+                break;
+            }
+            *lo = (u32)d->arch.vmca_msrs.mcg_status;
+            *hi = (u32)(d->arch.vmca_msrs.mcg_status >> 32);
+            printk(KERN_DEBUG "MCE: rd MCG_STATUS lo %x hi %x\n", *lo, *hi);
+            break;
+        case MSR_IA32_MCG_CAP:
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "MCE: rdmsr not in domain context!\n");
+                *lo = *hi = 0x0;
+                break;
+            }
+            *lo = (u32)d->arch.vmca_msrs.mcg_cap;
+            *hi = (u32)(d->arch.vmca_msrs.mcg_cap >> 32);
+            printk(KERN_DEBUG "MCE: rdmsr MCG_CAP lo %x hi %x\n", *lo, *hi);
+            break;
+        case MSR_IA32_MCG_CTL:
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "MCE: rdmsr not in domain context!\n");
+                *lo = *hi = 0x0;
+                break;
+            }
+            *lo = (u32)d->arch.vmca_msrs.mcg_ctl;
+            *hi = (u32)(d->arch.vmca_msrs.mcg_ctl >> 32);
+            printk(KERN_DEBUG "MCE: rdmsr MCG_CTL lo %x hi %x\n", *lo, *hi);
+            break;
+        case MSR_IA32_MC0_CTL2:
+        case MSR_IA32_MC1_CTL2:
+        case MSR_IA32_MC2_CTL2:
+        case MSR_IA32_MC3_CTL2:
+        case MSR_IA32_MC4_CTL2:
+        case MSR_IA32_MC5_CTL2:
+        case MSR_IA32_MC6_CTL2:
+        case MSR_IA32_MC7_CTL2:
+        case MSR_IA32_MC8_CTL2:
+            printk(KERN_WARNING "We have disabled CMCI capability, "
+                    "Guest should not read this MSR!\n");
+            *lo = *hi = 0x0;
+            break;
+        case MSR_IA32_MC0_CTL:
+        case MSR_IA32_MC1_CTL:
+        case MSR_IA32_MC2_CTL:
+        case MSR_IA32_MC3_CTL:
+        case MSR_IA32_MC4_CTL:
+        case MSR_IA32_MC5_CTL:
+        case MSR_IA32_MC6_CTL:
+        case MSR_IA32_MC7_CTL:
+        case MSR_IA32_MC8_CTL:
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "MCE: rdmsr not in domain context!\n");
+                *lo = *hi = 0x0;
+                break;
+            }
+            *lo = (u32)d->arch.vmca_msrs.mci_ctl[(msr - MSR_IA32_MC0_CTL)/4];
+            *hi =
+                (u32)(d->arch.vmca_msrs.mci_ctl[(msr - MSR_IA32_MC0_CTL)/4]
+                    >> 32);
+            printk(KERN_DEBUG "MCE: rdmsr MCi_CTL lo %x hi %x\n", *lo, *hi);
+            break;
+        case MSR_IA32_MC0_STATUS:
+        case MSR_IA32_MC1_STATUS:
+        case MSR_IA32_MC2_STATUS:
+        case MSR_IA32_MC3_STATUS:
+        case MSR_IA32_MC4_STATUS:
+        case MSR_IA32_MC5_STATUS:
+        case MSR_IA32_MC6_STATUS:
+        case MSR_IA32_MC7_STATUS:
+        case MSR_IA32_MC8_STATUS:
+            *lo = *hi = 0x0;
+            printk(KERN_DEBUG "MCE: rdmsr mci_status\n");
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "mce_rdmsr: not in domain context!\n");
+                break;
+            }
+            if (!list_empty(&d->arch.vmca_msrs.impact_header)) {
+                entry = list_entry(d->arch.vmca_msrs.impact_header.next,
+                    struct bank_entry, list);
+                *lo = entry->mci_status[(msr - MSR_IA32_MC0_STATUS)/4];
+                *hi = entry->mci_status[(msr - MSR_IA32_MC0_STATUS)/4] >> 32;
+
+                printk(KERN_DEBUG "MCE: rdmsr MCi_STATUS in vmCE# context "
+                    "lo %x hi %x\n", *lo, *hi);
+            }
+            break;
+        case MSR_IA32_MC0_ADDR:
+        case MSR_IA32_MC1_ADDR:
+        case MSR_IA32_MC2_ADDR:
+        case MSR_IA32_MC3_ADDR:
+        case MSR_IA32_MC4_ADDR:
+        case MSR_IA32_MC5_ADDR:
+        case MSR_IA32_MC6_ADDR:
+        case MSR_IA32_MC7_ADDR:
+        case MSR_IA32_MC8_ADDR:
+            *lo = *hi = 0x0;
+
+            printk(KERN_DEBUG "MCE: rdmsr mci_addr\n");
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "mce_rdmsr: not in domain context!\n");
+                break;
+            }
+            if (!list_empty(&d->arch.vmca_msrs.impact_header)) {
+                entry = list_entry(d->arch.vmca_msrs.impact_header.next,
+                    struct bank_entry, list);
+                *lo = entry->mci_addr[(msr - MSR_IA32_MC0_ADDR)/4];
+                *hi = entry->mci_addr[(msr - MSR_IA32_MC0_ADDR)/4] >> 32;
+                printk(KERN_DEBUG "MCE: rdmsr MCi_ADDR in vMCE# context "
+                    "lo %x hi %x\n", *lo, *hi);
+            }
+            break;
+        case MSR_IA32_MC0_MISC:
+        case MSR_IA32_MC1_MISC:
+        case MSR_IA32_MC2_MISC:
+        case MSR_IA32_MC3_MISC:
+        case MSR_IA32_MC4_MISC:
+        case MSR_IA32_MC5_MISC:
+        case MSR_IA32_MC6_MISC:
+        case MSR_IA32_MC7_MISC:
+        case MSR_IA32_MC8_MISC:
+            *lo = *hi = 0x0;
+            printk(KERN_DEBUG "MCE: rdmsr mci_misc\n");
+            if (!d || is_idle_domain(d)) {
+                printk(KERN_ERR "MCE: rdmsr not in domain context!\n");
+                break;
+            }
+            if (!list_empty(&d->arch.vmca_msrs.impact_header)) {
+                entry = list_entry(d->arch.vmca_msrs.impact_header.next,
+                    struct bank_entry, list);
+                *lo = entry->mci_misc[(msr - MSR_IA32_MC0_MISC)/4];
+                *hi = entry->mci_misc[(msr - MSR_IA32_MC0_MISC)/4] >> 32;
+
+                printk(KERN_DEBUG "MCE: rdmsr MCi_MISC in vMCE# context "
+                    " lo %x hi %x\n", *lo, *hi);
+            }
+            break;
+        default:
+            break;
+    }
+    spin_unlock(&mce_locks);
+    return ret;
+}
+
diff -r 179b7b3d7f84 xen/arch/x86/traps.c
--- a/xen/arch/x86/traps.c	Mon Feb 16 19:04:25 2009 +0800
+++ b/xen/arch/x86/traps.c	Mon Feb 16 19:12:20 2009 +0800
@@ -1636,6 +1636,10 @@
             (d->domain_id == 0));
 }
 
+/*Intel vMCE MSRs virtualization*/
+extern int intel_mce_wrmsr(u32 msr, u32 lo,  u32 hi);
+extern int intel_mce_rdmsr(u32 msr, u32 *lo,  u32 *hi);
+
 static int emulate_privileged_op(struct cpu_user_regs *regs)
 {
     struct vcpu *v = current;
@@ -2196,6 +2200,15 @@
         default:
             if ( wrmsr_hypervisor_regs(regs->ecx, eax, edx) )
                 break;
+            if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) {
+                if ( intel_mce_wrmsr(regs->ecx, eax, edx) != 0) {
+                    gdprintk(XENLOG_ERR, "MCE: vMCE MSRS(%lx) Write"
+                        " (%x:%x) Fails! ", regs->ecx, edx, eax);
+                    goto fail;
+                }
+                break;
+            }
+ 
             if ( (rdmsr_safe(regs->ecx, l, h) != 0) ||
                  (eax != l) || (edx != h) )
         invalid:
@@ -2279,6 +2292,12 @@
                         _p(regs->ecx));*/
             if ( rdmsr_safe(regs->ecx, regs->eax, regs->edx) )
                 goto fail;
+
+            if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) {
+                if ( intel_mce_rdmsr(regs->ecx, &eax, &edx) != 0)
+                    printk(KERN_ERR "MCE: Not MCE MSRs %lx\n", regs->ecx);
+            }
+
             break;
         }
         break;
diff -r 179b7b3d7f84 xen/include/asm-x86/msr-index.h
--- a/xen/include/asm-x86/msr-index.h	Mon Feb 16 19:04:25 2009 +0800
+++ b/xen/include/asm-x86/msr-index.h	Mon Feb 16 19:12:20 2009 +0800
@@ -96,30 +96,54 @@
 #define CMCI_EN 			(1UL<<30)
 #define CMCI_THRESHOLD_MASK		0x7FFF
 
+#define MSR_IA32_MC1_CTL		0x00000404
+#define MSR_IA32_MC1_CTL2		0x00000281
 #define MSR_IA32_MC1_STATUS		0x00000405
 #define MSR_IA32_MC1_ADDR		0x00000406
 #define MSR_IA32_MC1_MISC		0x00000407
 
 #define MSR_IA32_MC2_CTL		0x00000408
+#define MSR_IA32_MC2_CTL2		0x00000282
 #define MSR_IA32_MC2_STATUS		0x00000409
 #define MSR_IA32_MC2_ADDR		0x0000040A
 #define MSR_IA32_MC2_MISC		0x0000040B
 
+#define MSR_IA32_MC3_CTL2		0x00000283
 #define MSR_IA32_MC3_CTL		0x0000040C
 #define MSR_IA32_MC3_STATUS		0x0000040D
 #define MSR_IA32_MC3_ADDR		0x0000040E
 #define MSR_IA32_MC3_MISC		0x0000040F
 
+#define MSR_IA32_MC4_CTL2		0x00000284
 #define MSR_IA32_MC4_CTL		0x00000410
 #define MSR_IA32_MC4_STATUS		0x00000411
 #define MSR_IA32_MC4_ADDR		0x00000412
 #define MSR_IA32_MC4_MISC		0x00000413
 
+#define MSR_IA32_MC5_CTL2		0x00000285
 #define MSR_IA32_MC5_CTL		0x00000414
 #define MSR_IA32_MC5_STATUS		0x00000415
 #define MSR_IA32_MC5_ADDR		0x00000416
 #define MSR_IA32_MC5_MISC		0x00000417
 
+#define MSR_IA32_MC6_CTL2		0x00000286
+#define MSR_IA32_MC6_CTL		0x00000418
+#define MSR_IA32_MC6_STATUS		0x00000419
+#define MSR_IA32_MC6_ADDR		0x0000041A
+#define MSR_IA32_MC6_MISC		0x0000041B
+
+#define MSR_IA32_MC7_CTL2		0x00000287
+#define MSR_IA32_MC7_CTL		0x0000041C
+#define MSR_IA32_MC7_STATUS		0x0000041D
+#define MSR_IA32_MC7_ADDR		0x0000041E
+#define MSR_IA32_MC7_MISC		0x0000041F
+
+#define MSR_IA32_MC8_CTL2		0x00000288
+#define MSR_IA32_MC8_CTL		0x00000420
+#define MSR_IA32_MC8_STATUS		0x00000421
+#define MSR_IA32_MC8_ADDR		0x00000422
+#define MSR_IA32_MC8_MISC		0x00000423
+
 #define MSR_P6_PERFCTR0			0x000000c1
 #define MSR_P6_PERFCTR1			0x000000c2
 #define MSR_P6_EVNTSEL0			0x00000186

[-- Attachment #5: mce_dom0.patch --]
[-- Type: application/octet-stream, Size: 7354 bytes --]

diff -r ca8ac5fc168c arch/x86_64/Kconfig
--- a/arch/x86_64/Kconfig	Fri Feb 13 18:08:59 2009 +0800
+++ b/arch/x86_64/Kconfig	Mon Feb 16 21:30:34 2009 +0800
@@ -472,7 +472,6 @@
 
 config X86_MCE
 	bool "Machine check support" if EMBEDDED
-	depends on !X86_64_XEN
 	default y
 	help
 	   Include a machine check error handler to report hardware errors.
@@ -483,7 +482,7 @@
 config X86_MCE_INTEL
 	bool "Intel MCE features"
 	depends on X86_MCE && X86_LOCAL_APIC
-	default y
+	default n
 	help
 	   Additional support for intel specific MCE features such as
 	   the thermal monitor.
@@ -491,7 +490,7 @@
 config X86_MCE_AMD
 	bool "AMD MCE features"
 	depends on X86_MCE && X86_LOCAL_APIC
-	default y
+	default n
 	help
 	   Additional support for AMD specific MCE features such as
 	   the DRAM Error Threshold.
diff -r ca8ac5fc168c arch/x86_64/kernel/Makefile
--- a/arch/x86_64/kernel/Makefile	Fri Feb 13 18:08:59 2009 +0800
+++ b/arch/x86_64/kernel/Makefile	Mon Feb 16 21:30:34 2009 +0800
@@ -13,6 +13,7 @@
 obj-$(CONFIG_STACKTRACE)	+= stacktrace.o
 obj-$(CONFIG_X86_MCE)         += mce.o
 obj-$(CONFIG_X86_MCE_INTEL)	+= mce_intel.o
+obj-$(CONFIG_X86_MCE_INTEL)	+= mce_dom0.o
 obj-$(CONFIG_X86_MCE_AMD)	+= mce_amd.o
 obj-$(CONFIG_MTRR)		+= ../../i386/kernel/cpu/mtrr/
 obj-$(CONFIG_ACPI)		+= acpi/
diff -r ca8ac5fc168c arch/x86_64/kernel/entry-xen.S
--- a/arch/x86_64/kernel/entry-xen.S	Fri Feb 13 18:08:59 2009 +0800
+++ b/arch/x86_64/kernel/entry-xen.S	Mon Feb 16 21:30:34 2009 +0800
@@ -1258,13 +1258,8 @@
 
 #ifdef CONFIG_X86_MCE
 	/* runs on exception stack */
-ENTRY(machine_check)
-	INTR_FRAME
-	pushq $0
-	CFI_ADJUST_CFA_OFFSET 8	
-	paranoidentry do_machine_check
-	jmp paranoid_exit1
-	CFI_ENDPROC
+KPROBE_ENTRY(machine_check)
+	zeroentry do_machine_check
 END(machine_check)
 #endif
 
diff -r ca8ac5fc168c arch/x86_64/kernel/mce.c
--- a/arch/x86_64/kernel/mce.c	Fri Feb 13 18:08:59 2009 +0800
+++ b/arch/x86_64/kernel/mce.c	Mon Feb 16 21:30:34 2009 +0800
@@ -165,7 +165,7 @@
  * The actual machine check handler
  */
 
-void do_machine_check(struct pt_regs * regs, long error_code)
+asmlinkage void do_machine_check(struct pt_regs * regs, long error_code)
 {
 	struct mce m, panicm;
 	int nowayout = (tolerant < 1); 
@@ -276,9 +276,16 @@
 
 /*
  * Periodic polling timer for "silent" machine check errors.
- */
+ * We will disable polling in DOM0 since all CMCI/Polling
+ * mechanism will be done in XEN for Intel CPUs
+*/
 
+#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL)
+static int check_interval = 0; /* disable polling */
+#else
 static int check_interval = 5 * 60; /* 5 minutes */
+#endif
+
 static void mcheck_timer(void *data);
 static DECLARE_WORK(mcheck_work, mcheck_timer, NULL);
 
@@ -649,6 +656,7 @@
 };
 #endif
 
+extern void bind_virq_for_mce(void);
 static __init int mce_init_device(void)
 {
 	int err;
@@ -664,6 +672,13 @@
 
 	register_hotcpu_notifier(&mce_cpu_notifier);
 	misc_register(&mce_log_device);
+
+    /*Register vIRQ handler for MCE LOG processing*/
+    printk(KERN_DEBUG "MCE: bind virq for DOM0 Logging\n");
+#if defined (CONFIG_XEN) && defined(CONFIG_X86_MCE_INTEL)
+    bind_virq_for_mce();
+#endif
+
 	return err;
 }
 
diff -r ca8ac5fc168c arch/x86_64/kernel/mce_dom0.c
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/arch/x86_64/kernel/mce_dom0.c	Mon Feb 16 21:30:34 2009 +0800
@@ -0,0 +1,90 @@
+#include <linux/init.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <xen/interface/xen.h>
+#include <xen/evtchn.h>
+#include <xen/interface/vcpu.h>
+#include <asm/hypercall.h>
+#include <asm/mce.h>
+
+/*dom0 mce virq handler, this is called from polling or cmci*/
+static int convert_log(struct mc_info *mi)
+{
+	struct mcinfo_common *mic = NULL;
+	struct mcinfo_global *mc_global;
+	struct mcinfo_bank *mc_bank;
+	struct mce m;
+
+	x86_mcinfo_lookup(mic, mi, MC_TYPE_GLOBAL);
+	if (mic == NULL)
+	{
+		printk(KERN_ERR "DOM0_MCE_LOG: global data is NULL\n");
+		return -1;
+	}
+
+	mc_global = (struct mcinfo_global*)mic;
+	m.mcgstatus = mc_global->mc_gstatus;
+	m.cpu = mc_global->mc_coreid;/*for test*/
+	x86_mcinfo_lookup(mic, mi, MC_TYPE_BANK);
+	do
+	{
+		if (mic == NULL || mic->size == 0)
+			break;
+		if (mic->type == MC_TYPE_BANK)
+		{
+			mc_bank = (struct mcinfo_bank*)mic;
+			m.misc = mc_bank->mc_misc;
+			m.status = mc_bank->mc_status;
+			m.addr = mc_bank->mc_addr;
+			m.tsc = mc_bank->mc_tsc;
+			m.res1 = mc_bank->mc_ctl2;
+			m.bank = mc_bank->mc_bank;
+			printk(KERN_DEBUG "[CPU%d, BANK%d, addr %llx, state %llx]\n", 
+                m.bank, m.cpu, m.addr, m.status);
+			/*log this record*/
+			mce_log(&m);
+		}
+		mic = x86_mcinfo_next(mic);
+	}while (1);
+
+	return 0;
+}
+
+static irqreturn_t mce_dom0_interrupt(int irq, void *dev_id,
+									struct pt_regs *regs)
+{
+	xen_mc_t mc_op;
+	int result = 0;
+
+	printk(KERN_DEBUG "MCE_DOM0_LOG: enter dom0 mce vIRQ\n");
+	mc_op.cmd = XEN_MC_fetch;
+	mc_op.interface_version = XEN_MCA_INTERFACE_VERSION;
+	mc_op.u.mc_fetch.flags = XEN_MC_CORRECTABLE;
+	mc_op.u.mc_fetch.fetch_idx = 0;
+	memset(&mc_op.u.mc_fetch.mc_info, 0, sizeof(mc_op.u.mc_fetch.mc_info));
+	result = HYPERVISOR_mca(&mc_op);
+	if (result)
+		printk(KERN_WARNING "MCE_DOM0_LOG: fetch mce global data failed\n");
+	else
+	{
+		result = convert_log(&mc_op.u.mc_fetch.mc_info);
+		if (result)
+			printk(KERN_WARNING "MCE_DOM0_LOG: convert log failed\n");
+	}
+	return IRQ_HANDLED;
+}
+
+
+void bind_virq_for_mce(void)
+{
+	int ret;
+
+	ret  = bind_virq_to_irqhandler(VIRQ_ARCH_0, 0, 
+		mce_dom0_interrupt, 0, "mce", NULL);
+
+	if ( ret<0 )
+	{
+		printk(KERN_ERR "MCE_DOM0_LOG: bind_virq for DOM0 failed\n");
+	}
+}
+
diff -r ca8ac5fc168c include/asm-x86_64/mach-xen/asm/hypercall.h
--- a/include/asm-x86_64/mach-xen/asm/hypercall.h	Fri Feb 13 18:08:59 2009 +0800
+++ b/include/asm-x86_64/mach-xen/asm/hypercall.h	Mon Feb 16 21:30:34 2009 +0800
@@ -215,7 +215,13 @@
 	platform_op->interface_version = XENPF_INTERFACE_VERSION;
 	return _hypercall1(int, platform_op, platform_op);
 }
-
+static inline int __must_check
+HYPERVISOR_mca(
+	struct xen_mc *mc_op)
+{
+	mc_op->interface_version = XEN_MCA_INTERFACE_VERSION;
+	return _hypercall1(int, mca, mc_op);
+}
 static inline int __must_check
 HYPERVISOR_set_debugreg(
 	unsigned int reg, unsigned long value)
diff -r ca8ac5fc168c include/asm-x86_64/mach-xen/irq_vectors.h
--- a/include/asm-x86_64/mach-xen/irq_vectors.h	Fri Feb 13 18:08:59 2009 +0800
+++ b/include/asm-x86_64/mach-xen/irq_vectors.h	Mon Feb 16 21:30:34 2009 +0800
@@ -57,6 +57,7 @@
 #define LOCAL_TIMER_VECTOR	0xef
 #endif
 
+#define THERMAL_APIC_VECTOR	0xfa
 #define SPURIOUS_APIC_VECTOR	0xff
 #define ERROR_APIC_VECTOR	0xfe
 
diff -r ca8ac5fc168c include/xen/interface/arch-x86/xen-mca.h
--- a/include/xen/interface/arch-x86/xen-mca.h	Fri Feb 13 18:08:59 2009 +0800
+++ b/include/xen/interface/arch-x86/xen-mca.h	Mon Feb 16 21:30:34 2009 +0800
@@ -56,7 +56,7 @@
 /* Hypercall */
 #define __HYPERVISOR_mca __HYPERVISOR_arch_0
 
-#define XEN_MCA_INTERFACE_VERSION 0x03000001
+#define XEN_MCA_INTERFACE_VERSION 0x03000002
 
 /* IN: Dom0 calls hypercall from MC event handler. */
 #define XEN_MC_CORRECTABLE  0x0
@@ -132,6 +132,8 @@
     uint64_t mc_addr;   /* bank address, only valid
                          * if addr bit is set in mc_status */
     uint64_t mc_misc;
+    uint64_t mc_ctl2;
+    uint64_t mc_tsc;
 };
 
 

[-- Attachment #6: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread