xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* Xen Security Advisory 351 v1 - Information leak via power sidechannel
@ 2020-11-10 19:00 Xen.org security team
  0 siblings, 0 replies; 2+ messages in thread
From: Xen.org security team @ 2020-11-10 19:00 UTC (permalink / raw)
  To: xen-devel; +Cc: Xen.org security team

[-- Attachment #1: Type: text/plain, Size: 4204 bytes --]

(Copy of advisory)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

                    Xen Security Advisory XSA-351

                 Information leak via power sidechannel

ISSUE DESCRIPTION
=================

Researchers have demonstrated using software power/energy monitoring
interfaces to create covert channels, and infer the operations/data used
by other contexts within the system.

Access to these interfaces should be restricted to privileged software,
but it was found that Xen doesn't restrict access suitably, and the
interfaces are accessible to all guests.

For more information, see:
  https://platypusattack.com
  https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00389.html

IMPACT
======

An unprivileged guest administrator can sample platform power/energy
data.  This may be used to infer the operations/data used by other
contexts within the system.

The research demonstrates using this sidechannel to leak the AES keys
used elsewhere in the system.

VULNERABLE SYSTEMS
==================

Power/energy monitoring interfaces are platform and architecture
specific.  Consult your hardware vendor to ascertain what power feedback
interfaces are available.

For ARM systems, all versions of Xen are vulnerable.  The fix restricts
access to the AMU (Activity Monitors Unit) interface, introduced in
Armv8.4.

For x86 systems, Xen 4.14 and earlier are vulnerable - master is not
vulnerable, as these issues have been addressed in a more general
fashion.

The x86 fixes restrict access to:
 * Intel RAPL interface, introduced in SandyBridge CPUs.
 * Intel platform energy interface.
 * Intel perf_ctl interface, introduced in Pentium 4 CPUs and also
   implemented by other vendors.
 * AMD RAPL interface, introduced in Ryzen/EPYC CPUs.
 * AMD compute unit energy interface, present in Fam15/16 CPUs.

MITIGATION
==========

There are no mitigations available.

RESOLUTION
==========

Applying the appropriate attached patch resolves this issue.

Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball.  Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.

xsa351-arm.patch             Xen unstable - 4.10.x [ARM]
xsa351-x86-4.14-?.patch      Xen 4.14.x            [x86]
xsa351-x86-4.13-?.patch      Xen 4.13.x            [x86]
xsa351-x86-4.12-?.patch      Xen 4.12.x            [x86]
xsa351-x86-4.11-?.patch      Xen 4.11.x - 4.10.x   [x86]

$ sha256sum xsa351*
cad287981a870f13484834fa2364ffee68178517e906f55d2889304a4a9eae06  xsa351.meta
70ebd0e93af240af2680374dcfd8ff4a5dd3eefccf670f1cb9b546d763d6a554  xsa351-arm.patch
49b52a1366912a29e184e3014a9f1f579e8a0dd8a36f01d38d995d2c8ed81928  xsa351-arm-4.11.patch
2e7b7c2b98625d70c8b10047a9f668372f3ccede167344dedb712312606acbca  xsa351-x86-4.11-1.patch
ab9e2cb7d5e3e0c3a916f006c697495f4f01146e09df60ece59ce0a8f7aa5ed0  xsa351-x86-4.11-2.patch
bb68f6e6905bc1566156cafab058cbaf02a17c197385c33a83b7f73885913c1c  xsa351-x86-4.12-1.patch
53f464269f59498f8a9a614f10a47cfb1d81c666f0d684346e28005015de962c  xsa351-x86-4.12-2.patch
67a29d66230faafd9a8047ac80ec18130b5659e80a38c3a412cb2be6d3288a8f  xsa351-x86-4.13-1.patch
f7d8717dec33ee7484b36490402d113f1e7e168e7541bcf193fef620df299f08  xsa351-x86-4.13-2.patch
7d4fbe11a766226d7f1b93c5bf34664d8855deee09d1feebc76f11e49f2aa9c9  xsa351-x86-4.14-1.patch
41df825deafe3ef28e8594ec956033689af69f84a4a6dd92f97d1071e925203d  xsa351-x86-4.14-2.patch
$

NOTE REGARDING LACK OF EMBARGO
==============================

Despite an attempt to organise predisclosure, the discoverers ultimately
did not authorise a predisclosure.
-----BEGIN PGP SIGNATURE-----

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAl+q1WwMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZANkH+wf8pft4t9KoC9HFxd96DfCjZ+FQnD0hMp+890cY
ztNJM4+o+SBP2ytEMZLIoN1oJeTSQqyNgQh2sXNm7/WpseklOTR6s8zw4LWATEfz
rqF8G2xIN8ka7AAqAwOzkzj6qlxuWbiXKm4ENd5ocRxVvF1A2PYyEX88uCPgmupg
dqfufhYQF7hrz8VKDRDYtLsMrRaIFCWqGdOdQfVF64pHGHLvGZkANGN8yva8mBfC
uavwvX+O3CdVMENS4AA3TNo6p2nnWp1iQJCiBwLGCRbTQaRtRucV4Q/eSLC3pHLp
NO26OxieT4tLJN7Ox4ex43KZIsyweZSaUl18rfg0J8MB3FM=
=/6Fo
-----END PGP SIGNATURE-----

[-- Attachment #2: xsa351.meta --]
[-- Type: application/octet-stream, Size: 1772 bytes --]

{
  "XSA": 351,
  "SupportedVersions": [
    "master",
    "4.14",
    "4.13",
    "4.12",
    "4.11",
    "4.10"
  ],
  "Trees": [
    "xen"
  ],
  "Recipes": {
    "4.10": {
      "Recipes": {
        "xen": {
          "StableRef": "78d903e95efc5b0166b393d289a687c64016e8ef",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.11-?.patch",
            "xsa351-arm-4.11.patch"
          ]
        }
      }
    },
    "4.11": {
      "Recipes": {
        "xen": {
          "StableRef": "e274c8bdc12eb596e55233040e8b49da27150f31",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.11-?.patch",
            "xsa351-arm-4.11.patch"
          ]
        }
      }
    },
    "4.12": {
      "Recipes": {
        "xen": {
          "StableRef": "97b7b5567fba6918a656ad349051b5343b5dea2e",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.12-?.patch",
            "xsa351-arm.patch"
          ]
        }
      }
    },
    "4.13": {
      "Recipes": {
        "xen": {
          "StableRef": "0060ac29bcbdb76d49d2e248ddfcb7afa2345440",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.13-?.patch",
            "xsa351-arm.patch"
          ]
        }
      }
    },
    "4.14": {
      "Recipes": {
        "xen": {
          "StableRef": "10bb63c203f42d931fa1fa7dbbae7ce1765cecf2",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.14-?.patch",
            "xsa351-arm.patch"
          ]
        }
      }
    },
    "master": {
      "Recipes": {
        "xen": {
          "StableRef": "7056f2f89f03f2f804ac7e776c7b2b000cd716cd",
          "Prereqs": [],
          "Patches": [
	              "xsa351-arm.patch"
		      ]
        }
      }
    }
  }
}

[-- Attachment #3: xsa351-arm.patch --]
[-- Type: application/octet-stream, Size: 2418 bytes --]

From: Julien Grall <jgrall@amazon.com>
Subject: xen/arm: Always trap AMU system registers

The Activity Monitors Unit (AMU) has been introduced by ARMv8.4. It is
considered to be unsafe to be expose to guests as they might expose
information about code executed by other guests or the host.

Arm provided a way to trap all the AMU system registers by setting
CPTR_EL2.TAM to 1.

Unfortunately, on older revision of the specification, the bit 30 (now
CPTR_EL1.TAM) was RES0. Because of that, Xen is setting it to 0 and
therefore the system registers would be exposed to the guest when it is
run on processors with AMU.

As the bit is mark as UNKNOWN at boot in Armv8.4, the only safe solution
for us is to always set CPTR_EL1.TAM to 1.

Guest trying to access the AMU system registers will now receive an
undefined instruction. Unfortunately, this means that even well-behaved
guest may fail to boot because we don't sanitize the ID registers.

This is a known issues with other Armv8.0+ features (e.g. SVE, Pointer
Auth). This will taken care separately.

This is part of XSA-351 (or XSA-93 re-born).

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index a36f145e67..22bd1bd4c6 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -151,7 +151,8 @@ void init_traps(void)
      * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
      * RES1, i.e. they would trap whether we did this write or not.
      */
-    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) | HCPTR_TTA,
+    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
+                 HCPTR_TTA | HCPTR_TAM,
                  CPTR_EL2);
 
     /*
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 3ca67f8157..d3d12a9d19 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -351,6 +351,7 @@
 #define VTCR_RES1       (_AC(1,UL)<<31)
 
 /* HCPTR Hyp. Coprocessor Trap Register */
+#define HCPTR_TAM       ((_AC(1,U)<<30))
 #define HCPTR_TTA       ((_AC(1,U)<<20))        /* Trap trace registers */
 #define HCPTR_CP(x)     ((_AC(1,U)<<(x)))       /* Trap Coprocessor x */
 #define HCPTR_CP_MASK   ((_AC(1,U)<<14)-1)

[-- Attachment #4: xsa351-arm-4.11.patch --]
[-- Type: application/octet-stream, Size: 2714 bytes --]

From bdbd66cb9ba17dd1a7221f2a561f45a836f12f64 Mon Sep 17 00:00:00 2001
From: Julien Grall <jgrall@amazon.com>
Date: Tue, 10 Nov 2020 17:08:32 +0000
Subject: [PATCH] xen/arm: Always trap AMU system registers

The Activity Monitors Unit (AMU) has been introduced by ARMv8.4. It is
considered to be unsafe to be expose to guests as they might expose
information about code executed by other guests or the host.

Arm provided a way to trap all the AMU system registers by setting
CPTR_EL2.TAM to 1.

Unfortunately, on older revision of the specification, the bit 30 (now
CPTR_EL1.TAM) was RES0. Because of that, Xen is setting it to 0 and
therefore the system registers would be exposed to the guest when it is
run on processors with AMU.

As the bit is mark as UNKNOWN at boot in Armv8.4, the only safe solution
for us is to always set CPTR_EL1.TAM to 1.

Guest trying to access the AMU system registers will now receive an
undefined instruction. Unfortunately, this means that even well-behaved
guest may fail to boot because we don't sanitize the ID registers.

This is a known issues with other Armv8.0+ features (e.g. SVE, Pointer
Auth). This will taken care separately.

This is part of XSA-351 (or XSA-93 re-born).

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
 xen/arch/arm/traps.c            | 3 ++-
 xen/include/asm-arm/processor.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index e930585ad6d4..c12010a722b5 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -179,7 +179,8 @@ void init_traps(void)
      * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
      * RES1, i.e. they would trap whether we did this write or not.
      */
-    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) | HCPTR_TTA,
+    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
+                 HCPTR_TTA | HCPTR_TAM,
                  CPTR_EL2);
 
     /* Setup hypervisor traps */
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 222a02dd9935..5755cc64344a 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -291,6 +291,7 @@
 #define VTCR_RES1       (_AC(1,UL)<<31)
 
 /* HCPTR Hyp. Coprocessor Trap Register */
+#define HCPTR_TAM       ((_AC(1,U)<<30))
 #define HCPTR_TTA       ((_AC(1,U)<<20))        /* Trap trace registers */
 #define HCPTR_CP(x)     ((_AC(1,U)<<(x)))       /* Trap Coprocessor x */
 #define HCPTR_CP_MASK   ((_AC(1,U)<<14)-1)
-- 
2.17.1


[-- Attachment #5: xsa351-x86-4.11-1.patch --]
[-- Type: application/octet-stream, Size: 6244 bytes --]

From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com>
Subject: x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

  vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
  d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 256e58d82b..3495ac9f4a 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -141,6 +141,7 @@ int init_vcpu_msr_policy(struct vcpu *v)
 
 int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
 {
+    const struct domain *d = v->domain;
     const struct cpuid_policy *cp = v->domain->arch.cpuid;
     const struct msr_domain_policy *dp = v->domain->arch.msr;
     const struct msr_vcpu_policy *vp = v->arch.msr;
@@ -212,6 +213,25 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
         break;
 
         /*
+         * These MSRs are not enumerated in CPUID.  They have been around
+         * since the Pentium 4, and implemented by other vendors.
+         *
+         * Some versions of Windows try reading these before setting up a #GP
+         * handler, and Linux has several unguarded reads as well.  Provide
+         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+         * have full access.
+         */
+    case MSR_IA32_PERF_STATUS:
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        *val = 0;
+        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+            break;
+        goto gp_fault;
+
+        /*
          * TODO: Implement when we have better topology representation.
     case MSR_INTEL_CORE_THREAD_COUNT:
          */
@@ -241,6 +261,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
+    case MSR_IA32_PERF_STATUS:
         /* Read-only */
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
@@ -345,6 +366,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         break;
     }
 
+        /*
+         * This MSR is not enumerated in CPUID.  It has been around since the
+         * Pentium 4, and implemented by other vendors.
+         *
+         * To match the RAZ semantics, implement as write-discard, except for
+         * a cpufreq controller dom0 which has full access.
+         */
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+            break;
+        goto gp_fault;
+
     default:
         return X86EMUL_UNHANDLEABLE;
     }
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 8120ded330..755f00db33 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -816,12 +816,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     return val;
 }
 
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
 static int read_msr(unsigned int reg, uint64_t *val,
                     struct x86_emulate_ctxt *ctxt)
 {
@@ -1096,14 +1090,6 @@ static int write_msr(unsigned int reg, uint64_t val,
             return X86EMUL_OKAY;
         break;
 
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
     case MSR_IA32_THERM_CONTROL:
     case MSR_IA32_ENERGY_PERF_BIAS:
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index c0cc5d9336..7e4ad5d51b 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -920,6 +920,22 @@ extern enum cpufreq_controller {
     FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
 } cpufreq_controller;
 
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+    /*
+     * A PV dom0 can be nominated as the cpufreq controller, instead of using
+     * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+     * MSRs.
+     *
+     * This interface only works when dom0 is identity pinned and has the same
+     * number of vCPUs as pCPUs on the system.
+     *
+     * It would be far better to paravirtualise the interface.
+     */
+    return (is_pv_domain(d) && is_hardware_domain(d) &&
+            cpufreq_controller == FREQCTL_dom0_kernel);
+}
+
 #define CPUPOOLID_NONE    -1
 
 struct cpupool *cpupool_get_by_id(int poolid);

[-- Attachment #6: xsa351-x86-4.11-2.patch --]
[-- Type: application/octet-stream, Size: 4448 bytes --]

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues.  The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 3495ac9f4a..99c848ff41 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -156,6 +156,15 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
@@ -266,6 +275,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 480d1d8102..a685dcdcca 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -96,6 +96,38 @@
 /* Lower 6 bits define the format of the address in the LBR stack */
 #define MSR_IA32_PERF_CAP_LBR_FORMAT	0x3f
 
+/*
+ * Intel Runtime Average Power Limiting (RAPL) interface.  Power plane base
+ * addresses (MSR_*_POWER_LIMIT) are model specific, but have so-far been
+ * consistent since their introduction in SandyBridge.
+ *
+ * Offsets of functionality from the power plane base is architectural, but
+ * not all power planes support all functionality.
+ */
+#define MSR_RAPL_POWER_UNIT		0x00000606
+
+#define MSR_PKG_POWER_LIMIT		0x00000610
+#define MSR_PKG_ENERGY_STATUS		0x00000611
+#define MSR_PKG_PERF_STATUS		0x00000613
+#define MSR_PKG_POWER_INFO		0x00000614
+
+#define MSR_DRAM_POWER_LIMIT		0x00000618
+#define MSR_DRAM_ENERGY_STATUS		0x00000619
+#define MSR_DRAM_PERF_STATUS		0x0000061b
+#define MSR_DRAM_POWER_INFO		0x0000061c
+
+#define MSR_PP0_POWER_LIMIT		0x00000638
+#define MSR_PP0_ENERGY_STATUS		0x00000639
+#define MSR_PP0_POLICY			0x0000063a
+
+#define MSR_PP1_POWER_LIMIT		0x00000640
+#define MSR_PP1_ENERGY_STATUS		0x00000641
+#define MSR_PP1_POLICY			0x00000642
+
+/* Intel Platform-wide power interface. */
+#define MSR_PLATFORM_ENERGY_COUNTER	0x0000064d
+#define MSR_PLATFORM_POWER_LIMIT	0x0000065c
+
 #define MSR_IA32_BNDCFGS		0x00000d90
 #define IA32_BNDCFGS_ENABLE		0x00000001
 #define IA32_BNDCFGS_PRESERVE		0x00000002
@@ -218,6 +250,8 @@
 #define MSR_K8_VM_CR			0xc0010114
 #define MSR_K8_VM_HSAVE_PA		0xc0010117
 
+#define MSR_F15H_CU_POWER		0xc001007a
+#define MSR_F15H_CU_MAX_POWER		0xc001007b
 #define MSR_AMD_FAM15H_EVNTSEL0		0xc0010200
 #define MSR_AMD_FAM15H_PERFCTR0		0xc0010201
 #define MSR_AMD_FAM15H_EVNTSEL1		0xc0010202
@@ -231,6 +265,10 @@
 #define MSR_AMD_FAM15H_EVNTSEL5		0xc001020a
 #define MSR_AMD_FAM15H_PERFCTR5		0xc001020b
 
+#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
+#define MSR_AMD_CORE_ENERGY_STATUS	0xc001029a
+#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
+
 #define MSR_AMD_L7S0_FEATURE_MASK	0xc0011002
 #define MSR_AMD_THRM_FEATURE_MASK	0xc0011003
 #define MSR_K8_FEATURE_MASK		0xc0011004

[-- Attachment #7: xsa351-x86-4.12-1.patch --]
[-- Type: application/octet-stream, Size: 5992 bytes --]

From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com>
Subject: x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

  vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
  d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 4677222c40..a427826ba0 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -206,6 +206,25 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = msrs->misc_features_enables.raw;
         break;
 
+        /*
+         * These MSRs are not enumerated in CPUID.  They have been around
+         * since the Pentium 4, and implemented by other vendors.
+         *
+         * Some versions of Windows try reading these before setting up a #GP
+         * handler, and Linux has several unguarded reads as well.  Provide
+         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+         * have full access.
+         */
+    case MSR_IA32_PERF_STATUS:
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        *val = 0;
+        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -290,6 +309,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
+    case MSR_IA32_PERF_STATUS:
         /* Read-only */
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
@@ -394,6 +414,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         break;
     }
 
+        /*
+         * This MSR is not enumerated in CPUID.  It has been around since the
+         * Pentium 4, and implemented by other vendors.
+         *
+         * To match the RAZ semantics, implement as write-discard, except for
+         * a cpufreq controller dom0 which has full access.
+         */
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 324a2334a2..933036ea34 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -799,12 +799,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     return val;
 }
 
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
 static int read_msr(unsigned int reg, uint64_t *val,
                     struct x86_emulate_ctxt *ctxt)
 {
@@ -1047,14 +1041,6 @@ static int write_msr(unsigned int reg, uint64_t val,
             return X86EMUL_OKAY;
         break;
 
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
     case MSR_IA32_THERM_CONTROL:
     case MSR_IA32_ENERGY_PERF_BIAS:
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 819f6ede2b..b918624327 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -993,6 +993,22 @@ extern enum cpufreq_controller {
     FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
 } cpufreq_controller;
 
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+    /*
+     * A PV dom0 can be nominated as the cpufreq controller, instead of using
+     * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+     * MSRs.
+     *
+     * This interface only works when dom0 is identity pinned and has the same
+     * number of vCPUs as pCPUs on the system.
+     *
+     * It would be far better to paravirtualise the interface.
+     */
+    return (is_pv_domain(d) && is_hardware_domain(d) &&
+            cpufreq_controller == FREQCTL_dom0_kernel);
+}
+
 #define CPUPOOLID_NONE    -1
 
 struct cpupool *cpupool_get_by_id(int poolid);

[-- Attachment #8: xsa351-x86-4.12-2.patch --]
[-- Type: application/octet-stream, Size: 4682 bytes --]

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues.  The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index a427826ba0..927ed625df 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -151,9 +151,18 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
@@ -315,9 +324,18 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 0eb6855614..ba9e90af21 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -96,6 +96,38 @@
 /* Lower 6 bits define the format of the address in the LBR stack */
 #define MSR_IA32_PERF_CAP_LBR_FORMAT	0x3f
 
+/*
+ * Intel Runtime Average Power Limiting (RAPL) interface.  Power plane base
+ * addresses (MSR_*_POWER_LIMIT) are model specific, but have so-far been
+ * consistent since their introduction in SandyBridge.
+ *
+ * Offsets of functionality from the power plane base is architectural, but
+ * not all power planes support all functionality.
+ */
+#define MSR_RAPL_POWER_UNIT		0x00000606
+
+#define MSR_PKG_POWER_LIMIT		0x00000610
+#define MSR_PKG_ENERGY_STATUS		0x00000611
+#define MSR_PKG_PERF_STATUS		0x00000613
+#define MSR_PKG_POWER_INFO		0x00000614
+
+#define MSR_DRAM_POWER_LIMIT		0x00000618
+#define MSR_DRAM_ENERGY_STATUS		0x00000619
+#define MSR_DRAM_PERF_STATUS		0x0000061b
+#define MSR_DRAM_POWER_INFO		0x0000061c
+
+#define MSR_PP0_POWER_LIMIT		0x00000638
+#define MSR_PP0_ENERGY_STATUS		0x00000639
+#define MSR_PP0_POLICY			0x0000063a
+
+#define MSR_PP1_POWER_LIMIT		0x00000640
+#define MSR_PP1_ENERGY_STATUS		0x00000641
+#define MSR_PP1_POLICY			0x00000642
+
+/* Intel Platform-wide power interface. */
+#define MSR_PLATFORM_ENERGY_COUNTER	0x0000064d
+#define MSR_PLATFORM_POWER_LIMIT	0x0000065c
+
 #define MSR_IA32_BNDCFGS		0x00000d90
 #define IA32_BNDCFGS_ENABLE		0x00000001
 #define IA32_BNDCFGS_PRESERVE		0x00000002
@@ -236,6 +268,8 @@
 #define MSR_K8_VM_CR			0xc0010114
 #define MSR_K8_VM_HSAVE_PA		0xc0010117
 
+#define MSR_F15H_CU_POWER		0xc001007a
+#define MSR_F15H_CU_MAX_POWER		0xc001007b
 #define MSR_AMD_FAM15H_EVNTSEL0		0xc0010200
 #define MSR_AMD_FAM15H_PERFCTR0		0xc0010201
 #define MSR_AMD_FAM15H_EVNTSEL1		0xc0010202
@@ -249,6 +283,10 @@
 #define MSR_AMD_FAM15H_EVNTSEL5		0xc001020a
 #define MSR_AMD_FAM15H_PERFCTR5		0xc001020b
 
+#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
+#define MSR_AMD_CORE_ENERGY_STATUS	0xc001029a
+#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
+
 #define MSR_AMD_L7S0_FEATURE_MASK	0xc0011002
 #define MSR_AMD_THRM_FEATURE_MASK	0xc0011003
 #define MSR_K8_FEATURE_MASK		0xc0011004

[-- Attachment #9: xsa351-x86-4.13-1.patch --]
[-- Type: application/octet-stream, Size: 5988 bytes --]

From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com>
Subject: x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

  vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
  d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 875ac39d30..8c969197aa 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -208,6 +208,25 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = msrs->misc_features_enables.raw;
         break;
 
+        /*
+         * These MSRs are not enumerated in CPUID.  They have been around
+         * since the Pentium 4, and implemented by other vendors.
+         *
+         * Some versions of Windows try reading these before setting up a #GP
+         * handler, and Linux has several unguarded reads as well.  Provide
+         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+         * have full access.
+         */
+    case MSR_IA32_PERF_STATUS:
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        *val = 0;
+        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -305,6 +324,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
+    case MSR_IA32_PERF_STATUS:
         /* Read-only */
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
@@ -411,6 +431,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         break;
     }
 
+        /*
+         * This MSR is not enumerated in CPUID.  It has been around since the
+         * Pentium 4, and implemented by other vendors.
+         *
+         * To match the RAZ semantics, implement as write-discard, except for
+         * a cpufreq controller dom0 which has full access.
+         */
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 42258c6bf1..6dc4f92a84 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -776,12 +776,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     return val;
 }
 
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
 static int read_msr(unsigned int reg, uint64_t *val,
                     struct x86_emulate_ctxt *ctxt)
 {
@@ -1026,14 +1020,6 @@ static int write_msr(unsigned int reg, uint64_t val,
             return X86EMUL_OKAY;
         break;
 
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
     case MSR_IA32_THERM_CONTROL:
     case MSR_IA32_ENERGY_PERF_BIAS:
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index d6e27fc4b8..8bb5bd7b38 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1057,6 +1057,22 @@ extern enum cpufreq_controller {
     FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
 } cpufreq_controller;
 
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+    /*
+     * A PV dom0 can be nominated as the cpufreq controller, instead of using
+     * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+     * MSRs.
+     *
+     * This interface only works when dom0 is identity pinned and has the same
+     * number of vCPUs as pCPUs on the system.
+     *
+     * It would be far better to paravirtualise the interface.
+     */
+    return (is_pv_domain(d) && is_hardware_domain(d) &&
+            cpufreq_controller == FREQCTL_dom0_kernel);
+}
+
 #define CPUPOOLID_NONE    -1
 
 struct cpupool *cpupool_get_by_id(int poolid);

[-- Attachment #10: xsa351-x86-4.13-2.patch --]
[-- Type: application/octet-stream, Size: 4800 bytes --]

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues.  The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 8c969197aa..8ab6949a8e 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -152,11 +152,20 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
     case MSR_AMD64_LWP_CFG:
     case MSR_AMD64_LWP_CBADDR:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
@@ -330,11 +339,20 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
     case MSR_AMD64_LWP_CFG:
     case MSR_AMD64_LWP_CBADDR:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 0eb6855614..ba9e90af21 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -96,6 +96,38 @@
 /* Lower 6 bits define the format of the address in the LBR stack */
 #define MSR_IA32_PERF_CAP_LBR_FORMAT	0x3f
 
+/*
+ * Intel Runtime Average Power Limiting (RAPL) interface.  Power plane base
+ * addresses (MSR_*_POWER_LIMIT) are model specific, but have so-far been
+ * consistent since their introduction in SandyBridge.
+ *
+ * Offsets of functionality from the power plane base is architectural, but
+ * not all power planes support all functionality.
+ */
+#define MSR_RAPL_POWER_UNIT		0x00000606
+
+#define MSR_PKG_POWER_LIMIT		0x00000610
+#define MSR_PKG_ENERGY_STATUS		0x00000611
+#define MSR_PKG_PERF_STATUS		0x00000613
+#define MSR_PKG_POWER_INFO		0x00000614
+
+#define MSR_DRAM_POWER_LIMIT		0x00000618
+#define MSR_DRAM_ENERGY_STATUS		0x00000619
+#define MSR_DRAM_PERF_STATUS		0x0000061b
+#define MSR_DRAM_POWER_INFO		0x0000061c
+
+#define MSR_PP0_POWER_LIMIT		0x00000638
+#define MSR_PP0_ENERGY_STATUS		0x00000639
+#define MSR_PP0_POLICY			0x0000063a
+
+#define MSR_PP1_POWER_LIMIT		0x00000640
+#define MSR_PP1_ENERGY_STATUS		0x00000641
+#define MSR_PP1_POLICY			0x00000642
+
+/* Intel Platform-wide power interface. */
+#define MSR_PLATFORM_ENERGY_COUNTER	0x0000064d
+#define MSR_PLATFORM_POWER_LIMIT	0x0000065c
+
 #define MSR_IA32_BNDCFGS		0x00000d90
 #define IA32_BNDCFGS_ENABLE		0x00000001
 #define IA32_BNDCFGS_PRESERVE		0x00000002
@@ -236,6 +268,8 @@
 #define MSR_K8_VM_CR			0xc0010114
 #define MSR_K8_VM_HSAVE_PA		0xc0010117
 
+#define MSR_F15H_CU_POWER		0xc001007a
+#define MSR_F15H_CU_MAX_POWER		0xc001007b
 #define MSR_AMD_FAM15H_EVNTSEL0		0xc0010200
 #define MSR_AMD_FAM15H_PERFCTR0		0xc0010201
 #define MSR_AMD_FAM15H_EVNTSEL1		0xc0010202
@@ -249,6 +283,10 @@
 #define MSR_AMD_FAM15H_EVNTSEL5		0xc001020a
 #define MSR_AMD_FAM15H_PERFCTR5		0xc001020b
 
+#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
+#define MSR_AMD_CORE_ENERGY_STATUS	0xc001029a
+#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
+
 #define MSR_AMD_L7S0_FEATURE_MASK	0xc0011002
 #define MSR_AMD_THRM_FEATURE_MASK	0xc0011003
 #define MSR_K8_FEATURE_MASK		0xc0011004

[-- Attachment #11: xsa351-x86-4.14-1.patch --]
[-- Type: application/octet-stream, Size: 6075 bytes --]

From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com>
Subject: x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

  vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
  d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index d72ab0fa1f..3db26faf08 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -245,6 +245,25 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = msrs->misc_features_enables.raw;
         break;
 
+        /*
+         * These MSRs are not enumerated in CPUID.  They have been around
+         * since the Pentium 4, and implemented by other vendors.
+         *
+         * Some versions of Windows try reading these before setting up a #GP
+         * handler, and Linux has several unguarded reads as well.  Provide
+         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+         * have full access.
+         */
+    case MSR_IA32_PERF_STATUS:
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        *val = 0;
+        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -343,6 +362,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
+    case MSR_IA32_PERF_STATUS:
         /* Read-only */
     case MSR_TEST_CTRL:
     case MSR_TSX_FORCE_ABORT:
@@ -454,6 +474,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         break;
     }
 
+        /*
+         * This MSR is not enumerated in CPUID.  It has been around since the
+         * Pentium 4, and implemented by other vendors.
+         *
+         * To match the RAZ semantics, implement as write-discard, except for
+         * a cpufreq controller dom0 which has full access.
+         */
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 85a9fd4767..5c7b9117ae 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -820,12 +820,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     return val;
 }
 
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
 static int read_msr(unsigned int reg, uint64_t *val,
                     struct x86_emulate_ctxt *ctxt)
 {
@@ -1070,14 +1064,6 @@ static int write_msr(unsigned int reg, uint64_t val,
             return X86EMUL_OKAY;
         break;
 
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
     case MSR_IA32_THERM_CONTROL:
     case MSR_IA32_ENERGY_PERF_BIAS:
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index a0d87ef9d0..97ba8e0795 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1071,6 +1071,22 @@ extern enum cpufreq_controller {
     FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
 } cpufreq_controller;
 
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+    /*
+     * A PV dom0 can be nominated as the cpufreq controller, instead of using
+     * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+     * MSRs.
+     *
+     * This interface only works when dom0 is identity pinned and has the same
+     * number of vCPUs as pCPUs on the system.
+     *
+     * It would be far better to paravirtualise the interface.
+     */
+    return (is_pv_domain(d) && is_hardware_domain(d) &&
+            cpufreq_controller == FREQCTL_dom0_kernel);
+}
+
 int cpupool_move_domain(struct domain *d, struct cpupool *c);
 int cpupool_do_sysctl(struct xen_sysctl_cpupool_op *op);
 int cpupool_get_id(const struct domain *d);

[-- Attachment #12: xsa351-x86-4.14-2.patch --]
[-- Type: application/octet-stream, Size: 5173 bytes --]

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues.  The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 3db26faf08..aa107823ac 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -185,6 +185,13 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
@@ -192,6 +199,8 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_AMD64_LWP_CBADDR:
     case MSR_PPIN_CTL:
     case MSR_PPIN:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
     case MSR_AMD_PPIN_CTL:
     case MSR_AMD_PPIN:
         /* Not offered to guests. */
@@ -369,6 +378,13 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
@@ -376,6 +392,8 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_AMD64_LWP_CBADDR:
     case MSR_PPIN_CTL:
     case MSR_PPIN:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
     case MSR_AMD_PPIN_CTL:
     case MSR_AMD_PPIN:
         /* Not offered to guests. */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 0fe98af923..5e64ecff91 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -77,6 +77,38 @@
 #define MSR_RTIT_ADDR_A(n)                 (0x00000580 + (n) * 2)
 #define MSR_RTIT_ADDR_B(n)                 (0x00000581 + (n) * 2)
 
+/*
+ * Intel Runtime Average Power Limiting (RAPL) interface.  Power plane base
+ * addresses (MSR_*_POWER_LIMIT) are model specific, but have so-far been
+ * consistent since their introduction in SandyBridge.
+ *
+ * Offsets of functionality from the power plane base is architectural, but
+ * not all power planes support all functionality.
+ */
+#define MSR_RAPL_POWER_UNIT                 0x00000606
+
+#define MSR_PKG_POWER_LIMIT                 0x00000610
+#define MSR_PKG_ENERGY_STATUS               0x00000611
+#define MSR_PKG_PERF_STATUS                 0x00000613
+#define MSR_PKG_POWER_INFO                  0x00000614
+
+#define MSR_DRAM_POWER_LIMIT                0x00000618
+#define MSR_DRAM_ENERGY_STATUS              0x00000619
+#define MSR_DRAM_PERF_STATUS                0x0000061b
+#define MSR_DRAM_POWER_INFO                 0x0000061c
+
+#define MSR_PP0_POWER_LIMIT                 0x00000638
+#define MSR_PP0_ENERGY_STATUS               0x00000639
+#define MSR_PP0_POLICY                      0x0000063a
+
+#define MSR_PP1_POWER_LIMIT                 0x00000640
+#define MSR_PP1_ENERGY_STATUS               0x00000641
+#define MSR_PP1_POLICY                      0x00000642
+
+/* Intel Platform-wide power interface. */
+#define MSR_PLATFORM_ENERGY_COUNTER         0x0000064d
+#define MSR_PLATFORM_POWER_LIMIT            0x0000065c
+
 #define MSR_U_CET                           0x000006a0
 #define MSR_S_CET                           0x000006a2
 #define  CET_SHSTK_EN                       (_AC(1, ULL) <<  0)
@@ -92,6 +124,13 @@
 #define  PASID_PASID_MASK                   0x000fffff
 #define  PASID_VALID                        (_AC(1, ULL) << 31)
 
+#define MSR_F15H_CU_POWER                   0xc001007a
+#define MSR_F15H_CU_MAX_POWER               0xc001007b
+
+#define MSR_AMD_RAPL_POWER_UNIT             0xc0010299
+#define MSR_AMD_CORE_ENERGY_STATUS          0xc001029a
+#define MSR_AMD_PKG_ENERGY_STATUS           0xc001029b
+
 /*
  * Legacy MSR constants in need of cleanup.  No new MSRs below this comment.
  */

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Xen Security Advisory 351 v1 - Information leak via power sidechannel
@ 2020-11-10 18:01 Xen.org security team
  0 siblings, 0 replies; 2+ messages in thread
From: Xen.org security team @ 2020-11-10 18:01 UTC (permalink / raw)
  To: xen-announce, xen-devel, xen-users, oss-security; +Cc: Xen.org security team

[-- Attachment #1: Type: text/plain, Size: 4184 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

                    Xen Security Advisory XSA-351

                 Information leak via power sidechannel

ISSUE DESCRIPTION
=================

Researchers have demonstrated using software power/energy monitoring
interfaces to create covert channels, and infer the operations/data used
by other contexts within the system.

Access to these interfaces should be restricted to privileged software,
but it was found that Xen doesn't restrict access suitably, and the
interfaces are accessible to all guests.

For more information, see:
  https://platypusattack.com
  https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00389.html

IMPACT
======

An unprivileged guest administrator can sample platform power/energy
data.  This may be used to infer the operations/data used by other
contexts within the system.

The research demonstrates using this sidechannel to leak the AES keys
used elsewhere in the system.

VULNERABLE SYSTEMS
==================

Power/energy monitoring interfaces are platform and architecture
specific.  Consult your hardware vendor to ascertain what power feedback
interfaces are available.

For ARM systems, all versions of Xen are vulnerable.  The fix restricts
access to the AMU (Activity Monitors Unit) interface, introduced in
Armv8.4.

For x86 systems, Xen 4.14 and earlier are vulnerable - master is not
vulnerable, as these issues have been addressed in a more general
fashion.

The x86 fixes restrict access to:
 * Intel RAPL interface, introduced in SandyBridge CPUs.
 * Intel platform energy interface.
 * Intel perf_ctl interface, introduced in Pentium 4 CPUs and also
   implemented by other vendors.
 * AMD RAPL interface, introduced in Ryzen/EPYC CPUs.
 * AMD compute unit energy interface, present in Fam15/16 CPUs.

MITIGATION
==========

There are no mitigations available.

RESOLUTION
==========

Applying the appropriate attached patch resolves this issue.

Note that patches for released versions are generally prepared to
apply to the stable branches, and may not apply cleanly to the most
recent release tarball.  Downstreams are encouraged to update to the
tip of the stable branch before applying these patches.

xsa351-arm.patch             Xen unstable - 4.10.x [ARM]
xsa351-x86-4.14-?.patch      Xen 4.14.x            [x86]
xsa351-x86-4.13-?.patch      Xen 4.13.x            [x86]
xsa351-x86-4.12-?.patch      Xen 4.12.x            [x86]
xsa351-x86-4.11-?.patch      Xen 4.11.x - 4.10.x   [x86]

$ sha256sum xsa351*
cad287981a870f13484834fa2364ffee68178517e906f55d2889304a4a9eae06  xsa351.meta
70ebd0e93af240af2680374dcfd8ff4a5dd3eefccf670f1cb9b546d763d6a554  xsa351-arm.patch
49b52a1366912a29e184e3014a9f1f579e8a0dd8a36f01d38d995d2c8ed81928  xsa351-arm-4.11.patch
2e7b7c2b98625d70c8b10047a9f668372f3ccede167344dedb712312606acbca  xsa351-x86-4.11-1.patch
ab9e2cb7d5e3e0c3a916f006c697495f4f01146e09df60ece59ce0a8f7aa5ed0  xsa351-x86-4.11-2.patch
bb68f6e6905bc1566156cafab058cbaf02a17c197385c33a83b7f73885913c1c  xsa351-x86-4.12-1.patch
53f464269f59498f8a9a614f10a47cfb1d81c666f0d684346e28005015de962c  xsa351-x86-4.12-2.patch
67a29d66230faafd9a8047ac80ec18130b5659e80a38c3a412cb2be6d3288a8f  xsa351-x86-4.13-1.patch
f7d8717dec33ee7484b36490402d113f1e7e168e7541bcf193fef620df299f08  xsa351-x86-4.13-2.patch
7d4fbe11a766226d7f1b93c5bf34664d8855deee09d1feebc76f11e49f2aa9c9  xsa351-x86-4.14-1.patch
41df825deafe3ef28e8594ec956033689af69f84a4a6dd92f97d1071e925203d  xsa351-x86-4.14-2.patch
$

NOTE REGARDING LACK OF EMBARGO
==============================

Despite an attempt to organise predisclosure, the discoverers ultimately
did not authorise a predisclosure.
-----BEGIN PGP SIGNATURE-----

iQFABAEBCAAqFiEEI+MiLBRfRHX6gGCng/4UyVfoK9kFAl+q1WwMHHBncEB4ZW4u
b3JnAAoJEIP+FMlX6CvZANkH+wf8pft4t9KoC9HFxd96DfCjZ+FQnD0hMp+890cY
ztNJM4+o+SBP2ytEMZLIoN1oJeTSQqyNgQh2sXNm7/WpseklOTR6s8zw4LWATEfz
rqF8G2xIN8ka7AAqAwOzkzj6qlxuWbiXKm4ENd5ocRxVvF1A2PYyEX88uCPgmupg
dqfufhYQF7hrz8VKDRDYtLsMrRaIFCWqGdOdQfVF64pHGHLvGZkANGN8yva8mBfC
uavwvX+O3CdVMENS4AA3TNo6p2nnWp1iQJCiBwLGCRbTQaRtRucV4Q/eSLC3pHLp
NO26OxieT4tLJN7Ox4ex43KZIsyweZSaUl18rfg0J8MB3FM=
=/6Fo
-----END PGP SIGNATURE-----

[-- Attachment #2: xsa351.meta --]
[-- Type: application/octet-stream, Size: 1772 bytes --]

{
  "XSA": 351,
  "SupportedVersions": [
    "master",
    "4.14",
    "4.13",
    "4.12",
    "4.11",
    "4.10"
  ],
  "Trees": [
    "xen"
  ],
  "Recipes": {
    "4.10": {
      "Recipes": {
        "xen": {
          "StableRef": "78d903e95efc5b0166b393d289a687c64016e8ef",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.11-?.patch",
            "xsa351-arm-4.11.patch"
          ]
        }
      }
    },
    "4.11": {
      "Recipes": {
        "xen": {
          "StableRef": "e274c8bdc12eb596e55233040e8b49da27150f31",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.11-?.patch",
            "xsa351-arm-4.11.patch"
          ]
        }
      }
    },
    "4.12": {
      "Recipes": {
        "xen": {
          "StableRef": "97b7b5567fba6918a656ad349051b5343b5dea2e",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.12-?.patch",
            "xsa351-arm.patch"
          ]
        }
      }
    },
    "4.13": {
      "Recipes": {
        "xen": {
          "StableRef": "0060ac29bcbdb76d49d2e248ddfcb7afa2345440",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.13-?.patch",
            "xsa351-arm.patch"
          ]
        }
      }
    },
    "4.14": {
      "Recipes": {
        "xen": {
          "StableRef": "10bb63c203f42d931fa1fa7dbbae7ce1765cecf2",
          "Prereqs": [],
          "Patches": [
            "xsa351-x86-4.14-?.patch",
            "xsa351-arm.patch"
          ]
        }
      }
    },
    "master": {
      "Recipes": {
        "xen": {
          "StableRef": "7056f2f89f03f2f804ac7e776c7b2b000cd716cd",
          "Prereqs": [],
          "Patches": [
	              "xsa351-arm.patch"
		      ]
        }
      }
    }
  }
}

[-- Attachment #3: xsa351-arm.patch --]
[-- Type: application/octet-stream, Size: 2418 bytes --]

From: Julien Grall <jgrall@amazon.com>
Subject: xen/arm: Always trap AMU system registers

The Activity Monitors Unit (AMU) has been introduced by ARMv8.4. It is
considered to be unsafe to be expose to guests as they might expose
information about code executed by other guests or the host.

Arm provided a way to trap all the AMU system registers by setting
CPTR_EL2.TAM to 1.

Unfortunately, on older revision of the specification, the bit 30 (now
CPTR_EL1.TAM) was RES0. Because of that, Xen is setting it to 0 and
therefore the system registers would be exposed to the guest when it is
run on processors with AMU.

As the bit is mark as UNKNOWN at boot in Armv8.4, the only safe solution
for us is to always set CPTR_EL1.TAM to 1.

Guest trying to access the AMU system registers will now receive an
undefined instruction. Unfortunately, this means that even well-behaved
guest may fail to boot because we don't sanitize the ID registers.

This is a known issues with other Armv8.0+ features (e.g. SVE, Pointer
Auth). This will taken care separately.

This is part of XSA-351 (or XSA-93 re-born).

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index a36f145e67..22bd1bd4c6 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -151,7 +151,8 @@ void init_traps(void)
      * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
      * RES1, i.e. they would trap whether we did this write or not.
      */
-    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) | HCPTR_TTA,
+    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
+                 HCPTR_TTA | HCPTR_TAM,
                  CPTR_EL2);
 
     /*
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 3ca67f8157..d3d12a9d19 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -351,6 +351,7 @@
 #define VTCR_RES1       (_AC(1,UL)<<31)
 
 /* HCPTR Hyp. Coprocessor Trap Register */
+#define HCPTR_TAM       ((_AC(1,U)<<30))
 #define HCPTR_TTA       ((_AC(1,U)<<20))        /* Trap trace registers */
 #define HCPTR_CP(x)     ((_AC(1,U)<<(x)))       /* Trap Coprocessor x */
 #define HCPTR_CP_MASK   ((_AC(1,U)<<14)-1)

[-- Attachment #4: xsa351-arm-4.11.patch --]
[-- Type: application/octet-stream, Size: 2714 bytes --]

From bdbd66cb9ba17dd1a7221f2a561f45a836f12f64 Mon Sep 17 00:00:00 2001
From: Julien Grall <jgrall@amazon.com>
Date: Tue, 10 Nov 2020 17:08:32 +0000
Subject: [PATCH] xen/arm: Always trap AMU system registers

The Activity Monitors Unit (AMU) has been introduced by ARMv8.4. It is
considered to be unsafe to be expose to guests as they might expose
information about code executed by other guests or the host.

Arm provided a way to trap all the AMU system registers by setting
CPTR_EL2.TAM to 1.

Unfortunately, on older revision of the specification, the bit 30 (now
CPTR_EL1.TAM) was RES0. Because of that, Xen is setting it to 0 and
therefore the system registers would be exposed to the guest when it is
run on processors with AMU.

As the bit is mark as UNKNOWN at boot in Armv8.4, the only safe solution
for us is to always set CPTR_EL1.TAM to 1.

Guest trying to access the AMU system registers will now receive an
undefined instruction. Unfortunately, this means that even well-behaved
guest may fail to boot because we don't sanitize the ID registers.

This is a known issues with other Armv8.0+ features (e.g. SVE, Pointer
Auth). This will taken care separately.

This is part of XSA-351 (or XSA-93 re-born).

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
---
 xen/arch/arm/traps.c            | 3 ++-
 xen/include/asm-arm/processor.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
index e930585ad6d4..c12010a722b5 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -179,7 +179,8 @@ void init_traps(void)
      * On ARM64 the TCPx bits which we set here (0..9,12,13) are all
      * RES1, i.e. they would trap whether we did this write or not.
      */
-    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) | HCPTR_TTA,
+    WRITE_SYSREG((HCPTR_CP_MASK & ~(HCPTR_CP(10) | HCPTR_CP(11))) |
+                 HCPTR_TTA | HCPTR_TAM,
                  CPTR_EL2);
 
     /* Setup hypervisor traps */
diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/processor.h
index 222a02dd9935..5755cc64344a 100644
--- a/xen/include/asm-arm/processor.h
+++ b/xen/include/asm-arm/processor.h
@@ -291,6 +291,7 @@
 #define VTCR_RES1       (_AC(1,UL)<<31)
 
 /* HCPTR Hyp. Coprocessor Trap Register */
+#define HCPTR_TAM       ((_AC(1,U)<<30))
 #define HCPTR_TTA       ((_AC(1,U)<<20))        /* Trap trace registers */
 #define HCPTR_CP(x)     ((_AC(1,U)<<(x)))       /* Trap Coprocessor x */
 #define HCPTR_CP_MASK   ((_AC(1,U)<<14)-1)
-- 
2.17.1


[-- Attachment #5: xsa351-x86-4.11-1.patch --]
[-- Type: application/octet-stream, Size: 6244 bytes --]

From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com>
Subject: x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

  vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
  d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 256e58d82b..3495ac9f4a 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -141,6 +141,7 @@ int init_vcpu_msr_policy(struct vcpu *v)
 
 int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
 {
+    const struct domain *d = v->domain;
     const struct cpuid_policy *cp = v->domain->arch.cpuid;
     const struct msr_domain_policy *dp = v->domain->arch.msr;
     const struct msr_vcpu_policy *vp = v->arch.msr;
@@ -212,6 +213,25 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
         break;
 
         /*
+         * These MSRs are not enumerated in CPUID.  They have been around
+         * since the Pentium 4, and implemented by other vendors.
+         *
+         * Some versions of Windows try reading these before setting up a #GP
+         * handler, and Linux has several unguarded reads as well.  Provide
+         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+         * have full access.
+         */
+    case MSR_IA32_PERF_STATUS:
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        *val = 0;
+        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+            break;
+        goto gp_fault;
+
+        /*
          * TODO: Implement when we have better topology representation.
     case MSR_INTEL_CORE_THREAD_COUNT:
          */
@@ -241,6 +261,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
+    case MSR_IA32_PERF_STATUS:
         /* Read-only */
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
@@ -345,6 +366,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         break;
     }
 
+        /*
+         * This MSR is not enumerated in CPUID.  It has been around since the
+         * Pentium 4, and implemented by other vendors.
+         *
+         * To match the RAZ semantics, implement as write-discard, except for
+         * a cpufreq controller dom0 which has full access.
+         */
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+            break;
+        goto gp_fault;
+
     default:
         return X86EMUL_UNHANDLEABLE;
     }
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 8120ded330..755f00db33 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -816,12 +816,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     return val;
 }
 
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
 static int read_msr(unsigned int reg, uint64_t *val,
                     struct x86_emulate_ctxt *ctxt)
 {
@@ -1096,14 +1090,6 @@ static int write_msr(unsigned int reg, uint64_t val,
             return X86EMUL_OKAY;
         break;
 
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
     case MSR_IA32_THERM_CONTROL:
     case MSR_IA32_ENERGY_PERF_BIAS:
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index c0cc5d9336..7e4ad5d51b 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -920,6 +920,22 @@ extern enum cpufreq_controller {
     FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
 } cpufreq_controller;
 
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+    /*
+     * A PV dom0 can be nominated as the cpufreq controller, instead of using
+     * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+     * MSRs.
+     *
+     * This interface only works when dom0 is identity pinned and has the same
+     * number of vCPUs as pCPUs on the system.
+     *
+     * It would be far better to paravirtualise the interface.
+     */
+    return (is_pv_domain(d) && is_hardware_domain(d) &&
+            cpufreq_controller == FREQCTL_dom0_kernel);
+}
+
 #define CPUPOOLID_NONE    -1
 
 struct cpupool *cpupool_get_by_id(int poolid);

[-- Attachment #6: xsa351-x86-4.11-2.patch --]
[-- Type: application/octet-stream, Size: 4448 bytes --]

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues.  The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 3495ac9f4a..99c848ff41 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -156,6 +156,15 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
@@ -266,6 +275,15 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 480d1d8102..a685dcdcca 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -96,6 +96,38 @@
 /* Lower 6 bits define the format of the address in the LBR stack */
 #define MSR_IA32_PERF_CAP_LBR_FORMAT	0x3f
 
+/*
+ * Intel Runtime Average Power Limiting (RAPL) interface.  Power plane base
+ * addresses (MSR_*_POWER_LIMIT) are model specific, but have so-far been
+ * consistent since their introduction in SandyBridge.
+ *
+ * Offsets of functionality from the power plane base is architectural, but
+ * not all power planes support all functionality.
+ */
+#define MSR_RAPL_POWER_UNIT		0x00000606
+
+#define MSR_PKG_POWER_LIMIT		0x00000610
+#define MSR_PKG_ENERGY_STATUS		0x00000611
+#define MSR_PKG_PERF_STATUS		0x00000613
+#define MSR_PKG_POWER_INFO		0x00000614
+
+#define MSR_DRAM_POWER_LIMIT		0x00000618
+#define MSR_DRAM_ENERGY_STATUS		0x00000619
+#define MSR_DRAM_PERF_STATUS		0x0000061b
+#define MSR_DRAM_POWER_INFO		0x0000061c
+
+#define MSR_PP0_POWER_LIMIT		0x00000638
+#define MSR_PP0_ENERGY_STATUS		0x00000639
+#define MSR_PP0_POLICY			0x0000063a
+
+#define MSR_PP1_POWER_LIMIT		0x00000640
+#define MSR_PP1_ENERGY_STATUS		0x00000641
+#define MSR_PP1_POLICY			0x00000642
+
+/* Intel Platform-wide power interface. */
+#define MSR_PLATFORM_ENERGY_COUNTER	0x0000064d
+#define MSR_PLATFORM_POWER_LIMIT	0x0000065c
+
 #define MSR_IA32_BNDCFGS		0x00000d90
 #define IA32_BNDCFGS_ENABLE		0x00000001
 #define IA32_BNDCFGS_PRESERVE		0x00000002
@@ -218,6 +250,8 @@
 #define MSR_K8_VM_CR			0xc0010114
 #define MSR_K8_VM_HSAVE_PA		0xc0010117
 
+#define MSR_F15H_CU_POWER		0xc001007a
+#define MSR_F15H_CU_MAX_POWER		0xc001007b
 #define MSR_AMD_FAM15H_EVNTSEL0		0xc0010200
 #define MSR_AMD_FAM15H_PERFCTR0		0xc0010201
 #define MSR_AMD_FAM15H_EVNTSEL1		0xc0010202
@@ -231,6 +265,10 @@
 #define MSR_AMD_FAM15H_EVNTSEL5		0xc001020a
 #define MSR_AMD_FAM15H_PERFCTR5		0xc001020b
 
+#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
+#define MSR_AMD_CORE_ENERGY_STATUS	0xc001029a
+#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
+
 #define MSR_AMD_L7S0_FEATURE_MASK	0xc0011002
 #define MSR_AMD_THRM_FEATURE_MASK	0xc0011003
 #define MSR_K8_FEATURE_MASK		0xc0011004

[-- Attachment #7: xsa351-x86-4.12-1.patch --]
[-- Type: application/octet-stream, Size: 5992 bytes --]

From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com>
Subject: x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

  vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
  d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 4677222c40..a427826ba0 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -206,6 +206,25 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = msrs->misc_features_enables.raw;
         break;
 
+        /*
+         * These MSRs are not enumerated in CPUID.  They have been around
+         * since the Pentium 4, and implemented by other vendors.
+         *
+         * Some versions of Windows try reading these before setting up a #GP
+         * handler, and Linux has several unguarded reads as well.  Provide
+         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+         * have full access.
+         */
+    case MSR_IA32_PERF_STATUS:
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        *val = 0;
+        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -290,6 +309,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
+    case MSR_IA32_PERF_STATUS:
         /* Read-only */
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
@@ -394,6 +414,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         break;
     }
 
+        /*
+         * This MSR is not enumerated in CPUID.  It has been around since the
+         * Pentium 4, and implemented by other vendors.
+         *
+         * To match the RAZ semantics, implement as write-discard, except for
+         * a cpufreq controller dom0 which has full access.
+         */
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 324a2334a2..933036ea34 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -799,12 +799,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     return val;
 }
 
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
 static int read_msr(unsigned int reg, uint64_t *val,
                     struct x86_emulate_ctxt *ctxt)
 {
@@ -1047,14 +1041,6 @@ static int write_msr(unsigned int reg, uint64_t val,
             return X86EMUL_OKAY;
         break;
 
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
     case MSR_IA32_THERM_CONTROL:
     case MSR_IA32_ENERGY_PERF_BIAS:
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 819f6ede2b..b918624327 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -993,6 +993,22 @@ extern enum cpufreq_controller {
     FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
 } cpufreq_controller;
 
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+    /*
+     * A PV dom0 can be nominated as the cpufreq controller, instead of using
+     * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+     * MSRs.
+     *
+     * This interface only works when dom0 is identity pinned and has the same
+     * number of vCPUs as pCPUs on the system.
+     *
+     * It would be far better to paravirtualise the interface.
+     */
+    return (is_pv_domain(d) && is_hardware_domain(d) &&
+            cpufreq_controller == FREQCTL_dom0_kernel);
+}
+
 #define CPUPOOLID_NONE    -1
 
 struct cpupool *cpupool_get_by_id(int poolid);

[-- Attachment #8: xsa351-x86-4.12-2.patch --]
[-- Type: application/octet-stream, Size: 4682 bytes --]

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues.  The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index a427826ba0..927ed625df 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -151,9 +151,18 @@ int guest_rdmsr(const struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
@@ -315,9 +324,18 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 0eb6855614..ba9e90af21 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -96,6 +96,38 @@
 /* Lower 6 bits define the format of the address in the LBR stack */
 #define MSR_IA32_PERF_CAP_LBR_FORMAT	0x3f
 
+/*
+ * Intel Runtime Average Power Limiting (RAPL) interface.  Power plane base
+ * addresses (MSR_*_POWER_LIMIT) are model specific, but have so-far been
+ * consistent since their introduction in SandyBridge.
+ *
+ * Offsets of functionality from the power plane base is architectural, but
+ * not all power planes support all functionality.
+ */
+#define MSR_RAPL_POWER_UNIT		0x00000606
+
+#define MSR_PKG_POWER_LIMIT		0x00000610
+#define MSR_PKG_ENERGY_STATUS		0x00000611
+#define MSR_PKG_PERF_STATUS		0x00000613
+#define MSR_PKG_POWER_INFO		0x00000614
+
+#define MSR_DRAM_POWER_LIMIT		0x00000618
+#define MSR_DRAM_ENERGY_STATUS		0x00000619
+#define MSR_DRAM_PERF_STATUS		0x0000061b
+#define MSR_DRAM_POWER_INFO		0x0000061c
+
+#define MSR_PP0_POWER_LIMIT		0x00000638
+#define MSR_PP0_ENERGY_STATUS		0x00000639
+#define MSR_PP0_POLICY			0x0000063a
+
+#define MSR_PP1_POWER_LIMIT		0x00000640
+#define MSR_PP1_ENERGY_STATUS		0x00000641
+#define MSR_PP1_POLICY			0x00000642
+
+/* Intel Platform-wide power interface. */
+#define MSR_PLATFORM_ENERGY_COUNTER	0x0000064d
+#define MSR_PLATFORM_POWER_LIMIT	0x0000065c
+
 #define MSR_IA32_BNDCFGS		0x00000d90
 #define IA32_BNDCFGS_ENABLE		0x00000001
 #define IA32_BNDCFGS_PRESERVE		0x00000002
@@ -236,6 +268,8 @@
 #define MSR_K8_VM_CR			0xc0010114
 #define MSR_K8_VM_HSAVE_PA		0xc0010117
 
+#define MSR_F15H_CU_POWER		0xc001007a
+#define MSR_F15H_CU_MAX_POWER		0xc001007b
 #define MSR_AMD_FAM15H_EVNTSEL0		0xc0010200
 #define MSR_AMD_FAM15H_PERFCTR0		0xc0010201
 #define MSR_AMD_FAM15H_EVNTSEL1		0xc0010202
@@ -249,6 +283,10 @@
 #define MSR_AMD_FAM15H_EVNTSEL5		0xc001020a
 #define MSR_AMD_FAM15H_PERFCTR5		0xc001020b
 
+#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
+#define MSR_AMD_CORE_ENERGY_STATUS	0xc001029a
+#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
+
 #define MSR_AMD_L7S0_FEATURE_MASK	0xc0011002
 #define MSR_AMD_THRM_FEATURE_MASK	0xc0011003
 #define MSR_K8_FEATURE_MASK		0xc0011004

[-- Attachment #9: xsa351-x86-4.13-1.patch --]
[-- Type: application/octet-stream, Size: 5988 bytes --]

From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com>
Subject: x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

  vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
  d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 875ac39d30..8c969197aa 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -208,6 +208,25 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = msrs->misc_features_enables.raw;
         break;
 
+        /*
+         * These MSRs are not enumerated in CPUID.  They have been around
+         * since the Pentium 4, and implemented by other vendors.
+         *
+         * Some versions of Windows try reading these before setting up a #GP
+         * handler, and Linux has several unguarded reads as well.  Provide
+         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+         * have full access.
+         */
+    case MSR_IA32_PERF_STATUS:
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        *val = 0;
+        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -305,6 +324,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
+    case MSR_IA32_PERF_STATUS:
         /* Read-only */
     case MSR_TSX_FORCE_ABORT:
     case MSR_TSX_CTRL:
@@ -411,6 +431,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         break;
     }
 
+        /*
+         * This MSR is not enumerated in CPUID.  It has been around since the
+         * Pentium 4, and implemented by other vendors.
+         *
+         * To match the RAZ semantics, implement as write-discard, except for
+         * a cpufreq controller dom0 which has full access.
+         */
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 42258c6bf1..6dc4f92a84 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -776,12 +776,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     return val;
 }
 
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
 static int read_msr(unsigned int reg, uint64_t *val,
                     struct x86_emulate_ctxt *ctxt)
 {
@@ -1026,14 +1020,6 @@ static int write_msr(unsigned int reg, uint64_t val,
             return X86EMUL_OKAY;
         break;
 
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
     case MSR_IA32_THERM_CONTROL:
     case MSR_IA32_ENERGY_PERF_BIAS:
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index d6e27fc4b8..8bb5bd7b38 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1057,6 +1057,22 @@ extern enum cpufreq_controller {
     FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
 } cpufreq_controller;
 
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+    /*
+     * A PV dom0 can be nominated as the cpufreq controller, instead of using
+     * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+     * MSRs.
+     *
+     * This interface only works when dom0 is identity pinned and has the same
+     * number of vCPUs as pCPUs on the system.
+     *
+     * It would be far better to paravirtualise the interface.
+     */
+    return (is_pv_domain(d) && is_hardware_domain(d) &&
+            cpufreq_controller == FREQCTL_dom0_kernel);
+}
+
 #define CPUPOOLID_NONE    -1
 
 struct cpupool *cpupool_get_by_id(int poolid);

[-- Attachment #10: xsa351-x86-4.13-2.patch --]
[-- Type: application/octet-stream, Size: 4800 bytes --]

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues.  The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 8c969197aa..8ab6949a8e 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -152,11 +152,20 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
     case MSR_AMD64_LWP_CFG:
     case MSR_AMD64_LWP_CBADDR:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
@@ -330,11 +339,20 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
     case MSR_AMD64_LWP_CFG:
     case MSR_AMD64_LWP_CBADDR:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
         /* Not offered to guests. */
         goto gp_fault;
 
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 0eb6855614..ba9e90af21 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -96,6 +96,38 @@
 /* Lower 6 bits define the format of the address in the LBR stack */
 #define MSR_IA32_PERF_CAP_LBR_FORMAT	0x3f
 
+/*
+ * Intel Runtime Average Power Limiting (RAPL) interface.  Power plane base
+ * addresses (MSR_*_POWER_LIMIT) are model specific, but have so-far been
+ * consistent since their introduction in SandyBridge.
+ *
+ * Offsets of functionality from the power plane base is architectural, but
+ * not all power planes support all functionality.
+ */
+#define MSR_RAPL_POWER_UNIT		0x00000606
+
+#define MSR_PKG_POWER_LIMIT		0x00000610
+#define MSR_PKG_ENERGY_STATUS		0x00000611
+#define MSR_PKG_PERF_STATUS		0x00000613
+#define MSR_PKG_POWER_INFO		0x00000614
+
+#define MSR_DRAM_POWER_LIMIT		0x00000618
+#define MSR_DRAM_ENERGY_STATUS		0x00000619
+#define MSR_DRAM_PERF_STATUS		0x0000061b
+#define MSR_DRAM_POWER_INFO		0x0000061c
+
+#define MSR_PP0_POWER_LIMIT		0x00000638
+#define MSR_PP0_ENERGY_STATUS		0x00000639
+#define MSR_PP0_POLICY			0x0000063a
+
+#define MSR_PP1_POWER_LIMIT		0x00000640
+#define MSR_PP1_ENERGY_STATUS		0x00000641
+#define MSR_PP1_POLICY			0x00000642
+
+/* Intel Platform-wide power interface. */
+#define MSR_PLATFORM_ENERGY_COUNTER	0x0000064d
+#define MSR_PLATFORM_POWER_LIMIT	0x0000065c
+
 #define MSR_IA32_BNDCFGS		0x00000d90
 #define IA32_BNDCFGS_ENABLE		0x00000001
 #define IA32_BNDCFGS_PRESERVE		0x00000002
@@ -236,6 +268,8 @@
 #define MSR_K8_VM_CR			0xc0010114
 #define MSR_K8_VM_HSAVE_PA		0xc0010117
 
+#define MSR_F15H_CU_POWER		0xc001007a
+#define MSR_F15H_CU_MAX_POWER		0xc001007b
 #define MSR_AMD_FAM15H_EVNTSEL0		0xc0010200
 #define MSR_AMD_FAM15H_PERFCTR0		0xc0010201
 #define MSR_AMD_FAM15H_EVNTSEL1		0xc0010202
@@ -249,6 +283,10 @@
 #define MSR_AMD_FAM15H_EVNTSEL5		0xc001020a
 #define MSR_AMD_FAM15H_PERFCTR5		0xc001020b
 
+#define MSR_AMD_RAPL_POWER_UNIT		0xc0010299
+#define MSR_AMD_CORE_ENERGY_STATUS	0xc001029a
+#define MSR_AMD_PKG_ENERGY_STATUS	0xc001029b
+
 #define MSR_AMD_L7S0_FEATURE_MASK	0xc0011002
 #define MSR_AMD_THRM_FEATURE_MASK	0xc0011003
 #define MSR_K8_FEATURE_MASK		0xc0011004

[-- Attachment #11: xsa351-x86-4.14-1.patch --]
[-- Type: application/octet-stream, Size: 6075 bytes --]

From: =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= <roger.pau@citrix.com>
Subject: x86/msr: fix handling of MSR_IA32_PERF_{STATUS/CTL}
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Currently a PV hardware domain can also be given control over the CPU
frequency, and such guest is allowed to write to MSR_IA32_PERF_CTL.
However since commit 322ec7c89f6 the default behavior has been changed
to reject accesses to not explicitly handled MSRs, preventing PV
guests that manage CPU frequency from reading
MSR_IA32_PERF_{STATUS/CTL}.

Additionally some HVM guests (Windows at least) will attempt to read
MSR_IA32_PERF_CTL and will panic if given back a #GP fault:

  vmx.c:3035:d8v0 RDMSR 0x00000199 unimplemented
  d8v0 VIRIDIAN CRASH: 3b c0000096 fffff806871c1651 ffffda0253683720 0

Move the handling of MSR_IA32_PERF_{STATUS/CTL} to the common MSR
handling shared between HVM and PV guests, and add an explicit case
for reads to MSR_IA32_PERF_{STATUS/CTL}.

Restore previous behavior and allow PV guests with the required
permissions to read the contents of the mentioned MSRs. Non privileged
guests will get 0 when trying to read those registers, as writes to
MSR_IA32_PERF_CTL by such guest will already be silently dropped.

Fixes: 322ec7c89f6 ('x86/pv: disallow access to unknown MSRs')
Fixes: 84e848fd7a1 ('x86/hvm: disallow access to unknown MSRs')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3059178798a23ba870ff86ff54d442a07e6651fc)

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index d72ab0fa1f..3db26faf08 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -245,6 +245,25 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
         *val = msrs->misc_features_enables.raw;
         break;
 
+        /*
+         * These MSRs are not enumerated in CPUID.  They have been around
+         * since the Pentium 4, and implemented by other vendors.
+         *
+         * Some versions of Windows try reading these before setting up a #GP
+         * handler, and Linux has several unguarded reads as well.  Provide
+         * RAZ semantics, in general, but permit a cpufreq controller dom0 to
+         * have full access.
+         */
+    case MSR_IA32_PERF_STATUS:
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        *val = 0;
+        if ( likely(!is_cpufreq_controller(d)) || rdmsr_safe(msr, *val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
@@ -343,6 +362,7 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_INTEL_CORE_THREAD_COUNT:
     case MSR_INTEL_PLATFORM_INFO:
     case MSR_ARCH_CAPABILITIES:
+    case MSR_IA32_PERF_STATUS:
         /* Read-only */
     case MSR_TEST_CTRL:
     case MSR_TSX_FORCE_ABORT:
@@ -454,6 +474,21 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
         break;
     }
 
+        /*
+         * This MSR is not enumerated in CPUID.  It has been around since the
+         * Pentium 4, and implemented by other vendors.
+         *
+         * To match the RAZ semantics, implement as write-discard, except for
+         * a cpufreq controller dom0 which has full access.
+         */
+    case MSR_IA32_PERF_CTL:
+        if ( !(cp->x86_vendor & (X86_VENDOR_INTEL | X86_VENDOR_CENTAUR)) )
+            goto gp_fault;
+
+        if ( likely(!is_cpufreq_controller(d)) || wrmsr_safe(msr, val) == 0 )
+            break;
+        goto gp_fault;
+
     case MSR_X2APIC_FIRST ... MSR_X2APIC_LAST:
         if ( !is_hvm_domain(d) || v != curr )
             goto gp_fault;
diff --git a/xen/arch/x86/pv/emul-priv-op.c b/xen/arch/x86/pv/emul-priv-op.c
index 85a9fd4767..5c7b9117ae 100644
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -820,12 +820,6 @@ static inline uint64_t guest_misc_enable(uint64_t val)
     return val;
 }
 
-static inline bool is_cpufreq_controller(const struct domain *d)
-{
-    return ((cpufreq_controller == FREQCTL_dom0_kernel) &&
-            is_hardware_domain(d));
-}
-
 static int read_msr(unsigned int reg, uint64_t *val,
                     struct x86_emulate_ctxt *ctxt)
 {
@@ -1070,14 +1064,6 @@ static int write_msr(unsigned int reg, uint64_t val,
             return X86EMUL_OKAY;
         break;
 
-    case MSR_IA32_PERF_CTL:
-        if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
-            break;
-        if ( likely(!is_cpufreq_controller(currd)) ||
-             wrmsr_safe(reg, val) == 0 )
-            return X86EMUL_OKAY;
-        break;
-
     case MSR_IA32_THERM_CONTROL:
     case MSR_IA32_ENERGY_PERF_BIAS:
         if ( boot_cpu_data.x86_vendor != X86_VENDOR_INTEL )
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index a0d87ef9d0..97ba8e0795 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1071,6 +1071,22 @@ extern enum cpufreq_controller {
     FREQCTL_none, FREQCTL_dom0_kernel, FREQCTL_xen
 } cpufreq_controller;
 
+static always_inline bool is_cpufreq_controller(const struct domain *d)
+{
+    /*
+     * A PV dom0 can be nominated as the cpufreq controller, instead of using
+     * Xen's cpufreq driver, at which point dom0 gets direct access to certain
+     * MSRs.
+     *
+     * This interface only works when dom0 is identity pinned and has the same
+     * number of vCPUs as pCPUs on the system.
+     *
+     * It would be far better to paravirtualise the interface.
+     */
+    return (is_pv_domain(d) && is_hardware_domain(d) &&
+            cpufreq_controller == FREQCTL_dom0_kernel);
+}
+
 int cpupool_move_domain(struct domain *d, struct cpupool *c);
 int cpupool_do_sysctl(struct xen_sysctl_cpupool_op *op);
 int cpupool_get_id(const struct domain *d);

[-- Attachment #12: xsa351-x86-4.14-2.patch --]
[-- Type: application/octet-stream, Size: 5173 bytes --]

From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: x86/msr: Disallow guest access to the RAPL MSRs

Researchers have demonstrated using the RAPL interface to perform a
differential power analysis attack to recover AES keys used by other cores in
the system.

Furthermore, even privileged guests cannot use this interface correctly, due
to MSR scope and vcpu scheduling issues.  The interface would want to be
paravirtualised to be used sensibly.

Disallow access to the RAPL MSRs completely, as well as other MSRs which
potentially access fine grain power information.

This is part of XSA-351.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

diff --git a/xen/arch/x86/msr.c b/xen/arch/x86/msr.c
index 3db26faf08..aa107823ac 100644
--- a/xen/arch/x86/msr.c
+++ b/xen/arch/x86/msr.c
@@ -185,6 +185,13 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
@@ -192,6 +199,8 @@ int guest_rdmsr(struct vcpu *v, uint32_t msr, uint64_t *val)
     case MSR_AMD64_LWP_CBADDR:
     case MSR_PPIN_CTL:
     case MSR_PPIN:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
     case MSR_AMD_PPIN_CTL:
     case MSR_AMD_PPIN:
         /* Not offered to guests. */
@@ -369,6 +378,13 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_TSX_CTRL:
     case MSR_MCU_OPT_CTRL:
     case MSR_RTIT_OUTPUT_BASE ... MSR_RTIT_ADDR_B(7):
+    case MSR_RAPL_POWER_UNIT:
+    case MSR_PKG_POWER_LIMIT  ... MSR_PKG_POWER_INFO:
+    case MSR_DRAM_POWER_LIMIT ... MSR_DRAM_POWER_INFO:
+    case MSR_PP0_POWER_LIMIT  ... MSR_PP0_POLICY:
+    case MSR_PP1_POWER_LIMIT  ... MSR_PP1_POLICY:
+    case MSR_PLATFORM_ENERGY_COUNTER:
+    case MSR_PLATFORM_POWER_LIMIT:
     case MSR_U_CET:
     case MSR_S_CET:
     case MSR_PL0_SSP ... MSR_INTERRUPT_SSP_TABLE:
@@ -376,6 +392,8 @@ int guest_wrmsr(struct vcpu *v, uint32_t msr, uint64_t val)
     case MSR_AMD64_LWP_CBADDR:
     case MSR_PPIN_CTL:
     case MSR_PPIN:
+    case MSR_F15H_CU_POWER ... MSR_F15H_CU_MAX_POWER:
+    case MSR_AMD_RAPL_POWER_UNIT ... MSR_AMD_PKG_ENERGY_STATUS:
     case MSR_AMD_PPIN_CTL:
     case MSR_AMD_PPIN:
         /* Not offered to guests. */
diff --git a/xen/include/asm-x86/msr-index.h b/xen/include/asm-x86/msr-index.h
index 0fe98af923..5e64ecff91 100644
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -77,6 +77,38 @@
 #define MSR_RTIT_ADDR_A(n)                 (0x00000580 + (n) * 2)
 #define MSR_RTIT_ADDR_B(n)                 (0x00000581 + (n) * 2)
 
+/*
+ * Intel Runtime Average Power Limiting (RAPL) interface.  Power plane base
+ * addresses (MSR_*_POWER_LIMIT) are model specific, but have so-far been
+ * consistent since their introduction in SandyBridge.
+ *
+ * Offsets of functionality from the power plane base is architectural, but
+ * not all power planes support all functionality.
+ */
+#define MSR_RAPL_POWER_UNIT                 0x00000606
+
+#define MSR_PKG_POWER_LIMIT                 0x00000610
+#define MSR_PKG_ENERGY_STATUS               0x00000611
+#define MSR_PKG_PERF_STATUS                 0x00000613
+#define MSR_PKG_POWER_INFO                  0x00000614
+
+#define MSR_DRAM_POWER_LIMIT                0x00000618
+#define MSR_DRAM_ENERGY_STATUS              0x00000619
+#define MSR_DRAM_PERF_STATUS                0x0000061b
+#define MSR_DRAM_POWER_INFO                 0x0000061c
+
+#define MSR_PP0_POWER_LIMIT                 0x00000638
+#define MSR_PP0_ENERGY_STATUS               0x00000639
+#define MSR_PP0_POLICY                      0x0000063a
+
+#define MSR_PP1_POWER_LIMIT                 0x00000640
+#define MSR_PP1_ENERGY_STATUS               0x00000641
+#define MSR_PP1_POLICY                      0x00000642
+
+/* Intel Platform-wide power interface. */
+#define MSR_PLATFORM_ENERGY_COUNTER         0x0000064d
+#define MSR_PLATFORM_POWER_LIMIT            0x0000065c
+
 #define MSR_U_CET                           0x000006a0
 #define MSR_S_CET                           0x000006a2
 #define  CET_SHSTK_EN                       (_AC(1, ULL) <<  0)
@@ -92,6 +124,13 @@
 #define  PASID_PASID_MASK                   0x000fffff
 #define  PASID_VALID                        (_AC(1, ULL) << 31)
 
+#define MSR_F15H_CU_POWER                   0xc001007a
+#define MSR_F15H_CU_MAX_POWER               0xc001007b
+
+#define MSR_AMD_RAPL_POWER_UNIT             0xc0010299
+#define MSR_AMD_CORE_ENERGY_STATUS          0xc001029a
+#define MSR_AMD_PKG_ENERGY_STATUS           0xc001029b
+
 /*
  * Legacy MSR constants in need of cleanup.  No new MSRs below this comment.
  */

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-11-10 19:00 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-10 19:00 Xen Security Advisory 351 v1 - Information leak via power sidechannel Xen.org security team
  -- strict thread matches above, loose matches on Subject: below --
2020-11-10 18:01 Xen.org security team

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).