All of lore.kernel.org
 help / color / mirror / Atom feed
* L1TF Patch Series v10
@ 2019-03-14 12:50 Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 1/8] spec: add l1tf-barrier Norbert Manthey
                   ` (7 more replies)
  0 siblings, 8 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

Dear all,

This patch series attempts to mitigate the issue that have been raised in the
XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
execution on Intel hardware, an lfence instruction is required to make sure
that selected checks are not bypassed. Speculative out-of-bound accesses can
be prevented by using the array_index_nospec macro.

The major change compared to version 9 is patch 8/8, which slipped through the
analysis in the first rounds.

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH L1TF v10 1/8] spec: add l1tf-barrier
  2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
@ 2019-03-14 12:50 ` Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec Norbert Manthey
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

To control the runtime behavior on L1TF vulnerable platforms better, the
command line option l1tf-barrier is introduced. This option controls
whether on vulnerable x86 platforms the lfence instruction is used to
prevent speculative execution from bypassing the evaluation of
conditionals that are protected with the evaluate_nospec macro.

By now, Xen is capable of identifying L1TF vulnerable hardware. However,
this information cannot be used for alternative patching, as a CPU feature
is required. To control alternative patching with the command line option,
a new x86 feature "X86_FEATURE_SC_L1TF_VULN" is introduced. This feature
is used to patch the lfence instruction into the arch_barrier_nospec_true
function. The feature is enabled only if L1TF vulnerable hardware is
detected and the command line option does not prevent using this feature.

The status of hyperthreading is considered when automatically enabling
adding the lfence instruction. Since platforms without hyperthreading can
still be vulnerable to L1TF in case the L1 cache is not flushed properly,
the additional lfence instructions are patched in if either hyperthreading
is enabled, or L1 cache flushing is missing.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
 docs/misc/xen-command-line.pandoc | 14 ++++++++++----
 xen/arch/x86/spec_ctrl.c          | 17 +++++++++++++++--
 xen/include/asm-x86/cpufeatures.h |  1 +
 xen/include/asm-x86/spec_ctrl.h   |  1 +
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -483,9 +483,9 @@ accounting for hardware capabilities as enumerated via CPUID.
 
 Currently accepted:
 
-The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`,
-`l1d-flush` and `ssbd` are used by default if available and applicable.  They can
-be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
+The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, `l1d-flush`,
+`l1tf-barrier` and `ssbd` are used by default if available and applicable.  They
+can be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
 won't offer them to guests.
 
 ### cpuid_mask_cpu
@@ -1902,7 +1902,7 @@ By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`).
 ### spec-ctrl (x86)
 > `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb}=<bool>,
 >              bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
->              l1d-flush}=<bool> ]`
+>              l1d-flush,l1tf-barrier}=<bool> ]`
 
 Controls for speculative execution sidechannel mitigations.  By default, Xen
 will pick the most appropriate mitigations based on compiled in support,
@@ -1968,6 +1968,12 @@ Irrespective of Xen's setting, the feature is virtualised for HVM guests to
 use.  By default, Xen will enable this mitigation on hardware believed to be
 vulnerable to L1TF.
 
+On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force
+or prevent Xen from protecting evaluations inside the hypervisor with a barrier
+instruction to not load potentially secret information into L1 cache.  By
+default, Xen will enable this mitigation on hardware believed to be vulnerable
+to L1TF.
+
 ### sync_console
 > `= <boolean>`
 
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -21,6 +21,7 @@
 #include <xen/lib.h>
 #include <xen/warning.h>
 
+#include <asm/cpuid.h>
 #include <asm/microcode.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -50,6 +51,7 @@ bool __read_mostly opt_ibpb = true;
 bool __read_mostly opt_ssbd = false;
 int8_t __read_mostly opt_eager_fpu = -1;
 int8_t __read_mostly opt_l1d_flush = -1;
+int8_t __read_mostly opt_l1tf_barrier = -1;
 
 bool __initdata bsp_delay_spec_ctrl;
 uint8_t __read_mostly default_xen_spec_ctrl;
@@ -91,6 +93,8 @@ static int __init parse_spec_ctrl(const char *s)
             if ( opt_pv_l1tf_domu < 0 )
                 opt_pv_l1tf_domu = 0;
 
+            opt_l1tf_barrier = 0;
+
         disable_common:
             opt_rsb_pv = false;
             opt_rsb_hvm = false;
@@ -157,6 +161,8 @@ static int __init parse_spec_ctrl(const char *s)
             opt_eager_fpu = val;
         else if ( (val = parse_boolean("l1d-flush", s, ss)) >= 0 )
             opt_l1d_flush = val;
+        else if ( (val = parse_boolean("l1tf-barrier", s, ss)) >= 0 )
+            opt_l1tf_barrier = val;
         else
             rc = -EINVAL;
 
@@ -248,7 +254,7 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
                "\n");
 
     /* Settings for Xen's protection, irrespective of guests. */
-    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s\n",
+    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s%s\n",
            thunk == THUNK_NONE      ? "N/A" :
            thunk == THUNK_RETPOLINE ? "RETPOLINE" :
            thunk == THUNK_LFENCE    ? "LFENCE" :
@@ -258,7 +264,8 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
            !boot_cpu_has(X86_FEATURE_SSBD)           ? "" :
            (default_xen_spec_ctrl & SPEC_CTRL_SSBD)  ? " SSBD+" : " SSBD-",
            opt_ibpb                                  ? " IBPB"  : "",
-           opt_l1d_flush                             ? " L1D_FLUSH" : "");
+           opt_l1d_flush                             ? " L1D_FLUSH" : "",
+           opt_l1tf_barrier                          ? " L1TF_BARRIER" : "");
 
     /* L1TF diagnostics, printed if vulnerable or PV shadowing is in use. */
     if ( cpu_has_bug_l1tf || opt_pv_l1tf_hwdom || opt_pv_l1tf_domu )
@@ -842,6 +849,12 @@ void __init init_speculation_mitigations(void)
     else if ( opt_l1d_flush == -1 )
         opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
 
+    /* By default, enable L1TF_VULN on L1TF-vulnerable hardware */
+    if ( opt_l1tf_barrier == -1 )
+        opt_l1tf_barrier = cpu_has_bug_l1tf && (opt_smt || !opt_l1d_flush);
+    if ( opt_l1tf_barrier > 0 )
+        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);
+
     /*
      * We do not disable HT by default on affected hardware.
      *
diff --git a/xen/include/asm-x86/cpufeatures.h b/xen/include/asm-x86/cpufeatures.h
--- a/xen/include/asm-x86/cpufeatures.h
+++ b/xen/include/asm-x86/cpufeatures.h
@@ -25,6 +25,7 @@ XEN_CPUFEATURE(XEN_SMAP,        (FSCAPINTS+0)*32+11) /* SMAP gets used by Xen it
 XEN_CPUFEATURE(LFENCE_DISPATCH, (FSCAPINTS+0)*32+12) /* lfence set as Dispatch Serialising */
 XEN_CPUFEATURE(IND_THUNK_LFENCE,(FSCAPINTS+0)*32+13) /* Use IND_THUNK_LFENCE */
 XEN_CPUFEATURE(IND_THUNK_JMP,   (FSCAPINTS+0)*32+14) /* Use IND_THUNK_JMP */
+XEN_CPUFEATURE(SC_L1TF_VULN,    (FSCAPINTS+0)*32+15) /* L1TF protection required */
 XEN_CPUFEATURE(SC_MSR_PV,       (FSCAPINTS+0)*32+16) /* MSR_SPEC_CTRL used by Xen for PV */
 XEN_CPUFEATURE(SC_MSR_HVM,      (FSCAPINTS+0)*32+17) /* MSR_SPEC_CTRL used by Xen for HVM */
 XEN_CPUFEATURE(SC_RSB_PV,       (FSCAPINTS+0)*32+18) /* RSB overwrite needed for PV */
diff --git a/xen/include/asm-x86/spec_ctrl.h b/xen/include/asm-x86/spec_ctrl.h
--- a/xen/include/asm-x86/spec_ctrl.h
+++ b/xen/include/asm-x86/spec_ctrl.h
@@ -37,6 +37,7 @@ extern bool opt_ibpb;
 extern bool opt_ssbd;
 extern int8_t opt_eager_fpu;
 extern int8_t opt_l1d_flush;
+extern int8_t opt_l1tf_barrier;
 
 extern bool bsp_delay_spec_ctrl;
 extern uint8_t default_xen_spec_ctrl;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec
  2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 1/8] spec: add l1tf-barrier Norbert Manthey
@ 2019-03-14 12:50 ` Norbert Manthey
  2019-03-14 13:19   ` Jan Beulich
  2019-03-14 12:50 ` [PATCH L1TF v10 3/8] is_control_domain: block speculation Norbert Manthey
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
L1 cache is problematic, because when hyperthreading is used as well, a
guest running on the sibling core can leak this potentially secret data.

To prevent these speculative accesses, we block speculation after
accessing the domain property field by adding lfence instructions. This
way, the CPU continues executing and loading data only once the condition
is actually evaluated.

As this protection is typically used in if statements, the lfence has to
come in a compatible way. Therefore, a function that returns true after an
lfence instruction is introduced. To protect both branches after a
conditional, an lfence instruction has to be added for the two branches.
To be able to block speculation after several evaluations, the generic
barrier macro block_speculation is also introduced.

As the L1TF vulnerability is only present on the x86 architecture, there is
no need to add protection for other architectures. Hence, the introduced
functions are defined but empty.

On the x86 architecture, by default, the lfence instruction is not present
either. Only when a L1TF vulnerable platform is detected, the lfence
instruction is patched in via alternative patching. Similarly, PV guests
are protected wrt L1TF by default, so that the protection is furthermore
disabled in case HVM is exclueded via the build configuration.

Introducing the lfence instructions catches a lot of potential leaks with
a simple unintrusive code change. During performance testing, we did not
notice performance effects.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Acked-by: Julien Grall <julien.grall@arm.com>
---
 xen/include/asm-arm/nospec.h | 25 +++++++++++++++++++++++++
 xen/include/asm-x86/nospec.h | 39 +++++++++++++++++++++++++++++++++++++++
 xen/include/xen/nospec.h     |  1 +
 3 files changed, 65 insertions(+)
 create mode 100644 xen/include/asm-arm/nospec.h
 create mode 100644 xen/include/asm-x86/nospec.h

diff --git a/xen/include/asm-arm/nospec.h b/xen/include/asm-arm/nospec.h
new file mode 100644
--- /dev/null
+++ b/xen/include/asm-arm/nospec.h
@@ -0,0 +1,25 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */
+
+#ifndef _ASM_ARM_NOSPEC_H
+#define _ASM_ARM_NOSPEC_H
+
+static inline bool evaluate_nospec(bool condition)
+{
+    return condition;
+}
+
+static inline void block_speculation(void)
+{
+}
+
+#endif /* _ASM_ARM_NOSPEC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/nospec.h b/xen/include/asm-x86/nospec.h
new file mode 100644
--- /dev/null
+++ b/xen/include/asm-x86/nospec.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */
+
+#ifndef _ASM_X86_NOSPEC_H
+#define _ASM_X86_NOSPEC_H
+
+#include <asm/alternative.h>
+
+/* Allow to insert a read memory barrier into conditionals */
+static always_inline bool barrier_nospec_true(void)
+{
+#ifdef CONFIG_HVM
+    alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
+#endif
+    return true;
+}
+
+/* Allow to protect evaluation of conditionasl with respect to speculation */
+static always_inline bool evaluate_nospec(bool condition)
+{
+    return condition ? barrier_nospec_true() : !barrier_nospec_true();
+}
+
+/* Allow to block speculative execution in generic code */
+static always_inline void block_speculation(void)
+{
+    barrier_nospec_true();
+}
+
+#endif /* _ASM_X86_NOSPEC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -8,6 +8,7 @@
 #define XEN_NOSPEC_H
 
 #include <asm/system.h>
+#include <asm/nospec.h>
 
 /**
  * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH L1TF v10 3/8] is_control_domain: block speculation
  2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 1/8] spec: add l1tf-barrier Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec Norbert Manthey
@ 2019-03-14 12:50 ` Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 4/8] is_hvm/pv_domain: " Norbert Manthey
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

Checks of domain properties, such as is_hardware_domain or is_hvm_domain,
might be bypassed by speculatively executing these instructions. A reason
for bypassing these checks is that these macros access the domain
structure via a pointer, and check a certain field. Since this memory
access is slow, the CPU assumes a returned value and continues the
execution.

In case an is_control_domain check is bypassed, for example during a
hypercall, data that should only be accessible by the control domain could
be loaded into the cache.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Acked-by: Jan Beulich <jbeulich@suse.com>

---
 xen/include/xen/sched.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -913,10 +913,10 @@ void watchdog_domain_destroy(struct domain *d);
  *    (that is, this would not be suitable for a driver domain)
  *  - There is never a reason to deny the hardware domain access to this
  */
-#define is_hardware_domain(_d) ((_d) == hardware_domain)
+#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
 
 /* This check is for functionality specific to a control domain */
-#define is_control_domain(_d) ((_d)->is_privileged)
+#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)
 
 #define VM_ASSIST(d, t) (test_bit(VMASST_TYPE_ ## t, &(d)->vm_assist))
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
  2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
                   ` (2 preceding siblings ...)
  2019-03-14 12:50 ` [PATCH L1TF v10 3/8] is_control_domain: block speculation Norbert Manthey
@ 2019-03-14 12:50 ` Norbert Manthey
  2019-04-05 15:34     ` [Xen-devel] " Andrew Cooper
  2019-03-14 12:50 ` [PATCH L1TF v10 5/8] common/memory: block speculative out-of-bound accesses Norbert Manthey
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

When checking for being an hvm domain, or PV domain, we have to make
sure that speculation cannot bypass that check, and eventually access
data that should not end up in cache for the current domain type.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Acked-by: Jan Beulich <jbeulich@suse.com>

---
 xen/include/xen/sched.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
 
 static inline bool is_pv_domain(const struct domain *d)
 {
-    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
+    return IS_ENABLED(CONFIG_PV)
+           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
 }
 
 static inline bool is_pv_vcpu(const struct vcpu *v)
@@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
 #endif
 static inline bool is_hvm_domain(const struct domain *d)
 {
-    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
+    return IS_ENABLED(CONFIG_HVM)
+           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
 }
 
 static inline bool is_hvm_vcpu(const struct vcpu *v)
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH L1TF v10 5/8] common/memory: block speculative out-of-bound accesses
  2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
                   ` (3 preceding siblings ...)
  2019-03-14 12:50 ` [PATCH L1TF v10 4/8] is_hvm/pv_domain: " Norbert Manthey
@ 2019-03-14 12:50 ` Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 6/8] x86/hvm: add nospec to hvmop param Norbert Manthey
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

The get_page_from_gfn method returns a pointer to a page that belongs
to a gfn. Before returning the pointer, the gfn is checked for being
valid. Under speculation, these checks can be bypassed, so that
the function get_page is still executed partially. Consequently, the
function page_get_owner_and_reference might be executed partially as
well. In this function, the computed pointer is accessed, resulting in
a speculative out-of-bound address load. As the gfn can be controlled by
a guest, this access is problematic.

To mitigate the root cause, an lfence instruction is added via the
evaluate_nospec macro. To make the protection generic, we do not
introduce the lfence instruction for this single check, but add it to
the mfn_valid function. This way, other potentially problematic accesses
are protected as well.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Acked-by: Jan Beulich <jbeulich@suse.com>

---
 xen/common/pdx.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/common/pdx.c b/xen/common/pdx.c
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -18,6 +18,7 @@
 #include <xen/init.h>
 #include <xen/mm.h>
 #include <xen/bitops.h>
+#include <xen/nospec.h>
 
 /* Parameters for PFN/MADDR compression. */
 unsigned long __read_mostly max_pdx;
@@ -33,8 +34,9 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
 
 bool __mfn_valid(unsigned long mfn)
 {
-    return likely(mfn < max_page) &&
-           likely(!(mfn & pfn_hole_mask)) &&
+    if ( unlikely(evaluate_nospec(mfn >= max_page)) )
+        return false;
+    return likely(!(mfn & pfn_hole_mask)) &&
            likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
                            pdx_group_valid));
 }
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH L1TF v10 6/8] x86/hvm: add nospec to hvmop param
  2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
                   ` (4 preceding siblings ...)
  2019-03-14 12:50 ` [PATCH L1TF v10 5/8] common/memory: block speculative out-of-bound accesses Norbert Manthey
@ 2019-03-14 12:50 ` Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
  2019-03-14 12:50 ` [PATCH L1TF v10 8/8] common/domain: " Norbert Manthey
  7 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

The params array in hvm can be accessed with get and set functions.
As the index is guest controlled, make sure no out-of-bound accesses
can be performed.

As we cannot influence how future compilers might modify the
instructions that enforce the bounds, we furthermore block speculation,
so that the update is visible in the architectural state.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Acked-by: Jan Beulich <jbeulich@suse.com>

---
 xen/arch/x86/hvm/hvm.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -4135,6 +4135,9 @@ static int hvmop_set_param(
     if ( a.index >= HVM_NR_PARAMS )
         return -EINVAL;
 
+    /* Make sure the above bound check is not bypassed during speculation. */
+    block_speculation();
+
     d = rcu_lock_domain_by_any_id(a.domid);
     if ( d == NULL )
         return -ESRCH;
@@ -4401,6 +4404,9 @@ static int hvmop_get_param(
     if ( a.index >= HVM_NR_PARAMS )
         return -EINVAL;
 
+    /* Make sure the above bound check is not bypassed during speculation. */
+    block_speculation();
+
     d = rcu_lock_domain_by_any_id(a.domid);
     if ( d == NULL )
         return -ESRCH;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses
  2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
                   ` (5 preceding siblings ...)
  2019-03-14 12:50 ` [PATCH L1TF v10 6/8] x86/hvm: add nospec to hvmop param Norbert Manthey
@ 2019-03-14 12:50 ` Norbert Manthey
  2019-03-29 17:11   ` Jan Beulich
  2019-03-14 12:50 ` [PATCH L1TF v10 8/8] common/domain: " Norbert Manthey
  7 siblings, 1 reply; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

Guests can issue grant table operations and provide guest controlled
data to them. This data is also used for memory loads. To avoid
speculative out-of-bound accesses, we use the array_index_nospec macro
where applicable. However, there are also memory accesses that cannot
be protected by a single array protection, or multiple accesses in a
row. To protect these, a nospec barrier is placed between the actual
range check and the access via the block_speculation macro.

Speculative execution is not blocked in case one of the following
properties is true:
 - path cannot be triggered by the guest
 - path does not return to the guest
 - path does not result in an out-of-bound access
 - path cannot be executed repeatedly
Only the combination of the above properties allows to actually leak
continuous chunks of memory. Therefore, we only add the penalty of
protective mechanisms in case a potential speculative out-of-bound
access matches all the above properties.

As different versions of grant tables use structures of different size,
and the status is encoded in an array for version 2, speculative
execution might perform out-of-bound accesses of version 2 while
the table is actually using version 1. Hence, speculation is prevented
when accessing new memory based on the grant table version. In cases,
where no different memory locations are accessed on the code path that
follow an if statement, no protection is required. No different memory
locations are accessed in the following functionsi after a version check:

 * _set_status, as the header memory layout is the same
 * unmap_common, as potentially touched memory locations are allocated
                 and initialized
 * gnttab_grow_table, as the touched memory is the same for each
                branch after the conditionals
 * gnttab_transfer, as no memory access depends on the conditional
 * release_grant_for_copy, as no out-of-bound access depends on this
                conditional
 * gnttab_set_version, as in case of a version change all the memory is
                touched in both cases
 * gnttab_release_mappings, as this function is called only during domain
                destruction and control is not returned to the guest
 * mem_sharing_gref_to_gfn, as potential dangerous memory accesses are
                covered by the next evaluate_nospec
 * gnttab_get_status_frame, as the potential dangerous memory accesses
                are protected in gnttab_get_status_frame_mfn
 * gnttab_usage_print, as this function cannot be triggered by the guest

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v10: extended commit message with explanation when to exclude comparisons
       minor change in gnttab_transfer due to rebase

 xen/common/grant_table.c | 97 +++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 75 insertions(+), 22 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -37,6 +37,7 @@
 #include <xen/paging.h>
 #include <xen/keyhandler.h>
 #include <xen/vmap.h>
+#include <xen/nospec.h>
 #include <xsm/xsm.h>
 #include <asm/flushtlb.h>
 
@@ -203,8 +204,9 @@ static inline unsigned int nr_status_frames(const struct grant_table *gt)
 }
 
 #define MAPTRACK_PER_PAGE (PAGE_SIZE / sizeof(struct grant_mapping))
-#define maptrack_entry(t, e) \
-    ((t)->maptrack[(e)/MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
+#define maptrack_entry(t, e)                                                   \
+    ((t)->maptrack[array_index_nospec(e, (t)->maptrack_limit) /                \
+                                    MAPTRACK_PER_PAGE][(e) % MAPTRACK_PER_PAGE])
 
 static inline unsigned int
 nr_maptrack_frames(struct grant_table *t)
@@ -226,10 +228,23 @@ nr_maptrack_frames(struct grant_table *t)
 static grant_entry_header_t *
 shared_entry_header(struct grant_table *t, grant_ref_t ref)
 {
-    if ( t->gt_version == 1 )
+    switch ( t->gt_version )
+    {
+    case 1:
+        /* Returned values should be independent of speculative execution */
+        block_speculation();
         return (grant_entry_header_t*)&shared_entry_v1(t, ref);
-    else
+
+    case 2:
+        /* Returned values should be independent of speculative execution */
+        block_speculation();
         return &shared_entry_v2(t, ref).hdr;
+    }
+
+    ASSERT_UNREACHABLE();
+    block_speculation();
+
+    return NULL;
 }
 
 /* Active grant entry - used for shadowing GTF_permit_access grants. */
@@ -634,14 +649,24 @@ static unsigned int nr_grant_entries(struct grant_table *gt)
     case 1:
         BUILD_BUG_ON(f2e(INITIAL_NR_GRANT_FRAMES, 1) <
                      GNTTAB_NR_RESERVED_ENTRIES);
+
+        /* Make sure we return a value independently of speculative execution */
+        block_speculation();
         return f2e(nr_grant_frames(gt), 1);
+
     case 2:
         BUILD_BUG_ON(f2e(INITIAL_NR_GRANT_FRAMES, 2) <
                      GNTTAB_NR_RESERVED_ENTRIES);
+
+        /* Make sure we return a value independently of speculative execution */
+        block_speculation();
         return f2e(nr_grant_frames(gt), 2);
 #undef f2e
     }
 
+    ASSERT_UNREACHABLE();
+    block_speculation();
+
     return 0;
 }
 
@@ -963,9 +988,13 @@ map_grant_ref(
         PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
                  op->ref, rgt->domain->domain_id);
 
-    act = active_entry_acquire(rgt, op->ref);
+    /* This call ensures the above check cannot be bypassed speculatively */
     shah = shared_entry_header(rgt, op->ref);
-    status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
+    act = active_entry_acquire(rgt, op->ref);
+
+    /* Make sure we do not access memory speculatively */
+    status = evaluate_nospec(rgt->gt_version == 1) ? &shah->flags
+                                                 : &status_entry(rgt, op->ref);
 
     /* If already pinned, check the active domid and avoid refcnt overflow. */
     if ( act->pin &&
@@ -987,7 +1016,7 @@ map_grant_ref(
 
         if ( !act->pin )
         {
-            unsigned long gfn = rgt->gt_version == 1 ?
+            unsigned long gfn = evaluate_nospec(rgt->gt_version == 1) ?
                                 shared_entry_v1(rgt, op->ref).frame :
                                 shared_entry_v2(rgt, op->ref).full_page.frame;
 
@@ -1321,6 +1350,9 @@ unmap_common(
         goto unlock_out;
     }
 
+    /* Make sure the above bound check cannot be bypassed speculatively */
+    block_speculation();
+
     act = active_entry_acquire(rgt, op->ref);
 
     /*
@@ -1418,7 +1450,7 @@ unmap_common_complete(struct gnttab_unmap_common *op)
     struct page_info *pg;
     uint16_t *status;
 
-    if ( !op->done )
+    if ( evaluate_nospec(!op->done) )
     {
         /* unmap_common() didn't do anything - nothing to complete. */
         return;
@@ -2026,6 +2058,7 @@ gnttab_prepare_for_transfer(
         goto fail;
     }
 
+    /* This call ensures the above check cannot be bypassed speculatively */
     sha = shared_entry_header(rgt, ref);
 
     scombo.word = *(u32 *)&sha->flags;
@@ -2223,7 +2256,12 @@ gnttab_transfer(
         spin_unlock(&e->page_alloc_lock);
         okay = gnttab_prepare_for_transfer(e, d, gop.ref);
 
-        if ( unlikely(!okay || assign_pages(e, page, 0, MEMF_no_refcount)) )
+        /*
+         * Make sure the reference bound check in gnttab_prepare_for_transfer
+         * is respected and speculative execution is blocked accordingly
+         */
+        if ( unlikely(!evaluate_nospec(okay)) ||
+            unlikely(assign_pages(e, page, 0, MEMF_no_refcount)) )
         {
             bool drop_dom_ref;
 
@@ -2255,7 +2293,7 @@ gnttab_transfer(
         grant_read_lock(e->grant_table);
         act = active_entry_acquire(e->grant_table, gop.ref);
 
-        if ( e->grant_table->gt_version == 1 )
+        if ( evaluate_nospec(e->grant_table->gt_version == 1) )
         {
             grant_entry_v1_t *sha = &shared_entry_v1(e->grant_table, gop.ref);
 
@@ -2316,7 +2354,7 @@ release_grant_for_copy(
     sha = shared_entry_header(rgt, gref);
     mfn = act->mfn;
 
-    if ( rgt->gt_version == 1 )
+    if ( evaluate_nospec(rgt->gt_version == 1) )
     {
         status = &sha->flags;
         td = rd;
@@ -2410,9 +2448,11 @@ acquire_grant_for_copy(
         PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
                  "Bad grant reference %#x\n", gref);
 
-    act = active_entry_acquire(rgt, gref);
+    /* This call ensures the above check cannot be bypassed speculatively */
     shah = shared_entry_header(rgt, gref);
-    if ( rgt->gt_version == 1 )
+    act = active_entry_acquire(rgt, gref);
+
+    if ( evaluate_nospec(rgt->gt_version == 1) )
     {
         sha2 = NULL;
         status = &shah->flags;
@@ -2828,6 +2868,9 @@ static int gnttab_copy_buf(const struct gnttab_copy *op,
                  op->dest.offset, dest->ptr.offset,
                  op->len, dest->len);
 
+    /* Make sure the above checks are not bypassed speculatively */
+    block_speculation();
+
     memcpy(dest->virt + op->dest.offset, src->virt + op->source.offset,
            op->len);
     gnttab_mark_dirty(dest->domain, dest->mfn);
@@ -2947,7 +2990,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARAM(gnttab_set_version_t) uop)
     struct grant_table *gt = currd->grant_table;
     grant_entry_v1_t reserved_entries[GNTTAB_NR_RESERVED_ENTRIES];
     int res;
-    unsigned int i;
+    unsigned int i, nr_ents;
 
     if ( copy_from_guest(&op, uop, 1) )
         return -EFAULT;
@@ -2971,7 +3014,8 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARAM(gnttab_set_version_t) uop)
      * are allowed to be in use (xenstore/xenconsole keeps them mapped).
      * (You need to change the version number for e.g. kexec.)
      */
-    for ( i = GNTTAB_NR_RESERVED_ENTRIES; i < nr_grant_entries(gt); i++ )
+    nr_ents = nr_grant_entries(gt);
+    for ( i = GNTTAB_NR_RESERVED_ENTRIES; i < nr_ents; i++ )
     {
         if ( read_atomic(&_active_entry(gt, i).pin) != 0 )
         {
@@ -3213,6 +3257,9 @@ swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
     if ( unlikely(ref_b >= nr_grant_entries(d->grant_table)))
         PIN_FAIL(out, GNTST_bad_gntref, "Bad ref-b %#x\n", ref_b);
 
+    /* Make sure the above checks are not bypassed speculatively */
+    block_speculation();
+
     /* Swapping the same ref is a no-op. */
     if ( ref_a == ref_b )
         goto out;
@@ -3225,7 +3272,7 @@ swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
     if ( act_b->pin )
         PIN_FAIL(out, GNTST_eagain, "ref b %#x busy\n", ref_b);
 
-    if ( gt->gt_version == 1 )
+    if ( evaluate_nospec(gt->gt_version == 1) )
     {
         grant_entry_v1_t shared;
 
@@ -3682,13 +3729,14 @@ void grant_table_warn_active_grants(struct domain *d)
     struct grant_table *gt = d->grant_table;
     struct active_grant_entry *act;
     grant_ref_t ref;
-    unsigned int nr_active = 0;
+    unsigned int nr_active = 0, nr_ents;
 
 #define WARN_GRANT_MAX 10
 
     grant_read_lock(gt);
 
-    for ( ref = 0; ref != nr_grant_entries(gt); ref++ )
+    nr_ents = nr_grant_entries(gt);
+    for ( ref = 0; ref != nr_ents; ref++ )
     {
         act = active_entry_acquire(gt, ref);
         if ( !act->pin )
@@ -3773,7 +3821,7 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
         rc = -EINVAL;
     else if ( ref >= nr_grant_entries(gt) )
         rc = -ENOENT;
-    else if ( gt->gt_version == 1 )
+    else if ( evaluate_nospec(gt->gt_version == 1) )
     {
         const grant_entry_v1_t *sha1 = &shared_entry_v1(gt, ref);
 
@@ -3795,7 +3843,7 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
         rc = -ENXIO;
     else if ( !rc && status )
     {
-        if ( gt->gt_version == 1 )
+        if ( evaluate_nospec(gt->gt_version == 1) )
             *status = flags;
         else
             *status = status_entry(gt, ref);
@@ -3838,6 +3886,9 @@ static int gnttab_get_status_frame_mfn(struct domain *d,
             return -EINVAL;
     }
 
+    /* Make sure idx is bounded wrt nr_status_frames */
+    block_speculation();
+
     *mfn = _mfn(virt_to_mfn(gt->status[idx]));
     return 0;
 }
@@ -3879,7 +3930,7 @@ int gnttab_map_frame(struct domain *d, unsigned long idx, gfn_t gfn, mfn_t *mfn)
 
     grant_write_lock(gt);
 
-    if ( gt->gt_version == 2 && (idx & XENMAPIDX_grant_table_status) )
+    if ( evaluate_nospec(gt->gt_version == 2) && (idx & XENMAPIDX_grant_table_status) )
     {
         idx &= ~XENMAPIDX_grant_table_status;
         status = true;
@@ -3937,6 +3988,7 @@ static void gnttab_usage_print(struct domain *rd)
     int first = 1;
     grant_ref_t ref;
     struct grant_table *gt = rd->grant_table;
+    unsigned int nr_ents;
 
     printk("      -------- active --------       -------- shared --------\n");
     printk("[ref] localdom mfn      pin          localdom gmfn     flags\n");
@@ -3949,7 +4001,8 @@ static void gnttab_usage_print(struct domain *rd)
            nr_grant_frames(gt), gt->max_grant_frames,
            nr_maptrack_frames(gt), gt->max_maptrack_frames);
 
-    for ( ref = 0; ref != nr_grant_entries(gt); ref++ )
+    nr_ents = nr_grant_entries(gt);
+    for ( ref = 0; ref != nr_ents; ref++ )
     {
         struct active_grant_entry *act;
         struct grant_entry_header *sha;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH L1TF v10 8/8] common/domain: block speculative out-of-bound accesses
  2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
                   ` (6 preceding siblings ...)
  2019-03-14 12:50 ` [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
@ 2019-03-14 12:50 ` Norbert Manthey
  2019-03-14 13:20   ` Jan Beulich
  7 siblings, 1 reply; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 12:50 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel, Norbert Manthey

When issuing a vcpu_op hypercall, guests have control over the
vcpuid variable. In the old code, this allowed to perform
speculative out-of-bound accesses. To block this, we make use
of the domain_vcpu function.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
---
 xen/common/domain.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/common/domain.c b/xen/common/domain.c
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1365,7 +1365,7 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
     struct vcpu *v;
     long rc = 0;
 
-    if ( vcpuid >= d->max_vcpus || (v = d->vcpu[vcpuid]) == NULL )
+    if ( (v = domain_vcpu(d, vcpuid)) == NULL )
         return -ENOENT;
 
     switch ( cmd )
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec
  2019-03-14 12:50 ` [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec Norbert Manthey
@ 2019-03-14 13:19   ` Jan Beulich
  2019-03-14 13:21     ` Norbert Manthey
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2019-03-14 13:19 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

>>> On 14.03.19 at 13:50, <nmanthey@amazon.de> wrote:
> Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
> L1 cache is problematic, because when hyperthreading is used as well, a
> guest running on the sibling core can leak this potentially secret data.
> 
> To prevent these speculative accesses, we block speculation after
> accessing the domain property field by adding lfence instructions. This
> way, the CPU continues executing and loading data only once the condition
> is actually evaluated.
> 
> As this protection is typically used in if statements, the lfence has to
> come in a compatible way. Therefore, a function that returns true after an
> lfence instruction is introduced. To protect both branches after a
> conditional, an lfence instruction has to be added for the two branches.
> To be able to block speculation after several evaluations, the generic
> barrier macro block_speculation is also introduced.
> 
> As the L1TF vulnerability is only present on the x86 architecture, there is
> no need to add protection for other architectures. Hence, the introduced
> functions are defined but empty.
> 
> On the x86 architecture, by default, the lfence instruction is not present
> either. Only when a L1TF vulnerable platform is detected, the lfence
> instruction is patched in via alternative patching. Similarly, PV guests
> are protected wrt L1TF by default, so that the protection is furthermore
> disabled in case HVM is exclueded via the build configuration.
> 
> Introducing the lfence instructions catches a lot of potential leaks with
> a simple unintrusive code change. During performance testing, we did not
> notice performance effects.
> 
> This is part of the speculative hardening effort.
> 
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> Acked-by: Julien Grall <julien.grall@arm.com>

I did give my ack on v9, and I see no indication of changes which
may have invalidated it.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 8/8] common/domain: block speculative out-of-bound accesses
  2019-03-14 12:50 ` [PATCH L1TF v10 8/8] common/domain: " Norbert Manthey
@ 2019-03-14 13:20   ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2019-03-14 13:20 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

>>> On 14.03.19 at 13:50, <nmanthey@amazon.de> wrote:
> When issuing a vcpu_op hypercall, guests have control over the
> vcpuid variable. In the old code, this allowed to perform
> speculative out-of-bound accesses. To block this, we make use
> of the domain_vcpu function.
> 
> This is part of the speculative hardening effort.
> 
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

Reviewed-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec
  2019-03-14 13:19   ` Jan Beulich
@ 2019-03-14 13:21     ` Norbert Manthey
  0 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-03-14 13:21 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

On 3/14/19 14:19, Jan Beulich wrote:
>>>> On 14.03.19 at 13:50, <nmanthey@amazon.de> wrote:
>> Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
>> L1 cache is problematic, because when hyperthreading is used as well, a
>> guest running on the sibling core can leak this potentially secret data.
>>
>> To prevent these speculative accesses, we block speculation after
>> accessing the domain property field by adding lfence instructions. This
>> way, the CPU continues executing and loading data only once the condition
>> is actually evaluated.
>>
>> As this protection is typically used in if statements, the lfence has to
>> come in a compatible way. Therefore, a function that returns true after an
>> lfence instruction is introduced. To protect both branches after a
>> conditional, an lfence instruction has to be added for the two branches.
>> To be able to block speculation after several evaluations, the generic
>> barrier macro block_speculation is also introduced.
>>
>> As the L1TF vulnerability is only present on the x86 architecture, there is
>> no need to add protection for other architectures. Hence, the introduced
>> functions are defined but empty.
>>
>> On the x86 architecture, by default, the lfence instruction is not present
>> either. Only when a L1TF vulnerable platform is detected, the lfence
>> instruction is patched in via alternative patching. Similarly, PV guests
>> are protected wrt L1TF by default, so that the protection is furthermore
>> disabled in case HVM is exclueded via the build configuration.
>>
>> Introducing the lfence instructions catches a lot of potential leaks with
>> a simple unintrusive code change. During performance testing, we did not
>> notice performance effects.
>>
>> This is part of the speculative hardening effort.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>> Acked-by: Julien Grall <julien.grall@arm.com>
> I did give my ack on v9, and I see no indication of changes which
> may have invalidated it.

That is a miss on my side.

Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses
  2019-03-14 12:50 ` [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
@ 2019-03-29 17:11   ` Jan Beulich
  2019-05-20 14:27       ` [Xen-devel] " Norbert Manthey
  0 siblings, 1 reply; 25+ messages in thread
From: Jan Beulich @ 2019-03-29 17:11 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

>>> On 14.03.19 at 13:50, <nmanthey@amazon.de> wrote:
> Guests can issue grant table operations and provide guest controlled
> data to them. This data is also used for memory loads. To avoid
> speculative out-of-bound accesses, we use the array_index_nospec macro
> where applicable. However, there are also memory accesses that cannot
> be protected by a single array protection, or multiple accesses in a
> row. To protect these, a nospec barrier is placed between the actual
> range check and the access via the block_speculation macro.
> 
> Speculative execution is not blocked in case one of the following
> properties is true:
>  - path cannot be triggered by the guest
>  - path does not return to the guest
>  - path does not result in an out-of-bound access
>  - path cannot be executed repeatedly
> Only the combination of the above properties allows to actually leak
> continuous chunks of memory. Therefore, we only add the penalty of
> protective mechanisms in case a potential speculative out-of-bound
> access matches all the above properties.
> 
> As different versions of grant tables use structures of different size,
> and the status is encoded in an array for version 2, speculative
> execution might perform out-of-bound accesses of version 2 while
> the table is actually using version 1. Hence, speculation is prevented
> when accessing new memory based on the grant table version. In cases,
> where no different memory locations are accessed on the code path that
> follow an if statement, no protection is required. No different memory
> locations are accessed in the following functionsi after a version check:
> 
>  * _set_status, as the header memory layout is the same

Isn't this rather by virtue of shared_entry_header() having got
hardened? I don't think the memory layout alone can serve as a
reason for there to be no issue - the position in memory matters
as well.

>  * unmap_common, as potentially touched memory locations are allocated
>                  and initialized

I can't seem to spot any explicit version checks in unmap_common().
Do you mean unmap_common_complete()? If so I'm afraid I don't
understand what "allocated and initialized" is supposed to mean.
The version check there looks potentially problematic to me, at
least from a purely theoretical pov.

>  * gnttab_grow_table, as the touched memory is the same for each
>                 branch after the conditionals

How that? gnttab_populate_status_frames() could be speculated
into for a v1 guest.

Next there's a version check in gnttab_setup_table(), but the function
doesn't get changed and also isn't listed here.

>  * gnttab_transfer, as no memory access depends on the conditional
>  * release_grant_for_copy, as no out-of-bound access depends on this
>                 conditional

But you add evaluate_nospec() there, and memory accesses very well
look to depend on the condition, just not inside the bodies of the if/else.

>  * gnttab_set_version, as in case of a version change all the memory is
>                 touched in both cases

And you're sure speculation through NULL pointers is impossible? And
the offset-into-table differences between v1 and v2 don't matter?

>  * gnttab_release_mappings, as this function is called only during domain
>                 destruction and control is not returned to the guest
>  * mem_sharing_gref_to_gfn, as potential dangerous memory accesses are
>                 covered by the next evaluate_nospec
>  * gnttab_get_status_frame, as the potential dangerous memory accesses
>                 are protected in gnttab_get_status_frame_mfn

But there's quite a bit of code in gnttab_get_status_frame_mfn()
before the addition you make. But I guess speculation in particular
into gnttab_grow_table() might be safe?

> @@ -963,9 +988,13 @@ map_grant_ref(
>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>                   op->ref, rgt->domain->domain_id);
>  
> -    act = active_entry_acquire(rgt, op->ref);
> +    /* This call ensures the above check cannot be bypassed speculatively */
>      shah = shared_entry_header(rgt, op->ref);

I know I've come across this several times by now, but I'm afraid I
now get the impression that the comment kind of suggests that
the call is just for this purpose, instead of fulfilling the purpose as
a side effect. Would you mind adding "also" to this (and possible
further instances)? To avoid the line growing too long, perhaps
"call" could be dropped instead.

> @@ -2410,9 +2448,11 @@ acquire_grant_for_copy(
>          PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
>                   "Bad grant reference %#x\n", gref);
>  
> -    act = active_entry_acquire(rgt, gref);
> +    /* This call ensures the above check cannot be bypassed speculatively */
>      shah = shared_entry_header(rgt, gref);
> -    if ( rgt->gt_version == 1 )
> +    act = active_entry_acquire(rgt, gref);
> +
> +    if ( evaluate_nospec(rgt->gt_version == 1) )
>      {
>          sha2 = NULL;
>          status = &shah->flags;

What about the second version check further down in this function?

> @@ -3838,6 +3886,9 @@ static int gnttab_get_status_frame_mfn(struct domain *d,
>              return -EINVAL;
>      }
>  
> +    /* Make sure idx is bounded wrt nr_status_frames */
> +    block_speculation();
> +
>      *mfn = _mfn(virt_to_mfn(gt->status[idx]));

Why not array_access_nospec()? And how is this different from
gnttab_get_shared_frame_mfn(), which you don't change?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
@ 2019-04-05 15:34     ` Andrew Cooper
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Cooper @ 2019-04-05 15:34 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel

On 14/03/2019 12:50, Norbert Manthey wrote:
> When checking for being an hvm domain, or PV domain, we have to make
> sure that speculation cannot bypass that check, and eventually access
> data that should not end up in cache for the current domain type.
>
> This is part of the speculative hardening effort.
>
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> Acked-by: Jan Beulich <jbeulich@suse.com>
>
> ---
>  xen/include/xen/sched.h | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
>  
>  static inline bool is_pv_domain(const struct domain *d)
>  {
> -    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
> +    return IS_ENABLED(CONFIG_PV)
> +           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
>  }
>  
>  static inline bool is_pv_vcpu(const struct vcpu *v)
> @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
>  #endif
>  static inline bool is_hvm_domain(const struct domain *d)
>  {
> -    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
> +    return IS_ENABLED(CONFIG_HVM)
> +           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
>  }

Unfortunately, this breaks the livepatch build, in a rather insidious
way.  The asm inside evaluate_nospec() forces these out-of-line.

Duplicate symbol 'domctl.c#is_pv_domain.isra.0' (ffff82d080270390 !=
ffff82d08036a005)

On forcing these to be always_inline, the next bit of breakage occurs at:

Duplicate symbol 'domctl.c#is_pv_32bit_domain' (ffff82d080270270 !=
ffff82d08036a000)

which means we now have a goose chase of throwing always_inline through
the code.

I'm going to have to revert this in XenServer, especially seeing as
OSSTest is currently out of action, which AFAICT, makes XenServer's
testing the only automatic testing which staging is getting.

I also need to see about upstreaming out local patch which causes the
build to properly fail in cases which we know will break the ability to
build livepatches.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
@ 2019-04-05 15:34     ` Andrew Cooper
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Cooper @ 2019-04-05 15:34 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel

On 14/03/2019 12:50, Norbert Manthey wrote:
> When checking for being an hvm domain, or PV domain, we have to make
> sure that speculation cannot bypass that check, and eventually access
> data that should not end up in cache for the current domain type.
>
> This is part of the speculative hardening effort.
>
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> Acked-by: Jan Beulich <jbeulich@suse.com>
>
> ---
>  xen/include/xen/sched.h | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
>  
>  static inline bool is_pv_domain(const struct domain *d)
>  {
> -    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
> +    return IS_ENABLED(CONFIG_PV)
> +           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
>  }
>  
>  static inline bool is_pv_vcpu(const struct vcpu *v)
> @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
>  #endif
>  static inline bool is_hvm_domain(const struct domain *d)
>  {
> -    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
> +    return IS_ENABLED(CONFIG_HVM)
> +           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
>  }

Unfortunately, this breaks the livepatch build, in a rather insidious
way.  The asm inside evaluate_nospec() forces these out-of-line.

Duplicate symbol 'domctl.c#is_pv_domain.isra.0' (ffff82d080270390 !=
ffff82d08036a005)

On forcing these to be always_inline, the next bit of breakage occurs at:

Duplicate symbol 'domctl.c#is_pv_32bit_domain' (ffff82d080270270 !=
ffff82d08036a000)

which means we now have a goose chase of throwing always_inline through
the code.

I'm going to have to revert this in XenServer, especially seeing as
OSSTest is currently out of action, which AFAICT, makes XenServer's
testing the only automatic testing which staging is getting.

I also need to see about upstreaming out local patch which causes the
build to properly fail in cases which we know will break the ability to
build livepatches.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
@ 2019-04-05 18:29       ` Norbert Manthey
  0 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-04-05 18:29 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel

On 4/5/19 17:34, Andrew Cooper wrote:
> On 14/03/2019 12:50, Norbert Manthey wrote:
>> When checking for being an hvm domain, or PV domain, we have to make
>> sure that speculation cannot bypass that check, and eventually access
>> data that should not end up in cache for the current domain type.
>>
>> This is part of the speculative hardening effort.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>
>> ---
>>  xen/include/xen/sched.h | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
>>  
>>  static inline bool is_pv_domain(const struct domain *d)
>>  {
>> -    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
>> +    return IS_ENABLED(CONFIG_PV)
>> +           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
>>  }
>>  
>>  static inline bool is_pv_vcpu(const struct vcpu *v)
>> @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
>>  #endif
>>  static inline bool is_hvm_domain(const struct domain *d)
>>  {
>> -    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
>> +    return IS_ENABLED(CONFIG_HVM)
>> +           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
>>  }
> Unfortunately, this breaks the livepatch build, in a rather insidious
> way.  The asm inside evaluate_nospec() forces these out-of-line.
>
> Duplicate symbol 'domctl.c#is_pv_domain.isra.0' (ffff82d080270390 !=
> ffff82d08036a005)
>
> On forcing these to be always_inline, the next bit of breakage occurs at:
>
> Duplicate symbol 'domctl.c#is_pv_32bit_domain' (ffff82d080270270 !=
> ffff82d08036a000)
>
> which means we now have a goose chase of throwing always_inline through
> the code.
>
> I'm going to have to revert this in XenServer, especially seeing as
> OSSTest is currently out of action, which AFAICT, makes XenServer's
> testing the only automatic testing which staging is getting.
>
> I also need to see about upstreaming out local patch which causes the
> build to properly fail in cases which we know will break the ability to
> build livepatches.

I see, thanks for the pointer. I guess the build only fails in case you
patch some function that needs one of the two? In case you have
something that reproduces the problem, and can be shared, we might have
a look. Thanks!

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
@ 2019-04-05 18:29       ` Norbert Manthey
  0 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-04-05 18:29 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel

On 4/5/19 17:34, Andrew Cooper wrote:
> On 14/03/2019 12:50, Norbert Manthey wrote:
>> When checking for being an hvm domain, or PV domain, we have to make
>> sure that speculation cannot bypass that check, and eventually access
>> data that should not end up in cache for the current domain type.
>>
>> This is part of the speculative hardening effort.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>
>> ---
>>  xen/include/xen/sched.h | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>> --- a/xen/include/xen/sched.h
>> +++ b/xen/include/xen/sched.h
>> @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
>>  
>>  static inline bool is_pv_domain(const struct domain *d)
>>  {
>> -    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
>> +    return IS_ENABLED(CONFIG_PV)
>> +           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
>>  }
>>  
>>  static inline bool is_pv_vcpu(const struct vcpu *v)
>> @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
>>  #endif
>>  static inline bool is_hvm_domain(const struct domain *d)
>>  {
>> -    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
>> +    return IS_ENABLED(CONFIG_HVM)
>> +           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
>>  }
> Unfortunately, this breaks the livepatch build, in a rather insidious
> way.  The asm inside evaluate_nospec() forces these out-of-line.
>
> Duplicate symbol 'domctl.c#is_pv_domain.isra.0' (ffff82d080270390 !=
> ffff82d08036a005)
>
> On forcing these to be always_inline, the next bit of breakage occurs at:
>
> Duplicate symbol 'domctl.c#is_pv_32bit_domain' (ffff82d080270270 !=
> ffff82d08036a000)
>
> which means we now have a goose chase of throwing always_inline through
> the code.
>
> I'm going to have to revert this in XenServer, especially seeing as
> OSSTest is currently out of action, which AFAICT, makes XenServer's
> testing the only automatic testing which staging is getting.
>
> I also need to see about upstreaming out local patch which causes the
> build to properly fail in cases which we know will break the ability to
> build livepatches.

I see, thanks for the pointer. I guess the build only fails in case you
patch some function that needs one of the two? In case you have
something that reproduces the problem, and can be shared, we might have
a look. Thanks!

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
@ 2019-04-05 18:38         ` Andrew Cooper
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Cooper @ 2019-04-05 18:38 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel

On 05/04/2019 19:29, Norbert Manthey wrote:
> On 4/5/19 17:34, Andrew Cooper wrote:
>> On 14/03/2019 12:50, Norbert Manthey wrote:
>>> When checking for being an hvm domain, or PV domain, we have to make
>>> sure that speculation cannot bypass that check, and eventually access
>>> data that should not end up in cache for the current domain type.
>>>
>>> This is part of the speculative hardening effort.
>>>
>>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> ---
>>>  xen/include/xen/sched.h | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
>>>  
>>>  static inline bool is_pv_domain(const struct domain *d)
>>>  {
>>> -    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
>>> +    return IS_ENABLED(CONFIG_PV)
>>> +           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
>>>  }
>>>  
>>>  static inline bool is_pv_vcpu(const struct vcpu *v)
>>> @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
>>>  #endif
>>>  static inline bool is_hvm_domain(const struct domain *d)
>>>  {
>>> -    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
>>> +    return IS_ENABLED(CONFIG_HVM)
>>> +           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
>>>  }
>> Unfortunately, this breaks the livepatch build, in a rather insidious
>> way.  The asm inside evaluate_nospec() forces these out-of-line.
>>
>> Duplicate symbol 'domctl.c#is_pv_domain.isra.0' (ffff82d080270390 !=
>> ffff82d08036a005)
>>
>> On forcing these to be always_inline, the next bit of breakage occurs at:
>>
>> Duplicate symbol 'domctl.c#is_pv_32bit_domain' (ffff82d080270270 !=
>> ffff82d08036a000)
>>
>> which means we now have a goose chase of throwing always_inline through
>> the code.
>>
>> I'm going to have to revert this in XenServer, especially seeing as
>> OSSTest is currently out of action, which AFAICT, makes XenServer's
>> testing the only automatic testing which staging is getting.
>>
>> I also need to see about upstreaming out local patch which causes the
>> build to properly fail in cases which we know will break the ability to
>> build livepatches.
> I see, thanks for the pointer. I guess the build only fails in case you
> patch some function that needs one of the two? In case you have
> something that reproduces the problem, and can be shared, we might have
> a look. Thanks!

The presence of duplicate symbols breaks the binary diffing logic in the
livepatch build process itself.

For runtime patch application, you are correct that it is fine in
practice so long as you don't care about patching Is_pv_domain() itself.


One part of livepatching work which appears not to have been upstreamed
(and I'm trying to fix this right now), is the bit which causes a hard
failure of the hypervisor build when you select CONFIG_LIVEPATCH and
duplicate symbols are detected.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
@ 2019-04-05 18:38         ` Andrew Cooper
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Cooper @ 2019-04-05 18:38 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Bjoern Doebel

On 05/04/2019 19:29, Norbert Manthey wrote:
> On 4/5/19 17:34, Andrew Cooper wrote:
>> On 14/03/2019 12:50, Norbert Manthey wrote:
>>> When checking for being an hvm domain, or PV domain, we have to make
>>> sure that speculation cannot bypass that check, and eventually access
>>> data that should not end up in cache for the current domain type.
>>>
>>> This is part of the speculative hardening effort.
>>>
>>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> ---
>>>  xen/include/xen/sched.h | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
>>>  
>>>  static inline bool is_pv_domain(const struct domain *d)
>>>  {
>>> -    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
>>> +    return IS_ENABLED(CONFIG_PV)
>>> +           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
>>>  }
>>>  
>>>  static inline bool is_pv_vcpu(const struct vcpu *v)
>>> @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
>>>  #endif
>>>  static inline bool is_hvm_domain(const struct domain *d)
>>>  {
>>> -    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
>>> +    return IS_ENABLED(CONFIG_HVM)
>>> +           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
>>>  }
>> Unfortunately, this breaks the livepatch build, in a rather insidious
>> way.  The asm inside evaluate_nospec() forces these out-of-line.
>>
>> Duplicate symbol 'domctl.c#is_pv_domain.isra.0' (ffff82d080270390 !=
>> ffff82d08036a005)
>>
>> On forcing these to be always_inline, the next bit of breakage occurs at:
>>
>> Duplicate symbol 'domctl.c#is_pv_32bit_domain' (ffff82d080270270 !=
>> ffff82d08036a000)
>>
>> which means we now have a goose chase of throwing always_inline through
>> the code.
>>
>> I'm going to have to revert this in XenServer, especially seeing as
>> OSSTest is currently out of action, which AFAICT, makes XenServer's
>> testing the only automatic testing which staging is getting.
>>
>> I also need to see about upstreaming out local patch which causes the
>> build to properly fail in cases which we know will break the ability to
>> build livepatches.
> I see, thanks for the pointer. I guess the build only fails in case you
> patch some function that needs one of the two? In case you have
> something that reproduces the problem, and can be shared, we might have
> a look. Thanks!

The presence of duplicate symbols breaks the binary diffing logic in the
livepatch build process itself.

For runtime patch application, you are correct that it is fine in
practice so long as you don't care about patching Is_pv_domain() itself.


One part of livepatching work which appears not to have been upstreamed
(and I'm trying to fix this right now), is the bit which causes a hard
failure of the hypervisor build when you select CONFIG_LIVEPATCH and
duplicate symbols are detected.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
@ 2019-04-08  9:19         ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2019-04-08  9:19 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

>>> On 05.04.19 at 20:29, <nmanthey@amazon.de> wrote:
> On 4/5/19 17:34, Andrew Cooper wrote:
>> On 14/03/2019 12:50, Norbert Manthey wrote:
>>> When checking for being an hvm domain, or PV domain, we have to make
>>> sure that speculation cannot bypass that check, and eventually access
>>> data that should not end up in cache for the current domain type.
>>>
>>> This is part of the speculative hardening effort.
>>>
>>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> ---
>>>  xen/include/xen/sched.h | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
>>>  
>>>  static inline bool is_pv_domain(const struct domain *d)
>>>  {
>>> -    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
>>> +    return IS_ENABLED(CONFIG_PV)
>>> +           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
>>>  }
>>>  
>>>  static inline bool is_pv_vcpu(const struct vcpu *v)
>>> @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu 
> *v)
>>>  #endif
>>>  static inline bool is_hvm_domain(const struct domain *d)
>>>  {
>>> -    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
>>> +    return IS_ENABLED(CONFIG_HVM)
>>> +           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
>>>  }
>> Unfortunately, this breaks the livepatch build, in a rather insidious
>> way.  The asm inside evaluate_nospec() forces these out-of-line.
>>
>> Duplicate symbol 'domctl.c#is_pv_domain.isra.0' (ffff82d080270390 !=
>> ffff82d08036a005)
>>
>> On forcing these to be always_inline, the next bit of breakage occurs at:
>>
>> Duplicate symbol 'domctl.c#is_pv_32bit_domain' (ffff82d080270270 !=
>> ffff82d08036a000)
>>
>> which means we now have a goose chase of throwing always_inline through
>> the code.
>>
>> I'm going to have to revert this in XenServer, especially seeing as
>> OSSTest is currently out of action, which AFAICT, makes XenServer's
>> testing the only automatic testing which staging is getting.
>>
>> I also need to see about upstreaming out local patch which causes the
>> build to properly fail in cases which we know will break the ability to
>> build livepatches.
> 
> I see, thanks for the pointer. I guess the build only fails in case you
> patch some function that needs one of the two? In case you have
> something that reproduces the problem, and can be shared, we might have
> a look. Thanks!

I too did notice the issue while committing your patches, so there's
no special setup needed (might be compiler version dependent,
though). I merely failed to make the connection to live-patching, or
else I would have refrained from pushing the changes.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] [PATCH L1TF v10 4/8] is_hvm/pv_domain: block speculation
@ 2019-04-08  9:19         ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2019-04-08  9:19 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

>>> On 05.04.19 at 20:29, <nmanthey@amazon.de> wrote:
> On 4/5/19 17:34, Andrew Cooper wrote:
>> On 14/03/2019 12:50, Norbert Manthey wrote:
>>> When checking for being an hvm domain, or PV domain, we have to make
>>> sure that speculation cannot bypass that check, and eventually access
>>> data that should not end up in cache for the current domain type.
>>>
>>> This is part of the speculative hardening effort.
>>>
>>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>>> Acked-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> ---
>>>  xen/include/xen/sched.h | 6 ++++--
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
>>> --- a/xen/include/xen/sched.h
>>> +++ b/xen/include/xen/sched.h
>>> @@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
>>>  
>>>  static inline bool is_pv_domain(const struct domain *d)
>>>  {
>>> -    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
>>> +    return IS_ENABLED(CONFIG_PV)
>>> +           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
>>>  }
>>>  
>>>  static inline bool is_pv_vcpu(const struct vcpu *v)
>>> @@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu 
> *v)
>>>  #endif
>>>  static inline bool is_hvm_domain(const struct domain *d)
>>>  {
>>> -    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
>>> +    return IS_ENABLED(CONFIG_HVM)
>>> +           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
>>>  }
>> Unfortunately, this breaks the livepatch build, in a rather insidious
>> way.  The asm inside evaluate_nospec() forces these out-of-line.
>>
>> Duplicate symbol 'domctl.c#is_pv_domain.isra.0' (ffff82d080270390 !=
>> ffff82d08036a005)
>>
>> On forcing these to be always_inline, the next bit of breakage occurs at:
>>
>> Duplicate symbol 'domctl.c#is_pv_32bit_domain' (ffff82d080270270 !=
>> ffff82d08036a000)
>>
>> which means we now have a goose chase of throwing always_inline through
>> the code.
>>
>> I'm going to have to revert this in XenServer, especially seeing as
>> OSSTest is currently out of action, which AFAICT, makes XenServer's
>> testing the only automatic testing which staging is getting.
>>
>> I also need to see about upstreaming out local patch which causes the
>> build to properly fail in cases which we know will break the ability to
>> build livepatches.
> 
> I see, thanks for the pointer. I guess the build only fails in case you
> patch some function that needs one of the two? In case you have
> something that reproduces the problem, and can be shared, we might have
> a look. Thanks!

I too did notice the issue while committing your patches, so there's
no special setup needed (might be compiler version dependent,
though). I merely failed to make the connection to live-patching, or
else I would have refrained from pushing the changes.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses
@ 2019-05-20 14:27       ` Norbert Manthey
  0 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-05-20 14:27 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

I looked into these changes after a while again. I will split this
larger commit into smaller ones, and address parts of the problem in
each of them separately.

On 3/29/19 18:11, Jan Beulich wrote:
>>>> On 14.03.19 at 13:50, <nmanthey@amazon.de> wrote:
>> Guests can issue grant table operations and provide guest controlled
>> data to them. This data is also used for memory loads. To avoid
>> speculative out-of-bound accesses, we use the array_index_nospec macro
>> where applicable. However, there are also memory accesses that cannot
>> be protected by a single array protection, or multiple accesses in a
>> row. To protect these, a nospec barrier is placed between the actual
>> range check and the access via the block_speculation macro.
>>
>> Speculative execution is not blocked in case one of the following
>> properties is true:
>>  - path cannot be triggered by the guest
>>  - path does not return to the guest
>>  - path does not result in an out-of-bound access
>>  - path cannot be executed repeatedly
>> Only the combination of the above properties allows to actually leak
>> continuous chunks of memory. Therefore, we only add the penalty of
>> protective mechanisms in case a potential speculative out-of-bound
>> access matches all the above properties.
>>
>> As different versions of grant tables use structures of different size,
>> and the status is encoded in an array for version 2, speculative
>> execution might perform out-of-bound accesses of version 2 while
>> the table is actually using version 1. Hence, speculation is prevented
>> when accessing new memory based on the grant table version. In cases,
>> where no different memory locations are accessed on the code path that
>> follow an if statement, no protection is required. No different memory
>> locations are accessed in the following functionsi after a version check:
>>
>>  * _set_status, as the header memory layout is the same
> Isn't this rather by virtue of shared_entry_header() having got
> hardened? I don't think the memory layout alone can serve as a
> reason for there to be no issue - the position in memory matters
> as well.
To be on the safe side, I will add a fix here as well.
>
>>  * unmap_common, as potentially touched memory locations are allocated
>>                  and initialized
> I can't seem to spot any explicit version checks in unmap_common().
> Do you mean unmap_common_complete()? If so I'm afraid I don't
> understand what "allocated and initialized" is supposed to mean.
> The version check there looks potentially problematic to me, at
> least from a purely theoretical pov.
That likely meant unmap_common_complete, and that one will be fixed.
>
>>  * gnttab_grow_table, as the touched memory is the same for each
>>                 branch after the conditionals
> How that? gnttab_populate_status_frames() could be speculated
> into for a v1 guest.
>
> Next there's a version check in gnttab_setup_table(), but the function
> doesn't get changed and also isn't listed here.
I will address both.
>
>>  * gnttab_transfer, as no memory access depends on the conditional
>>  * release_grant_for_copy, as no out-of-bound access depends on this
>>                 conditional
> But you add evaluate_nospec() there, and memory accesses very well
> look to depend on the condition, just not inside the bodies of the if/else.
That seems to be a left over. This function is actually fixed.
>
>>  * gnttab_set_version, as in case of a version change all the memory is
>>                 touched in both cases
> And you're sure speculation through NULL pointers is impossible? And
> the offset-into-table differences between v1 and v2 don't matter?
Yes, I think this is good enough.
>
>>  * gnttab_release_mappings, as this function is called only during domain
>>                 destruction and control is not returned to the guest
>>  * mem_sharing_gref_to_gfn, as potential dangerous memory accesses are
>>                 covered by the next evaluate_nospec
>>  * gnttab_get_status_frame, as the potential dangerous memory accesses
>>                 are protected in gnttab_get_status_frame_mfn
> But there's quite a bit of code in gnttab_get_status_frame_mfn()
> before the addition you make. But I guess speculation in particular
> into gnttab_grow_table() might be safe?
I think this is save, yes.
>
>> @@ -963,9 +988,13 @@ map_grant_ref(
>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>                   op->ref, rgt->domain->domain_id);
>>  
>> -    act = active_entry_acquire(rgt, op->ref);
>> +    /* This call ensures the above check cannot be bypassed speculatively */
>>      shah = shared_entry_header(rgt, op->ref);
> I know I've come across this several times by now, but I'm afraid I
> now get the impression that the comment kind of suggests that
> the call is just for this purpose, instead of fulfilling the purpose as
> a side effect. Would you mind adding "also" to this (and possible
> further instances)? To avoid the line growing too long, perhaps
> "call" could be dropped instead.
Yes, I will add an 'also' to these calls.
>
>> @@ -2410,9 +2448,11 @@ acquire_grant_for_copy(
>>          PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
>>                   "Bad grant reference %#x\n", gref);
>>  
>> -    act = active_entry_acquire(rgt, gref);
>> +    /* This call ensures the above check cannot be bypassed speculatively */
>>      shah = shared_entry_header(rgt, gref);
>> -    if ( rgt->gt_version == 1 )
>> +    act = active_entry_acquire(rgt, gref);
>> +
>> +    if ( evaluate_nospec(rgt->gt_version == 1) )
>>      {
>>          sha2 = NULL;
>>          status = &shah->flags;
> What about the second version check further down in this function?
That one should be fine, as it exists that function afterwards, and
represents an abort path that is valid for both versions.
>
>> @@ -3838,6 +3886,9 @@ static int gnttab_get_status_frame_mfn(struct domain *d,
>>              return -EINVAL;
>>      }
>>  
>> +    /* Make sure idx is bounded wrt nr_status_frames */
>> +    block_speculation();
>> +
>>      *mfn = _mfn(virt_to_mfn(gt->status[idx]));
> Why not array_access_nospec()? And how is this different from
> gnttab_get_shared_frame_mfn(), which you don't change?

This idx access is to a version dependent structure, and hence
array_index_nospec is not good enough to catch the version difference
case as well.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses
@ 2019-05-20 14:27       ` Norbert Manthey
  0 siblings, 0 replies; 25+ messages in thread
From: Norbert Manthey @ 2019-05-20 14:27 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

I looked into these changes after a while again. I will split this
larger commit into smaller ones, and address parts of the problem in
each of them separately.

On 3/29/19 18:11, Jan Beulich wrote:
>>>> On 14.03.19 at 13:50, <nmanthey@amazon.de> wrote:
>> Guests can issue grant table operations and provide guest controlled
>> data to them. This data is also used for memory loads. To avoid
>> speculative out-of-bound accesses, we use the array_index_nospec macro
>> where applicable. However, there are also memory accesses that cannot
>> be protected by a single array protection, or multiple accesses in a
>> row. To protect these, a nospec barrier is placed between the actual
>> range check and the access via the block_speculation macro.
>>
>> Speculative execution is not blocked in case one of the following
>> properties is true:
>>  - path cannot be triggered by the guest
>>  - path does not return to the guest
>>  - path does not result in an out-of-bound access
>>  - path cannot be executed repeatedly
>> Only the combination of the above properties allows to actually leak
>> continuous chunks of memory. Therefore, we only add the penalty of
>> protective mechanisms in case a potential speculative out-of-bound
>> access matches all the above properties.
>>
>> As different versions of grant tables use structures of different size,
>> and the status is encoded in an array for version 2, speculative
>> execution might perform out-of-bound accesses of version 2 while
>> the table is actually using version 1. Hence, speculation is prevented
>> when accessing new memory based on the grant table version. In cases,
>> where no different memory locations are accessed on the code path that
>> follow an if statement, no protection is required. No different memory
>> locations are accessed in the following functionsi after a version check:
>>
>>  * _set_status, as the header memory layout is the same
> Isn't this rather by virtue of shared_entry_header() having got
> hardened? I don't think the memory layout alone can serve as a
> reason for there to be no issue - the position in memory matters
> as well.
To be on the safe side, I will add a fix here as well.
>
>>  * unmap_common, as potentially touched memory locations are allocated
>>                  and initialized
> I can't seem to spot any explicit version checks in unmap_common().
> Do you mean unmap_common_complete()? If so I'm afraid I don't
> understand what "allocated and initialized" is supposed to mean.
> The version check there looks potentially problematic to me, at
> least from a purely theoretical pov.
That likely meant unmap_common_complete, and that one will be fixed.
>
>>  * gnttab_grow_table, as the touched memory is the same for each
>>                 branch after the conditionals
> How that? gnttab_populate_status_frames() could be speculated
> into for a v1 guest.
>
> Next there's a version check in gnttab_setup_table(), but the function
> doesn't get changed and also isn't listed here.
I will address both.
>
>>  * gnttab_transfer, as no memory access depends on the conditional
>>  * release_grant_for_copy, as no out-of-bound access depends on this
>>                 conditional
> But you add evaluate_nospec() there, and memory accesses very well
> look to depend on the condition, just not inside the bodies of the if/else.
That seems to be a left over. This function is actually fixed.
>
>>  * gnttab_set_version, as in case of a version change all the memory is
>>                 touched in both cases
> And you're sure speculation through NULL pointers is impossible? And
> the offset-into-table differences between v1 and v2 don't matter?
Yes, I think this is good enough.
>
>>  * gnttab_release_mappings, as this function is called only during domain
>>                 destruction and control is not returned to the guest
>>  * mem_sharing_gref_to_gfn, as potential dangerous memory accesses are
>>                 covered by the next evaluate_nospec
>>  * gnttab_get_status_frame, as the potential dangerous memory accesses
>>                 are protected in gnttab_get_status_frame_mfn
> But there's quite a bit of code in gnttab_get_status_frame_mfn()
> before the addition you make. But I guess speculation in particular
> into gnttab_grow_table() might be safe?
I think this is save, yes.
>
>> @@ -963,9 +988,13 @@ map_grant_ref(
>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>                   op->ref, rgt->domain->domain_id);
>>  
>> -    act = active_entry_acquire(rgt, op->ref);
>> +    /* This call ensures the above check cannot be bypassed speculatively */
>>      shah = shared_entry_header(rgt, op->ref);
> I know I've come across this several times by now, but I'm afraid I
> now get the impression that the comment kind of suggests that
> the call is just for this purpose, instead of fulfilling the purpose as
> a side effect. Would you mind adding "also" to this (and possible
> further instances)? To avoid the line growing too long, perhaps
> "call" could be dropped instead.
Yes, I will add an 'also' to these calls.
>
>> @@ -2410,9 +2448,11 @@ acquire_grant_for_copy(
>>          PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
>>                   "Bad grant reference %#x\n", gref);
>>  
>> -    act = active_entry_acquire(rgt, gref);
>> +    /* This call ensures the above check cannot be bypassed speculatively */
>>      shah = shared_entry_header(rgt, gref);
>> -    if ( rgt->gt_version == 1 )
>> +    act = active_entry_acquire(rgt, gref);
>> +
>> +    if ( evaluate_nospec(rgt->gt_version == 1) )
>>      {
>>          sha2 = NULL;
>>          status = &shah->flags;
> What about the second version check further down in this function?
That one should be fine, as it exists that function afterwards, and
represents an abort path that is valid for both versions.
>
>> @@ -3838,6 +3886,9 @@ static int gnttab_get_status_frame_mfn(struct domain *d,
>>              return -EINVAL;
>>      }
>>  
>> +    /* Make sure idx is bounded wrt nr_status_frames */
>> +    block_speculation();
>> +
>>      *mfn = _mfn(virt_to_mfn(gt->status[idx]));
> Why not array_access_nospec()? And how is this different from
> gnttab_get_shared_frame_mfn(), which you don't change?

This idx access is to a version dependent structure, and hence
array_index_nospec is not good enough to catch the version difference
case as well.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses
@ 2019-05-21  9:41         ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2019-05-21  9:41 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

>>> On 20.05.19 at 16:27, <nmanthey@amazon.de> wrote:
> On 3/29/19 18:11, Jan Beulich wrote:
>>>>> On 14.03.19 at 13:50, <nmanthey@amazon.de> wrote:
>>> @@ -2410,9 +2448,11 @@ acquire_grant_for_copy(
>>>          PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
>>>                   "Bad grant reference %#x\n", gref);
>>>  
>>> -    act = active_entry_acquire(rgt, gref);
>>> +    /* This call ensures the above check cannot be bypassed speculatively */
>>>      shah = shared_entry_header(rgt, gref);
>>> -    if ( rgt->gt_version == 1 )
>>> +    act = active_entry_acquire(rgt, gref);
>>> +
>>> +    if ( evaluate_nospec(rgt->gt_version == 1) )
>>>      {
>>>          sha2 = NULL;
>>>          status = &shah->flags;
>> What about the second version check further down in this function?
> That one should be fine, as it exists that function afterwards, and
> represents an abort path that is valid for both versions.

That's not obvious from looking at just the if() in question. For
example, fixup_status_for_copy_pin() gets handed "status" as
an argument, which is version dependent. It seems quite likely
indeed that no code changes are here, but this imo again needs
discussing/explaining in the commit message.

>>> @@ -3838,6 +3886,9 @@ static int gnttab_get_status_frame_mfn(struct domain *d,
>>>              return -EINVAL;
>>>      }
>>>  
>>> +    /* Make sure idx is bounded wrt nr_status_frames */
>>> +    block_speculation();
>>> +
>>>      *mfn = _mfn(virt_to_mfn(gt->status[idx]));
>> Why not array_access_nospec()? And how is this different from
>> gnttab_get_shared_frame_mfn(), which you don't change?
> 
> This idx access is to a version dependent structure, and hence
> array_index_nospec is not good enough to catch the version difference
> case as well.

But the comment talks about the array bound only.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [Xen-devel] [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses
@ 2019-05-21  9:41         ` Jan Beulich
  0 siblings, 0 replies; 25+ messages in thread
From: Jan Beulich @ 2019-05-21  9:41 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Bjoern Doebel

>>> On 20.05.19 at 16:27, <nmanthey@amazon.de> wrote:
> On 3/29/19 18:11, Jan Beulich wrote:
>>>>> On 14.03.19 at 13:50, <nmanthey@amazon.de> wrote:
>>> @@ -2410,9 +2448,11 @@ acquire_grant_for_copy(
>>>          PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
>>>                   "Bad grant reference %#x\n", gref);
>>>  
>>> -    act = active_entry_acquire(rgt, gref);
>>> +    /* This call ensures the above check cannot be bypassed speculatively */
>>>      shah = shared_entry_header(rgt, gref);
>>> -    if ( rgt->gt_version == 1 )
>>> +    act = active_entry_acquire(rgt, gref);
>>> +
>>> +    if ( evaluate_nospec(rgt->gt_version == 1) )
>>>      {
>>>          sha2 = NULL;
>>>          status = &shah->flags;
>> What about the second version check further down in this function?
> That one should be fine, as it exists that function afterwards, and
> represents an abort path that is valid for both versions.

That's not obvious from looking at just the if() in question. For
example, fixup_status_for_copy_pin() gets handed "status" as
an argument, which is version dependent. It seems quite likely
indeed that no code changes are here, but this imo again needs
discussing/explaining in the commit message.

>>> @@ -3838,6 +3886,9 @@ static int gnttab_get_status_frame_mfn(struct domain *d,
>>>              return -EINVAL;
>>>      }
>>>  
>>> +    /* Make sure idx is bounded wrt nr_status_frames */
>>> +    block_speculation();
>>> +
>>>      *mfn = _mfn(virt_to_mfn(gt->status[idx]));
>> Why not array_access_nospec()? And how is this different from
>> gnttab_get_shared_frame_mfn(), which you don't change?
> 
> This idx access is to a version dependent structure, and hence
> array_index_nospec is not good enough to catch the version difference
> case as well.

But the comment talks about the array bound only.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-05-21  9:41 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-14 12:50 L1TF Patch Series v10 Norbert Manthey
2019-03-14 12:50 ` [PATCH L1TF v10 1/8] spec: add l1tf-barrier Norbert Manthey
2019-03-14 12:50 ` [PATCH L1TF v10 2/8] nospec: introduce evaluate_nospec Norbert Manthey
2019-03-14 13:19   ` Jan Beulich
2019-03-14 13:21     ` Norbert Manthey
2019-03-14 12:50 ` [PATCH L1TF v10 3/8] is_control_domain: block speculation Norbert Manthey
2019-03-14 12:50 ` [PATCH L1TF v10 4/8] is_hvm/pv_domain: " Norbert Manthey
2019-04-05 15:34   ` Andrew Cooper
2019-04-05 15:34     ` [Xen-devel] " Andrew Cooper
2019-04-05 18:29     ` Norbert Manthey
2019-04-05 18:29       ` [Xen-devel] " Norbert Manthey
2019-04-05 18:38       ` Andrew Cooper
2019-04-05 18:38         ` [Xen-devel] " Andrew Cooper
2019-04-08  9:19       ` Jan Beulich
2019-04-08  9:19         ` [Xen-devel] " Jan Beulich
2019-03-14 12:50 ` [PATCH L1TF v10 5/8] common/memory: block speculative out-of-bound accesses Norbert Manthey
2019-03-14 12:50 ` [PATCH L1TF v10 6/8] x86/hvm: add nospec to hvmop param Norbert Manthey
2019-03-14 12:50 ` [PATCH L1TF v10 7/8] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
2019-03-29 17:11   ` Jan Beulich
2019-05-20 14:27     ` Norbert Manthey
2019-05-20 14:27       ` [Xen-devel] " Norbert Manthey
2019-05-21  9:41       ` Jan Beulich
2019-05-21  9:41         ` [Xen-devel] " Jan Beulich
2019-03-14 12:50 ` [PATCH L1TF v10 8/8] common/domain: " Norbert Manthey
2019-03-14 13:20   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.