All of lore.kernel.org
 help / color / mirror / Atom feed
* SpectreV1+L1TF Patch Series
@ 2019-01-23 11:51 Norbert Manthey
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation Norbert Manthey
                   ` (13 more replies)
  0 siblings, 14 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel

Dear all,

This patch series attempts to mitigate the issue that have been raised in the
XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
execution on Intel hardware, an lfence instruction is required to make sure
that selected checks are not bypassed. Speculative out-of-bound accesses can
be prevented by using the array_index_nospec macro.

The lfence instruction should be added on x86 platforms only. To not affect
platforms that are not affected by the L1TF vulnerability, the lfence
instruction is patched in via alternative patching on Intel CPUs only.
Furthermore, the compile time configuration allows to choose how to protect the
evaluation of conditions with the lfence instruction.

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
@ 2019-01-23 11:51 ` Norbert Manthey
  2019-01-23 13:07   ` Jan Beulich
  2019-01-23 13:20   ` Jan Beulich
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 02/11] is_hvm/pv_domain: " Norbert Manthey
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

Checks of domain properties, such as is_hardware_domain or is_hvm_domain,
might be bypassed by speculatively executing these instructions. A reason
for bypassing these checks is that these macros access the domain
structure via a pointer, and check a certain field. Since this memory
access is slow, the CPU assumes a returned value and continues the
execution.

In case an is_control_domain check is bypassed, for example during a
hypercall, data that should only be accessible by the control domain could
be loaded into the cache.

Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
L1 cache is problemetic, because when hyperthreading is used as well, a
guest running on the sibling core can leak this potentially secret data.

To prevent these speculative accesses, we block speculation after
accessing the domain property field by adding lfence instructions. This
way, the CPU continues executing and loading data only once the condition
is actually evaluated.

As the macros are typically used in if statements, the lfence has to come
in a compatible way. Therefore, a function that returns true after an
lfence instruction is introduced. To protect both branches after a
conditional, an lfence instruction has to be added for the two branches.
As the L1TF vulnerability is only present on the x86 architecture, the
macros will not use the lfence instruction on other architectures.

Introducing the lfence instructions catches a lot of potential leaks with
a simple unintrusive code change. During performance testing, we did not
notice performance effects.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/include/xen/nospec.h | 15 +++++++++++++++
 xen/include/xen/sched.h  |  5 +++--
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -58,6 +58,21 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
     (typeof(_i)) (_i & _mask);                                          \
 })
 
+/*
+ * allow to insert a read memory barrier into conditionals
+ */
+#ifdef CONFIG_X86
+static inline bool lfence_true(void) { rmb(); return true; }
+#else
+static inline bool lfence_true(void) { return true; }
+#endif
+
+/*
+ * protect evaluation of conditional with respect to speculation
+ */
+#define evaluate_nospec(condition)                                      \
+    (((condition) && lfence_true()) || !lfence_true())
+
 #endif /* XEN_NOSPEC_H */
 
 /*
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -22,6 +22,7 @@
 #include <asm/atomic.h>
 #include <xen/vpci.h>
 #include <xen/wait.h>
+#include <xen/nospec.h>
 #include <public/xen.h>
 #include <public/domctl.h>
 #include <public/sysctl.h>
@@ -882,10 +883,10 @@ void watchdog_domain_destroy(struct domain *d);
  *    (that is, this would not be suitable for a driver domain)
  *  - There is never a reason to deny the hardware domain access to this
  */
-#define is_hardware_domain(_d) ((_d) == hardware_domain)
+#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
 
 /* This check is for functionality specific to a control domain */
-#define is_control_domain(_d) ((_d)->is_privileged)
+#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)
 
 #define VM_ASSIST(d, t) (test_bit(VMASST_TYPE_ ## t, &(d)->vm_assist))
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 02/11] is_hvm/pv_domain: block speculation
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation Norbert Manthey
@ 2019-01-23 11:51 ` Norbert Manthey
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option Norbert Manthey
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

When checking for being an hvm domain, or PV domain, we have to make
sure that speculation cannot bypass that check, and eventually access
data that should not end up in cache for the current domain type.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/include/xen/sched.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -892,7 +892,8 @@ void watchdog_domain_destroy(struct domain *d);
 
 static inline bool is_pv_domain(const struct domain *d)
 {
-    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
+    return IS_ENABLED(CONFIG_PV)
+           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
 }
 
 static inline bool is_pv_vcpu(const struct vcpu *v)
@@ -923,7 +924,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
 #endif
 static inline bool is_hvm_domain(const struct domain *d)
 {
-    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
+    return IS_ENABLED(CONFIG_HVM)
+           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
 }
 
 static inline bool is_hvm_vcpu(const struct vcpu *v)
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation Norbert Manthey
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 02/11] is_hvm/pv_domain: " Norbert Manthey
@ 2019-01-23 11:51 ` Norbert Manthey
  2019-01-23 13:18   ` Jan Beulich
                     ` (2 more replies)
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 04/11] x86/hvm: block speculative out-of-bound accesses Norbert Manthey
                   ` (10 subsequent siblings)
  13 siblings, 3 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

This commit introduces the configuration option L1TF_LFENCE that allows
to control the implementation of the protection of privilege checks via
lfence instructions. The following four alternatives are provided:

 - not injecting lfence instructions
 - inject an lfence instruction for both outcomes of the conditional
 - inject an lfence instruction only if the conditional would evaluate
   to true, so that this case cannot be entered under speculation
 - evaluating the condition and store the result into a local variable.
   before using this value, inject an lfence instruction.

The different options allow to control the level of protection vs the
slowdown the addtional lfence instructions would introduce. The default
value is set to protecting both branches.

For non-x86 platforms, the protection is disabled by default.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/arch/x86/Kconfig     | 24 ++++++++++++++++++++++++
 xen/include/xen/nospec.h | 12 ++++++++++--
 2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -176,6 +176,30 @@ config PV_SHIM_EXCLUSIVE
 	  firmware, and will not function correctly in other scenarios.
 
 	  If unsure, say N.
+
+choice
+	prompt "Default L1TF Branch Protection?"
+
+	config L1TF_LFENCE_BOTH
+		bool "Protect both branches of certain conditionals" if HVM
+		---help---
+		  Inject an lfence instruction after the condition to be
+		  evaluated for both outcomes of the condition
+	config L1TF_LFENCE_TRUE
+		bool "Protect true branch of certain conditionals" if HVM
+		---help---
+		  Protect only the path where the condition is evaluated to true
+	config L1TF_LFENCE_INTERMEDIATE
+		bool "Protect before using certain conditionals value" if HVM
+		---help---
+		  Inject an lfence instruction after evaluating the condition
+		  but before forwarding this value from a local variable
+	config L1TF_LFENCE_NONE
+		bool "No conditional protection"
+		---help---
+		  Do not inject lfences for conditional evaluations
+endchoice
+
 endmenu
 
 source "common/Kconfig"
diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -68,10 +68,18 @@ static inline bool lfence_true(void) { return true; }
 #endif
 
 /*
- * protect evaluation of conditional with respect to speculation
+ * allow to protect evaluation of conditional with respect to speculation on x86
  */
-#define evaluate_nospec(condition)                                      \
+#if defined(CONFIG_L1TF_LFENCE_NONE) || !defined(CONFIG_X86)
+#define evaluate_nospec(condition) (condition)
+#elif defined(CONFIG_L1TF_LFENCE_BOTH)
+#define evaluate_nospec(condition)                                         \
     (((condition) && lfence_true()) || !lfence_true())
+#elif defined(CONFIG_L1TF_LFENCE_TRUE)
+#define evaluate_nospec(condition) ((condition) && lfence_true())
+#elif defined(CONFIG_L1TF_LFENCE_INTERMEDIATE)
+#define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; })
+#endif
 
 #endif /* XEN_NOSPEC_H */
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 04/11] x86/hvm: block speculative out-of-bound accesses
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (2 preceding siblings ...)
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option Norbert Manthey
@ 2019-01-23 11:51 ` Norbert Manthey
  2019-01-31 19:31   ` Andrew Cooper
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: " Norbert Manthey
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

There are multiple arrays in the HVM interface that are accessed
with indices that are provided by the guest. To avoid speculative
out-of-bound accesses, we use the array_index_nospec macro.

When blocking speculative out-of-bound accesses, we can classify arrays
into dynamic arrays and static arrays. Where the former are allocated
during run time, the size of the latter is known during compile time.
On static arrays, compiler might be able to block speculative accesses
in the future.

We introduce another macro that uses the ARRAY_SIZE macro to block
speculative accesses. For arrays that are statically accessed, this macro
can be used instead of the usual macro. Using this macro results in more
readable code, and allows to modify the way this case is handled in a
single place.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/arch/x86/hvm/hvm.c   | 27 ++++++++++++++++++++++-----
 xen/include/xen/nospec.h |  6 ++++++
 2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -37,6 +37,7 @@
 #include <xen/monitor.h>
 #include <xen/warning.h>
 #include <xen/vpci.h>
+#include <xen/nospec.h>
 #include <asm/shadow.h>
 #include <asm/hap.h>
 #include <asm/current.h>
@@ -2102,7 +2103,7 @@ int hvm_mov_from_cr(unsigned int cr, unsigned int gpr)
     case 2:
     case 3:
     case 4:
-        val = curr->arch.hvm.guest_cr[cr];
+        val = array_access_nospec(curr->arch.hvm.guest_cr, cr);
         break;
     case 8:
         val = (vlapic_get_reg(vcpu_vlapic(curr), APIC_TASKPRI) & 0xf0) >> 4;
@@ -3448,13 +3449,15 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix16K_80000;
-        *msr_content = fixed_range_base[index + 1];
+        *msr_content = fixed_range_base[array_index_nospec(index + 1,
+                                   ARRAY_SIZE(v->arch.hvm.mtrr.fixed_ranges))];
         break;
     case MSR_MTRRfix4K_C0000...MSR_MTRRfix4K_F8000:
         if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix4K_C0000;
-        *msr_content = fixed_range_base[index + 3];
+        *msr_content = fixed_range_base[array_index_nospec(index + 3,
+                                   ARRAY_SIZE(v->arch.hvm.mtrr.fixed_ranges))];
         break;
     case MSR_IA32_MTRR_PHYSBASE(0)...MSR_IA32_MTRR_PHYSMASK(MTRR_VCNT_MAX - 1):
         if ( !d->arch.cpuid->basic.mtrr )
@@ -3463,7 +3466,8 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         if ( (index / 2) >=
              MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT) )
             goto gp_fault;
-        *msr_content = var_range_base[index];
+        *msr_content = var_range_base[array_index_nospec(index,
+                          MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT))];
         break;
 
     case MSR_IA32_XSS:
@@ -4026,7 +4030,8 @@ static int hvmop_set_evtchn_upcall_vector(
     if ( op.vector < 0x10 )
         return -EINVAL;
 
-    if ( op.vcpu >= d->max_vcpus || (v = d->vcpu[op.vcpu]) == NULL )
+    if ( op.vcpu >= d->max_vcpus ||
+        (v = d->vcpu[array_index_nospec(op.vcpu, d->max_vcpus)]) == NULL )
         return -ENOENT;
 
     printk(XENLOG_G_INFO "%pv: upcall vector %02x\n", v, op.vector);
@@ -4114,6 +4119,12 @@ static int hvmop_set_param(
     if ( a.index >= HVM_NR_PARAMS )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value a.index is bounded even during
+     * speculative execution.
+     */
+    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
+
     d = rcu_lock_domain_by_any_id(a.domid);
     if ( d == NULL )
         return -ESRCH;
@@ -4380,6 +4391,12 @@ static int hvmop_get_param(
     if ( a.index >= HVM_NR_PARAMS )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value a.index is bounded even during
+     * speculative execution.
+     */
+    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
+
     d = rcu_lock_domain_by_any_id(a.domid);
     if ( d == NULL )
         return -ESRCH;
diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -59,6 +59,12 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
 })
 
 /*
+ * array_access_nospec - allow nospec access for static size arrays
+ */
+#define array_access_nospec(array, index)                               \
+    (array)[array_index_nospec(index, ARRAY_SIZE(array))]
+
+/*
  * allow to insert a read memory barrier into conditionals
  */
 #ifdef CONFIG_X86
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (3 preceding siblings ...)
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 04/11] x86/hvm: block speculative out-of-bound accesses Norbert Manthey
@ 2019-01-23 11:51 ` Norbert Manthey
  2019-01-23 13:37   ` Jan Beulich
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 06/11] common/memory: " Norbert Manthey
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

Guests can issue grant table operations and provide guest controlled
data to them. This data is also used for memory loads. To avoid
speculative out-of-bound accesses, we use the array_index_nospec macro
where applicable. However, there are also memory accesses that cannot
be protected by a single array protection, or multiple accesses in a
row. To protect these, an lfence instruction is placed between the
actual range check and the access via the newly introduced macro
block_speculation.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/common/grant_table.c | 23 +++++++++++++++++++++--
 xen/include/xen/nospec.h |  9 +++++++++
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -37,6 +37,7 @@
 #include <xen/paging.h>
 #include <xen/keyhandler.h>
 #include <xen/vmap.h>
+#include <xen/nospec.h>
 #include <xsm/xsm.h>
 #include <asm/flushtlb.h>
 
@@ -963,6 +964,9 @@ map_grant_ref(
         PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
                  op->ref, rgt->domain->domain_id);
 
+    /* Make sure the above check is not bypassed speculatively */
+    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
+
     act = active_entry_acquire(rgt, op->ref);
     shah = shared_entry_header(rgt, op->ref);
     status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
@@ -1268,7 +1272,8 @@ unmap_common(
     }
 
     smp_rmb();
-    map = &maptrack_entry(lgt, op->handle);
+    map = &maptrack_entry(lgt, array_index_nospec(op->handle,
+                                                  lgt->maptrack_limit));
 
     if ( unlikely(!read_atomic(&map->flags)) )
     {
@@ -2026,6 +2031,9 @@ gnttab_prepare_for_transfer(
         goto fail;
     }
 
+    /* Make sure the above check is not bypassed speculatively */
+    ref = array_index_nospec(ref, nr_grant_entries(rgt));
+
     sha = shared_entry_header(rgt, ref);
 
     scombo.word = *(u32 *)&sha->flags;
@@ -2223,7 +2231,8 @@ gnttab_transfer(
         okay = gnttab_prepare_for_transfer(e, d, gop.ref);
         spin_lock(&e->page_alloc_lock);
 
-        if ( unlikely(!okay) || unlikely(e->is_dying) )
+        /* Make sure this check is not bypassed speculatively */
+        if ( evaluate_nospec(unlikely(!okay) || unlikely(e->is_dying)) )
         {
             bool_t drop_dom_ref = !domain_adjust_tot_pages(e, -1);
 
@@ -2408,6 +2417,9 @@ acquire_grant_for_copy(
         PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
                  "Bad grant reference %#x\n", gref);
 
+    /* Make sure the above check is not bypassed speculatively */
+    gref = array_index_nospec(gref, nr_grant_entries(rgt));
+
     act = active_entry_acquire(rgt, gref);
     shah = shared_entry_header(rgt, gref);
     if ( rgt->gt_version == 1 )
@@ -2826,6 +2838,9 @@ static int gnttab_copy_buf(const struct gnttab_copy *op,
                  op->dest.offset, dest->ptr.offset,
                  op->len, dest->len);
 
+    /* Make sure the above checks are not bypassed speculatively */
+    block_speculation();
+
     memcpy(dest->virt + op->dest.offset, src->virt + op->source.offset,
            op->len);
     gnttab_mark_dirty(dest->domain, dest->mfn);
@@ -3215,6 +3230,10 @@ swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
     if ( ref_a == ref_b )
         goto out;
 
+    /* Make sure the above check is not bypassed speculatively */
+    ref_a = array_index_nospec(ref_a, nr_grant_entries(d->grant_table));
+    ref_b = array_index_nospec(ref_b, nr_grant_entries(d->grant_table));
+
     act_a = active_entry_acquire(gt, ref_a);
     if ( act_a->pin )
         PIN_FAIL(out, GNTST_eagain, "ref a %#x busy\n", ref_a);
diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -87,6 +87,15 @@ static inline bool lfence_true(void) { return true; }
 #define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; })
 #endif
 
+/*
+ * allow to block speculative execution in generic code
+ */
+#ifdef CONFIG_X86
+#define block_speculation() rmb()
+#else
+#define block_speculation()
+#endif
+
 #endif /* XEN_NOSPEC_H */
 
 /*
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 06/11] common/memory: block speculative out-of-bound accesses
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (4 preceding siblings ...)
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: " Norbert Manthey
@ 2019-01-23 11:51 ` Norbert Manthey
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 07/11] nospec: enable lfence on Intel Norbert Manthey
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:51 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

The get_page_from_gfn method returns a pointer to a page that belongs
to a gfn. Before returning the pointer, the gfn is checked for being
valid. Under speculation, these checks can be bypassed, so that
the function get_page is still executed partially. Consequently, the
function page_get_owner_and_reference might be executed partially as
well. In this function, the computed pointer is accessed, resulting in
a speculative out-of-bound address load. As the gfn can be controlled by
a guest, this access is problematic.

To mitigate the root cause, an lfence instruction is added via the
evaluate_nospec macro. To make the protection generic, we do not
introduce the lfence instruction for this single check, but add it to
the mfn_valid function. This way, other potentially problematic accesses
are protected as well.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/common/pdx.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/xen/common/pdx.c b/xen/common/pdx.c
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -18,6 +18,7 @@
 #include <xen/init.h>
 #include <xen/mm.h>
 #include <xen/bitops.h>
+#include <xen/nospec.h>
 
 /* Parameters for PFN/MADDR compression. */
 unsigned long __read_mostly max_pdx;
@@ -33,10 +34,10 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
 
 bool __mfn_valid(unsigned long mfn)
 {
-    return likely(mfn < max_page) &&
-           likely(!(mfn & pfn_hole_mask)) &&
-           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
-                           pdx_group_valid));
+    return evaluate_nospec(likely(mfn < max_page) &&
+                           likely(!(mfn & pfn_hole_mask)) &&
+                           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
+                                           pdx_group_valid)));
 }
 
 /* Sets all bits from the most-significant 1-bit down to the LSB */
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 07/11] nospec: enable lfence on Intel
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (5 preceding siblings ...)
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 06/11] common/memory: " Norbert Manthey
@ 2019-01-23 11:57 ` Norbert Manthey
  2019-01-24 22:29   ` Andrew Cooper
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 08/11] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

While the lfence instruction was added for all x86 platform in the
beginning, it's useful to not block platforms that are not affected
by the L1TF vulnerability. Therefore, the lfence instruction should
only be introduced, in case the current CPU is an Intel CPU that is
capable of hyper threading. This combination of features is added
to the features that activate the alternative.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/include/xen/nospec.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -7,6 +7,7 @@
 #ifndef XEN_NOSPEC_H
 #define XEN_NOSPEC_H
 
+#include <asm/alternative.h>
 #include <asm/system.h>
 
 /**
@@ -68,7 +69,10 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
  * allow to insert a read memory barrier into conditionals
  */
 #ifdef CONFIG_X86
-static inline bool lfence_true(void) { rmb(); return true; }
+static inline bool lfence_true(void) {
+    alternative("", "lfence", X86_VENDOR_INTEL);
+    return true;
+}
 #else
 static inline bool lfence_true(void) { return true; }
 #endif
@@ -91,7 +95,7 @@ static inline bool lfence_true(void) { return true; }
  * allow to block speculative execution in generic code
  */
 #ifdef CONFIG_X86
-#define block_speculation() rmb()
+#define block_speculation() alternative("", "lfence", X86_VENDOR_INTEL)
 #else
 #define block_speculation()
 #endif
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 08/11] xen/evtchn: block speculative out-of-bound accesses
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (6 preceding siblings ...)
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 07/11] nospec: enable lfence on Intel Norbert Manthey
@ 2019-01-23 11:57 ` Norbert Manthey
  2019-01-24 16:56   ` Jan Beulich
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: " Norbert Manthey
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

Guests can issue event channel interaction with guest specified data.
To avoid speculative out-of-bound accesses, we use the nospec macros.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/common/event_channel.c | 25 ++++++++++++++++++++-----
 xen/common/event_fifo.c    | 16 +++++++++++++---
 xen/include/xen/event.h    |  5 +++--
 3 files changed, 36 insertions(+), 10 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -368,8 +368,14 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
     if ( virq_is_global(virq) && (vcpu != 0) )
         return -EINVAL;
 
+   /*
+    * Make sure the guest controlled value virq is bounded even during
+    * speculative execution.
+    */
+    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
+
     if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
-         ((v = d->vcpu[vcpu]) == NULL) )
+         ((v = d->vcpu[array_index_nospec(vcpu, d->max_vcpus)]) == NULL) )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -419,7 +425,7 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
     long           rc = 0;
 
     if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
-         (d->vcpu[vcpu] == NULL) )
+         (d->vcpu[array_index_nospec(vcpu, d->max_vcpus)] == NULL) )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -816,6 +822,12 @@ int set_global_virq_handler(struct domain *d, uint32_t virq)
     if (!virq_is_global(virq))
         return -EINVAL;
 
+   /*
+    * Make sure the guest controlled value virq is bounded even during
+    * speculative execution.
+    */
+    virq = array_index_nospec(virq, ARRAY_SIZE(global_virq_handlers));
+
     if (global_virq_handlers[virq] == d)
         return 0;
 
@@ -931,7 +943,8 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
     struct evtchn *chn;
     long           rc = 0;
 
-    if ( (vcpu_id >= d->max_vcpus) || (d->vcpu[vcpu_id] == NULL) )
+    if ( (vcpu_id >= d->max_vcpus) ||
+         (d->vcpu[array_index_nospec(vcpu_id, d->max_vcpus)] == NULL) )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -969,8 +982,10 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
         unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
         chn->notify_vcpu_id = vcpu_id;
         pirq_set_affinity(d, chn->u.pirq.irq,
-                          cpumask_of(d->vcpu[vcpu_id]->processor));
-        link_pirq_port(port, chn, d->vcpu[vcpu_id]);
+                          cpumask_of(d->vcpu[array_index_nospec(vcpu_id,
+                                                                d->max_vcpus)]->processor));
+        link_pirq_port(port, chn, d->vcpu[array_index_nospec(vcpu_id,
+                                                             d->max_vcpus)]);
         break;
     default:
         rc = -EINVAL;
diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
--- a/xen/common/event_fifo.c
+++ b/xen/common/event_fifo.c
@@ -33,7 +33,8 @@ static inline event_word_t *evtchn_fifo_word_from_port(const struct domain *d,
      */
     smp_rmb();
 
-    p = port / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
+    p = array_index_nospec(port / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE,
+                           d->evtchn_fifo->num_evtchns);
     w = port % EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
 
     return d->evtchn_fifo->event_array[p] + w;
@@ -516,14 +517,23 @@ int evtchn_fifo_init_control(struct evtchn_init_control *init_control)
     gfn     = init_control->control_gfn;
     offset  = init_control->offset;
 
-    if ( vcpu_id >= d->max_vcpus || !d->vcpu[vcpu_id] )
+    if ( vcpu_id >= d->max_vcpus ||
+         !d->vcpu[array_index_nospec(vcpu_id, d->max_vcpus)] )
         return -ENOENT;
-    v = d->vcpu[vcpu_id];
+
+    v = d->vcpu[array_index_nospec(vcpu_id, d->max_vcpus)];
 
     /* Must not cross page boundary. */
     if ( offset > (PAGE_SIZE - sizeof(evtchn_fifo_control_block_t)) )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value offset is bounded even during
+     * speculative execution.
+     */
+    offset = array_index_nospec(offset,
+                              PAGE_SIZE - sizeof(evtchn_fifo_control_block_t));
+
     /* Must be 8-bytes aligned. */
     if ( offset & (8 - 1) )
         return -EINVAL;
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -13,6 +13,7 @@
 #include <xen/smp.h>
 #include <xen/softirq.h>
 #include <xen/bitops.h>
+#include <xen/nospec.h>
 #include <asm/event.h>
 
 /*
@@ -96,7 +97,7 @@ void arch_evtchn_inject(struct vcpu *v);
  * The first bucket is directly accessed via d->evtchn.
  */
 #define group_from_port(d, p) \
-    ((d)->evtchn_group[(p) / EVTCHNS_PER_GROUP])
+    array_access_nospec((d)->evtchn_group, (p) / EVTCHNS_PER_GROUP)
 #define bucket_from_port(d, p) \
     ((group_from_port(d, p))[((p) % EVTCHNS_PER_GROUP) / EVTCHNS_PER_BUCKET])
 
@@ -110,7 +111,7 @@ static inline bool_t port_is_valid(struct domain *d, unsigned int p)
 static inline struct evtchn *evtchn_from_port(struct domain *d, unsigned int p)
 {
     if ( p < EVTCHNS_PER_BUCKET )
-        return &d->evtchn[p];
+        return &d->evtchn[array_index_nospec(p, EVTCHNS_PER_BUCKET)];
     return bucket_from_port(d, p) + (p % EVTCHNS_PER_BUCKET);
 }
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: block speculative out-of-bound accesses
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (7 preceding siblings ...)
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 08/11] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
@ 2019-01-23 11:57 ` Norbert Manthey
  2019-01-25 16:34   ` Jan Beulich
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 10/11] x86/hvm/hpet: " Norbert Manthey
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

When interacting with io apic, a guest can specify values that are used
as index to structures, and whose values are not compared against
upper bounds to prevent speculative out-of-bound accesses. This change
prevents these speculative accesses.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/arch/x86/hvm/vioapic.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -30,6 +30,7 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/sched.h>
+#include <xen/nospec.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/vpic.h>
@@ -66,6 +67,9 @@ static struct hvm_vioapic *gsi_vioapic(const struct domain *d,
 {
     unsigned int i;
 
+    /* Make sure the compiler does not optimize the initialization */
+    OPTIMIZER_HIDE_VAR(pin);
+
     for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
     {
         struct hvm_vioapic *vioapic = domain_vioapic(d, i);
@@ -117,7 +121,8 @@ static uint32_t vioapic_read_indirect(const struct hvm_vioapic *vioapic)
             break;
         }
 
-        redir_content = vioapic->redirtbl[redir_index].bits;
+        redir_content = vioapic->redirtbl[array_index_nospec(redir_index,
+                                                       vioapic->nr_pins)].bits;
         result = (vioapic->ioregsel & 1) ? (redir_content >> 32)
                                          : redir_content;
         break;
@@ -212,7 +217,12 @@ static void vioapic_write_redirent(
     struct hvm_irq *hvm_irq = hvm_domain_irq(d);
     union vioapic_redir_entry *pent, ent;
     int unmasked = 0;
-    unsigned int gsi = vioapic->base_gsi + idx;
+    unsigned int gsi;
+
+    /* Make sure no out-of-bound value for idx can be used */
+    idx = array_index_nospec(idx, vioapic->nr_pins);
+
+    gsi = vioapic->base_gsi + idx;
 
     spin_lock(&d->arch.hvm.irq_lock);
 
@@ -378,7 +388,8 @@ static inline int pit_channel0_enabled(void)
 
 static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
 {
-    uint16_t dest = vioapic->redirtbl[pin].fields.dest_id;
+    uint16_t dest = vioapic->redirtbl
+               [pin = array_index_nospec(pin, vioapic->nr_pins)].fields.dest_id;
     uint8_t dest_mode = vioapic->redirtbl[pin].fields.dest_mode;
     uint8_t delivery_mode = vioapic->redirtbl[pin].fields.delivery_mode;
     uint8_t vector = vioapic->redirtbl[pin].fields.vector;
@@ -463,7 +474,7 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
 
 void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
 {
-    unsigned int pin;
+    unsigned int pin = 0; /* See gsi_vioapic */
     struct hvm_vioapic *vioapic = gsi_vioapic(d, irq, &pin);
     union vioapic_redir_entry *ent;
 
@@ -560,7 +571,7 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
 
 int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
 {
-    unsigned int pin;
+    unsigned int pin = 0; /* See gsi_vioapic */
     const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
 
     if ( !vioapic )
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 10/11] x86/hvm/hpet: block speculative out-of-bound accesses
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (8 preceding siblings ...)
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: " Norbert Manthey
@ 2019-01-23 11:57 ` Norbert Manthey
  2019-01-25 16:50   ` Jan Beulich
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 11/11] x86/CPUID: " Norbert Manthey
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

When interacting with hpet, read and write operations can be executed
during instruction emulation, where the guest controls the data that
is used. As it is hard to predict the number of instructions that are
executed speculatively, we prevent out-of-bound accesses by using the
array_index_nospec function for guest specified addresses that should
be used for hpet operations.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/arch/x86/hvm/hpet.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/hpet.c b/xen/arch/x86/hvm/hpet.c
--- a/xen/arch/x86/hvm/hpet.c
+++ b/xen/arch/x86/hvm/hpet.c
@@ -25,6 +25,7 @@
 #include <xen/sched.h>
 #include <xen/event.h>
 #include <xen/trace.h>
+#include <xen/nospec.h>
 
 #define domain_vhpet(x) (&(x)->arch.hvm.pl_time->vhpet)
 #define vcpu_vhpet(x)   (domain_vhpet((x)->domain))
@@ -124,15 +125,17 @@ static inline uint64_t hpet_read64(HPETState *h, unsigned long addr,
     case HPET_Tn_CFG(0):
     case HPET_Tn_CFG(1):
     case HPET_Tn_CFG(2):
-        return h->hpet.timers[HPET_TN(CFG, addr)].config;
+        return array_access_nospec(h->hpet.timers, HPET_TN(CFG, addr)).config;
     case HPET_Tn_CMP(0):
     case HPET_Tn_CMP(1):
     case HPET_Tn_CMP(2):
-        return hpet_get_comparator(h, HPET_TN(CMP, addr), guest_time);
+        return hpet_get_comparator(h, array_index_nospec(HPET_TN(CMP, addr),
+                                                   ARRAY_SIZE(h->hpet.timers)),
+                                                              guest_time);
     case HPET_Tn_ROUTE(0):
     case HPET_Tn_ROUTE(1):
     case HPET_Tn_ROUTE(2):
-        return h->hpet.timers[HPET_TN(ROUTE, addr)].fsb;
+        return array_access_nospec(h->hpet.timers, HPET_TN(ROUTE, addr)).fsb;
     }
 
     return 0;
@@ -438,7 +441,7 @@ static int hpet_write(
     case HPET_Tn_CFG(0):
     case HPET_Tn_CFG(1):
     case HPET_Tn_CFG(2):
-        tn = HPET_TN(CFG, addr);
+        tn = array_index_nospec(HPET_TN(CFG, addr), ARRAY_SIZE(h->hpet.timers));
 
         h->hpet.timers[tn].config =
             hpet_fixup_reg(new_val, old_val,
@@ -480,7 +483,7 @@ static int hpet_write(
     case HPET_Tn_CMP(0):
     case HPET_Tn_CMP(1):
     case HPET_Tn_CMP(2):
-        tn = HPET_TN(CMP, addr);
+        tn = array_index_nospec(HPET_TN(CMP, addr), ARRAY_SIZE(h->hpet.timers));
         if ( timer_is_periodic(h, tn) &&
              !(h->hpet.timers[tn].config & HPET_TN_SETVAL) )
         {
@@ -523,7 +526,7 @@ static int hpet_write(
     case HPET_Tn_ROUTE(0):
     case HPET_Tn_ROUTE(1):
     case HPET_Tn_ROUTE(2):
-        tn = HPET_TN(ROUTE, addr);
+        tn = array_index_nospec(HPET_TN(ROUTE, addr), ARRAY_SIZE(h->hpet.timers));
         h->hpet.timers[tn].fsb = new_val;
         break;
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v4 11/11] x86/CPUID: block speculative out-of-bound accesses
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (9 preceding siblings ...)
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 10/11] x86/hvm/hpet: " Norbert Manthey
@ 2019-01-23 11:57 ` Norbert Manthey
  2019-01-24 21:05 ` SpectreV1+L1TF Patch Series Andrew Cooper
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-23 11:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel, Norbert Manthey

During instruction emulation, the cpuid instruction is emulated with
data that is controlled by the guest. As speculation might pass bound
checks, we have to ensure that no out-of-bound loads are possible.

To not rely on the compiler to perform value propagation, instead of
using the array_index_nospec macro, we replace the variable with the
constant to be propagated instead.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Reviewed-by: Jan Beulich <jbeulich@suse.com>

---
 xen/arch/x86/cpuid.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
--- a/xen/arch/x86/cpuid.c
+++ b/xen/arch/x86/cpuid.c
@@ -1,6 +1,7 @@
 #include <xen/init.h>
 #include <xen/lib.h>
 #include <xen/sched.h>
+#include <xen/nospec.h>
 #include <asm/cpuid.h>
 #include <asm/hvm/hvm.h>
 #include <asm/hvm/nestedhvm.h>
@@ -629,7 +630,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
             if ( subleaf >= ARRAY_SIZE(p->cache.raw) )
                 return;
 
-            *res = p->cache.raw[subleaf];
+            *res = array_access_nospec(p->cache.raw, subleaf);
             break;
 
         case 0x7:
@@ -638,25 +639,25 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
                                  ARRAY_SIZE(p->feat.raw) - 1) )
                 return;
 
-            *res = p->feat.raw[subleaf];
+            *res = array_access_nospec(p->feat.raw, subleaf);
             break;
 
         case 0xb:
             if ( subleaf >= ARRAY_SIZE(p->topo.raw) )
                 return;
 
-            *res = p->topo.raw[subleaf];
+            *res = array_access_nospec(p->topo.raw, subleaf);
             break;
 
         case XSTATE_CPUID:
             if ( !p->basic.xsave || subleaf >= ARRAY_SIZE(p->xstate.raw) )
                 return;
 
-            *res = p->xstate.raw[subleaf];
+            *res = array_access_nospec(p->xstate.raw, subleaf);
             break;
 
         default:
-            *res = p->basic.raw[leaf];
+            *res = array_access_nospec(p->basic.raw, leaf);
             break;
         }
         break;
@@ -680,7 +681,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
                                      ARRAY_SIZE(p->extd.raw) - 1) )
             return;
 
-        *res = p->extd.raw[leaf & 0xffff];
+        *res = array_access_nospec(p->extd.raw, leaf & 0xffff);
         break;
 
     default:
@@ -847,7 +848,7 @@ void guest_cpuid(const struct vcpu *v, uint32_t leaf,
         if ( is_pv_domain(d) && is_hardware_domain(d) &&
              guest_kernel_mode(v, regs) && cpu_has_monitor &&
              regs->entry_vector == TRAP_gp_fault )
-            *res = raw_cpuid_policy.basic.raw[leaf];
+            *res = raw_cpuid_policy.basic.raw[5];
         break;
 
     case 0x7:
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation Norbert Manthey
@ 2019-01-23 13:07   ` Jan Beulich
  2019-01-23 13:20     ` Julien Grall
  2019-01-23 13:20   ` Jan Beulich
  1 sibling, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-23 13:07 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
> --- a/xen/include/xen/nospec.h
> +++ b/xen/include/xen/nospec.h
> @@ -58,6 +58,21 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>      (typeof(_i)) (_i & _mask);                                          \
>  })
>  
> +/*
> + * allow to insert a read memory barrier into conditionals
> + */

Please obey to the comment style set forth in ./CODING_STYLE.

> +#ifdef CONFIG_X86
> +static inline bool lfence_true(void) { rmb(); return true; }
> +#else
> +static inline bool lfence_true(void) { return true; }
> +#endif

This is a generic header, hence functions defined here should have
universally applicable names. "lfence", however, is an x86 term
(naming a particular instruction). I can't think of really good
alternatives, but how about one of arch_nospec_true() /
arch_fence_nospec_true() / arch_nospec_fence_true()?

Furthermore, rather than adding Kconfig control and alternatives
patching later in the series (as per the cover letter), it should be
that way from the beginning. Remember that any series may go
in piecemeal, not all in one go.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option Norbert Manthey
@ 2019-01-23 13:18   ` Jan Beulich
  2019-01-24 12:11     ` Norbert Manthey
  2019-01-23 13:24   ` Julien Grall
  2019-01-24 21:29   ` Andrew Cooper
  2 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-23 13:18 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
> This commit introduces the configuration option L1TF_LFENCE that allows
> to control the implementation of the protection of privilege checks via
> lfence instructions. The following four alternatives are provided:
> 
>  - not injecting lfence instructions
>  - inject an lfence instruction for both outcomes of the conditional
>  - inject an lfence instruction only if the conditional would evaluate
>    to true, so that this case cannot be entered under speculation

I'd really like to see justification for this variant, as the asymmetric
handling doesn't look overly helpful. It's also not clear to me how
someone configuring Xen should tell whether this would be a safe
selection to make. I'm tempted to request that this option become
EXPERT dependent.

>  - evaluating the condition and store the result into a local variable.
>    before using this value, inject an lfence instruction.
> 
> The different options allow to control the level of protection vs the
> slowdown the addtional lfence instructions would introduce. The default
> value is set to protecting both branches.
> 
> For non-x86 platforms, the protection is disabled by default.

At least the "by default" is wrong here.

> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -176,6 +176,30 @@ config PV_SHIM_EXCLUSIVE
>  	  firmware, and will not function correctly in other scenarios.
>  
>  	  If unsure, say N.
> +
> +choice
> +	prompt "Default L1TF Branch Protection?"
> +
> +	config L1TF_LFENCE_BOTH
> +		bool "Protect both branches of certain conditionals" if HVM
> +		---help---
> +		  Inject an lfence instruction after the condition to be
> +		  evaluated for both outcomes of the condition
> +	config L1TF_LFENCE_TRUE
> +		bool "Protect true branch of certain conditionals" if HVM
> +		---help---
> +		  Protect only the path where the condition is evaluated to true
> +	config L1TF_LFENCE_INTERMEDIATE
> +		bool "Protect before using certain conditionals value" if HVM
> +		---help---
> +		  Inject an lfence instruction after evaluating the condition
> +		  but before forwarding this value from a local variable
> +	config L1TF_LFENCE_NONE
> +		bool "No conditional protection"
> +		---help---
> +		  Do not inject lfences for conditional evaluations
> +endchoice

I guess we should settle on some default for this choice. The
description talks about a default, but I don't see it set here (and
I don't think relying merely on the order is a good idea).

> --- a/xen/include/xen/nospec.h
> +++ b/xen/include/xen/nospec.h
> @@ -68,10 +68,18 @@ static inline bool lfence_true(void) { return true; }
>  #endif
>  
>  /*
> - * protect evaluation of conditional with respect to speculation
> + * allow to protect evaluation of conditional with respect to speculation on x86
>   */
> -#define evaluate_nospec(condition)                                      \
> +#if defined(CONFIG_L1TF_LFENCE_NONE) || !defined(CONFIG_X86)
> +#define evaluate_nospec(condition) (condition)
> +#elif defined(CONFIG_L1TF_LFENCE_BOTH)
> +#define evaluate_nospec(condition)                                         \

I'm puzzled by this line changing - do you really need to move the
backslash here?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
  2019-01-23 13:07   ` Jan Beulich
@ 2019-01-23 13:20     ` Julien Grall
  2019-01-23 13:40       ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Julien Grall @ 2019-01-23 13:20 UTC (permalink / raw)
  To: Jan Beulich, nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

Hi,

On 23/01/2019 13:07, Jan Beulich wrote:
>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>> --- a/xen/include/xen/nospec.h
>> +++ b/xen/include/xen/nospec.h
>> @@ -58,6 +58,21 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>>       (typeof(_i)) (_i & _mask);                                          \
>>   })
>>   
>> +/*
>> + * allow to insert a read memory barrier into conditionals
>> + */
> 
> Please obey to the comment style set forth in ./CODING_STYLE.
> 
>> +#ifdef CONFIG_X86
>> +static inline bool lfence_true(void) { rmb(); return true; }
>> +#else
>> +static inline bool lfence_true(void) { return true; }
>> +#endif
> 
> This is a generic header, hence functions defined here should have
> universally applicable names. "lfence", however, is an x86 term
> (naming a particular instruction). I can't think of really good
> alternatives, but how about one of arch_nospec_true() /
> arch_fence_nospec_true() / arch_nospec_fence_true()?

We seems to use more the term "barrier" in common code over "fence". So how 
about arch_barrier_nospec_true/arch_nospec_barrier_true?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation Norbert Manthey
  2019-01-23 13:07   ` Jan Beulich
@ 2019-01-23 13:20   ` Jan Beulich
  2019-01-24 12:07     ` Norbert Manthey
  1 sibling, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-23 13:20 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
> --- a/xen/include/xen/nospec.h
> +++ b/xen/include/xen/nospec.h
> @@ -58,6 +58,21 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>      (typeof(_i)) (_i & _mask);                                          \
>  })
>  
> +/*
> + * allow to insert a read memory barrier into conditionals
> + */
> +#ifdef CONFIG_X86
> +static inline bool lfence_true(void) { rmb(); return true; }
> +#else
> +static inline bool lfence_true(void) { return true; }
> +#endif
> +
> +/*
> + * protect evaluation of conditional with respect to speculation
> + */
> +#define evaluate_nospec(condition)                                      \
> +    (((condition) && lfence_true()) || !lfence_true())

It may be just me, but I think

#define evaluate_nospec(condition)                                      \
    ((condition) ? lfence_true() : !lfence_true())

would better express the two-way nature of this.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option Norbert Manthey
  2019-01-23 13:18   ` Jan Beulich
@ 2019-01-23 13:24   ` Julien Grall
  2019-01-23 13:39     ` Jan Beulich
  2019-01-24 21:29   ` Andrew Cooper
  2 siblings, 1 reply; 150+ messages in thread
From: Julien Grall @ 2019-01-23 13:24 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel

Hi,

On 23/01/2019 11:51, Norbert Manthey wrote:
> diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
> --- a/xen/include/xen/nospec.h
> +++ b/xen/include/xen/nospec.h
> @@ -68,10 +68,18 @@ static inline bool lfence_true(void) { return true; }
>   #endif
>   
>   /*
> - * protect evaluation of conditional with respect to speculation
> + * allow to protect evaluation of conditional with respect to speculation on x86
>    */
> -#define evaluate_nospec(condition)                                      \
> +#if defined(CONFIG_L1TF_LFENCE_NONE) || !defined(CONFIG_X86)
> +#define evaluate_nospec(condition) (condition)
> +#elif defined(CONFIG_L1TF_LFENCE_BOTH)
> +#define evaluate_nospec(condition)                                         \
>       (((condition) && lfence_true()) || !lfence_true())
> +#elif defined(CONFIG_L1TF_LFENCE_TRUE)
> +#define evaluate_nospec(condition) ((condition) && lfence_true())
> +#elif defined(CONFIG_L1TF_LFENCE_INTERMEDIATE)
> +#define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; }) > +#endif

None of the configs are defined on Arm, so can this be moved in arch-x86?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: " Norbert Manthey
@ 2019-01-23 13:37   ` Jan Beulich
  2019-01-28 14:45     ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-23 13:37 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
> @@ -1268,7 +1272,8 @@ unmap_common(
>      }
>  
>      smp_rmb();
> -    map = &maptrack_entry(lgt, op->handle);
> +    map = &maptrack_entry(lgt, array_index_nospec(op->handle,
> +                                                  lgt->maptrack_limit));

It might be better to move this into maptrack_entry() itself, or
make a maptrack_entry_nospec() clone (as several but not all
uses may indeed not be in need of the extra protection). At
least the ones in steal_maptrack_handle() and
put_maptrack_handle() also look potentially suspicious.

> @@ -2223,7 +2231,8 @@ gnttab_transfer(
>          okay = gnttab_prepare_for_transfer(e, d, gop.ref);
>          spin_lock(&e->page_alloc_lock);
>  
> -        if ( unlikely(!okay) || unlikely(e->is_dying) )
> +        /* Make sure this check is not bypassed speculatively */
> +        if ( evaluate_nospec(unlikely(!okay) || unlikely(e->is_dying)) )
>          {
>              bool_t drop_dom_ref = !domain_adjust_tot_pages(e, -1);

What is it that makes this particular if() different from other
surrounding ones? In particular the version dependent code (a few
lines down from here as well as elsewhere) look to be easily
divertable onto the wrong branch, then causing out of bounds
speculative accesses due to the different (version dependent)
shared entry sizes.

> @@ -3215,6 +3230,10 @@ swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
>      if ( ref_a == ref_b )
>          goto out;
>  
> +    /* Make sure the above check is not bypassed speculatively */
> +    ref_a = array_index_nospec(ref_a, nr_grant_entries(d->grant_table));
> +    ref_b = array_index_nospec(ref_b, nr_grant_entries(d->grant_table));

I think this wants to move up ahead of the if() in context, and the
comment be changed to plural.

> --- a/xen/include/xen/nospec.h
> +++ b/xen/include/xen/nospec.h
> @@ -87,6 +87,15 @@ static inline bool lfence_true(void) { return true; }
>  #define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; 
> })
>  #endif
>  
> +/*
> + * allow to block speculative execution in generic code
> + */

Comment style again.

> +#ifdef CONFIG_X86
> +#define block_speculation() rmb()
> +#else
> +#define block_speculation()
> +#endif

Why does this not simply resolve to what currently is named lfence_true()
(perhaps with a cast to void)? And why does this not depend on the
Kconfig setting?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 13:24   ` Julien Grall
@ 2019-01-23 13:39     ` Jan Beulich
  2019-01-23 13:44       ` Julien Grall
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-23 13:39 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 23.01.19 at 14:24, <julien.grall@arm.com> wrote:
> On 23/01/2019 11:51, Norbert Manthey wrote:
>> --- a/xen/include/xen/nospec.h
>> +++ b/xen/include/xen/nospec.h
>> @@ -68,10 +68,18 @@ static inline bool lfence_true(void) { return true; }
>>   #endif
>>   
>>   /*
>> - * protect evaluation of conditional with respect to speculation
>> + * allow to protect evaluation of conditional with respect to speculation on x86
>>    */
>> -#define evaluate_nospec(condition)                                      \
>> +#if defined(CONFIG_L1TF_LFENCE_NONE) || !defined(CONFIG_X86)
>> +#define evaluate_nospec(condition) (condition)
>> +#elif defined(CONFIG_L1TF_LFENCE_BOTH)
>> +#define evaluate_nospec(condition)                                         \
>>       (((condition) && lfence_true()) || !lfence_true())
>> +#elif defined(CONFIG_L1TF_LFENCE_TRUE)
>> +#define evaluate_nospec(condition) ((condition) && lfence_true())
>> +#elif defined(CONFIG_L1TF_LFENCE_INTERMEDIATE)
>> +#define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; })
> +#endif
> 
> None of the configs are defined on Arm, so can this be moved in arch-x86?

To be honest I'd like to avoid introducing asm/nospec.h for the time
being.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
  2019-01-23 13:20     ` Julien Grall
@ 2019-01-23 13:40       ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-23 13:40 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 23.01.19 at 14:20, <julien.grall@arm.com> wrote:
> On 23/01/2019 13:07, Jan Beulich wrote:
>>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>>> --- a/xen/include/xen/nospec.h
>>> +++ b/xen/include/xen/nospec.h
>>> @@ -58,6 +58,21 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>>>       (typeof(_i)) (_i & _mask);                                          \
>>>   })
>>>   
>>> +/*
>>> + * allow to insert a read memory barrier into conditionals
>>> + */
>> 
>> Please obey to the comment style set forth in ./CODING_STYLE.
>> 
>>> +#ifdef CONFIG_X86
>>> +static inline bool lfence_true(void) { rmb(); return true; }
>>> +#else
>>> +static inline bool lfence_true(void) { return true; }
>>> +#endif
>> 
>> This is a generic header, hence functions defined here should have
>> universally applicable names. "lfence", however, is an x86 term
>> (naming a particular instruction). I can't think of really good
>> alternatives, but how about one of arch_nospec_true() /
>> arch_fence_nospec_true() / arch_nospec_fence_true()?
> 
> We seems to use more the term "barrier" in common code over "fence". So how 
> about arch_barrier_nospec_true/arch_nospec_barrier_true?

That would be fine with me as well.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 13:39     ` Jan Beulich
@ 2019-01-23 13:44       ` Julien Grall
  2019-01-23 14:45         ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Julien Grall @ 2019-01-23 13:44 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

Hi Jan,

On 23/01/2019 13:39, Jan Beulich wrote:
>>>> On 23.01.19 at 14:24, <julien.grall@arm.com> wrote:
>> On 23/01/2019 11:51, Norbert Manthey wrote:
>>> --- a/xen/include/xen/nospec.h
>>> +++ b/xen/include/xen/nospec.h
>>> @@ -68,10 +68,18 @@ static inline bool lfence_true(void) { return true; }
>>>    #endif
>>>    
>>>    /*
>>> - * protect evaluation of conditional with respect to speculation
>>> + * allow to protect evaluation of conditional with respect to speculation on x86
>>>     */
>>> -#define evaluate_nospec(condition)                                      \
>>> +#if defined(CONFIG_L1TF_LFENCE_NONE) || !defined(CONFIG_X86)
>>> +#define evaluate_nospec(condition) (condition)
>>> +#elif defined(CONFIG_L1TF_LFENCE_BOTH)
>>> +#define evaluate_nospec(condition)                                         \
>>>        (((condition) && lfence_true()) || !lfence_true())
>>> +#elif defined(CONFIG_L1TF_LFENCE_TRUE)
>>> +#define evaluate_nospec(condition) ((condition) && lfence_true())
>>> +#elif defined(CONFIG_L1TF_LFENCE_INTERMEDIATE)
>>> +#define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; })
>> +#endif
>>
>> None of the configs are defined on Arm, so can this be moved in arch-x86?
> 
> To be honest I'd like to avoid introducing asm/nospec.h for the time
> being.

How about adding them in system.h as we did for array_index_mask_nospec?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 13:44       ` Julien Grall
@ 2019-01-23 14:45         ` Jan Beulich
  2019-01-24 12:21           ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-23 14:45 UTC (permalink / raw)
  To: Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 23.01.19 at 14:44, <julien.grall@arm.com> wrote:
> On 23/01/2019 13:39, Jan Beulich wrote:
>>>>> On 23.01.19 at 14:24, <julien.grall@arm.com> wrote:
>>> On 23/01/2019 11:51, Norbert Manthey wrote:
>>>> --- a/xen/include/xen/nospec.h
>>>> +++ b/xen/include/xen/nospec.h
>>>> @@ -68,10 +68,18 @@ static inline bool lfence_true(void) { return true; }
>>>>    #endif
>>>>    
>>>>    /*
>>>> - * protect evaluation of conditional with respect to speculation
>>>> + * allow to protect evaluation of conditional with respect to speculation on x86
>>>>     */
>>>> -#define evaluate_nospec(condition)                                      \
>>>> +#if defined(CONFIG_L1TF_LFENCE_NONE) || !defined(CONFIG_X86)
>>>> +#define evaluate_nospec(condition) (condition)
>>>> +#elif defined(CONFIG_L1TF_LFENCE_BOTH)
>>>> +#define evaluate_nospec(condition)                                         \
>>>>        (((condition) && lfence_true()) || !lfence_true())
>>>> +#elif defined(CONFIG_L1TF_LFENCE_TRUE)
>>>> +#define evaluate_nospec(condition) ((condition) && lfence_true())
>>>> +#elif defined(CONFIG_L1TF_LFENCE_INTERMEDIATE)
>>>> +#define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; })
>>> +#endif
>>>
>>> None of the configs are defined on Arm, so can this be moved in arch-x86?
>> 
>> To be honest I'd like to avoid introducing asm/nospec.h for the time
>> being.
> 
> How about adding them in system.h as we did for array_index_mask_nospec?

To tell you the truth, that's where Norbert had it originally.
I think that's not the right place though (also for
array_index_mask_nospec()). But I'll listen to a majority
thinking differently, at least as far as what is currently
lfence_true() goes. evaluate_nospec(), otoh, belongs where
it is now, I think.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
  2019-01-23 13:20   ` Jan Beulich
@ 2019-01-24 12:07     ` Norbert Manthey
  2019-01-24 20:33       ` Andrew Cooper
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-24 12:07 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/23/19 14:20, Jan Beulich wrote:
>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>> --- a/xen/include/xen/nospec.h
>> +++ b/xen/include/xen/nospec.h
>> @@ -58,6 +58,21 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>>      (typeof(_i)) (_i & _mask);                                          \
>>  })
>>  
>> +/*
>> + * allow to insert a read memory barrier into conditionals
>> + */
>> +#ifdef CONFIG_X86
>> +static inline bool lfence_true(void) { rmb(); return true; }
>> +#else
>> +static inline bool lfence_true(void) { return true; }
>> +#endif
>> +
>> +/*
>> + * protect evaluation of conditional with respect to speculation
>> + */
>> +#define evaluate_nospec(condition)                                      \
>> +    (((condition) && lfence_true()) || !lfence_true())
> It may be just me, but I think
>
> #define evaluate_nospec(condition)                                      \
>     ((condition) ? lfence_true() : !lfence_true())
>
> would better express the two-way nature of this.

I compared the binary output of the two variants, and they are the same
(for my build environment). I'll switch to your variant, in case nobody
objects.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 13:18   ` Jan Beulich
@ 2019-01-24 12:11     ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-24 12:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/23/19 14:18, Jan Beulich wrote:
>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>> This commit introduces the configuration option L1TF_LFENCE that allows
>> to control the implementation of the protection of privilege checks via
>> lfence instructions. The following four alternatives are provided:
>>
>>  - not injecting lfence instructions
>>  - inject an lfence instruction for both outcomes of the conditional
>>  - inject an lfence instruction only if the conditional would evaluate
>>    to true, so that this case cannot be entered under speculation
> I'd really like to see justification for this variant, as the asymmetric
> handling doesn't look overly helpful. It's also not clear to me how
> someone configuring Xen should tell whether this would be a safe
> selection to make. I'm tempted to request that this option become
> EXPERT dependent.
I will drop this option. Without properly defining which property checks
should be protected (we currently do not protect any XSM based checks
that are used in hypercalls like physdev_op), and what to protect, I
agree it's hard to judge whether this is useful.
>
>>  - evaluating the condition and store the result into a local variable.
>>    before using this value, inject an lfence instruction.
>>
>> The different options allow to control the level of protection vs the
>> slowdown the addtional lfence instructions would introduce. The default
>> value is set to protecting both branches.
>>
>> For non-x86 platforms, the protection is disabled by default.
> At least the "by default" is wrong here.
I will drop the "by default" in this sentence.
>
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -176,6 +176,30 @@ config PV_SHIM_EXCLUSIVE
>>  	  firmware, and will not function correctly in other scenarios.
>>  
>>  	  If unsure, say N.
>> +
>> +choice
>> +	prompt "Default L1TF Branch Protection?"
>> +
>> +	config L1TF_LFENCE_BOTH
>> +		bool "Protect both branches of certain conditionals" if HVM
>> +		---help---
>> +		  Inject an lfence instruction after the condition to be
>> +		  evaluated for both outcomes of the condition
>> +	config L1TF_LFENCE_TRUE
>> +		bool "Protect true branch of certain conditionals" if HVM
>> +		---help---
>> +		  Protect only the path where the condition is evaluated to true
>> +	config L1TF_LFENCE_INTERMEDIATE
>> +		bool "Protect before using certain conditionals value" if HVM
>> +		---help---
>> +		  Inject an lfence instruction after evaluating the condition
>> +		  but before forwarding this value from a local variable
>> +	config L1TF_LFENCE_NONE
>> +		bool "No conditional protection"
>> +		---help---
>> +		  Do not inject lfences for conditional evaluations
>> +endchoice
> I guess we should settle on some default for this choice. The
> description talks about a default, but I don't see it set here (and
> I don't think relying merely on the order is a good idea).
I will add a "default" statement, and pick the L1TF_LFENCE_BOTH variant
there.
>
>> --- a/xen/include/xen/nospec.h
>> +++ b/xen/include/xen/nospec.h
>> @@ -68,10 +68,18 @@ static inline bool lfence_true(void) { return true; }
>>  #endif
>>  
>>  /*
>> - * protect evaluation of conditional with respect to speculation
>> + * allow to protect evaluation of conditional with respect to speculation on x86
>>   */
>> -#define evaluate_nospec(condition)                                      \
>> +#if defined(CONFIG_L1TF_LFENCE_NONE) || !defined(CONFIG_X86)
>> +#define evaluate_nospec(condition) (condition)
>> +#elif defined(CONFIG_L1TF_LFENCE_BOTH)
>> +#define evaluate_nospec(condition)                                         \
> I'm puzzled by this line changing - do you really need to move the
> backslash here?

This looks strange as a stand-alone modification, I agree. I will merge
the introduction of the barrier with the new name, and merge it with the
configuration option and the alternative patching. This way, this change
will be removed.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 14:45         ` Jan Beulich
@ 2019-01-24 12:21           ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-24 12:21 UTC (permalink / raw)
  To: Jan Beulich, Julien Grall
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel


On 1/23/19 15:45, Jan Beulich wrote:
>>>> On 23.01.19 at 14:44, <julien.grall@arm.com> wrote:
>> On 23/01/2019 13:39, Jan Beulich wrote:
>>>>>> On 23.01.19 at 14:24, <julien.grall@arm.com> wrote:
>>>> On 23/01/2019 11:51, Norbert Manthey wrote:
>>>>> --- a/xen/include/xen/nospec.h
>>>>> +++ b/xen/include/xen/nospec.h
>>>>> @@ -68,10 +68,18 @@ static inline bool lfence_true(void) { return true; }
>>>>>    #endif
>>>>>    
>>>>>    /*
>>>>> - * protect evaluation of conditional with respect to speculation
>>>>> + * allow to protect evaluation of conditional with respect to speculation on x86
>>>>>     */
>>>>> -#define evaluate_nospec(condition)                                      \
>>>>> +#if defined(CONFIG_L1TF_LFENCE_NONE) || !defined(CONFIG_X86)
>>>>> +#define evaluate_nospec(condition) (condition)
>>>>> +#elif defined(CONFIG_L1TF_LFENCE_BOTH)
>>>>> +#define evaluate_nospec(condition)                                         \
>>>>>        (((condition) && lfence_true()) || !lfence_true())
>>>>> +#elif defined(CONFIG_L1TF_LFENCE_TRUE)
>>>>> +#define evaluate_nospec(condition) ((condition) && lfence_true())
>>>>> +#elif defined(CONFIG_L1TF_LFENCE_INTERMEDIATE)
>>>>> +#define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; })
>>>> +#endif
>>>>
>>>> None of the configs are defined on Arm, so can this be moved in arch-x86?
>>> To be honest I'd like to avoid introducing asm/nospec.h for the time
>>> being.
>> How about adding them in system.h as we did for array_index_mask_nospec?
> To tell you the truth, that's where Norbert had it originally.
> I think that's not the right place though (also for
> array_index_mask_nospec()). But I'll listen to a majority
> thinking differently, at least as far as what is currently
> lfence_true() goes. evaluate_nospec(), otoh, belongs where
> it is now, I think.

I will rename the lfence_true macro into "arch_barrier_nospec_true".
Furthermore, I will merge the introduction of the macros with the
introduction of the configuration and the alternative patching. Finally,
I'll reuse the arch_barrier_nospec_true implementation in the
evaluate_nospec macro.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 08/11] xen/evtchn: block speculative out-of-bound accesses
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 08/11] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
@ 2019-01-24 16:56   ` Jan Beulich
  2019-01-24 19:50     ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-24 16:56 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 23.01.19 at 12:57, <nmanthey@amazon.de> wrote:
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -368,8 +368,14 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
>      if ( virq_is_global(virq) && (vcpu != 0) )
>          return -EINVAL;
>  
> +   /*
> +    * Make sure the guest controlled value virq is bounded even during
> +    * speculative execution.
> +    */
> +    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));

I think this wants to move ahead of the if() in context, to be independent
of the particular implementation of virq_is_global() (the current shape of
which is mostly fine, perhaps with the exception of the risk of the compiler
translating the switch() there by way of a jump table). This also moves it
closer to the if() the construct is a companion to.

> @@ -816,6 +822,12 @@ int set_global_virq_handler(struct domain *d, uint32_t virq)
>      if (!virq_is_global(virq))
>          return -EINVAL;
>  
> +   /*
> +    * Make sure the guest controlled value virq is bounded even during
> +    * speculative execution.
> +    */
> +    virq = array_index_nospec(virq, ARRAY_SIZE(global_virq_handlers));

Same here then.

> @@ -931,7 +943,8 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>      struct evtchn *chn;
>      long           rc = 0;
>  
> -    if ( (vcpu_id >= d->max_vcpus) || (d->vcpu[vcpu_id] == NULL) )
> +    if ( (vcpu_id >= d->max_vcpus) ||
> +         (d->vcpu[array_index_nospec(vcpu_id, d->max_vcpus)] == NULL) )
>          return -ENOENT;
>  
>      spin_lock(&d->event_lock);
> @@ -969,8 +982,10 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>          unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
>          chn->notify_vcpu_id = vcpu_id;
>          pirq_set_affinity(d, chn->u.pirq.irq,
> -                          cpumask_of(d->vcpu[vcpu_id]->processor));
> -        link_pirq_port(port, chn, d->vcpu[vcpu_id]);
> +                          cpumask_of(d->vcpu[array_index_nospec(vcpu_id,
> +                                                                d->max_vcpus)]->processor));
> +        link_pirq_port(port, chn, d->vcpu[array_index_nospec(vcpu_id,
> +                                                             d->max_vcpus)]);

Using Andrew's new domain_vcpu() will improve readability, especially
after your change, quite a bit here. But of course code elsewhere will
benefit as well.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 08/11] xen/evtchn: block speculative out-of-bound accesses
  2019-01-24 16:56   ` Jan Beulich
@ 2019-01-24 19:50     ` Norbert Manthey
  2019-01-25  9:23       ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-24 19:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/24/19 17:56, Jan Beulich wrote:
>>>> On 23.01.19 at 12:57, <nmanthey@amazon.de> wrote:
>> --- a/xen/common/event_channel.c
>> +++ b/xen/common/event_channel.c
>> @@ -368,8 +368,14 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
>>      if ( virq_is_global(virq) && (vcpu != 0) )
>>          return -EINVAL;
>>  
>> +   /*
>> +    * Make sure the guest controlled value virq is bounded even during
>> +    * speculative execution.
>> +    */
>> +    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
> I think this wants to move ahead of the if() in context, to be independent
> of the particular implementation of virq_is_global() (the current shape of
> which is mostly fine, perhaps with the exception of the risk of the compiler
> translating the switch() there by way of a jump table). This also moves it
> closer to the if() the construct is a companion to.
I understand the concern. However, because the value of virq would be
changed before the virq_is_global check, couldn't that result in
returning a wrong error code? The potential out-of-bound value is
brought back into the valid range, so that the above check might fire
incorrectly?
>
>> @@ -816,6 +822,12 @@ int set_global_virq_handler(struct domain *d, uint32_t virq)
>>      if (!virq_is_global(virq))
>>          return -EINVAL;
>>  
>> +   /*
>> +    * Make sure the guest controlled value virq is bounded even during
>> +    * speculative execution.
>> +    */
>> +    virq = array_index_nospec(virq, ARRAY_SIZE(global_virq_handlers));
> Same here then.
>
>> @@ -931,7 +943,8 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>>      struct evtchn *chn;
>>      long           rc = 0;
>>  
>> -    if ( (vcpu_id >= d->max_vcpus) || (d->vcpu[vcpu_id] == NULL) )
>> +    if ( (vcpu_id >= d->max_vcpus) ||
>> +         (d->vcpu[array_index_nospec(vcpu_id, d->max_vcpus)] == NULL) )
>>          return -ENOENT;
>>  
>>      spin_lock(&d->event_lock);
>> @@ -969,8 +982,10 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>>          unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
>>          chn->notify_vcpu_id = vcpu_id;
>>          pirq_set_affinity(d, chn->u.pirq.irq,
>> -                          cpumask_of(d->vcpu[vcpu_id]->processor));
>> -        link_pirq_port(port, chn, d->vcpu[vcpu_id]);
>> +                          cpumask_of(d->vcpu[array_index_nospec(vcpu_id,
>> +                                                                d->max_vcpus)]->processor));
>> +        link_pirq_port(port, chn, d->vcpu[array_index_nospec(vcpu_id,
>> +                                                             d->max_vcpus)]);
> Using Andrew's new domain_vcpu() will improve readability, especially
> after your change, quite a bit here. But of course code elsewhere will
> benefit as well.

You mean I should use the domain_vcpu function in both hunks, because
due to the first one, the latter can never return NULL? I will rebase
the series on top of this fresh change, and use the domain_vcpu function
for the locations where I bound a vcpu_id.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
  2019-01-24 12:07     ` Norbert Manthey
@ 2019-01-24 20:33       ` Andrew Cooper
  2019-01-25  9:19         ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Andrew Cooper @ 2019-01-24 20:33 UTC (permalink / raw)
  To: Norbert Manthey, Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 24/01/2019 12:07, Norbert Manthey wrote:
> On 1/23/19 14:20, Jan Beulich wrote:
>>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>>> --- a/xen/include/xen/nospec.h
>>> +++ b/xen/include/xen/nospec.h
>>> @@ -58,6 +58,21 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>>>      (typeof(_i)) (_i & _mask);                                          \
>>>  })
>>>  
>>> +/*
>>> + * allow to insert a read memory barrier into conditionals
>>> + */
>>> +#ifdef CONFIG_X86
>>> +static inline bool lfence_true(void) { rmb(); return true; }
>>> +#else
>>> +static inline bool lfence_true(void) { return true; }
>>> +#endif
>>> +
>>> +/*
>>> + * protect evaluation of conditional with respect to speculation
>>> + */
>>> +#define evaluate_nospec(condition)                                      \
>>> +    (((condition) && lfence_true()) || !lfence_true())
>> It may be just me, but I think
>>
>> #define evaluate_nospec(condition)                                      \
>>     ((condition) ? lfence_true() : !lfence_true())
>>
>> would better express the two-way nature of this.
> I compared the binary output of the two variants, and they are the same
> (for my build environment). I'll switch to your variant, in case nobody
> objects.

Is it safe though?  The original variant is required by C to only
evaluate one of the lfence_true() blocks, whereas the second variation
could execute both of them and cmov the 1 and 0 together, which is wasteful.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: SpectreV1+L1TF Patch Series
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (10 preceding siblings ...)
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 11/11] x86/CPUID: " Norbert Manthey
@ 2019-01-24 21:05 ` Andrew Cooper
  2019-01-28 13:56   ` Norbert Manthey
  2019-01-28  8:28 ` Jan Beulich
       [not found] ` <5C4EBD1A0200007800211954@suse.com>
  13 siblings, 1 reply; 150+ messages in thread
From: Andrew Cooper @ 2019-01-24 21:05 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel

On 23/01/2019 11:51, Norbert Manthey wrote:
> Dear all,
>
> This patch series attempts to mitigate the issue that have been raised in the
> XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
> execution on Intel hardware, an lfence instruction is required to make sure
> that selected checks are not bypassed. Speculative out-of-bound accesses can
> be prevented by using the array_index_nospec macro.
>
> The lfence instruction should be added on x86 platforms only. To not affect
> platforms that are not affected by the L1TF vulnerability, the lfence
> instruction is patched in via alternative patching on Intel CPUs only.
> Furthermore, the compile time configuration allows to choose how to protect the
> evaluation of conditions with the lfence instruction.

Hello,

First of all, I've dusted off an old patch of mine and made it
speculatively safe.

https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=9e92acf1b752dfdfb294234b32d1fa9f55bfdc0f

Using the new domain_vcpu() helper should tidy up quite a few patches in
the series.


Next, to the ordering of patches.

Please introduce the Kconfig variable(s) first.  I'll follow up on that
thread about options.

Next, introduce a new synthetic feature bit to cause patching to occur,
and logic to trigger it in appropriate circumstances.  Look through the
history of include/asm-x86/cpufeatures.h to see some examples from the
previous speculative mitigation work.  In particular, you'll need a
command line parameter to control the use of this functionality when it
is compiled in.

Next, introduce eval_nospec().  To avoid interfering with other
architectures, you probably want something like this:

xen/nospec.h contains:

/*
 * Evaluate a condition in a speculation-safe way.
 * Stub implementation for builds which don't care.
 */
#ifndef eval_nospec
#define eval_nospec(x) (x)
#endif

and something containing x86's implementation.  TBH, I personally think
asm/nospec.h is overdue for introducing now.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option Norbert Manthey
  2019-01-23 13:18   ` Jan Beulich
  2019-01-23 13:24   ` Julien Grall
@ 2019-01-24 21:29   ` Andrew Cooper
  2019-01-25 10:14     ` Jan Beulich
  2 siblings, 1 reply; 150+ messages in thread
From: Andrew Cooper @ 2019-01-24 21:29 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel

On 23/01/2019 11:51, Norbert Manthey wrote:
> This commit introduces the configuration option L1TF_LFENCE that allows
> to control the implementation of the protection of privilege checks via
> lfence instructions. The following four alternatives are provided:
>
>  - not injecting lfence instructions
>  - inject an lfence instruction for both outcomes of the conditional
>  - inject an lfence instruction only if the conditional would evaluate
>    to true, so that this case cannot be entered under speculation
>  - evaluating the condition and store the result into a local variable.
>    before using this value, inject an lfence instruction.

Can we take a step back and think about what is going on here.

TBH, I'm dubious of the overall utility of this option.  Either people
value their security and disabled HT for L1TF, or they opted for
performance instead.  However, I accept that there are plenty of people
who are playing fast and loose with that perhaps this series in an
intermediate they'd choose.

That is not to say I agree with the reasoning which lead to that choice,
but that having such an option available in some form in Xen is likely
to be useful to some people.

So, getting to the people playing fast and loose with their security... 
They will be wanting this to gain some security, at a perf expense which
they expect to be far less than disabling HT.


In practice, you generally only need a fence for one of the two basic
blocks following a conditional, but because we are not the compiler, we
cannot evaluate which side is safe at compile time, especially as the
compiler is able to optimise sub expressions.

Therefore, the "fence on true" or "fence on false" options are fairly
useless.  On average, they will "fix" 50% of the conditional branches,
but you won't know which until you disassemble the resulting binary.

Worse is the "evaluate condition, stash result, fence, use variable"
option, which is almost completely useless.  If you work out the
resulting instruction stream, you'll have a conditional expression
calculated down into a register, then a fence, then a test register and
conditional jump into one of two basic blocks.  This takes the perf hit,
and doesn't protect either of the basic blocks for speculative
mis-execution.

The only one of these options I see which has any value is the fence on
both sides of the condition, because it is the only one which
meaningfully improves the security of the resulting binary.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 07/11] nospec: enable lfence on Intel
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 07/11] nospec: enable lfence on Intel Norbert Manthey
@ 2019-01-24 22:29   ` Andrew Cooper
  2019-01-27 20:09     ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Andrew Cooper @ 2019-01-24 22:29 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Ian Jackson, Tim Deegan, Dario Faggioli,
	Martin Pohlack, Julien Grall, Bjoern Doebel, Jan Beulich,
	Martin Mazein, Julian Stecklina, David Woodhouse

On 23/01/2019 11:57, Norbert Manthey wrote:
> While the lfence instruction was added for all x86 platform in the
> beginning, it's useful to not block platforms that are not affected
> by the L1TF vulnerability. Therefore, the lfence instruction should
> only be introduced, in case the current CPU is an Intel CPU that is
> capable of hyper threading. This combination of features is added
> to the features that activate the alternative.
>
> This commit is part of the SpectreV1+L1TF mitigation patch series.
>
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>
> ---
>  xen/include/xen/nospec.h | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
> --- a/xen/include/xen/nospec.h
> +++ b/xen/include/xen/nospec.h
> @@ -7,6 +7,7 @@
>  #ifndef XEN_NOSPEC_H
>  #define XEN_NOSPEC_H
>  
> +#include <asm/alternative.h>
>  #include <asm/system.h>
>  
>  /**
> @@ -68,7 +69,10 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>   * allow to insert a read memory barrier into conditionals
>   */
>  #ifdef CONFIG_X86
> -static inline bool lfence_true(void) { rmb(); return true; }
> +static inline bool lfence_true(void) {
> +    alternative("", "lfence", X86_VENDOR_INTEL);

This doesn't do what you expect.  It will cause the lfences to be
patched into existence on any hardware with an FPU (before a recent
patch of mine) or with VME (after a recent patch).

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation
  2019-01-24 20:33       ` Andrew Cooper
@ 2019-01-25  9:19         ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-25  9:19 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 24.01.19 at 21:33, <andrew.cooper3@citrix.com> wrote:
> On 24/01/2019 12:07, Norbert Manthey wrote:
>> On 1/23/19 14:20, Jan Beulich wrote:
>>>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>>>> --- a/xen/include/xen/nospec.h
>>>> +++ b/xen/include/xen/nospec.h
>>>> @@ -58,6 +58,21 @@ static inline unsigned long 
> array_index_mask_nospec(unsigned long index,
>>>>      (typeof(_i)) (_i & _mask);                                          \
>>>>  })
>>>>  
>>>> +/*
>>>> + * allow to insert a read memory barrier into conditionals
>>>> + */
>>>> +#ifdef CONFIG_X86
>>>> +static inline bool lfence_true(void) { rmb(); return true; }
>>>> +#else
>>>> +static inline bool lfence_true(void) { return true; }
>>>> +#endif
>>>> +
>>>> +/*
>>>> + * protect evaluation of conditional with respect to speculation
>>>> + */
>>>> +#define evaluate_nospec(condition)                                      \
>>>> +    (((condition) && lfence_true()) || !lfence_true())
>>> It may be just me, but I think
>>>
>>> #define evaluate_nospec(condition)                                      \
>>>     ((condition) ? lfence_true() : !lfence_true())
>>>
>>> would better express the two-way nature of this.
>> I compared the binary output of the two variants, and they are the same
>> (for my build environment). I'll switch to your variant, in case nobody
>> objects.
> 
> Is it safe though?  The original variant is required by C to only
> evaluate one of the lfence_true() blocks, whereas the second variation
> could execute both of them and cmov the 1 and 0 together, which is wasteful.

For one I don't understand the connection between the initial
question and the explanation boiling down to a performance
concern. But I think I see what safety concern you may have.

And then I'm having difficulty following your interpretation of
what evaluation requirements C imposes: &&, ||, and ?: are all
sequence points. I'm implying from this that, as long as the
evaluation of the expressions has no side effects, it can
happen irrespective of source arrangements, and in particular
the compiler could translate both into exactly the same code.

Neither variant excludes the asm() getting moved by the
compiler anyway, despite the volatile - this is what the gcc doc
has to say on the topic: "Note that the compiler can move even
volatile asm instructions relative to other code, including across
jump instructions." It then goes on to explain how this can be
improved in source; I wonder whether we may want to follow
that advice and add a dependency on the calculated branch
condition. But of course this may further impact performance.

What's worse, "Under certain circumstances, GCC may duplicate
(or remove duplicates of) your assembly code when optimizing"
suggests to me that neither of the two variants are really safe
from getting converted to code actually matching the behavior
of L1TF_LFENCE_INTERMEDIATE. Do we perhaps need to
further complicate things and use (using naming derived from
the current version)

static inline bool lfence_bool(bool cond) {
    asm volatile ( "lfence" : "+X" (cond) );
    return cond;
}

#define evaluate_nospec(condition) ({ \
    bool cond_ = (condition); \
    ((cond_) ? lfence_bool(cond_) : !lfence_bool(cond_))

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 08/11] xen/evtchn: block speculative out-of-bound accesses
  2019-01-24 19:50     ` Norbert Manthey
@ 2019-01-25  9:23       ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-25  9:23 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 24.01.19 at 20:50, <nmanthey@amazon.de> wrote:
> On 1/24/19 17:56, Jan Beulich wrote:
>>>>> On 23.01.19 at 12:57, <nmanthey@amazon.de> wrote:
>>> --- a/xen/common/event_channel.c
>>> +++ b/xen/common/event_channel.c
>>> @@ -368,8 +368,14 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
>>>      if ( virq_is_global(virq) && (vcpu != 0) )
>>>          return -EINVAL;
>>>  
>>> +   /*
>>> +    * Make sure the guest controlled value virq is bounded even during
>>> +    * speculative execution.
>>> +    */
>>> +    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
>> I think this wants to move ahead of the if() in context, to be independent
>> of the particular implementation of virq_is_global() (the current shape of
>> which is mostly fine, perhaps with the exception of the risk of the compiler
>> translating the switch() there by way of a jump table). This also moves it
>> closer to the if() the construct is a companion to.
> I understand the concern. However, because the value of virq would be
> changed before the virq_is_global check, couldn't that result in
> returning a wrong error code? The potential out-of-bound value is
> brought back into the valid range, so that the above check might fire
> incorrectly?

No - and incorrect (out of bounds value) making it into virq_is_global()
is possible during mis-speculation only anyway. Out of range values,
for the purpose of architecturally visible state, get rejected by the
first if(). In range values won't be altered by array_index_nospec().

>>> @@ -931,7 +943,8 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>>>      struct evtchn *chn;
>>>      long           rc = 0;
>>>  
>>> -    if ( (vcpu_id >= d->max_vcpus) || (d->vcpu[vcpu_id] == NULL) )
>>> +    if ( (vcpu_id >= d->max_vcpus) ||
>>> +         (d->vcpu[array_index_nospec(vcpu_id, d->max_vcpus)] == NULL) )
>>>          return -ENOENT;
>>>  
>>>      spin_lock(&d->event_lock);
>>> @@ -969,8 +982,10 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>>>          unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
>>>          chn->notify_vcpu_id = vcpu_id;
>>>          pirq_set_affinity(d, chn->u.pirq.irq,
>>> -                          cpumask_of(d->vcpu[vcpu_id]->processor));
>>> -        link_pirq_port(port, chn, d->vcpu[vcpu_id]);
>>> +                          cpumask_of(d->vcpu[array_index_nospec(vcpu_id,
>>> +                                                                d->max_vcpus)]->processor));
>>> +        link_pirq_port(port, chn, d->vcpu[array_index_nospec(vcpu_id,
>>> +                                                             d->max_vcpus)]);
>> Using Andrew's new domain_vcpu() will improve readability, especially
>> after your change, quite a bit here. But of course code elsewhere will
>> benefit as well.
> 
> You mean I should use the domain_vcpu function in both hunks, because
> due to the first one, the latter can never return NULL? I will rebase
> the series on top of this fresh change, and use the domain_vcpu function
> for the locations where I bound a vcpu_id.

Thanks - that why Andrew had dusted off this old change of his.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-24 21:29   ` Andrew Cooper
@ 2019-01-25 10:14     ` Jan Beulich
  2019-01-25 10:50       ` Norbert Manthey
  2019-01-31 22:39       ` Andrew Cooper
  0 siblings, 2 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-25 10:14 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
> Worse is the "evaluate condition, stash result, fence, use variable"
> option, which is almost completely useless.  If you work out the
> resulting instruction stream, you'll have a conditional expression
> calculated down into a register, then a fence, then a test register and
> conditional jump into one of two basic blocks.  This takes the perf hit,
> and doesn't protect either of the basic blocks for speculative
> mis-execution.

How does it not protect anything? It shrinks the speculation window
to just the register test and conditional branch, which ought to be
far smaller than that behind a memory access which fails to hit any
of the caches (and perhaps even any of the TLBs). This is the more
that LFENCE does specifically not prevent insn fetching from
continuing.

That said I agree that the LFENCE would better sit between the
register test and the conditional branch, but as we've said so many
times before - this can't be achieved without compiler support. It's
said enough that the default "cc" clobber of asm()-s on x86 alone
prevents this from possibly working, while my over four year old
patch to add a means to avoid this has not seen sufficient
comments to get it into some hopefully acceptable shape, but also
has not been approved as is.

Then again, following an earlier reply of mine on another sub-
thread, nothing really prevents the compiler from moving ahead
and folding the two LFENCEs of the "both branches" model into
one. It just so happens that apparently right now this never
occurs (assuming Norbert has done full generated code analysis
to confirm the intended placement).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-25 10:14     ` Jan Beulich
@ 2019-01-25 10:50       ` Norbert Manthey
  2019-01-25 13:09         ` Jan Beulich
  2019-01-31 22:39       ` Andrew Cooper
  1 sibling, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-25 10:50 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/25/19 11:14, Jan Beulich wrote:
>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>> Worse is the "evaluate condition, stash result, fence, use variable"
>> option, which is almost completely useless.  If you work out the
>> resulting instruction stream, you'll have a conditional expression
>> calculated down into a register, then a fence, then a test register and
>> conditional jump into one of two basic blocks.  This takes the perf hit,
>> and doesn't protect either of the basic blocks for speculative
>> mis-execution.
> How does it not protect anything? It shrinks the speculation window
> to just the register test and conditional branch, which ought to be
> far smaller than that behind a memory access which fails to hit any
> of the caches (and perhaps even any of the TLBs). This is the more
> that LFENCE does specifically not prevent insn fetching from
> continuing.
>
> That said I agree that the LFENCE would better sit between the
> register test and the conditional branch, but as we've said so many
> times before - this can't be achieved without compiler support. It's
> said enough that the default "cc" clobber of asm()-s on x86 alone
> prevents this from possibly working, while my over four year old
> patch to add a means to avoid this has not seen sufficient
> comments to get it into some hopefully acceptable shape, but also
> has not been approved as is.
>
> Then again, following an earlier reply of mine on another sub-
> thread, nothing really prevents the compiler from moving ahead
> and folding the two LFENCEs of the "both branches" model into
> one. It just so happens that apparently right now this never
> occurs (assuming Norbert has done full generated code analysis
> to confirm the intended placement).

I am happy to jump back to my earlier version without a configuration
option to protect both branches with a lfence instruction, using logic
operators. For this version, I actually looked into the object dump and
checked for various locations that the lfence statment was added for
both blocks after the jump instruction. So, for the compiler I used did
not move the lfence instruction before the jump instruction and merged
them. I actually hope that the lazy evaluation of logic prevents the
compiler from doing so.

A note on performance: I created a set of micro benchmarks that call
certain hypercall+command pairs in a tight loop many times. These
hypercalls target locations I modified with this patch series. The
current state of testing shows that in the worst case the full series
adds at most 3% runtime (relative to what the same hypercall took before
the modification). The testing used the evaluate_nospec implementation
that protects both branches via logic operators. Given that those are
micro benchmarks, I expect the impact for usual user work loads is even
lower, but I did not measure any userland benchmarks yet. In case you
point me to performance tests you typically use, I can also look into
that. Thanks!

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-25 10:50       ` Norbert Manthey
@ 2019-01-25 13:09         ` Jan Beulich
  2019-01-27 20:28           ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-25 13:09 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 25.01.19 at 11:50, <nmanthey@amazon.de> wrote:
> On 1/25/19 11:14, Jan Beulich wrote:
>>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>>> Worse is the "evaluate condition, stash result, fence, use variable"
>>> option, which is almost completely useless.  If you work out the
>>> resulting instruction stream, you'll have a conditional expression
>>> calculated down into a register, then a fence, then a test register and
>>> conditional jump into one of two basic blocks.  This takes the perf hit,
>>> and doesn't protect either of the basic blocks for speculative
>>> mis-execution.
>> How does it not protect anything? It shrinks the speculation window
>> to just the register test and conditional branch, which ought to be
>> far smaller than that behind a memory access which fails to hit any
>> of the caches (and perhaps even any of the TLBs). This is the more
>> that LFENCE does specifically not prevent insn fetching from
>> continuing.
>>
>> That said I agree that the LFENCE would better sit between the
>> register test and the conditional branch, but as we've said so many
>> times before - this can't be achieved without compiler support. It's
>> said enough that the default "cc" clobber of asm()-s on x86 alone
>> prevents this from possibly working, while my over four year old
>> patch to add a means to avoid this has not seen sufficient
>> comments to get it into some hopefully acceptable shape, but also
>> has not been approved as is.
>>
>> Then again, following an earlier reply of mine on another sub-
>> thread, nothing really prevents the compiler from moving ahead
>> and folding the two LFENCEs of the "both branches" model into
>> one. It just so happens that apparently right now this never
>> occurs (assuming Norbert has done full generated code analysis
>> to confirm the intended placement).
> 
> I am happy to jump back to my earlier version without a configuration
> option to protect both branches with a lfence instruction, using logic
> operators.

I don't understand this, I'm afraid: What I've said was to support
my thinking of the && + || variant being identical in code and risk
to that using ?: . I.e. I'm not asking you to switch back.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: block speculative out-of-bound accesses
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: " Norbert Manthey
@ 2019-01-25 16:34   ` Jan Beulich
  2019-01-28 11:03     ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-25 16:34 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 23.01.19 at 12:57, <nmanthey@amazon.de> wrote:
> @@ -66,6 +67,9 @@ static struct hvm_vioapic *gsi_vioapic(const struct domain *d,
>  {
>      unsigned int i;
>  
> +    /* Make sure the compiler does not optimize the initialization */
> +    OPTIMIZER_HIDE_VAR(pin);

Since there's no initialization here, I think it would help to add "done
in the callers". Perhaps also "optimize away" or "delete"?

And then I think you mean *pin.

> @@ -212,7 +217,12 @@ static void vioapic_write_redirent(
>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>      union vioapic_redir_entry *pent, ent;
>      int unmasked = 0;
> -    unsigned int gsi = vioapic->base_gsi + idx;
> +    unsigned int gsi;
> +
> +    /* Make sure no out-of-bound value for idx can be used */
> +    idx = array_index_nospec(idx, vioapic->nr_pins);
> +
> +    gsi = vioapic->base_gsi + idx;

I dislike the disconnect from the respective bounds check: There's
only one caller, so the construct could be moved there, or
otherwise I'd like to see an ASSERT() added documenting that the
bounds check is expected to have happened in the caller.

> @@ -378,7 +388,8 @@ static inline int pit_channel0_enabled(void)
>  
>  static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
>  {
> -    uint16_t dest = vioapic->redirtbl[pin].fields.dest_id;
> +    uint16_t dest = vioapic->redirtbl
> +               [pin = array_index_nospec(pin, vioapic->nr_pins)].fields.dest_id;
>      uint8_t dest_mode = vioapic->redirtbl[pin].fields.dest_mode;
>      uint8_t delivery_mode = vioapic->redirtbl[pin].fields.delivery_mode;
>      uint8_t vector = vioapic->redirtbl[pin].fields.vector;

I'm sorry, but despite prior discussions I'm still not happy about
this change - all of the callers pass known good values:
- vioapic_write_redirent() gets adjusted above,
- vioapic_irq_positive_edge() gets the value passed into here
  from gsi_vioapic(), which you also take care of,
- vioapic_update_EOI() loops over all pins, so only passes in-
  range values.
Therefore I still don't see what protection this change adds.
As per above, if it was to stay, some sort of connection to the
range check(s) it guards would otherwise be nice to establish,
but I realize that adding an ASSERT() here would go against
a certain aspect of review comments I gave on earlier versions.

> @@ -463,7 +474,7 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
>  
>  void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
>  {
> -    unsigned int pin;
> +    unsigned int pin = 0; /* See gsi_vioapic */
>      struct hvm_vioapic *vioapic = gsi_vioapic(d, irq, &pin);
>      union vioapic_redir_entry *ent;
>  
> @@ -560,7 +571,7 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>  
>  int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
>  {
> -    unsigned int pin;
> +    unsigned int pin = 0; /* See gsi_vioapic */
>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
>  
>      if ( !vioapic )

Since there are more callers of gsi_vioapic(), justification should be
added to the description why only some need adjustment (or
otherwise, just to be on the safe side as well as for consistency
all of them should be updated, in which case it would still be nice
to call out the ones which really [don't] need updating).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 10/11] x86/hvm/hpet: block speculative out-of-bound accesses
  2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 10/11] x86/hvm/hpet: " Norbert Manthey
@ 2019-01-25 16:50   ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-25 16:50 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 23.01.19 at 12:57, <nmanthey@amazon.de> wrote:
> When interacting with hpet, read and write operations can be executed
> during instruction emulation, where the guest controls the data that
> is used. As it is hard to predict the number of instructions that are
> executed speculatively, we prevent out-of-bound accesses by using the
> array_index_nospec function for guest specified addresses that should
> be used for hpet operations.
> 
> This commit is part of the SpectreV1+L1TF mitigation patch series.
> 
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one further remark:

> @@ -523,7 +526,7 @@ static int hpet_write(
>      case HPET_Tn_ROUTE(0):
>      case HPET_Tn_ROUTE(1):
>      case HPET_Tn_ROUTE(2):
> -        tn = HPET_TN(ROUTE, addr);
> +        tn = array_index_nospec(HPET_TN(ROUTE, addr), ARRAY_SIZE(h->hpet.timers));
>          h->hpet.timers[tn].fsb = new_val;
>          break;

This one, unlike the other two in this function, would be a fair
candidate for use of array_access_nospec() - tn is used just
once here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 07/11] nospec: enable lfence on Intel
  2019-01-24 22:29   ` Andrew Cooper
@ 2019-01-27 20:09     ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-27 20:09 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Ian Jackson, Tim Deegan, Dario Faggioli,
	Martin Pohlack, Julien Grall, Bjoern Doebel, Jan Beulich,
	Martin Mazein, Julian Stecklina, David Woodhouse

On 1/24/19 23:29, Andrew Cooper wrote:
> On 23/01/2019 11:57, Norbert Manthey wrote:
>> While the lfence instruction was added for all x86 platform in the
>> beginning, it's useful to not block platforms that are not affected
>> by the L1TF vulnerability. Therefore, the lfence instruction should
>> only be introduced, in case the current CPU is an Intel CPU that is
>> capable of hyper threading. This combination of features is added
>> to the features that activate the alternative.
>>
>> This commit is part of the SpectreV1+L1TF mitigation patch series.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>>
>> ---
>>  xen/include/xen/nospec.h | 8 ++++++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
>> --- a/xen/include/xen/nospec.h
>> +++ b/xen/include/xen/nospec.h
>> @@ -7,6 +7,7 @@
>>  #ifndef XEN_NOSPEC_H
>>  #define XEN_NOSPEC_H
>>  
>> +#include <asm/alternative.h>
>>  #include <asm/system.h>
>>  
>>  /**
>> @@ -68,7 +69,10 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>>   * allow to insert a read memory barrier into conditionals
>>   */
>>  #ifdef CONFIG_X86
>> -static inline bool lfence_true(void) { rmb(); return true; }
>> +static inline bool lfence_true(void) {
>> +    alternative("", "lfence", X86_VENDOR_INTEL);
> This doesn't do what you expect.  It will cause the lfences to be
> patched into existence on any hardware with an FPU (before a recent
> patch of mine) or with VME (after a recent patch).

After looking more into this, I would introduce another synthesized CPU
feature flag, so that alternative patching can use this flag to patch
the lfence in, in case the detected platform is vulnerable to L1TF. I
would set this flag based on whether an L1TF vulnerable platform is
detected, and an introduced command line option does not prevent this.
Is this what you envision, or do I miss something?

Best,
Norbert

>
> ~Andrew




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-25 13:09         ` Jan Beulich
@ 2019-01-27 20:28           ` Norbert Manthey
  2019-01-28  7:35             ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-27 20:28 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/25/19 14:09, Jan Beulich wrote:
>>>> On 25.01.19 at 11:50, <nmanthey@amazon.de> wrote:
>> On 1/25/19 11:14, Jan Beulich wrote:
>>>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>>>> Worse is the "evaluate condition, stash result, fence, use variable"
>>>> option, which is almost completely useless.  If you work out the
>>>> resulting instruction stream, you'll have a conditional expression
>>>> calculated down into a register, then a fence, then a test register and
>>>> conditional jump into one of two basic blocks.  This takes the perf hit,
>>>> and doesn't protect either of the basic blocks for speculative
>>>> mis-execution.
>>> How does it not protect anything? It shrinks the speculation window
>>> to just the register test and conditional branch, which ought to be
>>> far smaller than that behind a memory access which fails to hit any
>>> of the caches (and perhaps even any of the TLBs). This is the more
>>> that LFENCE does specifically not prevent insn fetching from
>>> continuing.
>>>
>>> That said I agree that the LFENCE would better sit between the
>>> register test and the conditional branch, but as we've said so many
>>> times before - this can't be achieved without compiler support. It's
>>> said enough that the default "cc" clobber of asm()-s on x86 alone
>>> prevents this from possibly working, while my over four year old
>>> patch to add a means to avoid this has not seen sufficient
>>> comments to get it into some hopefully acceptable shape, but also
>>> has not been approved as is.
>>>
>>> Then again, following an earlier reply of mine on another sub-
>>> thread, nothing really prevents the compiler from moving ahead
>>> and folding the two LFENCEs of the "both branches" model into
>>> one. It just so happens that apparently right now this never
>>> occurs (assuming Norbert has done full generated code analysis
>>> to confirm the intended placement).
>> I am happy to jump back to my earlier version without a configuration
>> option to protect both branches with a lfence instruction, using logic
>> operators.
> I don't understand this, I'm afraid: What I've said was to support
> my thinking of the && + || variant being identical in code and risk
> to that using ?: . I.e. I'm not asking you to switch back.

I understand that you did not ask. However, Andrew raised concerns, and
I analyzed the binary output for the variant with logical operators.
Hence, I'd like to keep that variant with the logical operators.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-27 20:28           ` Norbert Manthey
@ 2019-01-28  7:35             ` Jan Beulich
  2019-01-28  7:56               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-28  7:35 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 27.01.19 at 21:28, <nmanthey@amazon.de> wrote:
> On 1/25/19 14:09, Jan Beulich wrote:
>>>>> On 25.01.19 at 11:50, <nmanthey@amazon.de> wrote:
>>> On 1/25/19 11:14, Jan Beulich wrote:
>>>>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>>>>> Worse is the "evaluate condition, stash result, fence, use variable"
>>>>> option, which is almost completely useless.  If you work out the
>>>>> resulting instruction stream, you'll have a conditional expression
>>>>> calculated down into a register, then a fence, then a test register and
>>>>> conditional jump into one of two basic blocks.  This takes the perf hit,
>>>>> and doesn't protect either of the basic blocks for speculative
>>>>> mis-execution.
>>>> How does it not protect anything? It shrinks the speculation window
>>>> to just the register test and conditional branch, which ought to be
>>>> far smaller than that behind a memory access which fails to hit any
>>>> of the caches (and perhaps even any of the TLBs). This is the more
>>>> that LFENCE does specifically not prevent insn fetching from
>>>> continuing.
>>>>
>>>> That said I agree that the LFENCE would better sit between the
>>>> register test and the conditional branch, but as we've said so many
>>>> times before - this can't be achieved without compiler support. It's
>>>> said enough that the default "cc" clobber of asm()-s on x86 alone
>>>> prevents this from possibly working, while my over four year old
>>>> patch to add a means to avoid this has not seen sufficient
>>>> comments to get it into some hopefully acceptable shape, but also
>>>> has not been approved as is.
>>>>
>>>> Then again, following an earlier reply of mine on another sub-
>>>> thread, nothing really prevents the compiler from moving ahead
>>>> and folding the two LFENCEs of the "both branches" model into
>>>> one. It just so happens that apparently right now this never
>>>> occurs (assuming Norbert has done full generated code analysis
>>>> to confirm the intended placement).
>>> I am happy to jump back to my earlier version without a configuration
>>> option to protect both branches with a lfence instruction, using logic
>>> operators.
>> I don't understand this, I'm afraid: What I've said was to support
>> my thinking of the && + || variant being identical in code and risk
>> to that using ?: . I.e. I'm not asking you to switch back.
> 
> I understand that you did not ask. However, Andrew raised concerns, and
> I analyzed the binary output for the variant with logical operators.
> Hence, I'd like to keep that variant with the logical operators.

But didn't you say earlier that there was no difference in generated
code between the two variants?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-28  7:35             ` Jan Beulich
@ 2019-01-28  7:56               ` Norbert Manthey
  2019-01-28  8:24                 ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-28  7:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/28/19 08:35, Jan Beulich wrote:
>>>> On 27.01.19 at 21:28, <nmanthey@amazon.de> wrote:
>> On 1/25/19 14:09, Jan Beulich wrote:
>>>>>> On 25.01.19 at 11:50, <nmanthey@amazon.de> wrote:
>>>> On 1/25/19 11:14, Jan Beulich wrote:
>>>>>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>>>>>> Worse is the "evaluate condition, stash result, fence, use variable"
>>>>>> option, which is almost completely useless.  If you work out the
>>>>>> resulting instruction stream, you'll have a conditional expression
>>>>>> calculated down into a register, then a fence, then a test register and
>>>>>> conditional jump into one of two basic blocks.  This takes the perf hit,
>>>>>> and doesn't protect either of the basic blocks for speculative
>>>>>> mis-execution.
>>>>> How does it not protect anything? It shrinks the speculation window
>>>>> to just the register test and conditional branch, which ought to be
>>>>> far smaller than that behind a memory access which fails to hit any
>>>>> of the caches (and perhaps even any of the TLBs). This is the more
>>>>> that LFENCE does specifically not prevent insn fetching from
>>>>> continuing.
>>>>>
>>>>> That said I agree that the LFENCE would better sit between the
>>>>> register test and the conditional branch, but as we've said so many
>>>>> times before - this can't be achieved without compiler support. It's
>>>>> said enough that the default "cc" clobber of asm()-s on x86 alone
>>>>> prevents this from possibly working, while my over four year old
>>>>> patch to add a means to avoid this has not seen sufficient
>>>>> comments to get it into some hopefully acceptable shape, but also
>>>>> has not been approved as is.
>>>>>
>>>>> Then again, following an earlier reply of mine on another sub-
>>>>> thread, nothing really prevents the compiler from moving ahead
>>>>> and folding the two LFENCEs of the "both branches" model into
>>>>> one. It just so happens that apparently right now this never
>>>>> occurs (assuming Norbert has done full generated code analysis
>>>>> to confirm the intended placement).
>>>> I am happy to jump back to my earlier version without a configuration
>>>> option to protect both branches with a lfence instruction, using logic
>>>> operators.
>>> I don't understand this, I'm afraid: What I've said was to support
>>> my thinking of the && + || variant being identical in code and risk
>>> to that using ?: . I.e. I'm not asking you to switch back.
>> I understand that you did not ask. However, Andrew raised concerns, and
>> I analyzed the binary output for the variant with logical operators.
>> Hence, I'd like to keep that variant with the logical operators.
> But didn't you say earlier that there was no difference in generated
> code between the two variants?

Yes, for the current commit, and for the 1 compiler I used. Personally,
I prefer the logic operand variant. You seem to prefer the ternary
variant, and Andrew at least raised concerns there. I would really like
to move forward somehow, but currently it does not look really clear how
to achieve that.

I try to apply majority vote for each hunk that has been commented and
create a v5 of the series. I even think about separating the
introduction of eval_nospec and the arch_nospec_barrier macro into
another series to move faster with the array_index_nospec-based changes
first. Guidance is very welcome.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-28  7:56               ` Norbert Manthey
@ 2019-01-28  8:24                 ` Jan Beulich
  2019-01-28 10:07                   ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-28  8:24 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 28.01.19 at 08:56, <nmanthey@amazon.de> wrote:
> On 1/28/19 08:35, Jan Beulich wrote:
>>>>> On 27.01.19 at 21:28, <nmanthey@amazon.de> wrote:
>>> On 1/25/19 14:09, Jan Beulich wrote:
>>>>>>> On 25.01.19 at 11:50, <nmanthey@amazon.de> wrote:
>>>>> On 1/25/19 11:14, Jan Beulich wrote:
>>>>>>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>>>>>>> Worse is the "evaluate condition, stash result, fence, use variable"
>>>>>>> option, which is almost completely useless.  If you work out the
>>>>>>> resulting instruction stream, you'll have a conditional expression
>>>>>>> calculated down into a register, then a fence, then a test register and
>>>>>>> conditional jump into one of two basic blocks.  This takes the perf hit,
>>>>>>> and doesn't protect either of the basic blocks for speculative
>>>>>>> mis-execution.
>>>>>> How does it not protect anything? It shrinks the speculation window
>>>>>> to just the register test and conditional branch, which ought to be
>>>>>> far smaller than that behind a memory access which fails to hit any
>>>>>> of the caches (and perhaps even any of the TLBs). This is the more
>>>>>> that LFENCE does specifically not prevent insn fetching from
>>>>>> continuing.
>>>>>>
>>>>>> That said I agree that the LFENCE would better sit between the
>>>>>> register test and the conditional branch, but as we've said so many
>>>>>> times before - this can't be achieved without compiler support. It's
>>>>>> said enough that the default "cc" clobber of asm()-s on x86 alone
>>>>>> prevents this from possibly working, while my over four year old
>>>>>> patch to add a means to avoid this has not seen sufficient
>>>>>> comments to get it into some hopefully acceptable shape, but also
>>>>>> has not been approved as is.
>>>>>>
>>>>>> Then again, following an earlier reply of mine on another sub-
>>>>>> thread, nothing really prevents the compiler from moving ahead
>>>>>> and folding the two LFENCEs of the "both branches" model into
>>>>>> one. It just so happens that apparently right now this never
>>>>>> occurs (assuming Norbert has done full generated code analysis
>>>>>> to confirm the intended placement).
>>>>> I am happy to jump back to my earlier version without a configuration
>>>>> option to protect both branches with a lfence instruction, using logic
>>>>> operators.
>>>> I don't understand this, I'm afraid: What I've said was to support
>>>> my thinking of the && + || variant being identical in code and risk
>>>> to that using ?: . I.e. I'm not asking you to switch back.
>>> I understand that you did not ask. However, Andrew raised concerns, and
>>> I analyzed the binary output for the variant with logical operators.
>>> Hence, I'd like to keep that variant with the logical operators.
>> But didn't you say earlier that there was no difference in generated
>> code between the two variants?
> 
> Yes, for the current commit, and for the 1 compiler I used. Personally,
> I prefer the logic operand variant. You seem to prefer the ternary
> variant, and Andrew at least raised concerns there. I would really like
> to move forward somehow, but currently it does not look really clear how
> to achieve that.

Well, being able to move forward implies getting a response to my
reply suggesting that both variants are equivalent in risk. If there
are convincing arguments that the (imo) worse (simply from a
readability pov) is indeed better from a risk (of the compiler not
doing what we want it to do) pov, I'd certainly give up my opposition.

> I try to apply majority vote for each hunk that has been commented and
> create a v5 of the series. I even think about separating the
> introduction of eval_nospec and the arch_nospec_barrier macro into
> another series to move faster with the array_index_nospec-based changes
> first. Guidance is very welcome.

I have no problem picking patches out of order for committing.
For example I'd commit patches 10 and 11 of v4 as is once it
has the necessary release manager ack. I notice only now that
you didn't even Cc Jürgen. I guess I'll reply to the cover letter
asking for his opinion on the series as a whole.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: SpectreV1+L1TF Patch Series
  2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
                   ` (11 preceding siblings ...)
  2019-01-24 21:05 ` SpectreV1+L1TF Patch Series Andrew Cooper
@ 2019-01-28  8:28 ` Jan Beulich
       [not found] ` <5C4EBD1A0200007800211954@suse.com>
  13 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-28  8:28 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

Jürgen,

>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
> This patch series attempts to mitigate the issue that have been raised in the
> XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
> execution on Intel hardware, an lfence instruction is required to make sure
> that selected checks are not bypassed. Speculative out-of-bound accesses can
> be prevented by using the array_index_nospec macro.
> 
> The lfence instruction should be added on x86 platforms only. To not affect
> platforms that are not affected by the L1TF vulnerability, the lfence
> instruction is patched in via alternative patching on Intel CPUs only.
> Furthermore, the compile time configuration allows to choose how to protect the
> evaluation of conditions with the lfence instruction.

I've noticed only now that you weren't Cc-ed on this series. It
clearly is something to at least be considered for 4.12. May I
ask what your view on this is? Perhaps in particular whether
you would want to set some boundary in time until which pieces
of it (as they become ready, which looks to be the case for
patches 10 and 11 at this point in time) may go in?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: SpectreV1+L1TF Patch Series
       [not found] ` <5C4EBD1A0200007800211954@suse.com>
@ 2019-01-28  8:47   ` Juergen Gross
  2019-01-28  9:56     ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Juergen Gross @ 2019-01-28  8:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel, Norbert Manthey

On 28/01/2019 09:28, Jan Beulich wrote:
> Jürgen,
> 
>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>> This patch series attempts to mitigate the issue that have been raised in the
>> XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
>> execution on Intel hardware, an lfence instruction is required to make sure
>> that selected checks are not bypassed. Speculative out-of-bound accesses can
>> be prevented by using the array_index_nospec macro.
>>
>> The lfence instruction should be added on x86 platforms only. To not affect
>> platforms that are not affected by the L1TF vulnerability, the lfence
>> instruction is patched in via alternative patching on Intel CPUs only.
>> Furthermore, the compile time configuration allows to choose how to protect the
>> evaluation of conditions with the lfence instruction.
> 
> I've noticed only now that you weren't Cc-ed on this series. It
> clearly is something to at least be considered for 4.12. May I
> ask what your view on this is? Perhaps in particular whether
> you would want to set some boundary in time until which pieces
> of it (as they become ready, which looks to be the case for
> patches 10 and 11 at this point in time) may go in?

I'd say until RC3 they are fine to go in when ready. After that I'd like
to decide on a case-by-case basis.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: SpectreV1+L1TF Patch Series
  2019-01-28  8:47   ` Juergen Gross
@ 2019-01-28  9:56     ` Jan Beulich
       [not found]       ` <9C03B9BA0200004637554D14@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-28  9:56 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 28.01.19 at 09:47, <jgross@suse.com> wrote:
> On 28/01/2019 09:28, Jan Beulich wrote:
>>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>>> This patch series attempts to mitigate the issue that have been raised in the
>>> XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
>>> execution on Intel hardware, an lfence instruction is required to make sure
>>> that selected checks are not bypassed. Speculative out-of-bound accesses can
>>> be prevented by using the array_index_nospec macro.
>>>
>>> The lfence instruction should be added on x86 platforms only. To not affect
>>> platforms that are not affected by the L1TF vulnerability, the lfence
>>> instruction is patched in via alternative patching on Intel CPUs only.
>>> Furthermore, the compile time configuration allows to choose how to protect the
>>> evaluation of conditions with the lfence instruction.
>> 
>> I've noticed only now that you weren't Cc-ed on this series. It
>> clearly is something to at least be considered for 4.12. May I
>> ask what your view on this is? Perhaps in particular whether
>> you would want to set some boundary in time until which pieces
>> of it (as they become ready, which looks to be the case for
>> patches 10 and 11 at this point in time) may go in?
> 
> I'd say until RC3 they are fine to go in when ready. After that I'd like
> to decide on a case-by-case basis.

May I interpret this as a release ack for patches 10 and 11 of
v4 then, and perhaps even generally as such an ack for other
parts of the series (with the RC3 boundary in mind)?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: SpectreV1+L1TF Patch Series
@ 2019-01-28 10:01 Juergen Gross
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Juergen Gross @ 2019-01-28 10:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel, Norbert Manthey

On 28/01/2019 10:56, Jan Beulich wrote:
>>>> On 28.01.19 at 09:47, <jgross@suse.com> wrote:
>> On 28/01/2019 09:28, Jan Beulich wrote:
>>>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>>>> This patch series attempts to mitigate the issue that have been raised in the
>>>> XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
>>>> execution on Intel hardware, an lfence instruction is required to make sure
>>>> that selected checks are not bypassed. Speculative out-of-bound accesses can
>>>> be prevented by using the array_index_nospec macro.
>>>>
>>>> The lfence instruction should be added on x86 platforms only. To not affect
>>>> platforms that are not affected by the L1TF vulnerability, the lfence
>>>> instruction is patched in via alternative patching on Intel CPUs only.
>>>> Furthermore, the compile time configuration allows to choose how to protect the
>>>> evaluation of conditions with the lfence instruction.
>>>
>>> I've noticed only now that you weren't Cc-ed on this series. It
>>> clearly is something to at least be considered for 4.12. May I
>>> ask what your view on this is? Perhaps in particular whether
>>> you would want to set some boundary in time until which pieces
>>> of it (as they become ready, which looks to be the case for
>>> patches 10 and 11 at this point in time) may go in?
>>
>> I'd say until RC3 they are fine to go in when ready. After that I'd like
>> to decide on a case-by-case basis.
> 
> May I interpret this as a release ack for patches 10 and 11 of
> v4 then, and perhaps even generally as such an ack for other
> parts of the series (with the RC3 boundary in mind)?

Yes.


Juergen


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-28  8:24                 ` Jan Beulich
@ 2019-01-28 10:07                   ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-28 10:07 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/28/19 09:24, Jan Beulich wrote:
>>>> On 28.01.19 at 08:56, <nmanthey@amazon.de> wrote:
>> On 1/28/19 08:35, Jan Beulich wrote:
>>>>>> On 27.01.19 at 21:28, <nmanthey@amazon.de> wrote:
>>>> On 1/25/19 14:09, Jan Beulich wrote:
>>>>>>>> On 25.01.19 at 11:50, <nmanthey@amazon.de> wrote:
>>>>>> On 1/25/19 11:14, Jan Beulich wrote:
>>>>>>>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>>>>>>>> Worse is the "evaluate condition, stash result, fence, use variable"
>>>>>>>> option, which is almost completely useless.  If you work out the
>>>>>>>> resulting instruction stream, you'll have a conditional expression
>>>>>>>> calculated down into a register, then a fence, then a test register and
>>>>>>>> conditional jump into one of two basic blocks.  This takes the perf hit,
>>>>>>>> and doesn't protect either of the basic blocks for speculative
>>>>>>>> mis-execution.
>>>>>>> How does it not protect anything? It shrinks the speculation window
>>>>>>> to just the register test and conditional branch, which ought to be
>>>>>>> far smaller than that behind a memory access which fails to hit any
>>>>>>> of the caches (and perhaps even any of the TLBs). This is the more
>>>>>>> that LFENCE does specifically not prevent insn fetching from
>>>>>>> continuing.
>>>>>>>
>>>>>>> That said I agree that the LFENCE would better sit between the
>>>>>>> register test and the conditional branch, but as we've said so many
>>>>>>> times before - this can't be achieved without compiler support. It's
>>>>>>> said enough that the default "cc" clobber of asm()-s on x86 alone
>>>>>>> prevents this from possibly working, while my over four year old
>>>>>>> patch to add a means to avoid this has not seen sufficient
>>>>>>> comments to get it into some hopefully acceptable shape, but also
>>>>>>> has not been approved as is.
>>>>>>>
>>>>>>> Then again, following an earlier reply of mine on another sub-
>>>>>>> thread, nothing really prevents the compiler from moving ahead
>>>>>>> and folding the two LFENCEs of the "both branches" model into
>>>>>>> one. It just so happens that apparently right now this never
>>>>>>> occurs (assuming Norbert has done full generated code analysis
>>>>>>> to confirm the intended placement).
>>>>>> I am happy to jump back to my earlier version without a configuration
>>>>>> option to protect both branches with a lfence instruction, using logic
>>>>>> operators.
>>>>> I don't understand this, I'm afraid: What I've said was to support
>>>>> my thinking of the && + || variant being identical in code and risk
>>>>> to that using ?: . I.e. I'm not asking you to switch back.
>>>> I understand that you did not ask. However, Andrew raised concerns, and
>>>> I analyzed the binary output for the variant with logical operators.
>>>> Hence, I'd like to keep that variant with the logical operators.
>>> But didn't you say earlier that there was no difference in generated
>>> code between the two variants?
>> Yes, for the current commit, and for the 1 compiler I used. Personally,
>> I prefer the logic operand variant. You seem to prefer the ternary
>> variant, and Andrew at least raised concerns there. I would really like
>> to move forward somehow, but currently it does not look really clear how
>> to achieve that.
> Well, being able to move forward implies getting a response to my
> reply suggesting that both variants are equivalent in risk. If there
> are convincing arguments that the (imo) worse (simply from a
> readability pov) is indeed better from a risk (of the compiler not
> doing what we want it to do) pov, I'd certainly give up my opposition.
I understand the readability concern. The C standard makes similar
promises about the semantics (left-to-right, and using sequence points).
The implementation in the end seems to be up to the compiler. The risk
is that future compilers treat the conditional operator differently than
the one I used today. I'm fine with what we have right now. Once I'm
done with a v5 candidate, I'll look into this comparison one more time.
>
>> I try to apply majority vote for each hunk that has been commented and
>> create a v5 of the series. I even think about separating the
>> introduction of eval_nospec and the arch_nospec_barrier macro into
>> another series to move faster with the array_index_nospec-based changes
>> first. Guidance is very welcome.
> I have no problem picking patches out of order for committing.
> For example I'd commit patches 10 and 11 of v4 as is once it
> has the necessary release manager ack. I notice only now that
> you didn't even Cc Jürgen. I guess I'll reply to the cover letter
> asking for his opinion on the series as a whole.

To be able to merge these patched independently, I will bring back the
patch that was listed in the XSA,
xsa289/0005-nospec-introduce-method-for-static-arrays.patch, as that
function is required by patch 10 and 11.

Best,
Norbert

>
> Jan
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: block speculative out-of-bound accesses
  2019-01-25 16:34   ` Jan Beulich
@ 2019-01-28 11:03     ` Norbert Manthey
  2019-01-28 11:12       ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-28 11:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/25/19 17:34, Jan Beulich wrote:
>>>> On 23.01.19 at 12:57, <nmanthey@amazon.de> wrote:
>> @@ -66,6 +67,9 @@ static struct hvm_vioapic *gsi_vioapic(const struct domain *d,
>>  {
>>      unsigned int i;
>>  
>> +    /* Make sure the compiler does not optimize the initialization */
>> +    OPTIMIZER_HIDE_VAR(pin);
> Since there's no initialization here, I think it would help to add "done
> in the callers". Perhaps also "optimize away" or "delete"?
>
> And then I think you mean *pin.
True, I will adapt both the comment and the OPTIMIZER_HIDE_VAR call.
>
>> @@ -212,7 +217,12 @@ static void vioapic_write_redirent(
>>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>>      union vioapic_redir_entry *pent, ent;
>>      int unmasked = 0;
>> -    unsigned int gsi = vioapic->base_gsi + idx;
>> +    unsigned int gsi;
>> +
>> +    /* Make sure no out-of-bound value for idx can be used */
>> +    idx = array_index_nospec(idx, vioapic->nr_pins);
>> +
>> +    gsi = vioapic->base_gsi + idx;
> I dislike the disconnect from the respective bounds check: There's
> only one caller, so the construct could be moved there, or
> otherwise I'd like to see an ASSERT() added documenting that the
> bounds check is expected to have happened in the caller.
I agree that the idx value is used as an array index in this function
only once. However, the gsi value also uses the value of idx, and as
that is passed to other functions, I want to bound the gsi variable as
well. Therefore, I chose to have a separate assignment for the idx variable.
>
>> @@ -378,7 +388,8 @@ static inline int pit_channel0_enabled(void)
>>  
>>  static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
>>  {
>> -    uint16_t dest = vioapic->redirtbl[pin].fields.dest_id;
>> +    uint16_t dest = vioapic->redirtbl
>> +               [pin = array_index_nospec(pin, vioapic->nr_pins)].fields.dest_id;
>>      uint8_t dest_mode = vioapic->redirtbl[pin].fields.dest_mode;
>>      uint8_t delivery_mode = vioapic->redirtbl[pin].fields.delivery_mode;
>>      uint8_t vector = vioapic->redirtbl[pin].fields.vector;
> I'm sorry, but despite prior discussions I'm still not happy about
> this change - all of the callers pass known good values:
> - vioapic_write_redirent() gets adjusted above,
> - vioapic_irq_positive_edge() gets the value passed into here
>   from gsi_vioapic(), which you also take care of,
> - vioapic_update_EOI() loops over all pins, so only passes in-
>   range values.
> Therefore I still don't see what protection this change adds.
> As per above, if it was to stay, some sort of connection to the
> range check(s) it guards would otherwise be nice to establish,
> but I realize that adding an ASSERT() here would go against
> a certain aspect of review comments I gave on earlier versions.
I will drop this change. As you called out, all callers are bound
checked already. Hence, I will not add an assert.
>
>> @@ -463,7 +474,7 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
>>  
>>  void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
>>  {
>> -    unsigned int pin;
>> +    unsigned int pin = 0; /* See gsi_vioapic */
>>      struct hvm_vioapic *vioapic = gsi_vioapic(d, irq, &pin);
>>      union vioapic_redir_entry *ent;
>>  
>> @@ -560,7 +571,7 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
>>  
>>  int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
>>  {
>> -    unsigned int pin;
>> +    unsigned int pin = 0; /* See gsi_vioapic */
>>      const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
>>  
>>      if ( !vioapic )
> Since there are more callers of gsi_vioapic(), justification should be
> added to the description why only some need adjustment (or
> otherwise, just to be on the safe side as well as for consistency
> all of them should be updated, in which case it would still be nice
> to call out the ones which really [don't] need updating).

I will extend the explanation in the commit message.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: block speculative out-of-bound accesses
  2019-01-28 11:03     ` Norbert Manthey
@ 2019-01-28 11:12       ` Jan Beulich
  2019-01-28 12:20         ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-28 11:12 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 28.01.19 at 12:03, <nmanthey@amazon.de> wrote:
> On 1/25/19 17:34, Jan Beulich wrote:
>>>>> On 23.01.19 at 12:57, <nmanthey@amazon.de> wrote:
>>> @@ -212,7 +217,12 @@ static void vioapic_write_redirent(
>>>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>>>      union vioapic_redir_entry *pent, ent;
>>>      int unmasked = 0;
>>> -    unsigned int gsi = vioapic->base_gsi + idx;
>>> +    unsigned int gsi;
>>> +
>>> +    /* Make sure no out-of-bound value for idx can be used */
>>> +    idx = array_index_nospec(idx, vioapic->nr_pins);
>>> +
>>> +    gsi = vioapic->base_gsi + idx;
>> I dislike the disconnect from the respective bounds check: There's
>> only one caller, so the construct could be moved there, or
>> otherwise I'd like to see an ASSERT() added documenting that the
>> bounds check is expected to have happened in the caller.
> I agree that the idx value is used as an array index in this function
> only once. However, the gsi value also uses the value of idx, and as
> that is passed to other functions, I want to bound the gsi variable as
> well. Therefore, I chose to have a separate assignment for the idx variable.

I don't mind the separate assignment, and I didn't complain
about idx being used just once. What I said is that there's
only one caller of the function. If the bounds checking was
done there, "gsi" here would be equally "bounded" afaict.
And I did suggest an alternative in case you dislike the
moving of the construct you add.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: block speculative out-of-bound accesses
  2019-01-28 11:12       ` Jan Beulich
@ 2019-01-28 12:20         ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-28 12:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/28/19 12:12, Jan Beulich wrote:
>>>> On 28.01.19 at 12:03, <nmanthey@amazon.de> wrote:
>> On 1/25/19 17:34, Jan Beulich wrote:
>>>>>> On 23.01.19 at 12:57, <nmanthey@amazon.de> wrote:
>>>> @@ -212,7 +217,12 @@ static void vioapic_write_redirent(
>>>>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>>>>      union vioapic_redir_entry *pent, ent;
>>>>      int unmasked = 0;
>>>> -    unsigned int gsi = vioapic->base_gsi + idx;
>>>> +    unsigned int gsi;
>>>> +
>>>> +    /* Make sure no out-of-bound value for idx can be used */
>>>> +    idx = array_index_nospec(idx, vioapic->nr_pins);
>>>> +
>>>> +    gsi = vioapic->base_gsi + idx;
>>> I dislike the disconnect from the respective bounds check: There's
>>> only one caller, so the construct could be moved there, or
>>> otherwise I'd like to see an ASSERT() added documenting that the
>>> bounds check is expected to have happened in the caller.
>> I agree that the idx value is used as an array index in this function
>> only once. However, the gsi value also uses the value of idx, and as
>> that is passed to other functions, I want to bound the gsi variable as
>> well. Therefore, I chose to have a separate assignment for the idx variable.
> I don't mind the separate assignment, and I didn't complain
> about idx being used just once. What I said is that there's
> only one caller of the function. If the bounds checking was
> done there, "gsi" here would be equally "bounded" afaict.
> And I did suggest an alternative in case you dislike the
> moving of the construct you add.

Ah, I understand your previous sentence differently now. Thanks for
clarifying. I like to keep the nospec statements close to the
problematic use, so that eventual future callers benefit from that as
well. Therefore, I'll add an ASSERT statement with the bound check.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: SpectreV1+L1TF Patch Series
  2019-01-24 21:05 ` SpectreV1+L1TF Patch Series Andrew Cooper
@ 2019-01-28 13:56   ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-28 13:56 UTC (permalink / raw)
  To: Andrew Cooper, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel

On 1/24/19 22:05, Andrew Cooper wrote:
> On 23/01/2019 11:51, Norbert Manthey wrote:
>> Dear all,
>>
>> This patch series attempts to mitigate the issue that have been raised in the
>> XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
>> execution on Intel hardware, an lfence instruction is required to make sure
>> that selected checks are not bypassed. Speculative out-of-bound accesses can
>> be prevented by using the array_index_nospec macro.
>>
>> The lfence instruction should be added on x86 platforms only. To not affect
>> platforms that are not affected by the L1TF vulnerability, the lfence
>> instruction is patched in via alternative patching on Intel CPUs only.
>> Furthermore, the compile time configuration allows to choose how to protect the
>> evaluation of conditions with the lfence instruction.
> Hello,
>
> First of all, I've dusted off an old patch of mine and made it
> speculatively safe.
>
> https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=9e92acf1b752dfdfb294234b32d1fa9f55bfdc0f
>
> Using the new domain_vcpu() helper should tidy up quite a few patches in
> the series.
I will use the introduced function and apply it where I touched code,
thanks!
>
>
> Next, to the ordering of patches.
>
> Please introduce the Kconfig variable(s) first.  I'll follow up on that
> thread about options.
I will drop the Kconfig option and go with "protect both branches" only.
>
> Next, introduce a new synthetic feature bit to cause patching to occur,
> and logic to trigger it in appropriate circumstances.  Look through the
> history of include/asm-x86/cpufeatures.h to see some examples from the
> previous speculative mitigation work.  In particular, you'll need a
> command line parameter to control the use of this functionality when it
> is compiled in.
I will introduce a synthesized feature, and a command line option, and
add documentation.
>
> Next, introduce eval_nospec().  To avoid interfering with other
> architectures, you probably want something like this:
Do you want me to introduce the new macro in a separate commit, and use
it in follow up commits? I have been told previously to not split
introduced functions from their use cases, but merge them with at least
one. Your above commit again only introduces an at this point unused
function. Is there a Xen specifc style rule for this?
> xen/nospec.h contains:
>
> /*
>  * Evaluate a condition in a speculation-safe way.
>  * Stub implementation for builds which don't care.
>  */
> #ifndef eval_nospec
> #define eval_nospec(x) (x)
> #endif
>
> and something containing x86's implementation.  TBH, I personally think
> asm/nospec.h is overdue for introducing now.

For now, I would like to not introduce new files, as Jan also suggested
earlier.

Best,
Norbert

>
> ~Andrew




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-23 13:37   ` Jan Beulich
@ 2019-01-28 14:45     ` Norbert Manthey
  2019-01-28 15:09       ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-28 14:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/23/19 14:37, Jan Beulich wrote:
>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>> @@ -1268,7 +1272,8 @@ unmap_common(
>>      }
>>  
>>      smp_rmb();
>> -    map = &maptrack_entry(lgt, op->handle);
>> +    map = &maptrack_entry(lgt, array_index_nospec(op->handle,
>> +                                                  lgt->maptrack_limit));
> It might be better to move this into maptrack_entry() itself, or
> make a maptrack_entry_nospec() clone (as several but not all
> uses may indeed not be in need of the extra protection). At
> least the ones in steal_maptrack_handle() and
> put_maptrack_handle() also look potentially suspicious.
I will move the nospec protection into the macro. I would like to avoid
introducing yet another macro.
>
>> @@ -2223,7 +2231,8 @@ gnttab_transfer(
>>          okay = gnttab_prepare_for_transfer(e, d, gop.ref);
>>          spin_lock(&e->page_alloc_lock);
>>  
>> -        if ( unlikely(!okay) || unlikely(e->is_dying) )
>> +        /* Make sure this check is not bypassed speculatively */
>> +        if ( evaluate_nospec(unlikely(!okay) || unlikely(e->is_dying)) )
>>          {
>>              bool_t drop_dom_ref = !domain_adjust_tot_pages(e, -1);
> What is it that makes this particular if() different from other
> surrounding ones? In particular the version dependent code (a few
> lines down from here as well as elsewhere) look to be easily
> divertable onto the wrong branch, then causing out of bounds
> speculative accesses due to the different (version dependent)
> shared entry sizes.
This check evaluates the variable okay, which indicates whether the
value of gop.ref is bounded correctly. The next conditional that uses
code based on a version should be fine, even when being entered
speculatively with the wrong version setup, as the value of gop.ref is
correct (i.e. architecturally visible after this lfence) already. As the
version dependent macros are used, i.e. shared_entry_v1 and
shared_entry_v2, I do not see a risk why speculative out-of-bound access
should happen here.
>
>> @@ -3215,6 +3230,10 @@ swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
>>      if ( ref_a == ref_b )
>>          goto out;
>>  
>> +    /* Make sure the above check is not bypassed speculatively */
>> +    ref_a = array_index_nospec(ref_a, nr_grant_entries(d->grant_table));
>> +    ref_b = array_index_nospec(ref_b, nr_grant_entries(d->grant_table));
> I think this wants to move up ahead of the if() in context, and the
> comment be changed to plural.
I will move the code above the comparison.
>
>> --- a/xen/include/xen/nospec.h
>> +++ b/xen/include/xen/nospec.h
>> @@ -87,6 +87,15 @@ static inline bool lfence_true(void) { return true; }
>>  #define evaluate_nospec(condition) ({ bool res = (condition); rmb(); res; 
>> })
>>  #endif
>>  
>> +/*
>> + * allow to block speculative execution in generic code
>> + */
> Comment style again.
I will fix the comment.
>
>> +#ifdef CONFIG_X86
>> +#define block_speculation() rmb()
>> +#else
>> +#define block_speculation()
>> +#endif
> Why does this not simply resolve to what currently is named lfence_true()
> (perhaps with a cast to void)? And why does this not depend on the
> Kconfig setting?

I will update the definition of this macro to what is called
lfence_true() in this series, and cast it to void. I will furthermore
split the introduction of this macro and this commit.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-28 14:45     ` Norbert Manthey
@ 2019-01-28 15:09       ` Jan Beulich
  2019-01-29  8:33         ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-28 15:09 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 28.01.19 at 15:45, <nmanthey@amazon.de> wrote:
> On 1/23/19 14:37, Jan Beulich wrote:
>>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>>> @@ -2223,7 +2231,8 @@ gnttab_transfer(
>>>          okay = gnttab_prepare_for_transfer(e, d, gop.ref);
>>>          spin_lock(&e->page_alloc_lock);
>>>  
>>> -        if ( unlikely(!okay) || unlikely(e->is_dying) )
>>> +        /* Make sure this check is not bypassed speculatively */
>>> +        if ( evaluate_nospec(unlikely(!okay) || unlikely(e->is_dying)) )
>>>          {
>>>              bool_t drop_dom_ref = !domain_adjust_tot_pages(e, -1);
>> What is it that makes this particular if() different from other
>> surrounding ones? In particular the version dependent code (a few
>> lines down from here as well as elsewhere) look to be easily
>> divertable onto the wrong branch, then causing out of bounds
>> speculative accesses due to the different (version dependent)
>> shared entry sizes.
> This check evaluates the variable okay, which indicates whether the
> value of gop.ref is bounded correctly.

How does gop.ref come into play here? The if() above does not use
or update it.

> The next conditional that uses
> code based on a version should be fine, even when being entered
> speculatively with the wrong version setup, as the value of gop.ref is
> correct (i.e. architecturally visible after this lfence) already. As the
> version dependent macros are used, i.e. shared_entry_v1 and
> shared_entry_v2, I do not see a risk why speculative out-of-bound access
> should happen here.

As said - v2 entries are larger than v1 ones. Therefore, if the
processor wrongly speculates along the v2 path, it may use
indexes valid for v1, but beyond the size when scaled by v2
element size (whereas ->shared_raw[], aliased with
->shared_v1[] and ->shared_v2[], was actually set up with v1
element size).

And please don't forget - this subsequent conditional was just an
easy example. What I'm really after is why you modify the if()
above, without there being any array index involved.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-28 15:09       ` Jan Beulich
@ 2019-01-29  8:33         ` Norbert Manthey
  2019-01-29  9:46           ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29  8:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/28/19 16:09, Jan Beulich wrote:
>>>> On 28.01.19 at 15:45, <nmanthey@amazon.de> wrote:
>> On 1/23/19 14:37, Jan Beulich wrote:
>>>>>> On 23.01.19 at 12:51, <nmanthey@amazon.de> wrote:
>>>> @@ -2223,7 +2231,8 @@ gnttab_transfer(
>>>>          okay = gnttab_prepare_for_transfer(e, d, gop.ref);
>>>>          spin_lock(&e->page_alloc_lock);
>>>>  
>>>> -        if ( unlikely(!okay) || unlikely(e->is_dying) )
>>>> +        /* Make sure this check is not bypassed speculatively */
>>>> +        if ( evaluate_nospec(unlikely(!okay) || unlikely(e->is_dying)) )
>>>>          {
>>>>              bool_t drop_dom_ref = !domain_adjust_tot_pages(e, -1);
>>> What is it that makes this particular if() different from other
>>> surrounding ones? In particular the version dependent code (a few
>>> lines down from here as well as elsewhere) look to be easily
>>> divertable onto the wrong branch, then causing out of bounds
>>> speculative accesses due to the different (version dependent)
>>> shared entry sizes.
>> This check evaluates the variable okay, which indicates whether the
>> value of gop.ref is bounded correctly.
> How does gop.ref come into play here? The if() above does not use
> or update it.
>
>> The next conditional that uses
>> code based on a version should be fine, even when being entered
>> speculatively with the wrong version setup, as the value of gop.ref is
>> correct (i.e. architecturally visible after this lfence) already. As the
>> version dependent macros are used, i.e. shared_entry_v1 and
>> shared_entry_v2, I do not see a risk why speculative out-of-bound access
>> should happen here.
> As said - v2 entries are larger than v1 ones. Therefore, if the
> processor wrongly speculates along the v2 path, it may use
> indexes valid for v1, but beyond the size when scaled by v2
> element size (whereas ->shared_raw[], aliased with
> ->shared_v1[] and ->shared_v2[], was actually set up with v1
> element size).
I am aware that both version use the same base array, and access it via
different macros, which essentially partition the array based on the
size of the respective struct. The underlying raw array has the same
size for both version. In case the CPU decides to enter the wrong
branch, but uses a valid gop.ref value, no out-of-bound accesses will
happen, because in each branch, the accesses via shared_entry_v1 or
shared_entry_v2 make sure the correct math is used to divide the raw
array into chunks of the size of the correct structure. I agree that
speculative execution might access a v1 raw array with v2 offsets, but
that does not result in an out-of-bound access. The data that is used
afterwards might be garbage, here sha->frame. Whether accesses based on
this should be protected could be another discussion, but it at least
looks complex to turn that into an exploitable pattern.
>
> And please don't forget - this subsequent conditional was just an
> easy example. What I'm really after is why you modify the if()
> above, without there being any array index involved.

The check that I protected uses the value of the variable okay, which -
at least after the introduced protecting lfence instruction - holds the
return value of the function gnttab_prepare_for_transfer. This function,
among others, checks whether gop.ref is bounded. By protecting the
evaluation of okay, I make sure to continue only in case gop.ref is
bounded. Consequently, further (speculative) execution is aware of a
valid value of gop.ref.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-29  8:33         ` Norbert Manthey
@ 2019-01-29  9:46           ` Jan Beulich
  2019-01-29 13:47             ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-29  9:46 UTC (permalink / raw)
  To: nmanthey
  Cc: tim, sstabellini, wei.liu2, konrad.wilk, George.Dunlap,
	andrew.cooper3, Ian.Jackson, Dario Faggioli, mpohlack,
	julien.grall, dwmw, amazein, xen-devel, jsteckli, doebel

>>> Norbert Manthey <nmanthey@amazon.de> 01/29/19 9:35 AM >>>
>I am aware that both version use the same base array, and access it via
>different macros, which essentially partition the array based on the
>size of the respective struct. The underlying raw array has the same
>size for both version.

And this is the problem afaics: If a guest has requested its grant table to
be sized as a single page, this page can fit twice as many entries for
v1 than it can fit for v2. Hence the v1 grant reference pointing at the last
entry would point at the last entry in the (not mapped) second page for v2.


Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-29  9:46           ` Jan Beulich
@ 2019-01-29 13:47             ` Norbert Manthey
  2019-01-29 15:11               ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 13:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: tim, sstabellini, wei.liu2, konrad.wilk, George.Dunlap,
	andrew.cooper3, Ian.Jackson, Dario Faggioli, mpohlack,
	julien.grall, dwmw, amazein, xen-devel, jsteckli, doebel

On 1/29/19 10:46, Jan Beulich wrote:
>>>> Norbert Manthey <nmanthey@amazon.de> 01/29/19 9:35 AM >>>
>> I am aware that both version use the same base array, and access it via
>> different macros, which essentially partition the array based on the
>> size of the respective struct. The underlying raw array has the same
>> size for both version.
> And this is the problem afaics: If a guest has requested its grant table to
> be sized as a single page, this page can fit twice as many entries for
> v1 than it can fit for v2. Hence the v1 grant reference pointing at the last
> entry would point at the last entry in the (not mapped) second page for v2.

I might understand the code wrong, but a guest would ask to get at most
N grant frames, and this number cannot be increased afterwards, i.e. the
field gt->max_grant_frames is written exactly once. Furthermore, the
void** shared_raw array is allocated and written exactly once with
sufficient pointers for, namely gt->max_grant_frames many in function
grant_table_init. Hence, independently of the version being used, at
least the shared_raw array cannot be used for out-of-bound accesses
during speculation with my above evaluate_nospec.

That being said, let's assume we have a v1 grant table, and speculation
uses the v2 accesses. In that case, an existing and zero-initialized
entry of shared_raw might be used in the first part of the
shared_entry_v2 macro, and even if that pointer would be non-NULL, the
page it would point to would have been cleared when growing the grant
table in function gnttab_grow_table.

That being said, I believe it is fine to let the above speculative
happen without extra hardening.

Best,
Norbert

PS: I just noticed that the shared_raw array might be allocated with a
smaller size, as long as more than 1 grant_entry fits into a page.

>
>
> Jan
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* SpectreV1+L1TF Patch Series v5
  2019-01-28 10:01 SpectreV1+L1TF Patch Series Juergen Gross
@ 2019-01-29 14:43 ` Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
                     ` (8 more replies)
  0 siblings, 9 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel

Dear all,

This patch series attempts to mitigate the issue that have been raised in the
XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
execution on Intel hardware, an lfence instruction is required to make sure
that selected checks are not bypassed. Speculative out-of-bound accesses can
be prevented by using the array_index_nospec macro.

The lfence instruction should be added on x86 platforms only. To not affect
platforms that are not affected by the L1TF vulnerability, the lfence
instruction is patched in via alternative patching on L1TF vulnerable CPUs only.
To control the patching mechanism, I introduced a command line option and a
synthesized CPU feature flag.


Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 2/9] x86/vioapic: " Norbert Manthey
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

Guests can issue event channel interaction with guest specified data.
To avoid speculative out-of-bound accesses, we use the nospec macros.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/common/event_channel.c | 25 ++++++++++++++++++-------
 xen/common/event_fifo.c    | 15 ++++++++++++---
 xen/include/xen/event.h    |  5 +++--
 3 files changed, 33 insertions(+), 12 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -365,11 +365,16 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
     if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
         return -EINVAL;
 
+   /*
+    * Make sure the guest controlled value virq is bounded even during
+    * speculative execution.
+    */
+    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
+
     if ( virq_is_global(virq) && (vcpu != 0) )
         return -EINVAL;
 
-    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
-         ((v = d->vcpu[vcpu]) == NULL) )
+    if ( (vcpu < 0) || ((v = domain_vcpu(d, vcpu)) == NULL) )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -418,8 +423,7 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
     int            port, vcpu = bind->vcpu;
     long           rc = 0;
 
-    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
-         (d->vcpu[vcpu] == NULL) )
+    if ( (vcpu < 0) || domain_vcpu(d, vcpu) == NULL )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -813,6 +817,13 @@ int set_global_virq_handler(struct domain *d, uint32_t virq)
 
     if (virq >= NR_VIRQS)
         return -EINVAL;
+
+   /*
+    * Make sure the guest controlled value virq is bounded even during
+    * speculative execution.
+    */
+    virq = array_index_nospec(virq, ARRAY_SIZE(global_virq_handlers));
+
     if (!virq_is_global(virq))
         return -EINVAL;
 
@@ -931,7 +942,7 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
     struct evtchn *chn;
     long           rc = 0;
 
-    if ( (vcpu_id >= d->max_vcpus) || (d->vcpu[vcpu_id] == NULL) )
+    if ( !domain_vcpu(d, vcpu_id) )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -969,8 +980,8 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
         unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
         chn->notify_vcpu_id = vcpu_id;
         pirq_set_affinity(d, chn->u.pirq.irq,
-                          cpumask_of(d->vcpu[vcpu_id]->processor));
-        link_pirq_port(port, chn, d->vcpu[vcpu_id]);
+                          cpumask_of(domain_vcpu(d, vcpu_id)->processor));
+        link_pirq_port(port, chn, domain_vcpu(d, vcpu_id));
         break;
     default:
         rc = -EINVAL;
diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
--- a/xen/common/event_fifo.c
+++ b/xen/common/event_fifo.c
@@ -33,7 +33,8 @@ static inline event_word_t *evtchn_fifo_word_from_port(const struct domain *d,
      */
     smp_rmb();
 
-    p = port / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
+    p = array_index_nospec(port / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE,
+                           d->evtchn_fifo->num_evtchns);
     w = port % EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
 
     return d->evtchn_fifo->event_array[p] + w;
@@ -516,14 +517,22 @@ int evtchn_fifo_init_control(struct evtchn_init_control *init_control)
     gfn     = init_control->control_gfn;
     offset  = init_control->offset;
 
-    if ( vcpu_id >= d->max_vcpus || !d->vcpu[vcpu_id] )
+    if ( !domain_vcpu(d, vcpu_id) )
         return -ENOENT;
-    v = d->vcpu[vcpu_id];
+
+    v = domain_vcpu(d, vcpu_id);
 
     /* Must not cross page boundary. */
     if ( offset > (PAGE_SIZE - sizeof(evtchn_fifo_control_block_t)) )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value offset is bounded even during
+     * speculative execution.
+     */
+    offset = array_index_nospec(offset,
+                           PAGE_SIZE - sizeof(evtchn_fifo_control_block_t) + 1);
+
     /* Must be 8-bytes aligned. */
     if ( offset & (8 - 1) )
         return -EINVAL;
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -13,6 +13,7 @@
 #include <xen/smp.h>
 #include <xen/softirq.h>
 #include <xen/bitops.h>
+#include <xen/nospec.h>
 #include <asm/event.h>
 
 /*
@@ -96,7 +97,7 @@ void arch_evtchn_inject(struct vcpu *v);
  * The first bucket is directly accessed via d->evtchn.
  */
 #define group_from_port(d, p) \
-    ((d)->evtchn_group[(p) / EVTCHNS_PER_GROUP])
+    array_access_nospec((d)->evtchn_group, (p) / EVTCHNS_PER_GROUP)
 #define bucket_from_port(d, p) \
     ((group_from_port(d, p))[((p) % EVTCHNS_PER_GROUP) / EVTCHNS_PER_BUCKET])
 
@@ -110,7 +111,7 @@ static inline bool_t port_is_valid(struct domain *d, unsigned int p)
 static inline struct evtchn *evtchn_from_port(struct domain *d, unsigned int p)
 {
     if ( p < EVTCHNS_PER_BUCKET )
-        return &d->evtchn[p];
+        return &d->evtchn[array_index_nospec(p, EVTCHNS_PER_BUCKET)];
     return bucket_from_port(d, p) + (p % EVTCHNS_PER_BUCKET);
 }
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 2/9] x86/vioapic: block speculative out-of-bound accesses
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: " Norbert Manthey
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

When interacting with io apic, a guest can specify values that are used
as index to structures, and whose values are not compared against
upper bounds to prevent speculative out-of-bound accesses. This change
prevents these speculative accesses.

Furthermore, two variables are initialized and the compiler is asked to
not optimized these initializations, as the uninitialized, potentially
guest controlled, variables might be used in a speculative out-of-bound
access. As the two problematic variables are both used in the common
function gsi_vioapic, the mitigation is implemented there. Currently,
the problematic callers are the functions vioapic_irq_positive_edge and
vioapic_get_trigger_mode.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/arch/x86/hvm/vioapic.c | 24 ++++++++++++++++++++----
 1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -30,6 +30,7 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/sched.h>
+#include <xen/nospec.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/vpic.h>
@@ -66,6 +67,12 @@ static struct hvm_vioapic *gsi_vioapic(const struct domain *d,
 {
     unsigned int i;
 
+    /*
+     * Make sure the compiler does not optimize away the initialization done by
+     * callers
+     */
+    OPTIMIZER_HIDE_VAR(*pin);
+
     for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
     {
         struct hvm_vioapic *vioapic = domain_vioapic(d, i);
@@ -117,7 +124,8 @@ static uint32_t vioapic_read_indirect(const struct hvm_vioapic *vioapic)
             break;
         }
 
-        redir_content = vioapic->redirtbl[redir_index].bits;
+        redir_content = vioapic->redirtbl[array_index_nospec(redir_index,
+                                                       vioapic->nr_pins)].bits;
         result = (vioapic->ioregsel & 1) ? (redir_content >> 32)
                                          : redir_content;
         break;
@@ -212,7 +220,15 @@ static void vioapic_write_redirent(
     struct hvm_irq *hvm_irq = hvm_domain_irq(d);
     union vioapic_redir_entry *pent, ent;
     int unmasked = 0;
-    unsigned int gsi = vioapic->base_gsi + idx;
+    unsigned int gsi;
+
+    /* Callers of this function should make sure idx is bounded appropriately*/
+    ASSERT(idx < vioapic->nr_pins);
+
+    /* Make sure no out-of-bound value for idx can be used */
+    idx = array_index_nospec(idx, vioapic->nr_pins);
+
+    gsi = vioapic->base_gsi + idx;
 
     spin_lock(&d->arch.hvm.irq_lock);
 
@@ -467,7 +483,7 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
 
 void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
 {
-    unsigned int pin;
+    unsigned int pin = 0; /* See gsi_vioapic */
     struct hvm_vioapic *vioapic = gsi_vioapic(d, irq, &pin);
     union vioapic_redir_entry *ent;
 
@@ -564,7 +580,7 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
 
 int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
 {
-    unsigned int pin;
+    unsigned int pin = 0; /* See gsi_vioapic */
     const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
 
     if ( !vioapic )
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 2/9] x86/vioapic: " Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier Norbert Manthey
                     ` (5 subsequent siblings)
  8 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

There are multiple arrays in the HVM interface that are accessed
with indices that are provided by the guest. To avoid speculative
out-of-bound accesses, we use the array_index_nospec macro.

When blocking speculative out-of-bound accesses, we can classify arrays
into dynamic arrays and static arrays. Where the former are allocated
during run time, the size of the latter is known during compile time.
On static arrays, compiler might be able to block speculative accesses
in the future.

We introduce another macro that uses the ARRAY_SIZE macro to block
speculative accesses. For arrays that are statically accessed, this macro
can be used instead of the usual macro. Using this macro results in more
readable code, and allows to modify the way this case is handled in a
single place.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/arch/x86/hvm/hvm.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -37,6 +37,7 @@
 #include <xen/monitor.h>
 #include <xen/warning.h>
 #include <xen/vpci.h>
+#include <xen/nospec.h>
 #include <asm/shadow.h>
 #include <asm/hap.h>
 #include <asm/current.h>
@@ -2092,7 +2093,7 @@ int hvm_mov_from_cr(unsigned int cr, unsigned int gpr)
     case 2:
     case 3:
     case 4:
-        val = curr->arch.hvm.guest_cr[cr];
+        val = array_access_nospec(curr->arch.hvm.guest_cr, cr);
         break;
     case 8:
         val = (vlapic_get_reg(vcpu_vlapic(curr), APIC_TASKPRI) & 0xf0) >> 4;
@@ -3438,13 +3439,15 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix16K_80000;
-        *msr_content = fixed_range_base[index + 1];
+        *msr_content = fixed_range_base[array_index_nospec(index + 1,
+                                   ARRAY_SIZE(v->arch.hvm.mtrr.fixed_ranges))];
         break;
     case MSR_MTRRfix4K_C0000...MSR_MTRRfix4K_F8000:
         if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix4K_C0000;
-        *msr_content = fixed_range_base[index + 3];
+        *msr_content = fixed_range_base[array_index_nospec(index + 3,
+                                   ARRAY_SIZE(v->arch.hvm.mtrr.fixed_ranges))];
         break;
     case MSR_IA32_MTRR_PHYSBASE(0)...MSR_IA32_MTRR_PHYSMASK(MTRR_VCNT_MAX - 1):
         if ( !d->arch.cpuid->basic.mtrr )
@@ -3453,7 +3456,8 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         if ( (index / 2) >=
              MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT) )
             goto gp_fault;
-        *msr_content = var_range_base[index];
+        *msr_content = var_range_base[array_index_nospec(index,
+                          MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT))];
         break;
 
     case MSR_IA32_XSS:
@@ -4016,7 +4020,7 @@ static int hvmop_set_evtchn_upcall_vector(
     if ( op.vector < 0x10 )
         return -EINVAL;
 
-    if ( op.vcpu >= d->max_vcpus || (v = d->vcpu[op.vcpu]) == NULL )
+    if ( (v = domain_vcpu(d, op.vcpu)) == NULL )
         return -ENOENT;
 
     printk(XENLOG_G_INFO "%pv: upcall vector %02x\n", v, op.vector);
@@ -4104,6 +4108,12 @@ static int hvmop_set_param(
     if ( a.index >= HVM_NR_PARAMS )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value a.index is bounded even during
+     * speculative execution.
+     */
+    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
+
     d = rcu_lock_domain_by_any_id(a.domid);
     if ( d == NULL )
         return -ESRCH;
@@ -4370,6 +4380,12 @@ static int hvmop_get_param(
     if ( a.index >= HVM_NR_PARAMS )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value a.index is bounded even during
+     * speculative execution.
+     */
+    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
+
     d = rcu_lock_domain_by_any_id(a.domid);
     if ( d == NULL )
         return -ESRCH;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
                     ` (2 preceding siblings ...)
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: " Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec Norbert Manthey
                     ` (4 subsequent siblings)
  8 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

To control the runtime behavior on L1TF vulnerable platforms better, the
command line option l1tf-barrier is introduced. This option controls
whether on vulnerable x86 platforms the lfence instruction is used to
prevent speculative execution from bypassing the evaluation of
conditionals that are protected with the evaluate_nospec macro.

By now, Xen is capable of identifying L1TF vulnerable hardware. However,
this information cannot be used for alternative patching, as a CPU feature
is required. To control alternative patching with the command line option,
a new x86 feature "X86_FEATURE_SC_L1TF_VULN" is introduced. This feature
is used to patch the lfence instruction into the arch_barrier_nospec_true
function. The feature is enabled only if L1TF vulnerable hardware is
detected and the command line option does not prevent using this feature.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 docs/misc/xen-command-line.pandoc | 14 ++++++++++----
 xen/arch/x86/spec_ctrl.c          | 18 ++++++++++++++++--
 xen/include/asm-x86/cpufeatures.h |  1 +
 xen/include/asm-x86/spec_ctrl.h   |  1 +
 4 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -463,9 +463,9 @@ accounting for hardware capabilities as enumerated via CPUID.
 
 Currently accepted:
 
-The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`,
-`l1d-flush` and `ssbd` are used by default if available and applicable.  They can
-be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
+The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, `l1d-flush`,
+`l1tf-barrier` and `ssbd` are used by default if available and applicable.  They
+can be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
 won't offer them to guests.
 
 ### cpuid_mask_cpu
@@ -1876,7 +1876,7 @@ By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`).
 ### spec-ctrl (x86)
 > `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb}=<bool>,
 >              bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
->              l1d-flush}=<bool> ]`
+>              l1d-flush,l1tf-barrier}=<bool> ]`
 
 Controls for speculative execution sidechannel mitigations.  By default, Xen
 will pick the most appropriate mitigations based on compiled in support,
@@ -1942,6 +1942,12 @@ Irrespective of Xen's setting, the feature is virtualised for HVM guests to
 use.  By default, Xen will enable this mitigation on hardware believed to be
 vulnerable to L1TF.
 
+On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force
+or prevent Xen from protecting evaluations inside the hypervisor with a barrier
+instruction to not load potentially secret information into L1 cache.  By
+default, Xen will enable this mitigation on hardware believed to be vulnerable
+to L1TF.
+
 ### sync_console
 > `= <boolean>`
 
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -21,6 +21,7 @@
 #include <xen/lib.h>
 #include <xen/warning.h>
 
+#include <asm-x86/cpuid.h>
 #include <asm/microcode.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -50,6 +51,7 @@ bool __read_mostly opt_ibpb = true;
 bool __read_mostly opt_ssbd = false;
 int8_t __read_mostly opt_eager_fpu = -1;
 int8_t __read_mostly opt_l1d_flush = -1;
+int8_t __read_mostly opt_l1tf_barrier = -1;
 
 bool __initdata bsp_delay_spec_ctrl;
 uint8_t __read_mostly default_xen_spec_ctrl;
@@ -100,6 +102,7 @@ static int __init parse_spec_ctrl(const char *s)
             opt_ibpb = false;
             opt_ssbd = false;
             opt_l1d_flush = 0;
+            opt_l1tf_barrier = 0;
         }
         else if ( val > 0 )
             rc = -EINVAL;
@@ -157,6 +160,8 @@ static int __init parse_spec_ctrl(const char *s)
             opt_eager_fpu = val;
         else if ( (val = parse_boolean("l1d-flush", s, ss)) >= 0 )
             opt_l1d_flush = val;
+        else if ( (val = parse_boolean("l1tf-barrier", s, ss)) >= 0 )
+            opt_l1tf_barrier = val;
         else
             rc = -EINVAL;
 
@@ -248,7 +253,7 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
                "\n");
 
     /* Settings for Xen's protection, irrespective of guests. */
-    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s\n",
+    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s%s\n",
            thunk == THUNK_NONE      ? "N/A" :
            thunk == THUNK_RETPOLINE ? "RETPOLINE" :
            thunk == THUNK_LFENCE    ? "LFENCE" :
@@ -258,7 +263,8 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
            !boot_cpu_has(X86_FEATURE_SSBD)           ? "" :
            (default_xen_spec_ctrl & SPEC_CTRL_SSBD)  ? " SSBD+" : " SSBD-",
            opt_ibpb                                  ? " IBPB"  : "",
-           opt_l1d_flush                             ? " L1D_FLUSH" : "");
+           opt_l1d_flush                             ? " L1D_FLUSH" : "",
+           opt_l1tf_barrier                          ? " L1TF_BARRIER" : "");
 
     /* L1TF diagnostics, printed if vulnerable or PV shadowing is in use. */
     if ( cpu_has_bug_l1tf || opt_pv_l1tf_hwdom || opt_pv_l1tf_domu )
@@ -843,6 +849,14 @@ void __init init_speculation_mitigations(void)
         opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
 
     /*
+     * By default, enable L1TF_VULN on L1TF-vulnerable hardware
+     */
+    if ( opt_l1tf_barrier == -1 )
+        opt_l1tf_barrier = cpu_has_bug_l1tf;
+    if ( cpu_has_bug_l1tf && opt_l1tf_barrier > 0)
+        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);
+
+    /*
      * We do not disable HT by default on affected hardware.
      *
      * Firstly, if the user intends to use exclusively PV, or HVM shadow
diff --git a/xen/include/asm-x86/cpufeatures.h b/xen/include/asm-x86/cpufeatures.h
--- a/xen/include/asm-x86/cpufeatures.h
+++ b/xen/include/asm-x86/cpufeatures.h
@@ -31,3 +31,4 @@ XEN_CPUFEATURE(SC_RSB_PV,       (FSCAPINTS+0)*32+18) /* RSB overwrite needed for
 XEN_CPUFEATURE(SC_RSB_HVM,      (FSCAPINTS+0)*32+19) /* RSB overwrite needed for HVM */
 XEN_CPUFEATURE(SC_MSR_IDLE,     (FSCAPINTS+0)*32+21) /* (SC_MSR_PV || SC_MSR_HVM) && default_xen_spec_ctrl */
 XEN_CPUFEATURE(XEN_LBR,         (FSCAPINTS+0)*32+22) /* Xen uses MSR_DEBUGCTL.LBR */
+XEN_CPUFEATURE(SC_L1TF_VULN,    (FSCAPINTS+0)*32+23) /* L1TF protection required */
diff --git a/xen/include/asm-x86/spec_ctrl.h b/xen/include/asm-x86/spec_ctrl.h
--- a/xen/include/asm-x86/spec_ctrl.h
+++ b/xen/include/asm-x86/spec_ctrl.h
@@ -37,6 +37,7 @@ extern bool opt_ibpb;
 extern bool opt_ssbd;
 extern int8_t opt_eager_fpu;
 extern int8_t opt_l1d_flush;
+extern int8_t opt_l1tf_barrier;
 
 extern bool bsp_delay_spec_ctrl;
 extern uint8_t default_xen_spec_ctrl;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
                     ` (3 preceding siblings ...)
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  2019-02-08  9:20     ` Julien Grall
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation Norbert Manthey
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
L1 cache is problemetic, because when hyperthreading is used as well, a
guest running on the sibling core can leak this potentially secret data.

To prevent these speculative accesses, we block speculation after
accessing the domain property field by adding lfence instructions. This
way, the CPU continues executing and loading data only once the condition
is actually evaluated.

As the macros are typically used in if statements, the lfence has to come
in a compatible way. Therefore, a function that returns true after an
lfence instruction is introduced. To protect both branches after a
conditional, an lfence instruction has to be added for the two branches.
To be able to block speculation after several evalauations, the generic
barrier macro block_speculation is also introduced.

As the L1TF vulnerability is only present on the x86 architecture, the
macros will not use the lfence instruction on other architectures and the
protection is disabled during compilation. By default, the lfence
instruction is not present either. Only when a L1TF vulnerable platform
is detected, the lfence instruction is patched in via alterantive patching.

Introducing the lfence instructions catches a lot of potential leaks with
a simple unintrusive code change. During performance testing, we did not
notice performance effects.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
---
 xen/include/xen/nospec.h | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -7,6 +7,7 @@
 #ifndef XEN_NOSPEC_H
 #define XEN_NOSPEC_H
 
+#include <asm/alternative.h>
 #include <asm/system.h>
 
 /**
@@ -64,6 +65,33 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
 #define array_access_nospec(array, index)                               \
     (array)[array_index_nospec(index, ARRAY_SIZE(array))]
 
+/*
+ * Allow to insert a read memory barrier into conditionals
+ */
+#if defined(CONFIG_X86) && defined(CONFIG_HVM)
+static inline bool arch_barrier_nospec_true(void) {
+    alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
+    return true;
+}
+#else
+static inline bool arch_barrier_nospec_true(void) { return true; }
+#endif
+
+/*
+ * Allow to protect evaluation of conditional with respect to speculation on x86
+ */
+#ifndef CONFIG_X86
+#define evaluate_nospec(condition) (condition)
+#else
+#define evaluate_nospec(condition)                                         \
+    ((condition) ? arch_barrier_nospec_true() : !arch_barrier_nospec_true())
+#endif
+
+/*
+ * Allow to block speculative execution in generic code
+ */
+#define block_speculation() (void)arch_barrier_nospec_true()
+
 #endif /* XEN_NOSPEC_H */
 
 /*
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
                     ` (4 preceding siblings ...)
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 7/9] is_hvm/pv_domain: " Norbert Manthey
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

Checks of domain properties, such as is_hardware_domain or is_hvm_domain,
might be bypassed by speculatively executing these instructions. A reason
for bypassing these checks is that these macros access the domain
structure via a pointer, and check a certain field. Since this memory
access is slow, the CPU assumes a returned value and continues the
execution.

In case an is_control_domain check is bypassed, for example during a
hypercall, data that should only be accessible by the control domain could
be loaded into the cache.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/include/xen/sched.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -23,6 +23,7 @@
 #include <asm/atomic.h>
 #include <xen/vpci.h>
 #include <xen/wait.h>
+#include <xen/nospec.h>
 #include <public/xen.h>
 #include <public/domctl.h>
 #include <public/sysctl.h>
@@ -908,10 +909,10 @@ void watchdog_domain_destroy(struct domain *d);
  *    (that is, this would not be suitable for a driver domain)
  *  - There is never a reason to deny the hardware domain access to this
  */
-#define is_hardware_domain(_d) ((_d) == hardware_domain)
+#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
 
 /* This check is for functionality specific to a control domain */
-#define is_control_domain(_d) ((_d)->is_privileged)
+#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)
 
 #define VM_ASSIST(d, t) (test_bit(VMASST_TYPE_ ## t, &(d)->vm_assist))
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 7/9] is_hvm/pv_domain: block speculation
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
                     ` (5 preceding siblings ...)
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 9/9] common/memory: " Norbert Manthey
  8 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

When checking for being an hvm domain, or PV domain, we have to make
sure that speculation cannot bypass that check, and eventually access
data that should not end up in cache for the current domain type.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/include/xen/sched.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -918,7 +918,8 @@ void watchdog_domain_destroy(struct domain *d);
 
 static inline bool is_pv_domain(const struct domain *d)
 {
-    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
+    return IS_ENABLED(CONFIG_PV)
+           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
 }
 
 static inline bool is_pv_vcpu(const struct vcpu *v)
@@ -949,7 +950,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
 #endif
 static inline bool is_hvm_domain(const struct domain *d)
 {
-    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
+    return IS_ENABLED(CONFIG_HVM)
+           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
 }
 
 static inline bool is_hvm_vcpu(const struct vcpu *v)
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
                     ` (6 preceding siblings ...)
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 7/9] is_hvm/pv_domain: " Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 9/9] common/memory: " Norbert Manthey
  8 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

Guests can issue grant table operations and provide guest controlled
data to them. This data is also used for memory loads. To avoid
speculative out-of-bound accesses, we use the array_index_nospec macro
where applicable. However, there are also memory accesses that cannot
be protected by a single array protection, or multiple accesses in a
row. To protect these, a nospec barrier is placed between the actual
range check and the access via the block_speculation macro.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/common/grant_table.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -37,6 +37,7 @@
 #include <xen/paging.h>
 #include <xen/keyhandler.h>
 #include <xen/vmap.h>
+#include <xen/nospec.h>
 #include <xsm/xsm.h>
 #include <asm/flushtlb.h>
 
@@ -203,8 +204,9 @@ static inline unsigned int nr_status_frames(const struct grant_table *gt)
 }
 
 #define MAPTRACK_PER_PAGE (PAGE_SIZE / sizeof(struct grant_mapping))
-#define maptrack_entry(t, e) \
-    ((t)->maptrack[(e)/MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
+#define maptrack_entry(t, e)                                                   \
+    ((t)->maptrack[array_index_nospec(e, (t)->maptrack_limit)                  \
+                                     /MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
 
 static inline unsigned int
 nr_maptrack_frames(struct grant_table *t)
@@ -963,6 +965,9 @@ map_grant_ref(
         PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
                  op->ref, rgt->domain->domain_id);
 
+    /* Make sure the above check is not bypassed speculatively */
+    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
+
     act = active_entry_acquire(rgt, op->ref);
     shah = shared_entry_header(rgt, op->ref);
     status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
@@ -2026,6 +2031,9 @@ gnttab_prepare_for_transfer(
         goto fail;
     }
 
+    /* Make sure the above check is not bypassed speculatively */
+    ref = array_index_nospec(ref, nr_grant_entries(rgt));
+
     sha = shared_entry_header(rgt, ref);
 
     scombo.word = *(u32 *)&sha->flags;
@@ -2223,7 +2231,8 @@ gnttab_transfer(
         okay = gnttab_prepare_for_transfer(e, d, gop.ref);
         spin_lock(&e->page_alloc_lock);
 
-        if ( unlikely(!okay) || unlikely(e->is_dying) )
+        /* Make sure this check is not bypassed speculatively */
+        if ( evaluate_nospec(unlikely(!okay) || unlikely(e->is_dying)) )
         {
             bool_t drop_dom_ref = !domain_adjust_tot_pages(e, -1);
 
@@ -2408,6 +2417,9 @@ acquire_grant_for_copy(
         PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
                  "Bad grant reference %#x\n", gref);
 
+    /* Make sure the above check is not bypassed speculatively */
+    gref = array_index_nospec(gref, nr_grant_entries(rgt));
+
     act = active_entry_acquire(rgt, gref);
     shah = shared_entry_header(rgt, gref);
     if ( rgt->gt_version == 1 )
@@ -2826,6 +2838,9 @@ static int gnttab_copy_buf(const struct gnttab_copy *op,
                  op->dest.offset, dest->ptr.offset,
                  op->len, dest->len);
 
+    /* Make sure the above checks are not bypassed speculatively */
+    block_speculation();
+
     memcpy(dest->virt + op->dest.offset, src->virt + op->source.offset,
            op->len);
     gnttab_mark_dirty(dest->domain, dest->mfn);
@@ -3211,6 +3226,10 @@ swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
     if ( unlikely(ref_b >= nr_grant_entries(d->grant_table)))
         PIN_FAIL(out, GNTST_bad_gntref, "Bad ref-b %#x\n", ref_b);
 
+    /* Make sure the above checks are not bypassed speculatively */
+    ref_a = array_index_nospec(ref_a, nr_grant_entries(d->grant_table));
+    ref_b = array_index_nospec(ref_b, nr_grant_entries(d->grant_table));
+
     /* Swapping the same ref is a no-op. */
     if ( ref_a == ref_b )
         goto out;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v5 9/9] common/memory: block speculative out-of-bound accesses
  2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
                     ` (7 preceding siblings ...)
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
@ 2019-01-29 14:43   ` Norbert Manthey
  8 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-01-29 14:43 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Jan Beulich, Martin Mazein, Julian Stecklina, Bjoern Doebel,
	Norbert Manthey

The get_page_from_gfn method returns a pointer to a page that belongs
to a gfn. Before returning the pointer, the gfn is checked for being
valid. Under speculation, these checks can be bypassed, so that
the function get_page is still executed partially. Consequently, the
function page_get_owner_and_reference might be executed partially as
well. In this function, the computed pointer is accessed, resulting in
a speculative out-of-bound address load. As the gfn can be controlled by
a guest, this access is problematic.

To mitigate the root cause, an lfence instruction is added via the
evaluate_nospec macro. To make the protection generic, we do not
introduce the lfence instruction for this single check, but add it to
the mfn_valid function. This way, other potentially problematic accesses
are protected as well.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/common/pdx.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/xen/common/pdx.c b/xen/common/pdx.c
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -18,6 +18,7 @@
 #include <xen/init.h>
 #include <xen/mm.h>
 #include <xen/bitops.h>
+#include <xen/nospec.h>
 
 /* Parameters for PFN/MADDR compression. */
 unsigned long __read_mostly max_pdx;
@@ -33,10 +34,10 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
 
 bool __mfn_valid(unsigned long mfn)
 {
-    return likely(mfn < max_page) &&
-           likely(!(mfn & pfn_hole_mask)) &&
-           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
-                           pdx_group_valid));
+    return evaluate_nospec(likely(mfn < max_page) &&
+                           likely(!(mfn & pfn_hole_mask)) &&
+                           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
+                                           pdx_group_valid)));
 }
 
 /* Sets all bits from the most-significant 1-bit down to the LSB */
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-29 13:47             ` Norbert Manthey
@ 2019-01-29 15:11               ` Jan Beulich
  2019-01-30  8:06                 ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-29 15:11 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 14:47, <nmanthey@amazon.de> wrote:
> On 1/29/19 10:46, Jan Beulich wrote:
>>>>> Norbert Manthey <nmanthey@amazon.de> 01/29/19 9:35 AM >>>
>>> I am aware that both version use the same base array, and access it via
>>> different macros, which essentially partition the array based on the
>>> size of the respective struct. The underlying raw array has the same
>>> size for both version.
>> And this is the problem afaics: If a guest has requested its grant table to
>> be sized as a single page, this page can fit twice as many entries for
>> v1 than it can fit for v2. Hence the v1 grant reference pointing at the last
>> entry would point at the last entry in the (not mapped) second page for v2.
> 
> I might understand the code wrong, but a guest would ask to get at most
> N grant frames, and this number cannot be increased afterwards, i.e. the
> field gt->max_grant_frames is written exactly once. Furthermore, the
> void** shared_raw array is allocated and written exactly once with
> sufficient pointers for, namely gt->max_grant_frames many in function
> grant_table_init. Hence, independently of the version being used, at
> least the shared_raw array cannot be used for out-of-bound accesses
> during speculation with my above evaluate_nospec.

I'm afraid I'm still not following: A give number of pages is worth
twice as many grants in v1 than it is in v2. Therefore a v1 grant
reference to a grant entry tracked in the second half of the
first page would cause a speculative access to anywhere in the
second page when wrongly interpreted as a v2 ref.

> That being said, let's assume we have a v1 grant table, and speculation
> uses the v2 accesses. In that case, an existing and zero-initialized
> entry of shared_raw might be used in the first part of the
> shared_entry_v2 macro, and even if that pointer would be non-NULL, the
> page it would point to would have been cleared when growing the grant
> table in function gnttab_grow_table.

Not if the v1 ref is no smaller than half the maximum number of
v1 refs. In that case, if taken as a v2 ref, ->shared_raw[]
would need to be twice as big to cope with the larger index
(resulting from the smaller divisor in shared_entry_v2()
compared to shared_entry_v1()) in order to not be overrun.

Let's look at an example: gref 256 points into the middle of
the first page when using v1 calculations, but at the start
of the second page when using v2 calculations. Hence, if the
maximum number of grant frames was 1, we'd overrun the
array, consisting of just a single element (256 is valid as a
v1 gref in that case, but just out of bounds as a v2 one).

Furthermore, even if ->shared_raw[] itself could not be overrun,
an entry of it being NULL could be a problem with PV guests, who
can install a translation for the first page of the address space,
and thus perhaps partly control subsequent speculative execution.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-29 15:11               ` Jan Beulich
@ 2019-01-30  8:06                 ` Norbert Manthey
  2019-01-30 11:35                   ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-01-30  8:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/29/19 16:11, Jan Beulich wrote:
>>>> On 29.01.19 at 14:47, <nmanthey@amazon.de> wrote:
>> On 1/29/19 10:46, Jan Beulich wrote:
>>>>>> Norbert Manthey <nmanthey@amazon.de> 01/29/19 9:35 AM >>>
>>>> I am aware that both version use the same base array, and access it via
>>>> different macros, which essentially partition the array based on the
>>>> size of the respective struct. The underlying raw array has the same
>>>> size for both version.
>>> And this is the problem afaics: If a guest has requested its grant table to
>>> be sized as a single page, this page can fit twice as many entries for
>>> v1 than it can fit for v2. Hence the v1 grant reference pointing at the last
>>> entry would point at the last entry in the (not mapped) second page for v2.
>> I might understand the code wrong, but a guest would ask to get at most
>> N grant frames, and this number cannot be increased afterwards, i.e. the
>> field gt->max_grant_frames is written exactly once. Furthermore, the
>> void** shared_raw array is allocated and written exactly once with
>> sufficient pointers for, namely gt->max_grant_frames many in function
>> grant_table_init. Hence, independently of the version being used, at
>> least the shared_raw array cannot be used for out-of-bound accesses
>> during speculation with my above evaluate_nospec.
> I'm afraid I'm still not following: A give number of pages is worth
> twice as many grants in v1 than it is in v2. Therefore a v1 grant
> reference to a grant entry tracked in the second half of the
> first page would cause a speculative access to anywhere in the
> second page when wrongly interpreted as a v2 ref.
Agreed. So you want me to add another lfence to make sure the wrong
interpretation does not lead to other out-of-bound accesses down the
speculative window? In my opinion, the v1 vs v2 code does not result in
actual out-of-bound accesses, except for the NULL page case below. To
make the PV case happy, I will add the evaluate_nospec macro for the v1
vs v2 conditionals in functions with guest controlled ref indexes.
>
>> That being said, let's assume we have a v1 grant table, and speculation
>> uses the v2 accesses. In that case, an existing and zero-initialized
>> entry of shared_raw might be used in the first part of the
>> shared_entry_v2 macro, and even if that pointer would be non-NULL, the
>> page it would point to would have been cleared when growing the grant
>> table in function gnttab_grow_table.
> Not if the v1 ref is no smaller than half the maximum number of
> v1 refs. In that case, if taken as a v2 ref, ->shared_raw[]
> would need to be twice as big to cope with the larger index
> (resulting from the smaller divisor in shared_entry_v2()
> compared to shared_entry_v1()) in order to not be overrun.
>
> Let's look at an example: gref 256 points into the middle of
> the first page when using v1 calculations, but at the start
> of the second page when using v2 calculations. Hence, if the
> maximum number of grant frames was 1, we'd overrun the
> array, consisting of just a single element (256 is valid as a
> v1 gref in that case, but just out of bounds as a v2 one).
If 256 is a valid gref, then the shared_raw array holds sufficient
zero-initialized elements for such an access, even without the division
operator that is used in the shared_entry_v*() macros. Hence, no
out-of-bound access will happen here.
>
> Furthermore, even if ->shared_raw[] itself could not be overrun,
> an entry of it being NULL could be a problem with PV guests, who
> can install a translation for the first page of the address space,
> and thus perhaps partly control subsequent speculative execution.

I understand the concern. I add the evaluate_nospec as mentioned above.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: block speculative out-of-bound accesses
  2019-01-30  8:06                 ` Norbert Manthey
@ 2019-01-30 11:35                   ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-30 11:35 UTC (permalink / raw)
  To: nmanthey
  Cc: Tim Deegan, Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Andrew Cooper, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 30.01.19 at 09:06, <nmanthey@amazon.de> wrote:
> On 1/29/19 16:11, Jan Beulich wrote:
>>>>> On 29.01.19 at 14:47, <nmanthey@amazon.de> wrote:
>>> On 1/29/19 10:46, Jan Beulich wrote:
>>>>>>> Norbert Manthey <nmanthey@amazon.de> 01/29/19 9:35 AM >>>
>>>>> I am aware that both version use the same base array, and access it via
>>>>> different macros, which essentially partition the array based on the
>>>>> size of the respective struct. The underlying raw array has the same
>>>>> size for both version.
>>>> And this is the problem afaics: If a guest has requested its grant table to
>>>> be sized as a single page, this page can fit twice as many entries for
>>>> v1 than it can fit for v2. Hence the v1 grant reference pointing at the last
>>>> entry would point at the last entry in the (not mapped) second page for v2.
>>> I might understand the code wrong, but a guest would ask to get at most
>>> N grant frames, and this number cannot be increased afterwards, i.e. the
>>> field gt->max_grant_frames is written exactly once. Furthermore, the
>>> void** shared_raw array is allocated and written exactly once with
>>> sufficient pointers for, namely gt->max_grant_frames many in function
>>> grant_table_init. Hence, independently of the version being used, at
>>> least the shared_raw array cannot be used for out-of-bound accesses
>>> during speculation with my above evaluate_nospec.
>> I'm afraid I'm still not following: A give number of pages is worth
>> twice as many grants in v1 than it is in v2. Therefore a v1 grant
>> reference to a grant entry tracked in the second half of the
>> first page would cause a speculative access to anywhere in the
>> second page when wrongly interpreted as a v2 ref.
> Agreed. So you want me to add another lfence to make sure the wrong
> interpretation does not lead to other out-of-bound accesses down the
> speculative window? In my opinion, the v1 vs v2 code does not result in
> actual out-of-bound accesses, except for the NULL page case below. To
> make the PV case happy, I will add the evaluate_nospec macro for the v1
> vs v2 conditionals in functions with guest controlled ref indexes.

Please don't get me wrong - I'm not saying these have to be added.
The context here was that you added some checks but not others.
Going forward it is likely going to be important (or at least helpful)
to know where the boundaries are drawn. This is the more that I
think we all agree that insertion rules (or should I say guide lines)
are fuzzy enough already as long as we don't choose to guard _all_
conditionals.

>>> That being said, let's assume we have a v1 grant table, and speculation
>>> uses the v2 accesses. In that case, an existing and zero-initialized
>>> entry of shared_raw might be used in the first part of the
>>> shared_entry_v2 macro, and even if that pointer would be non-NULL, the
>>> page it would point to would have been cleared when growing the grant
>>> table in function gnttab_grow_table.
>> Not if the v1 ref is no smaller than half the maximum number of
>> v1 refs. In that case, if taken as a v2 ref, ->shared_raw[]
>> would need to be twice as big to cope with the larger index
>> (resulting from the smaller divisor in shared_entry_v2()
>> compared to shared_entry_v1()) in order to not be overrun.
>>
>> Let's look at an example: gref 256 points into the middle of
>> the first page when using v1 calculations, but at the start
>> of the second page when using v2 calculations. Hence, if the
>> maximum number of grant frames was 1, we'd overrun the
>> array, consisting of just a single element (256 is valid as a
>> v1 gref in that case, but just out of bounds as a v2 one).
> If 256 is a valid gref, then the shared_raw array holds sufficient
> zero-initialized elements for such an access, even without the division
> operator that is used in the shared_entry_v*() macros. Hence, no
> out-of-bound access will happen here.

There's no such thing as "256 is a valid gref" without saying for
what version. Since the shared table setup is driven by a number
of page frames, the number of valid grefs depends on the
version. In the single page example, 256 is a valid gref for v1,
but an out of bounds one for v2. Speculation along a v2 path
would access ->shared_raw[1] which was not set up (when
actually using v1), nor even allocated space for to store a NULL
in if ->max_grant_frames was 1.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses
       [not found]           ` <00FAE7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
@ 2019-01-31 15:05             ` Jan Beulich
  2019-02-01 13:45               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-31 15:05 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -365,11 +365,16 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
>      if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
>          return -EINVAL;
>  
> +   /*
> +    * Make sure the guest controlled value virq is bounded even during
> +    * speculative execution.
> +    */
> +    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
> +
>      if ( virq_is_global(virq) && (vcpu != 0) )
>          return -EINVAL;
>  
> -    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
> -         ((v = d->vcpu[vcpu]) == NULL) )
> +    if ( (vcpu < 0) || ((v = domain_vcpu(d, vcpu)) == NULL) )
>          return -ENOENT;

Is there a reason for the less-than-zero check to survive?

> @@ -418,8 +423,7 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
>      int            port, vcpu = bind->vcpu;
>      long           rc = 0;
>  
> -    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
> -         (d->vcpu[vcpu] == NULL) )
> +    if ( (vcpu < 0) || domain_vcpu(d, vcpu) == NULL )
>          return -ENOENT;

I'm not sure about this one: We're not after the struct vcpu pointer
here. Right now subsequent code looks fine, but what if the actual
"vcpu" local variable was used again in a risky way further down? I
think here and elsewhere it would be best to eliminate that local
variable, and use v->vcpu_id only for subsequent consumers (or
alternatively latch the local variable's value only _after_ the call to
domain_vcpu(), which might be better especially in cases like).

> @@ -969,8 +980,8 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>          unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
>          chn->notify_vcpu_id = vcpu_id;
>          pirq_set_affinity(d, chn->u.pirq.irq,
> -                          cpumask_of(d->vcpu[vcpu_id]->processor));
> -        link_pirq_port(port, chn, d->vcpu[vcpu_id]);
> +                          cpumask_of(domain_vcpu(d, vcpu_id)->processor));
> +        link_pirq_port(port, chn, domain_vcpu(d, vcpu_id));

... this one, where you then wouldn't need to alter code other than
that actually checking the vCPU ID.

> @@ -516,14 +517,22 @@ int evtchn_fifo_init_control(struct evtchn_init_control 
> *init_control)
>      gfn     = init_control->control_gfn;
>      offset  = init_control->offset;
>  
> -    if ( vcpu_id >= d->max_vcpus || !d->vcpu[vcpu_id] )
> +    if ( !domain_vcpu(d, vcpu_id) )
>          return -ENOENT;
> -    v = d->vcpu[vcpu_id];
> +
> +    v = domain_vcpu(d, vcpu_id);

Please don't call the function twice.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 2/9] x86/vioapic: block speculative out-of-bound accesses
       [not found]           ` <00FA27AF020000F8B1E090C7@prv1-mh.provo.novell.com>
@ 2019-01-31 16:05             ` Jan Beulich
  2019-02-01 13:54               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-31 16:05 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
> When interacting with io apic, a guest can specify values that are used
> as index to structures, and whose values are not compared against
> upper bounds to prevent speculative out-of-bound accesses. This change
> prevents these speculative accesses.
> 
> Furthermore, two variables are initialized and the compiler is asked to
> not optimized these initializations, as the uninitialized, potentially
> guest controlled, variables might be used in a speculative out-of-bound
> access. As the two problematic variables are both used in the common
> function gsi_vioapic, the mitigation is implemented there. Currently,
> the problematic callers are the functions vioapic_irq_positive_edge and
> vioapic_get_trigger_mode.

I would have wished for you to say why the other two are _not_
a problem. Afaict in both cases the functions only ever get
internal data passed.

Then again I'm not convinced it's worth taking the risk that a
problematic caller gets added down the road. How about you add
initializers everywhere, clarifying in the description that it's "just
in case" for the two currently safe ones?

> This commit is part of the SpectreV1+L1TF mitigation patch series.
> 
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> 
> ---

Btw., could you please get used to the habit of adding a brief
summary of changes for at least the most recent version here,
which aids review quite a bit?

> @@ -212,7 +220,15 @@ static void vioapic_write_redirent(
>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>      union vioapic_redir_entry *pent, ent;
>      int unmasked = 0;
> -    unsigned int gsi = vioapic->base_gsi + idx;
> +    unsigned int gsi;
> +
> +    /* Callers of this function should make sure idx is bounded appropriately*/

Missing blank at the end of the comment (which, if this was the
only open point, would be easy enough to adjust while committing).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: block speculative out-of-bound accesses
       [not found]           ` <00F867AF020000F8B1E090C7@prv1-mh.provo.novell.com>
@ 2019-01-31 16:19             ` Jan Beulich
  2019-01-31 20:02               ` Andrew Cooper
  2019-02-01 14:05               ` Norbert Manthey
  0 siblings, 2 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-31 16:19 UTC (permalink / raw)
  To: nmanthey, Andrew Cooper
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
> There are multiple arrays in the HVM interface that are accessed
> with indices that are provided by the guest. To avoid speculative
> out-of-bound accesses, we use the array_index_nospec macro.
> 
> When blocking speculative out-of-bound accesses, we can classify arrays
> into dynamic arrays and static arrays. Where the former are allocated
> during run time, the size of the latter is known during compile time.
> On static arrays, compiler might be able to block speculative accesses
> in the future.
> 
> We introduce another macro that uses the ARRAY_SIZE macro to block
> speculative accesses. For arrays that are statically accessed, this macro
> can be used instead of the usual macro. Using this macro results in more
> readable code, and allows to modify the way this case is handled in a
> single place.

I think this paragraph is stale now.

> @@ -3453,7 +3456,8 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>          if ( (index / 2) >=
>               MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT) )
>              goto gp_fault;
> -        *msr_content = var_range_base[index];
> +        *msr_content = var_range_base[array_index_nospec(index,
> +                          MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT))];
>          break;

I clearly should have noticed this earlier on - the bound passed into
the macro is not in line with the if() condition. I think you're funneling
half the number of entries into array slot 0.

> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>      if ( a.index >= HVM_NR_PARAMS )
>          return -EINVAL;
>  
> +    /*
> +     * Make sure the guest controlled value a.index is bounded even during
> +     * speculative execution.
> +     */
> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);

I'd like to come back to this model of updating local variables:
Is this really safe to do? If such a variable lives in memory
(which here it quite likely does), does speculation always
recognize the update to the value? Wouldn't it rather read
what's currently in that slot, and re-do the calculation in case
a subsequent write happens? (I know I did suggest doing so
earlier on, so I apologize if this results in you having to go
back to some earlier used model.)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier
       [not found]           ` <0101A7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
@ 2019-01-31 16:35             ` Jan Beulich
  2019-02-05 14:23               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-01-31 16:35 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
> @@ -1942,6 +1942,12 @@ Irrespective of Xen's setting, the feature is virtualised for HVM guests to
>  use.  By default, Xen will enable this mitigation on hardware believed to 
> be
>  vulnerable to L1TF.
>  
> +On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force
> +or prevent Xen from protecting evaluations inside the hypervisor with a barrier
> +instruction to not load potentially secret information into L1 cache.  By
> +default, Xen will enable this mitigation on hardware believed to be vulnerable
> +to L1TF.

... and having SMT enabled, since aiui this is a non-issue without.

> --- a/xen/arch/x86/spec_ctrl.c
> +++ b/xen/arch/x86/spec_ctrl.c
> @@ -21,6 +21,7 @@
>  #include <xen/lib.h>
>  #include <xen/warning.h>
>  
> +#include <asm-x86/cpuid.h>

asm/cpuid.h please

> @@ -100,6 +102,7 @@ static int __init parse_spec_ctrl(const char *s)
>              opt_ibpb = false;
>              opt_ssbd = false;
>              opt_l1d_flush = 0;
> +            opt_l1tf_barrier = 0;
>          }
>          else if ( val > 0 )
>              rc = -EINVAL;

Is this really something we want "spec-ctrl=no-xen" to disable?
It would seem to me that this should be restricted to "spec-ctrl=no".

> @@ -843,6 +849,14 @@ void __init init_speculation_mitigations(void)
>          opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
>  
>      /*
> +     * By default, enable L1TF_VULN on L1TF-vulnerable hardware
> +     */

This ought to be a single line comment.

> +    if ( opt_l1tf_barrier == -1 )
> +        opt_l1tf_barrier = cpu_has_bug_l1tf;

At the very least opt_smt should be taken into account here. But
I guess this setting of the default may need to be deferred
further, until the topology of the system is known (there may
not be any hyperthreads after all).

> +    if ( cpu_has_bug_l1tf && opt_l1tf_barrier > 0)
> +        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);

Why the left side of the &&?

> +    /*
>       * We do not disable HT by default on affected hardware.
>       *
>       * Firstly, if the user intends to use exclusively PV, or HVM shadow

Furthermore, as per the comment and logic here and below a
!HVM configuration ought to be safe too, unless "pv-l1tf=" was
used (in which case we defer to the admin anyway), so it's
questionable whether the whole logic should be there in the
first place in this case. This would then in particular keep all
of this out for the PV shim.

> --- a/xen/include/asm-x86/cpufeatures.h
> +++ b/xen/include/asm-x86/cpufeatures.h
> @@ -31,3 +31,4 @@ XEN_CPUFEATURE(SC_RSB_PV,       (FSCAPINTS+0)*32+18) /* RSB overwrite needed for
>  XEN_CPUFEATURE(SC_RSB_HVM,      (FSCAPINTS+0)*32+19) /* RSB overwrite needed for HVM */
>  XEN_CPUFEATURE(SC_MSR_IDLE,     (FSCAPINTS+0)*32+21) /* (SC_MSR_PV || SC_MSR_HVM) && default_xen_spec_ctrl */
>  XEN_CPUFEATURE(XEN_LBR,         (FSCAPINTS+0)*32+22) /* Xen uses MSR_DEBUGCTL.LBR */
> +XEN_CPUFEATURE(SC_L1TF_VULN,    (FSCAPINTS+0)*32+23) /* L1TF protection required */

Would you mind using one of the unused slots above first?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec
       [not found]           ` <0101E7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
@ 2019-01-31 17:05             ` Jan Beulich
  2019-02-05 14:32               ` Norbert Manthey
       [not found]               ` <A18FF6C80200006BB1E090C7@prv1-mh.provo.novell.com>
  0 siblings, 2 replies; 150+ messages in thread
From: Jan Beulich @ 2019-01-31 17:05 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
> Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
> L1 cache is problemetic, because when hyperthreading is used as well, a
> guest running on the sibling core can leak this potentially secret data.
> 
> To prevent these speculative accesses, we block speculation after
> accessing the domain property field by adding lfence instructions. This
> way, the CPU continues executing and loading data only once the condition
> is actually evaluated.
> 
> As the macros are typically used in if statements, the lfence has to come
> in a compatible way. Therefore, a function that returns true after an
> lfence instruction is introduced. To protect both branches after a
> conditional, an lfence instruction has to be added for the two branches.
> To be able to block speculation after several evalauations, the generic
> barrier macro block_speculation is also introduced.
> 
> As the L1TF vulnerability is only present on the x86 architecture, the
> macros will not use the lfence instruction on other architectures and the
> protection is disabled during compilation. By default, the lfence
> instruction is not present either. Only when a L1TF vulnerable platform
> is detected, the lfence instruction is patched in via alterantive patching.
> 
> Introducing the lfence instructions catches a lot of potential leaks with
> a simple unintrusive code change. During performance testing, we did not
> notice performance effects.
> 
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

Looks okay to me now, but I'm going to wait with giving an ack
until perhaps others have given comments, as some of this
was not entirely uncontroversial. There are a few cosmetic
issues left though:

> @@ -64,6 +65,33 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>  #define array_access_nospec(array, index)                               \
>      (array)[array_index_nospec(index, ARRAY_SIZE(array))]
>  
> +/*
> + * Allow to insert a read memory barrier into conditionals
> + */

Here and below, please make single line comments really be
single lines.

> +#if defined(CONFIG_X86) && defined(CONFIG_HVM)
> +static inline bool arch_barrier_nospec_true(void) {

The brace belongs on its own line.

> +    alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
> +    return true;
> +}
> +#else
> +static inline bool arch_barrier_nospec_true(void) { return true; }

This could be avoided if you placed the #if inside the
function body.

> +#endif
> +
> +/*
> + * Allow to protect evaluation of conditional with respect to speculation on x86
> + */
> +#ifndef CONFIG_X86

Why is this conditional different from the one above?

> +#define evaluate_nospec(condition) (condition)
> +#else
> +#define evaluate_nospec(condition)                                         \
> +    ((condition) ? arch_barrier_nospec_true() : !arch_barrier_nospec_true())
> +#endif
> +
> +/*
> + * Allow to block speculative execution in generic code
> + */
> +#define block_speculation() (void)arch_barrier_nospec_true()

Missing an outer pair of parentheses.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 04/11] x86/hvm: block speculative out-of-bound accesses
  2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 04/11] x86/hvm: block speculative out-of-bound accesses Norbert Manthey
@ 2019-01-31 19:31   ` Andrew Cooper
  2019-02-01  9:06     ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Andrew Cooper @ 2019-01-31 19:31 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel

On 23/01/2019 11:51, Norbert Manthey wrote:
> There are multiple arrays in the HVM interface that are accessed
> with indices that are provided by the guest. To avoid speculative
> out-of-bound accesses, we use the array_index_nospec macro.
>
> When blocking speculative out-of-bound accesses, we can classify arrays
> into dynamic arrays and static arrays. Where the former are allocated
> during run time, the size of the latter is known during compile time.
> On static arrays, compiler might be able to block speculative accesses
> in the future.
>
> We introduce another macro that uses the ARRAY_SIZE macro to block
> speculative accesses. For arrays that are statically accessed, this macro
> can be used instead of the usual macro. Using this macro results in more
> readable code, and allows to modify the way this case is handled in a
> single place.
>
> This commit is part of the SpectreV1+L1TF mitigation patch series.
>
> Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>
> ---
>  xen/arch/x86/hvm/hvm.c   | 27 ++++++++++++++++++++++-----
>  xen/include/xen/nospec.h |  6 ++++++
>  2 files changed, 28 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -37,6 +37,7 @@
>  #include <xen/monitor.h>
>  #include <xen/warning.h>
>  #include <xen/vpci.h>
> +#include <xen/nospec.h>
>  #include <asm/shadow.h>
>  #include <asm/hap.h>
>  #include <asm/current.h>
> @@ -2102,7 +2103,7 @@ int hvm_mov_from_cr(unsigned int cr, unsigned int gpr)
>      case 2:
>      case 3:
>      case 4:
> -        val = curr->arch.hvm.guest_cr[cr];
> +        val = array_access_nospec(curr->arch.hvm.guest_cr, cr);

This is an interesting case - we don't actually need protection here.

This path is called exclusively from intercepts, so cr is strictly one
of 0, 2, 3, 4, 8 even under adversarial speculation.  However, as
guest_cr[] is only 5 entries long, the 8 case can still result in an OoB
read.

However, given that the 8 index is in the hw_cr[] array and guaranteed
to be in the cache by this point, an attacker can't gain any additional
information by poisoning the switch logic.

(And furthermore, almost all hardware in the past decade has TPR
shadowing support, at which point we don't even hit the cr8 intercept.)

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-01-31 16:19             ` [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: " Jan Beulich
@ 2019-01-31 20:02               ` Andrew Cooper
  2019-02-01  8:23                 ` Jan Beulich
  2019-02-01 14:05               ` Norbert Manthey
  1 sibling, 1 reply; 150+ messages in thread
From: Andrew Cooper @ 2019-01-31 20:02 UTC (permalink / raw)
  To: Jan Beulich, nmanthey
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Ian Jackson, Tim Deegan,
	Dario Faggioli, Martin Pohlack, Julien Grall, Bjoern Doebel,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse

On 31/01/2019 16:19, Jan Beulich wrote:
>
>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>      if ( a.index >= HVM_NR_PARAMS )
>>          return -EINVAL;
>>  
>> +    /*
>> +     * Make sure the guest controlled value a.index is bounded even during
>> +     * speculative execution.
>> +     */
>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
> I'd like to come back to this model of updating local variables:
> Is this really safe to do? If such a variable lives in memory
> (which here it quite likely does), does speculation always
> recognize the update to the value? Wouldn't it rather read
> what's currently in that slot, and re-do the calculation in case
> a subsequent write happens? (I know I did suggest doing so
> earlier on, so I apologize if this results in you having to go
> back to some earlier used model.)

I'm afraid that is a very complicated set of questions to answer.

The processor needs to track write=>read dependencies to avoid wasting a
large quantity of time doing erroneous speculation, therefore it does. 
Pending writes which have happened under speculation are forwarded to
dependant instructions.

This behaviour is what gives rise to Bounds Check Bypass Store - a half
spectre-v1 gadget but with a store rather than a write.  You can e.g.
speculatively modify the return address on the stack, and hijack
speculation to an attacker controlled address for a brief period of
time.  If the speculation window is long enough, the processor first
follows the RSB/RAS (correctly), then later notices that the real value
on the stack was different, discards the speculation from the RSB/RAS
and uses the attacker controlled value instead, then eventually notices
that all of this was bogus and rewinds back to the original branch.

Another corner case is Speculative Store Bypass, where memory
disambiguation speculation can miss the fact that there is a real
write=>read dependency, and cause speculation using the older stale
value for a period of time.


As to overall safety, array_index_nospec() only works as intended when
the index remains in a register between the cmp/sbb which bounds it
under speculation, and the array access.  There is no way to guarantee
this property, as the compiler can spill any value if it thinks it needs to.

The general safety of the construct relies on the fact that an
optimising compiler will do its very best to avoid spilling variable to
the stack.  As with all of these issues, you can only confirm whether
you are no longer vulnerable by inspecting the eventual compiled code.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-25 10:14     ` Jan Beulich
  2019-01-25 10:50       ` Norbert Manthey
@ 2019-01-31 22:39       ` Andrew Cooper
  2019-02-01  8:02         ` Jan Beulich
  1 sibling, 1 reply; 150+ messages in thread
From: Andrew Cooper @ 2019-01-31 22:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

On 25/01/2019 10:14, Jan Beulich wrote:
>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>> Worse is the "evaluate condition, stash result, fence, use variable"
>> option, which is almost completely useless.  If you work out the
>> resulting instruction stream, you'll have a conditional expression
>> calculated down into a register, then a fence, then a test register and
>> conditional jump into one of two basic blocks.  This takes the perf hit,
>> and doesn't protect either of the basic blocks for speculative
>> mis-execution.
> How does it not protect anything? It shrinks the speculation window
> to just the register test and conditional branch,

A speculation window starts at a number of arbitrary points, and persist
until the processor has confirmed the speculation precondition was true
or false.  There can be multiple overlapping speculative windows open at
a single time.

> which ought to be
> far smaller than that behind a memory access which fails to hit any
> of the caches (and perhaps even any of the TLBs). This is the more
> that LFENCE does specifically not prevent insn fetching from
> continuing.

I'm afraid that isn't relevant.

For the attack described in this series, the speculation window which
matters starts with a conditional jump.  In this scenario, the fact that
you have stashed the value and issued a fence doesn't stop an attacker
from controlling the conditional jump.

The lfence doesn't interact with the branch predictor.  Any poisoned
predictions will survive.

As a result, the only safe course of action is to let the processor
follow the prediction, *then* wait for speculation to catch up with
reality and see whether the prediction was correct.  As such, code is
only safe when the fence is at the head of both basic blocks.

> That said I agree that the LFENCE would better sit between the
> register test and the conditional branch, but as we've said so many
> times before - this can't be achieved without compiler support.

It also doesn't fix the problem.

Both of these examples do narrow the speculation to just having each
basic block entered with each others legitimate entry condition, but the
following code sample is still vulnerable to leakage under these two
related strategies.

int foo(int a, int b, int c)
{
    if ( eval_nospec(a) )
        return array_b[b];
    else
        return array_c[c];
}

> Then again, following an earlier reply of mine on another sub-
> thread, nothing really prevents the compiler from moving ahead
> and folding the two LFENCEs of the "both branches" model into
> one. It just so happens that apparently right now this never
> occurs (assuming Norbert has done full generated code analysis
> to confirm the intended placement).

Following on from that other thread, eval_nospec() is only useful if we
can guarantee that it places fences at the head of each basic block,
rather than elsewhere.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option
  2019-01-31 22:39       ` Andrew Cooper
@ 2019-02-01  8:02         ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-02-01  8:02 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 31.01.19 at 23:39, <andrew.cooper3@citrix.com> wrote:
> On 25/01/2019 10:14, Jan Beulich wrote:
>>>>> On 24.01.19 at 22:29, <andrew.cooper3@citrix.com> wrote:
>>> Worse is the "evaluate condition, stash result, fence, use variable"
>>> option, which is almost completely useless.  If you work out the
>>> resulting instruction stream, you'll have a conditional expression
>>> calculated down into a register, then a fence, then a test register and
>>> conditional jump into one of two basic blocks.  This takes the perf hit,
>>> and doesn't protect either of the basic blocks for speculative
>>> mis-execution.
>> How does it not protect anything? It shrinks the speculation window
>> to just the register test and conditional branch,
> 
> A speculation window starts at a number of arbitrary points, and persist
> until the processor has confirmed the speculation precondition was true
> or false.  There can be multiple overlapping speculative windows open at
> a single time.
> 
>> which ought to be
>> far smaller than that behind a memory access which fails to hit any
>> of the caches (and perhaps even any of the TLBs). This is the more
>> that LFENCE does specifically not prevent insn fetching from
>> continuing.
> 
> I'm afraid that isn't relevant.
> 
> For the attack described in this series, the speculation window which
> matters starts with a conditional jump.  In this scenario, the fact that
> you have stashed the value and issued a fence doesn't stop an attacker
> from controlling the conditional jump.
> 
> The lfence doesn't interact with the branch predictor.  Any poisoned
> predictions will survive.
> 
> As a result, the only safe course of action is to let the processor
> follow the prediction, *then* wait for speculation to catch up with
> reality and see whether the prediction was correct.  As such, code is
> only safe when the fence is at the head of both basic blocks.
> 
>> That said I agree that the LFENCE would better sit between the
>> register test and the conditional branch, but as we've said so many
>> times before - this can't be achieved without compiler support.
> 
> It also doesn't fix the problem.
> 
> Both of these examples do narrow the speculation to just having each
> basic block entered with each others legitimate entry condition, but the
> following code sample is still vulnerable to leakage under these two
> related strategies.
> 
> int foo(int a, int b, int c)
> {
>     if ( eval_nospec(a) )
>         return array_b[b];
>     else
>         return array_c[c];
> }

All fine, but as you say, speculation starts at the conditional
branch. If the LFENCE sits immediately ahead of it, how far can
speculation actually make it before it gets canceled? I'm not
putting under question that the best we can do is adding one
fence on each side, but as we're anyway debating how to
balance added security vs lost performance, I remain not fully
convinced that this isn't an option someone may want to pick.

>> Then again, following an earlier reply of mine on another sub-
>> thread, nothing really prevents the compiler from moving ahead
>> and folding the two LFENCEs of the "both branches" model into
>> one. It just so happens that apparently right now this never
>> occurs (assuming Norbert has done full generated code analysis
>> to confirm the intended placement).
> 
> Following on from that other thread, eval_nospec() is only useful if we
> can guarantee that it places fences at the head of each basic block,
> rather than elsewhere.

We clearly agree on this aspect. The questionable point though
isn't what is useful, but what the compiler might possibly do with
the constructs we add.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-01-31 20:02               ` Andrew Cooper
@ 2019-02-01  8:23                 ` Jan Beulich
  2019-02-01 14:06                   ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-01  8:23 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, nmanthey,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 31.01.19 at 21:02, <andrew.cooper3@citrix.com> wrote:
> On 31/01/2019 16:19, Jan Beulich wrote:
>>
>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>      if ( a.index >= HVM_NR_PARAMS )
>>>          return -EINVAL;
>>>  
>>> +    /*
>>> +     * Make sure the guest controlled value a.index is bounded even during
>>> +     * speculative execution.
>>> +     */
>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>> I'd like to come back to this model of updating local variables:
>> Is this really safe to do? If such a variable lives in memory
>> (which here it quite likely does), does speculation always
>> recognize the update to the value? Wouldn't it rather read
>> what's currently in that slot, and re-do the calculation in case
>> a subsequent write happens? (I know I did suggest doing so
>> earlier on, so I apologize if this results in you having to go
>> back to some earlier used model.)
> 
> I'm afraid that is a very complicated set of questions to answer.
> 
> The processor needs to track write=>read dependencies to avoid wasting a
> large quantity of time doing erroneous speculation, therefore it does. 
> Pending writes which have happened under speculation are forwarded to
> dependant instructions.
> 
> This behaviour is what gives rise to Bounds Check Bypass Store - a half
> spectre-v1 gadget but with a store rather than a write.  You can e.g.
> speculatively modify the return address on the stack, and hijack
> speculation to an attacker controlled address for a brief period of
> time.  If the speculation window is long enough, the processor first
> follows the RSB/RAS (correctly), then later notices that the real value
> on the stack was different, discards the speculation from the RSB/RAS
> and uses the attacker controlled value instead, then eventually notices
> that all of this was bogus and rewinds back to the original branch.
> 
> Another corner case is Speculative Store Bypass, where memory
> disambiguation speculation can miss the fact that there is a real
> write=>read dependency, and cause speculation using the older stale
> value for a period of time.
> 
> 
> As to overall safety, array_index_nospec() only works as intended when
> the index remains in a register between the cmp/sbb which bounds it
> under speculation, and the array access.  There is no way to guarantee
> this property, as the compiler can spill any value if it thinks it needs to.
> 
> The general safety of the construct relies on the fact that an
> optimising compiler will do its very best to avoid spilling variable to
> the stack.

"Its very best" may be extremely limited with enough variables.
Even if we were to annotate them with the "register" keyword,
that still wouldn't help, as that's only a hint. We simply have no
way to control which variables the compiler wants to hold in
registers. I dare to guess that in the particular example above
it's rather unlikely to be put in a register.

In any event it looks like you support my suspicion that earlier
comments of mine may have driven things into a less safe
direction, and we instead need to accept the more heavy
clutter of scattering around array_{access,index}_nospec()
at all use sites instead of latching the result of
array_index_nospec() into whatever shape of local variable.

Which raises another interesting question: Can't CSE and
alike get in the way here? OPTIMIZER_HIDE_VAR() expands
to a non-volatile asm() (and as per remarks elsewhere I'm
unconvinced adding volatile would actually help), so the
compiler recognizing the same multiple times (perhaps in a
loop) could make it decide to calculate the thing just once.
array_index_mask_nospec() in effect is a pure (and actually
even const) function, and the lack of a respective attribute
doesn't make the compiler not treat it as such if it recognized
the fact. (In effect what I had asked Norbert to do to limit
the clutter was just CSE which the compiler may or may not
have recognized anyway. IOW I'm not convinced going back
would actually buy us anything.)

>  As with all of these issues, you can only confirm whether
> you are no longer vulnerable by inspecting the eventual compiled code.

Which is nothing one can sensibly do, because any change (to
code or the tool chain) would immediately invalidate all of the
previously accumulated results.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v4 04/11] x86/hvm: block speculative out-of-bound accesses
  2019-01-31 19:31   ` Andrew Cooper
@ 2019-02-01  9:06     ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-02-01  9:06 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Stefano Stabellini, Wei Liu, Konrad Rzeszutek Wilk,
	George Dunlap, Tim Deegan, Ian Jackson, Dario Faggioli,
	Martin Pohlack, Julien Grall, nmanthey, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, David Woodhouse, Bjoern Doebel

>>> On 31.01.19 at 20:31, <andrew.cooper3@citrix.com> wrote:
> On 23/01/2019 11:51, Norbert Manthey wrote:
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -37,6 +37,7 @@
>>  #include <xen/monitor.h>
>>  #include <xen/warning.h>
>>  #include <xen/vpci.h>
>> +#include <xen/nospec.h>
>>  #include <asm/shadow.h>
>>  #include <asm/hap.h>
>>  #include <asm/current.h>
>> @@ -2102,7 +2103,7 @@ int hvm_mov_from_cr(unsigned int cr, unsigned int gpr)
>>      case 2:
>>      case 3:
>>      case 4:
>> -        val = curr->arch.hvm.guest_cr[cr];
>> +        val = array_access_nospec(curr->arch.hvm.guest_cr, cr);
> 
> This is an interesting case - we don't actually need protection here.
> 
> This path is called exclusively from intercepts, so cr is strictly one
> of 0, 2, 3, 4, 8 even under adversarial speculation.  However, as
> guest_cr[] is only 5 entries long, the 8 case can still result in an OoB
> read.
> 
> However, given that the 8 index is in the hw_cr[] array and guaranteed
> to be in the cache by this point, an attacker can't gain any additional
> information by poisoning the switch logic.

Question though is - do we want to make the safety of our
code dependent on such (easily and un-noticeably changeable)
layout considerations? I'm not opposed (and I've used similar
arguments for overruns by 1 elsewhere, albeit in cases where
the layout wasn't as far away from the code in question as it
is here, and where the two fields were adjacent), but perhaps
we'd then want a BUILD_BUG_ON() with a suitable comment
(and carefully coded to avoid potential array-index-out-of-
bounds diagnostics)?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses
  2019-01-31 15:05             ` [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses Jan Beulich
@ 2019-02-01 13:45               ` Norbert Manthey
  2019-02-01 14:08                 ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-01 13:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/31/19 16:05, Jan Beulich wrote:
>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>> --- a/xen/common/event_channel.c
>> +++ b/xen/common/event_channel.c
>> @@ -365,11 +365,16 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
>>      if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
>>          return -EINVAL;
>>  
>> +   /*
>> +    * Make sure the guest controlled value virq is bounded even during
>> +    * speculative execution.
>> +    */
>> +    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
>> +
>>      if ( virq_is_global(virq) && (vcpu != 0) )
>>          return -EINVAL;
>>  
>> -    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
>> -         ((v = d->vcpu[vcpu]) == NULL) )
>> +    if ( (vcpu < 0) || ((v = domain_vcpu(d, vcpu)) == NULL) )
>>          return -ENOENT;
> Is there a reason for the less-than-zero check to survive?
Yes, domain_vcpu uses unsigned integers, and I want to return the proper
error code, in case somebody comes with a vcpu number that would
overflow into the valid range.
>
>> @@ -418,8 +423,7 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
>>      int            port, vcpu = bind->vcpu;
>>      long           rc = 0;
>>  
>> -    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
>> -         (d->vcpu[vcpu] == NULL) )
>> +    if ( (vcpu < 0) || domain_vcpu(d, vcpu) == NULL )
>>          return -ENOENT;
> I'm not sure about this one: We're not after the struct vcpu pointer
> here. Right now subsequent code looks fine, but what if the actual
> "vcpu" local variable was used again in a risky way further down? I
> think here and elsewhere it would be best to eliminate that local
> variable, and use v->vcpu_id only for subsequent consumers (or
> alternatively latch the local variable's value only _after_ the call to
> domain_vcpu(), which might be better especially in cases like).

I agree with getting rid of using the local variable. As discussed
elsewhere, updating such a variable might not fix the problem. However,
in this commit I want to avoid speculative out-of-bound accesses using a
guest controlled variable (vcpu). Hence, I add protection to the
locations where it is used as index. As the domain_vcpu function comes
with protection, I prefer this function over explicitly using
array_index_nospec, if possible.

>
>> @@ -969,8 +980,8 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>>          unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
>>          chn->notify_vcpu_id = vcpu_id;
>>          pirq_set_affinity(d, chn->u.pirq.irq,
>> -                          cpumask_of(d->vcpu[vcpu_id]->processor));
>> -        link_pirq_port(port, chn, d->vcpu[vcpu_id]);
>> +                          cpumask_of(domain_vcpu(d, vcpu_id)->processor));
>> +        link_pirq_port(port, chn, domain_vcpu(d, vcpu_id));
> ... this one, where you then wouldn't need to alter code other than
> that actually checking the vCPU ID.
Instead, I will introduce a struct vcpu variable, assign it in the first
check of the function, and continue using this variable instead of
performing array accesses again in this function.
>
>> @@ -516,14 +517,22 @@ int evtchn_fifo_init_control(struct evtchn_init_control 
>> *init_control)
>>      gfn     = init_control->control_gfn;
>>      offset  = init_control->offset;
>>  
>> -    if ( vcpu_id >= d->max_vcpus || !d->vcpu[vcpu_id] )
>> +    if ( !domain_vcpu(d, vcpu_id) )
>>          return -ENOENT;
>> -    v = d->vcpu[vcpu_id];
>> +
>> +    v = domain_vcpu(d, vcpu_id);
> Please don't call the function twice.

I will assign the variable as part of the if statement.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 2/9] x86/vioapic: block speculative out-of-bound accesses
  2019-01-31 16:05             ` [PATCH SpectreV1+L1TF v5 2/9] x86/vioapic: " Jan Beulich
@ 2019-02-01 13:54               ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-01 13:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/31/19 17:05, Jan Beulich wrote:
>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>> When interacting with io apic, a guest can specify values that are used
>> as index to structures, and whose values are not compared against
>> upper bounds to prevent speculative out-of-bound accesses. This change
>> prevents these speculative accesses.
>>
>> Furthermore, two variables are initialized and the compiler is asked to
>> not optimized these initializations, as the uninitialized, potentially
>> guest controlled, variables might be used in a speculative out-of-bound
>> access. As the two problematic variables are both used in the common
>> function gsi_vioapic, the mitigation is implemented there. Currently,
>> the problematic callers are the functions vioapic_irq_positive_edge and
>> vioapic_get_trigger_mode.
> I would have wished for you to say why the other two are _not_
> a problem. Afaict in both cases the functions only ever get
> internal data passed.
>
> Then again I'm not convinced it's worth taking the risk that a
> problematic caller gets added down the road. How about you add
> initializers everywhere, clarifying in the description that it's "just
> in case" for the two currently safe ones?
I will add the other initialization and update the commit message.
>
>> This commit is part of the SpectreV1+L1TF mitigation patch series.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>>
>> ---
> Btw., could you please get used to the habit of adding a brief
> summary of changes for at least the most recent version here,
> which aids review quite a bit?
I will start to do this with the next version.
>
>> @@ -212,7 +220,15 @@ static void vioapic_write_redirent(
>>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>>      union vioapic_redir_entry *pent, ent;
>>      int unmasked = 0;
>> -    unsigned int gsi = vioapic->base_gsi + idx;
>> +    unsigned int gsi;
>> +
>> +    /* Callers of this function should make sure idx is bounded appropriately*/
> Missing blank at the end of the comment (which, if this was the
> only open point, would be easy enough to adjust while committing).

Will fix.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-01-31 16:19             ` [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: " Jan Beulich
  2019-01-31 20:02               ` Andrew Cooper
@ 2019-02-01 14:05               ` Norbert Manthey
  1 sibling, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-01 14:05 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/31/19 17:19, Jan Beulich wrote:
>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>> There are multiple arrays in the HVM interface that are accessed
>> with indices that are provided by the guest. To avoid speculative
>> out-of-bound accesses, we use the array_index_nospec macro.
>>
>> When blocking speculative out-of-bound accesses, we can classify arrays
>> into dynamic arrays and static arrays. Where the former are allocated
>> during run time, the size of the latter is known during compile time.
>> On static arrays, compiler might be able to block speculative accesses
>> in the future.
>>
>> We introduce another macro that uses the ARRAY_SIZE macro to block
>> speculative accesses. For arrays that are statically accessed, this macro
>> can be used instead of the usual macro. Using this macro results in more
>> readable code, and allows to modify the way this case is handled in a
>> single place.
> I think this paragraph is stale now.
I will drop the paragraph.
>
>> @@ -3453,7 +3456,8 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>>          if ( (index / 2) >=
>>               MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT) )
>>              goto gp_fault;
>> -        *msr_content = var_range_base[index];
>> +        *msr_content = var_range_base[array_index_nospec(index,
>> +                          MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT))];
>>          break;
> I clearly should have noticed this earlier on - the bound passed into
> the macro is not in line with the if() condition. I think you're funneling
> half the number of entries into array slot 0.
I will fix the bound that's used in the array_index_nospec macro.
>
>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>      if ( a.index >= HVM_NR_PARAMS )
>>          return -EINVAL;
>>  
>> +    /*
>> +     * Make sure the guest controlled value a.index is bounded even during
>> +     * speculative execution.
>> +     */
>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
> I'd like to come back to this model of updating local variables:
> Is this really safe to do? If such a variable lives in memory
> (which here it quite likely does), does speculation always
> recognize the update to the value? Wouldn't it rather read
> what's currently in that slot, and re-do the calculation in case
> a subsequent write happens? (I know I did suggest doing so
> earlier on, so I apologize if this results in you having to go
> back to some earlier used model.)

I will reply to this on the thread that evolved.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-01  8:23                 ` Jan Beulich
@ 2019-02-01 14:06                   ` Norbert Manthey
  2019-02-01 14:31                     ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-01 14:06 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper
  Cc: Juergen Gross, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Tim Deegan, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/1/19 09:23, Jan Beulich wrote:
>>>> On 31.01.19 at 21:02, <andrew.cooper3@citrix.com> wrote:
>> On 31/01/2019 16:19, Jan Beulich wrote:
>>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>          return -EINVAL;
>>>>  
>>>> +    /*
>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>> +     * speculative execution.
>>>> +     */
>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>> I'd like to come back to this model of updating local variables:
>>> Is this really safe to do? If such a variable lives in memory
>>> (which here it quite likely does), does speculation always
>>> recognize the update to the value? Wouldn't it rather read
>>> what's currently in that slot, and re-do the calculation in case
>>> a subsequent write happens? (I know I did suggest doing so
>>> earlier on, so I apologize if this results in you having to go
>>> back to some earlier used model.)
>> I'm afraid that is a very complicated set of questions to answer.
>>
>> The processor needs to track write=>read dependencies to avoid wasting a
>> large quantity of time doing erroneous speculation, therefore it does. 
>> Pending writes which have happened under speculation are forwarded to
>> dependant instructions.
>>
>> This behaviour is what gives rise to Bounds Check Bypass Store - a half
>> spectre-v1 gadget but with a store rather than a write.  You can e.g.
>> speculatively modify the return address on the stack, and hijack
>> speculation to an attacker controlled address for a brief period of
>> time.  If the speculation window is long enough, the processor first
>> follows the RSB/RAS (correctly), then later notices that the real value
>> on the stack was different, discards the speculation from the RSB/RAS
>> and uses the attacker controlled value instead, then eventually notices
>> that all of this was bogus and rewinds back to the original branch.
>>
>> Another corner case is Speculative Store Bypass, where memory
>> disambiguation speculation can miss the fact that there is a real
>> write=>read dependency, and cause speculation using the older stale
>> value for a period of time.
>>
>>
>> As to overall safety, array_index_nospec() only works as intended when
>> the index remains in a register between the cmp/sbb which bounds it
>> under speculation, and the array access.  There is no way to guarantee
>> this property, as the compiler can spill any value if it thinks it needs to.
>>
>> The general safety of the construct relies on the fact that an
>> optimising compiler will do its very best to avoid spilling variable to
>> the stack.
> "Its very best" may be extremely limited with enough variables.
> Even if we were to annotate them with the "register" keyword,
> that still wouldn't help, as that's only a hint. We simply have no
> way to control which variables the compiler wants to hold in
> registers. I dare to guess that in the particular example above
> it's rather unlikely to be put in a register.
>
> In any event it looks like you support my suspicion that earlier
> comments of mine may have driven things into a less safe
> direction, and we instead need to accept the more heavy
> clutter of scattering around array_{access,index}_nospec()
> at all use sites instead of latching the result of
> array_index_nospec() into whatever shape of local variable.
>
> Which raises another interesting question: Can't CSE and
> alike get in the way here? OPTIMIZER_HIDE_VAR() expands
> to a non-volatile asm() (and as per remarks elsewhere I'm
> unconvinced adding volatile would actually help), so the
> compiler recognizing the same multiple times (perhaps in a
> loop) could make it decide to calculate the thing just once.
> array_index_mask_nospec() in effect is a pure (and actually
> even const) function, and the lack of a respective attribute
> doesn't make the compiler not treat it as such if it recognized
> the fact. (In effect what I had asked Norbert to do to limit
> the clutter was just CSE which the compiler may or may not
> have recognized anyway. IOW I'm not convinced going back
> would actually buy us anything.)

So this means I should stick to the current approach and continue
updating variables after their bound check with an array_index_nospec
call, correct?

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses
  2019-02-01 13:45               ` Norbert Manthey
@ 2019-02-01 14:08                 ` Jan Beulich
  2019-02-05 13:42                   ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-01 14:08 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 01.02.19 at 14:45, <nmanthey@amazon.de> wrote:
> On 1/31/19 16:05, Jan Beulich wrote:
>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>> --- a/xen/common/event_channel.c
>>> +++ b/xen/common/event_channel.c
>>> @@ -365,11 +365,16 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
>>>      if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
>>>          return -EINVAL;
>>>  
>>> +   /*
>>> +    * Make sure the guest controlled value virq is bounded even during
>>> +    * speculative execution.
>>> +    */
>>> +    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
>>> +
>>>      if ( virq_is_global(virq) && (vcpu != 0) )
>>>          return -EINVAL;
>>>  
>>> -    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
>>> -         ((v = d->vcpu[vcpu]) == NULL) )
>>> +    if ( (vcpu < 0) || ((v = domain_vcpu(d, vcpu)) == NULL) )
>>>          return -ENOENT;
>> Is there a reason for the less-than-zero check to survive?
> Yes, domain_vcpu uses unsigned integers, and I want to return the proper
> error code, in case somebody comes with a vcpu number that would
> overflow into the valid range.

I don't see how an overflow into the valid range could occur: Negative
numbers, when converted to unsigned, become large positive numbers.
If anything in this regard was to change here, then the type of _both_
local variable (which get initialized from a field of type uint32_t).

>>> @@ -418,8 +423,7 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
>>>      int            port, vcpu = bind->vcpu;
>>>      long           rc = 0;
>>>  
>>> -    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
>>> -         (d->vcpu[vcpu] == NULL) )
>>> +    if ( (vcpu < 0) || domain_vcpu(d, vcpu) == NULL )
>>>          return -ENOENT;
>> I'm not sure about this one: We're not after the struct vcpu pointer
>> here. Right now subsequent code looks fine, but what if the actual
>> "vcpu" local variable was used again in a risky way further down? I
>> think here and elsewhere it would be best to eliminate that local
>> variable, and use v->vcpu_id only for subsequent consumers (or
>> alternatively latch the local variable's value only _after_ the call to
>> domain_vcpu(), which might be better especially in cases like).
> 
> I agree with getting rid of using the local variable. As discussed
> elsewhere, updating such a variable might not fix the problem. However,
> in this commit I want to avoid speculative out-of-bound accesses using a
> guest controlled variable (vcpu). Hence, I add protection to the
> locations where it is used as index. As the domain_vcpu function comes
> with protection, I prefer this function over explicitly using
> array_index_nospec, if possible.

But domain_vcpu() does not alter an out of bounds value passed
into it in any way, i.e. subsequent array accesses using that value
would still be an issue. IOW in the case here what you do is
sufficient because there's no array access in the first place. It's
debatable whether any change is needed at all here (there would
need to be a speculation path which could observe the result of
the speculative write into chn->notify_vcpu_id).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-01 14:06                   ` Norbert Manthey
@ 2019-02-01 14:31                     ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-02-01 14:31 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 01.02.19 at 15:06, <nmanthey@amazon.de> wrote:
> On 2/1/19 09:23, Jan Beulich wrote:
>>>>> On 31.01.19 at 21:02, <andrew.cooper3@citrix.com> wrote:
>>> On 31/01/2019 16:19, Jan Beulich wrote:
>>>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>          return -EINVAL;
>>>>>  
>>>>> +    /*
>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>> +     * speculative execution.
>>>>> +     */
>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>> I'd like to come back to this model of updating local variables:
>>>> Is this really safe to do? If such a variable lives in memory
>>>> (which here it quite likely does), does speculation always
>>>> recognize the update to the value? Wouldn't it rather read
>>>> what's currently in that slot, and re-do the calculation in case
>>>> a subsequent write happens? (I know I did suggest doing so
>>>> earlier on, so I apologize if this results in you having to go
>>>> back to some earlier used model.)
>>> I'm afraid that is a very complicated set of questions to answer.
>>>
>>> The processor needs to track write=>read dependencies to avoid wasting a
>>> large quantity of time doing erroneous speculation, therefore it does. 
>>> Pending writes which have happened under speculation are forwarded to
>>> dependant instructions.
>>>
>>> This behaviour is what gives rise to Bounds Check Bypass Store - a half
>>> spectre-v1 gadget but with a store rather than a write.  You can e.g.
>>> speculatively modify the return address on the stack, and hijack
>>> speculation to an attacker controlled address for a brief period of
>>> time.  If the speculation window is long enough, the processor first
>>> follows the RSB/RAS (correctly), then later notices that the real value
>>> on the stack was different, discards the speculation from the RSB/RAS
>>> and uses the attacker controlled value instead, then eventually notices
>>> that all of this was bogus and rewinds back to the original branch.
>>>
>>> Another corner case is Speculative Store Bypass, where memory
>>> disambiguation speculation can miss the fact that there is a real
>>> write=>read dependency, and cause speculation using the older stale
>>> value for a period of time.
>>>
>>>
>>> As to overall safety, array_index_nospec() only works as intended when
>>> the index remains in a register between the cmp/sbb which bounds it
>>> under speculation, and the array access.  There is no way to guarantee
>>> this property, as the compiler can spill any value if it thinks it needs to.
>>>
>>> The general safety of the construct relies on the fact that an
>>> optimising compiler will do its very best to avoid spilling variable to
>>> the stack.
>> "Its very best" may be extremely limited with enough variables.
>> Even if we were to annotate them with the "register" keyword,
>> that still wouldn't help, as that's only a hint. We simply have no
>> way to control which variables the compiler wants to hold in
>> registers. I dare to guess that in the particular example above
>> it's rather unlikely to be put in a register.
>>
>> In any event it looks like you support my suspicion that earlier
>> comments of mine may have driven things into a less safe
>> direction, and we instead need to accept the more heavy
>> clutter of scattering around array_{access,index}_nospec()
>> at all use sites instead of latching the result of
>> array_index_nospec() into whatever shape of local variable.
>>
>> Which raises another interesting question: Can't CSE and
>> alike get in the way here? OPTIMIZER_HIDE_VAR() expands
>> to a non-volatile asm() (and as per remarks elsewhere I'm
>> unconvinced adding volatile would actually help), so the
>> compiler recognizing the same multiple times (perhaps in a
>> loop) could make it decide to calculate the thing just once.
>> array_index_mask_nospec() in effect is a pure (and actually
>> even const) function, and the lack of a respective attribute
>> doesn't make the compiler not treat it as such if it recognized
>> the fact. (In effect what I had asked Norbert to do to limit
>> the clutter was just CSE which the compiler may or may not
>> have recognized anyway. IOW I'm not convinced going back
>> would actually buy us anything.)
> 
> So this means I should stick to the current approach and continue
> updating variables after their bound check with an array_index_nospec
> call, correct?

Well, yes, at least for now I'm not convinced going back and
re-introduce the heavier code churn would buy us much. But
we'll have to see whether e.g. Andrew is of a different opinion.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses
  2019-02-01 14:08                 ` Jan Beulich
@ 2019-02-05 13:42                   ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-05 13:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/1/19 15:08, Jan Beulich wrote:
>>>> On 01.02.19 at 14:45, <nmanthey@amazon.de> wrote:
>> On 1/31/19 16:05, Jan Beulich wrote:
>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>> --- a/xen/common/event_channel.c
>>>> +++ b/xen/common/event_channel.c
>>>> @@ -365,11 +365,16 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
>>>>      if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
>>>>          return -EINVAL;
>>>>  
>>>> +   /*
>>>> +    * Make sure the guest controlled value virq is bounded even during
>>>> +    * speculative execution.
>>>> +    */
>>>> +    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
>>>> +
>>>>      if ( virq_is_global(virq) && (vcpu != 0) )
>>>>          return -EINVAL;
>>>>  
>>>> -    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
>>>> -         ((v = d->vcpu[vcpu]) == NULL) )
>>>> +    if ( (vcpu < 0) || ((v = domain_vcpu(d, vcpu)) == NULL) )
>>>>          return -ENOENT;
>>> Is there a reason for the less-than-zero check to survive?
>> Yes, domain_vcpu uses unsigned integers, and I want to return the proper
>> error code, in case somebody comes with a vcpu number that would
>> overflow into the valid range.
> I don't see how an overflow into the valid range could occur: Negative
> numbers, when converted to unsigned, become large positive numbers.
> If anything in this regard was to change here, then the type of _both_
> local variable (which get initialized from a field of type uint32_t).
True, I will drop the < 0 check as well.
>
>>>> @@ -418,8 +423,7 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
>>>>      int            port, vcpu = bind->vcpu;
>>>>      long           rc = 0;
>>>>  
>>>> -    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
>>>> -         (d->vcpu[vcpu] == NULL) )
>>>> +    if ( (vcpu < 0) || domain_vcpu(d, vcpu) == NULL )
>>>>          return -ENOENT;
>>> I'm not sure about this one: We're not after the struct vcpu pointer
>>> here. Right now subsequent code looks fine, but what if the actual
>>> "vcpu" local variable was used again in a risky way further down? I
>>> think here and elsewhere it would be best to eliminate that local
>>> variable, and use v->vcpu_id only for subsequent consumers (or
>>> alternatively latch the local variable's value only _after_ the call to
>>> domain_vcpu(), which might be better especially in cases like).
>> I agree with getting rid of using the local variable. As discussed
>> elsewhere, updating such a variable might not fix the problem. However,
>> in this commit I want to avoid speculative out-of-bound accesses using a
>> guest controlled variable (vcpu). Hence, I add protection to the
>> locations where it is used as index. As the domain_vcpu function comes
>> with protection, I prefer this function over explicitly using
>> array_index_nospec, if possible.
> But domain_vcpu() does not alter an out of bounds value passed
> into it in any way, i.e. subsequent array accesses using that value
> would still be an issue. IOW in the case here what you do is
> sufficient because there's no array access in the first place. It's
> debatable whether any change is needed at all here (there would
> need to be a speculation path which could observe the result of
> the speculative write into chn->notify_vcpu_id).

In this method, the access to d->vcpu[vcpu] has to be protected. That
happens by using the domain_vcpu function. The rest of this function
does not read the vcpu variable, as you mentioned. Therefore, I would
keep this version of the fix, and also drop the sign check as above.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier
  2019-01-31 16:35             ` [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier Jan Beulich
@ 2019-02-05 14:23               ` Norbert Manthey
  2019-02-05 14:43                 ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-05 14:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/31/19 17:35, Jan Beulich wrote:
>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>> @@ -1942,6 +1942,12 @@ Irrespective of Xen's setting, the feature is virtualised for HVM guests to
>>  use.  By default, Xen will enable this mitigation on hardware believed to 
>> be
>>  vulnerable to L1TF.
>>  
>> +On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force
>> +or prevent Xen from protecting evaluations inside the hypervisor with a barrier
>> +instruction to not load potentially secret information into L1 cache.  By
>> +default, Xen will enable this mitigation on hardware believed to be vulnerable
>> +to L1TF.
> ... and having SMT enabled, since aiui this is a non-issue without.
In case flushing the L1 cache is not enabled, that is still an issue,
because the transition guest -> hypervisor -> guest would allow to
retrieve hypervisor data from the cache still. Do you want me to extend
the logic to consider L1 cache flushing as well?
>
>> --- a/xen/arch/x86/spec_ctrl.c
>> +++ b/xen/arch/x86/spec_ctrl.c
>> @@ -21,6 +21,7 @@
>>  #include <xen/lib.h>
>>  #include <xen/warning.h>
>>  
>> +#include <asm-x86/cpuid.h>
> asm/cpuid.h please
Will fix.
>
>> @@ -100,6 +102,7 @@ static int __init parse_spec_ctrl(const char *s)
>>              opt_ibpb = false;
>>              opt_ssbd = false;
>>              opt_l1d_flush = 0;
>> +            opt_l1tf_barrier = 0;
>>          }
>>          else if ( val > 0 )
>>              rc = -EINVAL;
> Is this really something we want "spec-ctrl=no-xen" to disable?
> It would seem to me that this should be restricted to "spec-ctrl=no".
I have no strong opinion here. If you ask me to move it somewhere else,
I will do that. I just want to make sure it's disable in case
speculation mitigations should be disabled.
>
>> @@ -843,6 +849,14 @@ void __init init_speculation_mitigations(void)
>>          opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
>>  
>>      /*
>> +     * By default, enable L1TF_VULN on L1TF-vulnerable hardware
>> +     */
> This ought to be a single line comment.
Will fix.
>
>> +    if ( opt_l1tf_barrier == -1 )
>> +        opt_l1tf_barrier = cpu_has_bug_l1tf;
> At the very least opt_smt should be taken into account here. But
> I guess this setting of the default may need to be deferred
> further, until the topology of the system is known (there may
> not be any hyperthreads after all).
Again, cache flushing also has to be considered. So, I would like to
keep it like this for now.
>
>> +    if ( cpu_has_bug_l1tf && opt_l1tf_barrier > 0)
>> +        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);
> Why the left side of the &&?
IMHO, the CPU flag L1TF should only be set when the CPU is reported to
be vulnerable, even if the command line wants to enforce mitigations.
>
>> +    /*
>>       * We do not disable HT by default on affected hardware.
>>       *
>>       * Firstly, if the user intends to use exclusively PV, or HVM shadow
> Furthermore, as per the comment and logic here and below a
> !HVM configuration ought to be safe too, unless "pv-l1tf=" was
> used (in which case we defer to the admin anyway), so it's
> questionable whether the whole logic should be there in the
> first place in this case. This would then in particular keep all
> of this out for the PV shim.
For the PV shim, I could add pv-shim to my check before enabling the CPU
flag.
>> --- a/xen/include/asm-x86/cpufeatures.h
>> +++ b/xen/include/asm-x86/cpufeatures.h
>> @@ -31,3 +31,4 @@ XEN_CPUFEATURE(SC_RSB_PV,       (FSCAPINTS+0)*32+18) /* RSB overwrite needed for
>>  XEN_CPUFEATURE(SC_RSB_HVM,      (FSCAPINTS+0)*32+19) /* RSB overwrite needed for HVM */
>>  XEN_CPUFEATURE(SC_MSR_IDLE,     (FSCAPINTS+0)*32+21) /* (SC_MSR_PV || SC_MSR_HVM) && default_xen_spec_ctrl */
>>  XEN_CPUFEATURE(XEN_LBR,         (FSCAPINTS+0)*32+22) /* Xen uses MSR_DEBUGCTL.LBR */
>> +XEN_CPUFEATURE(SC_L1TF_VULN,    (FSCAPINTS+0)*32+23) /* L1TF protection required */
> Would you mind using one of the unused slots above first?

I will pick an unused slot.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec
  2019-01-31 17:05             ` [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec Jan Beulich
@ 2019-02-05 14:32               ` Norbert Manthey
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
       [not found]               ` <A18FF6C80200006BB1E090C7@prv1-mh.provo.novell.com>
  1 sibling, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-05 14:32 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 1/31/19 18:05, Jan Beulich wrote:
>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>> Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
>> L1 cache is problemetic, because when hyperthreading is used as well, a
>> guest running on the sibling core can leak this potentially secret data.
>>
>> To prevent these speculative accesses, we block speculation after
>> accessing the domain property field by adding lfence instructions. This
>> way, the CPU continues executing and loading data only once the condition
>> is actually evaluated.
>>
>> As the macros are typically used in if statements, the lfence has to come
>> in a compatible way. Therefore, a function that returns true after an
>> lfence instruction is introduced. To protect both branches after a
>> conditional, an lfence instruction has to be added for the two branches.
>> To be able to block speculation after several evalauations, the generic
>> barrier macro block_speculation is also introduced.
>>
>> As the L1TF vulnerability is only present on the x86 architecture, the
>> macros will not use the lfence instruction on other architectures and the
>> protection is disabled during compilation. By default, the lfence
>> instruction is not present either. Only when a L1TF vulnerable platform
>> is detected, the lfence instruction is patched in via alterantive patching.
>>
>> Introducing the lfence instructions catches a lot of potential leaks with
>> a simple unintrusive code change. During performance testing, we did not
>> notice performance effects.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> Looks okay to me now, but I'm going to wait with giving an ack
> until perhaps others have given comments, as some of this
> was not entirely uncontroversial. There are a few cosmetic
> issues left though:
>
>> @@ -64,6 +65,33 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>>  #define array_access_nospec(array, index)                               \
>>      (array)[array_index_nospec(index, ARRAY_SIZE(array))]
>>  
>> +/*
>> + * Allow to insert a read memory barrier into conditionals
>> + */
> Here and below, please make single line comments really be
> single lines.
Will fix.
>
>> +#if defined(CONFIG_X86) && defined(CONFIG_HVM)
>> +static inline bool arch_barrier_nospec_true(void) {
> The brace belongs on its own line.
Will fix.
>
>> +    alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
>> +    return true;
>> +}
>> +#else
>> +static inline bool arch_barrier_nospec_true(void) { return true; }
> This could be avoided if you placed the #if inside the
> function body.
I will move the #if inside.
>
>> +#endif
>> +
>> +/*
>> + * Allow to protect evaluation of conditional with respect to speculation on x86
>> + */
>> +#ifndef CONFIG_X86
> Why is this conditional different from the one above?
You are right, the two defines should be equal.
>
>> +#define evaluate_nospec(condition) (condition)
>> +#else
>> +#define evaluate_nospec(condition)                                         \
>> +    ((condition) ? arch_barrier_nospec_true() : !arch_barrier_nospec_true())
>> +#endif
>> +
>> +/*
>> + * Allow to block speculative execution in generic code
>> + */
>> +#define block_speculation() (void)arch_barrier_nospec_true()
> Missing an outer pair of parentheses.

Will add them.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier
  2019-02-05 14:23               ` Norbert Manthey
@ 2019-02-05 14:43                 ` Jan Beulich
  2019-02-06 13:02                   ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-05 14:43 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 05.02.19 at 15:23, <nmanthey@amazon.de> wrote:
> On 1/31/19 17:35, Jan Beulich wrote:
>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>> @@ -1942,6 +1942,12 @@ Irrespective of Xen's setting, the feature is 
> virtualised for HVM guests to
>>>  use.  By default, Xen will enable this mitigation on hardware believed to 
>>> be
>>>  vulnerable to L1TF.
>>>  
>>> +On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force
>>> +or prevent Xen from protecting evaluations inside the hypervisor with a barrier
>>> +instruction to not load potentially secret information into L1 cache.  By
>>> +default, Xen will enable this mitigation on hardware believed to be vulnerable
>>> +to L1TF.
>> ... and having SMT enabled, since aiui this is a non-issue without.
> In case flushing the L1 cache is not enabled, that is still an issue,
> because the transition guest -> hypervisor -> guest would allow to
> retrieve hypervisor data from the cache still. Do you want me to extend
> the logic to consider L1 cache flushing as well?

Well, I wouldn't be overly concerned of people disabling it from the
command line, but being kind to people without updated microcode
is perhaps a good idea.

>>> @@ -100,6 +102,7 @@ static int __init parse_spec_ctrl(const char *s)
>>>              opt_ibpb = false;
>>>              opt_ssbd = false;
>>>              opt_l1d_flush = 0;
>>> +            opt_l1tf_barrier = 0;
>>>          }
>>>          else if ( val > 0 )
>>>              rc = -EINVAL;
>> Is this really something we want "spec-ctrl=no-xen" to disable?
>> It would seem to me that this should be restricted to "spec-ctrl=no".
> I have no strong opinion here. If you ask me to move it somewhere else,
> I will do that. I just want to make sure it's disable in case
> speculation mitigations should be disabled.

Unless anyone else voices a different opinion, I'd like to see it
moved as suggested.

>>> @@ -843,6 +849,14 @@ void __init init_speculation_mitigations(void)
>>>          opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
>>>  
>>>      /*
>>> +     * By default, enable L1TF_VULN on L1TF-vulnerable hardware
>>> +     */
>> This ought to be a single line comment.
> Will fix.
>>
>>> +    if ( opt_l1tf_barrier == -1 )
>>> +        opt_l1tf_barrier = cpu_has_bug_l1tf;
>> At the very least opt_smt should be taken into account here. But
>> I guess this setting of the default may need to be deferred
>> further, until the topology of the system is known (there may
>> not be any hyperthreads after all).
> Again, cache flushing also has to be considered. So, I would like to
> keep it like this for now.

With the "for now" aspect properly explained in the description,
I guess that would be fine as a first step.

>>> +    if ( cpu_has_bug_l1tf && opt_l1tf_barrier > 0)
>>> +        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);
>> Why the left side of the &&?
> IMHO, the CPU flag L1TF should only be set when the CPU is reported to
> be vulnerable, even if the command line wants to enforce mitigations.

What's the command line option good for if it doesn't trigger
patching in of the LFENCEs? Command line options exist, among
other purposes, to aid mitigating flaws in our determination of
what is a vulnerable platform.

>>> +    /*
>>>       * We do not disable HT by default on affected hardware.
>>>       *
>>>       * Firstly, if the user intends to use exclusively PV, or HVM shadow
>> Furthermore, as per the comment and logic here and below a
>> !HVM configuration ought to be safe too, unless "pv-l1tf=" was
>> used (in which case we defer to the admin anyway), so it's
>> questionable whether the whole logic should be there in the
>> first place in this case. This would then in particular keep all
>> of this out for the PV shim.
> For the PV shim, I could add pv-shim to my check before enabling the CPU
> flag.

But the PV shim is just a special case. I'd like this code to be
compiled out for all !HVM configurations.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier
  2019-02-05 14:43                 ` Jan Beulich
@ 2019-02-06 13:02                   ` Norbert Manthey
  2019-02-06 13:20                     ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-06 13:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/5/19 15:43, Jan Beulich wrote:
>>>> On 05.02.19 at 15:23, <nmanthey@amazon.de> wrote:
>> On 1/31/19 17:35, Jan Beulich wrote:
>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>> @@ -1942,6 +1942,12 @@ Irrespective of Xen's setting, the feature is 
>> virtualised for HVM guests to
>>>>  use.  By default, Xen will enable this mitigation on hardware believed to 
>>>> be
>>>>  vulnerable to L1TF.
>>>>  
>>>> +On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force
>>>> +or prevent Xen from protecting evaluations inside the hypervisor with a barrier
>>>> +instruction to not load potentially secret information into L1 cache.  By
>>>> +default, Xen will enable this mitigation on hardware believed to be vulnerable
>>>> +to L1TF.
>>> ... and having SMT enabled, since aiui this is a non-issue without.
>> In case flushing the L1 cache is not enabled, that is still an issue,
>> because the transition guest -> hypervisor -> guest would allow to
>> retrieve hypervisor data from the cache still. Do you want me to extend
>> the logic to consider L1 cache flushing as well?
> Well, I wouldn't be overly concerned of people disabling it from the
> command line, but being kind to people without updated microcode
> is perhaps a good idea.
I will extend the commit message to state that this the CPU flag is set
automatically independently of SMT and cache flushing.
>
>>>> @@ -100,6 +102,7 @@ static int __init parse_spec_ctrl(const char *s)
>>>>              opt_ibpb = false;
>>>>              opt_ssbd = false;
>>>>              opt_l1d_flush = 0;
>>>> +            opt_l1tf_barrier = 0;
>>>>          }
>>>>          else if ( val > 0 )
>>>>              rc = -EINVAL;
>>> Is this really something we want "spec-ctrl=no-xen" to disable?
>>> It would seem to me that this should be restricted to "spec-ctrl=no".
>> I have no strong opinion here. If you ask me to move it somewhere else,
>> I will do that. I just want to make sure it's disable in case
>> speculation mitigations should be disabled.
> Unless anyone else voices a different opinion, I'd like to see it
> moved as suggested.
I will move the change above the disable_common label.
>>>> @@ -843,6 +849,14 @@ void __init init_speculation_mitigations(void)
>>>>          opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
>>>>  
>>>>      /*
>>>> +     * By default, enable L1TF_VULN on L1TF-vulnerable hardware
>>>> +     */
>>> This ought to be a single line comment.
>> Will fix.
>>>> +    if ( opt_l1tf_barrier == -1 )
>>>> +        opt_l1tf_barrier = cpu_has_bug_l1tf;
>>> At the very least opt_smt should be taken into account here. But
>>> I guess this setting of the default may need to be deferred
>>> further, until the topology of the system is known (there may
>>> not be any hyperthreads after all).
>> Again, cache flushing also has to be considered. So, I would like to
>> keep it like this for now.
> With the "for now" aspect properly explained in the description,
> I guess that would be fine as a first step.
I will extend the commit message accordingly.
>
>>>> +    if ( cpu_has_bug_l1tf && opt_l1tf_barrier > 0)
>>>> +        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);
>>> Why the left side of the &&?
>> IMHO, the CPU flag L1TF should only be set when the CPU is reported to
>> be vulnerable, even if the command line wants to enforce mitigations.
> What's the command line option good for if it doesn't trigger
> patching in of the LFENCEs? Command line options exist, among
> other purposes, to aid mitigating flaws in our determination of
> what is a vulnerable platform.
I will remove the extra conditional and enable patching based on the
command line only.
>
>>>> +    /*
>>>>       * We do not disable HT by default on affected hardware.
>>>>       *
>>>>       * Firstly, if the user intends to use exclusively PV, or HVM shadow
>>> Furthermore, as per the comment and logic here and below a
>>> !HVM configuration ought to be safe too, unless "pv-l1tf=" was
>>> used (in which case we defer to the admin anyway), so it's
>>> questionable whether the whole logic should be there in the
>>> first place in this case. This would then in particular keep all
>>> of this out for the PV shim.
>> For the PV shim, I could add pv-shim to my check before enabling the CPU
>> flag.
> But the PV shim is just a special case. I'd like this code to be
> compiled out for all !HVM configurations.

The that introduces the evaluate_nospec macro does that already. Based
on defined(CONFIG_HVM) lfence patching is disabled there.

Do you want me to wrap this command line option into CONFIG_HVM checks
as well?

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier
  2019-02-06 13:02                   ` Norbert Manthey
@ 2019-02-06 13:20                     ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-02-06 13:20 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 06.02.19 at 14:02, <nmanthey@amazon.de> wrote:
> On 2/5/19 15:43, Jan Beulich wrote:
>>>>> On 05.02.19 at 15:23, <nmanthey@amazon.de> wrote:
>>> On 1/31/19 17:35, Jan Beulich wrote:
>>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>>> +    /*
>>>>>       * We do not disable HT by default on affected hardware.
>>>>>       *
>>>>>       * Firstly, if the user intends to use exclusively PV, or HVM shadow
>>>> Furthermore, as per the comment and logic here and below a
>>>> !HVM configuration ought to be safe too, unless "pv-l1tf=" was
>>>> used (in which case we defer to the admin anyway), so it's
>>>> questionable whether the whole logic should be there in the
>>>> first place in this case. This would then in particular keep all
>>>> of this out for the PV shim.
>>> For the PV shim, I could add pv-shim to my check before enabling the CPU
>>> flag.
>> But the PV shim is just a special case. I'd like this code to be
>> compiled out for all !HVM configurations.
> 
> The that introduces the evaluate_nospec macro does that already. Based
> on defined(CONFIG_HVM) lfence patching is disabled there.

Oh, right.

> Do you want me to wrap this command line option into CONFIG_HVM checks
> as well?

That would be nice; I have a patch for post-4.12 where I do
something similar to opt_xpti_*. Therefore if you didn't do it
here, I'd probably submit a fixup patch down the road.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses
       [not found]           ` <0104A7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-06 14:52             ` Jan Beulich
  2019-02-06 15:06               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-06 14:52 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
> @@ -963,6 +965,9 @@ map_grant_ref(
>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>                   op->ref, rgt->domain->domain_id);
>  
> +    /* Make sure the above check is not bypassed speculatively */
> +    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
> +
>      act = active_entry_acquire(rgt, op->ref);
>      shah = shared_entry_header(rgt, op->ref);
>      status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);

Just FTR - this is a case where the change, according to prior
discussion, is pretty unlikely to help at all. The compiler will have
a hard time realizing that it could keep the result in a register past
the active_entry_acquire() invocation, as that - due to the spin
lock acquired there - acts as a compiler barrier. And looking at
generated code (gcc 8.2) confirms that there's a reload from the
stack.

> @@ -2026,6 +2031,9 @@ gnttab_prepare_for_transfer(
>          goto fail;
>      }
>  
> +    /* Make sure the above check is not bypassed speculatively */
> +    ref = array_index_nospec(ref, nr_grant_entries(rgt));
> +
>      sha = shared_entry_header(rgt, ref);
>  
>      scombo.word = *(u32 *)&sha->flags;
> @@ -2223,7 +2231,8 @@ gnttab_transfer(
>          okay = gnttab_prepare_for_transfer(e, d, gop.ref);
>          spin_lock(&e->page_alloc_lock);
>  
> -        if ( unlikely(!okay) || unlikely(e->is_dying) )
> +        /* Make sure this check is not bypassed speculatively */
> +        if ( evaluate_nospec(unlikely(!okay) || unlikely(e->is_dying)) )

I'm still not really happy about this. The comment isn't helpful in
connecting the use of evaluate_nospec() to the problem site
(in the earlier hunk, which I've left in context), and I still don't
understand why the e->is_dying is getting wrapped as well.
Plus it occurs to me now that you're liable to render unlikely()
ineffective here. So how about

        if ( unlikely(evaluate_nospec(!okay)) || unlikely(e->is_dying) )

?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation
       [not found]           ` <010527AF020000F8B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-06 15:03             ` Jan Beulich
  2019-02-06 15:36               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-06 15:03 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
> @@ -908,10 +909,10 @@ void watchdog_domain_destroy(struct domain *d);
>   *    (that is, this would not be suitable for a driver domain)
>   *  - There is never a reason to deny the hardware domain access to this
>   */
> -#define is_hardware_domain(_d) ((_d) == hardware_domain)
> +#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
>  
>  /* This check is for functionality specific to a control domain */
> -#define is_control_domain(_d) ((_d)->is_privileged)
> +#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)

I'm afraid there's another fly in the ointment here: While looking at
the still questionable grant table change I've started wondering
about constructs like

    case XENMEM_machphys_mapping:
    {
        struct xen_machphys_mapping mapping = {
            .v_start = MACH2PHYS_VIRT_START,
            .v_end   = MACH2PHYS_VIRT_END,
            .max_mfn = MACH2PHYS_NR_ENTRIES - 1
        };

        if ( !mem_hotplug && is_hardware_domain(current->domain) )
            mapping.max_mfn = max_page - 1;
        if ( copy_to_guest(arg, &mapping, 1) )
            return -EFAULT;

        return 0;
    }

Granted the example here could be easily re-arranged, but there
are others where this is less easy or not possible at all. What I'm
trying to get at are constructs where the such-protected
predicates sit on the right side of && or || - afaict (also from
looking at some much simplified code examples) the intended
protection is gone in these cases.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-06 14:52             ` [PATCH SpectreV1+L1TF v5 " Jan Beulich
@ 2019-02-06 15:06               ` Norbert Manthey
  2019-02-06 15:53                 ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-06 15:06 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/6/19 15:52, Jan Beulich wrote:
>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>> @@ -963,6 +965,9 @@ map_grant_ref(
>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>                   op->ref, rgt->domain->domain_id);
>>  
>> +    /* Make sure the above check is not bypassed speculatively */
>> +    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
>> +
>>      act = active_entry_acquire(rgt, op->ref);
>>      shah = shared_entry_header(rgt, op->ref);
>>      status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
> Just FTR - this is a case where the change, according to prior
> discussion, is pretty unlikely to help at all. The compiler will have
> a hard time realizing that it could keep the result in a register past
> the active_entry_acquire() invocation, as that - due to the spin
> lock acquired there - acts as a compiler barrier. And looking at
> generated code (gcc 8.2) confirms that there's a reload from the
> stack.
I could change this back to a prior version that protects each read
operation.
>> @@ -2026,6 +2031,9 @@ gnttab_prepare_for_transfer(
>>          goto fail;
>>      }
>>  
>> +    /* Make sure the above check is not bypassed speculatively */
>> +    ref = array_index_nospec(ref, nr_grant_entries(rgt));
>> +
>>      sha = shared_entry_header(rgt, ref);
>>  
>>      scombo.word = *(u32 *)&sha->flags;
>> @@ -2223,7 +2231,8 @@ gnttab_transfer(
>>          okay = gnttab_prepare_for_transfer(e, d, gop.ref);
>>          spin_lock(&e->page_alloc_lock);
>>  
>> -        if ( unlikely(!okay) || unlikely(e->is_dying) )
>> +        /* Make sure this check is not bypassed speculatively */
>> +        if ( evaluate_nospec(unlikely(!okay) || unlikely(e->is_dying)) )
> I'm still not really happy about this. The comment isn't helpful in
> connecting the use of evaluate_nospec() to the problem site
> (in the earlier hunk, which I've left in context), and I still don't
> understand why the e->is_dying is getting wrapped as well.
> Plus it occurs to me now that you're liable to render unlikely()
> ineffective here. So how about
>
>         if ( unlikely(evaluate_nospec(!okay)) || unlikely(e->is_dying) )
>
> ?

I will move the evaluate_nospec closer to the evaluation of okay, and
will improve the comment mentioning that the okay variable represents
whether the current reference is actually valid.

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 9/9] common/memory: block speculative out-of-bound accesses
       [not found]           ` <20F3469E02000096B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-06 15:25             ` Jan Beulich
  2019-02-06 15:39               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-06 15:25 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
> @@ -33,10 +34,10 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
>  
>  bool __mfn_valid(unsigned long mfn)
>  {
> -    return likely(mfn < max_page) &&
> -           likely(!(mfn & pfn_hole_mask)) &&
> -           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
> -                           pdx_group_valid));
> +    return evaluate_nospec(likely(mfn < max_page) &&
> +                           likely(!(mfn & pfn_hole_mask)) &&
> +                           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
> +                                           pdx_group_valid)));

Other than in the questionable grant table case, here I agree that
you want to wrap the entire construct. This has an unwanted effect
though: The test_bit() may still be speculated into with an out-of-
bounds mfn. (As mentioned elsewhere, operations on bit arrays are
an open issue altogether.) I therefore think you want to split this into
two:

bool __mfn_valid(unsigned long mfn)
{
    return likely(evaluate_nospec(mfn < max_page)) &&
           evaluate_nospec(likely(!(mfn & pfn_hole_mask)) &&
                           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
                                           pdx_group_valid)));
}

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation
  2019-02-06 15:03             ` [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation Jan Beulich
@ 2019-02-06 15:36               ` Norbert Manthey
  2019-02-06 16:01                 ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-06 15:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/6/19 16:03, Jan Beulich wrote:
>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>> @@ -908,10 +909,10 @@ void watchdog_domain_destroy(struct domain *d);
>>   *    (that is, this would not be suitable for a driver domain)
>>   *  - There is never a reason to deny the hardware domain access to this
>>   */
>> -#define is_hardware_domain(_d) ((_d) == hardware_domain)
>> +#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
>>  
>>  /* This check is for functionality specific to a control domain */
>> -#define is_control_domain(_d) ((_d)->is_privileged)
>> +#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)
> I'm afraid there's another fly in the ointment here: While looking at
> the still questionable grant table change I've started wondering
> about constructs like
>
>     case XENMEM_machphys_mapping:
>     {
>         struct xen_machphys_mapping mapping = {
>             .v_start = MACH2PHYS_VIRT_START,
>             .v_end   = MACH2PHYS_VIRT_END,
>             .max_mfn = MACH2PHYS_NR_ENTRIES - 1
>         };
>
>         if ( !mem_hotplug && is_hardware_domain(current->domain) )
>             mapping.max_mfn = max_page - 1;
>         if ( copy_to_guest(arg, &mapping, 1) )
>             return -EFAULT;
>
>         return 0;
>     }
>
> Granted the example here could be easily re-arranged, but there
> are others where this is less easy or not possible at all. What I'm
> trying to get at are constructs where the such-protected
> predicates sit on the right side of && or || - afaict (also from
> looking at some much simplified code examples) the intended
> protection is gone in these cases.

I do not follow this. Independently of other conditionals in the if
statement, there should be an lfence instruction between the
"is_domain_control(...)" evaluation and accessing the max_page variable
- in case the code actually protects accessing that variable via that
function.

I validated this property for the above code snippet in the generated
assembly. However, I just noticed another problem: while my initial
version just placed the lfence instruction right into the code, not the
arch_barrier_nospec_true method is called via callq. I would like to get
the instructions to be embedded into the code directly, without the call
detour. In case I cannot force the compiler to do that, I would go back
to using a fixed lfence statement on all x86 platforms.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 9/9] common/memory: block speculative out-of-bound accesses
  2019-02-06 15:25             ` [PATCH SpectreV1+L1TF v5 9/9] common/memory: block speculative out-of-bound accesses Jan Beulich
@ 2019-02-06 15:39               ` Norbert Manthey
  2019-02-06 16:08                 ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-06 15:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/6/19 16:25, Jan Beulich wrote:
>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>> @@ -33,10 +34,10 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
>>  
>>  bool __mfn_valid(unsigned long mfn)
>>  {
>> -    return likely(mfn < max_page) &&
>> -           likely(!(mfn & pfn_hole_mask)) &&
>> -           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>> -                           pdx_group_valid));
>> +    return evaluate_nospec(likely(mfn < max_page) &&
>> +                           likely(!(mfn & pfn_hole_mask)) &&
>> +                           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>> +                                           pdx_group_valid)));
> Other than in the questionable grant table case, here I agree that
> you want to wrap the entire construct. This has an unwanted effect
> though: The test_bit() may still be speculated into with an out-of-
> bounds mfn. (As mentioned elsewhere, operations on bit arrays are
> an open issue altogether.) I therefore think you want to split this into
> two:
>
> bool __mfn_valid(unsigned long mfn)
> {
>     return likely(evaluate_nospec(mfn < max_page)) &&
>            evaluate_nospec(likely(!(mfn & pfn_hole_mask)) &&
>                            likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>                                            pdx_group_valid)));
> }

I can split the code. However, I wonder whether the test_bit accesses
should be protected separately, or actually as part of the test_bit
method themselves. Do you have any plans to do that already, because in
that case I would not have to modify the code.

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-06 15:06               ` Norbert Manthey
@ 2019-02-06 15:53                 ` Jan Beulich
  2019-02-07  9:50                   ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-06 15:53 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 06.02.19 at 16:06, <nmanthey@amazon.de> wrote:
> On 2/6/19 15:52, Jan Beulich wrote:
>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>> @@ -963,6 +965,9 @@ map_grant_ref(
>>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>>                   op->ref, rgt->domain->domain_id);
>>>  
>>> +    /* Make sure the above check is not bypassed speculatively */
>>> +    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
>>> +
>>>      act = active_entry_acquire(rgt, op->ref);
>>>      shah = shared_entry_header(rgt, op->ref);
>>>      status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
>> Just FTR - this is a case where the change, according to prior
>> discussion, is pretty unlikely to help at all. The compiler will have
>> a hard time realizing that it could keep the result in a register past
>> the active_entry_acquire() invocation, as that - due to the spin
>> lock acquired there - acts as a compiler barrier. And looking at
>> generated code (gcc 8.2) confirms that there's a reload from the
>> stack.
> I could change this back to a prior version that protects each read
> operation.

That or use block_speculation() with a comment explaining why.

Also - why are there no changes at all to the unmap_grant_ref() /
unmap_and_replace() call paths? Note in particular the security
related comment next to the bounds check of op->ref there. I've
gone through earlier review rounds, but I couldn't find an indication
that this might have been the result of review feedback.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation
  2019-02-06 15:36               ` Norbert Manthey
@ 2019-02-06 16:01                 ` Jan Beulich
  2019-02-07 10:02                   ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-06 16:01 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 06.02.19 at 16:36, <nmanthey@amazon.de> wrote:
> On 2/6/19 16:03, Jan Beulich wrote:
>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>> @@ -908,10 +909,10 @@ void watchdog_domain_destroy(struct domain *d);
>>>   *    (that is, this would not be suitable for a driver domain)
>>>   *  - There is never a reason to deny the hardware domain access to this
>>>   */
>>> -#define is_hardware_domain(_d) ((_d) == hardware_domain)
>>> +#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
>>>  
>>>  /* This check is for functionality specific to a control domain */
>>> -#define is_control_domain(_d) ((_d)->is_privileged)
>>> +#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)
>> I'm afraid there's another fly in the ointment here: While looking at
>> the still questionable grant table change I've started wondering
>> about constructs like
>>
>>     case XENMEM_machphys_mapping:
>>     {
>>         struct xen_machphys_mapping mapping = {
>>             .v_start = MACH2PHYS_VIRT_START,
>>             .v_end   = MACH2PHYS_VIRT_END,
>>             .max_mfn = MACH2PHYS_NR_ENTRIES - 1
>>         };
>>
>>         if ( !mem_hotplug && is_hardware_domain(current->domain) )
>>             mapping.max_mfn = max_page - 1;
>>         if ( copy_to_guest(arg, &mapping, 1) )
>>             return -EFAULT;
>>
>>         return 0;
>>     }
>>
>> Granted the example here could be easily re-arranged, but there
>> are others where this is less easy or not possible at all. What I'm
>> trying to get at are constructs where the such-protected
>> predicates sit on the right side of && or || - afaict (also from
>> looking at some much simplified code examples) the intended
>> protection is gone in these cases.
> 
> I do not follow this. Independently of other conditionals in the if
> statement, there should be an lfence instruction between the
> "is_domain_control(...)" evaluation and accessing the max_page variable
> - in case the code actually protects accessing that variable via that
> function.

Hmm, yes, I may have been confused by looking at the && and ||
variants of the generated code, and mixing up the cases.

> I validated this property for the above code snippet in the generated
> assembly. However, I just noticed another problem: while my initial
> version just placed the lfence instruction right into the code, not the
> arch_barrier_nospec_true method is called via callq. I would like to get
> the instructions to be embedded into the code directly, without the call
> detour. In case I cannot force the compiler to do that, I would go back
> to using a fixed lfence statement on all x86 platforms.

I think we had made pretty clear that incurring the overhead even
onto unaffected platforms is not an option. Did you try whether
adding always_inline helps? (I take it that this is another case of
the size-of-asm issue that's being worked on in Linux as well iirc.)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 9/9] common/memory: block speculative out-of-bound accesses
  2019-02-06 15:39               ` Norbert Manthey
@ 2019-02-06 16:08                 ` Jan Beulich
  2019-02-07  7:20                   ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-06 16:08 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 06.02.19 at 16:39, <nmanthey@amazon.de> wrote:
> On 2/6/19 16:25, Jan Beulich wrote:
>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>> @@ -33,10 +34,10 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
>>>  
>>>  bool __mfn_valid(unsigned long mfn)
>>>  {
>>> -    return likely(mfn < max_page) &&
>>> -           likely(!(mfn & pfn_hole_mask)) &&
>>> -           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>>> -                           pdx_group_valid));
>>> +    return evaluate_nospec(likely(mfn < max_page) &&
>>> +                           likely(!(mfn & pfn_hole_mask)) &&
>>> +                           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>>> +                                           pdx_group_valid)));
>> Other than in the questionable grant table case, here I agree that
>> you want to wrap the entire construct. This has an unwanted effect
>> though: The test_bit() may still be speculated into with an out-of-
>> bounds mfn. (As mentioned elsewhere, operations on bit arrays are
>> an open issue altogether.) I therefore think you want to split this into
>> two:
>>
>> bool __mfn_valid(unsigned long mfn)
>> {
>>     return likely(evaluate_nospec(mfn < max_page)) &&
>>            evaluate_nospec(likely(!(mfn & pfn_hole_mask)) &&
>>                            likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>>                                            pdx_group_valid)));
>> }
> 
> I can split the code. However, I wonder whether the test_bit accesses
> should be protected separately, or actually as part of the test_bit
> method themselves. Do you have any plans to do that already, because in
> that case I would not have to modify the code.

I don't think we want to do that in test_bit() and friends
themselves, as that would likely produce more unnecessary
changes than necessary ones. Even the change here
already looks to have much bigger impact than would be
wanted, as in the common case MFNs aren't guest controlled.
ISTR that originally you had modified just a single call site,
but I can't seem to find that in my inbox anymore. If that
was the case, what exactly were the criteria upon which
you had chosen this sole caller?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 9/9] common/memory: block speculative out-of-bound accesses
  2019-02-06 16:08                 ` Jan Beulich
@ 2019-02-07  7:20                   ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-07  7:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/6/19 17:08, Jan Beulich wrote:
>>>> On 06.02.19 at 16:39, <nmanthey@amazon.de> wrote:
>> On 2/6/19 16:25, Jan Beulich wrote:
>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>> @@ -33,10 +34,10 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
>>>>  
>>>>  bool __mfn_valid(unsigned long mfn)
>>>>  {
>>>> -    return likely(mfn < max_page) &&
>>>> -           likely(!(mfn & pfn_hole_mask)) &&
>>>> -           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>>>> -                           pdx_group_valid));
>>>> +    return evaluate_nospec(likely(mfn < max_page) &&
>>>> +                           likely(!(mfn & pfn_hole_mask)) &&
>>>> +                           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>>>> +                                           pdx_group_valid)));
>>> Other than in the questionable grant table case, here I agree that
>>> you want to wrap the entire construct. This has an unwanted effect
>>> though: The test_bit() may still be speculated into with an out-of-
>>> bounds mfn. (As mentioned elsewhere, operations on bit arrays are
>>> an open issue altogether.) I therefore think you want to split this into
>>> two:
>>>
>>> bool __mfn_valid(unsigned long mfn)
>>> {
>>>     return likely(evaluate_nospec(mfn < max_page)) &&
>>>            evaluate_nospec(likely(!(mfn & pfn_hole_mask)) &&
>>>                            likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>>>                                            pdx_group_valid)));
>>> }
>> I can split the code. However, I wonder whether the test_bit accesses
>> should be protected separately, or actually as part of the test_bit
>> method themselves. Do you have any plans to do that already, because in
>> that case I would not have to modify the code.
> I don't think we want to do that in test_bit() and friends
> themselves, as that would likely produce more unnecessary
> changes than necessary ones. Even the change here
> already looks to have much bigger impact than would be
> wanted, as in the common case MFNs aren't guest controlled.
> ISTR that originally you had modified just a single call site,
> but I can't seem to find that in my inbox anymore. If that
> was the case, what exactly were the criteria upon which
> you had chosen this sole caller?

I understand that these fixes should not go into test_bit itself. I
could add a local array_index_nospec fix for this call, to not introduce
another lfence to be passed.

I picked the specific caller in the first versions, because there was a
direct path from a hypercall where the guest had full control over mfn.
Iirc, that call was not spotted by tooling, but by manual analysis.

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-06 15:53                 ` Jan Beulich
@ 2019-02-07  9:50                   ` Norbert Manthey
  2019-02-07 10:20                     ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-07  9:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/6/19 16:53, Jan Beulich wrote:
>>>> On 06.02.19 at 16:06, <nmanthey@amazon.de> wrote:
>> On 2/6/19 15:52, Jan Beulich wrote:
>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>> @@ -963,6 +965,9 @@ map_grant_ref(
>>>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>>>                   op->ref, rgt->domain->domain_id);
>>>>  
>>>> +    /* Make sure the above check is not bypassed speculatively */
>>>> +    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
>>>> +
>>>>      act = active_entry_acquire(rgt, op->ref);
>>>>      shah = shared_entry_header(rgt, op->ref);
>>>>      status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
>>> Just FTR - this is a case where the change, according to prior
>>> discussion, is pretty unlikely to help at all. The compiler will have
>>> a hard time realizing that it could keep the result in a register past
>>> the active_entry_acquire() invocation, as that - due to the spin
>>> lock acquired there - acts as a compiler barrier. And looking at
>>> generated code (gcc 8.2) confirms that there's a reload from the
>>> stack.
>> I could change this back to a prior version that protects each read
>> operation.
> That or use block_speculation() with a comment explaining why.
>
> Also - why are there no changes at all to the unmap_grant_ref() /
> unmap_and_replace() call paths? Note in particular the security
> related comment next to the bounds check of op->ref there. I've
> gone through earlier review rounds, but I couldn't find an indication
> that this might have been the result of review feedback.

You are right. I am not sure whether I had a fix placed there in the
beginning. I will replace the first "smp_rmb();" in function
unmap_common for the next iteration with the "block_speculation" macro.

The other check unlikely(op->ref >= nr_grant_entries(rgt)) can only
reach out-of-bounds for the unmap case, in case the map->ref entry has
been out-of-bounds beforehand. I did not find an assignment that is not
protected by a bound check and a speculation barrier or array_nospec_index.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation
  2019-02-06 16:01                 ` Jan Beulich
@ 2019-02-07 10:02                   ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-07 10:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/6/19 17:01, Jan Beulich wrote:
>>>> On 06.02.19 at 16:36, <nmanthey@amazon.de> wrote:
>> On 2/6/19 16:03, Jan Beulich wrote:
>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>> @@ -908,10 +909,10 @@ void watchdog_domain_destroy(struct domain *d);
>>>>   *    (that is, this would not be suitable for a driver domain)
>>>>   *  - There is never a reason to deny the hardware domain access to this
>>>>   */
>>>> -#define is_hardware_domain(_d) ((_d) == hardware_domain)
>>>> +#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
>>>>  
>>>>  /* This check is for functionality specific to a control domain */
>>>> -#define is_control_domain(_d) ((_d)->is_privileged)
>>>> +#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)
>>> snip
>> I validated this property for the above code snippet in the generated
>> assembly. However, I just noticed another problem: while my initial
>> version just placed the lfence instruction right into the code, not the
>> arch_barrier_nospec_true method is called via callq. I would like to get
>> the instructions to be embedded into the code directly, without the call
>> detour. In case I cannot force the compiler to do that, I would go back
>> to using a fixed lfence statement on all x86 platforms.
> I think we had made pretty clear that incurring the overhead even
> onto unaffected platforms is not an option. Did you try whether
> adding always_inline helps? (I take it that this is another case of
> the size-of-asm issue that's being worked on in Linux as well iirc.)

I fully understand that just using lfence everywhere is not an option.

I just tested the always_inline option, and that works for my binary. I
will adapt  the function definition accordingly.

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-07  9:50                   ` Norbert Manthey
@ 2019-02-07 10:20                     ` Norbert Manthey
  2019-02-07 14:00                       ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-07 10:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel


On 2/7/19 10:50, Norbert Manthey wrote:
> On 2/6/19 16:53, Jan Beulich wrote:
>>>>> On 06.02.19 at 16:06, <nmanthey@amazon.de> wrote:
>>> On 2/6/19 15:52, Jan Beulich wrote:
>>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>>> @@ -963,6 +965,9 @@ map_grant_ref(
>>>>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>>>>                   op->ref, rgt->domain->domain_id);
>>>>>  
>>>>> +    /* Make sure the above check is not bypassed speculatively */
>>>>> +    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
>>>>> +
>>>>>      act = active_entry_acquire(rgt, op->ref);
>>>>>      shah = shared_entry_header(rgt, op->ref);
>>>>>      status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
>>>> Just FTR - this is a case where the change, according to prior
>>>> discussion, is pretty unlikely to help at all. The compiler will have
>>>> a hard time realizing that it could keep the result in a register past
>>>> the active_entry_acquire() invocation, as that - due to the spin
>>>> lock acquired there - acts as a compiler barrier. And looking at
>>>> generated code (gcc 8.2) confirms that there's a reload from the
>>>> stack.
>>> I could change this back to a prior version that protects each read
>>> operation.
>> That or use block_speculation() with a comment explaining why.
>>
>> Also - why are there no changes at all to the unmap_grant_ref() /
>> unmap_and_replace() call paths? Note in particular the security
>> related comment next to the bounds check of op->ref there. I've
>> gone through earlier review rounds, but I couldn't find an indication
>> that this might have been the result of review feedback.
> You are right. I am not sure whether I had a fix placed there in the
> beginning. I will replace the first "smp_rmb();" in function
> unmap_common for the next iteration with the "block_speculation" macro.
I just checked this one more time. The maptrack_entry macro has been
extended with the array_index_nospec macro already, so that the
assignment to the map variable is in bound. Therefore, I actually will
not introduce the block_speculation macro.
>
> The other check unlikely(op->ref >= nr_grant_entries(rgt)) can only
> reach out-of-bounds for the unmap case, in case the map->ref entry has
> been out-of-bounds beforehand. I did not find an assignment that is not
> protected by a bound check and a speculation barrier or array_nospec_index.
>
> Best,
> Norbert
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-07 10:20                     ` Norbert Manthey
@ 2019-02-07 14:00                       ` Jan Beulich
  2019-02-07 16:20                         ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-07 14:00 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 07.02.19 at 11:20, <nmanthey@amazon.de> wrote:

> On 2/7/19 10:50, Norbert Manthey wrote:
>> On 2/6/19 16:53, Jan Beulich wrote:
>>>>>> On 06.02.19 at 16:06, <nmanthey@amazon.de> wrote:
>>>> On 2/6/19 15:52, Jan Beulich wrote:
>>>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>>>> @@ -963,6 +965,9 @@ map_grant_ref(
>>>>>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>>>>>                   op->ref, rgt->domain->domain_id);
>>>>>>  
>>>>>> +    /* Make sure the above check is not bypassed speculatively */
>>>>>> +    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
>>>>>> +
>>>>>>      act = active_entry_acquire(rgt, op->ref);
>>>>>>      shah = shared_entry_header(rgt, op->ref);
>>>>>>      status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
>>>>> Just FTR - this is a case where the change, according to prior
>>>>> discussion, is pretty unlikely to help at all. The compiler will have
>>>>> a hard time realizing that it could keep the result in a register past
>>>>> the active_entry_acquire() invocation, as that - due to the spin
>>>>> lock acquired there - acts as a compiler barrier. And looking at
>>>>> generated code (gcc 8.2) confirms that there's a reload from the
>>>>> stack.
>>>> I could change this back to a prior version that protects each read
>>>> operation.
>>> That or use block_speculation() with a comment explaining why.
>>>
>>> Also - why are there no changes at all to the unmap_grant_ref() /
>>> unmap_and_replace() call paths? Note in particular the security
>>> related comment next to the bounds check of op->ref there. I've
>>> gone through earlier review rounds, but I couldn't find an indication
>>> that this might have been the result of review feedback.
>> You are right. I am not sure whether I had a fix placed there in the
>> beginning. I will replace the first "smp_rmb();" in function
>> unmap_common for the next iteration with the "block_speculation" macro.
> I just checked this one more time. The maptrack_entry macro has been
> extended with the array_index_nospec macro already, so that the
> assignment to the map variable is in bound. Therefore, I actually will
> not introduce the block_speculation macro.

unmap_common() uses maptrack_entry() with op->handle. I didn't
refer to that, because - as you say - maptrack_entry() is itself
getting hardened already. Instead I am, as said, referring to
map->ref / op->ref.

And no, replacing _any_ smb_rmb() would not be correct: The
barriers are needed unconditionally, whereas block_speculation()
inserts a barrier only in a subset of cases (for example never on
Arm).

>> The other check unlikely(op->ref >= nr_grant_entries(rgt)) can only
>> reach out-of-bounds for the unmap case, in case the map->ref entry has
>> been out-of-bounds beforehand. I did not find an assignment that is not
>> protected by a bound check and a speculation barrier or array_nospec_index.

I can only refer you to the comment there again. In essence, the prior
bounds check done may have been against the grant table limits of
another domain. You may want to look at the full commit introducing this
comment.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-07 14:00                       ` Jan Beulich
@ 2019-02-07 16:20                         ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-07 16:20 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Julien Grall, David Woodhouse,
	Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/7/19 15:00, Jan Beulich wrote:
>>>> On 07.02.19 at 11:20, <nmanthey@amazon.de> wrote:
>> On 2/7/19 10:50, Norbert Manthey wrote:
>>> On 2/6/19 16:53, Jan Beulich wrote:
>>>>>>> On 06.02.19 at 16:06, <nmanthey@amazon.de> wrote:
>>>>> On 2/6/19 15:52, Jan Beulich wrote:
>>>>>>>>> On 29.01.19 at 15:43, <nmanthey@amazon.de> wrote:
>>>>>>> @@ -963,6 +965,9 @@ map_grant_ref(
>>>>>>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>>>>>>                   op->ref, rgt->domain->domain_id);
>>>>>>>  
>>>>>>> +    /* Make sure the above check is not bypassed speculatively */
>>>>>>> +    op->ref = array_index_nospec(op->ref, nr_grant_entries(rgt));
>>>>>>> +
>>>>>>>      act = active_entry_acquire(rgt, op->ref);
>>>>>>>      shah = shared_entry_header(rgt, op->ref);
>>>>>>>      status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
>>>>>> Just FTR - this is a case where the change, according to prior
>>>>>> discussion, is pretty unlikely to help at all. The compiler will have
>>>>>> a hard time realizing that it could keep the result in a register past
>>>>>> the active_entry_acquire() invocation, as that - due to the spin
>>>>>> lock acquired there - acts as a compiler barrier. And looking at
>>>>>> generated code (gcc 8.2) confirms that there's a reload from the
>>>>>> stack.
>>>>> I could change this back to a prior version that protects each read
>>>>> operation.
>>>> That or use block_speculation() with a comment explaining why.
>>>>
>>>> Also - why are there no changes at all to the unmap_grant_ref() /
>>>> unmap_and_replace() call paths? Note in particular the security
>>>> related comment next to the bounds check of op->ref there. I've
>>>> gone through earlier review rounds, but I couldn't find an indication
>>>> that this might have been the result of review feedback.
>>> You are right. I am not sure whether I had a fix placed there in the
>>> beginning. I will replace the first "smp_rmb();" in function
>>> unmap_common for the next iteration with the "block_speculation" macro.
>> I just checked this one more time. The maptrack_entry macro has been
>> extended with the array_index_nospec macro already, so that the
>> assignment to the map variable is in bound. Therefore, I actually will
>> not introduce the block_speculation macro.
> unmap_common() uses maptrack_entry() with op->handle. I didn't
> refer to that, because - as you say - maptrack_entry() is itself
> getting hardened already. Instead I am, as said, referring to
> map->ref / op->ref.
>
> And no, replacing _any_ smb_rmb() would not be correct: The
> barriers are needed unconditionally, whereas block_speculation()
> inserts a barrier only in a subset of cases (for example never on
> Arm).
Right. I will protect the index operations based on op->ref in
unmap_common via array_index_nospec.
>
>>> The other check unlikely(op->ref >= nr_grant_entries(rgt)) can only
>>> reach out-of-bounds for the unmap case, in case the map->ref entry has
>>> been out-of-bounds beforehand. I did not find an assignment that is not
>>> protected by a bound check and a speculation barrier or array_nospec_index.
> I can only refer you to the comment there again. In essence, the prior
> bounds check done may have been against the grant table limits of
> another domain. You may want to look at the full commit introducing this
> comment.

In unmap_common_complete, IHMO it is sufficient to evaluate the first
check op->done via evaluate_nospec, so that the return is taken in case
nothing has been done, and then invalid values of op->ref should not be
used under speculation, or out-of-bounds. On the other hand, this
function is always called after gnttab_flush_tlb. I did not spot a good
indicator for that function blocking speculation, hence, I would still
add the macro.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec
  2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec Norbert Manthey
@ 2019-02-08  9:20     ` Julien Grall
  0 siblings, 0 replies; 150+ messages in thread
From: Julien Grall @ 2019-02-08  9:20 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, David Woodhouse, Jan Beulich,
	Martin Mazein, Julian Stecklina, Bjoern Doebel

Hi,

On 29/01/2019 14:43, Norbert Manthey wrote:
> Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
> L1 cache is problemetic, because when hyperthreading is used as well, a

s/problemetic/problematic/

> guest running on the sibling core can leak this potentially secret data.
> 
> To prevent these speculative accesses, we block speculation after
> accessing the domain property field by adding lfence instructions. This
> way, the CPU continues executing and loading data only once the condition
> is actually evaluated.
> 
> As the macros are typically used in if statements, the lfence has to come
> in a compatible way. Therefore, a function that returns true after an
> lfence instruction is introduced. To protect both branches after a
> conditional, an lfence instruction has to be added for the two branches.
> To be able to block speculation after several evalauations, the generic

s/evalauations/evaluations/

> barrier macro block_speculation is also introduced.
> 
> As the L1TF vulnerability is only present on the x86 architecture, the
> macros will not use the lfence instruction on other architectures and the
> protection is disabled during compilation.

This sentence is a bit misleading because lfence instruction does not exist on 
Arm. A better wording would be:

"As the L1TF vulnerability is only present on the x86 architecture, there are no 
need to add protection for other architectures."

> By default, the lfence
> instruction is not present either. Only when a L1TF vulnerable platform
> is detected, the lfence instruction is patched in via alterantive patching.

s/alterantive/alternative/

> 
> Introducing the lfence instructions catches a lot of potential leaks with
> a simple unintrusive code change. During performance testing, we did not
> notice performance effects.
> 
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> ---
>   xen/include/xen/nospec.h | 28 ++++++++++++++++++++++++++++
>   1 file changed, 28 insertions(+)
> 
> diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
> --- a/xen/include/xen/nospec.h
> +++ b/xen/include/xen/nospec.h
> @@ -7,6 +7,7 @@
>   #ifndef XEN_NOSPEC_H
>   #define XEN_NOSPEC_H
>   
> +#include <asm/alternative.h>

I don't want asm/alternative.h to be included here when it is not necessary on 
Arm. This is one of the reason why I suggested to have arch specific code in 
arch specific header rather than in common headers.

>   #include <asm/system.h>
>   
>   /**
> @@ -64,6 +65,33 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
>   #define array_access_nospec(array, index)                               \
>       (array)[array_index_nospec(index, ARRAY_SIZE(array))]
>   
> +/*
> + * Allow to insert a read memory barrier into conditionals
> + */
> +#if defined(CONFIG_X86) && defined(CONFIG_HVM)

I am not an x86 expert, however I think you should explain in the commit message 
why this is only built for HVM.

> +static inline bool arch_barrier_nospec_true(void) {
> +    alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
> +    return true;
> +}
> +#else
> +static inline bool arch_barrier_nospec_true(void) { return true; }
> +#endif
> +
> +/*
> + * Allow to protect evaluation of conditional with respect to speculation on x86
> + */
> +#ifndef CONFIG_X86
> +#define evaluate_nospec(condition) (condition)
> +#else
> +#define evaluate_nospec(condition)                                         \
> +    ((condition) ? arch_barrier_nospec_true() : !arch_barrier_nospec_true())
> +#endif
> +
> +/*
> + * Allow to block speculative execution in generic code
> + */
> +#define block_speculation() (void)arch_barrier_nospec_true()
> +
>   #endif /* XEN_NOSPEC_H */
>   
>   /*
> 

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* SpectreV1+L1TF Patch Series v6
  2019-02-05 14:32               ` Norbert Manthey
@ 2019-02-08 13:44                 ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
                                     ` (9 more replies)
  0 siblings, 10 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel

Dear all,

This patch series attempts to mitigate the issue that have been raised in the
XSA-289 (https://xenbits.xen.org/xsa/advisory-289.html). To block speculative
execution on Intel hardware, an lfence instruction is required to make sure
that selected checks are not bypassed. Speculative out-of-bound accesses can
be prevented by using the array_index_nospec macro.

The major changes between v5 and v6 of this series are the introduction of
asm specific nospec.h files that introduce macros to add lfence instructions
to conditionals to be evaluated. Furthermore, updating variable that might
not be located in a register is tried to be avoided.

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 2/9] x86/vioapic: " Norbert Manthey
                                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

Guests can issue event channel interaction with guest specified data.
To avoid speculative out-of-bound accesses, we use the nospec macros,
or the domain_vcpu function.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v6: drop vcpu < 0 check
      use struct vpcu in evtchn_bind_vcpu
      do not call domain_vcpu twice in evtchn_fifo_word_from_port

 xen/common/event_channel.c | 34 +++++++++++++++++++++++-----------
 xen/common/event_fifo.c    | 13 ++++++++++---
 xen/include/xen/event.h    |  5 +++--
 3 files changed, 36 insertions(+), 16 deletions(-)

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -365,11 +365,16 @@ int evtchn_bind_virq(evtchn_bind_virq_t *bind, evtchn_port_t port)
     if ( (virq < 0) || (virq >= ARRAY_SIZE(v->virq_to_evtchn)) )
         return -EINVAL;
 
+   /*
+    * Make sure the guest controlled value virq is bounded even during
+    * speculative execution.
+    */
+    virq = array_index_nospec(virq, ARRAY_SIZE(v->virq_to_evtchn));
+
     if ( virq_is_global(virq) && (vcpu != 0) )
         return -EINVAL;
 
-    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
-         ((v = d->vcpu[vcpu]) == NULL) )
+    if ( (v = domain_vcpu(d, vcpu)) == NULL )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -418,8 +423,7 @@ static long evtchn_bind_ipi(evtchn_bind_ipi_t *bind)
     int            port, vcpu = bind->vcpu;
     long           rc = 0;
 
-    if ( (vcpu < 0) || (vcpu >= d->max_vcpus) ||
-         (d->vcpu[vcpu] == NULL) )
+    if ( domain_vcpu(d, vcpu) == NULL )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -813,6 +817,13 @@ int set_global_virq_handler(struct domain *d, uint32_t virq)
 
     if (virq >= NR_VIRQS)
         return -EINVAL;
+
+   /*
+    * Make sure the guest controlled value virq is bounded even during
+    * speculative execution.
+    */
+    virq = array_index_nospec(virq, ARRAY_SIZE(global_virq_handlers));
+
     if (!virq_is_global(virq))
         return -EINVAL;
 
@@ -930,8 +941,9 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
     struct domain *d = current->domain;
     struct evtchn *chn;
     long           rc = 0;
+    struct vcpu   *v;
 
-    if ( (vcpu_id >= d->max_vcpus) || (d->vcpu[vcpu_id] == NULL) )
+    if ( (v = domain_vcpu(d, vcpu_id)) == NULL )
         return -ENOENT;
 
     spin_lock(&d->event_lock);
@@ -955,22 +967,22 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
     {
     case ECS_VIRQ:
         if ( virq_is_global(chn->u.virq) )
-            chn->notify_vcpu_id = vcpu_id;
+            chn->notify_vcpu_id = v->vcpu_id;
         else
             rc = -EINVAL;
         break;
     case ECS_UNBOUND:
     case ECS_INTERDOMAIN:
-        chn->notify_vcpu_id = vcpu_id;
+        chn->notify_vcpu_id = v->vcpu_id;
         break;
     case ECS_PIRQ:
-        if ( chn->notify_vcpu_id == vcpu_id )
+        if ( chn->notify_vcpu_id == v->vcpu_id )
             break;
         unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
-        chn->notify_vcpu_id = vcpu_id;
+        chn->notify_vcpu_id = v->vcpu_id;
         pirq_set_affinity(d, chn->u.pirq.irq,
-                          cpumask_of(d->vcpu[vcpu_id]->processor));
-        link_pirq_port(port, chn, d->vcpu[vcpu_id]);
+                          cpumask_of(v->processor));
+        link_pirq_port(port, chn, v);
         break;
     default:
         rc = -EINVAL;
diff --git a/xen/common/event_fifo.c b/xen/common/event_fifo.c
--- a/xen/common/event_fifo.c
+++ b/xen/common/event_fifo.c
@@ -33,7 +33,8 @@ static inline event_word_t *evtchn_fifo_word_from_port(const struct domain *d,
      */
     smp_rmb();
 
-    p = port / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
+    p = array_index_nospec(port / EVTCHN_FIFO_EVENT_WORDS_PER_PAGE,
+                           d->evtchn_fifo->num_evtchns);
     w = port % EVTCHN_FIFO_EVENT_WORDS_PER_PAGE;
 
     return d->evtchn_fifo->event_array[p] + w;
@@ -516,14 +517,20 @@ int evtchn_fifo_init_control(struct evtchn_init_control *init_control)
     gfn     = init_control->control_gfn;
     offset  = init_control->offset;
 
-    if ( vcpu_id >= d->max_vcpus || !d->vcpu[vcpu_id] )
+    if ( (v = domain_vcpu(d, vcpu_id)) == NULL )
         return -ENOENT;
-    v = d->vcpu[vcpu_id];
 
     /* Must not cross page boundary. */
     if ( offset > (PAGE_SIZE - sizeof(evtchn_fifo_control_block_t)) )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value offset is bounded even during
+     * speculative execution.
+     */
+    offset = array_index_nospec(offset,
+                           PAGE_SIZE - sizeof(evtchn_fifo_control_block_t) + 1);
+
     /* Must be 8-bytes aligned. */
     if ( offset & (8 - 1) )
         return -EINVAL;
diff --git a/xen/include/xen/event.h b/xen/include/xen/event.h
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -13,6 +13,7 @@
 #include <xen/smp.h>
 #include <xen/softirq.h>
 #include <xen/bitops.h>
+#include <xen/nospec.h>
 #include <asm/event.h>
 
 /*
@@ -103,7 +104,7 @@ void arch_evtchn_inject(struct vcpu *v);
  * The first bucket is directly accessed via d->evtchn.
  */
 #define group_from_port(d, p) \
-    ((d)->evtchn_group[(p) / EVTCHNS_PER_GROUP])
+    array_access_nospec((d)->evtchn_group, (p) / EVTCHNS_PER_GROUP)
 #define bucket_from_port(d, p) \
     ((group_from_port(d, p))[((p) % EVTCHNS_PER_GROUP) / EVTCHNS_PER_BUCKET])
 
@@ -117,7 +118,7 @@ static inline bool_t port_is_valid(struct domain *d, unsigned int p)
 static inline struct evtchn *evtchn_from_port(struct domain *d, unsigned int p)
 {
     if ( p < EVTCHNS_PER_BUCKET )
-        return &d->evtchn[p];
+        return &d->evtchn[array_index_nospec(p, EVTCHNS_PER_BUCKET)];
     return bucket_from_port(d, p) + (p % EVTCHNS_PER_BUCKET);
 }
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 2/9] x86/vioapic: block speculative out-of-bound accesses
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: " Norbert Manthey
                                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

When interacting with io apic, a guest can specify values that are used
as index to structures, and whose values are not compared against
upper bounds to prevent speculative out-of-bound accesses. This change
prevents these speculative accesses.

Furthermore, variables are initialized and the compiler is asked to not
optimized these initializations, as the uninitialized, potentially guest
controlled, variables might be used in a speculative out-of-bound access.
Out of the four initialized variables, two are potentially problematic,
namely ones in the functions vioapic_irq_positive_edge and
vioapic_get_trigger_mode.

As the two problematic variables are both used in the common function
gsi_vioapic, the mitigation is implemented there. As the access pattern
of the currently non-guest-controlled functions might change in the
future as well, the other variables are initialized as well.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v6: Explain initialization in commit message
      Initialize pin in all 4 functions that call gsi_vioapic
      Fix space in comment

 xen/arch/x86/hvm/vioapic.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
--- a/xen/arch/x86/hvm/vioapic.c
+++ b/xen/arch/x86/hvm/vioapic.c
@@ -30,6 +30,7 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/sched.h>
+#include <xen/nospec.h>
 #include <public/hvm/ioreq.h>
 #include <asm/hvm/io.h>
 #include <asm/hvm/vpic.h>
@@ -66,6 +67,12 @@ static struct hvm_vioapic *gsi_vioapic(const struct domain *d,
 {
     unsigned int i;
 
+    /*
+     * Make sure the compiler does not optimize away the initialization done by
+     * callers
+     */
+    OPTIMIZER_HIDE_VAR(*pin);
+
     for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
     {
         struct hvm_vioapic *vioapic = domain_vioapic(d, i);
@@ -117,7 +124,8 @@ static uint32_t vioapic_read_indirect(const struct hvm_vioapic *vioapic)
             break;
         }
 
-        redir_content = vioapic->redirtbl[redir_index].bits;
+        redir_content = vioapic->redirtbl[array_index_nospec(redir_index,
+                                                       vioapic->nr_pins)].bits;
         result = (vioapic->ioregsel & 1) ? (redir_content >> 32)
                                          : redir_content;
         break;
@@ -212,7 +220,15 @@ static void vioapic_write_redirent(
     struct hvm_irq *hvm_irq = hvm_domain_irq(d);
     union vioapic_redir_entry *pent, ent;
     int unmasked = 0;
-    unsigned int gsi = vioapic->base_gsi + idx;
+    unsigned int gsi;
+
+    /* Callers of this function should make sure idx is bounded appropriately */
+    ASSERT(idx < vioapic->nr_pins);
+
+    /* Make sure no out-of-bound value for idx can be used */
+    idx = array_index_nospec(idx, vioapic->nr_pins);
+
+    gsi = vioapic->base_gsi + idx;
 
     spin_lock(&d->arch.hvm.irq_lock);
 
@@ -467,7 +483,7 @@ static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int pin)
 
 void vioapic_irq_positive_edge(struct domain *d, unsigned int irq)
 {
-    unsigned int pin;
+    unsigned int pin = 0; /* See gsi_vioapic */
     struct hvm_vioapic *vioapic = gsi_vioapic(d, irq, &pin);
     union vioapic_redir_entry *ent;
 
@@ -542,7 +558,7 @@ void vioapic_update_EOI(struct domain *d, u8 vector)
 
 int vioapic_get_mask(const struct domain *d, unsigned int gsi)
 {
-    unsigned int pin;
+    unsigned int pin = 0; /* See gsi_vioapic */
     const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
 
     if ( !vioapic )
@@ -553,7 +569,7 @@ int vioapic_get_mask(const struct domain *d, unsigned int gsi)
 
 int vioapic_get_vector(const struct domain *d, unsigned int gsi)
 {
-    unsigned int pin;
+    unsigned int pin = 0; /* See gsi_vioapic */
     const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
 
     if ( !vioapic )
@@ -564,7 +580,7 @@ int vioapic_get_vector(const struct domain *d, unsigned int gsi)
 
 int vioapic_get_trigger_mode(const struct domain *d, unsigned int gsi)
 {
-    unsigned int pin;
+    unsigned int pin = 0; /* See gsi_vioapic */
     const struct hvm_vioapic *vioapic = gsi_vioapic(d, gsi, &pin);
 
     if ( !vioapic )
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 2/9] x86/vioapic: " Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 4/9] spec: add l1tf-barrier Norbert Manthey
                                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

There are multiple arrays in the HVM interface that are accessed
with indices that are provided by the guest. To avoid speculative
out-of-bound accesses, we use the array_index_nospec macro.

When blocking speculative out-of-bound accesses, we can classify arrays
into dynamic arrays and static arrays. Where the former are allocated
during run time, the size of the latter is known during compile time.
On static arrays, compiler might be able to block speculative accesses
in the future.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v6: Match commit message with code
      Fix nospec bound in hvm_msr_read_intercept
 xen/arch/x86/hvm/hvm.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -37,6 +37,7 @@
 #include <xen/monitor.h>
 #include <xen/warning.h>
 #include <xen/vpci.h>
+#include <xen/nospec.h>
 #include <asm/shadow.h>
 #include <asm/hap.h>
 #include <asm/current.h>
@@ -2092,7 +2093,7 @@ int hvm_mov_from_cr(unsigned int cr, unsigned int gpr)
     case 2:
     case 3:
     case 4:
-        val = curr->arch.hvm.guest_cr[cr];
+        val = array_access_nospec(curr->arch.hvm.guest_cr, cr);
         break;
     case 8:
         val = (vlapic_get_reg(vcpu_vlapic(curr), APIC_TASKPRI) & 0xf0) >> 4;
@@ -3438,13 +3439,15 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix16K_80000;
-        *msr_content = fixed_range_base[index + 1];
+        *msr_content = fixed_range_base[array_index_nospec(index + 1,
+                                   ARRAY_SIZE(v->arch.hvm.mtrr.fixed_ranges))];
         break;
     case MSR_MTRRfix4K_C0000...MSR_MTRRfix4K_F8000:
         if ( !d->arch.cpuid->basic.mtrr )
             goto gp_fault;
         index = msr - MSR_MTRRfix4K_C0000;
-        *msr_content = fixed_range_base[index + 3];
+        *msr_content = fixed_range_base[array_index_nospec(index + 3,
+                                   ARRAY_SIZE(v->arch.hvm.mtrr.fixed_ranges))];
         break;
     case MSR_IA32_MTRR_PHYSBASE(0)...MSR_IA32_MTRR_PHYSMASK(MTRR_VCNT_MAX - 1):
         if ( !d->arch.cpuid->basic.mtrr )
@@ -3453,7 +3456,8 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
         if ( (index / 2) >=
              MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT) )
             goto gp_fault;
-        *msr_content = var_range_base[index];
+        *msr_content = var_range_base[array_index_nospec(index,
+                        2*MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT))];
         break;
 
     case MSR_IA32_XSS:
@@ -4016,7 +4020,7 @@ static int hvmop_set_evtchn_upcall_vector(
     if ( op.vector < 0x10 )
         return -EINVAL;
 
-    if ( op.vcpu >= d->max_vcpus || (v = d->vcpu[op.vcpu]) == NULL )
+    if ( (v = domain_vcpu(d, op.vcpu)) == NULL )
         return -ENOENT;
 
     printk(XENLOG_G_INFO "%pv: upcall vector %02x\n", v, op.vector);
@@ -4104,6 +4108,12 @@ static int hvmop_set_param(
     if ( a.index >= HVM_NR_PARAMS )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value a.index is bounded even during
+     * speculative execution.
+     */
+    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
+
     d = rcu_lock_domain_by_any_id(a.domid);
     if ( d == NULL )
         return -ESRCH;
@@ -4370,6 +4380,12 @@ static int hvmop_get_param(
     if ( a.index >= HVM_NR_PARAMS )
         return -EINVAL;
 
+    /*
+     * Make sure the guest controlled value a.index is bounded even during
+     * speculative execution.
+     */
+    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
+
     d = rcu_lock_domain_by_any_id(a.domid);
     if ( d == NULL )
         return -ESRCH;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 4/9] spec: add l1tf-barrier
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
                                     ` (2 preceding siblings ...)
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: " Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec Norbert Manthey
                                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

To control the runtime behavior on L1TF vulnerable platforms better, the
command line option l1tf-barrier is introduced. This option controls
whether on vulnerable x86 platforms the lfence instruction is used to
prevent speculative execution from bypassing the evaluation of
conditionals that are protected with the evaluate_nospec macro.

By now, Xen is capable of identifying L1TF vulnerable hardware. However,
this information cannot be used for alternative patching, as a CPU feature
is required. To control alternative patching with the command line option,
a new x86 feature "X86_FEATURE_SC_L1TF_VULN" is introduced. This feature
is used to patch the lfence instruction into the arch_barrier_nospec_true
function. The feature is enabled only if L1TF vulnerable hardware is
detected and the command line option does not prevent using this feature.

The status of hyperthreading is not considered when automatically enabling
adding the lfence instruction, because platforms without hyperthreading
can still be vulnerable to L1TF in case the L1 cache is not flushed
properly.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v6: Move disabling l1tf-barrier into spec-ctrl=no
      Use gap in existing flags
      Force barrier based on commandline, independently of L1TF detection

 docs/misc/xen-command-line.pandoc | 14 ++++++++++----
 xen/arch/x86/spec_ctrl.c          | 17 +++++++++++++++--
 xen/include/asm-x86/cpufeatures.h |  1 +
 xen/include/asm-x86/spec_ctrl.h   |  1 +
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -483,9 +483,9 @@ accounting for hardware capabilities as enumerated via CPUID.
 
 Currently accepted:
 
-The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`,
-`l1d-flush` and `ssbd` are used by default if available and applicable.  They can
-be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
+The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, `l1d-flush`,
+`l1tf-barrier` and `ssbd` are used by default if available and applicable.  They
+can be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
 won't offer them to guests.
 
 ### cpuid_mask_cpu
@@ -1896,7 +1896,7 @@ By default SSBD will be mitigated at runtime (i.e `ssbd=runtime`).
 ### spec-ctrl (x86)
 > `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb}=<bool>,
 >              bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
->              l1d-flush}=<bool> ]`
+>              l1d-flush,l1tf-barrier}=<bool> ]`
 
 Controls for speculative execution sidechannel mitigations.  By default, Xen
 will pick the most appropriate mitigations based on compiled in support,
@@ -1962,6 +1962,12 @@ Irrespective of Xen's setting, the feature is virtualised for HVM guests to
 use.  By default, Xen will enable this mitigation on hardware believed to be
 vulnerable to L1TF.
 
+On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to force
+or prevent Xen from protecting evaluations inside the hypervisor with a barrier
+instruction to not load potentially secret information into L1 cache.  By
+default, Xen will enable this mitigation on hardware believed to be vulnerable
+to L1TF.
+
 ### sync_console
 > `= <boolean>`
 
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -21,6 +21,7 @@
 #include <xen/lib.h>
 #include <xen/warning.h>
 
+#include <asm/cpuid.h>
 #include <asm/microcode.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -50,6 +51,7 @@ bool __read_mostly opt_ibpb = true;
 bool __read_mostly opt_ssbd = false;
 int8_t __read_mostly opt_eager_fpu = -1;
 int8_t __read_mostly opt_l1d_flush = -1;
+int8_t __read_mostly opt_l1tf_barrier = -1;
 
 bool __initdata bsp_delay_spec_ctrl;
 uint8_t __read_mostly default_xen_spec_ctrl;
@@ -91,6 +93,8 @@ static int __init parse_spec_ctrl(const char *s)
             if ( opt_pv_l1tf_domu < 0 )
                 opt_pv_l1tf_domu = 0;
 
+            opt_l1tf_barrier = 0;
+
         disable_common:
             opt_rsb_pv = false;
             opt_rsb_hvm = false;
@@ -157,6 +161,8 @@ static int __init parse_spec_ctrl(const char *s)
             opt_eager_fpu = val;
         else if ( (val = parse_boolean("l1d-flush", s, ss)) >= 0 )
             opt_l1d_flush = val;
+        else if ( (val = parse_boolean("l1tf-barrier", s, ss)) >= 0 )
+            opt_l1tf_barrier = val;
         else
             rc = -EINVAL;
 
@@ -248,7 +254,7 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
                "\n");
 
     /* Settings for Xen's protection, irrespective of guests. */
-    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s\n",
+    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s%s\n",
            thunk == THUNK_NONE      ? "N/A" :
            thunk == THUNK_RETPOLINE ? "RETPOLINE" :
            thunk == THUNK_LFENCE    ? "LFENCE" :
@@ -258,7 +264,8 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
            !boot_cpu_has(X86_FEATURE_SSBD)           ? "" :
            (default_xen_spec_ctrl & SPEC_CTRL_SSBD)  ? " SSBD+" : " SSBD-",
            opt_ibpb                                  ? " IBPB"  : "",
-           opt_l1d_flush                             ? " L1D_FLUSH" : "");
+           opt_l1d_flush                             ? " L1D_FLUSH" : "",
+           opt_l1tf_barrier                          ? " L1TF_BARRIER" : "");
 
     /* L1TF diagnostics, printed if vulnerable or PV shadowing is in use. */
     if ( cpu_has_bug_l1tf || opt_pv_l1tf_hwdom || opt_pv_l1tf_domu )
@@ -842,6 +849,12 @@ void __init init_speculation_mitigations(void)
     else if ( opt_l1d_flush == -1 )
         opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
 
+    /* By default, enable L1TF_VULN on L1TF-vulnerable hardware */
+    if ( opt_l1tf_barrier == -1 )
+        opt_l1tf_barrier = cpu_has_bug_l1tf;
+    if ( opt_l1tf_barrier > 0)
+        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);
+
     /*
      * We do not disable HT by default on affected hardware.
      *
diff --git a/xen/include/asm-x86/cpufeatures.h b/xen/include/asm-x86/cpufeatures.h
--- a/xen/include/asm-x86/cpufeatures.h
+++ b/xen/include/asm-x86/cpufeatures.h
@@ -25,6 +25,7 @@ XEN_CPUFEATURE(XEN_SMAP,        (FSCAPINTS+0)*32+11) /* SMAP gets used by Xen it
 XEN_CPUFEATURE(LFENCE_DISPATCH, (FSCAPINTS+0)*32+12) /* lfence set as Dispatch Serialising */
 XEN_CPUFEATURE(IND_THUNK_LFENCE,(FSCAPINTS+0)*32+13) /* Use IND_THUNK_LFENCE */
 XEN_CPUFEATURE(IND_THUNK_JMP,   (FSCAPINTS+0)*32+14) /* Use IND_THUNK_JMP */
+XEN_CPUFEATURE(SC_L1TF_VULN,    (FSCAPINTS+0)*32+15) /* L1TF protection required */
 XEN_CPUFEATURE(SC_MSR_PV,       (FSCAPINTS+0)*32+16) /* MSR_SPEC_CTRL used by Xen for PV */
 XEN_CPUFEATURE(SC_MSR_HVM,      (FSCAPINTS+0)*32+17) /* MSR_SPEC_CTRL used by Xen for HVM */
 XEN_CPUFEATURE(SC_RSB_PV,       (FSCAPINTS+0)*32+18) /* RSB overwrite needed for PV */
diff --git a/xen/include/asm-x86/spec_ctrl.h b/xen/include/asm-x86/spec_ctrl.h
--- a/xen/include/asm-x86/spec_ctrl.h
+++ b/xen/include/asm-x86/spec_ctrl.h
@@ -37,6 +37,7 @@ extern bool opt_ibpb;
 extern bool opt_ssbd;
 extern int8_t opt_eager_fpu;
 extern int8_t opt_l1d_flush;
+extern int8_t opt_l1tf_barrier;
 
 extern bool bsp_delay_spec_ctrl;
 extern uint8_t default_xen_spec_ctrl;
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
                                     ` (3 preceding siblings ...)
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 4/9] spec: add l1tf-barrier Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 6/9] is_control_domain: block speculation Norbert Manthey
                                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

Since the L1TF vulnerability of Intel CPUs, loading hypervisor data into
L1 cache is problematic, because when hyperthreading is used as well, a
guest running on the sibling core can leak this potentially secret data.

To prevent these speculative accesses, we block speculation after
accessing the domain property field by adding lfence instructions. This
way, the CPU continues executing and loading data only once the condition
is actually evaluated.

As the macros are typically used in if statements, the lfence has to come
in a compatible way. Therefore, a function that returns true after an
lfence instruction is introduced. To protect both branches after a
conditional, an lfence instruction has to be added for the two branches.
To be able to block speculation after several evaluations, the generic
barrier macro block_speculation is also introduced.

As the L1TF vulnerability is only present on the x86 architecture, there is
no need to add protection for other architectures. Hence, the introduced
macros are defined but empty.

On the x86 architecture, by default, the lfence instruction is not present
either. Only when a L1TF vulnerable platform is detected, the lfence
instruction is patched in via alternative patching. Similarly, PV guests
are protected wrt L1TF by default, so that the protection is furthermore
disabled in case HVM is exclueded via the build configuration.

Introducing the lfence instructions catches a lot of potential leaks with
a simple unintrusive code change. During performance testing, we did not
notice performance effects.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v6: Introduce asm nospec.h files
      Check CONFIG_HVM consistently
      Extend commit message to explain CONFIG_HVM and new files
      Fix typos in commit message

 xen/include/asm-arm/nospec.h | 20 ++++++++++++++++++++
 xen/include/asm-x86/nospec.h | 39 +++++++++++++++++++++++++++++++++++++++
 xen/include/xen/nospec.h     |  1 +
 3 files changed, 60 insertions(+)
 create mode 100644 xen/include/asm-arm/nospec.h
 create mode 100644 xen/include/asm-x86/nospec.h

diff --git a/xen/include/asm-arm/nospec.h b/xen/include/asm-arm/nospec.h
new file mode 100644
--- /dev/null
+++ b/xen/include/asm-arm/nospec.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */
+
+#ifndef _ASM_ARM_NOSPEC_H
+#define _ASM_ARM_NOSPEC_H
+
+#define evaluate_nospec(condition) (condition)
+
+#define block_speculation()
+
+#endif /* _ASM_ARM_NOSPEC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-x86/nospec.h b/xen/include/asm-x86/nospec.h
new file mode 100644
--- /dev/null
+++ b/xen/include/asm-x86/nospec.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. */
+
+#ifndef _ASM_X86_NOSPEC_H
+#define _ASM_X86_NOSPEC_H
+
+#include <asm/alternative.h>
+#include <asm/system.h>
+
+/* Allow to insert a read memory barrier into conditionals */
+static always_inline bool arch_barrier_nospec_true(void)
+{
+#if defined(CONFIG_HVM)
+    alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
+#endif
+    return true;
+}
+
+/* Allow to protect evaluation of conditionaasl with respect to speculation */
+#if defined(CONFIG_HVM)
+#define evaluate_nospec(condition)                                         \
+    ((condition) ? arch_barrier_nospec_true() : !arch_barrier_nospec_true())
+#else
+#define evaluate_nospec(condition) (condition)
+#endif
+
+/* Allow to block speculative execution in generic code */
+#define block_speculation() (void)arch_barrier_nospec_true()
+
+#endif /* _ASM_X86_NOSPEC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -8,6 +8,7 @@
 #define XEN_NOSPEC_H
 
 #include <asm/system.h>
+#include <asm/nospec.h>
 
 /**
  * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 6/9] is_control_domain: block speculation
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
                                     ` (4 preceding siblings ...)
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 7/9] is_hvm/pv_domain: " Norbert Manthey
                                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

Checks of domain properties, such as is_hardware_domain or is_hvm_domain,
might be bypassed by speculatively executing these instructions. A reason
for bypassing these checks is that these macros access the domain
structure via a pointer, and check a certain field. Since this memory
access is slow, the CPU assumes a returned value and continues the
execution.

In case an is_control_domain check is bypassed, for example during a
hypercall, data that should only be accessible by the control domain could
be loaded into the cache.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v6: Drop nospec.h include

 xen/include/xen/sched.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -913,10 +913,10 @@ void watchdog_domain_destroy(struct domain *d);
  *    (that is, this would not be suitable for a driver domain)
  *  - There is never a reason to deny the hardware domain access to this
  */
-#define is_hardware_domain(_d) ((_d) == hardware_domain)
+#define is_hardware_domain(_d) evaluate_nospec((_d) == hardware_domain)
 
 /* This check is for functionality specific to a control domain */
-#define is_control_domain(_d) ((_d)->is_privileged)
+#define is_control_domain(_d) evaluate_nospec((_d)->is_privileged)
 
 #define VM_ASSIST(d, t) (test_bit(VMASST_TYPE_ ## t, &(d)->vm_assist))
 
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 7/9] is_hvm/pv_domain: block speculation
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
                                     ` (5 preceding siblings ...)
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 6/9] is_control_domain: block speculation Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
                                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

When checking for being an hvm domain, or PV domain, we have to make
sure that speculation cannot bypass that check, and eventually access
data that should not end up in cache for the current domain type.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---
 xen/include/xen/sched.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -922,7 +922,8 @@ void watchdog_domain_destroy(struct domain *d);
 
 static inline bool is_pv_domain(const struct domain *d)
 {
-    return IS_ENABLED(CONFIG_PV) ? d->guest_type == guest_type_pv : false;
+    return IS_ENABLED(CONFIG_PV)
+           ? evaluate_nospec(d->guest_type == guest_type_pv) : false;
 }
 
 static inline bool is_pv_vcpu(const struct vcpu *v)
@@ -953,7 +954,8 @@ static inline bool is_pv_64bit_vcpu(const struct vcpu *v)
 #endif
 static inline bool is_hvm_domain(const struct domain *d)
 {
-    return IS_ENABLED(CONFIG_HVM) ? d->guest_type == guest_type_hvm : false;
+    return IS_ENABLED(CONFIG_HVM)
+           ? evaluate_nospec(d->guest_type == guest_type_hvm) : false;
 }
 
 static inline bool is_hvm_vcpu(const struct vcpu *v)
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
                                     ` (6 preceding siblings ...)
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 7/9] is_hvm/pv_domain: " Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 9/9] common/memory: " Norbert Manthey
  2019-02-08 14:32                   ` SpectreV1+L1TF Patch Series v6 Julien Grall
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

Guests can issue grant table operations and provide guest controlled
data to them. This data is also used for memory loads. To avoid
speculative out-of-bound accesses, we use the array_index_nospec macro
where applicable. However, there are also memory accesses that cannot
be protected by a single array protection, or multiple accesses in a
row. To protect these, a nospec barrier is placed between the actual
range check and the access via the block_speculation macro.

As different versions of grant tables use structures of different size,
and the status is encoded in an array for version 2, speculative
execution might touch zero-initialized structures of version 2 while
the table is actually using version 1. As PV guests can have control
over their NULL page, these accesses are prevented by protecting the
grant table version evaluation.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v6: Explain version 1 vs version 2 case in commit message
      Protect grant table version checks
      Use block_speculation in map_grant_ref instead of updating op->ref
      Move evaluate_nospec closer to the okay variable in gnttab_transfer

 xen/common/grant_table.c | 48 ++++++++++++++++++++++++++++++++++++------------
 1 file changed, 36 insertions(+), 12 deletions(-)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -37,6 +37,7 @@
 #include <xen/paging.h>
 #include <xen/keyhandler.h>
 #include <xen/vmap.h>
+#include <xen/nospec.h>
 #include <xsm/xsm.h>
 #include <asm/flushtlb.h>
 
@@ -203,8 +204,9 @@ static inline unsigned int nr_status_frames(const struct grant_table *gt)
 }
 
 #define MAPTRACK_PER_PAGE (PAGE_SIZE / sizeof(struct grant_mapping))
-#define maptrack_entry(t, e) \
-    ((t)->maptrack[(e)/MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
+#define maptrack_entry(t, e)                                                   \
+    ((t)->maptrack[array_index_nospec(e, (t)->maptrack_limit)                  \
+                                     /MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
 
 static inline unsigned int
 nr_maptrack_frames(struct grant_table *t)
@@ -963,9 +965,13 @@ map_grant_ref(
         PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
                  op->ref, rgt->domain->domain_id);
 
+    /* Make sure the above check is not bypassed speculatively */
+    block_speculation();
+
     act = active_entry_acquire(rgt, op->ref);
     shah = shared_entry_header(rgt, op->ref);
-    status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
+    status = evaluate_nospec(rgt->gt_version == 1) ? &shah->flags
+                                                 : &status_entry(rgt, op->ref);
 
     /* If already pinned, check the active domid and avoid refcnt overflow. */
     if ( act->pin &&
@@ -987,7 +993,7 @@ map_grant_ref(
 
         if ( !act->pin )
         {
-            unsigned long gfn = rgt->gt_version == 1 ?
+            unsigned long gfn = evaluate_nospec(rgt->gt_version == 1) ?
                                 shared_entry_v1(rgt, op->ref).frame :
                                 shared_entry_v2(rgt, op->ref).full_page.frame;
 
@@ -1321,7 +1327,8 @@ unmap_common(
         goto unlock_out;
     }
 
-    act = active_entry_acquire(rgt, op->ref);
+    act = active_entry_acquire(rgt, array_index_nospec(op->ref,
+                                                       nr_grant_entries(rgt)));
 
     /*
      * Note that we (ab)use the active entry lock here to protect against
@@ -1418,7 +1425,7 @@ unmap_common_complete(struct gnttab_unmap_common *op)
     struct page_info *pg;
     uint16_t *status;
 
-    if ( !op->done )
+    if ( evaluate_nospec(!op->done) )
     {
         /* unmap_common() didn't do anything - nothing to complete. */
         return;
@@ -2026,6 +2033,9 @@ gnttab_prepare_for_transfer(
         goto fail;
     }
 
+    /* Make sure the above check is not bypassed speculatively */
+    ref = array_index_nospec(ref, nr_grant_entries(rgt));
+
     sha = shared_entry_header(rgt, ref);
 
     scombo.word = *(u32 *)&sha->flags;
@@ -2223,7 +2233,11 @@ gnttab_transfer(
         okay = gnttab_prepare_for_transfer(e, d, gop.ref);
         spin_lock(&e->page_alloc_lock);
 
-        if ( unlikely(!okay) || unlikely(e->is_dying) )
+        /*
+         * Make sure the reference bound check in gnttab_prepare_for_transfer
+         * is respected and speculative execution is blocked accordingly
+         */
+        if ( unlikely(!evaluate_nospec(okay)) || unlikely(e->is_dying) )
         {
             bool_t drop_dom_ref = !domain_adjust_tot_pages(e, -1);
 
@@ -2253,7 +2267,7 @@ gnttab_transfer(
         grant_read_lock(e->grant_table);
         act = active_entry_acquire(e->grant_table, gop.ref);
 
-        if ( e->grant_table->gt_version == 1 )
+        if ( evaluate_nospec(e->grant_table->gt_version == 1) )
         {
             grant_entry_v1_t *sha = &shared_entry_v1(e->grant_table, gop.ref);
 
@@ -2408,9 +2422,12 @@ acquire_grant_for_copy(
         PIN_FAIL(gt_unlock_out, GNTST_bad_gntref,
                  "Bad grant reference %#x\n", gref);
 
+    /* Make sure the above check is not bypassed speculatively */
+    gref = array_index_nospec(gref, nr_grant_entries(rgt));
+
     act = active_entry_acquire(rgt, gref);
     shah = shared_entry_header(rgt, gref);
-    if ( rgt->gt_version == 1 )
+    if ( evaluate_nospec(rgt->gt_version == 1) )
     {
         sha2 = NULL;
         status = &shah->flags;
@@ -2826,6 +2843,9 @@ static int gnttab_copy_buf(const struct gnttab_copy *op,
                  op->dest.offset, dest->ptr.offset,
                  op->len, dest->len);
 
+    /* Make sure the above checks are not bypassed speculatively */
+    block_speculation();
+
     memcpy(dest->virt + op->dest.offset, src->virt + op->source.offset,
            op->len);
     gnttab_mark_dirty(dest->domain, dest->mfn);
@@ -3211,6 +3231,10 @@ swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
     if ( unlikely(ref_b >= nr_grant_entries(d->grant_table)))
         PIN_FAIL(out, GNTST_bad_gntref, "Bad ref-b %#x\n", ref_b);
 
+    /* Make sure the above checks are not bypassed speculatively */
+    ref_a = array_index_nospec(ref_a, nr_grant_entries(d->grant_table));
+    ref_b = array_index_nospec(ref_b, nr_grant_entries(d->grant_table));
+
     /* Swapping the same ref is a no-op. */
     if ( ref_a == ref_b )
         goto out;
@@ -3223,7 +3247,7 @@ swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b)
     if ( act_b->pin )
         PIN_FAIL(out, GNTST_eagain, "ref b %#x busy\n", ref_b);
 
-    if ( gt->gt_version == 1 )
+    if ( evaluate_nospec(gt->gt_version == 1) )
     {
         grant_entry_v1_t shared;
 
@@ -3771,7 +3795,7 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
         rc = -EINVAL;
     else if ( ref >= nr_grant_entries(gt) )
         rc = -ENOENT;
-    else if ( gt->gt_version == 1 )
+    else if ( evaluate_nospec(gt->gt_version == 1) )
     {
         const grant_entry_v1_t *sha1 = &shared_entry_v1(gt, ref);
 
@@ -3793,7 +3817,7 @@ int mem_sharing_gref_to_gfn(struct grant_table *gt, grant_ref_t ref,
         rc = -ENXIO;
     else if ( !rc && status )
     {
-        if ( gt->gt_version == 1 )
+        if ( evaluate_nospec(gt->gt_version == 1) )
             *status = flags;
         else
             *status = status_entry(gt, ref);
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* [PATCH SpectreV1+L1TF v6 9/9] common/memory: block speculative out-of-bound accesses
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
                                     ` (7 preceding siblings ...)
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
@ 2019-02-08 13:44                   ` Norbert Manthey
  2019-02-08 14:32                   ` SpectreV1+L1TF Patch Series v6 Julien Grall
  9 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-08 13:44 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	Julien Grall, David Woodhouse, Jan Beulich, Martin Mazein,
	Julian Stecklina, Bjoern Doebel, Norbert Manthey

The get_page_from_gfn method returns a pointer to a page that belongs
to a gfn. Before returning the pointer, the gfn is checked for being
valid. Under speculation, these checks can be bypassed, so that
the function get_page is still executed partially. Consequently, the
function page_get_owner_and_reference might be executed partially as
well. In this function, the computed pointer is accessed, resulting in
a speculative out-of-bound address load. As the gfn can be controlled by
a guest, this access is problematic.

To mitigate the root cause, an lfence instruction is added via the
evaluate_nospec macro. To make the protection generic, we do not
introduce the lfence instruction for this single check, but add it to
the mfn_valid function. This way, other potentially problematic accesses
are protected as well.

This commit is part of the SpectreV1+L1TF mitigation patch series.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>

---

Notes:
  v6: Add array_index_nospec to test_bit call

 xen/common/pdx.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/common/pdx.c b/xen/common/pdx.c
--- a/xen/common/pdx.c
+++ b/xen/common/pdx.c
@@ -18,6 +18,7 @@
 #include <xen/init.h>
 #include <xen/mm.h>
 #include <xen/bitops.h>
+#include <xen/nospec.h>
 
 /* Parameters for PFN/MADDR compression. */
 unsigned long __read_mostly max_pdx;
@@ -33,10 +34,11 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
 
 bool __mfn_valid(unsigned long mfn)
 {
-    return likely(mfn < max_page) &&
-           likely(!(mfn & pfn_hole_mask)) &&
-           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
-                           pdx_group_valid));
+    return evaluate_nospec(
+        likely(mfn < max_page) &&
+        likely(!(mfn & pfn_hole_mask)) &&
+        likely(test_bit(pfn_to_pdx(array_index_nospec(mfn, max_page))
+                                   / PDX_GROUP_COUNT, pdx_group_valid)));
 }
 
 /* Sets all bits from the most-significant 1-bit down to the LSB */
-- 
2.7.4




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: SpectreV1+L1TF Patch Series v6
  2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
                                     ` (8 preceding siblings ...)
  2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 9/9] common/memory: " Norbert Manthey
@ 2019-02-08 14:32                   ` Julien Grall
  9 siblings, 0 replies; 150+ messages in thread
From: Julien Grall @ 2019-02-08 14:32 UTC (permalink / raw)
  To: Norbert Manthey, xen-devel
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, Pawel Wieczorkiewicz,
	David Woodhouse, Jan Beulich, Martin Mazein, Julian Stecklina,
	Bjoern Doebel

Hi,

Please don't send the next version in reply-to a random e-mail from the previous 
version. Instead you should create a new thread to make things easier for review.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses
       [not found]                   ` <01CCEAAF02000039B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-12 13:08                     ` Jan Beulich
  2019-02-14 13:10                       ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 13:08 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> @@ -813,6 +817,13 @@ int set_global_virq_handler(struct domain *d, uint32_t virq)
>  
>      if (virq >= NR_VIRQS)
>          return -EINVAL;
> +
> +   /*
> +    * Make sure the guest controlled value virq is bounded even during
> +    * speculative execution.
> +    */
> +    virq = array_index_nospec(virq, ARRAY_SIZE(global_virq_handlers));
> +
>      if (!virq_is_global(virq))
>          return -EINVAL;

Didn't we agree earlier on that this addition is pointless, as the only
caller is the XEN_DOMCTL_set_virq_handler handler, and most
domctl-s (including this one) are excluded from security considerations
due to XSA-77?

> @@ -955,22 +967,22 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>      {
>      case ECS_VIRQ:
>          if ( virq_is_global(chn->u.virq) )
> -            chn->notify_vcpu_id = vcpu_id;
> +            chn->notify_vcpu_id = v->vcpu_id;
>          else
>              rc = -EINVAL;
>          break;
>      case ECS_UNBOUND:
>      case ECS_INTERDOMAIN:
> -        chn->notify_vcpu_id = vcpu_id;
> +        chn->notify_vcpu_id = v->vcpu_id;
>          break;
>      case ECS_PIRQ:
> -        if ( chn->notify_vcpu_id == vcpu_id )
> +        if ( chn->notify_vcpu_id == v->vcpu_id )
>              break;
>          unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
> -        chn->notify_vcpu_id = vcpu_id;
> +        chn->notify_vcpu_id = v->vcpu_id;

Right now we understand why all of these changes are done, but
without a comment this is liable to be converted back as an
optimization down the road.

Everything else here looks fine to me now.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 2/9] x86/vioapic: block speculative out-of-bound accesses
       [not found]                   ` <01CC2AAF02000039B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-12 13:16                     ` Jan Beulich
  2019-02-14 13:16                       ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 13:16 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> When interacting with io apic, a guest can specify values that are used
> as index to structures, and whose values are not compared against
> upper bounds to prevent speculative out-of-bound accesses. This change
> prevents these speculative accesses.
> 
> Furthermore, variables are initialized and the compiler is asked to not
> optimized these initializations, as the uninitialized, potentially guest
> controlled, variables might be used in a speculative out-of-bound access.

Uninitialized variables can't be guest controlled, not even potentially.
What we want to avoid here is speculation with uninitialized values
(or really stale data still on the stack from use by other code),
regardless of direct guest control.

> Out of the four initialized variables, two are potentially problematic,
> namely ones in the functions vioapic_irq_positive_edge and
> vioapic_get_trigger_mode.
> 
> As the two problematic variables are both used in the common function
> gsi_vioapic, the mitigation is implemented there. As the access pattern
> of the currently non-guest-controlled functions might change in the
> future as well, the other variables are initialized as well.
> 
> This commit is part of the SpectreV1+L1TF mitigation patch series.

Oh, I didn't pay attention in patch 1: You had meant to change this
wording to something including "speculative hardening" (throughout
the series).

> @@ -212,7 +220,15 @@ static void vioapic_write_redirent(
>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>      union vioapic_redir_entry *pent, ent;
>      int unmasked = 0;
> -    unsigned int gsi = vioapic->base_gsi + idx;
> +    unsigned int gsi;
> +
> +    /* Callers of this function should make sure idx is bounded appropriately */
> +    ASSERT(idx < vioapic->nr_pins);
> +
> +    /* Make sure no out-of-bound value for idx can be used */

out-of-bounds

I'm fine now with all the code changes here.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
       [not found]                   ` <01CE6AAF02000039B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-12 13:25                     ` Jan Beulich
  2019-02-12 14:05                       ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 13:25 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> @@ -3453,7 +3456,8 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>          if ( (index / 2) >=
>               MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT) )
>              goto gp_fault;
> -        *msr_content = var_range_base[index];
> +        *msr_content = var_range_base[array_index_nospec(index,
> +                        2*MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT))];

Missing blanks around *. This alone would be easy to adjust while
committing, but there's still the only partially discussed question
regarding ...

> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>      if ( a.index >= HVM_NR_PARAMS )
>          return -EINVAL;
>  
> +    /*
> +     * Make sure the guest controlled value a.index is bounded even during
> +     * speculative execution.
> +     */
> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
> +
>      d = rcu_lock_domain_by_any_id(a.domid);
>      if ( d == NULL )
>          return -ESRCH;
> @@ -4370,6 +4380,12 @@ static int hvmop_get_param(
>      if ( a.index >= HVM_NR_PARAMS )
>          return -EINVAL;
>  
> +    /*
> +     * Make sure the guest controlled value a.index is bounded even during
> +     * speculative execution.
> +     */
> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);

... the usefulness of these two. To make forward progress it may
be worthwhile to split off these two changes into a separate patch.
If you're fine with this, I could strip these two before committing,
in which case the remaining change is
Reviewed-by: Jan Beulich <jbeulich@suse.com>

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 4/9] spec: add l1tf-barrier
       [not found]                   ` <01CFAAAF02000039B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-12 13:44                     ` Jan Beulich
  2019-02-15  9:13                       ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 13:44 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> To control the runtime behavior on L1TF vulnerable platforms better, the
> command line option l1tf-barrier is introduced. This option controls
> whether on vulnerable x86 platforms the lfence instruction is used to
> prevent speculative execution from bypassing the evaluation of
> conditionals that are protected with the evaluate_nospec macro.
> 
> By now, Xen is capable of identifying L1TF vulnerable hardware. However,
> this information cannot be used for alternative patching, as a CPU feature
> is required. To control alternative patching with the command line option,
> a new x86 feature "X86_FEATURE_SC_L1TF_VULN" is introduced. This feature
> is used to patch the lfence instruction into the arch_barrier_nospec_true
> function. The feature is enabled only if L1TF vulnerable hardware is
> detected and the command line option does not prevent using this feature.
> 
> The status of hyperthreading is not considered when automatically enabling
> adding the lfence instruction, because platforms without hyperthreading
> can still be vulnerable to L1TF in case the L1 cache is not flushed
> properly.
> 
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> 
> ---
> 
> Notes:
>   v6: Move disabling l1tf-barrier into spec-ctrl=no
>       Use gap in existing flags
>       Force barrier based on commandline, independently of L1TF detection
> 
>  docs/misc/xen-command-line.pandoc | 14 ++++++++++----
>  xen/arch/x86/spec_ctrl.c          | 17 +++++++++++++++--
>  xen/include/asm-x86/cpufeatures.h |  1 +
>  xen/include/asm-x86/spec_ctrl.h   |  1 +
>  4 files changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -483,9 +483,9 @@ accounting for hardware capabilities as enumerated via 
> CPUID.
>  
>  Currently accepted:
>  
> -The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`,
> -`l1d-flush` and `ssbd` are used by default if available and applicable.  They 
> can
> -be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
> +The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, 
> `l1d-flush`,
> +`l1tf-barrier` and `ssbd` are used by default if available and applicable.  
> They
> +can be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, 
> and
>  won't offer them to guests.
>  
>  ### cpuid_mask_cpu
> @@ -1896,7 +1896,7 @@ By default SSBD will be mitigated at runtime (i.e 
> `ssbd=runtime`).
>  ### spec-ctrl (x86)
>  > `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb}=<bool>,
>  >              bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
> ->              l1d-flush}=<bool> ]`
> +>              l1d-flush,l1tf-barrier}=<bool> ]`
>  
>  Controls for speculative execution sidechannel mitigations.  By default, 
> Xen
>  will pick the most appropriate mitigations based on compiled in support,
> @@ -1962,6 +1962,12 @@ Irrespective of Xen's setting, the feature is 
> virtualised for HVM guests to
>  use.  By default, Xen will enable this mitigation on hardware believed to 
> be
>  vulnerable to L1TF.
>  
> +On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to 
> force
> +or prevent Xen from protecting evaluations inside the hypervisor with a 
> barrier
> +instruction to not load potentially secret information into L1 cache.  By
> +default, Xen will enable this mitigation on hardware believed to be 
> vulnerable
> +to L1TF.
> +
>  ### sync_console
>  > `= <boolean>`
>  
> diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
> --- a/xen/arch/x86/spec_ctrl.c
> +++ b/xen/arch/x86/spec_ctrl.c
> @@ -21,6 +21,7 @@
>  #include <xen/lib.h>
>  #include <xen/warning.h>
>  
> +#include <asm/cpuid.h>
>  #include <asm/microcode.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
> @@ -50,6 +51,7 @@ bool __read_mostly opt_ibpb = true;
>  bool __read_mostly opt_ssbd = false;
>  int8_t __read_mostly opt_eager_fpu = -1;
>  int8_t __read_mostly opt_l1d_flush = -1;
> +int8_t __read_mostly opt_l1tf_barrier = -1;
>  
>  bool __initdata bsp_delay_spec_ctrl;
>  uint8_t __read_mostly default_xen_spec_ctrl;
> @@ -91,6 +93,8 @@ static int __init parse_spec_ctrl(const char *s)
>              if ( opt_pv_l1tf_domu < 0 )
>                  opt_pv_l1tf_domu = 0;
>  
> +            opt_l1tf_barrier = 0;
> +
>          disable_common:
>              opt_rsb_pv = false;
>              opt_rsb_hvm = false;
> @@ -157,6 +161,8 @@ static int __init parse_spec_ctrl(const char *s)
>              opt_eager_fpu = val;
>          else if ( (val = parse_boolean("l1d-flush", s, ss)) >= 0 )
>              opt_l1d_flush = val;
> +        else if ( (val = parse_boolean("l1tf-barrier", s, ss)) >= 0 )
> +            opt_l1tf_barrier = val;
>          else
>              rc = -EINVAL;
>  
> @@ -248,7 +254,7 @@ static void __init print_details(enum ind_thunk thunk, 
> uint64_t caps)
>                 "\n");
>  
>      /* Settings for Xen's protection, irrespective of guests. */
> -    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s\n",
> +    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s%s\n",
>             thunk == THUNK_NONE      ? "N/A" :
>             thunk == THUNK_RETPOLINE ? "RETPOLINE" :
>             thunk == THUNK_LFENCE    ? "LFENCE" :
> @@ -258,7 +264,8 @@ static void __init print_details(enum ind_thunk thunk, 
> uint64_t caps)
>             !boot_cpu_has(X86_FEATURE_SSBD)           ? "" :
>             (default_xen_spec_ctrl & SPEC_CTRL_SSBD)  ? " SSBD+" : " SSBD-",
>             opt_ibpb                                  ? " IBPB"  : "",
> -           opt_l1d_flush                             ? " L1D_FLUSH" : "");
> +           opt_l1d_flush                             ? " L1D_FLUSH" : "",
> +           opt_l1tf_barrier                          ? " L1TF_BARRIER" : 
> "");
>  
>      /* L1TF diagnostics, printed if vulnerable or PV shadowing is in use. 
> */
>      if ( cpu_has_bug_l1tf || opt_pv_l1tf_hwdom || opt_pv_l1tf_domu )
> @@ -842,6 +849,12 @@ void __init init_speculation_mitigations(void)
>      else if ( opt_l1d_flush == -1 )
>          opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
>  
> +    /* By default, enable L1TF_VULN on L1TF-vulnerable hardware */
> +    if ( opt_l1tf_barrier == -1 )
> +        opt_l1tf_barrier = cpu_has_bug_l1tf;
> +    if ( opt_l1tf_barrier > 0)
> +        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);

Did we end with a misunderstanding in the v5 discussion? I didn't
answer your question regarding whether to also consider L1D
flush availability (on top of my request to consider SMT) with a
straight "yes", but I think it was still clear that my more extensive
response boiled down to a "yes". Oh, I see now - the same topic
was discussed in two places, and for the second I then said that
with the "for now" aspect properly stated (which you now do)
I'd be fine.

So let me put it this way: Is taking into consideration at least
opt_smt and opt_l1d_flush (but putting on the side the "too
early" aspect of the determination here) very difficult to do,
or would that leave un-addressed concerns of yours? If not,
may I ask that you go at least that little step further? As said
before - we'd like to avoid penalizing configurations as well as
setups which aren't affected. In particular it would seem
pretty bad of us to further penalize people who have set
"smt=0" and who use up-to-date microcode.

Also in the second if() there's yet again a missing blank.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec
       [not found]                   ` <01CFEAAF02000039B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-12 13:50                     ` Jan Beulich
  2019-02-14 13:37                       ` Norbert Manthey
  2019-02-12 14:12                     ` Jan Beulich
  1 sibling, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 13:50 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> --- /dev/null
> +++ b/xen/include/asm-x86/nospec.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 
> */
> +
> +#ifndef _ASM_X86_NOSPEC_H
> +#define _ASM_X86_NOSPEC_H
> +
> +#include <asm/alternative.h>
> +#include <asm/system.h>
> +
> +/* Allow to insert a read memory barrier into conditionals */
> +static always_inline bool arch_barrier_nospec_true(void)

Now that this is x86-specific (and not used by common code),
I don't think the arch_ prefix is warranted anymore.

> +{
> +#if defined(CONFIG_HVM)

Here and below I'd prefer if you used the shorter #ifdef.

> +    alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
> +#endif
> +    return true;
> +}
> +
> +/* Allow to protect evaluation of conditionaasl with respect to speculation 
> */
> +#if defined(CONFIG_HVM)
> +#define evaluate_nospec(condition)                                         \
> +    ((condition) ? arch_barrier_nospec_true() : !arch_barrier_nospec_true())
> +#else
> +#define evaluate_nospec(condition) (condition)
> +#endif
> +
> +/* Allow to block speculative execution in generic code */
> +#define block_speculation() (void)arch_barrier_nospec_true()

I'm pretty sure that I did point out before that this lacks an
outer pair of parentheses.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-12 13:25                     ` [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: " Jan Beulich
@ 2019-02-12 14:05                       ` Norbert Manthey
  2019-02-12 14:14                         ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-12 14:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/12/19 14:25, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> @@ -3453,7 +3456,8 @@ int hvm_msr_read_intercept(unsigned int msr, uint64_t *msr_content)
>>          if ( (index / 2) >=
>>               MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT) )
>>              goto gp_fault;
>> -        *msr_content = var_range_base[index];
>> +        *msr_content = var_range_base[array_index_nospec(index,
>> +                        2*MASK_EXTR(v->arch.hvm.mtrr.mtrr_cap, MTRRcap_VCNT))];
> Missing blanks around *. This alone would be easy to adjust while
> committing, but there's still the only partially discussed question
> regarding ...
>
>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>      if ( a.index >= HVM_NR_PARAMS )
>>          return -EINVAL;
>>  
>> +    /*
>> +     * Make sure the guest controlled value a.index is bounded even during
>> +     * speculative execution.
>> +     */
>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>> +
>>      d = rcu_lock_domain_by_any_id(a.domid);
>>      if ( d == NULL )
>>          return -ESRCH;
>> @@ -4370,6 +4380,12 @@ static int hvmop_get_param(
>>      if ( a.index >= HVM_NR_PARAMS )
>>          return -EINVAL;
>>  
>> +    /*
>> +     * Make sure the guest controlled value a.index is bounded even during
>> +     * speculative execution.
>> +     */
>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
> ... the usefulness of these two. To make forward progress it may
> be worthwhile to split off these two changes into a separate patch.
> If you're fine with this, I could strip these two before committing,
> in which case the remaining change is
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Taking apart the commit is fine with me. I will submit a follow up
change that does not update the values but fixes the reads.

Best,
Norbert




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 6/9] is_control_domain: block speculation
       [not found]                   ` <01CF2AAF02000039B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-12 14:11                     ` Jan Beulich
  2019-02-14 13:45                       ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 14:11 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> Checks of domain properties, such as is_hardware_domain or is_hvm_domain,
> might be bypassed by speculatively executing these instructions. A reason
> for bypassing these checks is that these macros access the domain
> structure via a pointer, and check a certain field. Since this memory
> access is slow, the CPU assumes a returned value and continues the
> execution.
> 
> In case an is_control_domain check is bypassed, for example during a
> hypercall, data that should only be accessible by the control domain could
> be loaded into the cache.
> 
> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
> 
> ---
> 
> Notes:
>   v6: Drop nospec.h include

And this was because of what? I think it is good practice to include
other headers which added definitions rely on, even if in practice
_right now_ that header gets included already by other means. If
there's some recursion in header dependencies, then it would have
been nice if you had pointed out the actual issue.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec
       [not found]                   ` <01CFEAAF02000039B1E090C7@prv1-mh.provo.novell.com>
  2019-02-12 13:50                     ` [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec Jan Beulich
@ 2019-02-12 14:12                     ` Jan Beulich
  2019-02-14 13:42                       ` Norbert Manthey
  1 sibling, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 14:12 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> --- /dev/null
> +++ b/xen/include/asm-x86/nospec.h
> @@ -0,0 +1,39 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 
> */
> +
> +#ifndef _ASM_X86_NOSPEC_H
> +#define _ASM_X86_NOSPEC_H
> +
> +#include <asm/alternative.h>
> +#include <asm/system.h>

Isn't the latter unnecessary now? You don't use any *mb() construct
anymore.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-12 14:05                       ` Norbert Manthey
@ 2019-02-12 14:14                         ` Jan Beulich
  2019-02-15  8:05                           ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 14:14 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 12.02.19 at 15:05, <nmanthey@amazon.de> wrote:
> On 2/12/19 14:25, Jan Beulich wrote:
>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>      if ( a.index >= HVM_NR_PARAMS )
>>>          return -EINVAL;
>>>  
>>> +    /*
>>> +     * Make sure the guest controlled value a.index is bounded even during
>>> +     * speculative execution.
>>> +     */
>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>> +
>>>      d = rcu_lock_domain_by_any_id(a.domid);
>>>      if ( d == NULL )
>>>          return -ESRCH;
>>> @@ -4370,6 +4380,12 @@ static int hvmop_get_param(
>>>      if ( a.index >= HVM_NR_PARAMS )
>>>          return -EINVAL;
>>>  
>>> +    /*
>>> +     * Make sure the guest controlled value a.index is bounded even during
>>> +     * speculative execution.
>>> +     */
>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>> ... the usefulness of these two. To make forward progress it may
>> be worthwhile to split off these two changes into a separate patch.
>> If you're fine with this, I could strip these two before committing,
>> in which case the remaining change is
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Taking apart the commit is fine with me. I will submit a follow up
> change that does not update the values but fixes the reads.

As pointed out during the v5 discussion, I'm unconvinced that if
you do so the compiler can't re-introduce the issue via CSE. I'd
really like a reliable solution to be determined first.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 9/9] common/memory: block speculative out-of-bound accesses
       [not found]                   ` <23D9419E02000017B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-12 14:31                     ` Jan Beulich
  2019-02-14 14:04                       ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-12 14:31 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> @@ -33,10 +34,11 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
>  
>  bool __mfn_valid(unsigned long mfn)
>  {
> -    return likely(mfn < max_page) &&
> -           likely(!(mfn & pfn_hole_mask)) &&
> -           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
> -                           pdx_group_valid));
> +    return evaluate_nospec(
> +        likely(mfn < max_page) &&
> +        likely(!(mfn & pfn_hole_mask)) &&
> +        likely(test_bit(pfn_to_pdx(array_index_nospec(mfn, max_page))
> +                                   / PDX_GROUP_COUNT, pdx_group_valid)));
>  }

How about this instead:

bool __mfn_valid(unsigned long mfn)
{
    if ( unlikely(evaluate_nospec(mfn >= max_page)) )
        return false;
    return likely(!(mfn & pfn_hole_mask)) &&
           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
                           pdx_group_valid));
}

Initially I really just wanted to improve the line wrapping (at the
very least the / was misplaced), but I think this variant guards
against all that's needed without even introducing wrapping
headaches.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses
       [not found]                   ` <01CEAAAF02000039B1E090C7@prv1-mh.provo.novell.com>
@ 2019-02-13 11:50                     ` Jan Beulich
  2019-02-15  9:55                       ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-13 11:50 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
> Guests can issue grant table operations and provide guest controlled
> data to them. This data is also used for memory loads. To avoid
> speculative out-of-bound accesses, we use the array_index_nospec macro
> where applicable. However, there are also memory accesses that cannot
> be protected by a single array protection, or multiple accesses in a
> row. To protect these, a nospec barrier is placed between the actual
> range check and the access via the block_speculation macro.
> 
> As different versions of grant tables use structures of different size,
> and the status is encoded in an array for version 2, speculative
> execution might touch zero-initialized structures of version 2 while
> the table is actually using version 1.

Why zero-initialized? Did I still not succeed demonstrating to you
that speculation along a v2 path can actually overrun v1 arrays,
not just access parts with may still be zero-initialized?

> @@ -203,8 +204,9 @@ static inline unsigned int nr_status_frames(const struct grant_table *gt)
>  }
>  
>  #define MAPTRACK_PER_PAGE (PAGE_SIZE / sizeof(struct grant_mapping))
> -#define maptrack_entry(t, e) \
> -    ((t)->maptrack[(e)/MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
> +#define maptrack_entry(t, e)                                                   \
> +    ((t)->maptrack[array_index_nospec(e, (t)->maptrack_limit)                  \
> +                                     /MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])

I would have hoped that the pointing out of similar formatting
issues elsewhere would have had an impact here as well, but
I see the / is still wrongly at the beginning of a line, and is still
not followed by a blank (would be "preceded" if it was well
placed). And while I realize it's only code movement, adding
the missing blanks around % would be appreciated too at this
occasion.

> @@ -963,9 +965,13 @@ map_grant_ref(
>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>                   op->ref, rgt->domain->domain_id);
>  
> +    /* Make sure the above check is not bypassed speculatively */
> +    block_speculation();
> +
>      act = active_entry_acquire(rgt, op->ref);
>      shah = shared_entry_header(rgt, op->ref);
> -    status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
> +    status = evaluate_nospec(rgt->gt_version == 1) ? &shah->flags
> +                                                 : &status_entry(rgt, op->ref);

Did you consider folding the two pairs of fences you emit into
one? Moving up the assignment to status ought to achieve this,
as then the block_speculation() could be dropped afaict.

Then again you don't alter shared_entry_header(). If there's
a reason for you not having done so, then a second fence
here is needed in any event.

What about the version check in nr_grant_entries()? It appears
to me as if at least its use in grant_map_exists() (which simply is
the first one I've found) is problematic without an adjustment.
Even worse, ...

> @@ -1321,7 +1327,8 @@ unmap_common(
>          goto unlock_out;
>      }
>  
> -    act = active_entry_acquire(rgt, op->ref);
> +    act = active_entry_acquire(rgt, array_index_nospec(op->ref,
> +                                                       nr_grant_entries(rgt)));

... you add a use e.g. here to _guard_ against speculation.

And what about _set_status(), unmap_common_complete(),
gnttab_grow_table(), gnttab_setup_table(),
release_grant_for_copy(), the 2nd one in acquire_grant_for_copy(),
several ones in gnttab_set_version(), gnttab_release_mappings(),
the 3rd one in mem_sharing_gref_to_gfn(), gnttab_map_frame(),
and gnttab_get_status_frame()?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses
  2019-02-12 13:08                     ` [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses Jan Beulich
@ 2019-02-14 13:10                       ` Norbert Manthey
  2019-02-14 13:20                         ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-14 13:10 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel


On 2/12/19 14:08, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> @@ -813,6 +817,13 @@ int set_global_virq_handler(struct domain *d, uint32_t virq)
>>  
>>      if (virq >= NR_VIRQS)
>>          return -EINVAL;
>> +
>> +   /*
>> +    * Make sure the guest controlled value virq is bounded even during
>> +    * speculative execution.
>> +    */
>> +    virq = array_index_nospec(virq, ARRAY_SIZE(global_virq_handlers));
>> +
>>      if (!virq_is_global(virq))
>>          return -EINVAL;
> Didn't we agree earlier on that this addition is pointless, as the only
> caller is the XEN_DOMCTL_set_virq_handler handler, and most
> domctl-s (including this one) are excluded from security considerations
> due to XSA-77?
I do not recall such a comment, but agree that this hunk can be dropped.
>
>> @@ -955,22 +967,22 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>>      {
>>      case ECS_VIRQ:
>>          if ( virq_is_global(chn->u.virq) )
>> -            chn->notify_vcpu_id = vcpu_id;
>> +            chn->notify_vcpu_id = v->vcpu_id;
>>          else
>>              rc = -EINVAL;
>>          break;
>>      case ECS_UNBOUND:
>>      case ECS_INTERDOMAIN:
>> -        chn->notify_vcpu_id = vcpu_id;
>> +        chn->notify_vcpu_id = v->vcpu_id;
>>          break;
>>      case ECS_PIRQ:
>> -        if ( chn->notify_vcpu_id == vcpu_id )
>> +        if ( chn->notify_vcpu_id == v->vcpu_id )
>>              break;
>>          unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
>> -        chn->notify_vcpu_id = vcpu_id;
>> +        chn->notify_vcpu_id = v->vcpu_id;
> Right now we understand why all of these changes are done, but
> without a comment this is liable to be converted back as an
> optimization down the road.

I will extend the commit message accordingly.

Best,
Norbert

>
> Everything else here looks fine to me now.
>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 2/9] x86/vioapic: block speculative out-of-bound accesses
  2019-02-12 13:16                     ` [PATCH SpectreV1+L1TF v6 2/9] x86/vioapic: " Jan Beulich
@ 2019-02-14 13:16                       ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-14 13:16 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/12/19 14:16, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> When interacting with io apic, a guest can specify values that are used
>> as index to structures, and whose values are not compared against
>> upper bounds to prevent speculative out-of-bound accesses. This change
>> prevents these speculative accesses.
>>
>> Furthermore, variables are initialized and the compiler is asked to not
>> optimized these initializations, as the uninitialized, potentially guest
>> controlled, variables might be used in a speculative out-of-bound access.
> Uninitialized variables can't be guest controlled, not even potentially.
> What we want to avoid here is speculation with uninitialized values
> (or really stale data still on the stack from use by other code),
> regardless of direct guest control.
I will drop the part "potentially guest controlled".
>
>> Out of the four initialized variables, two are potentially problematic,
>> namely ones in the functions vioapic_irq_positive_edge and
>> vioapic_get_trigger_mode.
>>
>> As the two problematic variables are both used in the common function
>> gsi_vioapic, the mitigation is implemented there. As the access pattern
>> of the currently non-guest-controlled functions might change in the
>> future as well, the other variables are initialized as well.
>>
>> This commit is part of the SpectreV1+L1TF mitigation patch series.
> Oh, I didn't pay attention in patch 1: You had meant to change this
> wording to something including "speculative hardening" (throughout
> the series).
That slipped through as I did not add that right after the discussion. I
added this to the whole series now.
>
>> @@ -212,7 +220,15 @@ static void vioapic_write_redirent(
>>      struct hvm_irq *hvm_irq = hvm_domain_irq(d);
>>      union vioapic_redir_entry *pent, ent;
>>      int unmasked = 0;
>> -    unsigned int gsi = vioapic->base_gsi + idx;
>> +    unsigned int gsi;
>> +
>> +    /* Callers of this function should make sure idx is bounded appropriately */
>> +    ASSERT(idx < vioapic->nr_pins);
>> +
>> +    /* Make sure no out-of-bound value for idx can be used */
> out-of-bounds

Will fix.

Best,
Norbert

>
> I'm fine now with all the code changes here.
>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses
  2019-02-14 13:10                       ` Norbert Manthey
@ 2019-02-14 13:20                         ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-02-14 13:20 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 14.02.19 at 14:10, <nmanthey@amazon.de> wrote:
> On 2/12/19 14:08, Jan Beulich wrote:
>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>> @@ -955,22 +967,22 @@ long evtchn_bind_vcpu(unsigned int port, unsigned int vcpu_id)
>>>      {
>>>      case ECS_VIRQ:
>>>          if ( virq_is_global(chn->u.virq) )
>>> -            chn->notify_vcpu_id = vcpu_id;
>>> +            chn->notify_vcpu_id = v->vcpu_id;
>>>          else
>>>              rc = -EINVAL;
>>>          break;
>>>      case ECS_UNBOUND:
>>>      case ECS_INTERDOMAIN:
>>> -        chn->notify_vcpu_id = vcpu_id;
>>> +        chn->notify_vcpu_id = v->vcpu_id;
>>>          break;
>>>      case ECS_PIRQ:
>>> -        if ( chn->notify_vcpu_id == vcpu_id )
>>> +        if ( chn->notify_vcpu_id == v->vcpu_id )
>>>              break;
>>>          unlink_pirq_port(chn, d->vcpu[chn->notify_vcpu_id]);
>>> -        chn->notify_vcpu_id = vcpu_id;
>>> +        chn->notify_vcpu_id = v->vcpu_id;
>> Right now we understand why all of these changes are done, but
>> without a comment this is liable to be converted back as an
>> optimization down the road.
> 
> I will extend the commit message accordingly.

Actually the request was for a comment addition, not for a commit
message adjustment.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec
  2019-02-12 13:50                     ` [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec Jan Beulich
@ 2019-02-14 13:37                       ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-14 13:37 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/12/19 14:50, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> --- /dev/null
>> +++ b/xen/include/asm-x86/nospec.h
>> @@ -0,0 +1,39 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 
>> */
>> +
>> +#ifndef _ASM_X86_NOSPEC_H
>> +#define _ASM_X86_NOSPEC_H
>> +
>> +#include <asm/alternative.h>
>> +#include <asm/system.h>
>> +
>> +/* Allow to insert a read memory barrier into conditionals */
>> +static always_inline bool arch_barrier_nospec_true(void)
> Now that this is x86-specific (and not used by common code),
> I don't think the arch_ prefix is warranted anymore.
I will drop the prefix.
>> +{
>> +#if defined(CONFIG_HVM)
> Here and below I'd prefer if you used the shorter #ifdef.
I will use the short version.
>
>> +    alternative("", "lfence", X86_FEATURE_SC_L1TF_VULN);
>> +#endif
>> +    return true;
>> +}
>> +
>> +/* Allow to protect evaluation of conditionaasl with respect to speculation 
>> */
>> +#if defined(CONFIG_HVM)
>> +#define evaluate_nospec(condition)                                         \
>> +    ((condition) ? arch_barrier_nospec_true() : !arch_barrier_nospec_true())
>> +#else
>> +#define evaluate_nospec(condition) (condition)
>> +#endif
>> +
>> +/* Allow to block speculative execution in generic code */
>> +#define block_speculation() (void)arch_barrier_nospec_true()
> I'm pretty sure that I did point out before that this lacks an
> outer pair of parentheses.

You did. I will add them.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec
  2019-02-12 14:12                     ` Jan Beulich
@ 2019-02-14 13:42                       ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-14 13:42 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/12/19 15:12, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> --- /dev/null
>> +++ b/xen/include/asm-x86/nospec.h
>> @@ -0,0 +1,39 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 
>> */
>> +
>> +#ifndef _ASM_X86_NOSPEC_H
>> +#define _ASM_X86_NOSPEC_H
>> +
>> +#include <asm/alternative.h>
>> +#include <asm/system.h>
> Isn't the latter unnecessary now? You don't use any *mb() construct
> anymore.

True, I deleted this include.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 6/9] is_control_domain: block speculation
  2019-02-12 14:11                     ` [PATCH SpectreV1+L1TF v6 6/9] is_control_domain: block speculation Jan Beulich
@ 2019-02-14 13:45                       ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-14 13:45 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/12/19 15:11, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> Checks of domain properties, such as is_hardware_domain or is_hvm_domain,
>> might be bypassed by speculatively executing these instructions. A reason
>> for bypassing these checks is that these macros access the domain
>> structure via a pointer, and check a certain field. Since this memory
>> access is slow, the CPU assumes a returned value and continues the
>> execution.
>>
>> In case an is_control_domain check is bypassed, for example during a
>> hypercall, data that should only be accessible by the control domain could
>> be loaded into the cache.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>>
>> ---
>>
>> Notes:
>>   v6: Drop nospec.h include
> And this was because of what? I think it is good practice to include
> other headers which added definitions rely on, even if in practice
> _right now_ that header gets included already by other means. If
> there's some recursion in header dependencies, then it would have
> been nice if you had pointed out the actual issue.

The nospec.h header has been introduced by the commit "xen/sched:
Introduce domain_vcpu() helper" between my v4 and v6, so I had to drop
my include there. The sched.h file still includes the nospec.h file, I
just do not have to add it any more. I could have been a bit more
verbose in the notes section.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 9/9] common/memory: block speculative out-of-bound accesses
  2019-02-12 14:31                     ` [PATCH SpectreV1+L1TF v6 9/9] common/memory: block speculative out-of-bound accesses Jan Beulich
@ 2019-02-14 14:04                       ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-14 14:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/12/19 15:31, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> @@ -33,10 +34,11 @@ unsigned long __read_mostly pdx_group_valid[BITS_TO_LONGS(
>>  
>>  bool __mfn_valid(unsigned long mfn)
>>  {
>> -    return likely(mfn < max_page) &&
>> -           likely(!(mfn & pfn_hole_mask)) &&
>> -           likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>> -                           pdx_group_valid));
>> +    return evaluate_nospec(
>> +        likely(mfn < max_page) &&
>> +        likely(!(mfn & pfn_hole_mask)) &&
>> +        likely(test_bit(pfn_to_pdx(array_index_nospec(mfn, max_page))
>> +                                   / PDX_GROUP_COUNT, pdx_group_valid)));
>>  }
> How about this instead:
>
> bool __mfn_valid(unsigned long mfn)
> {
>     if ( unlikely(evaluate_nospec(mfn >= max_page)) )
>         return false;
>     return likely(!(mfn & pfn_hole_mask)) &&
>            likely(test_bit(pfn_to_pdx(mfn) / PDX_GROUP_COUNT,
>                            pdx_group_valid));
> }
>
> Initially I really just wanted to improve the line wrapping (at the
> very least the / was misplaced), but I think this variant guards
> against all that's needed without even introducing wrapping
> headaches.

That works as well, I will adapt the commit accordingly.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-12 14:14                         ` Jan Beulich
@ 2019-02-15  8:05                           ` Norbert Manthey
  2019-02-15  8:55                             ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-15  8:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/12/19 15:14, Jan Beulich wrote:
>>>> On 12.02.19 at 15:05, <nmanthey@amazon.de> wrote:
>> On 2/12/19 14:25, Jan Beulich wrote:
>>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>          return -EINVAL;
>>>>  
>>>> +    /*
>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>> +     * speculative execution.
>>>> +     */
>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>> +
>>>>      d = rcu_lock_domain_by_any_id(a.domid);
>>>>      if ( d == NULL )
>>>>          return -ESRCH;
>>>> @@ -4370,6 +4380,12 @@ static int hvmop_get_param(
>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>          return -EINVAL;
>>>>  
>>>> +    /*
>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>> +     * speculative execution.
>>>> +     */
>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>> ... the usefulness of these two. To make forward progress it may
>>> be worthwhile to split off these two changes into a separate patch.
>>> If you're fine with this, I could strip these two before committing,
>>> in which case the remaining change is
>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> Taking apart the commit is fine with me. I will submit a follow up
>> change that does not update the values but fixes the reads.
> As pointed out during the v5 discussion, I'm unconvinced that if
> you do so the compiler can't re-introduce the issue via CSE. I'd
> really like a reliable solution to be determined first.

I cannot give a guarantee what future compilers might do. Furthermore, I
do not want to wait until all/most compilers ship with such a
controllable guarantee. While I would love to have a reliable solution
as well, I'd go with what we can do today for now, and re-iterate once
we have something more stable.

Best,
Norbert

>
> Jan
>
>



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-15  8:05                           ` Norbert Manthey
@ 2019-02-15  8:55                             ` Jan Beulich
  2019-02-15 10:50                               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-15  8:55 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 15.02.19 at 09:05, <nmanthey@amazon.de> wrote:
> On 2/12/19 15:14, Jan Beulich wrote:
>>>>> On 12.02.19 at 15:05, <nmanthey@amazon.de> wrote:
>>> On 2/12/19 14:25, Jan Beulich wrote:
>>>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>          return -EINVAL;
>>>>>  
>>>>> +    /*
>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>> +     * speculative execution.
>>>>> +     */
>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>>> +
>>>>>      d = rcu_lock_domain_by_any_id(a.domid);
>>>>>      if ( d == NULL )
>>>>>          return -ESRCH;
>>>>> @@ -4370,6 +4380,12 @@ static int hvmop_get_param(
>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>          return -EINVAL;
>>>>>  
>>>>> +    /*
>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>> +     * speculative execution.
>>>>> +     */
>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>> ... the usefulness of these two. To make forward progress it may
>>>> be worthwhile to split off these two changes into a separate patch.
>>>> If you're fine with this, I could strip these two before committing,
>>>> in which case the remaining change is
>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>> Taking apart the commit is fine with me. I will submit a follow up
>>> change that does not update the values but fixes the reads.
>> As pointed out during the v5 discussion, I'm unconvinced that if
>> you do so the compiler can't re-introduce the issue via CSE. I'd
>> really like a reliable solution to be determined first.
> 
> I cannot give a guarantee what future compilers might do. Furthermore, I
> do not want to wait until all/most compilers ship with such a
> controllable guarantee.

Guarantee? Future compilers are (hopefully) going to get better at
optimizing, and hence are (again hopefully) going to find more
opportunities for CSE. So the problem is going to get worse rather
than better, and the changes you're proposing to re-instate are
therefore more like false promises.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 4/9] spec: add l1tf-barrier
  2019-02-12 13:44                     ` [PATCH SpectreV1+L1TF v6 4/9] spec: add l1tf-barrier Jan Beulich
@ 2019-02-15  9:13                       ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-15  9:13 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/12/19 14:44, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> To control the runtime behavior on L1TF vulnerable platforms better, the
>> command line option l1tf-barrier is introduced. This option controls
>> whether on vulnerable x86 platforms the lfence instruction is used to
>> prevent speculative execution from bypassing the evaluation of
>> conditionals that are protected with the evaluate_nospec macro.
>>
>> By now, Xen is capable of identifying L1TF vulnerable hardware. However,
>> this information cannot be used for alternative patching, as a CPU feature
>> is required. To control alternative patching with the command line option,
>> a new x86 feature "X86_FEATURE_SC_L1TF_VULN" is introduced. This feature
>> is used to patch the lfence instruction into the arch_barrier_nospec_true
>> function. The feature is enabled only if L1TF vulnerable hardware is
>> detected and the command line option does not prevent using this feature.
>>
>> The status of hyperthreading is not considered when automatically enabling
>> adding the lfence instruction, because platforms without hyperthreading
>> can still be vulnerable to L1TF in case the L1 cache is not flushed
>> properly.
>>
>> Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
>>
>> ---
>>
>> Notes:
>>   v6: Move disabling l1tf-barrier into spec-ctrl=no
>>       Use gap in existing flags
>>       Force barrier based on commandline, independently of L1TF detection
>>
>>  docs/misc/xen-command-line.pandoc | 14 ++++++++++----
>>  xen/arch/x86/spec_ctrl.c          | 17 +++++++++++++++--
>>  xen/include/asm-x86/cpufeatures.h |  1 +
>>  xen/include/asm-x86/spec_ctrl.h   |  1 +
>>  4 files changed, 27 insertions(+), 6 deletions(-)
>>
>> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -483,9 +483,9 @@ accounting for hardware capabilities as enumerated via 
>> CPUID.
>>  
>>  Currently accepted:
>>  
>> -The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`,
>> -`l1d-flush` and `ssbd` are used by default if available and applicable.  They 
>> can
>> -be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, and
>> +The Speculation Control hardware features `ibrsb`, `stibp`, `ibpb`, 
>> `l1d-flush`,
>> +`l1tf-barrier` and `ssbd` are used by default if available and applicable.  
>> They
>> +can be ignored, e.g. `no-ibrsb`, at which point Xen won't use them itself, 
>> and
>>  won't offer them to guests.
>>  
>>  ### cpuid_mask_cpu
>> @@ -1896,7 +1896,7 @@ By default SSBD will be mitigated at runtime (i.e 
>> `ssbd=runtime`).
>>  ### spec-ctrl (x86)
>>  > `= List of [ <bool>, xen=<bool>, {pv,hvm,msr-sc,rsb}=<bool>,
>>  >              bti-thunk=retpoline|lfence|jmp, {ibrs,ibpb,ssbd,eager-fpu,
>> ->              l1d-flush}=<bool> ]`
>> +>              l1d-flush,l1tf-barrier}=<bool> ]`
>>  
>>  Controls for speculative execution sidechannel mitigations.  By default, 
>> Xen
>>  will pick the most appropriate mitigations based on compiled in support,
>> @@ -1962,6 +1962,12 @@ Irrespective of Xen's setting, the feature is 
>> virtualised for HVM guests to
>>  use.  By default, Xen will enable this mitigation on hardware believed to 
>> be
>>  vulnerable to L1TF.
>>  
>> +On hardware vulnerable to L1TF, the `l1tf-barrier=` option can be used to 
>> force
>> +or prevent Xen from protecting evaluations inside the hypervisor with a 
>> barrier
>> +instruction to not load potentially secret information into L1 cache.  By
>> +default, Xen will enable this mitigation on hardware believed to be 
>> vulnerable
>> +to L1TF.
>> +
>>  ### sync_console
>>  > `= <boolean>`
>>  
>> diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
>> --- a/xen/arch/x86/spec_ctrl.c
>> +++ b/xen/arch/x86/spec_ctrl.c
>> @@ -21,6 +21,7 @@
>>  #include <xen/lib.h>
>>  #include <xen/warning.h>
>>  
>> +#include <asm/cpuid.h>
>>  #include <asm/microcode.h>
>>  #include <asm/msr.h>
>>  #include <asm/processor.h>
>> @@ -50,6 +51,7 @@ bool __read_mostly opt_ibpb = true;
>>  bool __read_mostly opt_ssbd = false;
>>  int8_t __read_mostly opt_eager_fpu = -1;
>>  int8_t __read_mostly opt_l1d_flush = -1;
>> +int8_t __read_mostly opt_l1tf_barrier = -1;
>>  
>>  bool __initdata bsp_delay_spec_ctrl;
>>  uint8_t __read_mostly default_xen_spec_ctrl;
>> @@ -91,6 +93,8 @@ static int __init parse_spec_ctrl(const char *s)
>>              if ( opt_pv_l1tf_domu < 0 )
>>                  opt_pv_l1tf_domu = 0;
>>  
>> +            opt_l1tf_barrier = 0;
>> +
>>          disable_common:
>>              opt_rsb_pv = false;
>>              opt_rsb_hvm = false;
>> @@ -157,6 +161,8 @@ static int __init parse_spec_ctrl(const char *s)
>>              opt_eager_fpu = val;
>>          else if ( (val = parse_boolean("l1d-flush", s, ss)) >= 0 )
>>              opt_l1d_flush = val;
>> +        else if ( (val = parse_boolean("l1tf-barrier", s, ss)) >= 0 )
>> +            opt_l1tf_barrier = val;
>>          else
>>              rc = -EINVAL;
>>  
>> @@ -248,7 +254,7 @@ static void __init print_details(enum ind_thunk thunk, 
>> uint64_t caps)
>>                 "\n");
>>  
>>      /* Settings for Xen's protection, irrespective of guests. */
>> -    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s\n",
>> +    printk("  Xen settings: BTI-Thunk %s, SPEC_CTRL: %s%s, Other:%s%s%s\n",
>>             thunk == THUNK_NONE      ? "N/A" :
>>             thunk == THUNK_RETPOLINE ? "RETPOLINE" :
>>             thunk == THUNK_LFENCE    ? "LFENCE" :
>> @@ -258,7 +264,8 @@ static void __init print_details(enum ind_thunk thunk, 
>> uint64_t caps)
>>             !boot_cpu_has(X86_FEATURE_SSBD)           ? "" :
>>             (default_xen_spec_ctrl & SPEC_CTRL_SSBD)  ? " SSBD+" : " SSBD-",
>>             opt_ibpb                                  ? " IBPB"  : "",
>> -           opt_l1d_flush                             ? " L1D_FLUSH" : "");
>> +           opt_l1d_flush                             ? " L1D_FLUSH" : "",
>> +           opt_l1tf_barrier                          ? " L1TF_BARRIER" : 
>> "");
>>  
>>      /* L1TF diagnostics, printed if vulnerable or PV shadowing is in use. 
>> */
>>      if ( cpu_has_bug_l1tf || opt_pv_l1tf_hwdom || opt_pv_l1tf_domu )
>> @@ -842,6 +849,12 @@ void __init init_speculation_mitigations(void)
>>      else if ( opt_l1d_flush == -1 )
>>          opt_l1d_flush = cpu_has_bug_l1tf && !(caps & ARCH_CAPS_SKIP_L1DFL);
>>  
>> +    /* By default, enable L1TF_VULN on L1TF-vulnerable hardware */
>> +    if ( opt_l1tf_barrier == -1 )
>> +        opt_l1tf_barrier = cpu_has_bug_l1tf;
>> +    if ( opt_l1tf_barrier > 0)
>> +        setup_force_cpu_cap(X86_FEATURE_SC_L1TF_VULN);
> Did we end with a misunderstanding in the v5 discussion? I didn't
> answer your question regarding whether to also consider L1D
> flush availability (on top of my request to consider SMT) with a
> straight "yes", but I think it was still clear that my more extensive
> response boiled down to a "yes". Oh, I see now - the same topic
> was discussed in two places, and for the second I then said that
> with the "for now" aspect properly stated (which you now do)
> I'd be fine.
>
> So let me put it this way: Is taking into consideration at least
> opt_smt and opt_l1d_flush (but putting on the side the "too
> early" aspect of the determination here) very difficult to do,
> or would that leave un-addressed concerns of yours? If not,
> may I ask that you go at least that little step further? As said
> before - we'd like to avoid penalizing configurations as well as
> setups which aren't affected. In particular it would seem
> pretty bad of us to further penalize people who have set
> "smt=0" and who use up-to-date microcode.
I understand that smt=0 should not be penalized. However, only if
flushing is used as well, smt=0 is actually safe. I will extend the
logic to set the CPU flag automatically in case L1TF hardware has been
detected, and smt!=0 or !L1D-flush, i.e. opt_l1tf_barrier =
cpu_has_bug_l1tf && (opt_smt != 0 || opt_l1d_flush == 0);
>
> Also in the second if() there's yet again a missing blank.

Will fix.

Best,
Norbert

>
> Jan
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-13 11:50                     ` [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: " Jan Beulich
@ 2019-02-15  9:55                       ` Norbert Manthey
  2019-02-15 10:34                         ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-15  9:55 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/13/19 12:50, Jan Beulich wrote:
>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>> Guests can issue grant table operations and provide guest controlled
>> data to them. This data is also used for memory loads. To avoid
>> speculative out-of-bound accesses, we use the array_index_nospec macro
>> where applicable. However, there are also memory accesses that cannot
>> be protected by a single array protection, or multiple accesses in a
>> row. To protect these, a nospec barrier is placed between the actual
>> range check and the access via the block_speculation macro.
>>
>> As different versions of grant tables use structures of different size,
>> and the status is encoded in an array for version 2, speculative
>> execution might touch zero-initialized structures of version 2 while
>> the table is actually using version 1.
> Why zero-initialized? Did I still not succeed demonstrating to you
> that speculation along a v2 path can actually overrun v1 arrays,
> not just access parts with may still be zero-initialized?
I believe a speculative v2 access can touch data that has been written
by valid v1 accesses before, zero initialized data, or touch the NULL
page. Given the macros for the access I do not believe that a v2 access
can touch a page that is located behind a page holding valid v1 data.
>
>> @@ -203,8 +204,9 @@ static inline unsigned int nr_status_frames(const struct grant_table *gt)
>>  }
>>  
>>  #define MAPTRACK_PER_PAGE (PAGE_SIZE / sizeof(struct grant_mapping))
>> -#define maptrack_entry(t, e) \
>> -    ((t)->maptrack[(e)/MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
>> +#define maptrack_entry(t, e)                                                   \
>> +    ((t)->maptrack[array_index_nospec(e, (t)->maptrack_limit)                  \
>> +                                     /MAPTRACK_PER_PAGE][(e)%MAPTRACK_PER_PAGE])
> I would have hoped that the pointing out of similar formatting
> issues elsewhere would have had an impact here as well, but
> I see the / is still wrongly at the beginning of a line, and is still
> not followed by a blank (would be "preceded" if it was well
> placed). And while I realize it's only code movement, adding
> the missing blanks around % would be appreciated too at this
> occasion.
I will move the "/" to the upper line, and add the space around the "%".
>
>> @@ -963,9 +965,13 @@ map_grant_ref(
>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>                   op->ref, rgt->domain->domain_id);
>>  
>> +    /* Make sure the above check is not bypassed speculatively */
>> +    block_speculation();
>> +
>>      act = active_entry_acquire(rgt, op->ref);
>>      shah = shared_entry_header(rgt, op->ref);
>> -    status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
>> +    status = evaluate_nospec(rgt->gt_version == 1) ? &shah->flags
>> +                                                 : &status_entry(rgt, op->ref);
> Did you consider folding the two pairs of fences you emit into
> one? Moving up the assignment to status ought to achieve this,
> as then the block_speculation() could be dropped afaict.
>
> Then again you don't alter shared_entry_header(). If there's
> a reason for you not having done so, then a second fence
> here is needed in any event.
Instead of the block_speculation() macro, I can also protect the op->ref
usage before evaluate_nospec via the array_index_nospec function.
>
> What about the version check in nr_grant_entries()? It appears
> to me as if at least its use in grant_map_exists() (which simply is
> the first one I've found) is problematic without an adjustment.
> Even worse, ...
>
>> @@ -1321,7 +1327,8 @@ unmap_common(
>>          goto unlock_out;
>>      }
>>  
>> -    act = active_entry_acquire(rgt, op->ref);
>> +    act = active_entry_acquire(rgt, array_index_nospec(op->ref,
>> +                                                       nr_grant_entries(rgt)));
> ... you add a use e.g. here to _guard_ against speculation.
The adjustment you propose is to exchange the switch statement in
nr_grant_entries with a if( evaluate_nospec( gt->gt_version == 1 ), so
that the returned values are not speculated? Already before this
modification the function is called and not inlined. Do you want me to
cache the value in functions that call this method regularly to avoid
the penalty of the introduced lfence for each call?
>
> And what about _set_status(), unmap_common_complete(),
> gnttab_grow_table(), gnttab_setup_table(),
> release_grant_for_copy(), the 2nd one in acquire_grant_for_copy(),
> several ones in gnttab_set_version(), gnttab_release_mappings(),
> the 3rd one in mem_sharing_gref_to_gfn(), gnttab_map_frame(),
> and gnttab_get_status_frame()?

Protecting the function itself should allow to not modify the
speculation guards in these functions. I would have to check each of
them, whether the guest actually has control, and whether it makes sense
to introduce a _nospec variant of the nr_grant_entries function to not
penalize everywhere. Do you have an opinion on this?

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-15  9:55                       ` Norbert Manthey
@ 2019-02-15 10:34                         ` Jan Beulich
  2019-02-18 13:49                           ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-15 10:34 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 15.02.19 at 10:55, <nmanthey@amazon.de> wrote:
> On 2/13/19 12:50, Jan Beulich wrote:
>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>> Guests can issue grant table operations and provide guest controlled
>>> data to them. This data is also used for memory loads. To avoid
>>> speculative out-of-bound accesses, we use the array_index_nospec macro
>>> where applicable. However, there are also memory accesses that cannot
>>> be protected by a single array protection, or multiple accesses in a
>>> row. To protect these, a nospec barrier is placed between the actual
>>> range check and the access via the block_speculation macro.
>>>
>>> As different versions of grant tables use structures of different size,
>>> and the status is encoded in an array for version 2, speculative
>>> execution might touch zero-initialized structures of version 2 while
>>> the table is actually using version 1.
>> Why zero-initialized? Did I still not succeed demonstrating to you
>> that speculation along a v2 path can actually overrun v1 arrays,
>> not just access parts with may still be zero-initialized?
> I believe a speculative v2 access can touch data that has been written
> by valid v1 accesses before, zero initialized data, or touch the NULL
> page. Given the macros for the access I do not believe that a v2 access
> can touch a page that is located behind a page holding valid v1 data.

I've given examples before of how I see this to be possible. Would
you mind going back to one of the instances, and explaining to me
how you do _not_ see any room for an overrun there? Having
given examples, I simply don't know how else I can explain this to
you without knowing at what specific part of the explanation we
diverge. (And no, I'm not excluding that I'm making up an issue
where there is none.)

>>> @@ -963,9 +965,13 @@ map_grant_ref(
>>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>>                   op->ref, rgt->domain->domain_id);
>>>  
>>> +    /* Make sure the above check is not bypassed speculatively */
>>> +    block_speculation();
>>> +
>>>      act = active_entry_acquire(rgt, op->ref);
>>>      shah = shared_entry_header(rgt, op->ref);
>>> -    status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
>>> +    status = evaluate_nospec(rgt->gt_version == 1) ? &shah->flags
>>> +                                                 : &status_entry(rgt, op->ref);
>> Did you consider folding the two pairs of fences you emit into
>> one? Moving up the assignment to status ought to achieve this,
>> as then the block_speculation() could be dropped afaict.
>>
>> Then again you don't alter shared_entry_header(). If there's
>> a reason for you not having done so, then a second fence
>> here is needed in any event.
> Instead of the block_speculation() macro, I can also protect the op->ref
> usage before evaluate_nospec via the array_index_nospec function.

That's an option (as before), but doesn't help with shared_entry_header()'s
evaluation of gt_version.

>> What about the version check in nr_grant_entries()? It appears
>> to me as if at least its use in grant_map_exists() (which simply is
>> the first one I've found) is problematic without an adjustment.
>> Even worse, ...
>>
>>> @@ -1321,7 +1327,8 @@ unmap_common(
>>>          goto unlock_out;
>>>      }
>>>  
>>> -    act = active_entry_acquire(rgt, op->ref);
>>> +    act = active_entry_acquire(rgt, array_index_nospec(op->ref,
>>> +                                                       
> nr_grant_entries(rgt)));
>> ... you add a use e.g. here to _guard_ against speculation.
> The adjustment you propose is to exchange the switch statement in
> nr_grant_entries with a if( evaluate_nospec( gt->gt_version == 1 ), so
> that the returned values are not speculated?

At this point I'm not proposing a particular solution. I'm just
putting on the table an issue left un-addressed. I certainly
wouldn't welcome converting the switch() to an if(), even if
right now there's no v3 on the horizon. (It's actually quite
the inverse: If someone came and submitted a patch to change
the various if()-s on gt_version to switch()-es, I'd welcome this.)

> Already before this
> modification the function is called and not inlined.

How does this matter when considering speculation?

> Do you want me to
> cache the value in functions that call this method regularly to avoid
> the penalty of the introduced lfence for each call?

That would go back to the question of what good it does to
latch value into a local variable when you don't know whether
the compiler will put that variable in a register or in memory.
IOW I'm afraid that to be on the safe side there's no way
around the repeated LFENCEs.

>> And what about _set_status(), unmap_common_complete(),
>> gnttab_grow_table(), gnttab_setup_table(),
>> release_grant_for_copy(), the 2nd one in acquire_grant_for_copy(),
>> several ones in gnttab_set_version(), gnttab_release_mappings(),
>> the 3rd one in mem_sharing_gref_to_gfn(), gnttab_map_frame(),
>> and gnttab_get_status_frame()?
> 
> Protecting the function itself should allow to not modify the
> speculation guards in these functions. I would have to check each of
> them, whether the guest actually has control, and whether it makes sense
> to introduce a _nospec variant of the nr_grant_entries function to not
> penalize everywhere. Do you have an opinion on this?

As per above, yes, I think the only sufficiently reliable way is
to modify nr_grant_entries() itself. The question is how to
correctly do this without replacing the switch() there, the more
that the other change of yours has deliberately forced the
compiler into using rows of compares instead of jump tables (not
that I'd expect the compiler to have used a jump table here, i.e.
the remark is just wrt the general case).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-15  8:55                             ` Jan Beulich
@ 2019-02-15 10:50                               ` Norbert Manthey
  2019-02-15 11:46                                 ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-15 10:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/15/19 09:55, Jan Beulich wrote:
>>>> On 15.02.19 at 09:05, <nmanthey@amazon.de> wrote:
>> On 2/12/19 15:14, Jan Beulich wrote:
>>>>>> On 12.02.19 at 15:05, <nmanthey@amazon.de> wrote:
>>>> On 2/12/19 14:25, Jan Beulich wrote:
>>>>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>>>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>>          return -EINVAL;
>>>>>>  
>>>>>> +    /*
>>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>>> +     * speculative execution.
>>>>>> +     */
>>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>>>> +
>>>>>>      d = rcu_lock_domain_by_any_id(a.domid);
>>>>>>      if ( d == NULL )
>>>>>>          return -ESRCH;
>>>>>> @@ -4370,6 +4380,12 @@ static int hvmop_get_param(
>>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>>          return -EINVAL;
>>>>>>  
>>>>>> +    /*
>>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>>> +     * speculative execution.
>>>>>> +     */
>>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>>> ... the usefulness of these two. To make forward progress it may
>>>>> be worthwhile to split off these two changes into a separate patch.
>>>>> If you're fine with this, I could strip these two before committing,
>>>>> in which case the remaining change is
>>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>> Taking apart the commit is fine with me. I will submit a follow up
>>>> change that does not update the values but fixes the reads.
>>> As pointed out during the v5 discussion, I'm unconvinced that if
>>> you do so the compiler can't re-introduce the issue via CSE. I'd
>>> really like a reliable solution to be determined first.
>> I cannot give a guarantee what future compilers might do. Furthermore, I
>> do not want to wait until all/most compilers ship with such a
>> controllable guarantee.
> Guarantee? Future compilers are (hopefully) going to get better at
> optimizing, and hence are (again hopefully) going to find more
> opportunities for CSE. So the problem is going to get worse rather
> than better, and the changes you're proposing to re-instate are
> therefore more like false promises.

I do not want to dive into compilers future here. I would like to fix
the issue for todays compilers now and not wait until compilers evolved
one way or another. For this patch, the relevant information is whether
it should go in like this, or whether you want me to protect all the
reads instead. Is there more data I shall provide to help make this
decision?

Best,
Norbert

>
> Jan
>
>




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-15 10:50                               ` Norbert Manthey
@ 2019-02-15 11:46                                 ` Jan Beulich
  2019-02-18 14:47                                   ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-15 11:46 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 15.02.19 at 11:50, <nmanthey@amazon.de> wrote:
> On 2/15/19 09:55, Jan Beulich wrote:
>>>>> On 15.02.19 at 09:05, <nmanthey@amazon.de> wrote:
>>> On 2/12/19 15:14, Jan Beulich wrote:
>>>>>>> On 12.02.19 at 15:05, <nmanthey@amazon.de> wrote:
>>>>> On 2/12/19 14:25, Jan Beulich wrote:
>>>>>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>>>>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>>>          return -EINVAL;
>>>>>>>  
>>>>>>> +    /*
>>>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>>>> +     * speculative execution.
>>>>>>> +     */
>>>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>>>>> +
>>>>>>>      d = rcu_lock_domain_by_any_id(a.domid);
>>>>>>>      if ( d == NULL )
>>>>>>>          return -ESRCH;
>>>>>>> @@ -4370,6 +4380,12 @@ static int hvmop_get_param(
>>>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>>>          return -EINVAL;
>>>>>>>  
>>>>>>> +    /*
>>>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>>>> +     * speculative execution.
>>>>>>> +     */
>>>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>>>> ... the usefulness of these two. To make forward progress it may
>>>>>> be worthwhile to split off these two changes into a separate patch.
>>>>>> If you're fine with this, I could strip these two before committing,
>>>>>> in which case the remaining change is
>>>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>>> Taking apart the commit is fine with me. I will submit a follow up
>>>>> change that does not update the values but fixes the reads.
>>>> As pointed out during the v5 discussion, I'm unconvinced that if
>>>> you do so the compiler can't re-introduce the issue via CSE. I'd
>>>> really like a reliable solution to be determined first.
>>> I cannot give a guarantee what future compilers might do. Furthermore, I
>>> do not want to wait until all/most compilers ship with such a
>>> controllable guarantee.
>> Guarantee? Future compilers are (hopefully) going to get better at
>> optimizing, and hence are (again hopefully) going to find more
>> opportunities for CSE. So the problem is going to get worse rather
>> than better, and the changes you're proposing to re-instate are
>> therefore more like false promises.
> 
> I do not want to dive into compilers future here. I would like to fix
> the issue for todays compilers now and not wait until compilers evolved
> one way or another. For this patch, the relevant information is whether
> it should go in like this, or whether you want me to protect all the
> reads instead. Is there more data I shall provide to help make this
> decision?

I understand that you're not happy with what I've said, and you're
unlikely to become any happier with what I'll add. But please
understand that _if_ we make any changes to address issues with
speculation, the goal has to be that we don't have to come back
an re-investigate after every new compiler release.

Even beyond that - if, as you say, we'd limit ourselves to current
compilers, did you check that all of them at any optimization level
or with any other flags passed which may affect code generation
produce non-vulnerable code? And in particular considering the
case here never recognize CSE potential where we would like them
not to?

A code change is, imo, not even worthwhile considering to be put
in if it is solely based on the observations made with a limited set
of compilers and/or options. This might indeed help you, if you
care only about one specific environment. But by putting this in
(and perhaps even backporting it) we're sort of stating that the
issue is under control (to the best of our abilities, and for the given
area of code). For everyone.

So, to answer your question: From what we know, we simply
can't take a decision, at least not between the two proposed
variants of how to change the code. If there was a variant that
firmly worked, then there would not even be a need for any
discussion. And again from what we know, there is one
requirement that need to be fulfilled for a change to be
considered "firmly working": The index needs to be in a register.
There must not be a way for the compiler to undermine this,
be it by CSE or any other means.

Considering changes done elsewhere, of course this may be
taken with a grain of salt. In other places we also expect the
compiler to not emit unreasonable code (e.g. needlessly
spilling registers to memory just to then reload them). But
while that's (imo) a fine expectation to have when an array
index is used just once, it is unavoidably more complicated in
the case here as well as in the grant table one.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-15 10:34                         ` Jan Beulich
@ 2019-02-18 13:49                           ` Norbert Manthey
  2019-02-18 16:08                             ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-18 13:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/15/19 11:34, Jan Beulich wrote:
>>>> On 15.02.19 at 10:55, <nmanthey@amazon.de> wrote:
>> On 2/13/19 12:50, Jan Beulich wrote:
>>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>>> Guests can issue grant table operations and provide guest controlled
>>>> data to them. This data is also used for memory loads. To avoid
>>>> speculative out-of-bound accesses, we use the array_index_nospec macro
>>>> where applicable. However, there are also memory accesses that cannot
>>>> be protected by a single array protection, or multiple accesses in a
>>>> row. To protect these, a nospec barrier is placed between the actual
>>>> range check and the access via the block_speculation macro.
>>>>
>>>> As different versions of grant tables use structures of different size,
>>>> and the status is encoded in an array for version 2, speculative
>>>> execution might touch zero-initialized structures of version 2 while
>>>> the table is actually using version 1.
>>> Why zero-initialized? Did I still not succeed demonstrating to you
>>> that speculation along a v2 path can actually overrun v1 arrays,
>>> not just access parts with may still be zero-initialized?
>> I believe a speculative v2 access can touch data that has been written
>> by valid v1 accesses before, zero initialized data, or touch the NULL
>> page. Given the macros for the access I do not believe that a v2 access
>> can touch a page that is located behind a page holding valid v1 data.
> I've given examples before of how I see this to be possible. Would
> you mind going back to one of the instances, and explaining to me
> how you do _not_ see any room for an overrun there? Having
> given examples, I simply don't know how else I can explain this to
> you without knowing at what specific part of the explanation we
> diverge. (And no, I'm not excluding that I'm making up an issue
> where there is none.)
What we want to real out is that the actually use version1, while
speculation might use version2, right? I hope you refer to this example
of your earlier email.

On 1/29/19 16:11, Jan Beulich wrote:
> Let's look at an example: gref 256 points into the middle of
> the first page when using v1 calculations, but at the start
> of the second page when using v2 calculations. Hence, if the
> maximum number of grant frames was 1, we'd overrun the
> array, consisting of just a single element (256 is valid as a
> v1 gref in that case, but just out of bounds as a v2 one).

From how I read your example and my explanation, the key difference is
in the size of the shared_raw array. In case gref 256 is a valid v1
handle, then the shared_raw array has space for at least 256 entries, as
shared_raw was allocated for the number of requested entries. The access
to shared_raw is controlled with the macro shared_entry_v2:
 222 #define SHGNT_PER_PAGE_V2 (PAGE_SIZE / sizeof(grant_entry_v2_t))
 223 #define shared_entry_v2(t, e) \
 224     ((t)->shared_v2[(e)/SHGNT_PER_PAGE_V2][(e)%SHGNT_PER_PAGE_V2])
Since the direct access to the shared_v2 array depends on the
SHGNT_PER_PAGE_V2 value, this has to be less than the size of that
array. Hence, shared_raw will not be overrun (neither for version 1 nor
version 2). However, this division might result in accessing an element
of shared_raw that has not been initialized by version1 before. However,
right after allocation, shared_raw is zero initialized. Hence, this
might result in an access of the NULL page.

The second access in the macro allows to access only a single page, as
the value e is bound to the elements per page of the correct version
(the version 1 macro uses the corresponding value for the modulo
operation). Either, this refers to the NULL page, or it refers to a page
that has been initialized by version1 (partially). I do not see how an
out-of-bound access would be possible there.

>>>> @@ -963,9 +965,13 @@ map_grant_ref(
>>>>          PIN_FAIL(unlock_out, GNTST_bad_gntref, "Bad ref %#x for d%d\n",
>>>>                   op->ref, rgt->domain->domain_id);
>>>>  
>>>> +    /* Make sure the above check is not bypassed speculatively */
>>>> +    block_speculation();
>>>> +
>>>>      act = active_entry_acquire(rgt, op->ref);
>>>>      shah = shared_entry_header(rgt, op->ref);
>>>> -    status = rgt->gt_version == 1 ? &shah->flags : &status_entry(rgt, op->ref);
>>>> +    status = evaluate_nospec(rgt->gt_version == 1) ? &shah->flags
>>>> +                                                 : &status_entry(rgt, op->ref);
>>> Did you consider folding the two pairs of fences you emit into
>>> one? Moving up the assignment to status ought to achieve this,
>>> as then the block_speculation() could be dropped afaict.
>>>
>>> Then again you don't alter shared_entry_header(). If there's
>>> a reason for you not having done so, then a second fence
>>> here is needed in any event.
>> Instead of the block_speculation() macro, I can also protect the op->ref
>> usage before evaluate_nospec via the array_index_nospec function.
> That's an option (as before), but doesn't help with shared_entry_header()'s
> evaluation of gt_version.
That would have to be protected separately locally in that function,
similarly to the nr_grant_entries function.
>
>>> What about the version check in nr_grant_entries()? It appears
>>> to me as if at least its use in grant_map_exists() (which simply is
>>> the first one I've found) is problematic without an adjustment.
>>> Even worse, ...
>>>
>>>> @@ -1321,7 +1327,8 @@ unmap_common(
>>>>          goto unlock_out;
>>>>      }
>>>>  
>>>> -    act = active_entry_acquire(rgt, op->ref);
>>>> +    act = active_entry_acquire(rgt, array_index_nospec(op->ref,
>>>> +                                                       
>> nr_grant_entries(rgt)));
>>> ... you add a use e.g. here to _guard_ against speculation.
>> The adjustment you propose is to exchange the switch statement in
>> nr_grant_entries with a if( evaluate_nospec( gt->gt_version == 1 ), so
>> that the returned values are not speculated?
> At this point I'm not proposing a particular solution. I'm just
> putting on the table an issue left un-addressed. I certainly
> wouldn't welcome converting the switch() to an if(), even if
> right now there's no v3 on the horizon. (It's actually quite
> the inverse: If someone came and submitted a patch to change
> the various if()-s on gt_version to switch()-es, I'd welcome this.)
I am happy to add block_speculation() macros into each case of the
switch statement.
>> Already before this
>> modification the function is called and not inlined.
> How does this matter when considering speculation?
Likely not at all.
>
>> Do you want me to
>> cache the value in functions that call this method regularly to avoid
>> the penalty of the introduced lfence for each call?
> That would go back to the question of what good it does to
> latch value into a local variable when you don't know whether
> the compiler will put that variable in a register or in memory.
> IOW I'm afraid that to be on the safe side there's no way
> around the repeated LFENCEs.
The difference here would be that in case the value is stored into a
local variable (independently of memory or register) and an lfence was
executed, this value can be trusted and does not have to be checked
again, as it's no longer guest controlled.
>
>>> And what about _set_status(), unmap_common_complete(),
>>> gnttab_grow_table(), gnttab_setup_table(),
>>> release_grant_for_copy(), the 2nd one in acquire_grant_for_copy(),
>>> several ones in gnttab_set_version(), gnttab_release_mappings(),
>>> the 3rd one in mem_sharing_gref_to_gfn(), gnttab_map_frame(),
>>> and gnttab_get_status_frame()?
>> Protecting the function itself should allow to not modify the
>> speculation guards in these functions. I would have to check each of
>> them, whether the guest actually has control, and whether it makes sense
>> to introduce a _nospec variant of the nr_grant_entries function to not
>> penalize everywhere. Do you have an opinion on this?
> As per above, yes, I think the only sufficiently reliable way is
> to modify nr_grant_entries() itself. The question is how to
> correctly do this without replacing the switch() there, the more
> that the other change of yours has deliberately forced the
> compiler into using rows of compares instead of jump tables (not
> that I'd expect the compiler to have used a jump table here, i.e.
> the remark is just wrt the general case).

As explained above, using the block_speculation() macro in each case
statement should result in similar code than converting the switch into
a chain of if statements that are protected by evaluate_nospec macros.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-15 11:46                                 ` Jan Beulich
@ 2019-02-18 14:47                                   ` Norbert Manthey
  2019-02-18 15:56                                     ` Jan Beulich
  0 siblings, 1 reply; 150+ messages in thread
From: Norbert Manthey @ 2019-02-18 14:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/15/19 12:46, Jan Beulich wrote:
>>>> On 15.02.19 at 11:50, <nmanthey@amazon.de> wrote:
>> On 2/15/19 09:55, Jan Beulich wrote:
>>>>>> On 15.02.19 at 09:05, <nmanthey@amazon.de> wrote:
>>>> On 2/12/19 15:14, Jan Beulich wrote:
>>>>>>>> On 12.02.19 at 15:05, <nmanthey@amazon.de> wrote:
>>>>>> On 2/12/19 14:25, Jan Beulich wrote:
>>>>>>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>>>>>>> @@ -4104,6 +4108,12 @@ static int hvmop_set_param(
>>>>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>>>>          return -EINVAL;
>>>>>>>>  
>>>>>>>> +    /*
>>>>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>>>>> +     * speculative execution.
>>>>>>>> +     */
>>>>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>>>>>> +
>>>>>>>>      d = rcu_lock_domain_by_any_id(a.domid);
>>>>>>>>      if ( d == NULL )
>>>>>>>>          return -ESRCH;
>>>>>>>> @@ -4370,6 +4380,12 @@ static int hvmop_get_param(
>>>>>>>>      if ( a.index >= HVM_NR_PARAMS )
>>>>>>>>          return -EINVAL;
>>>>>>>>  
>>>>>>>> +    /*
>>>>>>>> +     * Make sure the guest controlled value a.index is bounded even during
>>>>>>>> +     * speculative execution.
>>>>>>>> +     */
>>>>>>>> +    a.index = array_index_nospec(a.index, HVM_NR_PARAMS);
>>>>>>> ... the usefulness of these two. To make forward progress it may
>>>>>>> be worthwhile to split off these two changes into a separate patch.
>>>>>>> If you're fine with this, I could strip these two before committing,
>>>>>>> in which case the remaining change is
>>>>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>>>> Taking apart the commit is fine with me. I will submit a follow up
>>>>>> change that does not update the values but fixes the reads.
>>>>> As pointed out during the v5 discussion, I'm unconvinced that if
>>>>> you do so the compiler can't re-introduce the issue via CSE. I'd
>>>>> really like a reliable solution to be determined first.
>>>> I cannot give a guarantee what future compilers might do. Furthermore, I
>>>> do not want to wait until all/most compilers ship with such a
>>>> controllable guarantee.
>>> Guarantee? Future compilers are (hopefully) going to get better at
>>> optimizing, and hence are (again hopefully) going to find more
>>> opportunities for CSE. So the problem is going to get worse rather
>>> than better, and the changes you're proposing to re-instate are
>>> therefore more like false promises.
>> I do not want to dive into compilers future here. I would like to fix
>> the issue for todays compilers now and not wait until compilers evolved
>> one way or another. For this patch, the relevant information is whether
>> it should go in like this, or whether you want me to protect all the
>> reads instead. Is there more data I shall provide to help make this
>> decision?
> I understand that you're not happy with what I've said, and you're
> unlikely to become any happier with what I'll add. But please
> understand that _if_ we make any changes to address issues with
> speculation, the goal has to be that we don't have to come back
> an re-investigate after every new compiler release.
>
> Even beyond that - if, as you say, we'd limit ourselves to current
> compilers, did you check that all of them at any optimization level
> or with any other flags passed which may affect code generation
> produce non-vulnerable code? And in particular considering the
> case here never recognize CSE potential where we would like them
> not to?
>
> A code change is, imo, not even worthwhile considering to be put
> in if it is solely based on the observations made with a limited set
> of compilers and/or options. This might indeed help you, if you
> care only about one specific environment. But by putting this in
> (and perhaps even backporting it) we're sort of stating that the
> issue is under control (to the best of our abilities, and for the given
> area of code). For everyone.
I do not see how a fix for problems like the discussed one could enter
the code base given the above conditions. However, for this very
specific fix, there fortunately is a comparison wrt a constant, and
there are many instructions until the potential speculative out-of-bound
access might happen, so that not fixing the two above access is fine for
me. While I cannot guarantee that it is not possible, we did not manage
to come up with a PoC for these two places with the effort we put into this.
> So, to answer your question: From what we know, we simply
> can't take a decision, at least not between the two proposed
> variants of how to change the code. If there was a variant that
> firmly worked, then there would not even be a need for any
> discussion. And again from what we know, there is one
> requirement that need to be fulfilled for a change to be
> considered "firmly working": The index needs to be in a register.
> There must not be a way for the compiler to undermine this,
> be it by CSE or any other means.
>
> Considering changes done elsewhere, of course this may be
> taken with a grain of salt. In other places we also expect the
> compiler to not emit unreasonable code (e.g. needlessly
> spilling registers to memory just to then reload them). But
> while that's (imo) a fine expectation to have when an array
> index is used just once, it is unavoidably more complicated in
> the case here as well as in the grant table one.

Unless you outline a path forward to fix the above two gadgets, I will
not include the above hunks in the next version of the series.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: block speculative out-of-bound accesses
  2019-02-18 14:47                                   ` Norbert Manthey
@ 2019-02-18 15:56                                     ` Jan Beulich
  0 siblings, 0 replies; 150+ messages in thread
From: Jan Beulich @ 2019-02-18 15:56 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 18.02.19 at 15:47, <nmanthey@amazon.de> wrote:
> On 2/15/19 12:46, Jan Beulich wrote:
>> A code change is, imo, not even worthwhile considering to be put
>> in if it is solely based on the observations made with a limited set
>> of compilers and/or options. This might indeed help you, if you
>> care only about one specific environment. But by putting this in
>> (and perhaps even backporting it) we're sort of stating that the
>> issue is under control (to the best of our abilities, and for the given
>> area of code). For everyone.
> I do not see how a fix for problems like the discussed one could enter
> the code base given the above conditions.

Well, on one hand I can understand your frustration. Otoh the
fundamental thing here is that "fix" to me means something that
actually fixes an issues independent of "weather conditions". I'd
at best call it a workaround here, yet even then I question its
usefulness given the limitations.

But please don't forget - I'm not the only one who can potentially
approve of changes which are proposed only in the hope that
they may help, without any guarantees. If other maintainers
think we should take such changes, I won't veto them going in
as long as it is made crystal clear that the same underlying issue
may re-surface at any time, for code that was supposedly "fixed"
already. It's just that I'm not going to ack anything like this myself.

> However, for this very
> specific fix, there fortunately is a comparison wrt a constant, and
> there are many instructions until the potential speculative out-of-bound
> access might happen, so that not fixing the two above access is fine for
> me. While I cannot guarantee that it is not possible, we did not manage
> to come up with a PoC for these two places with the effort we put into this.

Okay, thanks for letting us know.

>> So, to answer your question: From what we know, we simply
>> can't take a decision, at least not between the two proposed
>> variants of how to change the code. If there was a variant that
>> firmly worked, then there would not even be a need for any
>> discussion. And again from what we know, there is one
>> requirement that need to be fulfilled for a change to be
>> considered "firmly working": The index needs to be in a register.
>> There must not be a way for the compiler to undermine this,
>> be it by CSE or any other means.
>>
>> Considering changes done elsewhere, of course this may be
>> taken with a grain of salt. In other places we also expect the
>> compiler to not emit unreasonable code (e.g. needlessly
>> spilling registers to memory just to then reload them). But
>> while that's (imo) a fine expectation to have when an array
>> index is used just once, it is unavoidably more complicated in
>> the case here as well as in the grant table one.
> 
> Unless you outline a path forward to fix the above two gadgets, I will
> not include the above hunks in the next version of the series.

I would be more than happy to outline a path, but I simply see
none which would provide guarantees.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-18 13:49                           ` Norbert Manthey
@ 2019-02-18 16:08                             ` Jan Beulich
  2019-02-19 21:47                               ` Norbert Manthey
  0 siblings, 1 reply; 150+ messages in thread
From: Jan Beulich @ 2019-02-18 16:08 UTC (permalink / raw)
  To: nmanthey
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

>>> On 18.02.19 at 14:49, <nmanthey@amazon.de> wrote:
> On 2/15/19 11:34, Jan Beulich wrote:
>>>>> On 15.02.19 at 10:55, <nmanthey@amazon.de> wrote:
>>> On 2/13/19 12:50, Jan Beulich wrote:
>>>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>>>> Guests can issue grant table operations and provide guest controlled
>>>>> data to them. This data is also used for memory loads. To avoid
>>>>> speculative out-of-bound accesses, we use the array_index_nospec macro
>>>>> where applicable. However, there are also memory accesses that cannot
>>>>> be protected by a single array protection, or multiple accesses in a
>>>>> row. To protect these, a nospec barrier is placed between the actual
>>>>> range check and the access via the block_speculation macro.
>>>>>
>>>>> As different versions of grant tables use structures of different size,
>>>>> and the status is encoded in an array for version 2, speculative
>>>>> execution might touch zero-initialized structures of version 2 while
>>>>> the table is actually using version 1.
>>>> Why zero-initialized? Did I still not succeed demonstrating to you
>>>> that speculation along a v2 path can actually overrun v1 arrays,
>>>> not just access parts with may still be zero-initialized?
>>> I believe a speculative v2 access can touch data that has been written
>>> by valid v1 accesses before, zero initialized data, or touch the NULL
>>> page. Given the macros for the access I do not believe that a v2 access
>>> can touch a page that is located behind a page holding valid v1 data.
>> I've given examples before of how I see this to be possible. Would
>> you mind going back to one of the instances, and explaining to me
>> how you do _not_ see any room for an overrun there? Having
>> given examples, I simply don't know how else I can explain this to
>> you without knowing at what specific part of the explanation we
>> diverge. (And no, I'm not excluding that I'm making up an issue
>> where there is none.)
> What we want to real out is that the actually use version1, while
> speculation might use version2, right? I hope you refer to this example
> of your earlier email.
> 
> On 1/29/19 16:11, Jan Beulich wrote:
>> Let's look at an example: gref 256 points into the middle of
>> the first page when using v1 calculations, but at the start
>> of the second page when using v2 calculations. Hence, if the
>> maximum number of grant frames was 1, we'd overrun the
>> array, consisting of just a single element (256 is valid as a
>> v1 gref in that case, but just out of bounds as a v2 one).
> 
> From how I read your example and my explanation, the key difference is
> in the size of the shared_raw array. In case gref 256 is a valid v1
> handle, then the shared_raw array has space for at least 256 entries, as
> shared_raw was allocated for the number of requested entries. The access
> to shared_raw is controlled with the macro shared_entry_v2:
>  222 #define SHGNT_PER_PAGE_V2 (PAGE_SIZE / sizeof(grant_entry_v2_t))
>  223 #define shared_entry_v2(t, e) \
>  224     ((t)->shared_v2[(e)/SHGNT_PER_PAGE_V2][(e)%SHGNT_PER_PAGE_V2])
> Since the direct access to the shared_v2 array depends on the
> SHGNT_PER_PAGE_V2 value, this has to be less than the size of that
> array. Hence, shared_raw will not be overrun (neither for version 1 nor
> version 2). However, this division might result in accessing an element
> of shared_raw that has not been initialized by version1 before. However,
> right after allocation, shared_raw is zero initialized. Hence, this
> might result in an access of the NULL page.

The question is: How much of shared_raw[] will be zero-initialized?
The example I've given uses relatively small grant reference values,
so for the purpose here let's assume gt->max_grant_frames is 1.
In this case shared_raw[] is exactly one entry in size. Hence the
speculative access you describe will not necessarily access the NULL
page.

Obviously the same issue exists with higher limits and higher grant
reference numbers.

>>>>> @@ -1321,7 +1327,8 @@ unmap_common(
>>>>>          goto unlock_out;
>>>>>      }
>>>>>  
>>>>> -    act = active_entry_acquire(rgt, op->ref);
>>>>> +    act = active_entry_acquire(rgt, array_index_nospec(op->ref,
>>>>> +                                                       
>>> nr_grant_entries(rgt)));
>>>> ... you add a use e.g. here to _guard_ against speculation.
>>> The adjustment you propose is to exchange the switch statement in
>>> nr_grant_entries with a if( evaluate_nospec( gt->gt_version == 1 ), so
>>> that the returned values are not speculated?
>> At this point I'm not proposing a particular solution. I'm just
>> putting on the table an issue left un-addressed. I certainly
>> wouldn't welcome converting the switch() to an if(), even if
>> right now there's no v3 on the horizon. (It's actually quite
>> the inverse: If someone came and submitted a patch to change
>> the various if()-s on gt_version to switch()-es, I'd welcome this.)
> I am happy to add block_speculation() macros into each case of the
> switch statement.

Ugly, but perhaps the only possible solution at this point.

>>> Do you want me to
>>> cache the value in functions that call this method regularly to avoid
>>> the penalty of the introduced lfence for each call?
>> That would go back to the question of what good it does to
>> latch value into a local variable when you don't know whether
>> the compiler will put that variable in a register or in memory.
>> IOW I'm afraid that to be on the safe side there's no way
>> around the repeated LFENCEs.
> The difference here would be that in case the value is stored into a
> local variable (independently of memory or register) and an lfence was
> executed, this value can be trusted and does not have to be checked
> again, as it's no longer guest controlled.

Ah, yes, you're right (it just wasn't clear to me that you implied
adding a fence together with the caching of the value). So perhaps
that's then also the way to go for the hunks under discussion in
patch 3?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

* Re: [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses
  2019-02-18 16:08                             ` Jan Beulich
@ 2019-02-19 21:47                               ` Norbert Manthey
  0 siblings, 0 replies; 150+ messages in thread
From: Norbert Manthey @ 2019-02-19 21:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Tim Deegan, Stefano Stabellini, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Dario Faggioli, Martin Pohlack, wipawel, Julien Grall,
	David Woodhouse, Martin Mazein(amazein),
	xen-devel, Julian Stecklina, Bjoern Doebel

On 2/18/19 17:08, Jan Beulich wrote:
>>>> On 18.02.19 at 14:49, <nmanthey@amazon.de> wrote:
>> On 2/15/19 11:34, Jan Beulich wrote:
>>>>>> On 15.02.19 at 10:55, <nmanthey@amazon.de> wrote:
>>>> On 2/13/19 12:50, Jan Beulich wrote:
>>>>>>>> On 08.02.19 at 14:44, <nmanthey@amazon.de> wrote:
>>>>>> Guests can issue grant table operations and provide guest controlled
>>>>>> data to them. This data is also used for memory loads. To avoid
>>>>>> speculative out-of-bound accesses, we use the array_index_nospec macro
>>>>>> where applicable. However, there are also memory accesses that cannot
>>>>>> be protected by a single array protection, or multiple accesses in a
>>>>>> row. To protect these, a nospec barrier is placed between the actual
>>>>>> range check and the access via the block_speculation macro.
>>>>>>
>>>>>> As different versions of grant tables use structures of different size,
>>>>>> and the status is encoded in an array for version 2, speculative
>>>>>> execution might touch zero-initialized structures of version 2 while
>>>>>> the table is actually using version 1.
>>>>> Why zero-initialized? Did I still not succeed demonstrating to you
>>>>> that speculation along a v2 path can actually overrun v1 arrays,
>>>>> not just access parts with may still be zero-initialized?
>>>> I believe a speculative v2 access can touch data that has been written
>>>> by valid v1 accesses before, zero initialized data, or touch the NULL
>>>> page. Given the macros for the access I do not believe that a v2 access
>>>> can touch a page that is located behind a page holding valid v1 data.
>>> I've given examples before of how I see this to be possible. Would
>>> you mind going back to one of the instances, and explaining to me
>>> how you do _not_ see any room for an overrun there? Having
>>> given examples, I simply don't know how else I can explain this to
>>> you without knowing at what specific part of the explanation we
>>> diverge. (And no, I'm not excluding that I'm making up an issue
>>> where there is none.)
>> What we want to real out is that the actually use version1, while
>> speculation might use version2, right? I hope you refer to this example
>> of your earlier email.
>>
>> On 1/29/19 16:11, Jan Beulich wrote:
>>> Let's look at an example: gref 256 points into the middle of
>>> the first page when using v1 calculations, but at the start
>>> of the second page when using v2 calculations. Hence, if the
>>> maximum number of grant frames was 1, we'd overrun the
>>> array, consisting of just a single element (256 is valid as a
>>> v1 gref in that case, but just out of bounds as a v2 one).
>> From how I read your example and my explanation, the key difference is
>> in the size of the shared_raw array. In case gref 256 is a valid v1
>> handle, then the shared_raw array has space for at least 256 entries, as
>> shared_raw was allocated for the number of requested entries. The access
>> to shared_raw is controlled with the macro shared_entry_v2:
>>  222 #define SHGNT_PER_PAGE_V2 (PAGE_SIZE / sizeof(grant_entry_v2_t))
>>  223 #define shared_entry_v2(t, e) \
>>  224     ((t)->shared_v2[(e)/SHGNT_PER_PAGE_V2][(e)%SHGNT_PER_PAGE_V2])
>> Since the direct access to the shared_v2 array depends on the
>> SHGNT_PER_PAGE_V2 value, this has to be less than the size of that
>> array. Hence, shared_raw will not be overrun (neither for version 1 nor
>> version 2). However, this division might result in accessing an element
>> of shared_raw that has not been initialized by version1 before. However,
>> right after allocation, shared_raw is zero initialized. Hence, this
>> might result in an access of the NULL page.
> The question is: How much of shared_raw[] will be zero-initialized?
> The example I've given uses relatively small grant reference values,
> so for the purpose here let's assume gt->max_grant_frames is 1.
> In this case shared_raw[] is exactly one entry in size. Hence the
> speculative access you describe will not necessarily access the NULL
> page.
>
> Obviously the same issue exists with higher limits and higher grant
> reference numbers.
The solution to this problem is really simple, I mixed up grant frames
and grant entries. I agree that shared_raw can be accessed out-of-bounds
and should be protected. I will adapt the commit message accordingly,
and revise the modifications I added to the code base.
>
>>>>>> @@ -1321,7 +1327,8 @@ unmap_common(
>>>>>>          goto unlock_out;
>>>>>>      }
>>>>>>  
>>>>>> -    act = active_entry_acquire(rgt, op->ref);
>>>>>> +    act = active_entry_acquire(rgt, array_index_nospec(op->ref,
>>>>>> +                                                       
>>>> nr_grant_entries(rgt)));
>>>>> ... you add a use e.g. here to _guard_ against speculation.
>>>> The adjustment you propose is to exchange the switch statement in
>>>> nr_grant_entries with a if( evaluate_nospec( gt->gt_version == 1 ), so
>>>> that the returned values are not speculated?
>>> At this point I'm not proposing a particular solution. I'm just
>>> putting on the table an issue left un-addressed. I certainly
>>> wouldn't welcome converting the switch() to an if(), even if
>>> right now there's no v3 on the horizon. (It's actually quite
>>> the inverse: If someone came and submitted a patch to change
>>> the various if()-s on gt_version to switch()-es, I'd welcome this.)
>> I am happy to add block_speculation() macros into each case of the
>> switch statement.
> Ugly, but perhaps the only possible solution at this point.
>
>>>> Do you want me to
>>>> cache the value in functions that call this method regularly to avoid
>>>> the penalty of the introduced lfence for each call?
>>> That would go back to the question of what good it does to
>>> latch value into a local variable when you don't know whether
>>> the compiler will put that variable in a register or in memory.
>>> IOW I'm afraid that to be on the safe side there's no way
>>> around the repeated LFENCEs.
>> The difference here would be that in case the value is stored into a
>> local variable (independently of memory or register) and an lfence was
>> executed, this value can be trusted and does not have to be checked
>> again, as it's no longer guest controlled.
> Ah, yes, you're right (it just wasn't clear to me that you implied
> adding a fence together with the caching of the value). So perhaps
> that's then also the way to go for the hunks under discussion in
> patch 3?

Well, yes. That should effectively bound the a.index values in the other
hunks in patch 3 as well. I will adapt that patch accordingly. Until
now, I stepped back from this solution, as I want to avoid using the
lfence instruction as much as possible.

Best,
Norbert





Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Christian Schlaeger, Ralf Herbrich
Ust-ID: DE 289 237 879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 150+ messages in thread

end of thread, other threads:[~2019-02-19 21:47 UTC | newest]

Thread overview: 150+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-23 11:51 SpectreV1+L1TF Patch Series Norbert Manthey
2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 01/11] is_control_domain: block speculation Norbert Manthey
2019-01-23 13:07   ` Jan Beulich
2019-01-23 13:20     ` Julien Grall
2019-01-23 13:40       ` Jan Beulich
2019-01-23 13:20   ` Jan Beulich
2019-01-24 12:07     ` Norbert Manthey
2019-01-24 20:33       ` Andrew Cooper
2019-01-25  9:19         ` Jan Beulich
2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 02/11] is_hvm/pv_domain: " Norbert Manthey
2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 03/11] config: introduce L1TF_LFENCE option Norbert Manthey
2019-01-23 13:18   ` Jan Beulich
2019-01-24 12:11     ` Norbert Manthey
2019-01-23 13:24   ` Julien Grall
2019-01-23 13:39     ` Jan Beulich
2019-01-23 13:44       ` Julien Grall
2019-01-23 14:45         ` Jan Beulich
2019-01-24 12:21           ` Norbert Manthey
2019-01-24 21:29   ` Andrew Cooper
2019-01-25 10:14     ` Jan Beulich
2019-01-25 10:50       ` Norbert Manthey
2019-01-25 13:09         ` Jan Beulich
2019-01-27 20:28           ` Norbert Manthey
2019-01-28  7:35             ` Jan Beulich
2019-01-28  7:56               ` Norbert Manthey
2019-01-28  8:24                 ` Jan Beulich
2019-01-28 10:07                   ` Norbert Manthey
2019-01-31 22:39       ` Andrew Cooper
2019-02-01  8:02         ` Jan Beulich
2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 04/11] x86/hvm: block speculative out-of-bound accesses Norbert Manthey
2019-01-31 19:31   ` Andrew Cooper
2019-02-01  9:06     ` Jan Beulich
2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 05/11] common/grant_table: " Norbert Manthey
2019-01-23 13:37   ` Jan Beulich
2019-01-28 14:45     ` Norbert Manthey
2019-01-28 15:09       ` Jan Beulich
2019-01-29  8:33         ` Norbert Manthey
2019-01-29  9:46           ` Jan Beulich
2019-01-29 13:47             ` Norbert Manthey
2019-01-29 15:11               ` Jan Beulich
2019-01-30  8:06                 ` Norbert Manthey
2019-01-30 11:35                   ` Jan Beulich
2019-01-23 11:51 ` [PATCH SpectreV1+L1TF v4 06/11] common/memory: " Norbert Manthey
2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 07/11] nospec: enable lfence on Intel Norbert Manthey
2019-01-24 22:29   ` Andrew Cooper
2019-01-27 20:09     ` Norbert Manthey
2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 08/11] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
2019-01-24 16:56   ` Jan Beulich
2019-01-24 19:50     ` Norbert Manthey
2019-01-25  9:23       ` Jan Beulich
2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 09/11] x86/vioapic: " Norbert Manthey
2019-01-25 16:34   ` Jan Beulich
2019-01-28 11:03     ` Norbert Manthey
2019-01-28 11:12       ` Jan Beulich
2019-01-28 12:20         ` Norbert Manthey
2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 10/11] x86/hvm/hpet: " Norbert Manthey
2019-01-25 16:50   ` Jan Beulich
2019-01-23 11:57 ` [PATCH SpectreV1+L1TF v4 11/11] x86/CPUID: " Norbert Manthey
2019-01-24 21:05 ` SpectreV1+L1TF Patch Series Andrew Cooper
2019-01-28 13:56   ` Norbert Manthey
2019-01-28  8:28 ` Jan Beulich
     [not found] ` <5C4EBD1A0200007800211954@suse.com>
2019-01-28  8:47   ` Juergen Gross
2019-01-28  9:56     ` Jan Beulich
     [not found]       ` <9C03B9BA0200004637554D14@prv1-mh.provo.novell.com>
     [not found]         ` <00FAA7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
     [not found]           ` <00FAE7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
2019-01-31 15:05             ` [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses Jan Beulich
2019-02-01 13:45               ` Norbert Manthey
2019-02-01 14:08                 ` Jan Beulich
2019-02-05 13:42                   ` Norbert Manthey
     [not found]           ` <00FA27AF020000F8B1E090C7@prv1-mh.provo.novell.com>
2019-01-31 16:05             ` [PATCH SpectreV1+L1TF v5 2/9] x86/vioapic: " Jan Beulich
2019-02-01 13:54               ` Norbert Manthey
     [not found]           ` <00F867AF020000F8B1E090C7@prv1-mh.provo.novell.com>
2019-01-31 16:19             ` [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: " Jan Beulich
2019-01-31 20:02               ` Andrew Cooper
2019-02-01  8:23                 ` Jan Beulich
2019-02-01 14:06                   ` Norbert Manthey
2019-02-01 14:31                     ` Jan Beulich
2019-02-01 14:05               ` Norbert Manthey
     [not found]           ` <0101A7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
2019-01-31 16:35             ` [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier Jan Beulich
2019-02-05 14:23               ` Norbert Manthey
2019-02-05 14:43                 ` Jan Beulich
2019-02-06 13:02                   ` Norbert Manthey
2019-02-06 13:20                     ` Jan Beulich
     [not found]           ` <0101E7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
2019-01-31 17:05             ` [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec Jan Beulich
2019-02-05 14:32               ` Norbert Manthey
2019-02-08 13:44                 ` SpectreV1+L1TF Patch Series v6 Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 2/9] x86/vioapic: " Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: " Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 4/9] spec: add l1tf-barrier Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 6/9] is_control_domain: block speculation Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 7/9] is_hvm/pv_domain: " Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
2019-02-08 13:44                   ` [PATCH SpectreV1+L1TF v6 9/9] common/memory: " Norbert Manthey
2019-02-08 14:32                   ` SpectreV1+L1TF Patch Series v6 Julien Grall
     [not found]               ` <A18FF6C80200006BB1E090C7@prv1-mh.provo.novell.com>
     [not found]                 ` <01CCAAAF02000039B1E090C7@prv1-mh.provo.novell.com>
     [not found]                   ` <01CCEAAF02000039B1E090C7@prv1-mh.provo.novell.com>
2019-02-12 13:08                     ` [PATCH SpectreV1+L1TF v6 1/9] xen/evtchn: block speculative out-of-bound accesses Jan Beulich
2019-02-14 13:10                       ` Norbert Manthey
2019-02-14 13:20                         ` Jan Beulich
     [not found]                   ` <01CC2AAF02000039B1E090C7@prv1-mh.provo.novell.com>
2019-02-12 13:16                     ` [PATCH SpectreV1+L1TF v6 2/9] x86/vioapic: " Jan Beulich
2019-02-14 13:16                       ` Norbert Manthey
     [not found]                   ` <01CE6AAF02000039B1E090C7@prv1-mh.provo.novell.com>
2019-02-12 13:25                     ` [PATCH SpectreV1+L1TF v6 3/9] x86/hvm: " Jan Beulich
2019-02-12 14:05                       ` Norbert Manthey
2019-02-12 14:14                         ` Jan Beulich
2019-02-15  8:05                           ` Norbert Manthey
2019-02-15  8:55                             ` Jan Beulich
2019-02-15 10:50                               ` Norbert Manthey
2019-02-15 11:46                                 ` Jan Beulich
2019-02-18 14:47                                   ` Norbert Manthey
2019-02-18 15:56                                     ` Jan Beulich
     [not found]                   ` <01CFAAAF02000039B1E090C7@prv1-mh.provo.novell.com>
2019-02-12 13:44                     ` [PATCH SpectreV1+L1TF v6 4/9] spec: add l1tf-barrier Jan Beulich
2019-02-15  9:13                       ` Norbert Manthey
     [not found]                   ` <01CFEAAF02000039B1E090C7@prv1-mh.provo.novell.com>
2019-02-12 13:50                     ` [PATCH SpectreV1+L1TF v6 5/9] nospec: introduce evaluate_nospec Jan Beulich
2019-02-14 13:37                       ` Norbert Manthey
2019-02-12 14:12                     ` Jan Beulich
2019-02-14 13:42                       ` Norbert Manthey
     [not found]                   ` <01CF2AAF02000039B1E090C7@prv1-mh.provo.novell.com>
2019-02-12 14:11                     ` [PATCH SpectreV1+L1TF v6 6/9] is_control_domain: block speculation Jan Beulich
2019-02-14 13:45                       ` Norbert Manthey
     [not found]                   ` <23D9419E02000017B1E090C7@prv1-mh.provo.novell.com>
2019-02-12 14:31                     ` [PATCH SpectreV1+L1TF v6 9/9] common/memory: block speculative out-of-bound accesses Jan Beulich
2019-02-14 14:04                       ` Norbert Manthey
     [not found]                   ` <01CEAAAF02000039B1E090C7@prv1-mh.provo.novell.com>
2019-02-13 11:50                     ` [PATCH SpectreV1+L1TF v6 8/9] common/grant_table: " Jan Beulich
2019-02-15  9:55                       ` Norbert Manthey
2019-02-15 10:34                         ` Jan Beulich
2019-02-18 13:49                           ` Norbert Manthey
2019-02-18 16:08                             ` Jan Beulich
2019-02-19 21:47                               ` Norbert Manthey
     [not found]           ` <0104A7AF020000F8B1E090C7@prv1-mh.provo.novell.com>
2019-02-06 14:52             ` [PATCH SpectreV1+L1TF v5 " Jan Beulich
2019-02-06 15:06               ` Norbert Manthey
2019-02-06 15:53                 ` Jan Beulich
2019-02-07  9:50                   ` Norbert Manthey
2019-02-07 10:20                     ` Norbert Manthey
2019-02-07 14:00                       ` Jan Beulich
2019-02-07 16:20                         ` Norbert Manthey
     [not found]           ` <010527AF020000F8B1E090C7@prv1-mh.provo.novell.com>
2019-02-06 15:03             ` [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation Jan Beulich
2019-02-06 15:36               ` Norbert Manthey
2019-02-06 16:01                 ` Jan Beulich
2019-02-07 10:02                   ` Norbert Manthey
     [not found]           ` <20F3469E02000096B1E090C7@prv1-mh.provo.novell.com>
2019-02-06 15:25             ` [PATCH SpectreV1+L1TF v5 9/9] common/memory: block speculative out-of-bound accesses Jan Beulich
2019-02-06 15:39               ` Norbert Manthey
2019-02-06 16:08                 ` Jan Beulich
2019-02-07  7:20                   ` Norbert Manthey
2019-01-28 10:01 SpectreV1+L1TF Patch Series Juergen Gross
2019-01-29 14:43 ` SpectreV1+L1TF Patch Series v5 Norbert Manthey
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 1/9] xen/evtchn: block speculative out-of-bound accesses Norbert Manthey
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 2/9] x86/vioapic: " Norbert Manthey
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 3/9] x86/hvm: " Norbert Manthey
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 4/9] spec: add l1tf-barrier Norbert Manthey
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 5/9] nospec: introduce evaluate_nospec Norbert Manthey
2019-02-08  9:20     ` Julien Grall
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 6/9] is_control_domain: block speculation Norbert Manthey
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 7/9] is_hvm/pv_domain: " Norbert Manthey
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 8/9] common/grant_table: block speculative out-of-bound accesses Norbert Manthey
2019-01-29 14:43   ` [PATCH SpectreV1+L1TF v5 9/9] common/memory: " Norbert Manthey

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.