[PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests
@ 2018-08-21 15:37 Juergen Gross
  2018-08-21 15:37 ` [PATCH v2 1/2] x86/xen: don't write ptes directly in 32-bit PV guests Juergen Gross
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Juergen Gross @ 2018-08-21 15:37 UTC (permalink / raw)
  To: linux-kernel, xen-devel, x86
  Cc: boris.ostrovsky, hpa, tglx, mingo, Juergen Gross

While the hypervisor emulates plain writes to PTEs happily, this is
much slower than issuing a hypercall for PTE modifcations. And writing
a PTE via two 32-bit write instructions (especially when clearing the
PTE) will result in an intermediate L1TF vulnerable PTE.

Writes to PAE PTEs should always be done with 64-bit writes or via
hypercalls.

Juergen Gross (2):
  x86/xen: don't write ptes directly in 32-bit PV guests
  x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear

 arch/x86/include/asm/pgtable-3level.h | 7 +++----
 arch/x86/xen/mmu_pv.c                 | 7 +++----
 2 files changed, 6 insertions(+), 8 deletions(-)

-- 
2.13.7


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] x86/xen: don't write ptes directly in 32-bit PV guests
  2018-08-21 15:37 [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Juergen Gross
@ 2018-08-21 15:37 ` Juergen Gross
  2018-08-21 15:37 ` [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear Juergen Gross
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Juergen Gross @ 2018-08-21 15:37 UTC (permalink / raw)
  To: linux-kernel, xen-devel, x86
  Cc: boris.ostrovsky, hpa, tglx, mingo, Juergen Gross

In some cases 32-bit PAE PV guests still write PTEs directly instead of
using hypercalls. This is especially bad when clearing a PTE as this is
done via 32-bit writes which will produce intermediate L1TF attackable
PTEs.

Change the code to use hypercalls instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 arch/x86/xen/mmu_pv.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 9e7012858420..9396b4d17064 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -434,14 +434,13 @@ static void xen_set_pud(pud_t *ptr, pud_t val)
 static void xen_set_pte_atomic(pte_t *ptep, pte_t pte)
 {
 	trace_xen_mmu_set_pte_atomic(ptep, pte);
-	set_64bit((u64 *)ptep, native_pte_val(pte));
+	__xen_set_pte(ptep, pte);
 }
 
 static void xen_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
 	trace_xen_mmu_pte_clear(mm, addr, ptep);
-	if (!xen_batched_set_pte(ptep, native_make_pte(0)))
-		native_pte_clear(mm, addr, ptep);
+	__xen_set_pte(ptep, native_make_pte(0));
 }
 
 static void xen_pmd_clear(pmd_t *pmdp)
@@ -1569,7 +1568,7 @@ static void __init xen_set_pte_init(pte_t *ptep, pte_t pte)
 		pte = __pte_ma(((pte_val_ma(*ptep) & _PAGE_RW) | ~_PAGE_RW) &
 			       pte_val_ma(pte));
 #endif
-	native_set_pte(ptep, pte);
+	__xen_set_pte(ptep, pte);
 }
 
 /* Early in boot, while setting up the initial pagetable, assume
-- 
2.13.7


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
  2018-08-21 15:37 [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Juergen Gross
  2018-08-21 15:37 ` [PATCH v2 1/2] x86/xen: don't write ptes directly in 32-bit PV guests Juergen Gross
@ 2018-08-21 15:37 ` Juergen Gross
  2018-08-21 15:57   ` [Xen-devel] " Jan Beulich
  2018-08-26 10:25   ` Thomas Gleixner
  2018-08-27 16:03 ` [Xen-devel] [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Jason Andryuk
  2018-08-27 19:25 ` Boris Ostrovsky
  3 siblings, 2 replies; 8+ messages in thread
From: Juergen Gross @ 2018-08-21 15:37 UTC (permalink / raw)
  To: linux-kernel, xen-devel, x86
  Cc: boris.ostrovsky, hpa, tglx, mingo, Juergen Gross

Using only 32-bit writes for the pte will result in an intermediate
L1TF vulnerable PTE. When running as a Xen PV guest this will at once
switch the guest to shadow mode resulting in a loss of performance.

Use arch_atomic64_xchg() instead which will perform the requested
operation atomically with all 64 bits.

Some performance considerations according to:

https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scalable-Processor-throughput-latency.pdf

The main number should be the latency, as there is no tight loop around
native_ptep_get_and_clear().

"lock cmpxchg8b" has a latency of 20 cycles, while "lock xchg" (with a
memory operand) isn't mentioned in that document. "lock xadd" (with xadd
having 3 cycles less latency than xchg) has a latency of 11, so we can
assume a latency of 14 for "lock xchg".

Signed-off-by: Juergen Gross <jgross@suse.com>
---
In case adding about 6 cycles for native_ptep_get_and_clear() is believed
to be too bad I can modify the patch to add a paravirt function for that
purpose in order to add the overhead for Xen guests only (in fact the
overhead for Xen guests will be less, as only one instruction writing to
the PTE has to be emulated by the hypervisor).
---
 arch/x86/include/asm/pgtable-3level.h | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
index a564084c6141..f8b1ad2c3828 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_X86_PGTABLE_3LEVEL_H
 #define _ASM_X86_PGTABLE_3LEVEL_H
 
+#include <asm/atomic64_32.h>
+
 /*
  * Intel Physical Address Extension (PAE) Mode - three-level page
  * tables on PPro+ CPUs.
@@ -150,10 +152,7 @@ static inline pte_t native_ptep_get_and_clear(pte_t *ptep)
 {
 	pte_t res;
 
-	/* xchg acts as a barrier before the setting of the high bits */
-	res.pte_low = xchg(&ptep->pte_low, 0);
-	res.pte_high = ptep->pte_high;
-	ptep->pte_high = 0;
+	res.pte = (pteval_t)arch_atomic64_xchg((atomic64_t *)ptep, 0);
 
 	return res;
 }
-- 
2.13.7


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
  2018-08-21 15:37 ` [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear Juergen Gross
@ 2018-08-21 15:57   ` Jan Beulich
  2018-08-26 10:25   ` Thomas Gleixner
  1 sibling, 0 replies; 8+ messages in thread
From: Jan Beulich @ 2018-08-21 15:57 UTC (permalink / raw)
  To: Juergen Gross
  Cc: the arch/x86 maintainers, tglx, xen-devel, Boris Ostrovsky,
	mingo, linux-kernel, hpa

>>> On 21.08.18 at 17:37, <jgross@suse.com> wrote:
> Using only 32-bit writes for the pte will result in an intermediate
> L1TF vulnerable PTE. When running as a Xen PV guest this will at once
> switch the guest to shadow mode resulting in a loss of performance.
> 
> Use arch_atomic64_xchg() instead which will perform the requested
> operation atomically with all 64 bits.
> 
> Some performance considerations according to:
> 
> https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scal 
> able-Processor-throughput-latency.pdf
> 
> The main number should be the latency, as there is no tight loop around
> native_ptep_get_and_clear().
> 
> "lock cmpxchg8b" has a latency of 20 cycles, while "lock xchg" (with a
> memory operand) isn't mentioned in that document. "lock xadd" (with xadd
> having 3 cycles less latency than xchg) has a latency of 11, so we can
> assume a latency of 14 for "lock xchg".
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with one further remark:

> @@ -150,10 +152,7 @@ static inline pte_t native_ptep_get_and_clear(pte_t *ptep)
>  {
>  	pte_t res;
>  
> -	/* xchg acts as a barrier before the setting of the high bits */
> -	res.pte_low = xchg(&ptep->pte_low, 0);
> -	res.pte_high = ptep->pte_high;
> -	ptep->pte_high = 0;
> +	res.pte = (pteval_t)arch_atomic64_xchg((atomic64_t *)ptep, 0);

Is the cast on the return value really needed here?

Jan



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
  2018-08-21 15:37 ` [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear Juergen Gross
  2018-08-21 15:57   ` [Xen-devel] " Jan Beulich
@ 2018-08-26 10:25   ` Thomas Gleixner
  1 sibling, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2018-08-26 10:25 UTC (permalink / raw)
  To: Juergen Gross; +Cc: linux-kernel, xen-devel, x86, boris.ostrovsky, hpa, mingo

On Tue, 21 Aug 2018, Juergen Gross wrote:

> Using only 32-bit writes for the pte will result in an intermediate
> L1TF vulnerable PTE. When running as a Xen PV guest this will at once
> switch the guest to shadow mode resulting in a loss of performance.
> 
> Use arch_atomic64_xchg() instead which will perform the requested
> operation atomically with all 64 bits.
> 
> Some performance considerations according to:
> 
> https://software.intel.com/sites/default/files/managed/ad/dc/Intel-Xeon-Scalable-Processor-throughput-latency.pdf
> 
> The main number should be the latency, as there is no tight loop around
> native_ptep_get_and_clear().
> 
> "lock cmpxchg8b" has a latency of 20 cycles, while "lock xchg" (with a
> memory operand) isn't mentioned in that document. "lock xadd" (with xadd
> having 3 cycles less latency than xchg) has a latency of 11, so we can
> assume a latency of 14 for "lock xchg".
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests
  2018-08-21 15:37 [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Juergen Gross
  2018-08-21 15:37 ` [PATCH v2 1/2] x86/xen: don't write ptes directly in 32-bit PV guests Juergen Gross
  2018-08-21 15:37 ` [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear Juergen Gross
@ 2018-08-27 16:03 ` Jason Andryuk
  2018-08-27 16:07   ` Jason Andryuk
  2018-08-27 19:25 ` Boris Ostrovsky
  3 siblings, 1 reply; 8+ messages in thread
From: Jason Andryuk @ 2018-08-27 16:03 UTC (permalink / raw)
  To: Juergen Gross
  Cc: open list, xen-devel, x86, Boris Ostrovsky, Ingo Molnar,
	Thomas Gleixner, H. Peter Anvin

On Tue, Aug 21, 2018 at 11:40 AM Juergen Gross <jgross@suse.com> wrote:
>
> While the hypervisor emulates plain writes to PTEs happily, this is
> much slower than issuing a hypercall for PTE modifcations. And writing
> a PTE via two 32-bit write instructions (especially when clearing the
> PTE) will result in an intermediate L1TF vulnerable PTE.
>
> Writes to PAE PTEs should always be done with 64-bit writes or via
> hypercalls.
>
> Juergen Gross (2):
>   x86/xen: don't write ptes directly in 32-bit PV guests
>   x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
>

I tested both patches on 4.14, changing patch 2 to atomic64_xchg since
arch_atomic64_xchg doesn't exist.

I haven't seen https://bugzilla.kernel.org/show_bug.cgi?id=198497
trigger since incorporating these patch. Without the patches, I would
have seen it trigger by now.  Also, I've confirmed Xen does not enable
page table shadowing.  For what it's worth, the PTEs that would
trigger Xen shadowing (0x8000'0002'0000'0000) are the same as those
that triggered bug 198497.  There was at least 1 non-Xen user affected
by 198497, but this at least seems to fix it for me.

Tested-by: Jason Andryuk <jandryuk@gmail.com>

Thank you.
Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xen-devel] [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests
  2018-08-27 16:03 ` [Xen-devel] [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Jason Andryuk
@ 2018-08-27 16:07   ` Jason Andryuk
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Andryuk @ 2018-08-27 16:07 UTC (permalink / raw)
  To: Juergen Gross
  Cc: open list, xen-devel, x86, Boris Ostrovsky, Ingo Molnar,
	Thomas Gleixner, H. Peter Anvin

On Mon, Aug 27, 2018 at 12:03 PM Jason Andryuk <jandryuk@gmail.com> wrote:
>
> On Tue, Aug 21, 2018 at 11:40 AM Juergen Gross <jgross@suse.com> wrote:
> >
> > While the hypervisor emulates plain writes to PTEs happily, this is
> > much slower than issuing a hypercall for PTE modifcations. And writing
> > a PTE via two 32-bit write instructions (especially when clearing the
> > PTE) will result in an intermediate L1TF vulnerable PTE.
> >
> > Writes to PAE PTEs should always be done with 64-bit writes or via
> > hypercalls.
> >
> > Juergen Gross (2):
> >   x86/xen: don't write ptes directly in 32-bit PV guests
> >   x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
> >
>
> I tested both patches on 4.14, changing patch 2 to atomic64_xchg since
> arch_atomic64_xchg doesn't exist.
>
> I haven't seen https://bugzilla.kernel.org/show_bug.cgi?id=198497
> trigger since incorporating these patch. Without the patches, I would
> have seen it trigger by now.  Also, I've confirmed Xen does not enable
> page table shadowing.  For what it's worth, the PTEs that would
> trigger Xen shadowing (0x8000'0002'0000'0000) are the same as those
> that triggered bug 198497.  There was at least 1 non-Xen user affected
> by 198497, but this at least seems to fix it for me.
>
> Tested-by: Jason Andryuk <jandryuk@gmail.com>

Also, can these patches be Cc: stable@vger.kernel.org?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests
  2018-08-21 15:37 [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Juergen Gross
                   ` (2 preceding siblings ...)
  2018-08-27 16:03 ` [Xen-devel] [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Jason Andryuk
@ 2018-08-27 19:25 ` Boris Ostrovsky
  3 siblings, 0 replies; 8+ messages in thread
From: Boris Ostrovsky @ 2018-08-27 19:25 UTC (permalink / raw)
  To: Juergen Gross, linux-kernel, xen-devel, x86; +Cc: hpa, tglx, mingo, stable

On 08/21/2018 11:37 AM, Juergen Gross wrote:
> While the hypervisor emulates plain writes to PTEs happily, this is
> much slower than issuing a hypercall for PTE modifcations. And writing
> a PTE via two 32-bit write instructions (especially when clearing the
> PTE) will result in an intermediate L1TF vulnerable PTE.
>
> Writes to PAE PTEs should always be done with 64-bit writes or via
> hypercalls.
>
> Juergen Gross (2):
>   x86/xen: don't write ptes directly in 32-bit PV guests
>   x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
>
>  arch/x86/include/asm/pgtable-3level.h | 7 +++----
>  arch/x86/xen/mmu_pv.c                 | 7 +++----
>  2 files changed, 6 insertions(+), 8 deletions(-)
>


Applied to for-linus-19b.

(+stable.)

-boris

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-08-27 19:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-21 15:37 [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Juergen Gross
2018-08-21 15:37 ` [PATCH v2 1/2] x86/xen: don't write ptes directly in 32-bit PV guests Juergen Gross
2018-08-21 15:37 ` [PATCH v2 2/2] x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear Juergen Gross
2018-08-21 15:57   ` [Xen-devel] " Jan Beulich
2018-08-26 10:25   ` Thomas Gleixner
2018-08-27 16:03 ` [Xen-devel] [PATCH v2 0/2] x86/xen: avoid 32-bit writes to PTEs in PV PAE guests Jason Andryuk
2018-08-27 16:07   ` Jason Andryuk
2018-08-27 19:25 ` Boris Ostrovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).