All of lore.kernel.org
 help / color / mirror / Atom feed
* [MODERATED] [PATCH 0/8] L1TFv8 2
@ 2018-06-13 22:48 Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 1/8] L1TFv8 0 Andi Kleen
                   ` (9 more replies)
  0 siblings, 10 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

This is v8 of the native OS patchkit to mitigate the L1TF side channel.
It does not cover KVM.

This version addresses the latest review feedbacks. The mitigation
setup has been moved into check_bugs, and the memory size checking
patch is now integrated into the standard setup. The swap mitigation
has been split into two patches. Various other changes.
For more details see the individual changelogs.

Andi Kleen (6):
  x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_MASK
  x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation
  x86/speculation/l1tf: Make sure the first page is always reserved
  x86/speculation/l1tf: Add sysfs reporting for l1tf
  x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE
    mappings
  x86/speculation/l1tf: Limit swap file size to MAX_PA/2

Linus Torvalds (2):
  x86/speculation/l1tf: Change order of offset/type in swap entry
  x86/speculation/l1tf: Protect swap entries against L1TF

 arch/x86/include/asm/cpufeatures.h    |  2 ++
 arch/x86/include/asm/page_32_types.h  |  9 ++++--
 arch/x86/include/asm/pgtable-2level.h | 17 ++++++++++++
 arch/x86/include/asm/pgtable-3level.h |  2 ++
 arch/x86/include/asm/pgtable-invert.h | 32 +++++++++++++++++++++
 arch/x86/include/asm/pgtable.h        | 52 ++++++++++++++++++++++++++---------
 arch/x86/include/asm/pgtable_64.h     | 38 +++++++++++++++++--------
 arch/x86/include/asm/processor.h      |  5 ++++
 arch/x86/kernel/cpu/bugs.c            | 40 +++++++++++++++++++++++++++
 arch/x86/kernel/cpu/common.c          | 20 ++++++++++++++
 arch/x86/kernel/setup.c               |  6 ++++
 arch/x86/mm/init.c                    | 15 ++++++++++
 arch/x86/mm/mmap.c                    | 21 ++++++++++++++
 drivers/base/cpu.c                    |  8 ++++++
 include/asm-generic/pgtable.h         | 12 ++++++++
 include/linux/cpu.h                   |  2 ++
 include/linux/swapfile.h              |  2 ++
 mm/memory.c                           | 37 ++++++++++++++++++-------
 mm/mprotect.c                         | 49 +++++++++++++++++++++++++++++++++
 mm/swapfile.c                         | 46 ++++++++++++++++++++-----------
 20 files changed, 363 insertions(+), 52 deletions(-)
 create mode 100644 arch/x86/include/asm/pgtable-invert.h

-- 
2.14.4

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] [PATCH 1/8] L1TFv8 0
  2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
@ 2018-06-13 22:48 ` Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 2/8] L1TFv8 4 Andi Kleen
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

We need to protect memory inside the guest's memory against L1TF
by inverting the right bits to point to non existing memory.

The hypervisor should already protect itself against the guest by flushing
the caches as needed, but pages inside the guest are not protected against
attacks from other processes in that guest.

Our inverted PTE mask has to match the host to provide the full
protection for all pages the host could possibly map into our guest.
The host is likely 64bit and may use more than 43 bits of
memory. We want to set all possible bits to be safe here.

On 32bit PAE the max PTE mask is currently set to 44 bit because that is
the limit imposed by 32bit unsigned long PFNs in the VMs. This limits
the mask to be below what the host could possible use for physical
pages.

The L1TF PROT_NONE protection code uses the PTE masks to determine
what bits to invert to make sure the higher bits are set for unmapped
entries to prevent L1TF speculation attacks against EPT inside guests.

We want to invert all bits that could be used by the host.

So increase the mask on 32bit PAE to 52 to match 64bit.

The real limit for a 32bit OS is still 44 bits.

All Linux PTEs are created from unsigned long PFNs, so cannot be
higher than 44 bits on a 32bit kernel. So these extra PFN
bits should be never set. The only users of this macro are using
it to look at PTEs, so it's safe.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-By: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

---

v2: Improve commit message.
---
 arch/x86/include/asm/page_32_types.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/page_32_types.h b/arch/x86/include/asm/page_32_types.h
index aa30c3241ea7..0d5c739eebd7 100644
--- a/arch/x86/include/asm/page_32_types.h
+++ b/arch/x86/include/asm/page_32_types.h
@@ -29,8 +29,13 @@
 #define N_EXCEPTION_STACKS 1
 
 #ifdef CONFIG_X86_PAE
-/* 44=32+12, the limit we can fit into an unsigned long pfn */
-#define __PHYSICAL_MASK_SHIFT	44
+/*
+ * This is beyond the 44 bit limit imposed by the 32bit long pfns,
+ * but we need the full mask to make sure inverted PROT_NONE
+ * entries have all the host bits set in a guest.
+ * The real limit is still 44 bits.
+ */
+#define __PHYSICAL_MASK_SHIFT	52
 #define __VIRTUAL_MASK_SHIFT	32
 
 #else  /* !CONFIG_X86_PAE */
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] [PATCH 2/8] L1TFv8 4
  2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 1/8] L1TFv8 0 Andi Kleen
@ 2018-06-13 22:48 ` Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 3/8] L1TFv8 5 Andi Kleen
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

Here's a patch that switches the order of "type" and
"offset" in the x86-64 encoding in preparation of the next
patch which inverts the swap entry to protect against L1TF.

That means that now the offset is bits 9-58 in the page table, and that
the type is in the bits that hardware generally doesn't care about.

That, in turn, means that if you have a desktop chip with only 40 bits of
physical addressing, now that the offset starts at bit 9, you still have
to have 30 bits of offset actually *in use* until bit 39 ends up being
clear.

So that's 4 terabyte of swap space (because the offset is counted in
pages, so 30 bits of offset is 42 bits of actual coverage). With bigger
physical addressing, that obviously grows further, until you hit the limit
of the offset (at 50 bits of offset - 62 bits of actual swap file
coverage).

[updated description and minor tweaks by AK]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-By: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 arch/x86/include/asm/pgtable_64.h | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 877bc27718ae..bce04fd39372 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -273,7 +273,7 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
  *
  * |     ...            | 11| 10|  9|8|7|6|5| 4| 3|2| 1|0| <- bit number
  * |     ...            |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names
- * | OFFSET (14->63) | TYPE (9-13)  |0|0|X|X| X| X|X|SD|0| <- swp entry
+ * | TYPE (59-63) |  OFFSET (9-58)  |0|0|X|X| X| X|X|SD|0| <- swp entry
  *
  * G (8) is aliased and used as a PROT_NONE indicator for
  * !present ptes.  We need to start storing swap entries above
@@ -287,19 +287,28 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
  * Bit 7 in swp entry should be 0 because pmd_present checks not only P,
  * but also L and G.
  */
-#define SWP_TYPE_FIRST_BIT (_PAGE_BIT_PROTNONE + 1)
-#define SWP_TYPE_BITS 5
-/* Place the offset above the type: */
-#define SWP_OFFSET_FIRST_BIT (SWP_TYPE_FIRST_BIT + SWP_TYPE_BITS)
+#define SWP_TYPE_BITS		5
+
+#define SWP_OFFSET_FIRST_BIT	(_PAGE_BIT_PROTNONE + 1)
+
+/* We always extract/encode the offset by shifting it all the way up, and then down again */
+#define SWP_OFFSET_SHIFT	(SWP_OFFSET_FIRST_BIT+SWP_TYPE_BITS)
 
 #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS)
 
-#define __swp_type(x)			(((x).val >> (SWP_TYPE_FIRST_BIT)) \
-					 & ((1U << SWP_TYPE_BITS) - 1))
-#define __swp_offset(x)			((x).val >> SWP_OFFSET_FIRST_BIT)
-#define __swp_entry(type, offset)	((swp_entry_t) { \
-					 ((type) << (SWP_TYPE_FIRST_BIT)) \
-					 | ((offset) << SWP_OFFSET_FIRST_BIT) })
+/* Extract the high bits for type */
+#define __swp_type(x) ((x).val >> (64 - SWP_TYPE_BITS))
+
+/* Shift up (to get rid of type), then down to get value */
+#define __swp_offset(x) ((x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT)
+
+/*
+ * Shift the offset up "too far" by TYPE bits, then down again
+ */
+#define __swp_entry(type, offset) ((swp_entry_t) { \
+	((unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
+	| ((unsigned long)(type) << (64-SWP_TYPE_BITS)) })
+
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val((pte)) })
 #define __pmd_to_swp_entry(pmd)		((swp_entry_t) { pmd_val((pmd)) })
 #define __swp_entry_to_pte(x)		((pte_t) { .pte = (x).val })
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] [PATCH 3/8] L1TFv8 5
  2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 1/8] L1TFv8 0 Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 2/8] L1TFv8 4 Andi Kleen
@ 2018-06-13 22:48 ` Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 4/8] L1TFv8 8 Andi Kleen
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

With L1 terminal fault the CPU speculates into unmapped PTEs, and
resulting side effects allow to read the memory the PTE is pointing
too, if its values are still in the L1 cache.

For swapped out pages Linux uses unmapped PTEs and stores a swap entry
into them.

We need to make sure the swap entry is not pointing to valid memory,
which requires setting higher bits (between bit 36 and bit 45) that
are inside the CPUs physical address space, but outside any real
memory.

To do this we invert the offset to make sure the higher bits are always
set, as long as the swap file is not too big.

Note there is no workaround for 32bit !PAE, or on systems which
have more than MAX_PA/2 worth of memory. The later case is very unlikely
to happen on real systems.

[updated description and minor tweaks by AK]

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-By: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
v2: Split out patch that swaps fields.
---
 arch/x86/include/asm/pgtable_64.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index bce04fd39372..593c3cf259dd 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -273,7 +273,7 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
  *
  * |     ...            | 11| 10|  9|8|7|6|5| 4| 3|2| 1|0| <- bit number
  * |     ...            |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names
- * | TYPE (59-63) |  OFFSET (9-58)  |0|0|X|X| X| X|X|SD|0| <- swp entry
+ * | TYPE (59-63) | ~OFFSET (9-58)  |0|0|X|X| X| X|X|SD|0| <- swp entry
  *
  * G (8) is aliased and used as a PROT_NONE indicator for
  * !present ptes.  We need to start storing swap entries above
@@ -286,6 +286,9 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
  *
  * Bit 7 in swp entry should be 0 because pmd_present checks not only P,
  * but also L and G.
+ *
+ * The offset is inverted by a binary not operation to make the high
+ * physical bits set.
  */
 #define SWP_TYPE_BITS		5
 
@@ -300,13 +303,15 @@ static inline int pgd_large(pgd_t pgd) { return 0; }
 #define __swp_type(x) ((x).val >> (64 - SWP_TYPE_BITS))
 
 /* Shift up (to get rid of type), then down to get value */
-#define __swp_offset(x) ((x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT)
+#define __swp_offset(x) (~(x).val << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT)
 
 /*
  * Shift the offset up "too far" by TYPE bits, then down again
+ * The offset is inverted by a binary not operation to make the high
+ * physical bits set.
  */
 #define __swp_entry(type, offset) ((swp_entry_t) { \
-	((unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
+	(~(unsigned long)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
 	| ((unsigned long)(type) << (64-SWP_TYPE_BITS)) })
 
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { pte_val((pte)) })
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] [PATCH 4/8] L1TFv8 8
  2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
                   ` (2 preceding siblings ...)
  2018-06-13 22:48 ` [MODERATED] [PATCH 3/8] L1TFv8 5 Andi Kleen
@ 2018-06-13 22:48 ` Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 5/8] L1TFv8 3 Andi Kleen
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

We also need to protect PTEs that are set to PROT_NONE against
L1TF speculation attacks.

This is important inside guests, because L1TF speculation
bypasses physical page remapping. While the VM has its own
migitations preventing leaking data from other VMs into
the guest, this would still risk leaking the wrong page
inside the current guest.

This uses the same technique as Linus' swap entry patch:
while an entry is is in PROTNONE state we invert the
complete PFN part part of it. This ensures that the
the highest bit will point to non existing memory.

The invert is done by pte/pmd_modify and pfn/pmd/pud_pte for
PROTNONE and pte/pmd/pud_pfn undo it.

We assume that noone tries to touch the PFN part of
a PTE without using these primitives.

This doesn't handle the case that MMIO is on the top
of the CPU physical memory. If such an MMIO region
was exposed by an unpriviledged driver for mmap
it would be possible to attack some real memory.
However this situation is all rather unlikely.

For 32bit non PAE we don't try inversion because
there are really not enough bits to protect anything.

Q: Why does the guest need to be protected when the
HyperVisor already has L1TF mitigations?
A: Here's an example:
You have physical pages 1 2. They get mapped into a guest as
GPA 1 -> PA 2
GPA 2 -> PA 1
through EPT.

The L1TF speculation ignores the EPT remapping.

Now the guest kernel maps GPA 1 to process A and GPA 2 to process B,
and they belong to different users and should be isolated.

A sets the GPA 1 PA 2 PTE to PROT_NONE to bypass the EPT remapping
and gets read access to the underlying physical page. Which
in this case points to PA 2, so it can read process B's data,
if it happened to be in L1.

So we broke isolation inside the guest.

There's nothing the hypervisor can do about this. This
mitigation has to be done in the guest.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-By: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

---
v2: Use new helper to generate XOR mask to invert (Linus)
v3: Use inline helper for protnone mask checking
v4: Use inline helpers to check for PROT_NONE changes
---
 arch/x86/include/asm/pgtable-2level.h | 17 ++++++++++++++
 arch/x86/include/asm/pgtable-3level.h |  2 ++
 arch/x86/include/asm/pgtable-invert.h | 32 +++++++++++++++++++++++++
 arch/x86/include/asm/pgtable.h        | 44 ++++++++++++++++++++++++-----------
 arch/x86/include/asm/pgtable_64.h     |  2 ++
 5 files changed, 84 insertions(+), 13 deletions(-)
 create mode 100644 arch/x86/include/asm/pgtable-invert.h

diff --git a/arch/x86/include/asm/pgtable-2level.h b/arch/x86/include/asm/pgtable-2level.h
index 685ffe8a0eaf..60d0f9015317 100644
--- a/arch/x86/include/asm/pgtable-2level.h
+++ b/arch/x86/include/asm/pgtable-2level.h
@@ -95,4 +95,21 @@ static inline unsigned long pte_bitop(unsigned long value, unsigned int rightshi
 #define __pte_to_swp_entry(pte)		((swp_entry_t) { (pte).pte_low })
 #define __swp_entry_to_pte(x)		((pte_t) { .pte = (x).val })
 
+/* No inverted PFNs on 2 level page tables */
+
+static inline u64 protnone_mask(u64 val)
+{
+	return 0;
+}
+
+static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask)
+{
+	return val;
+}
+
+static inline bool __pte_needs_invert(u64 val)
+{
+	return false;
+}
+
 #endif /* _ASM_X86_PGTABLE_2LEVEL_H */
diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
index f24df59c40b2..76ab26a99e6e 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -295,4 +295,6 @@ static inline pte_t gup_get_pte(pte_t *ptep)
 	return pte;
 }
 
+#include <asm/pgtable-invert.h>
+
 #endif /* _ASM_X86_PGTABLE_3LEVEL_H */
diff --git a/arch/x86/include/asm/pgtable-invert.h b/arch/x86/include/asm/pgtable-invert.h
new file mode 100644
index 000000000000..177564187fc0
--- /dev/null
+++ b/arch/x86/include/asm/pgtable-invert.h
@@ -0,0 +1,32 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_PGTABLE_INVERT_H
+#define _ASM_PGTABLE_INVERT_H 1
+
+#ifndef __ASSEMBLY__
+
+static inline bool __pte_needs_invert(u64 val)
+{
+	return (val & (_PAGE_PRESENT|_PAGE_PROTNONE)) == _PAGE_PROTNONE;
+}
+
+/* Get a mask to xor with the page table entry to get the correct pfn. */
+static inline u64 protnone_mask(u64 val)
+{
+	return __pte_needs_invert(val) ?  ~0ull : 0;
+}
+
+static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask)
+{
+	/*
+	 * When a PTE transitions from NONE to !NONE or vice-versa
+	 * invert the PFN part to stop speculation.
+	 * pte_pfn undoes this when needed.
+	 */
+	if (__pte_needs_invert(oldval) != __pte_needs_invert(val))
+		val = (val & ~mask) | (~val & mask);
+	return val;
+}
+
+#endif /* __ASSEMBLY__ */
+
+#endif
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index f1633de5a675..10dcd9e597c6 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -185,19 +185,29 @@ static inline int pte_special(pte_t pte)
 	return pte_flags(pte) & _PAGE_SPECIAL;
 }
 
+/* Entries that were set to PROT_NONE are inverted */
+
+static inline u64 protnone_mask(u64 val);
+
 static inline unsigned long pte_pfn(pte_t pte)
 {
-	return (pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT;
+	unsigned long pfn = pte_val(pte);
+	pfn ^= protnone_mask(pfn);
+	return (pfn & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
 static inline unsigned long pmd_pfn(pmd_t pmd)
 {
-	return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
+	unsigned long pfn = pmd_val(pmd);
+	pfn ^= protnone_mask(pfn);
+	return (pfn & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
 }
 
 static inline unsigned long pud_pfn(pud_t pud)
 {
-	return (pud_val(pud) & pud_pfn_mask(pud)) >> PAGE_SHIFT;
+	unsigned long pfn = pud_val(pud);
+	pfn ^= protnone_mask(pfn);
+	return (pfn & pud_pfn_mask(pud)) >> PAGE_SHIFT;
 }
 
 static inline unsigned long p4d_pfn(p4d_t p4d)
@@ -545,25 +555,33 @@ static inline pgprotval_t check_pgprot(pgprot_t pgprot)
 
 static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
 {
-	return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) |
-		     check_pgprot(pgprot));
+	phys_addr_t pfn = page_nr << PAGE_SHIFT;
+	pfn ^= protnone_mask(pgprot_val(pgprot));
+	pfn &= PTE_PFN_MASK;
+	return __pte(pfn | check_pgprot(pgprot));
 }
 
 static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
 {
-	return __pmd(((phys_addr_t)page_nr << PAGE_SHIFT) |
-		     check_pgprot(pgprot));
+	phys_addr_t pfn = page_nr << PAGE_SHIFT;
+	pfn ^= protnone_mask(pgprot_val(pgprot));
+	pfn &= PHYSICAL_PMD_PAGE_MASK;
+	return __pmd(pfn | check_pgprot(pgprot));
 }
 
 static inline pud_t pfn_pud(unsigned long page_nr, pgprot_t pgprot)
 {
-	return __pud(((phys_addr_t)page_nr << PAGE_SHIFT) |
-		     check_pgprot(pgprot));
+	phys_addr_t pfn = page_nr << PAGE_SHIFT;
+	pfn ^= protnone_mask(pgprot_val(pgprot));
+	pfn &= PHYSICAL_PUD_PAGE_MASK;
+	return __pud(pfn | check_pgprot(pgprot));
 }
 
+static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask);
+
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	pteval_t val = pte_val(pte);
+	pteval_t val = pte_val(pte), oldval = val;
 
 	/*
 	 * Chop off the NX bit (if present), and add the NX portion of
@@ -571,17 +589,17 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 	 */
 	val &= _PAGE_CHG_MASK;
 	val |= check_pgprot(newprot) & ~_PAGE_CHG_MASK;
-
+	val = flip_protnone_guard(oldval, val, PTE_PFN_MASK);
 	return __pte(val);
 }
 
 static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
 {
-	pmdval_t val = pmd_val(pmd);
+	pmdval_t val = pmd_val(pmd), oldval = val;
 
 	val &= _HPAGE_CHG_MASK;
 	val |= check_pgprot(newprot) & ~_HPAGE_CHG_MASK;
-
+	val = flip_protnone_guard(oldval, val, PHYSICAL_PMD_PAGE_MASK);
 	return __pmd(val);
 }
 
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index 593c3cf259dd..ea99272ab63e 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -357,5 +357,7 @@ static inline bool gup_fast_permitted(unsigned long start, int nr_pages,
 	return true;
 }
 
+#include <asm/pgtable-invert.h>
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASM_X86_PGTABLE_64_H */
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] [PATCH 5/8] L1TFv8 3
  2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
                   ` (3 preceding siblings ...)
  2018-06-13 22:48 ` [MODERATED] [PATCH 4/8] L1TFv8 8 Andi Kleen
@ 2018-06-13 22:48 ` Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 6/8] L1TFv8 7 Andi Kleen
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

The L1TF workaround doesn't make any attempt to mitigate speculate
accesses to the first physical page for zeroed PTEs. Normally
it only contains some data from the early real mode BIOS.

I couldn't convince myself we always reserve the first page in
all configurations, so add an extra reservation call to
make sure it is really reserved. In most configurations (e.g.
with the standard reservations) it's likely a nop.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-By: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

---
v2: improve comment
---
 arch/x86/kernel/setup.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 5c623dfe39d1..89fd35349412 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -823,6 +823,12 @@ void __init setup_arch(char **cmdline_p)
 	memblock_reserve(__pa_symbol(_text),
 			 (unsigned long)__bss_stop - (unsigned long)_text);
 
+	/*
+	 * Make sure page 0 is always reserved because on systems with
+	 * L1TF its contents can be leaked to user processes.
+	 */
+	memblock_reserve(0, PAGE_SIZE);
+
 	early_reserve_initrd();
 
 	/*
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] [PATCH 6/8] L1TFv8 7
  2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
                   ` (4 preceding siblings ...)
  2018-06-13 22:48 ` [MODERATED] [PATCH 5/8] L1TFv8 3 Andi Kleen
@ 2018-06-13 22:48 ` Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 7/8] L1TFv8 1 Andi Kleen
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

L1TF core kernel workarounds are cheap and normally always enabled,
However we still want to report in sysfs if the system is vulnerable
or mitigated. Add the necessary checks.

- We extend the existing checks for Meltdowns to determine if the system is
vulnerable. This excludes some Atom CPUs which don't have this
problem.
- We check for 32bit non PAE and warn
- If the system has more than MAX_PA/2 physical memory the
invert page workarounds don't protect the system against
the L1TF attack anymore, because an inverted physical address
will point to valid memory. Print a warning in this case
and report that the system is vulnerable.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-By: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

---
v2: Use positive instead of negative flag for WA. Fix override
reporting.
v3: Fix L1TF_WA flag settting
v4: Rebase to SSB tree
v5: Minor cleanups. No functional changes.
Don't mark atoms and knights as vulnerable
v6: Change _WA to _FIX
v7: Use common sysfs function
v8: Improve commit message
Move mitigation check into check_bugs.
Integrate memory size checking into this patch
White space changes. Move l1tf_pfn_limit here.
---
 arch/x86/include/asm/cpufeatures.h |  2 ++
 arch/x86/include/asm/processor.h   |  5 +++++
 arch/x86/kernel/cpu/bugs.c         | 40 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/common.c       | 20 +++++++++++++++++++
 drivers/base/cpu.c                 |  8 ++++++++
 include/linux/cpu.h                |  2 ++
 6 files changed, 77 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index fb00a2fca990..3b0bdd7d6b71 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -219,6 +219,7 @@
 #define X86_FEATURE_IBPB		( 7*32+26) /* Indirect Branch Prediction Barrier */
 #define X86_FEATURE_STIBP		( 7*32+27) /* Single Thread Indirect Branch Predictors */
 #define X86_FEATURE_ZEN			( 7*32+28) /* "" CPU is AMD family 0x17 (Zen) */
+#define X86_FEATURE_L1TF_FIX		( 7*32+29) /* "" L1TF workaround used */
 
 /* Virtualization flags: Linux defined, word 8 */
 #define X86_FEATURE_TPR_SHADOW		( 8*32+ 0) /* Intel TPR Shadow */
@@ -371,5 +372,6 @@
 #define X86_BUG_SPECTRE_V1		X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */
 #define X86_BUG_SPECTRE_V2		X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */
 #define X86_BUG_SPEC_STORE_BYPASS	X86_BUG(17) /* CPU is affected by speculative store bypass attack */
+#define X86_BUG_L1TF			X86_BUG(18) /* CPU is affected by L1 Terminal Fault */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 21a114914ba4..1c6cedafbe94 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -181,6 +181,11 @@ extern const struct seq_operations cpuinfo_op;
 
 extern void cpu_detect(struct cpuinfo_x86 *c);
 
+static inline unsigned long l1tf_pfn_limit(void)
+{
+	return BIT(boot_cpu_data.x86_phys_bits - 1 - PAGE_SHIFT) - 1;
+}
+
 extern void early_cpu_init(void);
 extern void identify_boot_cpu(void);
 extern void identify_secondary_cpu(struct cpuinfo_x86 *);
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 7416fc206b4a..88effa6ad53d 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -27,9 +27,11 @@
 #include <asm/pgtable.h>
 #include <asm/set_memory.h>
 #include <asm/intel-family.h>
+#include <asm/e820/api.h>
 
 static void __init spectre_v2_select_mitigation(void);
 static void __init ssb_select_mitigation(void);
+static void __init l1tf_select_mitigation(void);
 
 /*
  * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any
@@ -81,6 +83,8 @@ void __init check_bugs(void)
 	 */
 	ssb_select_mitigation();
 
+	l1tf_select_mitigation();
+
 #ifdef CONFIG_X86_32
 	/*
 	 * Check whether we are able to run this kernel safely on SMP.
@@ -205,6 +209,32 @@ static void x86_amd_ssb_disable(void)
 		wrmsrl(MSR_AMD64_LS_CFG, msrval);
 }
 
+static void __init l1tf_select_mitigation(void)
+{
+	u64 half_pa;
+
+	if (!boot_cpu_has_bug(X86_BUG_L1TF))
+		return;
+
+#if CONFIG_PGTABLE_LEVELS == 2
+	pr_warn("Kernel not compiled for PAE. No mitigation for L1TF\n");
+	return;
+#endif
+
+	/*
+	 * This is extremely unlikely to happen because almost all
+	 * systems have far more MAX_PA/2 than RAM can be fit into
+	 * DIMM slots.
+	 */
+	half_pa = (u64)l1tf_pfn_limit() << PAGE_SHIFT;
+	if (e820__mapped_any(half_pa, ULLONG_MAX - half_pa, E820_TYPE_RAM)) {
+		pr_warn("System has more than MAX_PA/2 memory. L1TF mitigation not effective.\n");
+		return;
+	}
+
+	setup_force_cpu_cap(X86_FEATURE_L1TF_FIX);
+}
+
 #ifdef RETPOLINE
 static bool spectre_v2_bad_module;
 
@@ -681,6 +711,11 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
 	case X86_BUG_SPEC_STORE_BYPASS:
 		return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
 
+	case X86_BUG_L1TF:
+		if (boot_cpu_has(X86_FEATURE_L1TF_FIX))
+			return sprintf(buf, "Mitigation: Page Table Inversion\n");
+		break;
+
 	default:
 		break;
 	}
@@ -707,4 +742,9 @@ ssize_t cpu_show_spec_store_bypass(struct device *dev, struct device_attribute *
 {
 	return cpu_show_common(dev, attr, buf, X86_BUG_SPEC_STORE_BYPASS);
 }
+
+ssize_t cpu_show_l1tf(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	return cpu_show_common(dev, attr, buf, X86_BUG_L1TF);
+}
 #endif
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 38276f58d3bf..3bb0fa42edef 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -958,6 +958,21 @@ static const __initconst struct x86_cpu_id cpu_no_spec_store_bypass[] = {
 	{}
 };
 
+static const __initconst struct x86_cpu_id cpu_no_l1tf[] = {
+	/* in addition to cpu_no_speculation */
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_SILVERMONT1	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_SILVERMONT2	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_AIRMONT		},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_MERRIFIELD	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_MOOREFIELD	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_GOLDMONT	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_DENVERTON	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_ATOM_GEMINI_LAKE	},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_XEON_PHI_KNL		},
+	{ X86_VENDOR_INTEL,	6,	INTEL_FAM6_XEON_PHI_KNM		},
+	{}
+};
+
 static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 {
 	u64 ia32_cap = 0;
@@ -983,6 +998,11 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
 		return;
 
 	setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
+
+	if (x86_match_cpu(cpu_no_l1tf))
+		return;
+
+	setup_force_cpu_bug(X86_BUG_L1TF);
 }
 
 /*
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 30cc9c877ebb..eb9443d5bae1 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -540,16 +540,24 @@ ssize_t __weak cpu_show_spec_store_bypass(struct device *dev,
 	return sprintf(buf, "Not affected\n");
 }
 
+ssize_t __weak cpu_show_l1tf(struct device *dev,
+			     struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "Not affected\n");
+}
+
 static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
 static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
 static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
 static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
+static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL);
 
 static struct attribute *cpu_root_vulnerabilities_attrs[] = {
 	&dev_attr_meltdown.attr,
 	&dev_attr_spectre_v1.attr,
 	&dev_attr_spectre_v2.attr,
 	&dev_attr_spec_store_bypass.attr,
+	&dev_attr_l1tf.attr,
 	NULL
 };
 
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index a97a63eef59f..d3da5184aa1c 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -55,6 +55,8 @@ extern ssize_t cpu_show_spectre_v2(struct device *dev,
 				   struct device_attribute *attr, char *buf);
 extern ssize_t cpu_show_spec_store_bypass(struct device *dev,
 					  struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_l1tf(struct device *dev,
+			     struct device_attribute *attr, char *buf);
 
 extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] [PATCH 7/8] L1TFv8 1
  2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
                   ` (5 preceding siblings ...)
  2018-06-13 22:48 ` [MODERATED] [PATCH 6/8] L1TFv8 7 Andi Kleen
@ 2018-06-13 22:48 ` Andi Kleen
  2018-06-13 22:48 ` [MODERATED] [PATCH 8/8] L1TFv8 6 Andi Kleen
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

For L1TF PROT_NONE mappings are protected by inverting the PFN in the
page table entry. This sets the high bits in the CPU's address space,
thus making sure to point to not point an unmapped entry to valid
cached memory.

Some server system BIOS put the MMIO mappings high up in the physical
address space. If such an high mapping was mapped to an unprivileged
user they could attack low memory by setting such a mapping to
PROT_NONE. This could happen through a special device driver
which is not access protected. Normal /dev/mem is of course
access protect.

To avoid this we forbid PROT_NONE mappings or mprotect for high MMIO
mappings.

Valid page mappings are allowed because the system is then unsafe
anyways.

We don't expect users to commonly use PROT_NONE on MMIO. But
to minimize any impact here we only do this if the mapping actually
refers to a high MMIO address (defined as the MAX_PA-1 bit being set),
and also skip the check for root.

For mmaps this is straight forward and can be handled in vm_insert_pfn
and in remap_pfn_range().

For mprotect it's a bit trickier. At the point we're looking at the
actual PTEs a lot of state has been changed and would be difficult
to undo on an error. Since this is a uncommon case we use a separate
early page talk walk pass for MMIO PROT_NONE mappings that
checks for this condition early. For non MMIO and non PROT_NONE
there are no changes.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-By: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

---
v2: Use new helpers added earlier
v3: Fix inverted check added in v3
v4: Use l1tf_pfn_limit (Thomas)
Add comment for locked down kernels
v5: Use boot_cpu_has_bug. Check bug early in arch_has_pfn_modify_check
---
 arch/x86/include/asm/pgtable.h |  8 +++++++
 arch/x86/mm/mmap.c             | 21 ++++++++++++++++++
 include/asm-generic/pgtable.h  | 12 +++++++++++
 mm/memory.c                    | 37 ++++++++++++++++++++++---------
 mm/mprotect.c                  | 49 ++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 117 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 10dcd9e597c6..049f1f0f11c8 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1338,6 +1338,14 @@ static inline bool pud_access_permitted(pud_t pud, bool write)
 	return __pte_access_permitted(pud_val(pud), write);
 }
 
+#define __HAVE_ARCH_PFN_MODIFY_ALLOWED 1
+extern bool pfn_modify_allowed(unsigned long pfn, pgprot_t prot);
+
+static inline bool arch_has_pfn_modify_check(void)
+{
+	return boot_cpu_has_bug(X86_BUG_L1TF);
+}
+
 #include <asm-generic/pgtable.h>
 #endif	/* __ASSEMBLY__ */
 
diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c
index 48c591251600..f40ab8185d94 100644
--- a/arch/x86/mm/mmap.c
+++ b/arch/x86/mm/mmap.c
@@ -240,3 +240,24 @@ int valid_mmap_phys_addr_range(unsigned long pfn, size_t count)
 
 	return phys_addr_valid(addr + count - 1);
 }
+
+/*
+ * Only allow root to set high MMIO mappings to PROT_NONE.
+ * This prevents an unpriv. user to set them to PROT_NONE and invert
+ * them, then pointing to valid memory for L1TF speculation.
+ *
+ * Note: for locked down kernels may want to disable the root override.
+ */
+bool pfn_modify_allowed(unsigned long pfn, pgprot_t prot)
+{
+	if (!boot_cpu_has_bug(X86_BUG_L1TF))
+		return true;
+	if (!__pte_needs_invert(pgprot_val(prot)))
+		return true;
+	/* If it's real memory always allow */
+	if (pfn_valid(pfn))
+		return true;
+	if (pfn > l1tf_pfn_limit() && !capable(CAP_SYS_ADMIN))
+		return false;
+	return true;
+}
diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index f59639afaa39..0ecc1197084b 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -1097,4 +1097,16 @@ static inline void init_espfix_bsp(void) { }
 #endif
 #endif
 
+#ifndef __HAVE_ARCH_PFN_MODIFY_ALLOWED
+static inline bool pfn_modify_allowed(unsigned long pfn, pgprot_t prot)
+{
+	return true;
+}
+
+static inline bool arch_has_pfn_modify_check(void)
+{
+	return false;
+}
+#endif
+
 #endif /* _ASM_GENERIC_PGTABLE_H */
diff --git a/mm/memory.c b/mm/memory.c
index 01f5464e0fd2..fe497cecd2ab 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1891,6 +1891,9 @@ int vm_insert_pfn_prot(struct vm_area_struct *vma, unsigned long addr,
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return -EFAULT;
 
+	if (!pfn_modify_allowed(pfn, pgprot))
+		return -EACCES;
+
 	track_pfn_insert(vma, &pgprot, __pfn_to_pfn_t(pfn, PFN_DEV));
 
 	ret = insert_pfn(vma, addr, __pfn_to_pfn_t(pfn, PFN_DEV), pgprot,
@@ -1926,6 +1929,9 @@ static int __vm_insert_mixed(struct vm_area_struct *vma, unsigned long addr,
 
 	track_pfn_insert(vma, &pgprot, pfn);
 
+	if (!pfn_modify_allowed(pfn_t_to_pfn(pfn), pgprot))
+		return -EACCES;
+
 	/*
 	 * If we don't have pte special, then we have to use the pfn_valid()
 	 * based VM_MIXEDMAP scheme (see vm_normal_page), and thus we *must*
@@ -1973,6 +1979,7 @@ static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
 {
 	pte_t *pte;
 	spinlock_t *ptl;
+	int err = 0;
 
 	pte = pte_alloc_map_lock(mm, pmd, addr, &ptl);
 	if (!pte)
@@ -1980,12 +1987,16 @@ static int remap_pte_range(struct mm_struct *mm, pmd_t *pmd,
 	arch_enter_lazy_mmu_mode();
 	do {
 		BUG_ON(!pte_none(*pte));
+		if (!pfn_modify_allowed(pfn, prot)) {
+			err = -EACCES;
+			break;
+		}
 		set_pte_at(mm, addr, pte, pte_mkspecial(pfn_pte(pfn, prot)));
 		pfn++;
 	} while (pte++, addr += PAGE_SIZE, addr != end);
 	arch_leave_lazy_mmu_mode();
 	pte_unmap_unlock(pte - 1, ptl);
-	return 0;
+	return err;
 }
 
 static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
@@ -1994,6 +2005,7 @@ static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
 {
 	pmd_t *pmd;
 	unsigned long next;
+	int err;
 
 	pfn -= addr >> PAGE_SHIFT;
 	pmd = pmd_alloc(mm, pud, addr);
@@ -2002,9 +2014,10 @@ static inline int remap_pmd_range(struct mm_struct *mm, pud_t *pud,
 	VM_BUG_ON(pmd_trans_huge(*pmd));
 	do {
 		next = pmd_addr_end(addr, end);
-		if (remap_pte_range(mm, pmd, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
+		err = remap_pte_range(mm, pmd, addr, next,
+				pfn + (addr >> PAGE_SHIFT), prot);
+		if (err)
+			return err;
 	} while (pmd++, addr = next, addr != end);
 	return 0;
 }
@@ -2015,6 +2028,7 @@ static inline int remap_pud_range(struct mm_struct *mm, p4d_t *p4d,
 {
 	pud_t *pud;
 	unsigned long next;
+	int err;
 
 	pfn -= addr >> PAGE_SHIFT;
 	pud = pud_alloc(mm, p4d, addr);
@@ -2022,9 +2036,10 @@ static inline int remap_pud_range(struct mm_struct *mm, p4d_t *p4d,
 		return -ENOMEM;
 	do {
 		next = pud_addr_end(addr, end);
-		if (remap_pmd_range(mm, pud, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
+		err = remap_pmd_range(mm, pud, addr, next,
+				pfn + (addr >> PAGE_SHIFT), prot);
+		if (err)
+			return err;
 	} while (pud++, addr = next, addr != end);
 	return 0;
 }
@@ -2035,6 +2050,7 @@ static inline int remap_p4d_range(struct mm_struct *mm, pgd_t *pgd,
 {
 	p4d_t *p4d;
 	unsigned long next;
+	int err;
 
 	pfn -= addr >> PAGE_SHIFT;
 	p4d = p4d_alloc(mm, pgd, addr);
@@ -2042,9 +2058,10 @@ static inline int remap_p4d_range(struct mm_struct *mm, pgd_t *pgd,
 		return -ENOMEM;
 	do {
 		next = p4d_addr_end(addr, end);
-		if (remap_pud_range(mm, p4d, addr, next,
-				pfn + (addr >> PAGE_SHIFT), prot))
-			return -ENOMEM;
+		err = remap_pud_range(mm, p4d, addr, next,
+				pfn + (addr >> PAGE_SHIFT), prot);
+		if (err)
+			return err;
 	} while (p4d++, addr = next, addr != end);
 	return 0;
 }
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 625608bc8962..6d331620b9e5 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -306,6 +306,42 @@ unsigned long change_protection(struct vm_area_struct *vma, unsigned long start,
 	return pages;
 }
 
+static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
+			       unsigned long next, struct mm_walk *walk)
+{
+	return pfn_modify_allowed(pte_pfn(*pte), *(pgprot_t *)(walk->private)) ?
+		0 : -EACCES;
+}
+
+static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
+				   unsigned long addr, unsigned long next,
+				   struct mm_walk *walk)
+{
+	return pfn_modify_allowed(pte_pfn(*pte), *(pgprot_t *)(walk->private)) ?
+		0 : -EACCES;
+}
+
+static int prot_none_test(unsigned long addr, unsigned long next,
+			  struct mm_walk *walk)
+{
+	return 0;
+}
+
+static int prot_none_walk(struct vm_area_struct *vma, unsigned long start,
+			   unsigned long end, unsigned long newflags)
+{
+	pgprot_t new_pgprot = vm_get_page_prot(newflags);
+	struct mm_walk prot_none_walk = {
+		.pte_entry = prot_none_pte_entry,
+		.hugetlb_entry = prot_none_hugetlb_entry,
+		.test_walk = prot_none_test,
+		.mm = current->mm,
+		.private = &new_pgprot,
+	};
+
+	return walk_page_range(start, end, &prot_none_walk);
+}
+
 int
 mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev,
 	unsigned long start, unsigned long end, unsigned long newflags)
@@ -323,6 +359,19 @@ mprotect_fixup(struct vm_area_struct *vma, struct vm_area_struct **pprev,
 		return 0;
 	}
 
+	/*
+	 * Do PROT_NONE PFN permission checks here when we can still
+	 * bail out without undoing a lot of state. This is a rather
+	 * uncommon case, so doesn't need to be very optimized.
+	 */
+	if (arch_has_pfn_modify_check() &&
+	    (vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) &&
+	    (newflags & (VM_READ|VM_WRITE|VM_EXEC)) == 0) {
+		error = prot_none_walk(vma, start, end, newflags);
+		if (error)
+			return error;
+	}
+
 	/*
 	 * If we make a private mapping writable we increase our commit;
 	 * but (without finer accounting) cannot reduce our commit if we
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] [PATCH 8/8] L1TFv8 6
  2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
                   ` (6 preceding siblings ...)
  2018-06-13 22:48 ` [MODERATED] [PATCH 7/8] L1TFv8 1 Andi Kleen
@ 2018-06-13 22:48 ` Andi Kleen
       [not found] ` <20180614150632.E064C61183@crypto-ml.lab.linutronix.de>
       [not found] ` <20180613225434.1CDC8610FD@crypto-ml.lab.linutronix.de>
  9 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-13 22:48 UTC (permalink / raw)
  To: speck

For the L1TF workaround we want to limit the swap file size to below
MAX_PA/2, so that the higher bits of the swap offset inverted never
point to valid memory.

Add a way for the architecture to override the swap file
size check in swapfile.c and add a x86 specific max swapfile check
function that enforces that limit.

The check is only enabled if the CPU is vulnerable to L1TF.

In VMs with 42bit MAX_PA the typical limit is 2TB now,
on a native system with 46bit PA it is 32TB. The limit
is only per individual swap file, so it's always possible
to exceed these limits with multiple swap files or
partitions.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-By: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>

---
v2: Use new helper for maxpa_mask computation.
v3: Use l1tf_pfn_limit (Thomas)
Reformat comment
v4: Use boot_cpu_has_bug
v5: Move l1tf_pfn_limit to earlier patch
---
 arch/x86/mm/init.c       | 15 +++++++++++++++
 include/linux/swapfile.h |  2 ++
 mm/swapfile.c            | 46 ++++++++++++++++++++++++++++++----------------
 3 files changed, 47 insertions(+), 16 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index fec82b577c18..0cd3a534b7eb 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -4,6 +4,8 @@
 #include <linux/swap.h>
 #include <linux/memblock.h>
 #include <linux/bootmem.h>	/* for max_low_pfn */
+#include <linux/swapfile.h>
+#include <linux/swapops.h>
 
 #include <asm/set_memory.h>
 #include <asm/e820/api.h>
@@ -878,3 +880,16 @@ void update_cache_mode_entry(unsigned entry, enum page_cache_mode cache)
 	__cachemode2pte_tbl[cache] = __cm_idx2pte(entry);
 	__pte2cachemode_tbl[entry] = cache;
 }
+
+unsigned long max_swapfile_size(void)
+{
+	unsigned long pages;
+
+	pages = generic_max_swapfile_size();
+
+	if (boot_cpu_has_bug(X86_BUG_L1TF)) {
+		/* Limit the swap file size to MAX_PA/2 for L1TF workaround */
+		pages = min_t(unsigned long, l1tf_pfn_limit() + 1, pages);
+	}
+	return pages;
+}
diff --git a/include/linux/swapfile.h b/include/linux/swapfile.h
index 06bd7b096167..e06febf62978 100644
--- a/include/linux/swapfile.h
+++ b/include/linux/swapfile.h
@@ -10,5 +10,7 @@ extern spinlock_t swap_lock;
 extern struct plist_head swap_active_head;
 extern struct swap_info_struct *swap_info[];
 extern int try_to_unuse(unsigned int, bool, unsigned long);
+extern unsigned long generic_max_swapfile_size(void);
+extern unsigned long max_swapfile_size(void);
 
 #endif /* _LINUX_SWAPFILE_H */
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 78a015fcec3b..6ac2757d5997 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2909,6 +2909,35 @@ static int claim_swapfile(struct swap_info_struct *p, struct inode *inode)
 	return 0;
 }
 
+
+/*
+ * Find out how many pages are allowed for a single swap device. There
+ * are two limiting factors:
+ * 1) the number of bits for the swap offset in the swp_entry_t type, and
+ * 2) the number of bits in the swap pte, as defined by the different
+ * architectures.
+ *
+ * In order to find the largest possible bit mask, a swap entry with
+ * swap type 0 and swap offset ~0UL is created, encoded to a swap pte,
+ * decoded to a swp_entry_t again, and finally the swap offset is
+ * extracted.
+ *
+ * This will mask all the bits from the initial ~0UL mask that can't
+ * be encoded in either the swp_entry_t or the architecture definition
+ * of a swap pte.
+ */
+unsigned long generic_max_swapfile_size(void)
+{
+	return swp_offset(pte_to_swp_entry(
+			swp_entry_to_pte(swp_entry(0, ~0UL)))) + 1;
+}
+
+/* Can be overridden by an architecture for additional checks. */
+__weak unsigned long max_swapfile_size(void)
+{
+	return generic_max_swapfile_size();
+}
+
 static unsigned long read_swap_header(struct swap_info_struct *p,
 					union swap_header *swap_header,
 					struct inode *inode)
@@ -2944,22 +2973,7 @@ static unsigned long read_swap_header(struct swap_info_struct *p,
 	p->cluster_next = 1;
 	p->cluster_nr = 0;
 
-	/*
-	 * Find out how many pages are allowed for a single swap
-	 * device. There are two limiting factors: 1) the number
-	 * of bits for the swap offset in the swp_entry_t type, and
-	 * 2) the number of bits in the swap pte as defined by the
-	 * different architectures. In order to find the
-	 * largest possible bit mask, a swap entry with swap type 0
-	 * and swap offset ~0UL is created, encoded to a swap pte,
-	 * decoded to a swp_entry_t again, and finally the swap
-	 * offset is extracted. This will mask all the bits from
-	 * the initial ~0UL mask that can't be encoded in either
-	 * the swp_entry_t or the architecture definition of a
-	 * swap pte.
-	 */
-	maxpages = swp_offset(pte_to_swp_entry(
-			swp_entry_to_pte(swp_entry(0, ~0UL)))) + 1;
+	maxpages = max_swapfile_size();
 	last_page = swap_header->info.last_page;
 	if (!last_page) {
 		pr_warn("Empty swap-file\n");
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
       [not found] ` <20180614150632.E064C61183@crypto-ml.lab.linutronix.de>
@ 2018-06-21  9:02   ` Vlastimil Babka
  2018-06-21 11:43     ` Vlastimil Babka
  0 siblings, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2018-06-21  9:02 UTC (permalink / raw)
  To: speck

On 06/14/2018 12:48 AM, speck for Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
> Subject:  x86/speculation/l1tf: Limit swap file size to MAX_PA/2
> 
> For the L1TF workaround we want to limit the swap file size to below
> MAX_PA/2, so that the higher bits of the swap offset inverted never
> point to valid memory.
> 
> Add a way for the architecture to override the swap file
> size check in swapfile.c and add a x86 specific max swapfile check
> function that enforces that limit.
> 
> The check is only enabled if the CPU is vulnerable to L1TF.
> 
> In VMs with 42bit MAX_PA the typical limit is 2TB now,
> on a native system with 46bit PA it is 32TB. The limit
> is only per individual swap file, so it's always possible
> to exceed these limits with multiple swap files or
> partitions.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Acked-By: Dave Hansen <dave.hansen@intel.com>
> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
> 
> ---
> v2: Use new helper for maxpa_mask computation.
> v3: Use l1tf_pfn_limit (Thomas)
> Reformat comment
> v4: Use boot_cpu_has_bug
> v5: Move l1tf_pfn_limit to earlier patch
> ---
>  arch/x86/mm/init.c       | 15 +++++++++++++++
>  include/linux/swapfile.h |  2 ++
>  mm/swapfile.c            | 46 ++++++++++++++++++++++++++++++----------------
>  3 files changed, 47 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index fec82b577c18..0cd3a534b7eb 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -4,6 +4,8 @@
>  #include <linux/swap.h>
>  #include <linux/memblock.h>
>  #include <linux/bootmem.h>	/* for max_low_pfn */
> +#include <linux/swapfile.h>
> +#include <linux/swapops.h>
>  
>  #include <asm/set_memory.h>
>  #include <asm/e820/api.h>
> @@ -878,3 +880,16 @@ void update_cache_mode_entry(unsigned entry, enum page_cache_mode cache)
>  	__cachemode2pte_tbl[cache] = __cm_idx2pte(entry);
>  	__pte2cachemode_tbl[entry] = cache;
>  }
> +
> +unsigned long max_swapfile_size(void)
> +{
> +	unsigned long pages;
> +
> +	pages = generic_max_swapfile_size();
> +
> +	if (boot_cpu_has_bug(X86_BUG_L1TF)) {
> +		/* Limit the swap file size to MAX_PA/2 for L1TF workaround */
> +		pages = min_t(unsigned long, l1tf_pfn_limit() + 1, pages);

Is this actually correct? IIUC l1tf_pfn_limit() is in page granularity,
which are encoded in bits 12 to $LIMIT., but we have swap offsets in
bits 9 to $LIMIT (after patch 2/8), i.e. 3 bits more? Same for the
limits described in the changelog?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-21  9:02   ` [MODERATED] " Vlastimil Babka
@ 2018-06-21 11:43     ` Vlastimil Babka
  2018-06-21 13:17       ` Vlastimil Babka
  2018-06-22 15:46       ` Vlastimil Babka
  0 siblings, 2 replies; 26+ messages in thread
From: Vlastimil Babka @ 2018-06-21 11:43 UTC (permalink / raw)
  To: speck

On 06/21/2018 11:02 AM, speck for Vlastimil Babka wrote:
> On 06/14/2018 12:48 AM, speck for Andi Kleen wrote:
>> +unsigned long max_swapfile_size(void)
>> +{
>> +	unsigned long pages;
>> +
>> +	pages = generic_max_swapfile_size();
>> +
>> +	if (boot_cpu_has_bug(X86_BUG_L1TF)) {
>> +		/* Limit the swap file size to MAX_PA/2 for L1TF workaround */
>> +		pages = min_t(unsigned long, l1tf_pfn_limit() + 1, pages);
> 
> Is this actually correct? IIUC l1tf_pfn_limit() is in page granularity,
> which are encoded in bits 12 to $LIMIT., but we have swap offsets in
> bits 9 to $LIMIT (after patch 2/8), i.e. 3 bits more? Same for the
> limits described in the changelog?

Yeah, I was able to verify this with some printk's, constructing a pte
with max allowed offset and printing it. In VM with 42bit limits, the
pte is 7ffffc000000000, so the unusable bits start with 38, not 41.

Also after more digging into this, I also suspect that the PAE case is
currently not mitigating. The pgtable-3level.h macros don't seem to flip
the bits. Also swap entries there use only the high pte word, whereas
most of the safe to use bits are in the low word.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-21 11:43     ` Vlastimil Babka
@ 2018-06-21 13:17       ` Vlastimil Babka
  2018-06-21 14:38         ` Michal Hocko
                           ` (2 more replies)
  2018-06-22 15:46       ` Vlastimil Babka
  1 sibling, 3 replies; 26+ messages in thread
From: Vlastimil Babka @ 2018-06-21 13:17 UTC (permalink / raw)
  To: speck

On 06/21/2018 01:43 PM, speck for Vlastimil Babka wrote:
> On 06/21/2018 11:02 AM, speck for Vlastimil Babka wrote:
>> On 06/14/2018 12:48 AM, speck for Andi Kleen wrote:
>>> +unsigned long max_swapfile_size(void)
>>> +{
>>> +	unsigned long pages;
>>> +
>>> +	pages = generic_max_swapfile_size();
>>> +
>>> +	if (boot_cpu_has_bug(X86_BUG_L1TF)) {
>>> +		/* Limit the swap file size to MAX_PA/2 for L1TF workaround */
>>> +		pages = min_t(unsigned long, l1tf_pfn_limit() + 1, pages);
>>
>> Is this actually correct? IIUC l1tf_pfn_limit() is in page granularity,
>> which are encoded in bits 12 to $LIMIT., but we have swap offsets in
>> bits 9 to $LIMIT (after patch 2/8), i.e. 3 bits more? Same for the
>> limits described in the changelog?
> 
> Yeah, I was able to verify this with some printk's, constructing a pte
> with max allowed offset and printing it. In VM with 42bit limits, the
> pte is 7ffffc000000000, so the unusable bits start with 38, not 41.

Here's a patch for the 64bit case. The testing pte is then 7fffe0000000000,
so all bits up to bit 40 are used.
Not sure what to do with the 32bit case, we'll probably have to start using
both words of the pte to avoid tiny offsets?

----8<----
From ca293624a2cc776d9d87edc9497dd406dbe460c0 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Thu, 21 Jun 2018 12:36:29 +0200
Subject: [PATCH] x86/speculation/l1tf: extend 64bit swap file size limit

The previous patch has limited swap file size so that large offsets cannot
clear bits above MAX_PA/2 in the pte and interfere with L1TF mitigation.

It assumed that offsets are encoded starting with bit 12, same as pfn. But
on x86_64, we encode offsets starting with bit 9. We can thus raise the limit
by 3 bits. That means 16TB with 42bit MAX_PA and 256TB with 46bit MAX_PA.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 arch/x86/mm/init.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 0cd3a534b7eb..f1e03047de90 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -889,7 +889,15 @@ unsigned long max_swapfile_size(void)
 
 	if (boot_cpu_has_bug(X86_BUG_L1TF)) {
 		/* Limit the swap file size to MAX_PA/2 for L1TF workaround */
-		pages = min_t(unsigned long, l1tf_pfn_limit() + 1, pages);
+		unsigned long l1tf_limit = l1tf_pfn_limit() + 1;
+		/*
+		 * We encode swap offsets also with 3 bits below those for pfn
+		 * which makes the usable limit higher.
+		 */
+#ifdef CONFIG_X86_64
+		l1tf_limit <<= PAGE_SHIFT - SWP_OFFSET_FIRST_BIT;
+#endif
+		pages = min_t(unsigned long, l1tf_limit, pages);
 	}
 	return pages;
 }
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-21 13:17       ` Vlastimil Babka
@ 2018-06-21 14:38         ` Michal Hocko
  2018-06-21 14:38         ` Thomas Gleixner
  2018-06-21 20:32         ` [MODERATED] " Andi Kleen
  2 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2018-06-21 14:38 UTC (permalink / raw)
  To: speck

On Thu 21-06-18 15:17:56, speck for Vlastimil Babka wrote:
[...]
> >From ca293624a2cc776d9d87edc9497dd406dbe460c0 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Thu, 21 Jun 2018 12:36:29 +0200
> Subject: [PATCH] x86/speculation/l1tf: extend 64bit swap file size limit
> 
> The previous patch has limited swap file size so that large offsets cannot
> clear bits above MAX_PA/2 in the pte and interfere with L1TF mitigation.
> 
> It assumed that offsets are encoded starting with bit 12, same as pfn. But
> on x86_64, we encode offsets starting with bit 9. We can thus raise the limit
> by 3 bits. That means 16TB with 42bit MAX_PA and 256TB with 46bit MAX_PA.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Yeah, you are right. I have missed that previously. Not that this would
be really critical. Who does want to use that insane amount of swap
space anyway.

Acked-by: Michal Hocko <mhocko@suse.com>
> ---
>  arch/x86/mm/init.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 0cd3a534b7eb..f1e03047de90 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -889,7 +889,15 @@ unsigned long max_swapfile_size(void)
>  
>  	if (boot_cpu_has_bug(X86_BUG_L1TF)) {
>  		/* Limit the swap file size to MAX_PA/2 for L1TF workaround */
> -		pages = min_t(unsigned long, l1tf_pfn_limit() + 1, pages);
> +		unsigned long l1tf_limit = l1tf_pfn_limit() + 1;
> +		/*
> +		 * We encode swap offsets also with 3 bits below those for pfn
> +		 * which makes the usable limit higher.
> +		 */
> +#ifdef CONFIG_X86_64
> +		l1tf_limit <<= PAGE_SHIFT - SWP_OFFSET_FIRST_BIT;
> +#endif
> +		pages = min_t(unsigned long, l1tf_limit, pages);
>  	}
>  	return pages;
>  }
> -- 
> 2.17.1
> 
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/8] L1TFv8 6
  2018-06-21 13:17       ` Vlastimil Babka
  2018-06-21 14:38         ` Michal Hocko
@ 2018-06-21 14:38         ` Thomas Gleixner
  2018-06-21 20:32         ` [MODERATED] " Andi Kleen
  2 siblings, 0 replies; 26+ messages in thread
From: Thomas Gleixner @ 2018-06-21 14:38 UTC (permalink / raw)
  To: speck

On Thu, 21 Jun 2018, speck for Vlastimil Babka wrote:
> ----8<----
> >From ca293624a2cc776d9d87edc9497dd406dbe460c0 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Thu, 21 Jun 2018 12:36:29 +0200
> Subject: [PATCH] x86/speculation/l1tf: extend 64bit swap file size limit
> 
> The previous patch has limited swap file size so that large offsets cannot
> clear bits above MAX_PA/2 in the pte and interfere with L1TF mitigation.
> 
> It assumed that offsets are encoded starting with bit 12, same as pfn. But
> on x86_64, we encode offsets starting with bit 9. We can thus raise the limit
> by 3 bits. That means 16TB with 42bit MAX_PA and 256TB with 46bit MAX_PA.

Applied.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-21 13:17       ` Vlastimil Babka
  2018-06-21 14:38         ` Michal Hocko
  2018-06-21 14:38         ` Thomas Gleixner
@ 2018-06-21 20:32         ` Andi Kleen
  2 siblings, 0 replies; 26+ messages in thread
From: Andi Kleen @ 2018-06-21 20:32 UTC (permalink / raw)
  To: speck

> ----8<----
> >From ca293624a2cc776d9d87edc9497dd406dbe460c0 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Thu, 21 Jun 2018 12:36:29 +0200
> Subject: [PATCH] x86/speculation/l1tf: extend 64bit swap file size limit
> 
> The previous patch has limited swap file size so that large offsets cannot
> clear bits above MAX_PA/2 in the pte and interfere with L1TF mitigation.
> 
> It assumed that offsets are encoded starting with bit 12, same as pfn. But
> on x86_64, we encode offsets starting with bit 9. We can thus raise the limit
> by 3 bits. That means 16TB with 42bit MAX_PA and 256TB with 46bit MAX_PA.

Thanks looks good.

Acked-by: Andi Kleen <ak@linux.intel.com>

-Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-21 11:43     ` Vlastimil Babka
  2018-06-21 13:17       ` Vlastimil Babka
@ 2018-06-22 15:46       ` Vlastimil Babka
  2018-06-22 16:56         ` Andi Kleen
  1 sibling, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2018-06-22 15:46 UTC (permalink / raw)
  To: speck

On 06/21/2018 01:43 PM, speck for Vlastimil Babka wrote:
> On 06/21/2018 11:02 AM, speck for Vlastimil Babka wrote:
>> On 06/14/2018 12:48 AM, speck for Andi Kleen wrote:
>>> +unsigned long max_swapfile_size(void)
>>> +{
>>> +	unsigned long pages;
>>> +
>>> +	pages = generic_max_swapfile_size();
>>> +
>>> +	if (boot_cpu_has_bug(X86_BUG_L1TF)) {
>>> +		/* Limit the swap file size to MAX_PA/2 for L1TF workaround */
>>> +		pages = min_t(unsigned long, l1tf_pfn_limit() + 1, pages);
>>
>> Is this actually correct? IIUC l1tf_pfn_limit() is in page granularity,
>> which are encoded in bits 12 to $LIMIT., but we have swap offsets in
>> bits 9 to $LIMIT (after patch 2/8), i.e. 3 bits more? Same for the
>> limits described in the changelog?
> 
> Yeah, I was able to verify this with some printk's, constructing a pte
> with max allowed offset and printing it. In VM with 42bit limits, the
> pte is 7ffffc000000000, so the unusable bits start with 38, not 41.
> 
> Also after more digging into this, I also suspect that the PAE case is
> currently not mitigating. The pgtable-3level.h macros don't seem to flip
> the bits. Also swap entries there use only the high pte word, whereas
> most of the safe to use bits are in the low word.

I've been trying to fix the PAE case and here's the current result, note
that it's only compile tested, so just a RFC and testing welcome. I
changed the swap entry format to mimic the 64bit one, as neither 32bit
word has enough "safe" bits to not limit swap size to few GB.

Because the macro machinery doesn't expect the arch-dependent swap entry
format to be 32bit and pte to be 64bit, the results is even more macros,
sorry about that.

-----8<-----
From 6f8c1176e99fbf56dc8a29a4d279a5770e45fd4f Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Fri, 22 Jun 2018 17:39:33 +0200
Subject: [PATCH] adjust PAE swap encoding for l1tf

---
 arch/x86/include/asm/pgtable-3level.h | 35 +++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
index 76ab26a99e6e..a1d9ab21f8ea 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -241,12 +241,43 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp)
 #endif
 
 /* Encode and de-code a swap entry */
+#define SWP_TYPE_BITS		5
+
+#define SWP_OFFSET_FIRST_BIT	(_PAGE_BIT_PROTNONE + 1)
+
+/* We always extract/encode the offset by shifting it all the way up, and then down again */
+#define SWP_OFFSET_SHIFT	(SWP_OFFSET_FIRST_BIT+SWP_TYPE_BITS)
+
 #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5)
 #define __swp_type(x)			(((x).val) & 0x1f)
 #define __swp_offset(x)			((x).val >> 5)
 #define __swp_entry(type, offset)	((swp_entry_t){(type) | (offset) << 5})
-#define __pte_to_swp_entry(pte)		((swp_entry_t){ (pte).pte_high })
-#define __swp_entry_to_pte(x)		((pte_t){ { .pte_high = (x).val } })
+
+/*
+ * Normally, __swp_entry() converts from arch-independent swp_entry_t to
+ * arch-dependent swp_entry_t, and __swp_entry_to_pte() just stores the result
+ * to pte. But here we have 32bit swp_entry_t and 64bit pte, and need to use the
+ * whole 64 bits. Thus, we shift the "real" arch-dependent conversion to
+ * __swp_entry_to_pte() through the following helper macro based on 64bit
+ * __swp_entry().
+ */
+#define __swp_pteval_entry(type, offset) ((pteval_t) { \
+	(~(pteval_t)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
+	| ((pteval_t)(type) << (64-SWP_TYPE_BITS)) })
+
+#define __swp_entry_to_pte(x)	((pte_t){ .pte = \
+		__swp_pteval_entry(__swp_type(x), __swp_offset(x)) })
+/*
+ * Analogically, __pte_to_swp_entry() doesn't just extract the arch-dependent
+ * swp_entry_t, but also has to convert it from 64bit to the 32bit
+ * intermediate representation, using the following macros based on 64bit
+ * __swp_type() and __swp_offset().
+ */
+#define __pteval_swp_type(x) ((unsigned long)((x).pte >> (64 - SWP_TYPE_BITS)))
+#define __pteval_swp_offset(x) ((unsigned long)(~((x).pte) << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT))
+
+#define __pte_to_swp_entry(pte)	(__swp_entry(__pteval_swp_type(pte), \
+					     __pteval_swp_offset(pte)))
 
 #define gup_get_pte gup_get_pte
 /*
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-22 15:46       ` Vlastimil Babka
@ 2018-06-22 16:56         ` Andi Kleen
  2018-06-25  7:04           ` Vlastimil Babka
  0 siblings, 1 reply; 26+ messages in thread
From: Andi Kleen @ 2018-06-22 16:56 UTC (permalink / raw)
  To: speck

> Because the macro machinery doesn't expect the arch-dependent swap entry
> format to be 32bit and pte to be 64bit, the results is even more macros,
> sorry about that.

Seems ugly and complicated. Perhaps it's better to just sacrifice the three bits.
Doubt anyone will really need it anyways, especially not on 32bit systems.

-Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-22 16:56         ` Andi Kleen
@ 2018-06-25  7:04           ` Vlastimil Babka
  2018-06-25 20:31             ` Andi Kleen
  0 siblings, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2018-06-25  7:04 UTC (permalink / raw)
  To: speck

On 06/22/2018 06:56 PM, speck for Andi Kleen wrote:
>> Because the macro machinery doesn't expect the arch-dependent swap entry
>> format to be 32bit and pte to be 64bit, the results is even more macros,
>> sorry about that.
> 
> Seems ugly and complicated. Perhaps it's better to just sacrifice the three bits.
> Doubt anyone will really need it anyways, especially not on 32bit systems.

What three bits? You seem to be confusing this with my previous fix for
64bit max swap size, but this is something quite different.

Before this patch, PAE code did not flip the offset bits, and was using
the high pte word. That means bits 32-36 for type, 37-63 for offset.
Lower word was zeroed, thus systems with 4GB or less memory should be
safe, for 4GB to 128GB the swap type controls the "vulnerable" memory
locations, above that also the offset. Is it correct that 32bit PAE HW
phys limit is 64GB, but in the virtualized 32bit-pae guest on 64bit HW
(that you were concerned about) that limit doesn't apply?

Now if we put the swap entry to the lower word starting with bit 9 (like
64bit), with 5 bits type we have 18 bits left for swap offset. That's
just one 1GB. In the high word we have bits to avoid for L1TF (40 to 51
at least?) so that's even worse. IMHO we have to use the whole 64bit
entry then, which is what the patch does.

Vlastimil

> -Andi
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-25  7:04           ` Vlastimil Babka
@ 2018-06-25 20:31             ` Andi Kleen
  2018-06-26 12:01               ` Vlastimil Babka
  0 siblings, 1 reply; 26+ messages in thread
From: Andi Kleen @ 2018-06-25 20:31 UTC (permalink / raw)
  To: speck

On Mon, Jun 25, 2018 at 09:04:34AM +0200, speck for Vlastimil Babka wrote:
> On 06/22/2018 06:56 PM, speck for Andi Kleen wrote:
> >> Because the macro machinery doesn't expect the arch-dependent swap entry
> >> format to be 32bit and pte to be 64bit, the results is even more macros,
> >> sorry about that.
> > 
> > Seems ugly and complicated. Perhaps it's better to just sacrifice the three bits.
> > Doubt anyone will really need it anyways, especially not on 32bit systems.
> 
> What three bits? You seem to be confusing this with my previous fix for
> 64bit max swap size, but this is something quite different.

You're right.

> 
> Before this patch, PAE code did not flip the offset bits, and was using
> the high pte word. That means bits 32-36 for type, 37-63 for offset.
> Lower word was zeroed, thus systems with 4GB or less memory should be
> safe, for 4GB to 128GB the swap type controls the "vulnerable" memory
> locations, above that also the offset. Is it correct that 32bit PAE HW
> phys limit is 64GB, but in the virtualized 32bit-pae guest on 64bit HW
> (that you were concerned about) that limit doesn't apply?

AFAIK it never applies on modern systems.

> 
> Now if we put the swap entry to the lower word starting with bit 9 (like
> 64bit), with 5 bits type we have 18 bits left for swap offset. That's
> just one 1GB. In the high word we have bits to avoid for L1TF (40 to 51
> at least?) so that's even worse. IMHO we have to use the whole 64bit
> entry then, which is what the patch does.

Ok.

-Andi

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-25 20:31             ` Andi Kleen
@ 2018-06-26 12:01               ` Vlastimil Babka
  2018-06-26 12:57                 ` Michal Hocko
  2018-06-27  9:14                 ` Thomas Gleixner
  0 siblings, 2 replies; 26+ messages in thread
From: Vlastimil Babka @ 2018-06-26 12:01 UTC (permalink / raw)
  To: speck

On 06/25/2018 10:31 PM, speck for Andi Kleen wrote:
> On Mon, Jun 25, 2018 at 09:04:34AM +0200, speck for Vlastimil Babka wrote:
>>> Seems ugly and complicated. Perhaps it's better to just sacrifice the three bits.
>>> Doubt anyone will really need it anyways, especially not on 32bit systems.
>>
>> What three bits? You seem to be confusing this with my previous fix for
>> 64bit max swap size, but this is something quite different.
> 
> You're right.
> 
>>
>> Before this patch, PAE code did not flip the offset bits, and was using
>> the high pte word. That means bits 32-36 for type, 37-63 for offset.
>> Lower word was zeroed, thus systems with 4GB or less memory should be
>> safe, for 4GB to 128GB the swap type controls the "vulnerable" memory
>> locations, above that also the offset. Is it correct that 32bit PAE HW
>> phys limit is 64GB, but in the virtualized 32bit-pae guest on 64bit HW
>> (that you were concerned about) that limit doesn't apply?
> 
> AFAIK it never applies on modern systems.
> 
>>
>> Now if we put the swap entry to the lower word starting with bit 9 (like
>> 64bit), with 5 bits type we have 18 bits left for swap offset. That's
>> just one 1GB. In the high word we have bits to avoid for L1TF (40 to 51
>> at least?) so that's even worse. IMHO we have to use the whole 64bit
>> entry then, which is what the patch does.
> 
> Ok.

Thanks. Here's an updated patch with changelog, and it has been also tested.

----8<----
From 94b19f2277984594eda826a315cb49d6be5375b5 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka@suse.cz>
Date: Fri, 22 Jun 2018 17:39:33 +0200
Subject: [PATCH] x86/speculation/l1tf: protect PAE swap entries against L1TF

The PAE 3-level paging code currently doesn't mitigate L1TF by flipping the
offset bits, and uses the high PTE word, thus bits 32-36 for type, 37-63 for
offset. The lower word is zeroed, thus systems with less than 4GB memory are
safe. With 4GB to 128GB the swap type selects the memory locations vulnerable
to L1TF; with even more memory, also the swap offfset influences the address.
This might be a problem with 32bit PAE guests running on large 64bit hosts.

By continuing to keep the whole swap entry in either high or low 32bit word of
PTE we would limit the swap size too much. Thus this patch uses the whole PAE
PTE with the same layout as the 64bit version does. The macros just become a
bit tricky since they assume the arch-dependent swp_entry_t to be 32bit.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
 arch/x86/include/asm/pgtable-3level.h | 35 +++++++++++++++++++++++++--
 arch/x86/mm/init.c                    |  2 +-
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
index 76ab26a99e6e..a1d9ab21f8ea 100644
--- a/arch/x86/include/asm/pgtable-3level.h
+++ b/arch/x86/include/asm/pgtable-3level.h
@@ -241,12 +241,43 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp)
 #endif
 
 /* Encode and de-code a swap entry */
+#define SWP_TYPE_BITS		5
+
+#define SWP_OFFSET_FIRST_BIT	(_PAGE_BIT_PROTNONE + 1)
+
+/* We always extract/encode the offset by shifting it all the way up, and then down again */
+#define SWP_OFFSET_SHIFT	(SWP_OFFSET_FIRST_BIT+SWP_TYPE_BITS)
+
 #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5)
 #define __swp_type(x)			(((x).val) & 0x1f)
 #define __swp_offset(x)			((x).val >> 5)
 #define __swp_entry(type, offset)	((swp_entry_t){(type) | (offset) << 5})
-#define __pte_to_swp_entry(pte)		((swp_entry_t){ (pte).pte_high })
-#define __swp_entry_to_pte(x)		((pte_t){ { .pte_high = (x).val } })
+
+/*
+ * Normally, __swp_entry() converts from arch-independent swp_entry_t to
+ * arch-dependent swp_entry_t, and __swp_entry_to_pte() just stores the result
+ * to pte. But here we have 32bit swp_entry_t and 64bit pte, and need to use the
+ * whole 64 bits. Thus, we shift the "real" arch-dependent conversion to
+ * __swp_entry_to_pte() through the following helper macro based on 64bit
+ * __swp_entry().
+ */
+#define __swp_pteval_entry(type, offset) ((pteval_t) { \
+	(~(pteval_t)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
+	| ((pteval_t)(type) << (64-SWP_TYPE_BITS)) })
+
+#define __swp_entry_to_pte(x)	((pte_t){ .pte = \
+		__swp_pteval_entry(__swp_type(x), __swp_offset(x)) })
+/*
+ * Analogically, __pte_to_swp_entry() doesn't just extract the arch-dependent
+ * swp_entry_t, but also has to convert it from 64bit to the 32bit
+ * intermediate representation, using the following macros based on 64bit
+ * __swp_type() and __swp_offset().
+ */
+#define __pteval_swp_type(x) ((unsigned long)((x).pte >> (64 - SWP_TYPE_BITS)))
+#define __pteval_swp_offset(x) ((unsigned long)(~((x).pte) << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT))
+
+#define __pte_to_swp_entry(pte)	(__swp_entry(__pteval_swp_type(pte), \
+					     __pteval_swp_offset(pte)))
 
 #define gup_get_pte gup_get_pte
 /*
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index c0870df32b2d..862191ed3d6e 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -896,7 +896,7 @@ unsigned long max_swapfile_size(void)
 		 * We encode swap offsets also with 3 bits below those for pfn
 		 * which makes the usable limit higher.
 		 */
-#ifdef CONFIG_X86_64
+#if CONFIG_PGTABLE_LEVELS > 2
 		l1tf_limit <<= PAGE_SHIFT - SWP_OFFSET_FIRST_BIT;
 #endif
 		pages = min_t(unsigned long, l1tf_limit, pages);
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-26 12:01               ` Vlastimil Babka
@ 2018-06-26 12:57                 ` Michal Hocko
  2018-06-26 13:05                   ` Michal Hocko
  2018-06-27  9:14                 ` Thomas Gleixner
  1 sibling, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2018-06-26 12:57 UTC (permalink / raw)
  To: speck

On Tue 26-06-18 14:01:18, speck for Vlastimil Babka wrote:
> On 06/25/2018 10:31 PM, speck for Andi Kleen wrote:
[...]
> >From 94b19f2277984594eda826a315cb49d6be5375b5 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Fri, 22 Jun 2018 17:39:33 +0200
> Subject: [PATCH] x86/speculation/l1tf: protect PAE swap entries against L1TF
> 
> The PAE 3-level paging code currently doesn't mitigate L1TF by flipping the
> offset bits, and uses the high PTE word, thus bits 32-36 for type, 37-63 for
> offset. The lower word is zeroed, thus systems with less than 4GB memory are
> safe. With 4GB to 128GB the swap type selects the memory locations vulnerable
> to L1TF; with even more memory, also the swap offfset influences the address.
> This might be a problem with 32bit PAE guests running on large 64bit hosts.
> 
> By continuing to keep the whole swap entry in either high or low 32bit word of
> PTE we would limit the swap size too much. Thus this patch uses the whole PAE
> PTE with the same layout as the 64bit version does. The macros just become a
> bit tricky since they assume the arch-dependent swp_entry_t to be 32bit.

I have expected this to be even ugglier but it seems quite sane in the
end.

> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  arch/x86/include/asm/pgtable-3level.h | 35 +++++++++++++++++++++++++--
>  arch/x86/mm/init.c                    |  2 +-
>  2 files changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h
> index 76ab26a99e6e..a1d9ab21f8ea 100644
> --- a/arch/x86/include/asm/pgtable-3level.h
> +++ b/arch/x86/include/asm/pgtable-3level.h
> @@ -241,12 +241,43 @@ static inline pud_t native_pudp_get_and_clear(pud_t *pudp)
>  #endif
>  
>  /* Encode and de-code a swap entry */
> +#define SWP_TYPE_BITS		5
> +
> +#define SWP_OFFSET_FIRST_BIT	(_PAGE_BIT_PROTNONE + 1)
> +
> +/* We always extract/encode the offset by shifting it all the way up, and then down again */
> +#define SWP_OFFSET_SHIFT	(SWP_OFFSET_FIRST_BIT+SWP_TYPE_BITS)
> +
>  #define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > 5)
>  #define __swp_type(x)			(((x).val) & 0x1f)
>  #define __swp_offset(x)			((x).val >> 5)
>  #define __swp_entry(type, offset)	((swp_entry_t){(type) | (offset) << 5})
> -#define __pte_to_swp_entry(pte)		((swp_entry_t){ (pte).pte_high })
> -#define __swp_entry_to_pte(x)		((pte_t){ { .pte_high = (x).val } })
> +
> +/*
> + * Normally, __swp_entry() converts from arch-independent swp_entry_t to
> + * arch-dependent swp_entry_t, and __swp_entry_to_pte() just stores the result
> + * to pte. But here we have 32bit swp_entry_t and 64bit pte, and need to use the
> + * whole 64 bits. Thus, we shift the "real" arch-dependent conversion to
> + * __swp_entry_to_pte() through the following helper macro based on 64bit
> + * __swp_entry().
> + */
> +#define __swp_pteval_entry(type, offset) ((pteval_t) { \
> +	(~(pteval_t)(offset) << SWP_OFFSET_SHIFT >> SWP_TYPE_BITS) \
> +	| ((pteval_t)(type) << (64-SWP_TYPE_BITS)) })
> +
> +#define __swp_entry_to_pte(x)	((pte_t){ .pte = \
> +		__swp_pteval_entry(__swp_type(x), __swp_offset(x)) })
> +/*
> + * Analogically, __pte_to_swp_entry() doesn't just extract the arch-dependent
> + * swp_entry_t, but also has to convert it from 64bit to the 32bit
> + * intermediate representation, using the following macros based on 64bit
> + * __swp_type() and __swp_offset().
> + */
> +#define __pteval_swp_type(x) ((unsigned long)((x).pte >> (64 - SWP_TYPE_BITS)))
> +#define __pteval_swp_offset(x) ((unsigned long)(~((x).pte) << SWP_TYPE_BITS >> SWP_OFFSET_SHIFT))
> +
> +#define __pte_to_swp_entry(pte)	(__swp_entry(__pteval_swp_type(pte), \
> +					     __pteval_swp_offset(pte)))
>  
>  #define gup_get_pte gup_get_pte
>  /*
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index c0870df32b2d..862191ed3d6e 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -896,7 +896,7 @@ unsigned long max_swapfile_size(void)
>  		 * We encode swap offsets also with 3 bits below those for pfn
>  		 * which makes the usable limit higher.
>  		 */
> -#ifdef CONFIG_X86_64
> +#if CONFIG_PGTABLE_LEVELS > 2
>  		l1tf_limit <<= PAGE_SHIFT - SWP_OFFSET_FIRST_BIT;
>  #endif
>  		pages = min_t(unsigned long, l1tf_limit, pages);
> -- 
> 2.17.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 8/8] L1TFv8 6
  2018-06-26 12:57                 ` Michal Hocko
@ 2018-06-26 13:05                   ` Michal Hocko
  0 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2018-06-26 13:05 UTC (permalink / raw)
  To: speck

On Tue 26-06-18 14:57:50, speck for Michal Hocko wrote:
> On Tue 26-06-18 14:01:18, speck for Vlastimil Babka wrote:
> > On 06/25/2018 10:31 PM, speck for Andi Kleen wrote:
> [...]
> > >From 94b19f2277984594eda826a315cb49d6be5375b5 Mon Sep 17 00:00:00 2001
> > From: Vlastimil Babka <vbabka@suse.cz>
> > Date: Fri, 22 Jun 2018 17:39:33 +0200
> > Subject: [PATCH] x86/speculation/l1tf: protect PAE swap entries against L1TF
> > 
> > The PAE 3-level paging code currently doesn't mitigate L1TF by flipping the
> > offset bits, and uses the high PTE word, thus bits 32-36 for type, 37-63 for
> > offset. The lower word is zeroed, thus systems with less than 4GB memory are
> > safe. With 4GB to 128GB the swap type selects the memory locations vulnerable
> > to L1TF; with even more memory, also the swap offfset influences the address.
> > This might be a problem with 32bit PAE guests running on large 64bit hosts.
> > 
> > By continuing to keep the whole swap entry in either high or low 32bit word of
> > PTE we would limit the swap size too much. Thus this patch uses the whole PAE
> > PTE with the same layout as the 64bit version does. The macros just become a
> > bit tricky since they assume the arch-dependent swp_entry_t to be 32bit.
> 
> I have expected this to be even ugglier but it seems quite sane in the
> end.

And just for the record. I was worried that the 64b swap entry would
lead to RMW issues because we are not doing the single write to u64 on
32b but all swap entries constructors I can see are local and only made
visible later. Regular ptes handle that already but I was worried that
having swap entries special could lead to some subtle issues but I
cannot see any in the code.
 
> > Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> Thanks!
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 8/8] L1TFv8 6
  2018-06-26 12:01               ` Vlastimil Babka
  2018-06-26 12:57                 ` Michal Hocko
@ 2018-06-27  9:14                 ` Thomas Gleixner
  1 sibling, 0 replies; 26+ messages in thread
From: Thomas Gleixner @ 2018-06-27  9:14 UTC (permalink / raw)
  To: speck

On Tue, 26 Jun 2018, speck for Vlastimil Babka wrote:
> Thanks. Here's an updated patch with changelog, and it has been also tested.
> 
> ----8<----
> >From 94b19f2277984594eda826a315cb49d6be5375b5 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@suse.cz>
> Date: Fri, 22 Jun 2018 17:39:33 +0200
> Subject: [PATCH] x86/speculation/l1tf: protect PAE swap entries against L1TF
> 
> The PAE 3-level paging code currently doesn't mitigate L1TF by flipping the
> offset bits, and uses the high PTE word, thus bits 32-36 for type, 37-63 for
> offset. The lower word is zeroed, thus systems with less than 4GB memory are
> safe. With 4GB to 128GB the swap type selects the memory locations vulnerable
> to L1TF; with even more memory, also the swap offfset influences the address.
> This might be a problem with 32bit PAE guests running on large 64bit hosts.
> 
> By continuing to keep the whole swap entry in either high or low 32bit word of
> PTE we would limit the swap size too much. Thus this patch uses the whole PAE
> PTE with the same layout as the 64bit version does. The macros just become a
> bit tricky since they assume the arch-dependent swp_entry_t to be 32bit.
> 
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Applied and pushed out. Thanks Vlastimil!

	tglx

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation
       [not found] ` <20180613225434.1CDC8610FD@crypto-ml.lab.linutronix.de>
@ 2018-06-27 15:51   ` Michal Hocko
  2018-06-28  8:05     ` [MODERATED] Re: [PATCH 4/8] L1TFv8 8 Vlastimil Babka
  0 siblings, 1 reply; 26+ messages in thread
From: Michal Hocko @ 2018-06-27 15:51 UTC (permalink / raw)
  To: speck

On Wed 13-06-18 15:48:24, speck for Andi Kleen wrote:
[...]
>  static inline unsigned long pte_pfn(pte_t pte)
>  {
> -	return (pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT;
> +	unsigned long pfn = pte_val(pte);
> +	pfn ^= protnone_mask(pfn);
> +	return (pfn & PTE_PFN_MASK) >> PAGE_SHIFT;
>  }
>  
>  static inline unsigned long pmd_pfn(pmd_t pmd)
>  {
> -	return (pmd_val(pmd) & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
> +	unsigned long pfn = pmd_val(pmd);
> +	pfn ^= protnone_mask(pfn);
> +	return (pfn & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
>  }
>  
>  static inline unsigned long pud_pfn(pud_t pud)
>  {
> -	return (pud_val(pud) & pud_pfn_mask(pud)) >> PAGE_SHIFT;
> +	unsigned long pfn = pud_val(pud);
> +	pfn ^= protnone_mask(pfn);
> +	return (pfn & pud_pfn_mask(pud)) >> PAGE_SHIFT;
>  }
>  
>  static inline unsigned long p4d_pfn(p4d_t p4d)
> @@ -545,25 +555,33 @@ static inline pgprotval_t check_pgprot(pgprot_t pgprot)
>  
>  static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
>  {
> -	return __pte(((phys_addr_t)page_nr << PAGE_SHIFT) |
> -		     check_pgprot(pgprot));
> +	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> +	pfn ^= protnone_mask(pgprot_val(pgprot));
> +	pfn &= PTE_PFN_MASK;
> +	return __pte(pfn | check_pgprot(pgprot));
>  }
>  
>  static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
>  {
> -	return __pmd(((phys_addr_t)page_nr << PAGE_SHIFT) |
> -		     check_pgprot(pgprot));
> +	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> +	pfn ^= protnone_mask(pgprot_val(pgprot));
> +	pfn &= PHYSICAL_PMD_PAGE_MASK;
> +	return __pmd(pfn | check_pgprot(pgprot));
>  }
>  
>  static inline pud_t pfn_pud(unsigned long page_nr, pgprot_t pgprot)
>  {
> -	return __pud(((phys_addr_t)page_nr << PAGE_SHIFT) |
> -		     check_pgprot(pgprot));
> +	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> +	pfn ^= protnone_mask(pgprot_val(pgprot));
> +	pfn &= PHYSICAL_PUD_PAGE_MASK;
> +	return __pud(pfn | check_pgprot(pgprot));
>  }

Jan Beulich has noticed that these are not correct on 32b PAE systems
because phys_addr_t is wider than unsigned long. So we need an explicit
cas for pfn_* and use phys_addr_t for other direction. I think we want
the following:

From d3050e2b99e9070defcd990b7bc31a4b433367c5 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.cz>
Date: Wed, 27 Jun 2018 17:46:50 +0200
Subject: [PATCH] x86/speculation/l1tf: fix up pte->pfn conversion for PAE

Jan has noticed that pte_pfn and co. resp. pfn_pte are incorrect for
CONFIG_PAE because phys_addr_t is wider than unsigned long and so the
pte_val reps. shift left would get truncated. Fix this up by using
proper types.

Noticed-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 arch/x86/include/asm/pgtable.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 6a090a76fdca..26fd42a91946 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -191,21 +191,21 @@ static inline u64 protnone_mask(u64 val);
 
 static inline unsigned long pte_pfn(pte_t pte)
 {
-	unsigned long pfn = pte_val(pte);
+	phys_addr_t pfn = pte_val(pte);
 	pfn ^= protnone_mask(pfn);
 	return (pfn & PTE_PFN_MASK) >> PAGE_SHIFT;
 }
 
 static inline unsigned long pmd_pfn(pmd_t pmd)
 {
-	unsigned long pfn = pmd_val(pmd);
+	phys_addr_t pfn = pmd_val(pmd);
 	pfn ^= protnone_mask(pfn);
 	return (pfn & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
 }
 
 static inline unsigned long pud_pfn(pud_t pud)
 {
-	unsigned long pfn = pud_val(pud);
+	phys_addr_t pfn = pud_val(pud);
 	pfn ^= protnone_mask(pfn);
 	return (pfn & pud_pfn_mask(pud)) >> PAGE_SHIFT;
 }
@@ -555,7 +555,7 @@ static inline pgprotval_t check_pgprot(pgprot_t pgprot)
 
 static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
 {
-	phys_addr_t pfn = page_nr << PAGE_SHIFT;
+	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
 	pfn ^= protnone_mask(pgprot_val(pgprot));
 	pfn &= PTE_PFN_MASK;
 	return __pte(pfn | check_pgprot(pgprot));
@@ -563,7 +563,7 @@ static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
 
 static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
 {
-	phys_addr_t pfn = page_nr << PAGE_SHIFT;
+	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
 	pfn ^= protnone_mask(pgprot_val(pgprot));
 	pfn &= PHYSICAL_PMD_PAGE_MASK;
 	return __pmd(pfn | check_pgprot(pgprot));
@@ -571,7 +571,7 @@ static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
 
 static inline pud_t pfn_pud(unsigned long page_nr, pgprot_t pgprot)
 {
-	phys_addr_t pfn = page_nr << PAGE_SHIFT;
+	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
 	pfn ^= protnone_mask(pgprot_val(pgprot));
 	pfn &= PHYSICAL_PUD_PAGE_MASK;
 	return __pud(pfn | check_pgprot(pgprot));
-- 
2.12.3

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 4/8] L1TFv8 8
  2018-06-27 15:51   ` [MODERATED] Re: x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation Michal Hocko
@ 2018-06-28  8:05     ` Vlastimil Babka
  2018-06-29 12:22       ` Michal Hocko
  0 siblings, 1 reply; 26+ messages in thread
From: Vlastimil Babka @ 2018-06-28  8:05 UTC (permalink / raw)
  To: speck

On 06/27/2018 05:51 PM, speck for Michal Hocko wrote:
> Jan Beulich has noticed that these are not correct on 32b PAE systems
> because phys_addr_t is wider than unsigned long. So we need an explicit
> cas for pfn_* and use phys_addr_t for other direction. I think we want
> the following:
> 
>>From d3050e2b99e9070defcd990b7bc31a4b433367c5 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.cz>
> Date: Wed, 27 Jun 2018 17:46:50 +0200
> Subject: [PATCH] x86/speculation/l1tf: fix up pte->pfn conversion for PAE
> 
> Jan has noticed that pte_pfn and co. resp. pfn_pte are incorrect for
> CONFIG_PAE because phys_addr_t is wider than unsigned long and so the
> pte_val reps. shift left would get truncated. Fix this up by using
> proper types.
> 
> Noticed-by: Jan Beulich <JBeulich@suse.com>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Good catch. Looks good to me, and some basic printk tests on manually
created and modified pte's confirm that the problem does exist and the
fix works.

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  arch/x86/include/asm/pgtable.h | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 6a090a76fdca..26fd42a91946 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -191,21 +191,21 @@ static inline u64 protnone_mask(u64 val);
>  
>  static inline unsigned long pte_pfn(pte_t pte)
>  {
> -	unsigned long pfn = pte_val(pte);
> +	phys_addr_t pfn = pte_val(pte);
>  	pfn ^= protnone_mask(pfn);
>  	return (pfn & PTE_PFN_MASK) >> PAGE_SHIFT;
>  }
>  
>  static inline unsigned long pmd_pfn(pmd_t pmd)
>  {
> -	unsigned long pfn = pmd_val(pmd);
> +	phys_addr_t pfn = pmd_val(pmd);
>  	pfn ^= protnone_mask(pfn);
>  	return (pfn & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
>  }
>  
>  static inline unsigned long pud_pfn(pud_t pud)
>  {
> -	unsigned long pfn = pud_val(pud);
> +	phys_addr_t pfn = pud_val(pud);
>  	pfn ^= protnone_mask(pfn);
>  	return (pfn & pud_pfn_mask(pud)) >> PAGE_SHIFT;
>  }
> @@ -555,7 +555,7 @@ static inline pgprotval_t check_pgprot(pgprot_t pgprot)
>  
>  static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
>  {
> -	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> +	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
>  	pfn ^= protnone_mask(pgprot_val(pgprot));
>  	pfn &= PTE_PFN_MASK;
>  	return __pte(pfn | check_pgprot(pgprot));
> @@ -563,7 +563,7 @@ static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
>  
>  static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
>  {
> -	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> +	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
>  	pfn ^= protnone_mask(pgprot_val(pgprot));
>  	pfn &= PHYSICAL_PMD_PAGE_MASK;
>  	return __pmd(pfn | check_pgprot(pgprot));
> @@ -571,7 +571,7 @@ static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
>  
>  static inline pud_t pfn_pud(unsigned long page_nr, pgprot_t pgprot)
>  {
> -	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> +	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
>  	pfn ^= protnone_mask(pgprot_val(pgprot));
>  	pfn &= PHYSICAL_PUD_PAGE_MASK;
>  	return __pud(pfn | check_pgprot(pgprot));
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [MODERATED] Re: [PATCH 4/8] L1TFv8 8
  2018-06-28  8:05     ` [MODERATED] Re: [PATCH 4/8] L1TFv8 8 Vlastimil Babka
@ 2018-06-29 12:22       ` Michal Hocko
  0 siblings, 0 replies; 26+ messages in thread
From: Michal Hocko @ 2018-06-29 12:22 UTC (permalink / raw)
  To: speck

On Thu 28-06-18 10:05:47, speck for Vlastimil Babka wrote:
> On 06/27/2018 05:51 PM, speck for Michal Hocko wrote:
> > Jan Beulich has noticed that these are not correct on 32b PAE systems
> > because phys_addr_t is wider than unsigned long. So we need an explicit
> > cas for pfn_* and use phys_addr_t for other direction. I think we want
> > the following:
> > 
> >>From d3050e2b99e9070defcd990b7bc31a4b433367c5 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.cz>
> > Date: Wed, 27 Jun 2018 17:46:50 +0200
> > Subject: [PATCH] x86/speculation/l1tf: fix up pte->pfn conversion for PAE
> > 
> > Jan has noticed that pte_pfn and co. resp. pfn_pte are incorrect for
> > CONFIG_PAE because phys_addr_t is wider than unsigned long and so the
> > pte_val reps. shift left would get truncated. Fix this up by using
> > proper types.
> > 
> > Noticed-by: Jan Beulich <JBeulich@suse.com>
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> 
> Good catch. Looks good to me, and some basic printk tests on manually
> created and modified pte's confirm that the problem does exist and the
> fix works.
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>

Thanks for the review. Btw. could you add
Fixes: 6b28baca9b1f ("x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation")

if you haven't pushed this yet Thomas?

> 
> > ---
> >  arch/x86/include/asm/pgtable.h | 12 ++++++------
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> > index 6a090a76fdca..26fd42a91946 100644
> > --- a/arch/x86/include/asm/pgtable.h
> > +++ b/arch/x86/include/asm/pgtable.h
> > @@ -191,21 +191,21 @@ static inline u64 protnone_mask(u64 val);
> >  
> >  static inline unsigned long pte_pfn(pte_t pte)
> >  {
> > -	unsigned long pfn = pte_val(pte);
> > +	phys_addr_t pfn = pte_val(pte);
> >  	pfn ^= protnone_mask(pfn);
> >  	return (pfn & PTE_PFN_MASK) >> PAGE_SHIFT;
> >  }
> >  
> >  static inline unsigned long pmd_pfn(pmd_t pmd)
> >  {
> > -	unsigned long pfn = pmd_val(pmd);
> > +	phys_addr_t pfn = pmd_val(pmd);
> >  	pfn ^= protnone_mask(pfn);
> >  	return (pfn & pmd_pfn_mask(pmd)) >> PAGE_SHIFT;
> >  }
> >  
> >  static inline unsigned long pud_pfn(pud_t pud)
> >  {
> > -	unsigned long pfn = pud_val(pud);
> > +	phys_addr_t pfn = pud_val(pud);
> >  	pfn ^= protnone_mask(pfn);
> >  	return (pfn & pud_pfn_mask(pud)) >> PAGE_SHIFT;
> >  }
> > @@ -555,7 +555,7 @@ static inline pgprotval_t check_pgprot(pgprot_t pgprot)
> >  
> >  static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
> >  {
> > -	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> > +	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
> >  	pfn ^= protnone_mask(pgprot_val(pgprot));
> >  	pfn &= PTE_PFN_MASK;
> >  	return __pte(pfn | check_pgprot(pgprot));
> > @@ -563,7 +563,7 @@ static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot)
> >  
> >  static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
> >  {
> > -	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> > +	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
> >  	pfn ^= protnone_mask(pgprot_val(pgprot));
> >  	pfn &= PHYSICAL_PMD_PAGE_MASK;
> >  	return __pmd(pfn | check_pgprot(pgprot));
> > @@ -571,7 +571,7 @@ static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot)
> >  
> >  static inline pud_t pfn_pud(unsigned long page_nr, pgprot_t pgprot)
> >  {
> > -	phys_addr_t pfn = page_nr << PAGE_SHIFT;
> > +	phys_addr_t pfn = (phys_addr_t)page_nr << PAGE_SHIFT;
> >  	pfn ^= protnone_mask(pgprot_val(pgprot));
> >  	pfn &= PHYSICAL_PUD_PAGE_MASK;
> >  	return __pud(pfn | check_pgprot(pgprot));
> > 
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2018-06-29 14:18 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-13 22:48 [MODERATED] [PATCH 0/8] L1TFv8 2 Andi Kleen
2018-06-13 22:48 ` [MODERATED] [PATCH 1/8] L1TFv8 0 Andi Kleen
2018-06-13 22:48 ` [MODERATED] [PATCH 2/8] L1TFv8 4 Andi Kleen
2018-06-13 22:48 ` [MODERATED] [PATCH 3/8] L1TFv8 5 Andi Kleen
2018-06-13 22:48 ` [MODERATED] [PATCH 4/8] L1TFv8 8 Andi Kleen
2018-06-13 22:48 ` [MODERATED] [PATCH 5/8] L1TFv8 3 Andi Kleen
2018-06-13 22:48 ` [MODERATED] [PATCH 6/8] L1TFv8 7 Andi Kleen
2018-06-13 22:48 ` [MODERATED] [PATCH 7/8] L1TFv8 1 Andi Kleen
2018-06-13 22:48 ` [MODERATED] [PATCH 8/8] L1TFv8 6 Andi Kleen
     [not found] ` <20180614150632.E064C61183@crypto-ml.lab.linutronix.de>
2018-06-21  9:02   ` [MODERATED] " Vlastimil Babka
2018-06-21 11:43     ` Vlastimil Babka
2018-06-21 13:17       ` Vlastimil Babka
2018-06-21 14:38         ` Michal Hocko
2018-06-21 14:38         ` Thomas Gleixner
2018-06-21 20:32         ` [MODERATED] " Andi Kleen
2018-06-22 15:46       ` Vlastimil Babka
2018-06-22 16:56         ` Andi Kleen
2018-06-25  7:04           ` Vlastimil Babka
2018-06-25 20:31             ` Andi Kleen
2018-06-26 12:01               ` Vlastimil Babka
2018-06-26 12:57                 ` Michal Hocko
2018-06-26 13:05                   ` Michal Hocko
2018-06-27  9:14                 ` Thomas Gleixner
     [not found] ` <20180613225434.1CDC8610FD@crypto-ml.lab.linutronix.de>
2018-06-27 15:51   ` [MODERATED] Re: x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation Michal Hocko
2018-06-28  8:05     ` [MODERATED] Re: [PATCH 4/8] L1TFv8 8 Vlastimil Babka
2018-06-29 12:22       ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.