All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch V2 0/5] x86/kaiser: Boot time disabling and debug support
@ 2017-11-26 23:14 ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

This patch series applies on top of

     git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/mm

It contains the following updates:

  - Don't set NX/PAGE_GLOBAL unconditionally

  - Get rid of the compile time PAGE_GLOBAL disabling

  - Add debug support for WX mappings in the KAISER shadow table

  - Provide debug files to dump the kernel and the user page table for the
    current task.

  - Add a boot time switch to disable KAISER. This does not yet take care of
    the 8k PGD allocations, but that can be done on top.

Changes vs. V1:

  - Prevent setting PAGE_GLOBAL/NX when not supported or disabled

  - Restructured the debug stuff a bit

  - Extended the boot time disable to debug stuff

I tried to reenable paravirt by disabling kaiser at boot time when XEN_PV
is detected. XEN_PV is the only one having CR3 access paravirtualized,
which will explode nicely in the enter/exit code.

But enabling KAISER has some weird not yet debugged side effects even on
KVM guests. Will look at that tomorrow morning.

Thanks,

	tglx

---
 arch/x86/entry/calling.h             |    7 +++
 arch/x86/include/asm/kaiser.h        |   10 ++++
 arch/x86/include/asm/pgtable.h       |    1 
 arch/x86/include/asm/pgtable_64.h    |    9 +++
 arch/x86/include/asm/pgtable_types.h |   16 ------
 arch/x86/mm/debug_pagetables.c       |   81 ++++++++++++++++++++++++++++++++---
 arch/x86/mm/dump_pagetables.c        |   32 +++++++++++--
 arch/x86/mm/init.c                   |   14 ++++--
 arch/x86/mm/kaiser.c                 |   42 +++++++++++++++++-
 arch/x86/mm/pageattr.c               |   16 +++---
 security/Kconfig                     |    2 
 11 files changed, 190 insertions(+), 40 deletions(-)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 0/5] x86/kaiser: Boot time disabling and debug support
@ 2017-11-26 23:14 ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

This patch series applies on top of

     git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/mm

It contains the following updates:

  - Don't set NX/PAGE_GLOBAL unconditionally

  - Get rid of the compile time PAGE_GLOBAL disabling

  - Add debug support for WX mappings in the KAISER shadow table

  - Provide debug files to dump the kernel and the user page table for the
    current task.

  - Add a boot time switch to disable KAISER. This does not yet take care of
    the 8k PGD allocations, but that can be done on top.

Changes vs. V1:

  - Prevent setting PAGE_GLOBAL/NX when not supported or disabled

  - Restructured the debug stuff a bit

  - Extended the boot time disable to debug stuff

I tried to reenable paravirt by disabling kaiser at boot time when XEN_PV
is detected. XEN_PV is the only one having CR3 access paravirtualized,
which will explode nicely in the enter/exit code.

But enabling KAISER has some weird not yet debugged side effects even on
KVM guests. Will look at that tomorrow morning.

Thanks,

	tglx

---
 arch/x86/entry/calling.h             |    7 +++
 arch/x86/include/asm/kaiser.h        |   10 ++++
 arch/x86/include/asm/pgtable.h       |    1 
 arch/x86/include/asm/pgtable_64.h    |    9 +++
 arch/x86/include/asm/pgtable_types.h |   16 ------
 arch/x86/mm/debug_pagetables.c       |   81 ++++++++++++++++++++++++++++++++---
 arch/x86/mm/dump_pagetables.c        |   32 +++++++++++--
 arch/x86/mm/init.c                   |   14 ++++--
 arch/x86/mm/kaiser.c                 |   42 +++++++++++++++++-
 arch/x86/mm/pageattr.c               |   16 +++---
 security/Kconfig                     |    2 
 11 files changed, 190 insertions(+), 40 deletions(-)



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 1/5] x86/kaiser: Respect disabled CPU features
  2017-11-26 23:14 ` Thomas Gleixner
@ 2017-11-26 23:14   ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-kaiser--Respect-disabled-features.patch --]
[-- Type: text/plain, Size: 1958 bytes --]

PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
line, but KAISER sets them unconditionally.

Add proper protection against that.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable_64.h |    3 ++-
 arch/x86/mm/kaiser.c              |   12 +++++++++++-
 2 files changed, 13 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -222,7 +222,8 @@ static inline pgd_t kaiser_set_shadow_pg
 			 * wrong CR3 value, userspace will crash
 			 * instead of running.
 			 */
-			pgd.pgd |= _PAGE_NX;
+			if (__supported_pte_mask & _PAGE_NX)
+				pgd.pgd |= _PAGE_NX;
 		}
 	} else if (pgd_userspace_access(*pgdp)) {
 		/*
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -42,6 +42,8 @@
 
 #define KAISER_WALK_ATOMIC  0x1
 
+static pteval_t kaiser_pte_mask __ro_after_init = ~(_PAGE_NX | _PAGE_GLOBAL);
+
 /*
  * At runtime, the only things we map are some things for CPU
  * hotplug, and stacks for new processes.  No two CPUs will ever
@@ -244,11 +246,14 @@ static pte_t *kaiser_shadow_pagetable_wa
 int kaiser_add_user_map(const void *__start_addr, unsigned long size,
 			unsigned long flags)
 {
-	pte_t *pte;
 	unsigned long start_addr = (unsigned long)__start_addr;
 	unsigned long address = start_addr & PAGE_MASK;
 	unsigned long end_addr = PAGE_ALIGN(start_addr + size);
 	unsigned long target_address;
+	pte_t *pte;
+
+	/* Clear not supported bits */
+	flags &= kaiser_pte_mask;
 
 	for (; address < end_addr; address += PAGE_SIZE) {
 		target_address = get_pa_from_kernel_map(address);
@@ -308,6 +313,11 @@ static void __init kaiser_init_all_pgds(
 	pgd_t *pgd;
 	int i;
 
+	if (__supported_pte_mask & _PAGE_NX)
+		kaiser_pte_mask |= _PAGE_NX;
+	if (boot_cpu_has(X86_FEATURE_PGE))
+		kaiser_pte_mask |= _PAGE_GLOBAL;
+
 	pgd = kernel_to_shadow_pgdp(pgd_offset_k(0UL));
 	for (i = PTRS_PER_PGD / 2; i < PTRS_PER_PGD; i++) {
 		/*

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 1/5] x86/kaiser: Respect disabled CPU features
@ 2017-11-26 23:14   ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-kaiser--Respect-disabled-features.patch --]
[-- Type: text/plain, Size: 2185 bytes --]

PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
line, but KAISER sets them unconditionally.

Add proper protection against that.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable_64.h |    3 ++-
 arch/x86/mm/kaiser.c              |   12 +++++++++++-
 2 files changed, 13 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -222,7 +222,8 @@ static inline pgd_t kaiser_set_shadow_pg
 			 * wrong CR3 value, userspace will crash
 			 * instead of running.
 			 */
-			pgd.pgd |= _PAGE_NX;
+			if (__supported_pte_mask & _PAGE_NX)
+				pgd.pgd |= _PAGE_NX;
 		}
 	} else if (pgd_userspace_access(*pgdp)) {
 		/*
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -42,6 +42,8 @@
 
 #define KAISER_WALK_ATOMIC  0x1
 
+static pteval_t kaiser_pte_mask __ro_after_init = ~(_PAGE_NX | _PAGE_GLOBAL);
+
 /*
  * At runtime, the only things we map are some things for CPU
  * hotplug, and stacks for new processes.  No two CPUs will ever
@@ -244,11 +246,14 @@ static pte_t *kaiser_shadow_pagetable_wa
 int kaiser_add_user_map(const void *__start_addr, unsigned long size,
 			unsigned long flags)
 {
-	pte_t *pte;
 	unsigned long start_addr = (unsigned long)__start_addr;
 	unsigned long address = start_addr & PAGE_MASK;
 	unsigned long end_addr = PAGE_ALIGN(start_addr + size);
 	unsigned long target_address;
+	pte_t *pte;
+
+	/* Clear not supported bits */
+	flags &= kaiser_pte_mask;
 
 	for (; address < end_addr; address += PAGE_SIZE) {
 		target_address = get_pa_from_kernel_map(address);
@@ -308,6 +313,11 @@ static void __init kaiser_init_all_pgds(
 	pgd_t *pgd;
 	int i;
 
+	if (__supported_pte_mask & _PAGE_NX)
+		kaiser_pte_mask |= _PAGE_NX;
+	if (boot_cpu_has(X86_FEATURE_PGE))
+		kaiser_pte_mask |= _PAGE_GLOBAL;
+
 	pgd = kernel_to_shadow_pgdp(pgd_offset_k(0UL));
 	for (i = PTRS_PER_PGD / 2; i < PTRS_PER_PGD; i++) {
 		/*


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 2/5] x86/kaiser: Simplify disabling of global pages
  2017-11-26 23:14 ` Thomas Gleixner
@ 2017-11-26 23:14   ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-kaiser--Make-page-global-disabling-sane.patch --]
[-- Type: text/plain, Size: 3990 bytes --]

The current way of disabling global pages at compile time prevents boot
time disabling of kaiser and creates unnecessary indirections.

Global pages can be supressed by __supported_pte_mask as well. The shadow
mappings set PAGE_GLOBAL for the minimal kernel mappings which are required
for entry/exit. These mappings are setup manually so the filtering does not
take place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable_types.h |   16 +---------------
 arch/x86/mm/init.c                   |   13 ++++++++++---
 arch/x86/mm/pageattr.c               |   16 ++++++++--------
 3 files changed, 19 insertions(+), 26 deletions(-)

--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -191,23 +191,9 @@ enum page_cache_mode {
 #define PAGE_READONLY_EXEC	__pgprot(_PAGE_PRESENT | _PAGE_USER |	\
 					 _PAGE_ACCESSED)
 
-/*
- * Disable global pages for anything using the default
- * __PAGE_KERNEL* macros.
- *
- * PGE will still be enabled and _PAGE_GLOBAL may still be used carefully
- * for a few selected kernel mappings which must be visible to userspace,
- * when KAISER is enabled, like the entry/exit code and data.
- */
-#ifdef CONFIG_KAISER
-#define __PAGE_KERNEL_GLOBAL	0
-#else
-#define __PAGE_KERNEL_GLOBAL	_PAGE_GLOBAL
-#endif
-
 #define __PAGE_KERNEL_EXEC						\
 	(_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED |	\
-	 __PAGE_KERNEL_GLOBAL)
+	 _PAGE_GLOBAL)
 #define __PAGE_KERNEL		(__PAGE_KERNEL_EXEC | _PAGE_NX)
 
 #define __PAGE_KERNEL_RO		(__PAGE_KERNEL & ~_PAGE_RW)
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -161,6 +161,13 @@ struct map_range {
 
 static int page_size_mask;
 
+static void enable_global_pages(void)
+{
+#ifndef CONFIG_KAISER
+	__supported_pte_mask |= _PAGE_GLOBAL;
+#endif
+}
+
 static void __init probe_page_size_mask(void)
 {
 	/*
@@ -179,11 +186,11 @@ static void __init probe_page_size_mask(
 		cr4_set_bits_and_update_boot(X86_CR4_PSE);
 
 	/* Enable PGE if available */
+	__supported_pte_mask |= _PAGE_GLOBAL;
 	if (boot_cpu_has(X86_FEATURE_PGE)) {
 		cr4_set_bits_and_update_boot(X86_CR4_PGE);
-		__supported_pte_mask |= _PAGE_GLOBAL;
-	} else
-		__supported_pte_mask &= ~_PAGE_GLOBAL;
+		enable_global_pages();
+	}
 
 	/* Enable 1 GB linear kernel mappings if available: */
 	if (direct_gbpages && boot_cpu_has(X86_FEATURE_GBPAGES)) {
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -585,9 +585,9 @@ try_preserve_large_page(pte_t *kpte, uns
 	 * for the ancient hardware that doesn't support it.
 	 */
 	if (pgprot_val(req_prot) & _PAGE_PRESENT)
-		pgprot_val(req_prot) |= _PAGE_PSE | __PAGE_KERNEL_GLOBAL;
+		pgprot_val(req_prot) |= _PAGE_PSE | _PAGE_GLOBAL;
 	else
-		pgprot_val(req_prot) &= ~(_PAGE_PSE | __PAGE_KERNEL_GLOBAL);
+		pgprot_val(req_prot) &= ~(_PAGE_PSE | _PAGE_GLOBAL);
 
 	req_prot = canon_pgprot(req_prot);
 
@@ -705,9 +705,9 @@ static int
 	 * for the ancient hardware that doesn't support it.
 	 */
 	if (pgprot_val(ref_prot) & _PAGE_PRESENT)
-		pgprot_val(ref_prot) |= __PAGE_KERNEL_GLOBAL;
+		pgprot_val(ref_prot) |= _PAGE_GLOBAL;
 	else
-		pgprot_val(ref_prot) &= ~__PAGE_KERNEL_GLOBAL;
+		pgprot_val(ref_prot) &= ~_PAGE_GLOBAL;
 
 	/*
 	 * Get the target pfn from the original entry:
@@ -938,9 +938,9 @@ static void populate_pte(struct cpa_data
 	 * support it.
 	 */
 	if (pgprot_val(pgprot) & _PAGE_PRESENT)
-		pgprot_val(pgprot) |= __PAGE_KERNEL_GLOBAL;
+		pgprot_val(pgprot) |= _PAGE_GLOBAL;
 	else
-		pgprot_val(pgprot) &= ~__PAGE_KERNEL_GLOBAL;
+		pgprot_val(pgprot) &= ~_PAGE_GLOBAL;
 
 	pgprot = canon_pgprot(pgprot);
 
@@ -1242,9 +1242,9 @@ static int __change_page_attr(struct cpa
 		 * support it.
 		 */
 		if (pgprot_val(new_prot) & _PAGE_PRESENT)
-			pgprot_val(new_prot) |= __PAGE_KERNEL_GLOBAL;
+			pgprot_val(new_prot) |= _PAGE_GLOBAL;
 		else
-			pgprot_val(new_prot) &= ~__PAGE_KERNEL_GLOBAL;
+			pgprot_val(new_prot) &= ~_PAGE_GLOBAL;
 
 		/*
 		 * We need to keep the pfn from the existing PTE,

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 2/5] x86/kaiser: Simplify disabling of global pages
@ 2017-11-26 23:14   ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-kaiser--Make-page-global-disabling-sane.patch --]
[-- Type: text/plain, Size: 4217 bytes --]

The current way of disabling global pages at compile time prevents boot
time disabling of kaiser and creates unnecessary indirections.

Global pages can be supressed by __supported_pte_mask as well. The shadow
mappings set PAGE_GLOBAL for the minimal kernel mappings which are required
for entry/exit. These mappings are setup manually so the filtering does not
take place.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable_types.h |   16 +---------------
 arch/x86/mm/init.c                   |   13 ++++++++++---
 arch/x86/mm/pageattr.c               |   16 ++++++++--------
 3 files changed, 19 insertions(+), 26 deletions(-)

--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -191,23 +191,9 @@ enum page_cache_mode {
 #define PAGE_READONLY_EXEC	__pgprot(_PAGE_PRESENT | _PAGE_USER |	\
 					 _PAGE_ACCESSED)
 
-/*
- * Disable global pages for anything using the default
- * __PAGE_KERNEL* macros.
- *
- * PGE will still be enabled and _PAGE_GLOBAL may still be used carefully
- * for a few selected kernel mappings which must be visible to userspace,
- * when KAISER is enabled, like the entry/exit code and data.
- */
-#ifdef CONFIG_KAISER
-#define __PAGE_KERNEL_GLOBAL	0
-#else
-#define __PAGE_KERNEL_GLOBAL	_PAGE_GLOBAL
-#endif
-
 #define __PAGE_KERNEL_EXEC						\
 	(_PAGE_PRESENT | _PAGE_RW | _PAGE_DIRTY | _PAGE_ACCESSED |	\
-	 __PAGE_KERNEL_GLOBAL)
+	 _PAGE_GLOBAL)
 #define __PAGE_KERNEL		(__PAGE_KERNEL_EXEC | _PAGE_NX)
 
 #define __PAGE_KERNEL_RO		(__PAGE_KERNEL & ~_PAGE_RW)
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -161,6 +161,13 @@ struct map_range {
 
 static int page_size_mask;
 
+static void enable_global_pages(void)
+{
+#ifndef CONFIG_KAISER
+	__supported_pte_mask |= _PAGE_GLOBAL;
+#endif
+}
+
 static void __init probe_page_size_mask(void)
 {
 	/*
@@ -179,11 +186,11 @@ static void __init probe_page_size_mask(
 		cr4_set_bits_and_update_boot(X86_CR4_PSE);
 
 	/* Enable PGE if available */
+	__supported_pte_mask |= _PAGE_GLOBAL;
 	if (boot_cpu_has(X86_FEATURE_PGE)) {
 		cr4_set_bits_and_update_boot(X86_CR4_PGE);
-		__supported_pte_mask |= _PAGE_GLOBAL;
-	} else
-		__supported_pte_mask &= ~_PAGE_GLOBAL;
+		enable_global_pages();
+	}
 
 	/* Enable 1 GB linear kernel mappings if available: */
 	if (direct_gbpages && boot_cpu_has(X86_FEATURE_GBPAGES)) {
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -585,9 +585,9 @@ try_preserve_large_page(pte_t *kpte, uns
 	 * for the ancient hardware that doesn't support it.
 	 */
 	if (pgprot_val(req_prot) & _PAGE_PRESENT)
-		pgprot_val(req_prot) |= _PAGE_PSE | __PAGE_KERNEL_GLOBAL;
+		pgprot_val(req_prot) |= _PAGE_PSE | _PAGE_GLOBAL;
 	else
-		pgprot_val(req_prot) &= ~(_PAGE_PSE | __PAGE_KERNEL_GLOBAL);
+		pgprot_val(req_prot) &= ~(_PAGE_PSE | _PAGE_GLOBAL);
 
 	req_prot = canon_pgprot(req_prot);
 
@@ -705,9 +705,9 @@ static int
 	 * for the ancient hardware that doesn't support it.
 	 */
 	if (pgprot_val(ref_prot) & _PAGE_PRESENT)
-		pgprot_val(ref_prot) |= __PAGE_KERNEL_GLOBAL;
+		pgprot_val(ref_prot) |= _PAGE_GLOBAL;
 	else
-		pgprot_val(ref_prot) &= ~__PAGE_KERNEL_GLOBAL;
+		pgprot_val(ref_prot) &= ~_PAGE_GLOBAL;
 
 	/*
 	 * Get the target pfn from the original entry:
@@ -938,9 +938,9 @@ static void populate_pte(struct cpa_data
 	 * support it.
 	 */
 	if (pgprot_val(pgprot) & _PAGE_PRESENT)
-		pgprot_val(pgprot) |= __PAGE_KERNEL_GLOBAL;
+		pgprot_val(pgprot) |= _PAGE_GLOBAL;
 	else
-		pgprot_val(pgprot) &= ~__PAGE_KERNEL_GLOBAL;
+		pgprot_val(pgprot) &= ~_PAGE_GLOBAL;
 
 	pgprot = canon_pgprot(pgprot);
 
@@ -1242,9 +1242,9 @@ static int __change_page_attr(struct cpa
 		 * support it.
 		 */
 		if (pgprot_val(new_prot) & _PAGE_PRESENT)
-			pgprot_val(new_prot) |= __PAGE_KERNEL_GLOBAL;
+			pgprot_val(new_prot) |= _PAGE_GLOBAL;
 		else
-			pgprot_val(new_prot) &= ~__PAGE_KERNEL_GLOBAL;
+			pgprot_val(new_prot) &= ~_PAGE_GLOBAL;
 
 		/*
 		 * We need to keep the pfn from the existing PTE,


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 3/5] x86/dump_pagetables: Check KAISER shadow page table for WX pages
  2017-11-26 23:14 ` Thomas Gleixner
@ 2017-11-26 23:14   ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-kaiser--Check-shadow-pagetable-for-WX-mappings.patch --]
[-- Type: text/plain, Size: 2661 bytes --]

ptdump_walk_pgd_level_checkwx() checks the kernel page table for WX pages,
but does not check the KAISER shadow page table.

Restructure the code so that dmesg output is selected by an explicit
argument and not implicit via checking the pgd argument for !NULL. 
Add the check for the shadow page table.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable.h |    1 +
 arch/x86/mm/debug_pagetables.c |    2 +-
 arch/x86/mm/dump_pagetables.c  |   27 ++++++++++++++++++++++-----
 3 files changed, 24 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -28,6 +28,7 @@ extern pgd_t early_top_pgt[PTRS_PER_PGD]
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
 
 void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd);
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd);
 void ptdump_walk_pgd_level_checkwx(void);
 
 #ifdef CONFIG_DEBUG_WX
--- a/arch/x86/mm/debug_pagetables.c
+++ b/arch/x86/mm/debug_pagetables.c
@@ -5,7 +5,7 @@
 
 static int ptdump_show(struct seq_file *m, void *v)
 {
-	ptdump_walk_pgd_level(m, NULL);
+	ptdump_walk_pgd_level_debugfs(m, NULL);
 	return 0;
 }
 
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -447,7 +447,7 @@ static inline bool is_hypervisor_range(i
 }
 
 static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
-				       bool checkwx)
+				       bool checkwx, bool dmesg)
 {
 #ifdef CONFIG_X86_64
 	pgd_t *start = (pgd_t *) &init_top_pgt;
@@ -460,7 +460,7 @@ static void ptdump_walk_pgd_level_core(s
 
 	if (pgd) {
 		start = pgd;
-		st.to_dmesg = true;
+		st.to_dmesg = dmesg;
 	}
 
 	st.check_wx = checkwx;
@@ -498,13 +498,30 @@ static void ptdump_walk_pgd_level_core(s
 
 void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd)
 {
-	ptdump_walk_pgd_level_core(m, pgd, false);
+	ptdump_walk_pgd_level_core(m, pgd, false, true);
+}
+
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd)
+{
+	ptdump_walk_pgd_level_core(m, pgd, false, false);
+}
+EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level_debugfs);
+
+void ptdump_walk_shadow_pgd_level_checkwx(void)
+{
+#ifdef CONFIG_KAISER
+	pgd_t *pgd = (pgd_t *) &init_top_pgt;
+
+	pr_info("x86/mm: Checking shadow page tables\n");
+	pgd += PTRS_PER_PGD;
+	ptdump_walk_pgd_level_core(NULL, pgd, true, false);
+#endif
 }
-EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level);
 
 void ptdump_walk_pgd_level_checkwx(void)
 {
-	ptdump_walk_pgd_level_core(NULL, NULL, true);
+	ptdump_walk_pgd_level_core(NULL, NULL, true, false);
+	ptdump_walk_shadow_pgd_level_checkwx();
 }
 
 static int __init pt_dump_init(void)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 3/5] x86/dump_pagetables: Check KAISER shadow page table for WX pages
@ 2017-11-26 23:14   ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-kaiser--Check-shadow-pagetable-for-WX-mappings.patch --]
[-- Type: text/plain, Size: 2888 bytes --]

ptdump_walk_pgd_level_checkwx() checks the kernel page table for WX pages,
but does not check the KAISER shadow page table.

Restructure the code so that dmesg output is selected by an explicit
argument and not implicit via checking the pgd argument for !NULL. 
Add the check for the shadow page table.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable.h |    1 +
 arch/x86/mm/debug_pagetables.c |    2 +-
 arch/x86/mm/dump_pagetables.c  |   27 ++++++++++++++++++++++-----
 3 files changed, 24 insertions(+), 6 deletions(-)

--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -28,6 +28,7 @@ extern pgd_t early_top_pgt[PTRS_PER_PGD]
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
 
 void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd);
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd);
 void ptdump_walk_pgd_level_checkwx(void);
 
 #ifdef CONFIG_DEBUG_WX
--- a/arch/x86/mm/debug_pagetables.c
+++ b/arch/x86/mm/debug_pagetables.c
@@ -5,7 +5,7 @@
 
 static int ptdump_show(struct seq_file *m, void *v)
 {
-	ptdump_walk_pgd_level(m, NULL);
+	ptdump_walk_pgd_level_debugfs(m, NULL);
 	return 0;
 }
 
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -447,7 +447,7 @@ static inline bool is_hypervisor_range(i
 }
 
 static void ptdump_walk_pgd_level_core(struct seq_file *m, pgd_t *pgd,
-				       bool checkwx)
+				       bool checkwx, bool dmesg)
 {
 #ifdef CONFIG_X86_64
 	pgd_t *start = (pgd_t *) &init_top_pgt;
@@ -460,7 +460,7 @@ static void ptdump_walk_pgd_level_core(s
 
 	if (pgd) {
 		start = pgd;
-		st.to_dmesg = true;
+		st.to_dmesg = dmesg;
 	}
 
 	st.check_wx = checkwx;
@@ -498,13 +498,30 @@ static void ptdump_walk_pgd_level_core(s
 
 void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd)
 {
-	ptdump_walk_pgd_level_core(m, pgd, false);
+	ptdump_walk_pgd_level_core(m, pgd, false, true);
+}
+
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd)
+{
+	ptdump_walk_pgd_level_core(m, pgd, false, false);
+}
+EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level_debugfs);
+
+void ptdump_walk_shadow_pgd_level_checkwx(void)
+{
+#ifdef CONFIG_KAISER
+	pgd_t *pgd = (pgd_t *) &init_top_pgt;
+
+	pr_info("x86/mm: Checking shadow page tables\n");
+	pgd += PTRS_PER_PGD;
+	ptdump_walk_pgd_level_core(NULL, pgd, true, false);
+#endif
 }
-EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level);
 
 void ptdump_walk_pgd_level_checkwx(void)
 {
-	ptdump_walk_pgd_level_core(NULL, NULL, true);
+	ptdump_walk_pgd_level_core(NULL, NULL, true, false);
+	ptdump_walk_shadow_pgd_level_checkwx();
 }
 
 static int __init pt_dump_init(void)


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 4/5] x86/mm/debug_pagetables: Allow dumping current pagetables
  2017-11-26 23:14 ` Thomas Gleixner
@ 2017-11-26 23:14   ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-mm-dump_pagetables--Add-support-for-KAISER-tables.patch --]
[-- Type: text/plain, Size: 4404 bytes --]

Add two debugfs files which allow to dump the pagetable of the current task.

current_page_tables_knl dumps the regular page table. This is the page
table which is normally shared between kernel and user space. If KAISER is
enabled this is the kernel space mapping.

If KAISER is enabled the second file, current_page_tables_usr, dumps the
user space page table.

These files allow to verify the resulting page tables for KAISER, but even
in the non KAISER case its useful to be able to inspect user space page
tables of current for debugging purposes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable.h |    2 -
 arch/x86/mm/debug_pagetables.c |   81 +++++++++++++++++++++++++++++++++++++----
 arch/x86/mm/dump_pagetables.c  |    4 +-
 3 files changed, 79 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -28,7 +28,7 @@ extern pgd_t early_top_pgt[PTRS_PER_PGD]
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
 
 void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd);
-void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd);
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool shadow);
 void ptdump_walk_pgd_level_checkwx(void);
 
 #ifdef CONFIG_DEBUG_WX
--- a/arch/x86/mm/debug_pagetables.c
+++ b/arch/x86/mm/debug_pagetables.c
@@ -5,7 +5,7 @@
 
 static int ptdump_show(struct seq_file *m, void *v)
 {
-	ptdump_walk_pgd_level_debugfs(m, NULL);
+	ptdump_walk_pgd_level_debugfs(m, NULL, false);
 	return 0;
 }
 
@@ -22,21 +22,90 @@ static const struct file_operations ptdu
 	.release	= single_release,
 };
 
-static struct dentry *pe;
+static int ptdump_show_curknl(struct seq_file *m, void *v)
+{
+	if (current->mm->pgd) {
+		down_read(&current->mm->mmap_sem);
+		ptdump_walk_pgd_level_debugfs(m, current->mm->pgd, false);
+		up_read(&current->mm->mmap_sem);
+	}
+	return 0;
+}
+
+static int ptdump_open_curknl(struct inode *inode, struct file *filp)
+{
+	return single_open(filp, ptdump_show_curknl, NULL);
+}
+
+static const struct file_operations ptdump_curknl_fops = {
+	.owner		= THIS_MODULE,
+	.open		= ptdump_open_curknl,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+#ifdef CONFIG_KAISER
+static int ptdump_show_curusr(struct seq_file *m, void *v)
+{
+	if (current->mm->pgd) {
+		down_read(&current->mm->mmap_sem);
+		ptdump_walk_pgd_level_debugfs(m, current->mm->pgd, true);
+		up_read(&current->mm->mmap_sem);
+	}
+	return 0;
+}
+
+static int ptdump_open_curusr(struct inode *inode, struct file *filp)
+{
+	return single_open(filp, ptdump_show_curusr, NULL);
+}
+
+static const struct file_operations ptdump_curusr_fops = {
+	.owner		= THIS_MODULE,
+	.open		= ptdump_open_curusr,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+#endif
+
+static struct dentry *pe_knl, *pe_curknl, *pe_curusr;
+
+static void pt_dump_debug_remove_files(void)
+{
+	debugfs_remove_recursive(pe_knl);
+	debugfs_remove_recursive(pe_curknl);
+	debugfs_remove_recursive(pe_curusr);
+}
 
 static int __init pt_dump_debug_init(void)
 {
-	pe = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
-				 &ptdump_fops);
-	if (!pe)
+	pe_knl = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
+				     &ptdump_fops);
+	if (!pe_knl)
 		return -ENOMEM;
 
+	pe_curknl = debugfs_create_file("current_page_tables_knl", S_IRUSR,
+					NULL, NULL, &ptdump_curknl_fops);
+	if (!pe_curknl)
+		goto err;
+
+#ifdef CONFIG_KAISER
+	pe_curusr = debugfs_create_file("current_page_tables_usr", S_IRUSR,
+					NULL, NULL, &ptdump_curusr_fops);
+	if (!pe_curusr)
+		goto err;
+#endif
 	return 0;
+err:
+	pt_dump_debug_remove_files();
+	return -ENOMEM;
 }
 
 static void __exit pt_dump_debug_exit(void)
 {
-	debugfs_remove_recursive(pe);
+	pt_dump_debug_remove_files();
 }
 
 module_init(pt_dump_debug_init);
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -501,8 +501,10 @@ void ptdump_walk_pgd_level(struct seq_fi
 	ptdump_walk_pgd_level_core(m, pgd, false, true);
 }
 
-void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd)
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool shadow)
 {
+	if (shadow)
+		pgd += PTRS_PER_PGD;
 	ptdump_walk_pgd_level_core(m, pgd, false, false);
 }
 EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level_debugfs);

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 4/5] x86/mm/debug_pagetables: Allow dumping current pagetables
@ 2017-11-26 23:14   ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-mm-dump_pagetables--Add-support-for-KAISER-tables.patch --]
[-- Type: text/plain, Size: 4631 bytes --]

Add two debugfs files which allow to dump the pagetable of the current task.

current_page_tables_knl dumps the regular page table. This is the page
table which is normally shared between kernel and user space. If KAISER is
enabled this is the kernel space mapping.

If KAISER is enabled the second file, current_page_tables_usr, dumps the
user space page table.

These files allow to verify the resulting page tables for KAISER, but even
in the non KAISER case its useful to be able to inspect user space page
tables of current for debugging purposes.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/pgtable.h |    2 -
 arch/x86/mm/debug_pagetables.c |   81 +++++++++++++++++++++++++++++++++++++----
 arch/x86/mm/dump_pagetables.c  |    4 +-
 3 files changed, 79 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -28,7 +28,7 @@ extern pgd_t early_top_pgt[PTRS_PER_PGD]
 int __init __early_make_pgtable(unsigned long address, pmdval_t pmd);
 
 void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd);
-void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd);
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool shadow);
 void ptdump_walk_pgd_level_checkwx(void);
 
 #ifdef CONFIG_DEBUG_WX
--- a/arch/x86/mm/debug_pagetables.c
+++ b/arch/x86/mm/debug_pagetables.c
@@ -5,7 +5,7 @@
 
 static int ptdump_show(struct seq_file *m, void *v)
 {
-	ptdump_walk_pgd_level_debugfs(m, NULL);
+	ptdump_walk_pgd_level_debugfs(m, NULL, false);
 	return 0;
 }
 
@@ -22,21 +22,90 @@ static const struct file_operations ptdu
 	.release	= single_release,
 };
 
-static struct dentry *pe;
+static int ptdump_show_curknl(struct seq_file *m, void *v)
+{
+	if (current->mm->pgd) {
+		down_read(&current->mm->mmap_sem);
+		ptdump_walk_pgd_level_debugfs(m, current->mm->pgd, false);
+		up_read(&current->mm->mmap_sem);
+	}
+	return 0;
+}
+
+static int ptdump_open_curknl(struct inode *inode, struct file *filp)
+{
+	return single_open(filp, ptdump_show_curknl, NULL);
+}
+
+static const struct file_operations ptdump_curknl_fops = {
+	.owner		= THIS_MODULE,
+	.open		= ptdump_open_curknl,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+#ifdef CONFIG_KAISER
+static int ptdump_show_curusr(struct seq_file *m, void *v)
+{
+	if (current->mm->pgd) {
+		down_read(&current->mm->mmap_sem);
+		ptdump_walk_pgd_level_debugfs(m, current->mm->pgd, true);
+		up_read(&current->mm->mmap_sem);
+	}
+	return 0;
+}
+
+static int ptdump_open_curusr(struct inode *inode, struct file *filp)
+{
+	return single_open(filp, ptdump_show_curusr, NULL);
+}
+
+static const struct file_operations ptdump_curusr_fops = {
+	.owner		= THIS_MODULE,
+	.open		= ptdump_open_curusr,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+#endif
+
+static struct dentry *pe_knl, *pe_curknl, *pe_curusr;
+
+static void pt_dump_debug_remove_files(void)
+{
+	debugfs_remove_recursive(pe_knl);
+	debugfs_remove_recursive(pe_curknl);
+	debugfs_remove_recursive(pe_curusr);
+}
 
 static int __init pt_dump_debug_init(void)
 {
-	pe = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
-				 &ptdump_fops);
-	if (!pe)
+	pe_knl = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
+				     &ptdump_fops);
+	if (!pe_knl)
 		return -ENOMEM;
 
+	pe_curknl = debugfs_create_file("current_page_tables_knl", S_IRUSR,
+					NULL, NULL, &ptdump_curknl_fops);
+	if (!pe_curknl)
+		goto err;
+
+#ifdef CONFIG_KAISER
+	pe_curusr = debugfs_create_file("current_page_tables_usr", S_IRUSR,
+					NULL, NULL, &ptdump_curusr_fops);
+	if (!pe_curusr)
+		goto err;
+#endif
 	return 0;
+err:
+	pt_dump_debug_remove_files();
+	return -ENOMEM;
 }
 
 static void __exit pt_dump_debug_exit(void)
 {
-	debugfs_remove_recursive(pe);
+	pt_dump_debug_remove_files();
 }
 
 module_init(pt_dump_debug_init);
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -501,8 +501,10 @@ void ptdump_walk_pgd_level(struct seq_fi
 	ptdump_walk_pgd_level_core(m, pgd, false, true);
 }
 
-void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd)
+void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool shadow)
 {
+	if (shadow)
+		pgd += PTRS_PER_PGD;
 	ptdump_walk_pgd_level_core(m, pgd, false, false);
 }
 EXPORT_SYMBOL_GPL(ptdump_walk_pgd_level_debugfs);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-26 23:14 ` Thomas Gleixner
@ 2017-11-26 23:14   ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-kaiser--Add-boottime-disable-switch.patch --]
[-- Type: text/plain, Size: 6942 bytes --]

KAISER comes with overhead. The most expensive part is the CR3 switching in
the entry code.

Add a command line parameter which allows to disable KAISER at boot time.

Most code pathes simply check a variable, but the entry code uses a static
branch. The other code pathes cannot use a static branch because they are
used before jump label patching is possible. Not an issue as the code
pathes are not so performance sensitive as the entry/exit code.

This makes KAISER depend on JUMP_LABEL and on a GCC which supports
it, but that's a resonable requirement.

The PGD allocation is still 8k when CONFIG_KAISER is enabled. This can be
addressed on top of this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/calling.h          |    7 +++++++
 arch/x86/include/asm/kaiser.h     |   10 ++++++++++
 arch/x86/include/asm/pgtable_64.h |    6 ++++++
 arch/x86/mm/dump_pagetables.c     |    5 ++++-
 arch/x86/mm/init.c                |    7 ++++---
 arch/x86/mm/kaiser.c              |   30 ++++++++++++++++++++++++++++++
 security/Kconfig                  |    2 +-
 7 files changed, 62 insertions(+), 5 deletions(-)

--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -210,18 +210,23 @@ For 32-bit we have the following convent
 .endm
 
 .macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
+	STATIC_JUMP_IF_FALSE .Lend_\@, kaiser_enabled_key, def=1
 	mov	%cr3, \scratch_reg
 	ADJUST_KERNEL_CR3 \scratch_reg
 	mov	\scratch_reg, %cr3
+.Lend_\@:
 .endm
 
 .macro SWITCH_TO_USER_CR3 scratch_reg:req
+	STATIC_JUMP_IF_FALSE .Lend_\@, kaiser_enabled_key, def=1
 	mov	%cr3, \scratch_reg
 	ADJUST_USER_CR3 \scratch_reg
 	mov	\scratch_reg, %cr3
+.Lend_\@:
 .endm
 
 .macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
+	STATIC_JUMP_IF_FALSE .Ldone_\@, kaiser_enabled_key, def=1
 	movq	%cr3, %r\scratch_reg
 	movq	%r\scratch_reg, \save_reg
 	/*
@@ -244,11 +249,13 @@ For 32-bit we have the following convent
 .endm
 
 .macro RESTORE_CR3 save_reg:req
+	STATIC_JUMP_IF_FALSE .Lend_\@, kaiser_enabled_key, def=1
 	/*
 	 * The CR3 write could be avoided when not changing its value,
 	 * but would require a CR3 read *and* a scratch register.
 	 */
 	movq	\save_reg, %cr3
+.Lend_\@:
 .endm
 
 #else /* CONFIG_KAISER=n: */
--- a/arch/x86/include/asm/kaiser.h
+++ b/arch/x86/include/asm/kaiser.h
@@ -56,6 +56,16 @@ extern void kaiser_remove_mapping(unsign
  */
 extern void kaiser_init(void);
 
+/* True if kaiser is enabled at boot time */
+extern struct static_key_true kaiser_enabled_key;
+extern bool kaiser_enabled;
+extern void kaiser_check_cmdline(void);
+
+#else /* CONFIG_KAISER */
+
+#define kaiser_enabled		(false)
+static inline void kaiser_check_cmdline(void) { }
+
 #endif
 
 #endif /* __ASSEMBLY__ */
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -175,6 +175,9 @@ static inline p4d_t *shadow_to_kernel_p4
 {
 	return ptr_clear_bit(p4dp, KAISER_PGTABLE_SWITCH_BIT);
 }
+
+extern bool kaiser_enabled;
+
 #endif /* CONFIG_KAISER */
 
 /*
@@ -208,6 +211,9 @@ static inline bool pgd_userspace_access(
 static inline pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd)
 {
 #ifdef CONFIG_KAISER
+	if (!kaiser_enabled)
+		return pgd;
+
 	if (pgd_userspace_access(pgd)) {
 		if (pgdp_maps_userspace(pgdp)) {
 			/*
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -20,6 +20,7 @@
 #include <linux/seq_file.h>
 
 #include <asm/pgtable.h>
+#include <asm/kaiser.h>
 
 /*
  * The dumper groups pagetable entries of the same type into one, and for
@@ -503,7 +504,7 @@ void ptdump_walk_pgd_level(struct seq_fi
 
 void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool shadow)
 {
-	if (shadow)
+	if (shadow && kaiser_enabled)
 		pgd += PTRS_PER_PGD;
 	ptdump_walk_pgd_level_core(m, pgd, false, false);
 }
@@ -514,6 +515,8 @@ void ptdump_walk_shadow_pgd_level_checkw
 #ifdef CONFIG_KAISER
 	pgd_t *pgd = (pgd_t *) &init_top_pgt;
 
+	if (!kaiser_enabled)
+		return;
 	pr_info("x86/mm: Checking shadow page tables\n");
 	pgd += PTRS_PER_PGD;
 	ptdump_walk_pgd_level_core(NULL, pgd, true, false);
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -20,6 +20,7 @@
 #include <asm/kaslr.h>
 #include <asm/hypervisor.h>
 #include <asm/cpufeature.h>
+#include <asm/kaiser.h>
 
 /*
  * We need to define the tracepoints somewhere, and tlb.c
@@ -163,9 +164,8 @@ static int page_size_mask;
 
 static void enable_global_pages(void)
 {
-#ifndef CONFIG_KAISER
-	__supported_pte_mask |= _PAGE_GLOBAL;
-#endif
+	if (!kaiser_enabled)
+		__supported_pte_mask |= _PAGE_GLOBAL;
 }
 
 static void __init probe_page_size_mask(void)
@@ -656,6 +656,7 @@ void __init init_mem_mapping(void)
 {
 	unsigned long end;
 
+	kaiser_check_cmdline();
 	probe_page_size_mask();
 	setup_pcid();
 
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -34,6 +34,7 @@
 #include <linux/mm.h>
 #include <linux/uaccess.h>
 
+#include <asm/cmdline.h>
 #include <asm/kaiser.h>
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -44,6 +45,16 @@
 
 static pteval_t kaiser_pte_mask __ro_after_init = ~(_PAGE_NX | _PAGE_GLOBAL);
 
+/* Global flag for boot time kaiser enable/disable */
+bool kaiser_enabled __ro_after_init = true;
+DEFINE_STATIC_KEY_TRUE(kaiser_enabled_key);
+
+void __init kaiser_check_cmdline(void)
+{
+	if (cmdline_find_option_bool(boot_command_line, "nokaiser"))
+		kaiser_enabled = false;
+}
+
 /*
  * At runtime, the only things we map are some things for CPU
  * hotplug, and stacks for new processes.  No two CPUs will ever
@@ -252,6 +263,9 @@ int kaiser_add_user_map(const void *__st
 	unsigned long target_address;
 	pte_t *pte;
 
+	if (!kaiser_enabled)
+		return 0;
+
 	/* Clear not supported bits */
 	flags &= kaiser_pte_mask;
 
@@ -402,6 +416,9 @@ void __init kaiser_init(void)
 {
 	int cpu;
 
+	if (!kaiser_enabled)
+		return;
+
 	kaiser_init_all_pgds();
 
 	for_each_possible_cpu(cpu) {
@@ -436,6 +453,16 @@ void __init kaiser_init(void)
 	kaiser_add_mapping_cpu_entry(0);
 }
 
+static int __init kaiser_boottime_control(void)
+{
+	if (!kaiser_enabled) {
+		static_branch_disable(&kaiser_enabled_key);
+		pr_info("kaiser: Disabled on command line\n");
+	}
+	return 0;
+}
+subsys_initcall(kaiser_boottime_control);
+
 int kaiser_add_mapping(unsigned long addr, unsigned long size,
 		       unsigned long flags)
 {
@@ -446,6 +473,9 @@ void kaiser_remove_mapping(unsigned long
 {
 	unsigned long addr;
 
+	if (!kaiser_enabled)
+		return;
+
 	/* The shadow page tables always use small pages: */
 	for (addr = start; addr < start + size; addr += PAGE_SIZE) {
 		/*
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -56,7 +56,7 @@ config SECURITY_NETWORK
 
 config KAISER
 	bool "Remove the kernel mapping in user mode"
-	depends on X86_64 && SMP && !PARAVIRT
+	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
 	help
 	  This feature reduces the number of hardware side channels by
 	  ensuring that the majority of kernel addresses are not mapped

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-26 23:14   ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-26 23:14 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

[-- Attachment #1: x86-kaiser--Add-boottime-disable-switch.patch --]
[-- Type: text/plain, Size: 7169 bytes --]

KAISER comes with overhead. The most expensive part is the CR3 switching in
the entry code.

Add a command line parameter which allows to disable KAISER at boot time.

Most code pathes simply check a variable, but the entry code uses a static
branch. The other code pathes cannot use a static branch because they are
used before jump label patching is possible. Not an issue as the code
pathes are not so performance sensitive as the entry/exit code.

This makes KAISER depend on JUMP_LABEL and on a GCC which supports
it, but that's a resonable requirement.

The PGD allocation is still 8k when CONFIG_KAISER is enabled. This can be
addressed on top of this.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/entry/calling.h          |    7 +++++++
 arch/x86/include/asm/kaiser.h     |   10 ++++++++++
 arch/x86/include/asm/pgtable_64.h |    6 ++++++
 arch/x86/mm/dump_pagetables.c     |    5 ++++-
 arch/x86/mm/init.c                |    7 ++++---
 arch/x86/mm/kaiser.c              |   30 ++++++++++++++++++++++++++++++
 security/Kconfig                  |    2 +-
 7 files changed, 62 insertions(+), 5 deletions(-)

--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -210,18 +210,23 @@ For 32-bit we have the following convent
 .endm
 
 .macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
+	STATIC_JUMP_IF_FALSE .Lend_\@, kaiser_enabled_key, def=1
 	mov	%cr3, \scratch_reg
 	ADJUST_KERNEL_CR3 \scratch_reg
 	mov	\scratch_reg, %cr3
+.Lend_\@:
 .endm
 
 .macro SWITCH_TO_USER_CR3 scratch_reg:req
+	STATIC_JUMP_IF_FALSE .Lend_\@, kaiser_enabled_key, def=1
 	mov	%cr3, \scratch_reg
 	ADJUST_USER_CR3 \scratch_reg
 	mov	\scratch_reg, %cr3
+.Lend_\@:
 .endm
 
 .macro SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg:req save_reg:req
+	STATIC_JUMP_IF_FALSE .Ldone_\@, kaiser_enabled_key, def=1
 	movq	%cr3, %r\scratch_reg
 	movq	%r\scratch_reg, \save_reg
 	/*
@@ -244,11 +249,13 @@ For 32-bit we have the following convent
 .endm
 
 .macro RESTORE_CR3 save_reg:req
+	STATIC_JUMP_IF_FALSE .Lend_\@, kaiser_enabled_key, def=1
 	/*
 	 * The CR3 write could be avoided when not changing its value,
 	 * but would require a CR3 read *and* a scratch register.
 	 */
 	movq	\save_reg, %cr3
+.Lend_\@:
 .endm
 
 #else /* CONFIG_KAISER=n: */
--- a/arch/x86/include/asm/kaiser.h
+++ b/arch/x86/include/asm/kaiser.h
@@ -56,6 +56,16 @@ extern void kaiser_remove_mapping(unsign
  */
 extern void kaiser_init(void);
 
+/* True if kaiser is enabled at boot time */
+extern struct static_key_true kaiser_enabled_key;
+extern bool kaiser_enabled;
+extern void kaiser_check_cmdline(void);
+
+#else /* CONFIG_KAISER */
+
+#define kaiser_enabled		(false)
+static inline void kaiser_check_cmdline(void) { }
+
 #endif
 
 #endif /* __ASSEMBLY__ */
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -175,6 +175,9 @@ static inline p4d_t *shadow_to_kernel_p4
 {
 	return ptr_clear_bit(p4dp, KAISER_PGTABLE_SWITCH_BIT);
 }
+
+extern bool kaiser_enabled;
+
 #endif /* CONFIG_KAISER */
 
 /*
@@ -208,6 +211,9 @@ static inline bool pgd_userspace_access(
 static inline pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd)
 {
 #ifdef CONFIG_KAISER
+	if (!kaiser_enabled)
+		return pgd;
+
 	if (pgd_userspace_access(pgd)) {
 		if (pgdp_maps_userspace(pgdp)) {
 			/*
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -20,6 +20,7 @@
 #include <linux/seq_file.h>
 
 #include <asm/pgtable.h>
+#include <asm/kaiser.h>
 
 /*
  * The dumper groups pagetable entries of the same type into one, and for
@@ -503,7 +504,7 @@ void ptdump_walk_pgd_level(struct seq_fi
 
 void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool shadow)
 {
-	if (shadow)
+	if (shadow && kaiser_enabled)
 		pgd += PTRS_PER_PGD;
 	ptdump_walk_pgd_level_core(m, pgd, false, false);
 }
@@ -514,6 +515,8 @@ void ptdump_walk_shadow_pgd_level_checkw
 #ifdef CONFIG_KAISER
 	pgd_t *pgd = (pgd_t *) &init_top_pgt;
 
+	if (!kaiser_enabled)
+		return;
 	pr_info("x86/mm: Checking shadow page tables\n");
 	pgd += PTRS_PER_PGD;
 	ptdump_walk_pgd_level_core(NULL, pgd, true, false);
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -20,6 +20,7 @@
 #include <asm/kaslr.h>
 #include <asm/hypervisor.h>
 #include <asm/cpufeature.h>
+#include <asm/kaiser.h>
 
 /*
  * We need to define the tracepoints somewhere, and tlb.c
@@ -163,9 +164,8 @@ static int page_size_mask;
 
 static void enable_global_pages(void)
 {
-#ifndef CONFIG_KAISER
-	__supported_pte_mask |= _PAGE_GLOBAL;
-#endif
+	if (!kaiser_enabled)
+		__supported_pte_mask |= _PAGE_GLOBAL;
 }
 
 static void __init probe_page_size_mask(void)
@@ -656,6 +656,7 @@ void __init init_mem_mapping(void)
 {
 	unsigned long end;
 
+	kaiser_check_cmdline();
 	probe_page_size_mask();
 	setup_pcid();
 
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -34,6 +34,7 @@
 #include <linux/mm.h>
 #include <linux/uaccess.h>
 
+#include <asm/cmdline.h>
 #include <asm/kaiser.h>
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -44,6 +45,16 @@
 
 static pteval_t kaiser_pte_mask __ro_after_init = ~(_PAGE_NX | _PAGE_GLOBAL);
 
+/* Global flag for boot time kaiser enable/disable */
+bool kaiser_enabled __ro_after_init = true;
+DEFINE_STATIC_KEY_TRUE(kaiser_enabled_key);
+
+void __init kaiser_check_cmdline(void)
+{
+	if (cmdline_find_option_bool(boot_command_line, "nokaiser"))
+		kaiser_enabled = false;
+}
+
 /*
  * At runtime, the only things we map are some things for CPU
  * hotplug, and stacks for new processes.  No two CPUs will ever
@@ -252,6 +263,9 @@ int kaiser_add_user_map(const void *__st
 	unsigned long target_address;
 	pte_t *pte;
 
+	if (!kaiser_enabled)
+		return 0;
+
 	/* Clear not supported bits */
 	flags &= kaiser_pte_mask;
 
@@ -402,6 +416,9 @@ void __init kaiser_init(void)
 {
 	int cpu;
 
+	if (!kaiser_enabled)
+		return;
+
 	kaiser_init_all_pgds();
 
 	for_each_possible_cpu(cpu) {
@@ -436,6 +453,16 @@ void __init kaiser_init(void)
 	kaiser_add_mapping_cpu_entry(0);
 }
 
+static int __init kaiser_boottime_control(void)
+{
+	if (!kaiser_enabled) {
+		static_branch_disable(&kaiser_enabled_key);
+		pr_info("kaiser: Disabled on command line\n");
+	}
+	return 0;
+}
+subsys_initcall(kaiser_boottime_control);
+
 int kaiser_add_mapping(unsigned long addr, unsigned long size,
 		       unsigned long flags)
 {
@@ -446,6 +473,9 @@ void kaiser_remove_mapping(unsigned long
 {
 	unsigned long addr;
 
+	if (!kaiser_enabled)
+		return;
+
 	/* The shadow page tables always use small pages: */
 	for (addr = start; addr < start + size; addr += PAGE_SIZE) {
 		/*
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -56,7 +56,7 @@ config SECURITY_NETWORK
 
 config KAISER
 	bool "Remove the kernel mapping in user mode"
-	depends on X86_64 && SMP && !PARAVIRT
+	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
 	help
 	  This feature reduces the number of hardware side channels by
 	  ensuring that the majority of kernel addresses are not mapped


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 4/5] x86/mm/debug_pagetables: Allow dumping current pagetables
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27  9:41     ` Peter Zijlstra
  -1 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27  9:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 12:14:07AM +0100, Thomas Gleixner wrote:
>  static int __init pt_dump_debug_init(void)
>  {
> +	pe_knl = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
> +				     &ptdump_fops);
> +	if (!pe_knl)
>  		return -ENOMEM;
>  
> +	pe_curknl = debugfs_create_file("current_page_tables_knl", S_IRUSR,
> +					NULL, NULL, &ptdump_curknl_fops);
> +	if (!pe_curknl)
> +		goto err;
> +
> +#ifdef CONFIG_KAISER
> +	pe_curusr = debugfs_create_file("current_page_tables_usr", S_IRUSR,
> +					NULL, NULL, &ptdump_curusr_fops);
> +	if (!pe_curusr)
> +		goto err;
> +#endif
>  	return 0;
> +err:
> +	pt_dump_debug_remove_files();
> +	return -ENOMEM;
>  }


Could we pretty please use the octal permission thing? I can't read
thise S_crap nonsense.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 4/5] x86/mm/debug_pagetables: Allow dumping current pagetables
@ 2017-11-27  9:41     ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27  9:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 12:14:07AM +0100, Thomas Gleixner wrote:
>  static int __init pt_dump_debug_init(void)
>  {
> +	pe_knl = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
> +				     &ptdump_fops);
> +	if (!pe_knl)
>  		return -ENOMEM;
>  
> +	pe_curknl = debugfs_create_file("current_page_tables_knl", S_IRUSR,
> +					NULL, NULL, &ptdump_curknl_fops);
> +	if (!pe_curknl)
> +		goto err;
> +
> +#ifdef CONFIG_KAISER
> +	pe_curusr = debugfs_create_file("current_page_tables_usr", S_IRUSR,
> +					NULL, NULL, &ptdump_curusr_fops);
> +	if (!pe_curusr)
> +		goto err;
> +#endif
>  	return 0;
> +err:
> +	pt_dump_debug_remove_files();
> +	return -ENOMEM;
>  }


Could we pretty please use the octal permission thing? I can't read
thise S_crap nonsense.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27  9:48     ` Peter Zijlstra
  -1 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27  9:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 12:14:08AM +0100, Thomas Gleixner wrote:
> KAISER comes with overhead. The most expensive part is the CR3 switching in
> the entry code.
> 
> Add a command line parameter which allows to disable KAISER at boot time.
> 
> Most code pathes simply check a variable, but the entry code uses a static
> branch. The other code pathes cannot use a static branch because they are
> used before jump label patching is possible. Not an issue as the code
> pathes are not so performance sensitive as the entry/exit code.
> 
> This makes KAISER depend on JUMP_LABEL and on a GCC which supports
> it, but that's a resonable requirement.
> 
> The PGD allocation is still 8k when CONFIG_KAISER is enabled. This can be
> addressed on top of this.

So in patch 15 Andy notes that we should probably also disable the
SYSCALL trampoline when we disable KAISER.

  https://lkml.kernel.org/r/20171124172411.19476-16-mingo@kernel.org

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27  9:48     ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27  9:48 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 12:14:08AM +0100, Thomas Gleixner wrote:
> KAISER comes with overhead. The most expensive part is the CR3 switching in
> the entry code.
> 
> Add a command line parameter which allows to disable KAISER at boot time.
> 
> Most code pathes simply check a variable, but the entry code uses a static
> branch. The other code pathes cannot use a static branch because they are
> used before jump label patching is possible. Not an issue as the code
> pathes are not so performance sensitive as the entry/exit code.
> 
> This makes KAISER depend on JUMP_LABEL and on a GCC which supports
> it, but that's a resonable requirement.
> 
> The PGD allocation is still 8k when CONFIG_KAISER is enabled. This can be
> addressed on top of this.

So in patch 15 Andy notes that we should probably also disable the
SYSCALL trampoline when we disable KAISER.

  https://lkml.kernel.org/r/20171124172411.19476-16-mingo@kernel.org

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27  9:57     ` Peter Zijlstra
  -1 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27  9:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 12:14:04AM +0100, Thomas Gleixner wrote:
> PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
> line, but KAISER sets them unconditionally.

So KAISER is x86_64 only, right? AFAIK there is no x86_64 without NX
support. So would it not make sense to mandate NX for KAISER?, that is
instead of making "noexec" + KAISER work, make "noexec" kill KAISER +
emit a warning.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
@ 2017-11-27  9:57     ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27  9:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 12:14:04AM +0100, Thomas Gleixner wrote:
> PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
> line, but KAISER sets them unconditionally.

So KAISER is x86_64 only, right? AFAIK there is no x86_64 without NX
support. So would it not make sense to mandate NX for KAISER?, that is
instead of making "noexec" + KAISER work, make "noexec" kill KAISER +
emit a warning.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
  2017-11-27  9:41     ` Peter Zijlstra
@ 2017-11-27 10:06       ` Ingo Molnar
  -1 siblings, 0 replies; 64+ messages in thread
From: Ingo Molnar @ 2017-11-27 10:06 UTC (permalink / raw)
  To: Peter Zijlstra, Linus Torvalds
  Cc: Thomas Gleixner, LKML, Dave Hansen, Andy Lutomirski,
	Borislav Petkov, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
	Josh Poimboeuf, Linus Torvalds, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Mon, Nov 27, 2017 at 12:14:07AM +0100, Thomas Gleixner wrote:
> >  static int __init pt_dump_debug_init(void)
> >  {
> > +	pe_knl = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
> > +				     &ptdump_fops);
> > +	if (!pe_knl)
> >  		return -ENOMEM;
> >  
> > +	pe_curknl = debugfs_create_file("current_page_tables_knl", S_IRUSR,
> > +					NULL, NULL, &ptdump_curknl_fops);
> > +	if (!pe_curknl)
> > +		goto err;
> > +
> > +#ifdef CONFIG_KAISER
> > +	pe_curusr = debugfs_create_file("current_page_tables_usr", S_IRUSR,
> > +					NULL, NULL, &ptdump_curusr_fops);
> > +	if (!pe_curusr)
> > +		goto err;
> > +#endif
> >  	return 0;
> > +err:
> > +	pt_dump_debug_remove_files();
> > +	return -ENOMEM;
> >  }
> 
> 
> Could we pretty please use the octal permission thing? I can't read
> thise S_crap nonsense.

They are completely unreadable to me too. So if we added these helpers I sent a 
year ago:

	https://lwn.net/Articles/696231/

Then the above could be written as:

	pe_curknl = debugfs_create_file("current_page_tables_knl", PERM_r________,
					NULL, NULL, &ptdump_curknl_fops);

... etc., which is much more readable IMHO. Not only that, it would be trivial to 
_write_ permission masks as well. I just wrote this:

	PERM_rw_r__r__

Which is so much more readable than "S_IRUGO|S_IWUSR" or even "0644". The former 
pattern is used 527 times in the kernel ...

The patch below adds it to the current kernel.

Thanks,

	Ingo
---
Signed-off-by: Ingo Molnar <mingo@kernel.org>

 include/linux/stat.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/include/linux/stat.h b/include/linux/stat.h
index 22484e44544d..fc389c3a8692 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -6,6 +6,38 @@
 #include <asm/stat.h>
 #include <uapi/linux/stat.h>
 
+/*
+ * Human readable symbolic definitions for common
+ * file permissions:
+ */
+#define PERM_r________	0400
+#define PERM_r__r_____	0440
+#define PERM_r__r__r__	0444
+
+#define PERM_rw_______	0600
+#define PERM_rw_r_____	0640
+#define PERM_rw_r__r__	0644
+#define PERM_rw_rw_r__	0664
+#define PERM_rw_rw_rw_	0666
+
+#define PERM__w_______	0200
+#define PERM__w__w____	0220
+#define PERM__w__w__w_	0222
+
+#define PERM_r_x______	0500
+#define PERM_r_xr_x___	0550
+#define PERM_r_xr_xr_x	0555
+
+#define PERM_rwx______	0700
+#define PERM_rwxr_x___	0750
+#define PERM_rwxr_xr_x	0755
+#define PERM_rwxrwxr_x	0775
+#define PERM_rwxrwxrwx	0777
+
+#define PERM__wx______	0300
+#define PERM__wx_wx___	0330
+#define PERM__wx_wx_wx	0333
+
 #define S_IRWXUGO	(S_IRWXU|S_IRWXG|S_IRWXO)
 #define S_IALLUGO	(S_ISUID|S_ISGID|S_ISVTX|S_IRWXUGO)
 #define S_IRUGO		(S_IRUSR|S_IRGRP|S_IROTH)

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
@ 2017-11-27 10:06       ` Ingo Molnar
  0 siblings, 0 replies; 64+ messages in thread
From: Ingo Molnar @ 2017-11-27 10:06 UTC (permalink / raw)
  To: Peter Zijlstra, Linus Torvalds
  Cc: Thomas Gleixner, LKML, Dave Hansen, Andy Lutomirski,
	Borislav Petkov, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
	Josh Poimboeuf, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Mon, Nov 27, 2017 at 12:14:07AM +0100, Thomas Gleixner wrote:
> >  static int __init pt_dump_debug_init(void)
> >  {
> > +	pe_knl = debugfs_create_file("kernel_page_tables", S_IRUSR, NULL, NULL,
> > +				     &ptdump_fops);
> > +	if (!pe_knl)
> >  		return -ENOMEM;
> >  
> > +	pe_curknl = debugfs_create_file("current_page_tables_knl", S_IRUSR,
> > +					NULL, NULL, &ptdump_curknl_fops);
> > +	if (!pe_curknl)
> > +		goto err;
> > +
> > +#ifdef CONFIG_KAISER
> > +	pe_curusr = debugfs_create_file("current_page_tables_usr", S_IRUSR,
> > +					NULL, NULL, &ptdump_curusr_fops);
> > +	if (!pe_curusr)
> > +		goto err;
> > +#endif
> >  	return 0;
> > +err:
> > +	pt_dump_debug_remove_files();
> > +	return -ENOMEM;
> >  }
> 
> 
> Could we pretty please use the octal permission thing? I can't read
> thise S_crap nonsense.

They are completely unreadable to me too. So if we added these helpers I sent a 
year ago:

	https://lwn.net/Articles/696231/

Then the above could be written as:

	pe_curknl = debugfs_create_file("current_page_tables_knl", PERM_r________,
					NULL, NULL, &ptdump_curknl_fops);

... etc., which is much more readable IMHO. Not only that, it would be trivial to 
_write_ permission masks as well. I just wrote this:

	PERM_rw_r__r__

Which is so much more readable than "S_IRUGO|S_IWUSR" or even "0644". The former 
pattern is used 527 times in the kernel ...

The patch below adds it to the current kernel.

Thanks,

	Ingo
---
Signed-off-by: Ingo Molnar <mingo@kernel.org>

 include/linux/stat.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/include/linux/stat.h b/include/linux/stat.h
index 22484e44544d..fc389c3a8692 100644
--- a/include/linux/stat.h
+++ b/include/linux/stat.h
@@ -6,6 +6,38 @@
 #include <asm/stat.h>
 #include <uapi/linux/stat.h>
 
+/*
+ * Human readable symbolic definitions for common
+ * file permissions:
+ */
+#define PERM_r________	0400
+#define PERM_r__r_____	0440
+#define PERM_r__r__r__	0444
+
+#define PERM_rw_______	0600
+#define PERM_rw_r_____	0640
+#define PERM_rw_r__r__	0644
+#define PERM_rw_rw_r__	0664
+#define PERM_rw_rw_rw_	0666
+
+#define PERM__w_______	0200
+#define PERM__w__w____	0220
+#define PERM__w__w__w_	0222
+
+#define PERM_r_x______	0500
+#define PERM_r_xr_x___	0550
+#define PERM_r_xr_xr_x	0555
+
+#define PERM_rwx______	0700
+#define PERM_rwxr_x___	0750
+#define PERM_rwxr_xr_x	0755
+#define PERM_rwxrwxr_x	0775
+#define PERM_rwxrwxrwx	0777
+
+#define PERM__wx______	0300
+#define PERM__wx_wx___	0330
+#define PERM__wx_wx_wx	0333
+
 #define S_IRWXUGO	(S_IRWXU|S_IRWXG|S_IRWXO)
 #define S_IALLUGO	(S_ISUID|S_ISGID|S_ISVTX|S_IRWXUGO)
 #define S_IRUGO		(S_IRUSR|S_IRGRP|S_IROTH)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-27  9:48     ` Peter Zijlstra
@ 2017-11-27 10:22       ` Peter Zijlstra
  -1 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27 10:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 10:48:46AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 27, 2017 at 12:14:08AM +0100, Thomas Gleixner wrote:
> > KAISER comes with overhead. The most expensive part is the CR3 switching in
> > the entry code.
> > 
> > Add a command line parameter which allows to disable KAISER at boot time.
> > 
> > Most code pathes simply check a variable, but the entry code uses a static
> > branch. The other code pathes cannot use a static branch because they are
> > used before jump label patching is possible. Not an issue as the code
> > pathes are not so performance sensitive as the entry/exit code.
> > 
> > This makes KAISER depend on JUMP_LABEL and on a GCC which supports
> > it, but that's a resonable requirement.
> > 
> > The PGD allocation is still 8k when CONFIG_KAISER is enabled. This can be
> > addressed on top of this.
> 
> So in patch 15 Andy notes that we should probably also disable the
> SYSCALL trampoline when we disable KAISER.
> 
>   https://lkml.kernel.org/r/20171124172411.19476-16-mingo@kernel.org

Could be a simple as this.. but I've not tested.


diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f4f4ab8525bd..1be393a97421 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1442,7 +1442,10 @@ void syscall_init(void)
 		(entry_SYSCALL_64_trampoline - _entry_trampoline);
 
 	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
-	wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
+	if (kaiser_enabled)
+		wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
+	else
+		wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
 
 #ifdef CONFIG_IA32_EMULATION
 	wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27 10:22       ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27 10:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 10:48:46AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 27, 2017 at 12:14:08AM +0100, Thomas Gleixner wrote:
> > KAISER comes with overhead. The most expensive part is the CR3 switching in
> > the entry code.
> > 
> > Add a command line parameter which allows to disable KAISER at boot time.
> > 
> > Most code pathes simply check a variable, but the entry code uses a static
> > branch. The other code pathes cannot use a static branch because they are
> > used before jump label patching is possible. Not an issue as the code
> > pathes are not so performance sensitive as the entry/exit code.
> > 
> > This makes KAISER depend on JUMP_LABEL and on a GCC which supports
> > it, but that's a resonable requirement.
> > 
> > The PGD allocation is still 8k when CONFIG_KAISER is enabled. This can be
> > addressed on top of this.
> 
> So in patch 15 Andy notes that we should probably also disable the
> SYSCALL trampoline when we disable KAISER.
> 
>   https://lkml.kernel.org/r/20171124172411.19476-16-mingo@kernel.org

Could be a simple as this.. but I've not tested.


diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index f4f4ab8525bd..1be393a97421 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1442,7 +1442,10 @@ void syscall_init(void)
 		(entry_SYSCALL_64_trampoline - _entry_trampoline);
 
 	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
-	wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
+	if (kaiser_enabled)
+		wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
+	else
+		wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
 
 #ifdef CONFIG_IA32_EMULATION
 	wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
  2017-11-27  9:57     ` Peter Zijlstra
@ 2017-11-27 11:47       ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 11:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Peter Zijlstra wrote:
> On Mon, Nov 27, 2017 at 12:14:04AM +0100, Thomas Gleixner wrote:
> > PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
> > line, but KAISER sets them unconditionally.
> 
> So KAISER is x86_64 only, right? AFAIK there is no x86_64 without NX
> support. So would it not make sense to mandate NX for KAISER?, that is
> instead of making "noexec" + KAISER work, make "noexec" kill KAISER +
> emit a warning.

OTOH, disabling NX is a simple way to verify that DEBUG_WX works correctly
also on the shadow maps.

But surely we can drop the PAGE_GLOBAL thing, as all 64bit systems have it.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
@ 2017-11-27 11:47       ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 11:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Peter Zijlstra wrote:
> On Mon, Nov 27, 2017 at 12:14:04AM +0100, Thomas Gleixner wrote:
> > PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
> > line, but KAISER sets them unconditionally.
> 
> So KAISER is x86_64 only, right? AFAIK there is no x86_64 without NX
> support. So would it not make sense to mandate NX for KAISER?, that is
> instead of making "noexec" + KAISER work, make "noexec" kill KAISER +
> emit a warning.

OTOH, disabling NX is a simple way to verify that DEBUG_WX works correctly
also on the shadow maps.

But surely we can drop the PAGE_GLOBAL thing, as all 64bit systems have it.

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 2/5] x86/kaiser: Simplify disabling of global pages
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27 11:49     ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 11:49 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

On Mon, 27 Nov 2017, Thomas Gleixner wrote:
>  	/*
> @@ -179,11 +186,11 @@ static void __init probe_page_size_mask(
>  		cr4_set_bits_and_update_boot(X86_CR4_PSE);
>  
>  	/* Enable PGE if available */
> +	__supported_pte_mask |= _PAGE_GLOBAL;

Bah. Late night reject fixup wreckage. That wants to be

	__supported_pte_mask &= ~_PAGE_GLOBAL;

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 2/5] x86/kaiser: Simplify disabling of global pages
@ 2017-11-27 11:49     ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 11:49 UTC (permalink / raw)
  To: LKML
  Cc: Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Peter Zijlstra, Rik van Riel, daniel.gruss,
	hughd, keescook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

On Mon, 27 Nov 2017, Thomas Gleixner wrote:
>  	/*
> @@ -179,11 +186,11 @@ static void __init probe_page_size_mask(
>  		cr4_set_bits_and_update_boot(X86_CR4_PSE);
>  
>  	/* Enable PGE if available */
> +	__supported_pte_mask |= _PAGE_GLOBAL;

Bah. Late night reject fixup wreckage. That wants to be

	__supported_pte_mask &= ~_PAGE_GLOBAL;

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-27 10:22       ` Peter Zijlstra
@ 2017-11-27 11:50         ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 11:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Peter Zijlstra wrote:
> On Mon, Nov 27, 2017 at 10:48:46AM +0100, Peter Zijlstra wrote:
> > On Mon, Nov 27, 2017 at 12:14:08AM +0100, Thomas Gleixner wrote:
> > > KAISER comes with overhead. The most expensive part is the CR3 switching in
> > > the entry code.
> > > 
> > > Add a command line parameter which allows to disable KAISER at boot time.
> > > 
> > > Most code pathes simply check a variable, but the entry code uses a static
> > > branch. The other code pathes cannot use a static branch because they are
> > > used before jump label patching is possible. Not an issue as the code
> > > pathes are not so performance sensitive as the entry/exit code.
> > > 
> > > This makes KAISER depend on JUMP_LABEL and on a GCC which supports
> > > it, but that's a resonable requirement.
> > > 
> > > The PGD allocation is still 8k when CONFIG_KAISER is enabled. This can be
> > > addressed on top of this.
> > 
> > So in patch 15 Andy notes that we should probably also disable the
> > SYSCALL trampoline when we disable KAISER.
> > 
> >   https://lkml.kernel.org/r/20171124172411.19476-16-mingo@kernel.org
> 
> Could be a simple as this.. but I've not tested.

That's only one part of it. I think we need to fiddle with the exit side as
well.

Thanks,

	tglx

> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index f4f4ab8525bd..1be393a97421 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1442,7 +1442,10 @@ void syscall_init(void)
>  		(entry_SYSCALL_64_trampoline - _entry_trampoline);
>  
>  	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
> -	wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
> +	if (kaiser_enabled)
> +		wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
> +	else
> +		wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
>  
>  #ifdef CONFIG_IA32_EMULATION
>  	wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);
> 

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27 11:50         ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 11:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Peter Zijlstra wrote:
> On Mon, Nov 27, 2017 at 10:48:46AM +0100, Peter Zijlstra wrote:
> > On Mon, Nov 27, 2017 at 12:14:08AM +0100, Thomas Gleixner wrote:
> > > KAISER comes with overhead. The most expensive part is the CR3 switching in
> > > the entry code.
> > > 
> > > Add a command line parameter which allows to disable KAISER at boot time.
> > > 
> > > Most code pathes simply check a variable, but the entry code uses a static
> > > branch. The other code pathes cannot use a static branch because they are
> > > used before jump label patching is possible. Not an issue as the code
> > > pathes are not so performance sensitive as the entry/exit code.
> > > 
> > > This makes KAISER depend on JUMP_LABEL and on a GCC which supports
> > > it, but that's a resonable requirement.
> > > 
> > > The PGD allocation is still 8k when CONFIG_KAISER is enabled. This can be
> > > addressed on top of this.
> > 
> > So in patch 15 Andy notes that we should probably also disable the
> > SYSCALL trampoline when we disable KAISER.
> > 
> >   https://lkml.kernel.org/r/20171124172411.19476-16-mingo@kernel.org
> 
> Could be a simple as this.. but I've not tested.

That's only one part of it. I think we need to fiddle with the exit side as
well.

Thanks,

	tglx

> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index f4f4ab8525bd..1be393a97421 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1442,7 +1442,10 @@ void syscall_init(void)
>  		(entry_SYSCALL_64_trampoline - _entry_trampoline);
>  
>  	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
> -	wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
> +	if (kaiser_enabled)
> +		wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
> +	else
> +		wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
>  
>  #ifdef CONFIG_IA32_EMULATION
>  	wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
  2017-11-27 11:47       ` Thomas Gleixner
@ 2017-11-27 12:31         ` Brian Gerst
  -1 siblings, 0 replies; 64+ messages in thread
From: Brian Gerst @ 2017-11-27 12:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, Daniel Gruss, Hugh Dickins,
	Kees Cook, Linux-MM, michael.schwarz, moritz.lipp,
	richard.fellner

On Mon, Nov 27, 2017 at 6:47 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 27 Nov 2017, Peter Zijlstra wrote:
>> On Mon, Nov 27, 2017 at 12:14:04AM +0100, Thomas Gleixner wrote:
>> > PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
>> > line, but KAISER sets them unconditionally.
>>
>> So KAISER is x86_64 only, right? AFAIK there is no x86_64 without NX
>> support. So would it not make sense to mandate NX for KAISER?, that is
>> instead of making "noexec" + KAISER work, make "noexec" kill KAISER +
>> emit a warning.
>
> OTOH, disabling NX is a simple way to verify that DEBUG_WX works correctly
> also on the shadow maps.
>
> But surely we can drop the PAGE_GLOBAL thing, as all 64bit systems have it.

I seem to recall that some virtualized environments (maybe Xen?) don't
support global pages.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
@ 2017-11-27 12:31         ` Brian Gerst
  0 siblings, 0 replies; 64+ messages in thread
From: Brian Gerst @ 2017-11-27 12:31 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Peter Zijlstra, LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, Daniel Gruss, Hugh Dickins,
	Kees Cook, Linux-MM, michael.schwarz, moritz.lipp,
	richard.fellner

On Mon, Nov 27, 2017 at 6:47 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 27 Nov 2017, Peter Zijlstra wrote:
>> On Mon, Nov 27, 2017 at 12:14:04AM +0100, Thomas Gleixner wrote:
>> > PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
>> > line, but KAISER sets them unconditionally.
>>
>> So KAISER is x86_64 only, right? AFAIK there is no x86_64 without NX
>> support. So would it not make sense to mandate NX for KAISER?, that is
>> instead of making "noexec" + KAISER work, make "noexec" kill KAISER +
>> emit a warning.
>
> OTOH, disabling NX is a simple way to verify that DEBUG_WX works correctly
> also on the shadow maps.
>
> But surely we can drop the PAGE_GLOBAL thing, as all 64bit systems have it.

I seem to recall that some virtualized environments (maybe Xen?) don't
support global pages.

--
Brian Gerst

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-27 11:50         ` Thomas Gleixner
@ 2017-11-27 12:49           ` Peter Zijlstra
  -1 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27 12:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 12:50:45PM +0100, Thomas Gleixner wrote:
> On Mon, 27 Nov 2017, Peter Zijlstra wrote:
> > On Mon, Nov 27, 2017 at 10:48:46AM +0100, Peter Zijlstra wrote:

> > > So in patch 15 Andy notes that we should probably also disable the
> > > SYSCALL trampoline when we disable KAISER.
> > > 
> > >   https://lkml.kernel.org/r/20171124172411.19476-16-mingo@kernel.org
> > 
> > Could be a simple as this.. but I've not tested.
> 
> That's only one part of it. I think we need to fiddle with the exit side as
> well.

So I assumed that the patches were bisectable. From that I figured the
exit path (patch 14 in that set) would work either way.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27 12:49           ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27 12:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 12:50:45PM +0100, Thomas Gleixner wrote:
> On Mon, 27 Nov 2017, Peter Zijlstra wrote:
> > On Mon, Nov 27, 2017 at 10:48:46AM +0100, Peter Zijlstra wrote:

> > > So in patch 15 Andy notes that we should probably also disable the
> > > SYSCALL trampoline when we disable KAISER.
> > > 
> > >   https://lkml.kernel.org/r/20171124172411.19476-16-mingo@kernel.org
> > 
> > Could be a simple as this.. but I've not tested.
> 
> That's only one part of it. I think we need to fiddle with the exit side as
> well.

So I assumed that the patches were bisectable. From that I figured the
exit path (patch 14 in that set) would work either way.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
  2017-11-27 12:31         ` Brian Gerst
@ 2017-11-27 13:18           ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 13:18 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Peter Zijlstra, LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, Daniel Gruss, Hugh Dickins,
	Kees Cook, Linux-MM, michael.schwarz, moritz.lipp,
	richard.fellner

On Mon, 27 Nov 2017, Brian Gerst wrote:
> On Mon, Nov 27, 2017 at 6:47 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Mon, 27 Nov 2017, Peter Zijlstra wrote:
> >> On Mon, Nov 27, 2017 at 12:14:04AM +0100, Thomas Gleixner wrote:
> >> > PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
> >> > line, but KAISER sets them unconditionally.
> >>
> >> So KAISER is x86_64 only, right? AFAIK there is no x86_64 without NX
> >> support. So would it not make sense to mandate NX for KAISER?, that is
> >> instead of making "noexec" + KAISER work, make "noexec" kill KAISER +
> >> emit a warning.
> >
> > OTOH, disabling NX is a simple way to verify that DEBUG_WX works correctly
> > also on the shadow maps.
> >
> > But surely we can drop the PAGE_GLOBAL thing, as all 64bit systems have it.
> 
> I seem to recall that some virtualized environments (maybe Xen?) don't
> support global pages.

Uuurg. Ok, we leave it as is for now. Better safe than sorry. It does no
harm.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
@ 2017-11-27 13:18           ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 13:18 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Peter Zijlstra, LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, Daniel Gruss, Hugh Dickins,
	Kees Cook, Linux-MM, michael.schwarz, moritz.lipp,
	richard.fellner

On Mon, 27 Nov 2017, Brian Gerst wrote:
> On Mon, Nov 27, 2017 at 6:47 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Mon, 27 Nov 2017, Peter Zijlstra wrote:
> >> On Mon, Nov 27, 2017 at 12:14:04AM +0100, Thomas Gleixner wrote:
> >> > PAGE_NX and PAGE_GLOBAL might be not supported or disabled on the command
> >> > line, but KAISER sets them unconditionally.
> >>
> >> So KAISER is x86_64 only, right? AFAIK there is no x86_64 without NX
> >> support. So would it not make sense to mandate NX for KAISER?, that is
> >> instead of making "noexec" + KAISER work, make "noexec" kill KAISER +
> >> emit a warning.
> >
> > OTOH, disabling NX is a simple way to verify that DEBUG_WX works correctly
> > also on the shadow maps.
> >
> > But surely we can drop the PAGE_GLOBAL thing, as all 64bit systems have it.
> 
> I seem to recall that some virtualized environments (maybe Xen?) don't
> support global pages.

Uuurg. Ok, we leave it as is for now. Better safe than sorry. It does no
harm.

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27 18:11     ` Dave Hansen
  -1 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:11 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

> --- a/arch/x86/include/asm/pgtable_64.h
> +++ b/arch/x86/include/asm/pgtable_64.h
> @@ -222,7 +222,8 @@ static inline pgd_t kaiser_set_shadow_pg
>  			 * wrong CR3 value, userspace will crash
>  			 * instead of running.
>  			 */
> -			pgd.pgd |= _PAGE_NX;
> +			if (__supported_pte_mask & _PAGE_NX)
> +				pgd.pgd |= _PAGE_NX;
>  		}

Thanks for catching that.  It's definitely a bug.  Although,
practically, it's hard to hit, right?  I think everything 64-bit
supports NX unless the hypervisor disabled it or something.

>  	} else if (pgd_userspace_access(*pgdp)) {
>  		/*
> --- a/arch/x86/mm/kaiser.c
> +++ b/arch/x86/mm/kaiser.c
> @@ -42,6 +42,8 @@
>  
>  #define KAISER_WALK_ATOMIC  0x1
>  
> +static pteval_t kaiser_pte_mask __ro_after_init = ~(_PAGE_NX | _PAGE_GLOBAL);

Do we need a comment on this, like:

/*
 * The NX and GLOBAL bits are not supported on all CPUs.
 * We will add them back to this mask at runtime in
 * kaiser_init_all_pgds() if supported.
 */

>  /*
>   * At runtime, the only things we map are some things for CPU
>   * hotplug, and stacks for new processes.  No two CPUs will ever
> @@ -244,11 +246,14 @@ static pte_t *kaiser_shadow_pagetable_wa
>  int kaiser_add_user_map(const void *__start_addr, unsigned long size,
>  			unsigned long flags)
>  {
> -	pte_t *pte;
>  	unsigned long start_addr = (unsigned long)__start_addr;
>  	unsigned long address = start_addr & PAGE_MASK;
>  	unsigned long end_addr = PAGE_ALIGN(start_addr + size);
>  	unsigned long target_address;
> +	pte_t *pte;
> +
> +	/* Clear not supported bits */
> +	flags &= kaiser_pte_mask;

Should we be warning on these if we clear them?  Seems kinda funky to
silently fix them up.

>  	for (; address < end_addr; address += PAGE_SIZE) {
>  		target_address = get_pa_from_kernel_map(address);
> @@ -308,6 +313,11 @@ static void __init kaiser_init_all_pgds(
>  	pgd_t *pgd;
>  	int i;
>  
> +	if (__supported_pte_mask & _PAGE_NX)
> +		kaiser_pte_mask |= _PAGE_NX;
> +	if (boot_cpu_has(X86_FEATURE_PGE))
> +		kaiser_pte_mask |= _PAGE_GLOBAL;

Practically, I guess boot_cpu_has(X86_FEATURE_PGE) == (cr4_read() &
X86_CR4_PGE).  But, in a slow path like this, is it perhaps better to
just be checking CR4 directly?

Looks functionally fine to me, though.  Feel free to add my Reviewed-by
or Acked-by.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
@ 2017-11-27 18:11     ` Dave Hansen
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:11 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

> --- a/arch/x86/include/asm/pgtable_64.h
> +++ b/arch/x86/include/asm/pgtable_64.h
> @@ -222,7 +222,8 @@ static inline pgd_t kaiser_set_shadow_pg
>  			 * wrong CR3 value, userspace will crash
>  			 * instead of running.
>  			 */
> -			pgd.pgd |= _PAGE_NX;
> +			if (__supported_pte_mask & _PAGE_NX)
> +				pgd.pgd |= _PAGE_NX;
>  		}

Thanks for catching that.  It's definitely a bug.  Although,
practically, it's hard to hit, right?  I think everything 64-bit
supports NX unless the hypervisor disabled it or something.

>  	} else if (pgd_userspace_access(*pgdp)) {
>  		/*
> --- a/arch/x86/mm/kaiser.c
> +++ b/arch/x86/mm/kaiser.c
> @@ -42,6 +42,8 @@
>  
>  #define KAISER_WALK_ATOMIC  0x1
>  
> +static pteval_t kaiser_pte_mask __ro_after_init = ~(_PAGE_NX | _PAGE_GLOBAL);

Do we need a comment on this, like:

/*
 * The NX and GLOBAL bits are not supported on all CPUs.
 * We will add them back to this mask at runtime in
 * kaiser_init_all_pgds() if supported.
 */

>  /*
>   * At runtime, the only things we map are some things for CPU
>   * hotplug, and stacks for new processes.  No two CPUs will ever
> @@ -244,11 +246,14 @@ static pte_t *kaiser_shadow_pagetable_wa
>  int kaiser_add_user_map(const void *__start_addr, unsigned long size,
>  			unsigned long flags)
>  {
> -	pte_t *pte;
>  	unsigned long start_addr = (unsigned long)__start_addr;
>  	unsigned long address = start_addr & PAGE_MASK;
>  	unsigned long end_addr = PAGE_ALIGN(start_addr + size);
>  	unsigned long target_address;
> +	pte_t *pte;
> +
> +	/* Clear not supported bits */
> +	flags &= kaiser_pte_mask;

Should we be warning on these if we clear them?  Seems kinda funky to
silently fix them up.

>  	for (; address < end_addr; address += PAGE_SIZE) {
>  		target_address = get_pa_from_kernel_map(address);
> @@ -308,6 +313,11 @@ static void __init kaiser_init_all_pgds(
>  	pgd_t *pgd;
>  	int i;
>  
> +	if (__supported_pte_mask & _PAGE_NX)
> +		kaiser_pte_mask |= _PAGE_NX;
> +	if (boot_cpu_has(X86_FEATURE_PGE))
> +		kaiser_pte_mask |= _PAGE_GLOBAL;

Practically, I guess boot_cpu_has(X86_FEATURE_PGE) == (cr4_read() &
X86_CR4_PGE).  But, in a slow path like this, is it perhaps better to
just be checking CR4 directly?

Looks functionally fine to me, though.  Feel free to add my Reviewed-by
or Acked-by.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 2/5] x86/kaiser: Simplify disabling of global pages
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27 18:15     ` Dave Hansen
  -1 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:15 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> +static void enable_global_pages(void)
> +{
> +#ifndef CONFIG_KAISER
> +	__supported_pte_mask |= _PAGE_GLOBAL;
> +#endif
> +}
> +
>  static void __init probe_page_size_mask(void)
>  {
>  	/*
> @@ -179,11 +186,11 @@ static void __init probe_page_size_mask(
>  		cr4_set_bits_and_update_boot(X86_CR4_PSE);
>  
>  	/* Enable PGE if available */
> +	__supported_pte_mask |= _PAGE_GLOBAL;
>  	if (boot_cpu_has(X86_FEATURE_PGE)) {
>  		cr4_set_bits_and_update_boot(X86_CR4_PGE);
> -		__supported_pte_mask |= _PAGE_GLOBAL;
> -	} else
> -		__supported_pte_mask &= ~_PAGE_GLOBAL;
> +		enable_global_pages();
> +	}

This looks a little funky.  Doesn't this or _PAGE_GLOBAL into
__supported_pte_mask twice?  Once before the if(), and then a second
time via enable_global_pages() inside the if?

Did you intend for it to be masked *out* in the first one and then or'd
back in via enable_global_pages()?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 2/5] x86/kaiser: Simplify disabling of global pages
@ 2017-11-27 18:15     ` Dave Hansen
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:15 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> +static void enable_global_pages(void)
> +{
> +#ifndef CONFIG_KAISER
> +	__supported_pte_mask |= _PAGE_GLOBAL;
> +#endif
> +}
> +
>  static void __init probe_page_size_mask(void)
>  {
>  	/*
> @@ -179,11 +186,11 @@ static void __init probe_page_size_mask(
>  		cr4_set_bits_and_update_boot(X86_CR4_PSE);
>  
>  	/* Enable PGE if available */
> +	__supported_pte_mask |= _PAGE_GLOBAL;
>  	if (boot_cpu_has(X86_FEATURE_PGE)) {
>  		cr4_set_bits_and_update_boot(X86_CR4_PGE);
> -		__supported_pte_mask |= _PAGE_GLOBAL;
> -	} else
> -		__supported_pte_mask &= ~_PAGE_GLOBAL;
> +		enable_global_pages();
> +	}

This looks a little funky.  Doesn't this or _PAGE_GLOBAL into
__supported_pte_mask twice?  Once before the if(), and then a second
time via enable_global_pages() inside the if?

Did you intend for it to be masked *out* in the first one and then or'd
back in via enable_global_pages()?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 3/5] x86/dump_pagetables: Check KAISER shadow page table for WX pages
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27 18:17     ` Dave Hansen
  -1 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:17 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> +void ptdump_walk_shadow_pgd_level_checkwx(void)
> +{
> +#ifdef CONFIG_KAISER
> +	pgd_t *pgd = (pgd_t *) &init_top_pgt;
> +
> +	pr_info("x86/mm: Checking shadow page tables\n");
> +	pgd += PTRS_PER_PGD;
> +	ptdump_walk_pgd_level_core(NULL, pgd, true, false);
> +#endif
>  }

We have the kernel_to_shadow_pgdp() function to use instead of "pgd +=
PTRS_PER_PGD;".  Should it be used instead?

Otherwise, looks good to me.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 3/5] x86/dump_pagetables: Check KAISER shadow page table for WX pages
@ 2017-11-27 18:17     ` Dave Hansen
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:17 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> +void ptdump_walk_shadow_pgd_level_checkwx(void)
> +{
> +#ifdef CONFIG_KAISER
> +	pgd_t *pgd = (pgd_t *) &init_top_pgt;
> +
> +	pr_info("x86/mm: Checking shadow page tables\n");
> +	pgd += PTRS_PER_PGD;
> +	ptdump_walk_pgd_level_core(NULL, pgd, true, false);
> +#endif
>  }

We have the kernel_to_shadow_pgdp() function to use instead of "pgd +=
PTRS_PER_PGD;".  Should it be used instead?

Otherwise, looks good to me.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 4/5] x86/mm/debug_pagetables: Allow dumping current pagetables
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27 18:18     ` Dave Hansen
  -1 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:18 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> -void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd)
> +void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool shadow)
>  {
> +	if (shadow)
> +		pgd += PTRS_PER_PGD;
>  	ptdump_walk_pgd_level_core(m, pgd, false, false);
>  }

Same comment about calculating the shadow pgd.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 4/5] x86/mm/debug_pagetables: Allow dumping current pagetables
@ 2017-11-27 18:18     ` Dave Hansen
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:18 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> -void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd)
> +void ptdump_walk_pgd_level_debugfs(struct seq_file *m, pgd_t *pgd, bool shadow)
>  {
> +	if (shadow)
> +		pgd += PTRS_PER_PGD;
>  	ptdump_walk_pgd_level_core(m, pgd, false, false);
>  }

Same comment about calculating the shadow pgd.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-26 23:14   ` Thomas Gleixner
@ 2017-11-27 18:22     ` Dave Hansen
  -1 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:22 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -56,7 +56,7 @@ config SECURITY_NETWORK
>  
>  config KAISER
>  	bool "Remove the kernel mapping in user mode"
> -	depends on X86_64 && SMP && !PARAVIRT
> +	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
>  	help
>  	  This feature reduces the number of hardware side channels by
>  	  ensuring that the majority of kernel addresses are not mapped

One of the reasons for doing the runtime-disable was to get rid of the
!PARAVIRT dependency.  I can add a follow-on here that will act as if we
did "nokaiser" whenever Xen is in play so we can remove this dependency.

I just hope Xen is detectable early enough to do the static patching.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27 18:22     ` Dave Hansen
  0 siblings, 0 replies; 64+ messages in thread
From: Dave Hansen @ 2017-11-27 18:22 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> --- a/security/Kconfig
> +++ b/security/Kconfig
> @@ -56,7 +56,7 @@ config SECURITY_NETWORK
>  
>  config KAISER
>  	bool "Remove the kernel mapping in user mode"
> -	depends on X86_64 && SMP && !PARAVIRT
> +	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
>  	help
>  	  This feature reduces the number of hardware side channels by
>  	  ensuring that the majority of kernel addresses are not mapped

One of the reasons for doing the runtime-disable was to get rid of the
!PARAVIRT dependency.  I can add a follow-on here that will act as if we
did "nokaiser" whenever Xen is in play so we can remove this dependency.

I just hope Xen is detectable early enough to do the static patching.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
  2017-11-27 18:11     ` Dave Hansen
@ 2017-11-27 18:37       ` Kees Cook
  -1 siblings, 0 replies; 64+ messages in thread
From: Kees Cook @ 2017-11-27 18:37 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Thomas Gleixner, LKML, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
	Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Daniel Gruss, Hugh Dickins, Linux-MM, michael.schwarz,
	moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 10:11 AM, Dave Hansen
<dave.hansen@linux.intel.com> wrote:
>> --- a/arch/x86/include/asm/pgtable_64.h
>> +++ b/arch/x86/include/asm/pgtable_64.h
>> @@ -222,7 +222,8 @@ static inline pgd_t kaiser_set_shadow_pg
>>                        * wrong CR3 value, userspace will crash
>>                        * instead of running.
>>                        */
>> -                     pgd.pgd |= _PAGE_NX;
>> +                     if (__supported_pte_mask & _PAGE_NX)
>> +                             pgd.pgd |= _PAGE_NX;
>>               }
>
> Thanks for catching that.  It's definitely a bug.  Although,
> practically, it's hard to hit, right?  I think everything 64-bit
> supports NX unless the hypervisor disabled it or something.

There was a very narrow window where x86_64 machines were made without
NX. :( This is reflected in x86_report_nx(), though maybe we should
add a "OMG, why?" when 64-bit but no NX. ;)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 1/5] x86/kaiser: Respect disabled CPU features
@ 2017-11-27 18:37       ` Kees Cook
  0 siblings, 0 replies; 64+ messages in thread
From: Kees Cook @ 2017-11-27 18:37 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Thomas Gleixner, LKML, Andy Lutomirski, Ingo Molnar,
	Borislav Petkov, Brian Gerst, Denys Vlasenko, H. Peter Anvin,
	Josh Poimboeuf, Linus Torvalds, Peter Zijlstra, Rik van Riel,
	Daniel Gruss, Hugh Dickins, Linux-MM, michael.schwarz,
	moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 10:11 AM, Dave Hansen
<dave.hansen@linux.intel.com> wrote:
>> --- a/arch/x86/include/asm/pgtable_64.h
>> +++ b/arch/x86/include/asm/pgtable_64.h
>> @@ -222,7 +222,8 @@ static inline pgd_t kaiser_set_shadow_pg
>>                        * wrong CR3 value, userspace will crash
>>                        * instead of running.
>>                        */
>> -                     pgd.pgd |= _PAGE_NX;
>> +                     if (__supported_pte_mask & _PAGE_NX)
>> +                             pgd.pgd |= _PAGE_NX;
>>               }
>
> Thanks for catching that.  It's definitely a bug.  Although,
> practically, it's hard to hit, right?  I think everything 64-bit
> supports NX unless the hypervisor disabled it or something.

There was a very narrow window where x86_64 machines were made without
NX. :( This is reflected in x86_report_nx(), though maybe we should
add a "OMG, why?" when 64-bit but no NX. ;)

-Kees

-- 
Kees Cook
Pixel Security

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-27 18:22     ` Dave Hansen
@ 2017-11-27 19:00       ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 19:00 UTC (permalink / raw)
  To: Dave Hansen
  Cc: LKML, Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Dave Hansen wrote:

> On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> > --- a/security/Kconfig
> > +++ b/security/Kconfig
> > @@ -56,7 +56,7 @@ config SECURITY_NETWORK
> >  
> >  config KAISER
> >  	bool "Remove the kernel mapping in user mode"
> > -	depends on X86_64 && SMP && !PARAVIRT
> > +	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
> >  	help
> >  	  This feature reduces the number of hardware side channels by
> >  	  ensuring that the majority of kernel addresses are not mapped
> 
> One of the reasons for doing the runtime-disable was to get rid of the
> !PARAVIRT dependency.  I can add a follow-on here that will act as if we
> did "nokaiser" whenever Xen is in play so we can remove this dependency.
> 
> I just hope Xen is detectable early enough to do the static patching.

Yes, it is. I'm currently trying to figure out why it fails on a KVM guest.

If I boot with 'nokaiser' on the command line it works. If kaiser is
runtime enabled then some early klibc user space in the ramdisk
explodes. Not sure yet whats going on.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27 19:00       ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 19:00 UTC (permalink / raw)
  To: Dave Hansen
  Cc: LKML, Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Dave Hansen wrote:

> On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> > --- a/security/Kconfig
> > +++ b/security/Kconfig
> > @@ -56,7 +56,7 @@ config SECURITY_NETWORK
> >  
> >  config KAISER
> >  	bool "Remove the kernel mapping in user mode"
> > -	depends on X86_64 && SMP && !PARAVIRT
> > +	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
> >  	help
> >  	  This feature reduces the number of hardware side channels by
> >  	  ensuring that the majority of kernel addresses are not mapped
> 
> One of the reasons for doing the runtime-disable was to get rid of the
> !PARAVIRT dependency.  I can add a follow-on here that will act as if we
> did "nokaiser" whenever Xen is in play so we can remove this dependency.
> 
> I just hope Xen is detectable early enough to do the static patching.

Yes, it is. I'm currently trying to figure out why it fails on a KVM guest.

If I boot with 'nokaiser' on the command line it works. If kaiser is
runtime enabled then some early klibc user space in the ramdisk
explodes. Not sure yet whats going on.

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-27 19:00       ` Thomas Gleixner
@ 2017-11-27 19:18         ` Josh Poimboeuf
  -1 siblings, 0 replies; 64+ messages in thread
From: Josh Poimboeuf @ 2017-11-27 19:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dave Hansen, LKML, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 08:00:19PM +0100, Thomas Gleixner wrote:
> On Mon, 27 Nov 2017, Dave Hansen wrote:
> 
> > On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> > > --- a/security/Kconfig
> > > +++ b/security/Kconfig
> > > @@ -56,7 +56,7 @@ config SECURITY_NETWORK
> > >  
> > >  config KAISER
> > >  	bool "Remove the kernel mapping in user mode"
> > > -	depends on X86_64 && SMP && !PARAVIRT
> > > +	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
> > >  	help
> > >  	  This feature reduces the number of hardware side channels by
> > >  	  ensuring that the majority of kernel addresses are not mapped
> > 
> > One of the reasons for doing the runtime-disable was to get rid of the
> > !PARAVIRT dependency.  I can add a follow-on here that will act as if we
> > did "nokaiser" whenever Xen is in play so we can remove this dependency.
> > 
> > I just hope Xen is detectable early enough to do the static patching.
> 
> Yes, it is. I'm currently trying to figure out why it fails on a KVM guest.
> 
> If I boot with 'nokaiser' on the command line it works. If kaiser is
> runtime enabled then some early klibc user space in the ramdisk
> explodes. Not sure yet whats going on.

I'm also seeing weirdness with PARAVIRT+KAISER on kvm.  The symptoms
aren't consistent.  Sometimes it boots, sometimes it hangs before the
login prompt, sometimes there are user space seg faults.

It almost seems like the interrupt handler is corrupting user space
state somehow.

This is with tip/WIP.x86/mm plus a patch to remove the KAISER dependency
on !PARAVIRT.

-- 
Josh

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27 19:18         ` Josh Poimboeuf
  0 siblings, 0 replies; 64+ messages in thread
From: Josh Poimboeuf @ 2017-11-27 19:18 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Dave Hansen, LKML, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 08:00:19PM +0100, Thomas Gleixner wrote:
> On Mon, 27 Nov 2017, Dave Hansen wrote:
> 
> > On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> > > --- a/security/Kconfig
> > > +++ b/security/Kconfig
> > > @@ -56,7 +56,7 @@ config SECURITY_NETWORK
> > >  
> > >  config KAISER
> > >  	bool "Remove the kernel mapping in user mode"
> > > -	depends on X86_64 && SMP && !PARAVIRT
> > > +	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
> > >  	help
> > >  	  This feature reduces the number of hardware side channels by
> > >  	  ensuring that the majority of kernel addresses are not mapped
> > 
> > One of the reasons for doing the runtime-disable was to get rid of the
> > !PARAVIRT dependency.  I can add a follow-on here that will act as if we
> > did "nokaiser" whenever Xen is in play so we can remove this dependency.
> > 
> > I just hope Xen is detectable early enough to do the static patching.
> 
> Yes, it is. I'm currently trying to figure out why it fails on a KVM guest.
> 
> If I boot with 'nokaiser' on the command line it works. If kaiser is
> runtime enabled then some early klibc user space in the ramdisk
> explodes. Not sure yet whats going on.

I'm also seeing weirdness with PARAVIRT+KAISER on kvm.  The symptoms
aren't consistent.  Sometimes it boots, sometimes it hangs before the
login prompt, sometimes there are user space seg faults.

It almost seems like the interrupt handler is corrupting user space
state somehow.

This is with tip/WIP.x86/mm plus a patch to remove the KAISER dependency
on !PARAVIRT.

-- 
Josh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
  2017-11-27 10:06       ` Ingo Molnar
@ 2017-11-27 19:21         ` Linus Torvalds
  -1 siblings, 0 replies; 64+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Thomas Gleixner, LKML, Dave Hansen,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	H. Peter Anvin, Josh Poimboeuf, Rik van Riel, Daniel Gruss,
	Hugh Dickins, Kees Cook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

On Mon, Nov 27, 2017 at 2:06 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> +/*
> + * Human readable symbolic definitions for common
> + * file permissions:
> + */
> +#define PERM_r________ 0400
> +#define PERM_r__r_____ 0440
> +#define PERM_r__r__r__ 0444

I'm not a fan. Particularly as you have a very random set of
permissions (rx and wx? Not very common), but also because it's just
not that legible.

I've argued several times that we shouldn't use the defines at all.
The octal format isn't any less legible than any #define I've ever
seen, and is generally _more_ legible.

What's wrong with just using 0400 for "read by user"?

                Linus

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
@ 2017-11-27 19:21         ` Linus Torvalds
  0 siblings, 0 replies; 64+ messages in thread
From: Linus Torvalds @ 2017-11-27 19:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Thomas Gleixner, LKML, Dave Hansen,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	H. Peter Anvin, Josh Poimboeuf, Rik van Riel, Daniel Gruss,
	Hugh Dickins, Kees Cook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

On Mon, Nov 27, 2017 at 2:06 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
>
> +/*
> + * Human readable symbolic definitions for common
> + * file permissions:
> + */
> +#define PERM_r________ 0400
> +#define PERM_r__r_____ 0440
> +#define PERM_r__r__r__ 0444

I'm not a fan. Particularly as you have a very random set of
permissions (rx and wx? Not very common), but also because it's just
not that legible.

I've argued several times that we shouldn't use the defines at all.
The octal format isn't any less legible than any #define I've ever
seen, and is generally _more_ legible.

What's wrong with just using 0400 for "read by user"?

                Linus

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 2/5] x86/kaiser: Simplify disabling of global pages
  2017-11-27 18:15     ` Dave Hansen
@ 2017-11-27 20:28       ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 20:28 UTC (permalink / raw)
  To: Dave Hansen
  Cc: LKML, Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Dave Hansen wrote:

> On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> > +static void enable_global_pages(void)
> > +{
> > +#ifndef CONFIG_KAISER
> > +	__supported_pte_mask |= _PAGE_GLOBAL;
> > +#endif
> > +}
> > +
> >  static void __init probe_page_size_mask(void)
> >  {
> >  	/*
> > @@ -179,11 +186,11 @@ static void __init probe_page_size_mask(
> >  		cr4_set_bits_and_update_boot(X86_CR4_PSE);
> >  
> >  	/* Enable PGE if available */
> > +	__supported_pte_mask |= _PAGE_GLOBAL;
> >  	if (boot_cpu_has(X86_FEATURE_PGE)) {
> >  		cr4_set_bits_and_update_boot(X86_CR4_PGE);
> > -		__supported_pte_mask |= _PAGE_GLOBAL;
> > -	} else
> > -		__supported_pte_mask &= ~_PAGE_GLOBAL;
> > +		enable_global_pages();
> > +	}
> 
> This looks a little funky.  Doesn't this or _PAGE_GLOBAL into
> __supported_pte_mask twice?  Once before the if(), and then a second
> time via enable_global_pages() inside the if?
> 
> Did you intend for it to be masked *out* in the first one and then or'd
> back in via enable_global_pages()?

It's fixed already ...

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 2/5] x86/kaiser: Simplify disabling of global pages
@ 2017-11-27 20:28       ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 20:28 UTC (permalink / raw)
  To: Dave Hansen
  Cc: LKML, Andy Lutomirski, Ingo Molnar, Borislav Petkov, Brian Gerst,
	Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Dave Hansen wrote:

> On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> > +static void enable_global_pages(void)
> > +{
> > +#ifndef CONFIG_KAISER
> > +	__supported_pte_mask |= _PAGE_GLOBAL;
> > +#endif
> > +}
> > +
> >  static void __init probe_page_size_mask(void)
> >  {
> >  	/*
> > @@ -179,11 +186,11 @@ static void __init probe_page_size_mask(
> >  		cr4_set_bits_and_update_boot(X86_CR4_PSE);
> >  
> >  	/* Enable PGE if available */
> > +	__supported_pte_mask |= _PAGE_GLOBAL;
> >  	if (boot_cpu_has(X86_FEATURE_PGE)) {
> >  		cr4_set_bits_and_update_boot(X86_CR4_PGE);
> > -		__supported_pte_mask |= _PAGE_GLOBAL;
> > -	} else
> > -		__supported_pte_mask &= ~_PAGE_GLOBAL;
> > +		enable_global_pages();
> > +	}
> 
> This looks a little funky.  Doesn't this or _PAGE_GLOBAL into
> __supported_pte_mask twice?  Once before the if(), and then a second
> time via enable_global_pages() inside the if?
> 
> Did you intend for it to be masked *out* in the first one and then or'd
> back in via enable_global_pages()?

It's fixed already ...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-27 19:18         ` Josh Poimboeuf
@ 2017-11-27 20:47           ` Thomas Gleixner
  -1 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 20:47 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Dave Hansen, LKML, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Josh Poimboeuf wrote:

> On Mon, Nov 27, 2017 at 08:00:19PM +0100, Thomas Gleixner wrote:
> > On Mon, 27 Nov 2017, Dave Hansen wrote:
> > 
> > > On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> > > > --- a/security/Kconfig
> > > > +++ b/security/Kconfig
> > > > @@ -56,7 +56,7 @@ config SECURITY_NETWORK
> > > >  
> > > >  config KAISER
> > > >  	bool "Remove the kernel mapping in user mode"
> > > > -	depends on X86_64 && SMP && !PARAVIRT
> > > > +	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
> > > >  	help
> > > >  	  This feature reduces the number of hardware side channels by
> > > >  	  ensuring that the majority of kernel addresses are not mapped
> > > 
> > > One of the reasons for doing the runtime-disable was to get rid of the
> > > !PARAVIRT dependency.  I can add a follow-on here that will act as if we
> > > did "nokaiser" whenever Xen is in play so we can remove this dependency.
> > > 
> > > I just hope Xen is detectable early enough to do the static patching.
> > 
> > Yes, it is. I'm currently trying to figure out why it fails on a KVM guest.
> > 
> > If I boot with 'nokaiser' on the command line it works. If kaiser is
> > runtime enabled then some early klibc user space in the ramdisk
> > explodes. Not sure yet whats going on.
> 
> I'm also seeing weirdness with PARAVIRT+KAISER on kvm.  The symptoms
> aren't consistent.  Sometimes it boots, sometimes it hangs before the
> login prompt, sometimes there are user space seg faults.
> 
> It almost seems like the interrupt handler is corrupting user space
> state somehow.
> 
> This is with tip/WIP.x86/mm plus a patch to remove the KAISER dependency
> on !PARAVIRT.

See the patches I posted. Its the PV patching of flush_tlb_single() ...

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27 20:47           ` Thomas Gleixner
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Gleixner @ 2017-11-27 20:47 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Dave Hansen, LKML, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Linus Torvalds,
	Peter Zijlstra, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, 27 Nov 2017, Josh Poimboeuf wrote:

> On Mon, Nov 27, 2017 at 08:00:19PM +0100, Thomas Gleixner wrote:
> > On Mon, 27 Nov 2017, Dave Hansen wrote:
> > 
> > > On 11/26/2017 03:14 PM, Thomas Gleixner wrote:
> > > > --- a/security/Kconfig
> > > > +++ b/security/Kconfig
> > > > @@ -56,7 +56,7 @@ config SECURITY_NETWORK
> > > >  
> > > >  config KAISER
> > > >  	bool "Remove the kernel mapping in user mode"
> > > > -	depends on X86_64 && SMP && !PARAVIRT
> > > > +	depends on X86_64 && SMP && !PARAVIRT && JUMP_LABEL
> > > >  	help
> > > >  	  This feature reduces the number of hardware side channels by
> > > >  	  ensuring that the majority of kernel addresses are not mapped
> > > 
> > > One of the reasons for doing the runtime-disable was to get rid of the
> > > !PARAVIRT dependency.  I can add a follow-on here that will act as if we
> > > did "nokaiser" whenever Xen is in play so we can remove this dependency.
> > > 
> > > I just hope Xen is detectable early enough to do the static patching.
> > 
> > Yes, it is. I'm currently trying to figure out why it fails on a KVM guest.
> > 
> > If I boot with 'nokaiser' on the command line it works. If kaiser is
> > runtime enabled then some early klibc user space in the ramdisk
> > explodes. Not sure yet whats going on.
> 
> I'm also seeing weirdness with PARAVIRT+KAISER on kvm.  The symptoms
> aren't consistent.  Sometimes it boots, sometimes it hangs before the
> login prompt, sometimes there are user space seg faults.
> 
> It almost seems like the interrupt handler is corrupting user space
> state somehow.
> 
> This is with tip/WIP.x86/mm plus a patch to remove the KAISER dependency
> on !PARAVIRT.

See the patches I posted. Its the PV patching of flush_tlb_single() ...

Thanks,

	tglx

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
  2017-11-27 10:22       ` Peter Zijlstra
@ 2017-11-27 21:43         ` Peter Zijlstra
  -1 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27 21:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 11:22:41AM +0100, Peter Zijlstra wrote:

> Could be a simple as this.. but I've not tested.
> 
> 
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index f4f4ab8525bd..1be393a97421 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1442,7 +1442,10 @@ void syscall_init(void)
>  		(entry_SYSCALL_64_trampoline - _entry_trampoline);
>  
>  	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
> -	wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
> +	if (kaiser_enabled)
> +		wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
> +	else
> +		wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
>  
>  #ifdef CONFIG_IA32_EMULATION
>  	wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);

Seems to work:

root@ivb-ep:~# rdmsr --all 0xc0000082 | uniq
ffffffff81beb780
root@ivb-ep:~# grep ffffffff81beb780 /proc/kallsyms 
ffffffff81beb780 T entry_SYSCALL_64

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [patch V2 5/5] x86/kaiser: Add boottime disable switch
@ 2017-11-27 21:43         ` Peter Zijlstra
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Zijlstra @ 2017-11-27 21:43 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: LKML, Dave Hansen, Andy Lutomirski, Ingo Molnar, Borislav Petkov,
	Brian Gerst, Denys Vlasenko, H. Peter Anvin, Josh Poimboeuf,
	Linus Torvalds, Rik van Riel, daniel.gruss, hughd, keescook,
	linux-mm, michael.schwarz, moritz.lipp, richard.fellner

On Mon, Nov 27, 2017 at 11:22:41AM +0100, Peter Zijlstra wrote:

> Could be a simple as this.. but I've not tested.
> 
> 
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index f4f4ab8525bd..1be393a97421 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1442,7 +1442,10 @@ void syscall_init(void)
>  		(entry_SYSCALL_64_trampoline - _entry_trampoline);
>  
>  	wrmsr(MSR_STAR, 0, (__USER32_CS << 16) | __KERNEL_CS);
> -	wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
> +	if (kaiser_enabled)
> +		wrmsrl(MSR_LSTAR, SYSCALL64_entry_trampoline);
> +	else
> +		wrmsrl(MSR_LSTAR, (unsigned long)entry_SYSCALL_64);
>  
>  #ifdef CONFIG_IA32_EMULATION
>  	wrmsrl(MSR_CSTAR, (unsigned long)entry_SYSCALL_compat);

Seems to work:

root@ivb-ep:~# rdmsr --all 0xc0000082 | uniq
ffffffff81beb780
root@ivb-ep:~# grep ffffffff81beb780 /proc/kallsyms 
ffffffff81beb780 T entry_SYSCALL_64


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
  2017-11-27 19:21         ` Linus Torvalds
@ 2017-11-28 10:54           ` Ingo Molnar
  -1 siblings, 0 replies; 64+ messages in thread
From: Ingo Molnar @ 2017-11-28 10:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Thomas Gleixner, LKML, Dave Hansen,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	H. Peter Anvin, Josh Poimboeuf, Rik van Riel, Daniel Gruss,
	Hugh Dickins, Kees Cook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, Nov 27, 2017 at 2:06 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > +/*
> > + * Human readable symbolic definitions for common
> > + * file permissions:
> > + */
> > +#define PERM_r________ 0400
> > +#define PERM_r__r_____ 0440
> > +#define PERM_r__r__r__ 0444
> 
> I'm not a fan. Particularly as you have a very random set of
> permissions (rx and wx? Not very common), but also because it's just
> not that legible.
> 
> I've argued several times that we shouldn't use the defines at all.
> The octal format isn't any less legible than any #define I've ever
> seen, and is generally _more_ legible.
> 
> What's wrong with just using 0400 for "read by user"?

Yeah, the octal format isn't all that bad - at least relative to the symbolic 
obfuscation defines.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
@ 2017-11-28 10:54           ` Ingo Molnar
  0 siblings, 0 replies; 64+ messages in thread
From: Ingo Molnar @ 2017-11-28 10:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Thomas Gleixner, LKML, Dave Hansen,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	H. Peter Anvin, Josh Poimboeuf, Rik van Riel, Daniel Gruss,
	Hugh Dickins, Kees Cook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, Nov 27, 2017 at 2:06 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > +/*
> > + * Human readable symbolic definitions for common
> > + * file permissions:
> > + */
> > +#define PERM_r________ 0400
> > +#define PERM_r__r_____ 0440
> > +#define PERM_r__r__r__ 0444
> 
> I'm not a fan. Particularly as you have a very random set of
> permissions (rx and wx? Not very common), but also because it's just
> not that legible.
> 
> I've argued several times that we shouldn't use the defines at all.
> The octal format isn't any less legible than any #define I've ever
> seen, and is generally _more_ legible.
> 
> What's wrong with just using 0400 for "read by user"?

Yeah, the octal format isn't all that bad - at least relative to the symbolic 
obfuscation defines.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
  2017-11-27 19:21         ` Linus Torvalds
@ 2017-11-28 11:12           ` Ingo Molnar
  -1 siblings, 0 replies; 64+ messages in thread
From: Ingo Molnar @ 2017-11-28 11:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Thomas Gleixner, LKML, Dave Hansen,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	H. Peter Anvin, Josh Poimboeuf, Rik van Riel, Daniel Gruss,
	Hugh Dickins, Kees Cook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, Nov 27, 2017 at 2:06 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > +/*
> > + * Human readable symbolic definitions for common
> > + * file permissions:
> > + */
> > +#define PERM_r________ 0400
> > +#define PERM_r__r_____ 0440
> > +#define PERM_r__r__r__ 0444
> 
> I'm not a fan. Particularly as you have a very random set of
> permissions (rx and wx? Not very common),

So I originally created those defines based on a grep of patterns used in the 
kernel, and added the 'wx' variants for completeness.

We would only need a small subset. Here's a git grep based histogram of octal file 
permission masks used in the kernel source:

      # mode
     21 0200
      8 0220
     14 0222
     33 0400
     11 0440
    219 0444
     91 0555
     39 0600
    906 0644
     12 0660
     12 0664
     18 0666
     14 0755
     31 0777

So there's literally only 14 variants used, and 0644 and 0444 make up 95% of the 
cases. We get the patch below if we extend these existing patterns using their 
natural (looking) generators to a complete group - 19 patterns that should cover 
all the sensible combinations.

> but also because it's just not that legible.

Fair enough.

Thanks,

	Ingo

---
 include/linux/stat.h |   28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

Index: tip/include/linux/stat.h
===================================================================
--- tip.orig/include/linux/stat.h
+++ tip/include/linux/stat.h
@@ -6,6 +6,34 @@
 #include <asm/stat.h>
 #include <uapi/linux/stat.h>
 
+/*
+ * Human readable symbolic definitions for common
+ * file permissions:
+ */
+#define PERM_r________	0400
+#define PERM_r__r_____	0440
+#define PERM_r__r__r__	0444
+
+#define PERM_rw_______	0600
+#define PERM_rw_r_____	0640
+#define PERM_rw_r__r__	0644
+#define PERM_rw_rw_r__	0664
+#define PERM_rw_rw_rw_	0666
+
+#define PERM__w_______	0200
+#define PERM__w__w____	0220
+#define PERM__w__w__w_	0222
+
+#define PERM_r_x______	0500
+#define PERM_r_xr_x___	0550
+#define PERM_r_xr_xr_x	0555
+
+#define PERM_rwx______	0700
+#define PERM_rwxr_x___	0750
+#define PERM_rwxr_xr_x	0755
+#define PERM_rwxrwxr_x	0775
+#define PERM_rwxrwxrwx	0777
+
 #define S_IRWXUGO	(S_IRWXU|S_IRWXG|S_IRWXO)
 #define S_IALLUGO	(S_ISUID|S_ISGID|S_ISVTX|S_IRWXUGO)
 #define S_IRUGO		(S_IRUSR|S_IRGRP|S_IROTH)

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
@ 2017-11-28 11:12           ` Ingo Molnar
  0 siblings, 0 replies; 64+ messages in thread
From: Ingo Molnar @ 2017-11-28 11:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Thomas Gleixner, LKML, Dave Hansen,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	H. Peter Anvin, Josh Poimboeuf, Rik van Riel, Daniel Gruss,
	Hugh Dickins, Kees Cook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, Nov 27, 2017 at 2:06 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > +/*
> > + * Human readable symbolic definitions for common
> > + * file permissions:
> > + */
> > +#define PERM_r________ 0400
> > +#define PERM_r__r_____ 0440
> > +#define PERM_r__r__r__ 0444
> 
> I'm not a fan. Particularly as you have a very random set of
> permissions (rx and wx? Not very common),

So I originally created those defines based on a grep of patterns used in the 
kernel, and added the 'wx' variants for completeness.

We would only need a small subset. Here's a git grep based histogram of octal file 
permission masks used in the kernel source:

      # mode
     21 0200
      8 0220
     14 0222
     33 0400
     11 0440
    219 0444
     91 0555
     39 0600
    906 0644
     12 0660
     12 0664
     18 0666
     14 0755
     31 0777

So there's literally only 14 variants used, and 0644 and 0444 make up 95% of the 
cases. We get the patch below if we extend these existing patterns using their 
natural (looking) generators to a complete group - 19 patterns that should cover 
all the sensible combinations.

> but also because it's just not that legible.

Fair enough.

Thanks,

	Ingo

---
 include/linux/stat.h |   28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

Index: tip/include/linux/stat.h
===================================================================
--- tip.orig/include/linux/stat.h
+++ tip/include/linux/stat.h
@@ -6,6 +6,34 @@
 #include <asm/stat.h>
 #include <uapi/linux/stat.h>
 
+/*
+ * Human readable symbolic definitions for common
+ * file permissions:
+ */
+#define PERM_r________	0400
+#define PERM_r__r_____	0440
+#define PERM_r__r__r__	0444
+
+#define PERM_rw_______	0600
+#define PERM_rw_r_____	0640
+#define PERM_rw_r__r__	0644
+#define PERM_rw_rw_r__	0664
+#define PERM_rw_rw_rw_	0666
+
+#define PERM__w_______	0200
+#define PERM__w__w____	0220
+#define PERM__w__w__w_	0222
+
+#define PERM_r_x______	0500
+#define PERM_r_xr_x___	0550
+#define PERM_r_xr_xr_x	0555
+
+#define PERM_rwx______	0700
+#define PERM_rwxr_x___	0750
+#define PERM_rwxr_xr_x	0755
+#define PERM_rwxrwxr_x	0775
+#define PERM_rwxrwxrwx	0777
+
 #define S_IRWXUGO	(S_IRWXU|S_IRWXG|S_IRWXO)
 #define S_IALLUGO	(S_ISUID|S_ISGID|S_ISVTX|S_IRWXUGO)
 #define S_IRUGO		(S_IRUSR|S_IRGRP|S_IROTH)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
  2017-11-28 11:12           ` Ingo Molnar
@ 2017-11-29  8:52             ` Michael Ellerman
  -1 siblings, 0 replies; 64+ messages in thread
From: Michael Ellerman @ 2017-11-29  8:52 UTC (permalink / raw)
  To: Ingo Molnar, Linus Torvalds
  Cc: Peter Zijlstra, Thomas Gleixner, LKML, Dave Hansen,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	H. Peter Anvin, Josh Poimboeuf, Rik van Riel, Daniel Gruss,
	Hugh Dickins, Kees Cook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

Ingo Molnar <mingo@kernel.org> writes:
...
> Index: tip/include/linux/stat.h
> ===================================================================
> --- tip.orig/include/linux/stat.h
> +++ tip/include/linux/stat.h
> @@ -6,6 +6,34 @@
>  #include <asm/stat.h>
>  #include <uapi/linux/stat.h>
>  
> +/*
> + * Human readable symbolic definitions for common
> + * file permissions:
> + */
> +#define PERM_r________	0400
> +#define PERM_r__r_____	0440
> +#define PERM_r__r__r__	0444
> +
> +#define PERM_rw_______	0600
> +#define PERM_rw_r_____	0640
> +#define PERM_rw_r__r__	0644
> +#define PERM_rw_rw_r__	0664
> +#define PERM_rw_rw_rw_	0666
> +
> +#define PERM__w_______	0200
> +#define PERM__w__w____	0220
> +#define PERM__w__w__w_	0222
> +
> +#define PERM_r_x______	0500
> +#define PERM_r_xr_x___	0550
> +#define PERM_r_xr_xr_x	0555
> +
> +#define PERM_rwx______	0700
> +#define PERM_rwxr_x___	0750
> +#define PERM_rwxr_xr_x	0755
> +#define PERM_rwxrwxr_x	0775
> +#define PERM_rwxrwxrwx	0777

I see what you're trying to do with all the explicit underscores, but it
does make them look kinda ugly.

What if you just used underscores to separate the user/group/other, and
the unset permission bits are just omitted.

Then the two most common cases would be:

  PERM_rw_r_r
  PERM_r_r_r

Both of those read nicely I think. ie. the first is "perm read write,
read, read".

Full set would be:

#define PERM_r			0400
#define PERM_r_r		0440
#define PERM_r_r_r		0444

#define PERM_rw			0600
#define PERM_rw_r		0640
#define PERM_rw_r_r		0644
#define PERM_rw_rw_r		0664
#define PERM_rw_rw_rw		0666

#define PERM_w			0200
#define PERM_w_w		0220
#define PERM_w_w_w		0222

#define PERM_rx			0500
#define PERM_rx_rx		0550
#define PERM_rx_rx_rx		0555

#define PERM_rwx		0700
#define PERM_rwx_rx		0750
#define PERM_rwx_rx_rx		0755
#define PERM_rwx_rwx_rx		0775
#define PERM_rwx_rwx_rwx	0777


cheers

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions
@ 2017-11-29  8:52             ` Michael Ellerman
  0 siblings, 0 replies; 64+ messages in thread
From: Michael Ellerman @ 2017-11-29  8:52 UTC (permalink / raw)
  To: Ingo Molnar, Linus Torvalds
  Cc: Peter Zijlstra, Thomas Gleixner, LKML, Dave Hansen,
	Andy Lutomirski, Borislav Petkov, Brian Gerst, Denys Vlasenko,
	H. Peter Anvin, Josh Poimboeuf, Rik van Riel, Daniel Gruss,
	Hugh Dickins, Kees Cook, linux-mm, michael.schwarz, moritz.lipp,
	richard.fellner

Ingo Molnar <mingo@kernel.org> writes:
...
> Index: tip/include/linux/stat.h
> ===================================================================
> --- tip.orig/include/linux/stat.h
> +++ tip/include/linux/stat.h
> @@ -6,6 +6,34 @@
>  #include <asm/stat.h>
>  #include <uapi/linux/stat.h>
>  
> +/*
> + * Human readable symbolic definitions for common
> + * file permissions:
> + */
> +#define PERM_r________	0400
> +#define PERM_r__r_____	0440
> +#define PERM_r__r__r__	0444
> +
> +#define PERM_rw_______	0600
> +#define PERM_rw_r_____	0640
> +#define PERM_rw_r__r__	0644
> +#define PERM_rw_rw_r__	0664
> +#define PERM_rw_rw_rw_	0666
> +
> +#define PERM__w_______	0200
> +#define PERM__w__w____	0220
> +#define PERM__w__w__w_	0222
> +
> +#define PERM_r_x______	0500
> +#define PERM_r_xr_x___	0550
> +#define PERM_r_xr_xr_x	0555
> +
> +#define PERM_rwx______	0700
> +#define PERM_rwxr_x___	0750
> +#define PERM_rwxr_xr_x	0755
> +#define PERM_rwxrwxr_x	0775
> +#define PERM_rwxrwxrwx	0777

I see what you're trying to do with all the explicit underscores, but it
does make them look kinda ugly.

What if you just used underscores to separate the user/group/other, and
the unset permission bits are just omitted.

Then the two most common cases would be:

  PERM_rw_r_r
  PERM_r_r_r

Both of those read nicely I think. ie. the first is "perm read write,
read, read".

Full set would be:

#define PERM_r			0400
#define PERM_r_r		0440
#define PERM_r_r_r		0444

#define PERM_rw			0600
#define PERM_rw_r		0640
#define PERM_rw_r_r		0644
#define PERM_rw_rw_r		0664
#define PERM_rw_rw_rw		0666

#define PERM_w			0200
#define PERM_w_w		0220
#define PERM_w_w_w		0222

#define PERM_rx			0500
#define PERM_rx_rx		0550
#define PERM_rx_rx_rx		0555

#define PERM_rwx		0700
#define PERM_rwx_rx		0750
#define PERM_rwx_rx_rx		0755
#define PERM_rwx_rwx_rx		0775
#define PERM_rwx_rwx_rwx	0777


cheers

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2017-11-29  8:53 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-26 23:14 [patch V2 0/5] x86/kaiser: Boot time disabling and debug support Thomas Gleixner
2017-11-26 23:14 ` Thomas Gleixner
2017-11-26 23:14 ` [patch V2 1/5] x86/kaiser: Respect disabled CPU features Thomas Gleixner
2017-11-26 23:14   ` Thomas Gleixner
2017-11-27  9:57   ` Peter Zijlstra
2017-11-27  9:57     ` Peter Zijlstra
2017-11-27 11:47     ` Thomas Gleixner
2017-11-27 11:47       ` Thomas Gleixner
2017-11-27 12:31       ` Brian Gerst
2017-11-27 12:31         ` Brian Gerst
2017-11-27 13:18         ` Thomas Gleixner
2017-11-27 13:18           ` Thomas Gleixner
2017-11-27 18:11   ` Dave Hansen
2017-11-27 18:11     ` Dave Hansen
2017-11-27 18:37     ` Kees Cook
2017-11-27 18:37       ` Kees Cook
2017-11-26 23:14 ` [patch V2 2/5] x86/kaiser: Simplify disabling of global pages Thomas Gleixner
2017-11-26 23:14   ` Thomas Gleixner
2017-11-27 11:49   ` Thomas Gleixner
2017-11-27 11:49     ` Thomas Gleixner
2017-11-27 18:15   ` Dave Hansen
2017-11-27 18:15     ` Dave Hansen
2017-11-27 20:28     ` Thomas Gleixner
2017-11-27 20:28       ` Thomas Gleixner
2017-11-26 23:14 ` [patch V2 3/5] x86/dump_pagetables: Check KAISER shadow page table for WX pages Thomas Gleixner
2017-11-26 23:14   ` Thomas Gleixner
2017-11-27 18:17   ` Dave Hansen
2017-11-27 18:17     ` Dave Hansen
2017-11-26 23:14 ` [patch V2 4/5] x86/mm/debug_pagetables: Allow dumping current pagetables Thomas Gleixner
2017-11-26 23:14   ` Thomas Gleixner
2017-11-27  9:41   ` Peter Zijlstra
2017-11-27  9:41     ` Peter Zijlstra
2017-11-27 10:06     ` [PATCH] vfs: Add PERM_* symbolic helpers for common file mode/permissions Ingo Molnar
2017-11-27 10:06       ` Ingo Molnar
2017-11-27 19:21       ` Linus Torvalds
2017-11-27 19:21         ` Linus Torvalds
2017-11-28 10:54         ` Ingo Molnar
2017-11-28 10:54           ` Ingo Molnar
2017-11-28 11:12         ` Ingo Molnar
2017-11-28 11:12           ` Ingo Molnar
2017-11-29  8:52           ` Michael Ellerman
2017-11-29  8:52             ` Michael Ellerman
2017-11-27 18:18   ` [patch V2 4/5] x86/mm/debug_pagetables: Allow dumping current pagetables Dave Hansen
2017-11-27 18:18     ` Dave Hansen
2017-11-26 23:14 ` [patch V2 5/5] x86/kaiser: Add boottime disable switch Thomas Gleixner
2017-11-26 23:14   ` Thomas Gleixner
2017-11-27  9:48   ` Peter Zijlstra
2017-11-27  9:48     ` Peter Zijlstra
2017-11-27 10:22     ` Peter Zijlstra
2017-11-27 10:22       ` Peter Zijlstra
2017-11-27 11:50       ` Thomas Gleixner
2017-11-27 11:50         ` Thomas Gleixner
2017-11-27 12:49         ` Peter Zijlstra
2017-11-27 12:49           ` Peter Zijlstra
2017-11-27 21:43       ` Peter Zijlstra
2017-11-27 21:43         ` Peter Zijlstra
2017-11-27 18:22   ` Dave Hansen
2017-11-27 18:22     ` Dave Hansen
2017-11-27 19:00     ` Thomas Gleixner
2017-11-27 19:00       ` Thomas Gleixner
2017-11-27 19:18       ` Josh Poimboeuf
2017-11-27 19:18         ` Josh Poimboeuf
2017-11-27 20:47         ` Thomas Gleixner
2017-11-27 20:47           ` Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.