All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Hugh Dickins <hughd@google.com>
Subject: [PATCH 4.9 06/39] kaiser: do not set _PAGE_NX on pgd_none
Date: Wed,  3 Jan 2018 21:11:20 +0100	[thread overview]
Message-ID: <20180103195104.346548837@linuxfoundation.org> (raw)
In-Reply-To: <20180103195104.066528044@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Hugh Dickins <hughd@google.com>


native_pgd_clear() uses native_set_pgd(), so native_set_pgd() must
avoid setting the _PAGE_NX bit on an otherwise pgd_none() entry:
usually that just generated a warning on exit, but sometimes
more mysterious and damaging failures (our production machines
could not complete booting).

The original fix to this just avoided adding _PAGE_NX to
an empty entry; but eventually more problems surfaced with kexec,
and EFI mapping expected to be a problem too.  So now instead
change native_set_pgd() to update shadow only if _PAGE_USER:

A few places (kernel/machine_kexec_64.c, platform/efi/efi_64.c for sure)
use set_pgd() to set up a temporary internal virtual address space, with
physical pages remapped at what Kaiser regards as userspace addresses:
Kaiser then assumes a shadow pgd follows, which it will try to corrupt.

This appears to be responsible for the recent kexec and kdump failures;
though it's unclear how those did not manifest as a problem before.
Ah, the shadow pgd will only be assumed to "follow" if the requested
pgd is on an even-numbered page: so I suppose it was going wrong 50%
of the time all along.

What we need is a flag to set_pgd(), to tell it we're dealing with
userspace.  Er, isn't that what the pgd's _PAGE_USER bit is saying?
Add a test for that.  But we cannot do the same for pgd_clear()
(which may be called to clear corrupted entries - set aside the
question of "corrupt in which pgd?" until later), so there just
rely on pgd_clear() not being called in the problematic cases -
with a WARN_ON_ONCE() which should fire half the time if it is.

But this is getting too big for an inline function: move it into
arch/x86/mm/kaiser.c (which then demands a boot/compressed mod);
and de-void and de-space native_get_shadow/normal_pgd() while here.

Also make an unnecessary change to KASLR's init_trampoline(): it was
using set_pgd() to assign a pgd-value to a global variable (not in a
pg directory page), which was rather scary given Kaiser's previous
set_pgd() implementation: not a problem now, but too scary to leave
as was, it could easily blow up if we have to change set_pgd() again.

Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/boot/compressed/misc.h   |    1 
 arch/x86/include/asm/pgtable_64.h |   51 +++++++++-----------------------------
 arch/x86/mm/kaiser.c              |   42 +++++++++++++++++++++++++++++++
 arch/x86/mm/kaslr.c               |    4 +-
 4 files changed, 58 insertions(+), 40 deletions(-)

--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -9,6 +9,7 @@
  */
 #undef CONFIG_PARAVIRT
 #undef CONFIG_PARAVIRT_SPINLOCKS
+#undef CONFIG_KAISER
 #undef CONFIG_KASAN
 
 #include <linux/linkage.h>
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -107,61 +107,36 @@ static inline void native_pud_clear(pud_
 }
 
 #ifdef CONFIG_KAISER
-static inline pgd_t * native_get_shadow_pgd(pgd_t *pgdp)
+extern pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd);
+
+static inline pgd_t *native_get_shadow_pgd(pgd_t *pgdp)
 {
-	return (pgd_t *)(void*)((unsigned long)(void*)pgdp | (unsigned long)PAGE_SIZE);
+	return (pgd_t *)((unsigned long)pgdp | (unsigned long)PAGE_SIZE);
 }
 
-static inline pgd_t * native_get_normal_pgd(pgd_t *pgdp)
+static inline pgd_t *native_get_normal_pgd(pgd_t *pgdp)
 {
-	return (pgd_t *)(void*)((unsigned long)(void*)pgdp &  ~(unsigned long)PAGE_SIZE);
+	return (pgd_t *)((unsigned long)pgdp & ~(unsigned long)PAGE_SIZE);
 }
 #else
-static inline pgd_t * native_get_shadow_pgd(pgd_t *pgdp)
+static inline pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd)
+{
+	return pgd;
+}
+static inline pgd_t *native_get_shadow_pgd(pgd_t *pgdp)
 {
 	BUILD_BUG_ON(1);
 	return NULL;
 }
-static inline pgd_t * native_get_normal_pgd(pgd_t *pgdp)
+static inline pgd_t *native_get_normal_pgd(pgd_t *pgdp)
 {
 	return pgdp;
 }
 #endif /* CONFIG_KAISER */
 
-/*
- * Page table pages are page-aligned.  The lower half of the top
- * level is used for userspace and the top half for the kernel.
- * This returns true for user pages that need to get copied into
- * both the user and kernel copies of the page tables, and false
- * for kernel pages that should only be in the kernel copy.
- */
-static inline bool is_userspace_pgd(void *__ptr)
-{
-	unsigned long ptr = (unsigned long)__ptr;
-
-	return ((ptr % PAGE_SIZE) < (PAGE_SIZE / 2));
-}
-
 static inline void native_set_pgd(pgd_t *pgdp, pgd_t pgd)
 {
-#ifdef CONFIG_KAISER
-	pteval_t extra_kern_pgd_flags = 0;
-	/* Do we need to also populate the shadow pgd? */
-	if (is_userspace_pgd(pgdp)) {
-		native_get_shadow_pgd(pgdp)->pgd = pgd.pgd;
-		/*
-		 * Even if the entry is *mapping* userspace, ensure
-		 * that userspace can not use it.  This way, if we
-		 * get out to userspace running on the kernel CR3,
-		 * userspace will crash instead of running.
-		 */
-		extra_kern_pgd_flags = _PAGE_NX;
-	}
-	pgdp->pgd = pgd.pgd;
-	pgdp->pgd |= extra_kern_pgd_flags;
-#else /* CONFIG_KAISER */
-	*pgdp = pgd;
-#endif
+	*pgdp = kaiser_set_shadow_pgd(pgdp, pgd);
 }
 
 static inline void native_pgd_clear(pgd_t *pgd)
--- a/arch/x86/mm/kaiser.c
+++ b/arch/x86/mm/kaiser.c
@@ -302,4 +302,46 @@ void kaiser_remove_mapping(unsigned long
 		unmap_pud_range_nofree(pgd, addr, end);
 	}
 }
+
+/*
+ * Page table pages are page-aligned.  The lower half of the top
+ * level is used for userspace and the top half for the kernel.
+ * This returns true for user pages that need to get copied into
+ * both the user and kernel copies of the page tables, and false
+ * for kernel pages that should only be in the kernel copy.
+ */
+static inline bool is_userspace_pgd(pgd_t *pgdp)
+{
+	return ((unsigned long)pgdp % PAGE_SIZE) < (PAGE_SIZE / 2);
+}
+
+pgd_t kaiser_set_shadow_pgd(pgd_t *pgdp, pgd_t pgd)
+{
+	/*
+	 * Do we need to also populate the shadow pgd?  Check _PAGE_USER to
+	 * skip cases like kexec and EFI which make temporary low mappings.
+	 */
+	if (pgd.pgd & _PAGE_USER) {
+		if (is_userspace_pgd(pgdp)) {
+			native_get_shadow_pgd(pgdp)->pgd = pgd.pgd;
+			/*
+			 * Even if the entry is *mapping* userspace, ensure
+			 * that userspace can not use it.  This way, if we
+			 * get out to userspace running on the kernel CR3,
+			 * userspace will crash instead of running.
+			 */
+			pgd.pgd |= _PAGE_NX;
+		}
+	} else if (!pgd.pgd) {
+		/*
+		 * pgd_clear() cannot check _PAGE_USER, and is even used to
+		 * clear corrupted pgd entries: so just rely on cases like
+		 * kexec and EFI never to be using pgd_clear().
+		 */
+		if (!WARN_ON_ONCE((unsigned long)pgdp & PAGE_SIZE) &&
+		    is_userspace_pgd(pgdp))
+			native_get_shadow_pgd(pgdp)->pgd = pgd.pgd;
+	}
+	return pgd;
+}
 #endif /* CONFIG_KAISER */
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -189,6 +189,6 @@ void __meminit init_trampoline(void)
 		*pud_tramp = *pud;
 	}
 
-	set_pgd(&trampoline_pgd_entry,
-		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
+	/* Avoid set_pgd(), in case it's complicated by CONFIG_KAISER */
+	trampoline_pgd_entry = __pgd(_KERNPG_TABLE | __pa(pud_page_tramp));
 }

  parent reply	other threads:[~2018-01-03 20:15 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-03 20:11 [PATCH 4.9 00/39] 4.9.75-stable review Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 01/39] tcp_bbr: reset full pipe detection on loss recovery undo Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 02/39] tcp_bbr: reset long-term bandwidth sampling " Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 03/39] x86/boot: Add early cmdline parsing for options with arguments Greg Kroah-Hartman
2018-01-03 20:11   ` Greg Kroah-Hartman
2018-01-03 20:11   ` Greg Kroah-Hartman
2018-01-03 20:11   ` Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 04/39] KAISER: Kernel Address Isolation Greg Kroah-Hartman
2018-01-03 20:11   ` [kernel-hardening] " Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 05/39] kaiser: merged update Greg Kroah-Hartman
2018-01-03 20:11 ` Greg Kroah-Hartman [this message]
2018-01-03 20:11 ` [PATCH 4.9 07/39] kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 08/39] kaiser: fix build and FIXME in alloc_ldt_struct() Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 09/39] kaiser: KAISER depends on SMP Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 10/39] kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 11/39] kaiser: fix perf crashes Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 12/39] kaiser: ENOMEM if kaiser_pagetable_walk() NULL Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 13/39] kaiser: tidied up asm/kaiser.h somewhat Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 14/39] kaiser: tidied up kaiser_add/remove_mapping slightly Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 15/39] kaiser: align addition to x86/mm/Makefile Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 16/39] kaiser: cleanups while trying for gold link Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 17/39] kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 18/39] kaiser: delete KAISER_REAL_SWITCH option Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 19/39] kaiser: vmstat show NR_KAISERTABLE as nr_overhead Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 20/39] kaiser: enhanced by kernel and user PCIDs Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 21/39] kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 22/39] kaiser: PCID 0 for kernel and 128 for user Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 23/39] kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 24/39] kaiser: paranoid_entry pass cr3 need to paranoid_exit Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 25/39] kaiser: kaiser_remove_mapping() move along the pgd Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 26/39] kaiser: fix unlikely error in alloc_ldt_struct() Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 27/39] kaiser: add "nokaiser" boot option, using ALTERNATIVE Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 28/39] x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 29/39] x86/kaiser: Check boottime cmdline params Greg Kroah-Hartman
2018-01-04  0:16   ` Ben Hutchings
2018-01-04  7:05     ` Borislav Petkov
2018-01-04  7:38       ` Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 30/39] kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 31/39] kaiser: drop is_atomic arg to kaiser_pagetable_walk() Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 32/39] kaiser: asm/tlbflush.h handle noPGE at lower level Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 33/39] kaiser: kaiser_flush_tlb_on_return_to_user() check PCID Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 34/39] x86/paravirt: Dont patch flush_tlb_single Greg Kroah-Hartman
2018-01-03 20:11   ` Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 35/39] x86/kaiser: Reenable PARAVIRT Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 36/39] kaiser: disabled on Xen PV Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 37/39] x86/kaiser: Move feature detection up Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 38/39] KPTI: Rename to PAGE_TABLE_ISOLATION Greg Kroah-Hartman
2018-01-03 20:11 ` [PATCH 4.9 39/39] KPTI: Report when enabled Greg Kroah-Hartman
2018-01-04  1:24 ` [PATCH 4.9 00/39] 4.9.75-stable review Ben Hutchings
2018-01-04  4:07   ` Hugh Dickins
2018-01-04  4:18     ` Andy Lutomirski
2018-01-04  7:39   ` Greg Kroah-Hartman
2018-01-04  7:41     ` Greg Kroah-Hartman
2018-01-04  7:01 ` Naresh Kamboju
2018-01-04  8:10   ` Greg Kroah-Hartman
2018-01-04 10:07 ` kernelci.org bot
2018-01-05  0:10   ` Kevin Hilman
2018-01-04 17:04 ` Guenter Roeck
2018-01-04 22:01 ` Shuah Khan
2018-01-05 10:23 ` Alice Ferrazzi
2018-01-05 12:12   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180103195104.346548837@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.