From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752342AbdLDW2W (ORCPT ); Mon, 4 Dec 2017 17:28:22 -0500 Received: from mail.kernel.org ([198.145.29.99]:40074 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751546AbdLDW2T (ORCPT ); Mon, 4 Dec 2017 17:28:19 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 76C89219AD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org X-Google-Smtp-Source: AGs4zMby0anz3xc6oDe9oL9bswG7d/Gq7ZtQX/PMLliTkCf+O27NS7N/+PcQVCM6GWHIGxCmQRSnLfS5XIwKIWcZdmw= MIME-Version: 1.0 In-Reply-To: <20171204150607.391576490@linutronix.de> References: <20171204140706.296109558@linutronix.de> <20171204150607.391576490@linutronix.de> From: Andy Lutomirski Date: Mon, 4 Dec 2017 14:27:57 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [patch 31/60] x86/mm/kpti: Add mapping helper functions To: Thomas Gleixner Cc: LKML , X86 ML , Linus Torvalds , Andy Lutomirsky , Peter Zijlstra , Dave Hansen , Borislav Petkov , Greg KH , Kees Cook , Hugh Dickins , Brian Gerst , Josh Poimboeuf , Denys Vlasenko , Rik van Riel , Boris Ostrovsky , Juergen Gross , David Laight , Eduardo Valentin , aliguori@amazon.com, Will Deacon , Daniel Gruss , Dave Hansen Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 4, 2017 at 6:07 AM, Thomas Gleixner wrote: > From: Dave Hansen > > Add the pagetable helper functions do manage the separate user space page > tables. > > [ tglx: Split out from the big combo kaiser patch ] > +/* > + * Take a PGD location (pgdp) and a pgd value that needs to be set there. > + * Populates the user and returns the resulting PGD that must be set in > + * the kernel copy of the page tables. > + */ > +static inline pgd_t kpti_set_user_pgd(pgd_t *pgdp, pgd_t pgd) > +{ > +#ifdef CONFIG_KERNEL_PAGE_TABLE_ISOLATION > + if (!static_cpu_has_bug(X86_BUG_CPU_SECURE_MODE_KPTI)) > + return pgd; > + > + if (pgd_userspace_access(pgd)) { > + if (pgdp_maps_userspace(pgdp)) { > + /* > + * The user page tables get the full PGD, > + * accessible from userspace: > + */ > + kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd; > + /* > + * For the copy of the pgd that the kernel uses, > + * make it unusable to userspace. This ensures on > + * in case that a return to userspace with the > + * kernel CR3 value, userspace will crash instead > + * of running. > + * > + * Note: NX might be not available or disabled. > + */ > + if (__supported_pte_mask & _PAGE_NX) > + pgd.pgd |= _PAGE_NX; > + } > + } else if (pgd_userspace_access(*pgdp)) { > + /* > + * We are clearing a _PAGE_USER PGD for which we presumably > + * populated the user PGD. We must now clear the user PGD > + * entry. > + */ > + if (pgdp_maps_userspace(pgdp)) { > + kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd; > + } else { > + /* > + * Attempted to clear a _PAGE_USER PGD which is in > + * the kernel porttion of the address space. PGDs > + * are pre-populated and we never clear them. > + */ > + WARN_ON_ONCE(1); > + } > + } else { > + /* > + * _PAGE_USER was not set in either the PGD being set or > + * cleared. All kernel PGDs should be pre-populated so > + * this should never happen after boot. > + */ > + WARN_ON_ONCE(system_state == SYSTEM_RUNNING); > + } > +#endif > + /* return the copy of the PGD we want the kernel to use: */ > + return pgd; > +} > + I mentioned this earlier, but I think this should be: VM_BUG_ON(pgdp points to a usermode table); if (pgdp_maps_userspace(pgdp)) { /* Install the pgd as requested into the usermode tables. */ kernel_to_user_pgdp(pgdp)->pgd = pgd.pgd; if (pgd_val(pgd) & _PAGE_USER) { /* * This is a normal user pgd -- the kernelmode mapping should have NX * set to prevent erroneous usermode execution with the kernel tables. */ return __pgd(pgd_val(pgd) | _PAGE_NX; } else { /* This is a weird mapping, e.g. EFI. Map it straight through. */ return pgd; } } else { /* * We can get here due to vmalloc, a vmalloc fault, memory hot-add, or initial setup * of kernelmode page tables. Regardless of which particular code path we're in, * these mappings should not be automatically propagated to the usermode tables. */ return pgd; } } That should make all the VSYSCALL nastiness go away.