From: Andy Lutomirski <luto@kernel.org>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>, X86 ML <x86@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andy Lutomirsky <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Dave Hansen <dave.hansen@intel.com>,
Borislav Petkov <bpetkov@suse.de>,
Greg KH <gregkh@linuxfoundation.org>,
Kees Cook <keescook@google.com>, Hugh Dickins <hughd@google.com>,
Brian Gerst <brgerst@gmail.com>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Denys Vlasenko <dvlasenk@redhat.com>,
Rik van Riel <riel@redhat.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Juergen Gross <jgross@suse.com>,
David Laight <David.Laight@aculab.com>,
Eduardo Valentin <eduval@amazon.com>,
aliguori@amazon.com, Will Deacon <will.deacon@arm.com>,
Daniel Gruss <daniel.gruss@iaik.tugraz.at>,
Dave Hansen <dave.hansen@linux.intel.com>,
Ingo Molnar <mingo@kernel.org>,
michael.schwarz@iaik.tugraz.at, Borislav Petkov <bp@alien8.de>,
moritz.lipp@iaik.tugraz.at, richard.fellner@student.tugraz.at
Subject: Re: [patch 51/60] x86/mm: Allow flushing for future ASID switches
Date: Mon, 4 Dec 2017 14:22:54 -0800 [thread overview]
Message-ID: <CALCETrX+DaXTTTQs_b_9nq2BpkCxrTv5FzLgyMd23wxczXn=GQ@mail.gmail.com> (raw)
In-Reply-To: <20171204150609.002009374@linutronix.de>
On Mon, Dec 4, 2017 at 6:07 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> If changing the page tables in such a way that an invalidation of all
> contexts (aka. PCIDs / ASIDs) is required, they can be actively invalidated
> by:
>
> 1. INVPCID for each PCID (works for single pages too).
>
> 2. Load CR3 with each PCID without the NOFLUSH bit set
>
> 3. Load CR3 with the NOFLUSH bit set for each and do INVLPG for each address.
>
> But, none of these are really feasible since there are ~6 ASIDs (12 with
> KERNEL_PAGE_TABLE_ISOLATION) at the time that invalidation is required.
> Instead of actively invalidating them, invalidate the *current* context and
> also mark the cpu_tlbstate _quickly_ to indicate future invalidation to be
> required.
>
> At the next context-switch, look for this indicator
> ('invalidate_other' being set) invalidate all of the
> cpu_tlbstate.ctxs[] entries.
>
> This ensures that any future context switches will do a full flush
> of the TLB, picking up the previous changes.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Denys Vlasenko <dvlasenk@redhat.com>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: michael.schwarz@iaik.tugraz.at
> Cc: daniel.gruss@iaik.tugraz.at
> Cc: Brian Gerst <brgerst@gmail.com>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: hughd@google.com
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: moritz.lipp@iaik.tugraz.at
> Cc: keescook@google.com
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: richard.fellner@student.tugraz.at
> Link: https://lkml.kernel.org/r/20171123003507.E8C327F5@viggo.jf.intel.com
>
> ---
> arch/x86/include/asm/tlbflush.h | 42 ++++++++++++++++++++++++++++++----------
> arch/x86/mm/tlb.c | 37 +++++++++++++++++++++++++++++++++++
> 2 files changed, 69 insertions(+), 10 deletions(-)
>
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -188,6 +188,17 @@ struct tlb_state {
> bool is_lazy;
>
> /*
> + * If set we changed the page tables in such a way that we
> + * needed an invalidation of all contexts (aka. PCIDs / ASIDs).
> + * This tells us to go invalidate all the non-loaded ctxs[]
> + * on the next context switch.
> + *
> + * The current ctx was kept up-to-date as it ran and does not
> + * need to be invalidated.
> + */
> + bool invalidate_other;
> +
> + /*
> * Access to this CR4 shadow and to H/W CR4 is protected by
> * disabling interrupts when modifying either one.
> */
> @@ -267,6 +278,19 @@ static inline unsigned long cr4_read_sha
> return this_cpu_read(cpu_tlbstate.cr4);
> }
>
> +static inline void invalidate_pcid_other(void)
> +{
> + /*
> + * With global pages, all of the shared kenel page tables
> + * are set as _PAGE_GLOBAL. We have no shared nonglobals
> + * and nothing to do here.
> + */
> + if (!static_cpu_has_bug(X86_BUG_CPU_SECURE_MODE_KPTI))
> + return;
I think I'd be more comfortable if this check were in the caller, not
here. Shouldn't a function called invalidate_pcid_other() do what the
name says?
> +
> + this_cpu_write(cpu_tlbstate.invalidate_other, true);
Why do we need this extra variable instead of just looping over all
other ASIDs and invalidating them? It would be something like:
for (i = 1; i < TLB_NR_DYN_ASIDS; i++) {
if (i != this_cpu_read(cpu_tlbstate.loaded_mm_asid))
this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0);
}
modulo epic whitespace damage and possible typos.
> +}
> +
> /*
> * Save some of cr4 feature set we're using (e.g. Pentium 4MB
> * enable and PPro Global page enable), so that any CPU's that boot
> @@ -341,24 +365,22 @@ static inline void __native_flush_tlb_si
>
> static inline void __flush_tlb_all(void)
> {
> - if (boot_cpu_has(X86_FEATURE_PGE))
> + if (boot_cpu_has(X86_FEATURE_PGE)) {
> __flush_tlb_global();
> - else
> + } else {
> __flush_tlb();
> -
> - /*
> - * Note: if we somehow had PCID but not PGE, then this wouldn't work --
> - * we'd end up flushing kernel translations for the current ASID but
> - * we might fail to flush kernel translations for other cached ASIDs.
> - *
> - * To avoid this issue, we force PCID off if PGE is off.
> - */
> + }
> }
>
> static inline void __flush_tlb_one(unsigned long addr)
> {
> count_vm_tlb_event(NR_TLB_LOCAL_FLUSH_ONE);
> __flush_tlb_single(addr);
> + /*
> + * Invalidate other address spaces inaccessible to single-page
> + * invalidation:
> + */
Ugh. If I'm reading this right, __flush_tlb_single() means "flush one
user address" and __flush_tlb_one() means "flush one kernel address".
That's, um, not exactly obvious. Could this be at least commented
better?
--Andy
next prev parent reply other threads:[~2017-12-04 22:23 UTC|newest]
Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-04 14:07 [patch 00/60] x86/kpti: Kernel Page Table Isolation (was KAISER) Thomas Gleixner
2017-12-04 14:07 ` [patch 01/60] x86/entry/64/paravirt: Use paravirt-safe macro to access eflags Thomas Gleixner
2017-12-05 12:17 ` Juergen Gross
2017-12-04 14:07 ` [patch 02/60] x86/unwinder/orc: Dont bail on stack overflow Thomas Gleixner
2017-12-04 20:31 ` Andy Lutomirski
2017-12-04 21:31 ` Thomas Gleixner
2017-12-04 14:07 ` [patch 03/60] x86/unwinder: Handle stack overflows more gracefully Thomas Gleixner
2017-12-04 14:07 ` [patch 04/60] x86/irq: Remove an old outdated comment about context tracking races Thomas Gleixner
2017-12-04 14:07 ` [patch 05/60] x86/irq/64: Print the offending IP in the stack overflow warning Thomas Gleixner
2017-12-04 14:07 ` [patch 06/60] x86/entry/64: Allocate and enable the SYSENTER stack Thomas Gleixner
2017-12-04 14:07 ` [patch 07/60] x86/dumpstack: Add get_stack_info() support for " Thomas Gleixner
2017-12-04 14:07 ` [patch 08/60] x86/entry/gdt: Put per-CPU GDT remaps in ascending order Thomas Gleixner
2017-12-04 14:07 ` [patch 09/60] x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area Thomas Gleixner
2017-12-04 14:07 ` [patch 10/60] x86/kasan/64: Teach KASAN about the cpu_entry_area Thomas Gleixner
2017-12-04 14:07 ` [patch 11/60] x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss Thomas Gleixner
2017-12-04 14:07 ` [patch 12/60] x86/dumpstack: Handle stack overflow on all stacks Thomas Gleixner
2017-12-04 14:07 ` [patch 13/60] x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct Thomas Gleixner
2017-12-04 14:07 ` [patch 14/60] x86/entry: Remap the TSS into the CPU entry area Thomas Gleixner
2017-12-04 18:20 ` Borislav Petkov
2017-12-04 14:07 ` [patch 15/60] x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 Thomas Gleixner
2017-12-04 14:07 ` [patch 16/60] x86/espfix/64: Stop assuming that pt_regs is on the entry stack Thomas Gleixner
2017-12-04 14:07 ` [patch 17/60] x86/entry/64: Use a per-CPU trampoline stack for IDT entries Thomas Gleixner
2017-12-04 14:07 ` [patch 18/60] x86/entry/64: Return to userspace from the trampoline stack Thomas Gleixner
2017-12-04 14:07 ` [patch 19/60] x86/entry/64: Create a per-CPU SYSCALL entry trampoline Thomas Gleixner
2017-12-04 22:30 ` Andy Lutomirski
2017-12-04 14:07 ` [patch 20/60] x86/entry/64: Move the IST stacks into struct cpu_entry_area Thomas Gleixner
2017-12-04 14:07 ` [patch 21/60] x86/entry/64: Remove the SYSENTER stack canary Thomas Gleixner
2017-12-04 14:07 ` [patch 22/60] x86/entry: Clean up the SYSENTER_stack code Thomas Gleixner
2017-12-04 19:41 ` Borislav Petkov
2017-12-04 14:07 ` [patch 23/60] x86/entry/64: Make cpu_entry_area.tss read-only Thomas Gleixner
2017-12-04 20:25 ` Borislav Petkov
2017-12-04 14:07 ` [patch 24/60] x86/paravirt: Dont patch flush_tlb_single Thomas Gleixner
2017-12-05 12:18 ` Juergen Gross
2017-12-04 14:07 ` [patch 25/60] x86/paravirt: Provide a way to check for hypervisors Thomas Gleixner
2017-12-05 12:19 ` Juergen Gross
2017-12-04 14:07 ` [patch 26/60] x86/cpufeature: Make cpu bugs sticky Thomas Gleixner
2017-12-04 22:39 ` Borislav Petkov
2017-12-04 14:07 ` [patch 27/60] x86/cpufeatures: Add X86_BUG_CPU_INSECURE Thomas Gleixner
2017-12-04 23:18 ` Borislav Petkov
2017-12-04 14:07 ` [patch 28/60] x86/mm/kpti: Disable global pages if KERNEL_PAGE_TABLE_ISOLATION=y Thomas Gleixner
2017-12-05 14:34 ` Borislav Petkov
2017-12-04 14:07 ` [patch 29/60] x86/mm/kpti: Prepare the x86/entry assembly code for entry/exit CR3 switching Thomas Gleixner
2017-12-04 14:07 ` [patch 30/60] x86/mm/kpti: Add infrastructure for page table isolation Thomas Gleixner
2017-12-05 15:20 ` Borislav Petkov
2017-12-04 14:07 ` [patch 31/60] x86/mm/kpti: Add mapping helper functions Thomas Gleixner
2017-12-04 22:27 ` Andy Lutomirski
2017-12-05 16:01 ` Borislav Petkov
2017-12-07 8:33 ` Borislav Petkov
2017-12-04 14:07 ` [patch 32/60] x86/mm/kpti: Allow NX poison to be set in p4d/pgd Thomas Gleixner
2017-12-05 17:09 ` Borislav Petkov
2017-12-04 14:07 ` [patch 33/60] x86/mm/kpti: Allocate a separate user PGD Thomas Gleixner
2017-12-05 18:33 ` Borislav Petkov
2017-12-06 20:56 ` Ingo Molnar
2017-12-04 14:07 ` [patch 34/60] x86/mm/kpti: Populate " Thomas Gleixner
2017-12-05 19:17 ` Borislav Petkov
2017-12-04 14:07 ` [patch 35/60] x86/espfix: Ensure that ESPFIX is visible in " Thomas Gleixner
2017-12-04 22:28 ` Andy Lutomirski
2017-12-04 14:07 ` [patch 36/60] x86/mm/kpti: Add functions to clone kernel PMDs Thomas Gleixner
2017-12-06 15:39 ` Borislav Petkov
2017-12-04 14:07 ` [patch 37/60] x86mm//kpti: Force entry through trampoline when KPTI active Thomas Gleixner
2017-12-06 16:01 ` Borislav Petkov
2017-12-04 14:07 ` [patch 38/60] x86/fixmap: Move cpu entry area into a separate PMD Thomas Gleixner
2017-12-06 18:57 ` Borislav Petkov
2017-12-04 14:07 ` [patch 39/60] x86/mm/kpti: Share cpu_entry_area PMDs Thomas Gleixner
2017-12-06 21:18 ` Borislav Petkov
2017-12-04 14:07 ` [patch 40/60] x86: PMD align entry text Thomas Gleixner
2017-12-07 8:07 ` Borislav Petkov
2017-12-04 14:07 ` [patch 41/60] x86/mm/kpti: Share entry text PMD Thomas Gleixner
2017-12-07 8:24 ` Borislav Petkov
2017-12-04 14:07 ` [patch 42/60] x86/fixmap: Move IDT fixmap into the cpu_entry_area range Thomas Gleixner
2017-12-04 14:07 ` [patch 43/60] x86/fixmap: Add debugstore entries to cpu_entry_area Thomas Gleixner
2017-12-07 9:55 ` Borislav Petkov
2017-12-04 14:07 ` [patch 44/60] x86/events/intel/ds: Map debug buffers in fixmap Thomas Gleixner
2017-12-04 14:07 ` [patch 45/60] x86/fixmap: Add ldt entries to user shared fixmap Thomas Gleixner
2017-12-04 14:07 ` [patch 46/60] x86/ldt: Rename ldt_struct->entries member Thomas Gleixner
2017-12-04 14:07 ` [patch 47/60] x86/ldt: Map LDT entries into fixmap Thomas Gleixner
2017-12-04 22:33 ` Andy Lutomirski
2017-12-04 22:51 ` Thomas Gleixner
2017-12-04 14:07 ` [patch 48/60] x86/mm: Move the CR3 construction functions to tlbflush.h Thomas Gleixner
2017-12-04 14:07 ` [patch 49/60] x86/mm: Remove hard-coded ASID limit checks Thomas Gleixner
2017-12-04 14:07 ` [patch 50/60] x86/mm: Put MMU to hardware ASID translation in one place Thomas Gleixner
2017-12-04 14:07 ` [patch 51/60] x86/mm: Allow flushing for future ASID switches Thomas Gleixner
2017-12-04 22:22 ` Andy Lutomirski [this message]
2017-12-04 22:34 ` Dave Hansen
2017-12-04 22:36 ` Andy Lutomirski
2017-12-04 22:47 ` Peter Zijlstra
2017-12-04 22:54 ` Andy Lutomirski
2017-12-04 23:06 ` Peter Zijlstra
2017-12-04 14:07 ` [patch 52/60] x86/mm: Abstract switching CR3 Thomas Gleixner
2017-12-04 14:07 ` [patch 53/60] x86/mm: Use/Fix PCID to optimize user/kernel switches Thomas Gleixner
2017-12-05 21:46 ` Andy Lutomirski
2017-12-05 22:05 ` Peter Zijlstra
2017-12-05 22:08 ` Dave Hansen
2017-12-04 14:08 ` [patch 54/60] x86/mm: Optimize RESTORE_CR3 Thomas Gleixner
2017-12-04 14:08 ` [patch 55/60] x86/mm: Use INVPCID for __native_flush_tlb_single() Thomas Gleixner
2017-12-04 22:25 ` Andy Lutomirski
2017-12-04 22:51 ` Peter Zijlstra
2017-12-05 13:51 ` Dave Hansen
2017-12-05 14:08 ` Peter Zijlstra
2017-12-04 14:08 ` [patch 56/60] x86/mm/kpti: Disable native VSYSCALL Thomas Gleixner
2017-12-04 22:33 ` Andy Lutomirski
2017-12-04 14:08 ` [patch 57/60] x86/mm/kpti: Add Kconfig Thomas Gleixner
2017-12-04 16:54 ` Andy Lutomirski
2017-12-04 16:57 ` Thomas Gleixner
2017-12-05 9:34 ` Thomas Gleixner
2017-12-04 14:08 ` [patch 58/60] x86/mm/debug_pagetables: Add page table directory Thomas Gleixner
2017-12-04 14:08 ` [patch 59/60] x86/mm/dump_pagetables: Check user space page table for WX pages Thomas Gleixner
2017-12-04 14:08 ` [patch 60/60] x86/mm/debug_pagetables: Allow dumping current pagetables Thomas Gleixner
2017-12-04 18:02 ` [patch 00/60] x86/kpti: Kernel Page Table Isolation (was KAISER) Linus Torvalds
2017-12-04 18:18 ` Thomas Gleixner
2017-12-04 18:21 ` Boris Ostrovsky
2017-12-04 18:28 ` Linus Torvalds
2017-12-05 21:49 ` Andy Lutomirski
2017-12-05 21:57 ` Dave Hansen
2017-12-05 23:19 ` Andy Lutomirski
2018-01-19 20:56 ` Andrew Morton
2018-01-19 21:06 ` Dave Hansen
2018-01-20 19:59 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALCETrX+DaXTTTQs_b_9nq2BpkCxrTv5FzLgyMd23wxczXn=GQ@mail.gmail.com' \
--to=luto@kernel.org \
--cc=David.Laight@aculab.com \
--cc=aliguori@amazon.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=bpetkov@suse.de \
--cc=brgerst@gmail.com \
--cc=daniel.gruss@iaik.tugraz.at \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dvlasenk@redhat.com \
--cc=eduval@amazon.com \
--cc=gregkh@linuxfoundation.org \
--cc=hughd@google.com \
--cc=jgross@suse.com \
--cc=jpoimboe@redhat.com \
--cc=keescook@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.schwarz@iaik.tugraz.at \
--cc=mingo@kernel.org \
--cc=moritz.lipp@iaik.tugraz.at \
--cc=peterz@infradead.org \
--cc=richard.fellner@student.tugraz.at \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).