From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752151AbdLEVrA (ORCPT ); Tue, 5 Dec 2017 16:47:00 -0500 Received: from mail.kernel.org ([198.145.29.99]:40120 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751288AbdLEVq7 (ORCPT ); Tue, 5 Dec 2017 16:46:59 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A0F04219A4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org X-Google-Smtp-Source: AGs4zMaOvky0D3wNo/jA1y2jUGpa16sJIkfQ89WhqvF8j2g1TFtARPC5D763CZDwsonxqx92V0CbV+K0Hmu3YAC/gSo= MIME-Version: 1.0 In-Reply-To: <20171204150609.179192470@linutronix.de> References: <20171204140706.296109558@linutronix.de> <20171204150609.179192470@linutronix.de> From: Andy Lutomirski Date: Tue, 5 Dec 2017 13:46:36 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [patch 53/60] x86/mm: Use/Fix PCID to optimize user/kernel switches To: Thomas Gleixner Cc: LKML , X86 ML , Linus Torvalds , Andy Lutomirsky , Peter Zijlstra , Dave Hansen , Borislav Petkov , Greg KH , Kees Cook , Hugh Dickins , Brian Gerst , Josh Poimboeuf , Denys Vlasenko , Rik van Riel , Boris Ostrovsky , Juergen Gross , David Laight , Eduardo Valentin , aliguori@amazon.com, Will Deacon , Daniel Gruss Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 4, 2017 at 6:07 AM, Thomas Gleixner wrote: > We can use PCID to retain the TLBs across CR3 switches; including > those now part of the user/kernel switch. This increases performance > of kernel entry/exit at the cost of more expensive/complicated TLB > flushing. > > Now that we have two address spaces, one for kernel and one for user > space, we need two PCIDs per mm. We use the top PCID bit to indicate a > user PCID (just like we use the PFN LSB for the PGD). Since we do TLB > invalidation from kernel space, the existing code will only invalidate > the kernel PCID, we augment that by marking the corresponding user > PCID invalid, and upon switching back to userspace, use a flushing CR3 > write for the switch. > > In order to access the user_pcid_flush_mask we use PER_CPU storage, > which means the previously established SWAPGS vs CR3 ordering is now > mandatory and required. > > Having to do this memory access does require additional registers, > most sites have a functioning stack and we can spill one (RAX), sites > without functional stack need to otherwise provide the second scratch > register. > > Note: PCID is generally available on Intel Sandybridge and later CPUs. > Note: Up until this point TLB flushing was broken in this series. I haven't checked that hard which patch introduces this bug, but it seems that, with this applied, nothing propagates non-mm-switch-related flushes to usermode. Shouldn't flush_tlb_func_common() contain a call to invalidate_user_asid() near the bottom? Alternatively, it could be in local_flush_tlb() and __flush_tlb_single() (or whatever the hell the flush-one-usermode-TLB function ends up being called). Also, on a somewhat related note, __flush_tlb_single() is called from both flush_tlb_func_common() and do_kernel_range_flush. That sounds wrong.