From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752506AbdLEWGV (ORCPT ); Tue, 5 Dec 2017 17:06:21 -0500 Received: from bombadil.infradead.org ([65.50.211.133]:51990 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751500AbdLEWGU (ORCPT ); Tue, 5 Dec 2017 17:06:20 -0500 Date: Tue, 5 Dec 2017 23:05:57 +0100 From: Peter Zijlstra To: Andy Lutomirski Cc: Thomas Gleixner , LKML , X86 ML , Linus Torvalds , Dave Hansen , Borislav Petkov , Greg KH , Kees Cook , Hugh Dickins , Brian Gerst , Josh Poimboeuf , Denys Vlasenko , Rik van Riel , Boris Ostrovsky , Juergen Gross , David Laight , Eduardo Valentin , aliguori@amazon.com, Will Deacon , Daniel Gruss Subject: Re: [patch 53/60] x86/mm: Use/Fix PCID to optimize user/kernel switches Message-ID: <20171205220557.GX3165@worktop.lehotels.local> References: <20171204140706.296109558@linutronix.de> <20171204150609.179192470@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.22.1 (2013-10-16) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 05, 2017 at 01:46:36PM -0800, Andy Lutomirski wrote: > On Mon, Dec 4, 2017 at 6:07 AM, Thomas Gleixner wrote: > > We can use PCID to retain the TLBs across CR3 switches; including > > those now part of the user/kernel switch. This increases performance > > of kernel entry/exit at the cost of more expensive/complicated TLB > > flushing. > > > > Now that we have two address spaces, one for kernel and one for user > > space, we need two PCIDs per mm. We use the top PCID bit to indicate a > > user PCID (just like we use the PFN LSB for the PGD). Since we do TLB > > invalidation from kernel space, the existing code will only invalidate > > the kernel PCID, we augment that by marking the corresponding user > > PCID invalid, and upon switching back to userspace, use a flushing CR3 > > write for the switch. > > > > In order to access the user_pcid_flush_mask we use PER_CPU storage, > > which means the previously established SWAPGS vs CR3 ordering is now > > mandatory and required. > > > > Having to do this memory access does require additional registers, > > most sites have a functioning stack and we can spill one (RAX), sites > > without functional stack need to otherwise provide the second scratch > > register. > > > > Note: PCID is generally available on Intel Sandybridge and later CPUs. > > Note: Up until this point TLB flushing was broken in this series. > > I haven't checked that hard which patch introduces this bug, but it > seems that, with this applied, nothing propagates > non-mm-switch-related flushes to usermode. Shouldn't > flush_tlb_func_common() contain a call to invalidate_user_asid() near > the bottom? Alternatively, it could be in local_flush_tlb() and > __flush_tlb_single() (or whatever the hell the flush-one-usermode-TLB > function ends up being called). __native_flush_tlb_single() has the invalidate_user_asid() __native_flush_tlb() has the invalidate_user_asid(). Which should be exactly that last option you mention. > Also, on a somewhat related note, __flush_tlb_single() is called from > both flush_tlb_func_common() and do_kernel_range_flush. That sounds > wrong. Fixed that in the patches I send out earlier today.