From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753031AbdEEHlU (ORCPT ); Fri, 5 May 2017 03:41:20 -0400 Received: from mrelay.tugraz.at ([129.27.2.203]:60649 "EHLO mrelay.tugraz.at" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751344AbdEEHlT (ORCPT ); Fri, 5 May 2017 03:41:19 -0400 Subject: Re: [RFC, PATCH] x86_64: KAISER - do not map kernel in user mode To: Christoph Hellwig References: <9df77051-ac01-bfe9-3cf7-4c2ecbcb9292@iaik.tugraz.at> <20170504154717.GA24353@infradead.org> CC: kernel list , , "clementine.maurice@iaik.tugraz.at" , "moritz.lipp@iaik.tugraz.at" , Michael Schwarz , Richard Fellner , , "Ingo Molnar" , "anders.fogh@gdata-adan.de" From: Daniel Gruss Message-ID: Date: Fri, 5 May 2017 09:40:05 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <20170504154717.GA24353@infradead.org> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [134.147.24.12] X-ClientProxiedBy: EXCG01-EXT.iaik.tugraz.at (2002:811b:98d3::811b:98d3) To EXCG01-INT.iaik.tugraz.at (2002:811b:981a::811b:981a) X-TM-AS-Product-Ver: SMEX-12.0.0.1464-8.100.1062-23050.003 X-TM-AS-Result: No--3.069000-0.000000-31 X-TM-AS-MatchedID: 150567-700075-139010-705167-702020-105700-139705-700486-7 01236-704473-701837-704864-707595-700752-701407-703483-700512-704179-704714 -702010-710245-701594-702762-863174-702409-706891-705882-148004-148040-1480 53-42000-42003 X-TM-AS-User-Approved-Sender: Yes X-TM-AS-User-Blocked-Sender: No X-TUG-Backscatter-control: IqAlG2Mm08USmfDJcRVXXA X-Spam-Scanner: SpamAssassin 3.003001 X-Spam-Score-relay: -1.9 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04.05.2017 17:47, Christoph Hellwig wrote: > I'll try to read the paper. In the meantime: how different is your > approach from then one here? > > https://lwn.net/Articles/39283/ > > and how different is the performance impact? The approach sounds very similar, but we have fewer changes because we don't want to change memory allocation but only split the virtual memory - everything can stay where it is. We found that the CR3 switch seems to be significantly improved in modern microarchitectures (we performed our performance tests on a Skylake i7-6700K). We think the TLB maybe uses the full CR3 base address as a tag, relaxing the necessity of flushing the entire TLB upon CR3 updates a bit. Direct runtime overhead is switching the CR3, but that's it. Indirectly, we're potentially increasing the number of TLB entries that are required on one or the other level of the TLB. For TLB-intense tasks this might lead to more significant performance penalties. I'm sure the overhead on older systems is larger than on recent systems. From mboxrd@z Thu Jan 1 00:00:00 1970 References: <9df77051-ac01-bfe9-3cf7-4c2ecbcb9292@iaik.tugraz.at> <20170504154717.GA24353@infradead.org> From: Daniel Gruss Message-ID: Date: Fri, 5 May 2017 09:40:05 +0200 MIME-Version: 1.0 In-Reply-To: <20170504154717.GA24353@infradead.org> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: [kernel-hardening] Re: [RFC, PATCH] x86_64: KAISER - do not map kernel in user mode To: Christoph Hellwig Cc: kernel list , kernel-hardening@lists.openwall.com, "clementine.maurice@iaik.tugraz.at" , "moritz.lipp@iaik.tugraz.at" , Michael Schwarz , Richard Fellner , kirill.shutemov@linux.intel.com, Ingo Molnar , "anders.fogh@gdata-adan.de" List-ID: On 04.05.2017 17:47, Christoph Hellwig wrote: > I'll try to read the paper. In the meantime: how different is your > approach from then one here? > > https://lwn.net/Articles/39283/ > > and how different is the performance impact? The approach sounds very similar, but we have fewer changes because we don't want to change memory allocation but only split the virtual memory - everything can stay where it is. We found that the CR3 switch seems to be significantly improved in modern microarchitectures (we performed our performance tests on a Skylake i7-6700K). We think the TLB maybe uses the full CR3 base address as a tag, relaxing the necessity of flushing the entire TLB upon CR3 updates a bit. Direct runtime overhead is switching the CR3, but that's it. Indirectly, we're potentially increasing the number of TLB entries that are required on one or the other level of the TLB. For TLB-intense tasks this might lead to more significant performance penalties. I'm sure the overhead on older systems is larger than on recent systems.