From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751004AbbLHVvs (ORCPT ); Tue, 8 Dec 2015 16:51:48 -0500 Received: from mail-oi0-f48.google.com ([209.85.218.48]:34937 "EHLO mail-oi0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750737AbbLHVvq (ORCPT ); Tue, 8 Dec 2015 16:51:46 -0500 MIME-Version: 1.0 In-Reply-To: <20151208185608.GA3004@gmail.com> References: <2ff015fa6989c6a8907c73636f5f5cb99402f6c3.1449522077.git.luto@kernel.org> <20151208185608.GA3004@gmail.com> From: Andy Lutomirski Date: Tue, 8 Dec 2015 13:51:26 -0800 Message-ID: Subject: Re: [PATCH 07/12] x86/entry/64: Always run ptregs-using syscalls on the slow path To: Ingo Molnar Cc: Brian Gerst , Andy Lutomirski , "the arch/x86 maintainers" , Linux Kernel Mailing List , Borislav Petkov , =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , Denys Vlasenko , Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Dec 8, 2015 at 10:56 AM, Ingo Molnar wrote: > > * Brian Gerst wrote: > >> > We could adjust it a bit and check whether we're in C land (by checking rsp >> > for ts) and jump into the slow path if we aren't, but I'm not sure this is a >> > huge win. It does save some rodata space by avoiding duplicating the table. >> >> The syscall table is huge. 545*8 bytes, over a full page. Duplicating it for >> just a few different entries is wasteful. > > Note that what matters more is cache footprint, not pure size: 1K of RAM overhead > for something as fundamental as system calls is trivial cost. > > So the questions to ask are along these lines: > > - what is the typical locality of access (do syscall numbers cluster in time and > space) > I suspect that they do. Web servers will call send over and over, for example. > - how frequently would the two tables be accessed (is one accessed less > frequently than the other?) On setups that don't bail right away, the fast path table gets hit most of the time. On setups that do bail right away (context tracking on, for example), we exclusively use the slow path table. > > - subsequently how does the effective cache footprint change with the > duplication? In the worst case (repeatedly forking, for example, but I doubt we care about that case), the duplication adds one extra cacheline. > > it might still end up not being worth it - but it's not the RAM cost that is the > main factor IMHO. Agreed. One option: borrow the high bit to indicate "needs ptregs". This adds a branch to both the fast path and the slow path, but it avoids the cache hit. Brian's approach gets the best of all worlds except that, if I understand it right, it's a bit fragile. --Andy