From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756753AbbLGWzt (ORCPT ); Mon, 7 Dec 2015 17:55:49 -0500 Received: from mail-ob0-f176.google.com ([209.85.214.176]:34567 "EHLO mail-ob0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756126AbbLGWzr (ORCPT ); Mon, 7 Dec 2015 17:55:47 -0500 MIME-Version: 1.0 In-Reply-To: References: From: Andy Lutomirski Date: Mon, 7 Dec 2015 14:55:27 -0800 Message-ID: Subject: Re: [PATCH 00/12] x86: Rewrite 64-bit syscall code To: Andy Lutomirski Cc: X86 ML , "linux-kernel@vger.kernel.org" , Brian Gerst , Borislav Petkov , =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , Denys Vlasenko , Linus Torvalds Content-Type: multipart/mixed; boundary=001a11c2b464fadd9f052656c1e8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --001a11c2b464fadd9f052656c1e8 Content-Type: text/plain; charset=UTF-8 On Mon, Dec 7, 2015 at 1:51 PM, Andy Lutomirski wrote: > This is kind of like the 32-bit and compat code, except that I > preserved the fast path this time. I was unable to measure any > significant performance change on my laptop in the fast path. > > What do you all think? For completeness, if I zap the fast path entirely (see attached), I lose 20 cycles (148 cycles vs 128 cycles) on Skylake. Switching between movq and pushq for stack setup makes no difference whatsoever, interestingly. I haven't tried to figure out exactly where those 20 cycles go. --Andy --001a11c2b464fadd9f052656c1e8 Content-Type: text/x-patch; charset=US-ASCII; name="zap_fastpatch.patch" Content-Disposition: attachment; filename="zap_fastpatch.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_ihwk0u290 ZGlmZiAtLWdpdCBhL2FyY2gveDg2L2VudHJ5L2VudHJ5XzY0LlMgYi9hcmNoL3g4Ni9lbnRyeS9l bnRyeV82NC5TCmluZGV4IDFhYjUzNjJmMjQxZC4uYTk3OTgxZjBkOWNlIDEwMDY0NAotLS0gYS9h cmNoL3g4Ni9lbnRyeS9lbnRyeV82NC5TCisrKyBiL2FyY2gveDg2L2VudHJ5L2VudHJ5XzY0LlMK QEAgLTE2Myw3MSArMTYzLDE3IEBAIEdMT0JBTChlbnRyeV9TWVNDQUxMXzY0X2FmdGVyX3N3YXBn cykKIAlwdXNocQklcjkJCQkJLyogcHRfcmVncy0+cjkgKi8KIAlwdXNocQklcjEwCQkJCS8qIHB0 X3JlZ3MtPnIxMCAqLwogCXB1c2hxCSVyMTEJCQkJLyogcHRfcmVncy0+cjExICovCi0Jc3ViCSQo Nio4KSwgJXJzcAkJCS8qIHB0X3JlZ3MtPmJwLCBieCwgcjEyLTE1IG5vdCBzYXZlZCAqLworCXB1 c2hxCSVyYngJCQkJLyogcHRfcmVncy0+cmJ4ICovCisJcHVzaHEJJXJicAkJCQkvKiBwdF9yZWdz LT5yYnAgKi8KKwlwdXNocQklcjEyCQkJCS8qIHB0X3JlZ3MtPnIxMiAqLworCXB1c2hxCSVyMTMJ CQkJLyogcHRfcmVncy0+cjEzICovCisJcHVzaHEJJXIxNAkJCQkvKiBwdF9yZWdzLT5yMTQgKi8K KwlwdXNocQklcjE1CQkJCS8qIHB0X3JlZ3MtPnIxNSAqLwogCi0JLyoKLQkgKiBJZiB3ZSBuZWVk IHRvIGRvIGVudHJ5IHdvcmsgb3IgaWYgd2UgZ3Vlc3Mgd2UnbGwgbmVlZCB0byBkbwotCSAqIGV4 aXQgd29yaywgZ28gc3RyYWlnaHQgdG8gdGhlIHNsb3cgcGF0aC4KLQkgKi8KLQl0ZXN0bAkkX1RJ Rl9XT1JLX1NZU0NBTExfRU5UUll8X1RJRl9BTExXT1JLX01BU0ssIEFTTV9USFJFQURfSU5GTyhU SV9mbGFncywgJXJzcCwgU0laRU9GX1BUUkVHUykKLQlqbnoJZW50cnlfU1lTQ0FMTDY0X3Nsb3df cGF0aAotCi1lbnRyeV9TWVNDQUxMXzY0X2Zhc3RwYXRoOgotCS8qCi0JICogRWFzeSBjYXNlOiBl bmFibGUgaW50ZXJydXB0cyBhbmQgaXNzdWUgdGhlIHN5c2NhbGwuICBJZiB0aGUgc3lzY2FsbAot CSAqIG5lZWRzIHB0X3JlZ3MsIHdlJ2xsIGNhbGwgYSBzdHViIHRoYXQgZGlzYWJsZXMgaW50ZXJy dXB0cyBhZ2FpbgotCSAqIGFuZCBqdW1wcyB0byB0aGUgc2xvdyBwYXRoLgotCSAqLwotCVRSQUNF X0lSUVNfT04KLQlFTkFCTEVfSU5URVJSVVBUUyhDTEJSX05PTkUpCi0jaWYgX19TWVNDQUxMX01B U0sgPT0gfjAKLQljbXBxCSRfX05SX3N5c2NhbGxfbWF4LCAlcmF4Ci0jZWxzZQotCWFuZGwJJF9f U1lTQ0FMTF9NQVNLLCAlZWF4Ci0JY21wbAkkX19OUl9zeXNjYWxsX21heCwgJWVheAotI2VuZGlm Ci0JamEJMWYJCQkJLyogcmV0dXJuIC1FTk9TWVMgKGFscmVhZHkgaW4gcHRfcmVncy0+YXgpICov Ci0JbW92cQklcjEwLCAlcmN4Ci0JY2FsbAkqc3lzX2NhbGxfdGFibGVfZmFzdHBhdGhfNjQoLCAl cmF4LCA4KQotCW1vdnEJJXJheCwgUkFYKCVyc3ApCi0xOgotCi0JLyoKLQkgKiBJZiB3ZSBnZXQg aGVyZSwgdGhlbiB3ZSBrbm93IHRoYXQgcHRfcmVncyBpcyBjbGVhbiBmb3IgU1lTUkVUNjQuCi0J ICogSWYgd2Ugc2VlIHRoYXQgbm8gZXhpdCB3b3JrIGlzIHJlcXVpcmVkICh3aGljaCB3ZSBhcmUg cmVxdWlyZWQKLQkgKiB0byBjaGVjayB3aXRoIElSUXMgb2ZmKSwgdGhlbiB3ZSBjYW4gZ28gc3Ry YWlnaHQgdG8gU1lTUkVUNjQuCi0JICovCi0JRElTQUJMRV9JTlRFUlJVUFRTKENMQlJfTk9ORSkK LQlUUkFDRV9JUlFTX09GRgotCXRlc3RsCSRfVElGX0FMTFdPUktfTUFTSywgQVNNX1RIUkVBRF9J TkZPKFRJX2ZsYWdzLCAlcnNwLCBTSVpFT0ZfUFRSRUdTKQotCWpuegkxZgotCi0JTE9DS0RFUF9T WVNfRVhJVAotCVRSQUNFX0lSUVNfT04JCS8qIHVzZXIgbW9kZSBpcyB0cmFjZWQgYXMgSVJRcyBv biAqLwotCVJFU1RPUkVfQ19SRUdTCi0JbW92cQlSU1AoJXJzcCksICVyc3AKLQlVU0VSR1NfU1lT UkVUNjQKLQotMToKLQkvKgotCSAqIFRoZSBmYXN0IHBhdGggbG9va2VkIGdvb2Qgd2hlbiB3ZSBz dGFydGVkLCBidXQgc29tZXRoaW5nIGNoYW5nZWQKLQkgKiBhbG9uZyB0aGUgd2F5IGFuZCB3ZSBu ZWVkIHRvIHN3aXRjaCB0byB0aGUgc2xvdyBwYXRoLiAgQ2FsbGluZwotCSAqIHJhaXNlKDMpIHdp bGwgdHJpZ2dlciB0aGlzLCBmb3IgZXhhbXBsZS4gIElSUXMgYXJlIG9mZi4KLQkgKi8KLQlUUkFD RV9JUlFTX09OCi0JRU5BQkxFX0lOVEVSUlVQVFMoQ0xCUl9OT05FKQotCVNBVkVfRVhUUkFfUkVH UwotCW1vdnEJJXJzcCwgJXJkaQotCWNhbGwJc3lzY2FsbF9yZXR1cm5fc2xvd3BhdGgJLyogcmV0 dXJucyB3aXRoIElSUXMgZGlzYWJsZWQgKi8KLQlqbXAJcmV0dXJuX2Zyb21fU1lTQ0FMTF82NAot Ci1lbnRyeV9TWVNDQUxMNjRfc2xvd19wYXRoOgogCS8qIElSUXMgYXJlIG9mZi4gKi8KLQlTQVZF X0VYVFJBX1JFR1MKIAltb3ZxCSVyc3AsICVyZGkKIAljYWxsCWRvX3N5c2NhbGxfNjQJCS8qIHJl dHVybnMgd2l0aCBJUlFzIGRpc2FibGVkICovCiAKLXJldHVybl9mcm9tX1NZU0NBTExfNjQ6CiAJ UkVTVE9SRV9FWFRSQV9SRUdTCiAJVFJBQ0VfSVJRU19JUkVUUQkJLyogd2UncmUgYWJvdXQgdG8g Y2hhbmdlIElGICovCiAKQEAgLTMwNSwxNCArMjUxLDcgQEAgb3Bwb3J0dW5pc3RpY19zeXNyZXRf ZmFpbGVkOgogRU5EKGVudHJ5X1NZU0NBTExfNjQpCiAKIEVOVFJZKHN0dWJfcHRyZWdzXzY0KQot CS8qCi0JICogU3lzY2FsbHMgbWFya2VkIGFzIG5lZWRpbmcgcHRyZWdzIHRoYXQgZ28gdGhyb3Vn aCB0aGUgZmFzdCBwYXRoCi0JICogbGFuZCBoZXJlLiAgV2UgdHJhbnNmZXIgdG8gdGhlIHNsb3cg cGF0aC4KLQkgKi8KLQlESVNBQkxFX0lOVEVSUlVQVFMoQ0xCUl9OT05FKQotCVRSQUNFX0lSUVNf T0ZGCi0JYWRkcQkkOCwgJXJzcAotCWptcAllbnRyeV9TWVNDQUxMNjRfc2xvd19wYXRoCisJdWQy CiBFTkQoc3R1Yl9wdHJlZ3NfNjQpCiAKIC8qCg== --001a11c2b464fadd9f052656c1e8--