From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3ECDCC2D0DB for ; Tue, 28 Jan 2020 19:52:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1386C214D8 for ; Tue, 28 Jan 2020 19:52:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1580241137; bh=SbyOG7cvW7ZLtAUVL89GtKsbgtdLZWiTOY160LF4WRk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=nOQ8JbJNAKV8oCLvdN6/BXCFFx9JmJsiUIbmBdW2PblKhXmkX5ksXkMqpRXZ6vRnN 3UXiVkx3rHllQt/XWjHn+QEZbq9d2vK/aU4zhzp0J8Eax2eY89I6UTciJLPQ1imJtm Zu2+UdYYpY7K17M7Zj3gFxFp/1LuEeicHuOxYD4c= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726340AbgA1TwQ (ORCPT ); Tue, 28 Jan 2020 14:52:16 -0500 Received: from mail-lj1-f194.google.com ([209.85.208.194]:42273 "EHLO mail-lj1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726066AbgA1TwQ (ORCPT ); Tue, 28 Jan 2020 14:52:16 -0500 Received: by mail-lj1-f194.google.com with SMTP id y4so15978055ljj.9 for ; Tue, 28 Jan 2020 11:52:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iyik5rYQjUugRWXOhS91wMvUPz7Kgd9tYhKH8VVIVRY=; b=ExwhayKlEf7Bfgv9eqGWXilHEdU7XISxBPCehCpR45dam0NkLcmg21/yMfF+WYxJ1q NH73D6YQ8j6ssVoWLhhf7598l4vaKJFGcwmXOjz8/EzAv0WZ3qLIHbZtiDiV8lct6SaL DnsfmlutMeOKWC1ZGIrU+ID/EVX0teHv0zlS8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iyik5rYQjUugRWXOhS91wMvUPz7Kgd9tYhKH8VVIVRY=; b=csYnE9jd2p6IgZLo55Pr6fQnCHgZXAJa6HasFbWUZdWgYARFxLAn9MgTPeKWj6/Jc/ CgFH0pLsPy28AP6ru9TK5FygPbimPLNTsuAM34BvF3pGqXi3xwxOS8G7GVAQ1kgvGSlI tcnNczVMoMGvbj/Cu3ikZmzDwc6ce9zz2EAC5gUw1U6U/CGaqJX3jo/Ssz8TZjiLjhGe A+rCEoVOZ6dG5AFi2u5QCdv1f1wMf0qaG+F4Z94NL2W42rufyC69hUWSrKBcJLsQPiOD MKTfJ2wLAawf695oFLLih76P5eWLYNR50Z3NozJlK3KL1WDMOUiurPoaGI+XDAH4aPbg jpKg== X-Gm-Message-State: APjAAAWtbIJyk5gOlqbG2MEPQI8UC1+1dsALlgKPVF8Gj3fizJCx3K1p mQJlbGl/i1aiwklwd1geDJpk1hH357M= X-Google-Smtp-Source: APXvYqzOjcEc9t35j03cRCMa10YEPebtbXINzUZU4+aPPk2f+dr8TTIQ0ZAnbAf95Z15zxmO9kLAhw== X-Received: by 2002:a05:651c:1049:: with SMTP id x9mr14563802ljm.233.1580241133648; Tue, 28 Jan 2020 11:52:13 -0800 (PST) Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com. [209.85.208.171]) by smtp.gmail.com with ESMTPSA id w19sm10223670lfl.55.2020.01.28.11.52.12 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 28 Jan 2020 11:52:12 -0800 (PST) Received: by mail-lj1-f171.google.com with SMTP id h23so15995144ljc.8 for ; Tue, 28 Jan 2020 11:52:12 -0800 (PST) X-Received: by 2002:a2e:3a13:: with SMTP id h19mr14478575lja.16.1580241132023; Tue, 28 Jan 2020 11:52:12 -0800 (PST) MIME-Version: 1.0 References: <20200128165906.GA67781@gmail.com> In-Reply-To: <20200128165906.GA67781@gmail.com> From: Linus Torvalds Date: Tue, 28 Jan 2020 11:51:56 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [GIT PULL] x86/asm changes for v5.6 To: Ingo Molnar , Tony Luck , Borislav Petkov Cc: Linux Kernel Mailing List , Thomas Gleixner , Borislav Petkov , Peter Zijlstra , Andrew Morton Content-Type: multipart/mixed; boundary="00000000000055d6b2059d388f19" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --00000000000055d6b2059d388f19 Content-Type: text/plain; charset="UTF-8" On Tue, Jan 28, 2020 at 8:59 AM Ingo Molnar wrote: > > - Add support for "Fast Short Rep Mov", which is available starting with > Ice Lake Intel CPUs - and make the x86 assembly version of memmove() > use REP MOV for all sizes when FSRM is available. Pulled. However, this seems rather non-optimal: ALTERNATIVE "cmp $0x20, %rdx; jb 1f", "", X86_FEATURE_FSRM ALTERNATIVE "", "movq %rdx, %rcx; rep movsb; retq", X86_FEATURE_ERMS in that it leaves unnecessary NOP's there as alternatives. We have "ALTERNATIVE_2", so we can do the above in one thing, _and_ move the rep-movsq testing code into there too: ALTERNATIVE_2 \ "cmp $680, %rdx ; jb 3f ; cmpb %dil, %sil; je 4f", \ "movq %rdx, %rcx ; rep movsb; retq", X86_FEATURE_FSRM, \ "cmp $0x20, %rdx; jb 1f; movq %rdx, %rcx; rep movsb; retq", X86_FEATURE_ERMS which avoids unnecessary nops. I dunno. It doesn't much matter, but we _do_ have things to do for all three cases, and it actually makes sense to move all the three "use rep movs" cases into the ALTERNATIVE. No? UNTESTED patch attached, but visually it seems to generate better code and less unnecessary nops (I get just two bytes of nop with this for the nonFSRM/ERMS case) If somebody tests this out and commits it and writes a commit message, they can claim authorship. Or add my sign-off. Linus --00000000000055d6b2059d388f19 Content-Type: text/x-patch; charset="US-ASCII"; name="patch.diff" Content-Disposition: attachment; filename="patch.diff" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_k5yap2xj0 ZGlmZiAtLWdpdCBhL2FyY2gveDg2L2xpYi9tZW1tb3ZlXzY0LlMgYi9hcmNoL3g4Ni9saWIvbWVt bW92ZV82NC5TCmluZGV4IDdmZjAwZWE2NGU0Zi4uZTQyYmYzNWI5YjYyIDEwMDY0NAotLS0gYS9h cmNoL3g4Ni9saWIvbWVtbW92ZV82NC5TCisrKyBiL2FyY2gveDg2L2xpYi9tZW1tb3ZlXzY0LlMK QEAgLTM5LDIzICszOSwxOSBAQCBTWU1fRlVOQ19TVEFSVChfX21lbW1vdmUpCiAJY21wICVyZGks ICVyOAogCWpnIDJmCiAKLQkvKiBGU1JNIGltcGxpZXMgRVJNUyA9PiBubyBsZW5ndGggY2hlY2tz LCBkbyB0aGUgY29weSBkaXJlY3RseSAqLworCS8qCisJICogVGhyZWUgcmVwLXN0cmluZyBhbHRl cm5hdGl2ZXM6CisJICogIC0gZ28gdG8gIm1vdnNxIiBmb3IgbGFyZ2UgcmVnaW9ucyB3aGVyZSBz b3VyY2UgYW5kIGRlc3QgYXJlCisJICogICAgbXV0dWFsbHkgYWxpZ25lZCAoc2FtZSBpbiBsb3cg OCBiaXRzKS4gImxhYmVsIDQiCisJICogIC0gcGxhaW4gcmVwLW1vdnNiIGZvciBGU1JNCisJICog IC0gcmVwLW1vdnMgZm9yID4gMzIgYnl0ZSBmb3IgRVJNUy4KKwkgKi8KIC5MbWVtbW92ZV9iZWdp bl9mb3J3YXJkOgotCUFMVEVSTkFUSVZFICJjbXAgJDB4MjAsICVyZHg7IGpiIDFmIiwgIiIsIFg4 Nl9GRUFUVVJFX0ZTUk0KLQlBTFRFUk5BVElWRSAiIiwgIm1vdnEgJXJkeCwgJXJjeDsgcmVwIG1v dnNiOyByZXRxIiwgWDg2X0ZFQVRVUkVfRVJNUworCUFMVEVSTkFUSVZFXzIgXAorCQkiY21wICAk NjgwLCAlcmR4IDsgamIgM2YgOyBjbXBiICVkaWwsICVzaWw7IGplIDRmIiwgXAorCQkibW92cSAl cmR4LCAlcmN4IDsgcmVwIG1vdnNiOyByZXRxIiwgWDg2X0ZFQVRVUkVfRlNSTSwgXAorCQkiY21w ICQweDIwLCAlcmR4OyBqYiAxZjsgbW92cSAlcmR4LCAlcmN4OyByZXAgbW92c2I7IHJldHEiLCBY ODZfRkVBVFVSRV9FUk1TCiAKLQkvKgotCSAqIG1vdnNxIGluc3RydWN0aW9uIGhhdmUgbWFueSBz dGFydHVwIGxhdGVuY3kKLQkgKiBzbyB3ZSBoYW5kbGUgc21hbGwgc2l6ZSBieSBnZW5lcmFsIHJl Z2lzdGVyLgotCSAqLwotCWNtcCAgJDY4MCwgJXJkeAotCWpiCTNmCi0JLyoKLQkgKiBtb3ZzcSBp bnN0cnVjdGlvbiBpcyBvbmx5IGdvb2QgZm9yIGFsaWduZWQgY2FzZS4KLQkgKi8KLQotCWNtcGIg JWRpbCwgJXNpbAotCWplIDRmCiAzOgogCXN1YiAkMHgyMCwgJXJkeAogCS8qCg== --00000000000055d6b2059d388f19--