From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965879AbdLSOgx (ORCPT ); Tue, 19 Dec 2017 09:36:53 -0500 Received: from smtp-out6.electric.net ([192.162.217.185]:56980 "EHLO smtp-out6.electric.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965575AbdLSOgw (ORCPT ); Tue, 19 Dec 2017 09:36:52 -0500 From: David Laight To: "'Juergen Gross'" , Ingo Molnar , "Eric Biggers" CC: "linux-crypto@vger.kernel.org" , Herbert Xu , "David S . Miller" , "Josh Poimboeuf" , Jussi Kivilinna , "x86@kernel.org" , "linux-kernel@vger.kernel.org" , "syzkaller-bugs@googlegroups.com" , Eric Biggers , "Peter Zijlstra" Subject: RE: [PATCH] crypto: x86/twofish-3way - Fix %rbp usage Thread-Topic: [PATCH] crypto: x86/twofish-3way - Fix %rbp usage Thread-Index: AQHTeKAkBwDCFp2XGkCnv+tCP2yXR6NKu9Yg Date: Tue, 19 Dec 2017 14:37:07 +0000 Message-ID: References: <001a113f2cd26f3532055f0f4a79@google.com> <20171219004026.170565-1-ebiggers3@gmail.com> <20171219075443.tdpt2l72eelhpi7j@gmail.com> <44b42058-c465-4d1e-7710-198754efabe4@suse.com> In-Reply-To: <44b42058-c465-4d1e-7710-198754efabe4@suse.com> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.33] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-Outbound-IP: 156.67.243.126 X-Env-From: David.Laight@ACULAB.COM X-Proto: esmtps X-Revdns: X-HELO: AcuMS.aculab.com X-TLS: TLSv1.2:ECDHE-RSA-AES256-SHA384:256 X-Authenticated_ID: X-PolicySMART: 3396946, 3397078 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id vBJEawLr004599 From: Juergen Gross > Sent: 19 December 2017 08:05 .. > > Exchanging 2 registers can be done without memory access via: > > xor reg1, reg2 > xor reg2, reg1 > xor reg1, reg2 That'll generate horrid data dependencies. ISTR that there are some optimisations for the stack, so even 'push reg1', 'mov reg2,reg1', 'pop reg2' might be faster than the above. David