From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934184AbcLIRc0 (ORCPT ); Fri, 9 Dec 2016 12:32:26 -0500 Received: from mail-db5eur01on0053.outbound.protection.outlook.com ([104.47.2.53]:8687 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932353AbcLIRcY (ORCPT ); Fri, 9 Dec 2016 12:32:24 -0500 Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=cmetcalf@mellanox.com; Subject: Re: [patch 5/6] [RFD] timekeeping: Provide optional 128bit math To: Peter Zijlstra , Thomas Gleixner References: <20161208202623.883855034@linutronix.de> <20161208204229.005418487@linutronix.de> <20161209052638.GC3061@worktop.programming.kicks-ass.net> <20161209063847.GC15765@worktop.programming.kicks-ass.net> <20161209083011.GD15765@worktop.programming.kicks-ass.net> CC: LKML , John Stultz , Ingo Molnar , David Gibson , Liav Rehana , Richard Cochran , Parit Bhargava , Laurent Vivier , "Christopher S. Hall" From: Chris Metcalf Message-ID: <486a28a2-b7de-67fd-f731-1487b141319b@mellanox.com> Date: Fri, 9 Dec 2016 12:32:07 -0500 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20161209083011.GD15765@worktop.programming.kicks-ass.net> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: DM3PR14CA0035.namprd14.prod.outlook.com (10.164.193.173) To VI1PR0501MB2767.eurprd05.prod.outlook.com (10.172.11.17) X-MS-Office365-Filtering-Correlation-Id: 5bf79d30-681a-4889-3170-08d4205954b9 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:VI1PR0501MB2767; X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2767;3:eTEw9D86aCQaAcuXD2C3X0iTEDMKneAmJhW2qyVm1xP+5aVy0kGnZKhfaMLefOLOD0OQJ3JnTl7yiAGdinGPYoa7p5//ROIFSCZ2SaUiY4sz3wNRCkkbeIDgroUnQ74FFnTVTm6nAK5LGXFW2U0r96RaR98fLCLgKyUxqtyc5WZKFAiFs+F93++L9oi2jDOo3YWwRtOFrKKH9BXZVG8vmmT+86RhAjDspXY3xUlgRJioWxz9d8spYYGR228qBb6S61Cf8HYw9tbFlD7RH/wd7w==;25:LZ1btDaEHB+bjMqxci3mzY65beuAShux/rMISXiuaetWZld8culc1+WZK9XXjfQtG3BKy4N10b782aWuSJTWtdH9YEdIXV3shIgcCaLvi1N6PttYl+er1LQItDjjvz4gYMn1/UxhC+Ps2DwPNY47ChQoE69+zhLa4qDdg4zzy5tB4xjH1iG60NfadjDqQifHxhLVzDm4dB+uP/Cc1PdMtFlwloL8NBGeoupIzuwF3ChYYhj/E3RQOJU+Qq+6ju03OerxKYKcT+N0W3qpbGrvcqfN6p5Y6LLx+P+RllyYr8S7o+gdZCKfAfxUUtz4P2N2msVtUnvSYXBsKJ3H1W7FsFzmRM7RzIPOZZAv0OukqKG0UHaigQMmCSxZ6BfKBaIRT75DgzlTEfHDu4GrmzpaZm0ZihsQYRMb+gcO7XRfW6/7kUsRCV3g4JRS1GQvxBftIBOVXhidnMH+nQ1qmTg+Og== X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2767;31:1RL26FPBX6pKdrQsufVgZ4HIksz9aUFNLJtxZgn1Tf5VCnyomP5ZUnxQRXjAkVNNaYuTmWNKkCcgFBtH+J4tdBEIC9gzS9D3pjmYN2vl4dcwrEJcpSOapww4XHab5FAAf9m2HTfgSFSHFvRXu1C7aYHawyowUq6SnV7LanJrQOYY06Be1pMhPavzxzyc7tWCBT1qdRVYoHydKxAP/wIwQdQc2SoTIr6CNtR+sXS6yUG9nhYSRXpWUj69OrNx6WLS;20:+phTM6kV8qViOQD/LwyZ4s9Z9SyAvdzWdAIrU7OA++ZEiM8ZQlNjhvFpUiNxf/hAsmaRj3mAGrQvs8eqZ9CfPfNnd3GFC/f7blLnLqJ3srCcfyBvCE5QTgAgH3u6hUxs1kRBe2B0kVUohrBvvAmngkTjyIL8BlcbWW/Iiqfq8LJrR8hUVATzvhDtDYqIKR0TmUUIftqiZ32J4DOFpSWWeZ+EEkNmxCXSYH3lhfq+NKtnAJ+n6EgfWaHTqpnwmvLx9n0LT3DZo7RJgRsMr/SfATCoWLuYQmNW+ILEowMEPvHJuQ1LNLvRTL1JbhU0JnBCGDey+/rdPLuolJbkaP3wcE8+IG8Hpnd4eupkvGB+c44fWvdW1M7cO9/wfaX8im29U3351QVOuYLYZPoAw+m/UECqLkoZxSzdFRpfBp96eCENCQ380E1JqGOPOQM588AztxDOdPbEF8RBEc6Iu5SKZkcQqTBX6lM6cNmneug7phfl4KPd+Y96wYCbQ7/svRQE X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(171992500451332); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026)(6041248)(20161123562025)(20161123560025)(20161123558021)(20161123555025)(20161123564025)(6072148);SRVR:VI1PR0501MB2767;BCL:0;PCL:0;RULEID:;SRVR:VI1PR0501MB2767; X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2767;4:xHWkvisOQagFReYuqXlWC8m1/vlhrv02aPNZw8dQW6/ijRnATSv3UXpsax5rrAqf0v7VS2izpRHfLg03P1sUuXEjRxABbAWMegIXzAVWYfZAdViHRAdq8y67Y36QkreX7hDdEo1yHsVgpEapJq98fhWZLDI0B/6I25uICDlXxrq14iuO9APs8fIDH+/gNLVVMThhchxipB4GOTOLbKoiXg4ZvoqpKaucfw5rXXfwCG/w0xzAlTMilvU0eXg6KS9oSvIih5n980gz1cr/OexoRJ0Bu4YwtMHF+4eUJGB0piRZQnVv8DrePYfHJESuaYzEb3VUT1ZFEZny76gGtiVhbYnNGDOjURPMtkcRg4kQjpu9srs8P5o68gYYNyfUlMcio5UD32rf7yb0cXOfCpaEudN+Ug2DQ1Y/dvcgG6D6bIqNXtPWYt4au4PCuy0fCJ7V12BM5uyXcIblYi7MnJFcgL1HA0fksD4wmEhGnuHw/CbcHhpfU4cEhm+RmcxUDMutp+qhNk6nwHdstgOa7cGzuLgbDy+sHKEQl+vIefdgx9uyLx+lnziAAa5wC/BBQzEeJ0bR/EyLeRZbsnKXcwgxaso/uHc7//SllTs7lTV+rF/5P0QNAgoEaFRjxhCXkqMMHHzt3SjOfQRhYVXiuDMrZtsE57rx8X83xGDrpYpPv3o= X-Forefront-PRVS: 015114592F X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6049001)(6009001)(7916002)(39450400003)(39860400002)(39410400002)(39840400002)(39850400002)(199003)(189002)(24454002)(377454003)(6486002)(66066001)(305945005)(50466002)(33646002)(5660300001)(47776003)(5001770100001)(65826007)(39060400001)(6666003)(2950100002)(81156014)(65956001)(2906002)(38730400001)(90366009)(77096006)(65806001)(4001350100001)(76176999)(81166006)(189998001)(101416001)(68736007)(97736004)(105586002)(50986999)(8676002)(93886004)(64126003)(6116002)(229853002)(106356001)(36756003)(31686004)(92566002)(86362001)(54356999)(3846002)(42186005)(31696002)(230700001)(83506001)(23746002)(7736002)(7416002)(4326007)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:VI1PR0501MB2767;H:[10.15.7.185];FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;VI1PR0501MB2767;23:LAUtvRQx6V5nA2D0xU0RXuhK2yBccZEhtCx?= =?Windows-1252?Q?PZJotSRiQb83q85SCPc9jRuv+QMLQ32ynNxR3mnJyEug9/gZlgWwlAym?= =?Windows-1252?Q?COIeneVMt95gpkF/Ea6gvc1uQmFNEnKO1FiM0jIVPP6kpT0en21M3hK3?= =?Windows-1252?Q?bEh4CwIsHQEBujUb7BXzXThLrJyy5pEDg24ugXmUkYVX3RLZ3kgkCU7E?= =?Windows-1252?Q?wAqEK/MepPfhiwDrClpp6KUpRIACVBaQy2LrOy+1hZRh5I1WOr2bG8+4?= =?Windows-1252?Q?asrwSGbirOaXnTD+uTztpoue5pjNdfIdq3xG6bxMBQG8Z6LF+cc7CqGT?= =?Windows-1252?Q?uL22GKTF2Md8ED6Ed28LEtq7yVPASi6Dm4EkuFahq+1Vdj1M724Nbof7?= =?Windows-1252?Q?S6EC20sbmTNU3j0akwGOzm7owKTIFdQw5PdeDWJxwniQFEljVTW4S0Pz?= =?Windows-1252?Q?fDa/xeLWHfZmDLbg2izUIsHYyYgAXZiUrGstAGOsjd5v1Bjlwri54oj4?= =?Windows-1252?Q?9BzWxixQY2pkxHJ5Y9DQcIPzkyaE3TEo948fCkvnD2LUsnnnP4CF/kOK?= =?Windows-1252?Q?CL3D0FwUr9ZLFiFIRJaWUiFRLAcRJVmZ7pVK8lcBLiZ+/PGrAs9Svq5R?= =?Windows-1252?Q?++4QqedpbJf96CIiAYB5Ee4FkZ3D8LLHXsfLU2akxL8wKqHIXa3H3fyL?= =?Windows-1252?Q?EtgI+oxiqA+eQwubLIpRzv0TI25h+5Llw5luKo4iI9/pVnu3TdbaoRoF?= =?Windows-1252?Q?pDe2iVZ424fxDF0iTig4qWu0c3w6kCDi+cdEk1kvkV3ayc37b0F9tnhy?= =?Windows-1252?Q?qdyDSnFX1IxAIeZ3h+nnZSZVKK+bQ3HGLlBA1IVQzeMkjivA/+tNn8Wh?= =?Windows-1252?Q?5RNNMFYTwz+x1UJrsfEC4WbrDGzPqKsrWUdyhHIPJRi5mII6ondI1bX2?= =?Windows-1252?Q?xQ8yCpUezkWL83Eho/V58tzAiWKobP77adU4aTtSiufV7bj2PgQKhNBd?= =?Windows-1252?Q?0NxtolAl/NKZxgVKBfUYLSyAqJdgbQZanHHrSrl0kC8EfsZx98IATL0+?= =?Windows-1252?Q?FPqOXC6Kzc1HZU/R18ahU6aaKPIl6uCvb0riZcjtvPOyZjMFQRrcUYAQ?= =?Windows-1252?Q?9sAkbREU5cvCi2B5O2ttGrbBij8kVjT/vIhqdrq+UMMDwj3U6Kg3ZV7o?= =?Windows-1252?Q?z0gqmznxAIkskgLbgfnhpYI9trcTsd7wzbXFAcH2JG+qZrKn/nLumTRK?= =?Windows-1252?Q?DCnevOH+meBw6tBfjqfpXz2SuLGqoJlHU0VpGYbHAzH2Pbl5/24mMBke?= =?Windows-1252?Q?EDc4Z6DHUpagR8GjlgtSPCONm3YHETUtqQnMZVNL9XjoVBE4OpX09yC3?= =?Windows-1252?Q?8fcu7YcA3PyKwaI7dOHwj8ztNCl7e2FkjWX40i+7sjMYIZIitRwA7sBR?= =?Windows-1252?Q?coxckFYmKypu5eSVEFAvzbTPpOXQ8w2VjNuGEPgOJw8+ulzzitZIXfZN?= =?Windows-1252?Q?gkB55sO+9CW9kbz89F7zANTwrvc7zO5oh4AWPVM+HLo+/LD7tmK7F90b?= =?Windows-1252?Q?c9AH/gDgYkKc11movod9m5GEad446tLdvhpULcljHPgIq6TuFZ4O9joa?= =?Windows-1252?Q?D+SbQmiGyn4NZYvWc93uJ0ZU=3D?= X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2767;6:NALJbSJpRKf7AritXbZcuXNbmv+VvsI0LDjLJzmpGOLMaELiqEqS0jFPZ2A41AcUh0+jYx/tpNs0zQITIYLbBoBgZzk6oFehHtgWKD9nPgzhBtkxw9kxa+cv/MCQJZOYLwgmXPXDfNXm8tyx8Nc+EfuGkVM3uMYsLOj+54N/eojrUmsgQpZl40ifmUK7zda29SnsD7Ge9G8fxDLLqc8RtZW73ebHNRxPtqI1XSarHbjLJmxKnjUeJRidWoAJDhftOh/RG/3LDXufou+GNIKTV4J0x6JV/7JXSkvmYwmwl880D0mE1LfFpVFbOrI8/NmUdkAqdxAYF60N8+20HfDkWr+QWRTrVX6F/sDNRndPPdkGtLS8AaKcSYyNIcmj2B9hWZWfDAxW7gN6uUDkqlVqH8b9G1XskyWMzV2m73lDGor8c0TvvmYcaqlETonS8Af2y6phkSBE39R83m/S2tnqfg==;5:ubB2ghOZkXrjyh6AsU6HHKoSEEQvWcJg4IDPWPCvk0Bnov6bNKzDviv3QEnHBUwBrEAuuFtAdQs2D1RW6VMNn2Jbwv8siT/iJrlmdACOqEBHTzD7slY6hduT+/r0C+F7QBvRiJiUT7+Syhvk8+yA5saUB3bP91kGdT8f3rw/oO4=;24:2c30/Uo+RHTMziA1eri7X9OPC5PVy1xui56YtgORu7O7cCl6PAv0SlIplKAEwXTM0foqTPVxc6vmg+xcitH7Zj690f0HzLpk8Z4UzjRbal8= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;VI1PR0501MB2767;7:NcjMmwe22PWlyo630uf16j0L1hR6DJo1XaqhvsppzjETrtaF7wehWTulM45bxqmICFwWli3rMArzgPDe5Kg0cU7LVnI6q2RI+x0pf30mA1yNuaKfX2TUO9zD3Kfd5cEDJYwSWsQyU1u10p/2qTm14t162M9SMKkyTK9qzSxCIFeJ34TedRnv8HlMJCdHUNV7/P2Ig9YhP6arB95CHs19sGaJKt0z4/5vZ7XSQivFS7tARhxQPWEr29pbZve0t3tpXfnP5yTO4NzDZQ9MHeow+nGXuXHjHgRe7Vz0Fyzh55afnwYHbk9P8qx2QGTm/QbC/k+A8hAct6dV3MPyA4dFgavig3L/qKwlYBz+Ppz3YMvIgss8mdzSU8KcvTFgFJ+3mF/6GLngtg+5vklbUn3aSXOnkJIiIxDYZBRSRBUDIJfVnOol0CA14t284Lz94UWnYJ6LNsfwfRl5h2g2V7oNyw== X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Dec 2016 17:32:18.8003 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0501MB2767 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/9/2016 3:30 AM, Peter Zijlstra wrote: > On Fri, Dec 09, 2016 at 07:38:47AM +0100, Peter Zijlstra wrote: >> On Fri, Dec 09, 2016 at 06:26:38AM +0100, Peter Zijlstra wrote: >>> Just for giggles, on tilegx the branch is actually slower than doing the >>> mult unconditionally. >>> >>> The problem is that the two multiplies would otherwise completely >>> pipeline, whereas with the conditional you serialize them. >> On my Haswell laptop the unconditional version is faster too. > Only when using x86_64 instructions, once I fixed the i386 variant it > was slower, probably due to register pressure and the like. > >>> (came to light while talking about why the mul_u64_u32_shr() fallback >>> didn't work right for them, which was a combination of the above issue >>> and the fact that their compiler 'lost' the fact that these are >>> 32x32->64 mults and did 64x64 ones instead). >> Turns out using GCC-6.2.1 we have the same problem on i386, GCC doesn't >> recognise the 32x32 mults and generates crap. >> >> This used to work :/ > Do we want something like so? > > --- > arch/tile/include/asm/Kbuild | 1 - > arch/tile/include/asm/div64.h | 14 ++++++++++++++ > arch/x86/include/asm/div64.h | 10 ++++++++++ > include/linux/math64.h | 26 ++++++++++++++++++-------- > 4 files changed, 42 insertions(+), 9 deletions(-) Untested, but I looked at it closely, and it seems like a decent idea. Acked-by: Chris Metcalf [for tile] Of course if this is pushed up, it will then probably be too tempting for me not to add the tilegx-specific mul_u64_u32_shr() to take advantage of pipelining the two 32x32->64 multiplies :-) -- Chris Metcalf, Mellanox Technologies http://www.mellanox.com