From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752653AbdLKS65 (ORCPT ); Mon, 11 Dec 2017 13:58:57 -0500 Received: from mail-he1eur01on0070.outbound.protection.outlook.com ([104.47.0.70]:3243 "EHLO EUR01-HE1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751628AbdLKS6w (ORCPT ); Mon, 11 Dec 2017 13:58:52 -0500 Subject: Re: [RESEND PATCH] arm64: v8.4: Support for new floating point multiplication variant To: Dave Martin , gengdongjiu Cc: Mark Rutland , "guohanjun@huawei.com" , "linux-doc@vger.kernel.org" , Catalin Marinas , "corbet@lwn.net" , Will Deacon , "linux-kernel@vger.kernel.org" , "linuxarm@huawei.com" , "zhihui.gao@huawei.com" , "huangshaoyu@huawei.com" , "gregkh@linuxfoundation.org" , "arvind.yadav.cs@gmail.com" , Robin Murphy , "linux-arm-kernel@lists.infradead.org" , "zhanghaibin7@huawei.com" , nd@arm.com References: <1512833322-35503-1-git-send-email-gengdongjiu@huawei.com> <20171211115947.GS12608@e103592.cambridge.arm.com> <4c6d83f1-e8f3-46d7-f3cd-af2db77e3a9c@huawei.com> <20171211132914.GJ22781@e103592.cambridge.arm.com> From: Suzuki K Poulose Message-ID: <8ebb6a36-0e09-1ac1-7785-cc9d4e4147fb@arm.com> Date: Mon, 11 Dec 2017 18:58:31 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20171211132914.GJ22781@e103592.cambridge.arm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [82.29.14.190] X-ClientProxiedBy: DB6PR0501CA0012.eurprd05.prod.outlook.com (2603:10a6:4:8f::22) To HE1PR0801MB1786.eurprd08.prod.outlook.com (2603:10a6:3:88::17) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 79b03f0c-7c7d-48a0-f718-08d540c93562 X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(5600026)(4604075)(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(48565401081)(2017052603307);SRVR:HE1PR0801MB1786; X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1786;3:972+kmcRX1Bw2mvhgQbJJMuB0v3NIaZjKijpoCQZio89CtUyUw8/645ksAWDnRFiUvDK6vBI05bKQR0hFt49zxuWUmIXoAgAh9qUHgTOcxo0Vb3HcznNDYDisHRktdd3d3OCJT91z5H1pZw4Hcw5f6bG6qY8JR+N+IMRfGn+h2zZSU0vOxRNNNcAJziHDkOo+ErHDL4qOzFdMwiL4sDWXVR3a3XxkagRpemTD6KN/nyll4VakLPlWGK/kf0L8HKb;25:/hQlH08JmDcMqdkbYSRZWz25gK55K2ZPlWlAqwF4lufjgmWgyr0tZOVVqkkB6uDGwkVyd6WJaTF66dPvFh1Ge4gj/yqqrkfsmWw4rkASvz3jof4pb8z9WQIXdj+4iNle+ch6PVYHopzE8Z4k0KWC5ng0pjRoDsM4nOV8EiSiTP5EwIlezJcnPUr2yagC3Dy3d/pjudY1EofB11sPG9139k/fQ4H5D3rvVhCfKIV6+MHwAebhVAEubTKXcPHalSfz+OOlYrk8GqW/6pJEs+5rtPwA/1NgdFjwIcs3nNzbYPn0TryDU930fAru3L3LG0gr6M7oSySN8cp7R6Vp2ISV+w==;31:KNQgXM4LeDlvbRPUEdLG1bGpIuZW3uLS0iBb7fNVOEwJe9aSnN00djyec7un8okY4sjcXSH3xOwREIy3cOyphNyQPJ+uhsl+a5gkEqUPFTqMH6Cq4yA/awq+9vUbnu1A65TUzfqMBlVCU/cUEcb//JBiRzjaWjQLrgjcxaeF8L4st8v2onRsuye/jMW/TJNBgtg9gOp40vmydauEKuaktMld+Y+K/Lp4jNelGbcokQo= X-MS-TrafficTypeDiagnostic: HE1PR0801MB1786: Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Suzuki.Poulose@arm.com; X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr NoDisclaimer: True X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1786;20:3Q6yD6iorqIDBDLlZi5h0HhtFJU8Rdaflz4VgDvYsLKD2r9u267FWH35urNALqefzyF8ycbdzdt7UctJs+1F8j137eyNQonHwV8IIZSS+E6tbo9SJRWnl1wYWmXdFRfcYyoDbogPmO/TASWnteC/crGot8GueRT8rrQF4ZkGLEU=;4:IRUpUkjR6+kPuPdtpgx2ao1HLyz+pgb2BLIeDrzDCij1+Ayxjbo8LxGczbnV7uDlnK1IbT6ygjBcmG9CrFone4ZKubIEGunWFZgI0JZAedMzeHBYmRDJhCbZcGkaw5cU41xeH2OVCzPpjjbkkLo68XvoOkBj4H1LyWRk042e8SmRy9ZyaXbgtODgeorxdmT/ryO6IY4G5EaDOEGEJBDtuLy+lQrRmvU4W3NV3Qb27aQ7mxip8q0TmahxuhsBjIE98z9RApKa4NOYjmiJ6/qD+A== X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040450)(2401047)(5005006)(8121501046)(10201501046)(93006095)(93001095)(3002001)(3231022)(6055026)(6041248)(20161123560025)(20161123562025)(20161123564025)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(6072148)(201708071742011);SRVR:HE1PR0801MB1786;BCL:0;PCL:0;RULEID:(100000803101)(100110400095);SRVR:HE1PR0801MB1786; X-Forefront-PRVS: 0518EEFB48 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(979002)(6049001)(346002)(376002)(39860400002)(366004)(24454002)(51914003)(189003)(199004)(4326008)(8936002)(65956001)(53546010)(39060400002)(2950100002)(316002)(16576012)(25786009)(50466002)(8676002)(81166006)(81156014)(31696002)(16526018)(31686004)(86362001)(76176011)(23676004)(53936002)(110136005)(54906003)(67846002)(68736007)(106356001)(83506002)(52146003)(2486003)(36756003)(64126003)(52116002)(229853002)(5660300001)(65826007)(6246003)(58126008)(66066001)(65806001)(97736004)(2906002)(230700001)(47776003)(478600001)(72206003)(7416002)(6666003)(3846002)(77096006)(93886005)(6486002)(305945005)(6116002)(7736002)(117156002)(105586002)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1101;SCL:1;SRVR:HE1PR0801MB1786;H:[192.168.0.15];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtIRTFQUjA4MDFNQjE3ODY7MjM6Qk0vVzd1dDdqQVlDQUhKbk1DM3JEWENL?= =?utf-8?B?RDkrRHhzNWFoK1phZmZTdE1GcGxGaFFIN3pBWUlKUzFpQ0xHZ3BZVUJZSDA1?= =?utf-8?B?K2EzSkQvb0ZQWER3UFFCeExTYU92eStwV1AyUmZBK1d3UEFMUnlGa1VBNWZU?= =?utf-8?B?ZGMzZmJlRjliSkRZRk9IRU00Zll6a1Zxamdhd014T1lXNHBFMlhGTkZ0SHJX?= =?utf-8?B?VmJpZDZMN0VIblBMYXh1T3lnYXQxSEZ6OUZJUERJVWRkVFFoOUxUYjN6Vk1Q?= =?utf-8?B?VzBiVFFTUXRhdUdtaHJUSURwU0hLc1ZDcmc1UUhlei9ralBlTjNrM25SUzRa?= =?utf-8?B?Q3Z4YXo5aThmckhMR2NJQkF6N1pJVUxUZmZYTngwSmV0T3NhdHduQ2VBNXlB?= =?utf-8?B?bGhibGVNcEZiam5sWUVlbHpHRGljamtwTWpxZ3Azay9CUWJxWDNaazAvU3Q4?= =?utf-8?B?VUFTSklYallyRFdBdlN6TUxvb0d5RStYUGdleHh1UWE4ZHRaVHV2Q0Z0aFlE?= =?utf-8?B?R1ZneTlxZE5XTzgxbjNDcVNHN0k5dUJCRzA1aWc1RHFnL2NReDU4c0R1ZUgx?= =?utf-8?B?Tk5qWFZMNldiNVpnRzNhTFhtdityTHhhZ05vbnY4QXBWVnZYYUJ0cjE4YWxK?= =?utf-8?B?OVFhaWFqZkU0N0xyNHFCYjhlaGY2cnM1Z0lXRmJUdEx3a2RZTlMrT0dJczdN?= =?utf-8?B?dlJHODQ4NWhNQUthQUJZZHRzVWFhVlNDSjhvMzc1ZHFBMDJlWkRhOTFIN0Zk?= =?utf-8?B?dzd2UHdTRFdQbXYrT0hBYTlpcnF1dzZRRmJScjZGOTk2VFVTRzh4YmF5eTlM?= =?utf-8?B?bitBQVprR2VHdCtpR3RVbUQzNDJtUUt6OW5LUzAvSjN0RzJTc201WFVSZkY1?= =?utf-8?B?R1NBeldUYldPMkpjeG85NTdaNHpINjAvQktPNk5EODZSeEx2OWk5SmZCOEVK?= =?utf-8?B?c0o5cFZaWXhocUtubVNnZ0xkaUpyRmNlRmdVVUw1ZGp0OG5BVHh2TzVvQlJ0?= =?utf-8?B?L0NEdVNTRUxPbjU4aVZsTjVwZDhwWnhqclpQRG9tWlIwYmQ4K3RwZWJheitF?= =?utf-8?B?bS80TThhN1NkQWN1R2hNMzY1K1JQR1Y1NThWMkNQVmI0a3VrRXZCWWZhOWRH?= =?utf-8?B?ZzJ6SkJRZ05hSGVFT2NTQmR0bkJ3azdUWE5waE03b1R4ajF5VlYvK0VJSkJD?= =?utf-8?B?MUpkNW9zWTJOdFBweXhaQTJsL3VsNnRCUGczcmlzZ2JFY1RmZXFUZUN1bnNM?= =?utf-8?B?aEFsL0p4dElJbUgzcjA1RXQzVng2NUNqYXAxT3hveW1kQS9TUzJVd1BiQjhv?= =?utf-8?B?QnM5Y1c3OFNKWWdtT0dOK0MzUlZNZ2lBOW1XdDMzNmIzVTZmWXpyODRueSs3?= =?utf-8?B?akdsUGFtek1ZZmNDbDBQekpxN1haampvL0xjbzdMbHBHM3AxMmhjdXpCSVJr?= =?utf-8?B?T0lHdWJ4UnNzQkxxQVg2Mzc1djRKNSt2RWQvQlVnWkJQd0FIaE1qcENBNEg2?= =?utf-8?B?ajkwTElOT0NRTDFUQjdHVDNRTzNWVHUzaFQzTnJUZVVzb3NzWVk0WWJSMmtN?= =?utf-8?B?MVZsLzFNaHRXUStERW9Mb2VtYVVkUmRuTy9rdjJPWkF5QVhraU1WQXpMbmpp?= =?utf-8?B?Q2ppZGFUSmJPdkkwcE1sYkFoeURmVXlCSVFaWkduYWQzd214M004Z2Z0ZGs2?= =?utf-8?B?dHlicU13L0YyeWplNFdTWlB0ODM4ZzBoc0xmTGdsWDZzejBITlV6TWRIQVJ0?= =?utf-8?B?VS9mMTk1dm1CelBJVVU3YTVuWXl0NitJcm9jenU3ZHRqR3orRFpJZVhDWXo4?= =?utf-8?B?N1c1cGZiYk1FNmpidGZ1eitJMFNlbnBjakpiaXQrOGxlL3BQajhwOGxLTmh3?= =?utf-8?B?cGZuc0tCa3RjUGM1cXBCTzZnUEMrMlMweVRoREl5aDBBWTlObDEzaCtTMDRX?= =?utf-8?B?Znh2M2lOTmU1dGRpM2R6YUMrYW1YQ3lNTWlIWExLbG9Xb05GdTJrdWdjOFBt?= =?utf-8?B?SFVhU2tFcmZLMkwwT2RDZ3VnMk5oUmQ2ZzNLZTFnaFlSRHp5S0RIaTJKdFBj?= =?utf-8?B?Z0xSM2lMWlZ5UlpNUnRPaWZVTkd2bk9KS0ZGVC9jV1YrOFcvR216ZUtPempR?= =?utf-8?Q?fl+n0hV3YNq+kaR5PNl2q/E1k=3D?= X-Microsoft-Exchange-Diagnostics: 1;HE1PR0801MB1786;6:VX/FfjFBHjmLLHRxomxAJ/xIXdYckZiUS19Gr9UTAGPKhoGgMTNOPuqwsQRdWlESce20zODbXlbnSdS4LuysYuknEVsDw9QAmSIaB9EClq7BmJ65L4FhqcdY9K0ZOnOaaPa3Ls1tznPneVRqPoPRQ3cu+uKKjvOnD+YF68DuOHBx1ahnbFYOTaN54NOMUYZz5A4oSEv+U9GiAnkceSdLkHLlBYQagA65MatQpZe9N16UhjDCBgd8KBqI+w1bcmkANu63zBUp/dPpQc+Rg5FT2T3lvAGIM2WDVNgpviWb6CPORW8K2bNWYjYbRxisUzVTpBPypTNZ2VpAs4US8HVHuasiQad3YtKyHhMRFV4OiTI=;5:mV7NCaw0cdFRuSHffCjIyGR3oGq9R0MAlPzS6hXm/n9GJVtyjR78CAdHEPa4FqWRnSDjo8zN2+dqqpRT9JQ/ASFJIQzsluiKECYl78yymCvy0AK0tGn0sp/mlEk8CejljKae6SmdQp6SYO9ZuvipFqvzXdbN6TPeIK20pQlCQ1g=;24:QVaXSFDFdbw4GFrw2x1V7FoVjJzLd2PbCi25ohUq4sBOaJfqI9XH3q868kTz6/wNTL1PsX3AT57pIX7GgB/4W+gbA+SozesYWPNJP/g5kHI=;7:kI6uYOb4gCJILP9f2SMoJ8Q3v3V/FHMNwPngZjtmdgQXZaKH2m5KpK55C6RPo3bqjiFW/abaqesWBcluRvN1Ms3L02RtyuZic4s5kZ7h7yYU3kA5lqRLl8lT2hF5mE5YAb37QkoiRwLpn+8tnaQuHoYd04IydZkG+et2k7ZxbtmXAtcAEomnQe2YV9BNl373zcjM4ktT7x3FEWP3duqyv9G8xKImw7sDwuarc6vTfNgGoWGMc9AbSLbn3M/rbCU5 SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Dec 2017 18:58:38.6857 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 79b03f0c-7c7d-48a0-f718-08d540c93562 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0801MB1786 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi gengdongjiu Sorry for the late response. I have a similar patch to add the support for "FHM", which I was about to post it this week. On 11/12/17 13:29, Dave Martin wrote: > On Mon, Dec 11, 2017 at 08:47:00PM +0800, gengdongjiu wrote: >> >> On 2017/12/11 19:59, Dave P Martin wrote: >>> On Sat, Dec 09, 2017 at 03:28:42PM +0000, Dongjiu Geng wrote: >>>> ARM v8.4 extensions include support for new floating point >>>> multiplication variant instructions to the AArch64 SIMD >>> >>> Do we have any human-readable description of what the new instructions >>> do? >>> >>> Since the v8.4 spec itself only describes these as "New Floating >>> Point Multiplication Variant", I wonder what "FHM" actually stands >>> for. >> Thanks for the point out. >> In fact, this feature only adds two instructions: >> FP16 * FP16 + FP32 >> FP16 * FP16 - FP32 >> >> The spec call this bit to ID_AA64ISAR0_EL1.FHM, I do not know why it >> will call "FHM", I think call it "FMLXL" may be better, which can >> stand for FMLAL/FMLSL instructions. > > Although "FHM" is cryptic, I think it makes sense to keep this as "FHM" > to match the ISAR0 field name -- we've tended to follow this policy > for other extension names unless there's a much better or more obvious > name available. > > For "FMLXL", new instructions might be added in the future that match > the same pattern, and then "FMLXL" could become ambiguous. So maybe > this is not the best choice. I think the FHM stands for "FP Half precision Multiplication instructions". I vote for keeping the feature bit in sync with the register bit definition. i.e, FHM. However, my version of the patch names the HWCAP bit "asimdfml", following the compiler name for the feature option "fp16fml", which is not perfect either. I think FHM is the safe option here. > >>> Maybe something like "widening half-precision floating-point multiply >>> accumulate" is acceptable wording consistent with the existing >>> architecture, but I just made that up, so it's not official ;) >> >> how about something like "performing a multiplication of each FP16 >> element of one vector with the corresponding FP16 element of a second >> vector, and to add or subtract this without an intermediate rounding >> to the corresponding FP32 element in a third vector."? > > We could have that, I guess. > I agree, and that matches the feature description.