From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1751871AbdLLBon (ORCPT <rfc822;w@1wt.eu>);
        Mon, 11 Dec 2017 20:44:43 -0500
Received: from szxga04-in.huawei.com ([45.249.212.190]:11930 "EHLO
        szxga04-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751281AbdLLBoj (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 11 Dec 2017 20:44:39 -0500
Subject: Re: [RESEND PATCH] arm64: v8.4: Support for new floating point
 multiplication variant
To: Dave Martin <Dave.Martin@arm.com>
CC: Mark Rutland <Mark.Rutland@arm.com>,
        "guohanjun@huawei.com" <guohanjun@huawei.com>,
        "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
        Suzuki Poulose <Suzuki.Poulose@arm.com>,
        "Catalin Marinas" <Catalin.Marinas@arm.com>,
        "corbet@lwn.net" <corbet@lwn.net>, "Will Deacon" <Will.Deacon@arm.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "linuxarm@huawei.com" <linuxarm@huawei.com>,
        "zhihui.gao@huawei.com" <zhihui.gao@huawei.com>,
        "huangshaoyu@huawei.com" <huangshaoyu@huawei.com>,
        "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
        "arvind.yadav.cs@gmail.com" <arvind.yadav.cs@gmail.com>,
        Robin Murphy <Robin.Murphy@arm.com>,
        "linux-arm-kernel@lists.infradead.org" 
        <linux-arm-kernel@lists.infradead.org>,
        "zhanghaibin7@huawei.com" <zhanghaibin7@huawei.com>
References: <1512833322-35503-1-git-send-email-gengdongjiu@huawei.com>
 <20171211115947.GS12608@e103592.cambridge.arm.com>
 <4c6d83f1-e8f3-46d7-f3cd-af2db77e3a9c@huawei.com>
 <20171211132914.GJ22781@e103592.cambridge.arm.com>
From: gengdongjiu <gengdongjiu@huawei.com>
Message-ID: <d39500ed-0d17-a8b2-b7e2-9e48b5d8d43a@huawei.com>
Date: Tue, 12 Dec 2017 09:44:06 +0800
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.3.0
MIME-Version: 1.0
In-Reply-To: <20171211132914.GJ22781@e103592.cambridge.arm.com>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-Originating-IP: [10.142.68.147]
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0),
        refid=str=0001.0A020204.5A2F347B.007D,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0,
        ip=0.0.0.0,
        so=2014-11-16 11:51:01,
        dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 0a02611e0fe5f3bcf0f3fbfbb8be2694
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org


On 2017/12/11 21:29, Dave Martin wrote:
>> Thanks for the point out.
>> In fact, this feature only adds two instructions:
>> FP16 * FP16 + FP32
>> FP16 * FP16 - FP32
>>
>> The spec call this bit to ID_AA64ISAR0_EL1.FHM, I do not know why it
>> will call "FHM", I  think call it "FMLXL" may be better, which can
>> stand for FMLAL/FMLSL instructions.
> Although "FHM" is cryptic, I think it makes sense to keep this as "FHM"
> to match the ISAR0 field name -- we've tended to follow this policy
> for other extension names unless there's a much better or more obvious
> name available
Agree with you, I also think the "FHM" is better.

> 
> For "FMLXL", new instructions might be added in the future that match
> the same pattern, and then "FMLXL" could become ambiguous.  So maybe
> this is not the best choice.
Ok.

> 
>>> Maybe something like "widening half-precision floating-point multiply
>>> accumulate" is acceptable wording consistent with the existing
>>> architecture, but I just made that up, so it's not official ;)
>> how about something like "performing a multiplication of each FP16
>> element of one vector with the corresponding FP16 element of a second
>> vector, and to add or subtract this without an intermediate rounding
>> to the corresponding FP32 element in a third vector."?
> We could have that, I guess.
Ok, thanks!

> 
>>>> instructions set. Let the userspace know about it via a
>>>> HWCAP bit and MRS emulation.