* Re: [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
2017-12-13 10:13 ` Dongjiu Geng
@ 2017-12-13 10:09 ` Suzuki K Poulose
-1 siblings, 0 replies; 16+ messages in thread
From: Suzuki K Poulose @ 2017-12-13 10:09 UTC (permalink / raw)
To: Dongjiu Geng, catalin.marinas, will.deacon, corbet, mark.rutland,
Dave.Martin, robin.murphy, gregkh, arvind.yadav.cs,
linux-arm-kernel, linux-doc, linux-kernel, linuxarm
Cc: huangshaoyu, guohanjun, zhanghaibin7, zhihui.gao
On 13/12/17 10:13, Dongjiu Geng wrote:
> ARM v8.4 extensions add new neon instructions for performing a
> multiplication of each FP16 element of one vector with the corresponding
> FP16 element of a second vector, and to add or subtract this without an
> intermediate rounding to the corresponding FP32 element in a third vector.
>
> This patch detects this feature and let the userspace know about it via a
> HWCAP bit and MRS emulation.
>
> Cc: Dave Martin <Dave.Martin@arm.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Looks good to me.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2017-12-13 10:09 ` Suzuki K Poulose
0 siblings, 0 replies; 16+ messages in thread
From: Suzuki K Poulose @ 2017-12-13 10:09 UTC (permalink / raw)
To: linux-arm-kernel
On 13/12/17 10:13, Dongjiu Geng wrote:
> ARM v8.4 extensions add new neon instructions for performing a
> multiplication of each FP16 element of one vector with the corresponding
> FP16 element of a second vector, and to add or subtract this without an
> intermediate rounding to the corresponding FP32 element in a third vector.
>
> This patch detects this feature and let the userspace know about it via a
> HWCAP bit and MRS emulation.
>
> Cc: Dave Martin <Dave.Martin@arm.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Looks good to me.
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2017-12-13 10:13 ` Dongjiu Geng
0 siblings, 0 replies; 16+ messages in thread
From: Dongjiu Geng @ 2017-12-13 10:13 UTC (permalink / raw)
To: catalin.marinas, will.deacon, corbet, mark.rutland,
suzuki.poulose, Dave.Martin, robin.murphy, gregkh,
arvind.yadav.cs, linux-arm-kernel, linux-doc, linux-kernel,
linuxarm
Cc: huangshaoyu, guohanjun, zhanghaibin7, zhihui.gao, gengdongjiu
ARM v8.4 extensions add new neon instructions for performing a
multiplication of each FP16 element of one vector with the corresponding
FP16 element of a second vector, and to add or subtract this without an
intermediate rounding to the corresponding FP32 element in a third vector.
This patch detects this feature and let the userspace know about it via a
HWCAP bit and MRS emulation.
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
---
Change since v2:
1. Change the HWCAP_FHM to HWCAP_ASIMDFHM
Change since v1:
1. Address Dave and Suzuki's comments to update the commit message.
2. Address Dave's comments to update Documentation/arm64/elf_hwcaps.txt.
---
Documentation/arm64/cpu-feature-registers.txt | 4 +++-
Documentation/arm64/elf_hwcaps.txt | 4 ++++
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/kernel/cpufeature.c | 2 ++
arch/arm64/kernel/cpuinfo.c | 1 +
6 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/Documentation/arm64/cpu-feature-registers.txt b/Documentation/arm64/cpu-feature-registers.txt
index bd9b3fa..a70090b 100644
--- a/Documentation/arm64/cpu-feature-registers.txt
+++ b/Documentation/arm64/cpu-feature-registers.txt
@@ -110,7 +110,9 @@ infrastructure:
x--------------------------------------------------x
| Name | bits | visible |
|--------------------------------------------------|
- | RES0 | [63-48] | n |
+ | RES0 | [63-52] | n |
+ |--------------------------------------------------|
+ | FHM | [51-48] | y |
|--------------------------------------------------|
| DP | [47-44] | y |
|--------------------------------------------------|
diff --git a/Documentation/arm64/elf_hwcaps.txt b/Documentation/arm64/elf_hwcaps.txt
index 89edba1..57324ee 100644
--- a/Documentation/arm64/elf_hwcaps.txt
+++ b/Documentation/arm64/elf_hwcaps.txt
@@ -158,3 +158,7 @@ HWCAP_SHA512
HWCAP_SVE
Functionality implied by ID_AA64PFR0_EL1.SVE == 0b0001.
+
+HWCAP_ASIMDFHM
+
+ Functionality implied by ID_AA64ISAR0_EL1.FHM == 0b0001.
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 08cc885..1818077 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -419,6 +419,7 @@
#define SCTLR_EL1_CP15BEN (1 << 5)
/* id_aa64isar0 */
+#define ID_AA64ISAR0_FHM_SHIFT 48
#define ID_AA64ISAR0_DP_SHIFT 44
#define ID_AA64ISAR0_SM4_SHIFT 40
#define ID_AA64ISAR0_SM3_SHIFT 36
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index cda76fa..f018c3d 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -43,5 +43,6 @@
#define HWCAP_ASIMDDP (1 << 20)
#define HWCAP_SHA512 (1 << 21)
#define HWCAP_SVE (1 << 22)
+#define HWCAP_ASIMDFHM (1 << 23)
#endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index c5ba009..bc7e707 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -123,6 +123,7 @@ static int __init register_cpu_hwcaps_dumper(void)
* sync with the documentation of the CPU feature register ABI.
*/
static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_FHM_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_DP_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SM4_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SM3_SHIFT, 4, 0),
@@ -991,6 +992,7 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM3_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SM3),
HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM4_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SM4),
HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_DP_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_ASIMDDP),
+ HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_FHM_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_ASIMDFHM),
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, HWCAP_FP),
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, HWCAP_FPHP),
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, HWCAP_ASIMD),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 1e25545..7f94623 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -76,6 +76,7 @@
"asimddp",
"sha512",
"sve",
+ "asimdfhm",
NULL
};
--
1.9.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2017-12-13 10:13 ` Dongjiu Geng
0 siblings, 0 replies; 16+ messages in thread
From: Dongjiu Geng @ 2017-12-13 10:13 UTC (permalink / raw)
To: linux-arm-kernel
ARM v8.4 extensions add new neon instructions for performing a
multiplication of each FP16 element of one vector with the corresponding
FP16 element of a second vector, and to add or subtract this without an
intermediate rounding to the corresponding FP32 element in a third vector.
This patch detects this feature and let the userspace know about it via a
HWCAP bit and MRS emulation.
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
---
Change since v2:
1. Change the HWCAP_FHM to HWCAP_ASIMDFHM
Change since v1:
1. Address Dave and Suzuki's comments to update the commit message.
2. Address Dave's comments to update Documentation/arm64/elf_hwcaps.txt.
---
Documentation/arm64/cpu-feature-registers.txt | 4 +++-
Documentation/arm64/elf_hwcaps.txt | 4 ++++
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/kernel/cpufeature.c | 2 ++
arch/arm64/kernel/cpuinfo.c | 1 +
6 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/Documentation/arm64/cpu-feature-registers.txt b/Documentation/arm64/cpu-feature-registers.txt
index bd9b3fa..a70090b 100644
--- a/Documentation/arm64/cpu-feature-registers.txt
+++ b/Documentation/arm64/cpu-feature-registers.txt
@@ -110,7 +110,9 @@ infrastructure:
x--------------------------------------------------x
| Name | bits | visible |
|--------------------------------------------------|
- | RES0 | [63-48] | n |
+ | RES0 | [63-52] | n |
+ |--------------------------------------------------|
+ | FHM | [51-48] | y |
|--------------------------------------------------|
| DP | [47-44] | y |
|--------------------------------------------------|
diff --git a/Documentation/arm64/elf_hwcaps.txt b/Documentation/arm64/elf_hwcaps.txt
index 89edba1..57324ee 100644
--- a/Documentation/arm64/elf_hwcaps.txt
+++ b/Documentation/arm64/elf_hwcaps.txt
@@ -158,3 +158,7 @@ HWCAP_SHA512
HWCAP_SVE
Functionality implied by ID_AA64PFR0_EL1.SVE == 0b0001.
+
+HWCAP_ASIMDFHM
+
+ Functionality implied by ID_AA64ISAR0_EL1.FHM == 0b0001.
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 08cc885..1818077 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -419,6 +419,7 @@
#define SCTLR_EL1_CP15BEN (1 << 5)
/* id_aa64isar0 */
+#define ID_AA64ISAR0_FHM_SHIFT 48
#define ID_AA64ISAR0_DP_SHIFT 44
#define ID_AA64ISAR0_SM4_SHIFT 40
#define ID_AA64ISAR0_SM3_SHIFT 36
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index cda76fa..f018c3d 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -43,5 +43,6 @@
#define HWCAP_ASIMDDP (1 << 20)
#define HWCAP_SHA512 (1 << 21)
#define HWCAP_SVE (1 << 22)
+#define HWCAP_ASIMDFHM (1 << 23)
#endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index c5ba009..bc7e707 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -123,6 +123,7 @@ static int __init register_cpu_hwcaps_dumper(void)
* sync with the documentation of the CPU feature register ABI.
*/
static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_FHM_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_DP_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SM4_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SM3_SHIFT, 4, 0),
@@ -991,6 +992,7 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM3_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SM3),
HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM4_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SM4),
HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_DP_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_ASIMDDP),
+ HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_FHM_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_ASIMDFHM),
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, HWCAP_FP),
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, HWCAP_FPHP),
HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, HWCAP_ASIMD),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 1e25545..7f94623 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -76,6 +76,7 @@
"asimddp",
"sha512",
"sve",
+ "asimdfhm",
NULL
};
--
1.9.1
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
2017-12-13 10:09 ` Suzuki K Poulose
@ 2017-12-13 10:32 ` gengdongjiu
-1 siblings, 0 replies; 16+ messages in thread
From: gengdongjiu @ 2017-12-13 10:32 UTC (permalink / raw)
To: Suzuki K Poulose, catalin.marinas, will.deacon, corbet,
mark.rutland, Dave.Martin, robin.murphy, gregkh, arvind.yadav.cs,
linux-arm-kernel, linux-doc, linux-kernel, linuxarm
Cc: huangshaoyu, guohanjun, zhanghaibin7, zhihui.gao
On 2017/12/13 18:09, Suzuki K Poulose wrote:
>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>
> Looks good to me.
>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Thanks a lot to Suzuki's review.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2017-12-13 10:32 ` gengdongjiu
0 siblings, 0 replies; 16+ messages in thread
From: gengdongjiu @ 2017-12-13 10:32 UTC (permalink / raw)
To: linux-arm-kernel
On 2017/12/13 18:09, Suzuki K Poulose wrote:
>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>
> Looks good to me.
>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Thanks a lot to Suzuki's review.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
2017-12-13 10:09 ` Suzuki K Poulose
(?)
(?)
@ 2017-12-16 2:41 ` gengdongjiu
-1 siblings, 0 replies; 16+ messages in thread
From: gengdongjiu @ 2017-12-16 2:41 UTC (permalink / raw)
To: linux-arm-kernel
Hi catalin/will,
On 2017/12/13 18:09, Suzuki K Poulose wrote:
> On 13/12/17 10:13, Dongjiu Geng wrote:
>> ARM v8.4 extensions add new neon instructions for performing a
>> multiplication of each FP16 element of one vector with the corresponding
>> FP16 element of a second vector, and to add or subtract this without an
>> intermediate rounding to the corresponding FP32 element in a third vector.
>>
>> This patch detects this feature and let the userspace know about it via a
>> HWCAP bit and MRS emulation.
>>
>> Cc: Dave Martin <Dave.Martin@arm.com>
>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>
> Looks good to me.
>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>
hope this patch can be applied to 4.15 kernel version.
Thanks
>
> .
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
2017-12-13 10:09 ` Suzuki K Poulose
@ 2018-01-05 1:22 ` gengdongjiu
-1 siblings, 0 replies; 16+ messages in thread
From: gengdongjiu @ 2018-01-05 1:22 UTC (permalink / raw)
To: Suzuki K Poulose, catalin.marinas, will.deacon, corbet,
mark.rutland, Dave.Martin, robin.murphy, gregkh, arvind.yadav.cs,
linux-arm-kernel, linux-doc, linux-kernel, linuxarm
Cc: huangshaoyu, guohanjun, zhanghaibin7, zhihui.gao, Linuxarm
Hi will/catalin
On 2017/12/13 18:09, Suzuki K Poulose wrote:
> On 13/12/17 10:13, Dongjiu Geng wrote:
>> ARM v8.4 extensions add new neon instructions for performing a
>> multiplication of each FP16 element of one vector with the corresponding
>> FP16 element of a second vector, and to add or subtract this without an
>> intermediate rounding to the corresponding FP32 element in a third vector.
>>
>> This patch detects this feature and let the userspace know about it via a
>> HWCAP bit and MRS emulation.
>>
>> Cc: Dave Martin <Dave.Martin@arm.com>
>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>
> Looks good to me.
>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
sorry to disturb you. Reminder, hope this patch can be applied to Linux 4.15-rc7.
Thanks a lot in advance.
>
>
> .
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2018-01-05 1:22 ` gengdongjiu
0 siblings, 0 replies; 16+ messages in thread
From: gengdongjiu @ 2018-01-05 1:22 UTC (permalink / raw)
To: linux-arm-kernel
Hi will/catalin
On 2017/12/13 18:09, Suzuki K Poulose wrote:
> On 13/12/17 10:13, Dongjiu Geng wrote:
>> ARM v8.4 extensions add new neon instructions for performing a
>> multiplication of each FP16 element of one vector with the corresponding
>> FP16 element of a second vector, and to add or subtract this without an
>> intermediate rounding to the corresponding FP32 element in a third vector.
>>
>> This patch detects this feature and let the userspace know about it via a
>> HWCAP bit and MRS emulation.
>>
>> Cc: Dave Martin <Dave.Martin@arm.com>
>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>
> Looks good to me.
>
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
sorry to disturb you. Reminder, hope this patch can be applied to Linux 4.15-rc7.
Thanks a lot in advance.
>
>
> .
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
2018-01-05 1:22 ` gengdongjiu
@ 2018-01-05 7:57 ` Greg KH
-1 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2018-01-05 7:57 UTC (permalink / raw)
To: gengdongjiu
Cc: Suzuki K Poulose, catalin.marinas, will.deacon, corbet,
mark.rutland, Dave.Martin, robin.murphy, arvind.yadav.cs,
linux-arm-kernel, linux-doc, linux-kernel, linuxarm, huangshaoyu,
guohanjun, zhanghaibin7, zhihui.gao
On Fri, Jan 05, 2018 at 09:22:54AM +0800, gengdongjiu wrote:
> Hi will/catalin
>
> On 2017/12/13 18:09, Suzuki K Poulose wrote:
> > On 13/12/17 10:13, Dongjiu Geng wrote:
> >> ARM v8.4 extensions add new neon instructions for performing a
> >> multiplication of each FP16 element of one vector with the corresponding
> >> FP16 element of a second vector, and to add or subtract this without an
> >> intermediate rounding to the corresponding FP32 element in a third vector.
> >>
> >> This patch detects this feature and let the userspace know about it via a
> >> HWCAP bit and MRS emulation.
> >>
> >> Cc: Dave Martin <Dave.Martin@arm.com>
> >> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> >> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
> >
> > Looks good to me.
> >
> > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>
> sorry to disturb you. Reminder, hope this patch can be applied to Linux 4.15-rc7.
New features should not be going into 4.15-rc, that should be a 4.16-rc1
thing, right?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2018-01-05 7:57 ` Greg KH
0 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2018-01-05 7:57 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jan 05, 2018 at 09:22:54AM +0800, gengdongjiu wrote:
> Hi will/catalin
>
> On 2017/12/13 18:09, Suzuki K Poulose wrote:
> > On 13/12/17 10:13, Dongjiu Geng wrote:
> >> ARM v8.4 extensions add new neon instructions for performing a
> >> multiplication of each FP16 element of one vector with the corresponding
> >> FP16 element of a second vector, and to add or subtract this without an
> >> intermediate rounding to the corresponding FP32 element in a third vector.
> >>
> >> This patch detects this feature and let the userspace know about it via a
> >> HWCAP bit and MRS emulation.
> >>
> >> Cc: Dave Martin <Dave.Martin@arm.com>
> >> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> >> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> >> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
> >
> > Looks good to me.
> >
> > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>
> sorry to disturb you. Reminder, hope this patch can be applied to Linux 4.15-rc7.
New features should not be going into 4.15-rc, that should be a 4.16-rc1
thing, right?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
2018-01-05 7:57 ` Greg KH
@ 2018-01-05 8:22 ` gengdongjiu
-1 siblings, 0 replies; 16+ messages in thread
From: gengdongjiu @ 2018-01-05 8:22 UTC (permalink / raw)
To: Greg KH
Cc: Suzuki K Poulose, catalin.marinas, will.deacon, corbet,
mark.rutland, Dave.Martin, robin.murphy, arvind.yadav.cs,
linux-arm-kernel, linux-doc, linux-kernel, linuxarm, huangshaoyu,
guohanjun, zhanghaibin7, zhihui.gao
On 2018/1/5 15:57, Greg KH wrote:
> On Fri, Jan 05, 2018 at 09:22:54AM +0800, gengdongjiu wrote:
>> Hi will/catalin
>>
>> On 2017/12/13 18:09, Suzuki K Poulose wrote:
>>> On 13/12/17 10:13, Dongjiu Geng wrote:
>>>> ARM v8.4 extensions add new neon instructions for performing a
>>>> multiplication of each FP16 element of one vector with the corresponding
>>>> FP16 element of a second vector, and to add or subtract this without an
>>>> intermediate rounding to the corresponding FP32 element in a third vector.
>>>>
>>>> This patch detects this feature and let the userspace know about it via a
>>>> HWCAP bit and MRS emulation.
>>>>
>>>> Cc: Dave Martin <Dave.Martin@arm.com>
>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>>>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>>>
>>> Looks good to me.
>>>
>>> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>
>> sorry to disturb you. Reminder, hope this patch can be applied to Linux 4.15-rc7.
>
> New features should not be going into 4.15-rc, that should be a 4.16-rc1
> thing, right?
It is also great if it can be applied to 4.16-rc1. Thanks a lot!
>
> thanks,
>
> greg k-h
>
> .
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2018-01-05 8:22 ` gengdongjiu
0 siblings, 0 replies; 16+ messages in thread
From: gengdongjiu @ 2018-01-05 8:22 UTC (permalink / raw)
To: linux-arm-kernel
On 2018/1/5 15:57, Greg KH wrote:
> On Fri, Jan 05, 2018 at 09:22:54AM +0800, gengdongjiu wrote:
>> Hi will/catalin
>>
>> On 2017/12/13 18:09, Suzuki K Poulose wrote:
>>> On 13/12/17 10:13, Dongjiu Geng wrote:
>>>> ARM v8.4 extensions add new neon instructions for performing a
>>>> multiplication of each FP16 element of one vector with the corresponding
>>>> FP16 element of a second vector, and to add or subtract this without an
>>>> intermediate rounding to the corresponding FP32 element in a third vector.
>>>>
>>>> This patch detects this feature and let the userspace know about it via a
>>>> HWCAP bit and MRS emulation.
>>>>
>>>> Cc: Dave Martin <Dave.Martin@arm.com>
>>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
>>>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
>>>
>>> Looks good to me.
>>>
>>> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>
>> sorry to disturb you. Reminder, hope this patch can be applied to Linux 4.15-rc7.
>
> New features should not be going into 4.15-rc, that should be a 4.16-rc1
> thing, right?
It is also great if it can be applied to 4.16-rc1. Thanks a lot!
>
> thanks,
>
> greg k-h
>
> .
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
2018-01-05 8:22 ` gengdongjiu
@ 2018-01-05 11:28 ` Catalin Marinas
-1 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2018-01-05 11:28 UTC (permalink / raw)
To: gengdongjiu
Cc: Greg KH, mark.rutland, linux-doc, corbet, Suzuki K Poulose,
will.deacon, linux-kernel, linuxarm, zhihui.gao, huangshaoyu,
guohanjun, arvind.yadav.cs, robin.murphy, Dave.Martin,
linux-arm-kernel, zhanghaibin7
On Fri, Jan 05, 2018 at 04:22:24PM +0800, gengdongjiu wrote:
> On 2018/1/5 15:57, Greg KH wrote:
> > On Fri, Jan 05, 2018 at 09:22:54AM +0800, gengdongjiu wrote:
> >> Hi will/catalin
> >>
> >> On 2017/12/13 18:09, Suzuki K Poulose wrote:
> >>> On 13/12/17 10:13, Dongjiu Geng wrote:
> >>>> ARM v8.4 extensions add new neon instructions for performing a
> >>>> multiplication of each FP16 element of one vector with the corresponding
> >>>> FP16 element of a second vector, and to add or subtract this without an
> >>>> intermediate rounding to the corresponding FP32 element in a third vector.
> >>>>
> >>>> This patch detects this feature and let the userspace know about it via a
> >>>> HWCAP bit and MRS emulation.
> >>>>
> >>>> Cc: Dave Martin <Dave.Martin@arm.com>
> >>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> >>>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
> >>>
> >>> Looks good to me.
> >>>
> >>> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>
> >> sorry to disturb you. Reminder, hope this patch can be applied to Linux 4.15-rc7.
> >
> > New features should not be going into 4.15-rc, that should be a 4.16-rc1
> > thing, right?
>
> It is also great if it can be applied to 4.16-rc1. Thanks a lot!
I will queue it for 4.16-rc1.
--
Catalin
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2018-01-05 11:28 ` Catalin Marinas
0 siblings, 0 replies; 16+ messages in thread
From: Catalin Marinas @ 2018-01-05 11:28 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jan 05, 2018 at 04:22:24PM +0800, gengdongjiu wrote:
> On 2018/1/5 15:57, Greg KH wrote:
> > On Fri, Jan 05, 2018 at 09:22:54AM +0800, gengdongjiu wrote:
> >> Hi will/catalin
> >>
> >> On 2017/12/13 18:09, Suzuki K Poulose wrote:
> >>> On 13/12/17 10:13, Dongjiu Geng wrote:
> >>>> ARM v8.4 extensions add new neon instructions for performing a
> >>>> multiplication of each FP16 element of one vector with the corresponding
> >>>> FP16 element of a second vector, and to add or subtract this without an
> >>>> intermediate rounding to the corresponding FP32 element in a third vector.
> >>>>
> >>>> This patch detects this feature and let the userspace know about it via a
> >>>> HWCAP bit and MRS emulation.
> >>>>
> >>>> Cc: Dave Martin <Dave.Martin@arm.com>
> >>>> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>>> Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
> >>>> Reviewed-by: Dave Martin <Dave.Martin@arm.com>
> >>>
> >>> Looks good to me.
> >>>
> >>> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> >>
> >> sorry to disturb you. Reminder, hope this patch can be applied to Linux 4.15-rc7.
> >
> > New features should not be going into 4.15-rc, that should be a 4.16-rc1
> > thing, right?
>
> It is also great if it can be applied to 4.16-rc1. Thanks a lot!
I will queue it for 4.16-rc1.
--
Catalin
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions
@ 2018-01-05 13:54 gengdongjiu
0 siblings, 0 replies; 16+ messages in thread
From: gengdongjiu @ 2018-01-05 13:54 UTC (permalink / raw)
To: Catalin Marinas
Cc: Greg KH, mark.rutland, linux-doc, corbet, Suzuki K Poulose,
will.deacon, linux-kernel, Linuxarm, Gaozhihui, Huangshaoyu,
Guohanjun (Hanjun Guo),
arvind.yadav.cs, robin.murphy, Dave.Martin, linux-arm-kernel,
Zhanghaibin (Euler)
> > > New features should not be going into 4.15-rc, that should be a
> > > 4.16-rc1 thing, right?
> >
> > It is also great if it can be applied to 4.16-rc1. Thanks a lot!
>
> I will queue it for 4.16-rc1.
Thanks very much to Catalin.
>
> --
> Catalin
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2018-01-05 13:54 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-13 10:13 [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions Dongjiu Geng
2017-12-13 10:13 ` Dongjiu Geng
2017-12-13 10:09 ` Suzuki K Poulose
2017-12-13 10:09 ` Suzuki K Poulose
2017-12-13 10:32 ` gengdongjiu
2017-12-13 10:32 ` gengdongjiu
2017-12-16 2:41 ` gengdongjiu
2018-01-05 1:22 ` gengdongjiu
2018-01-05 1:22 ` gengdongjiu
2018-01-05 7:57 ` Greg KH
2018-01-05 7:57 ` Greg KH
2018-01-05 8:22 ` gengdongjiu
2018-01-05 8:22 ` gengdongjiu
2018-01-05 11:28 ` Catalin Marinas
2018-01-05 11:28 ` Catalin Marinas
2018-01-05 13:54 gengdongjiu
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.