linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Enumerate and expose AVX512_FP16 feature
@ 2020-12-08  3:34 Kyung Min Park
  2020-12-08  3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Kyung Min Park @ 2020-12-08  3:34 UTC (permalink / raw)
  To: x86, linux-kernel, kvm
  Cc: tglx, mingo, bp, hpa, pbonzini, sean.j.christopherson, jmattson,
	joro, vkuznets, wanpengli, kyung.min.park, cathy.zhang

Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors
that support it. KVM reports this information and guests can make use
of it.

Detailed information on the instruction and CPUID feature flag can be found
in the latest "extensions" manual [1].

Reference:
[1]. https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Cathy Zhang (1):
  x86: Expose AVX512_FP16 for supported CPUID

Kyung Min Park (1):
  Enumerate AVX512 FP16 CPUID feature flag

 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
 arch/x86/kvm/cpuid.c               | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag
  2020-12-08  3:34 [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Kyung Min Park
@ 2020-12-08  3:34 ` Kyung Min Park
  2020-12-08  9:28   ` Borislav Petkov
  2020-12-08  3:34 ` [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park
  2020-12-11 23:42 ` [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Paolo Bonzini
  2 siblings, 1 reply; 7+ messages in thread
From: Kyung Min Park @ 2020-12-08  3:34 UTC (permalink / raw)
  To: x86, linux-kernel, kvm
  Cc: tglx, mingo, bp, hpa, pbonzini, sean.j.christopherson, jmattson,
	joro, vkuznets, wanpengli, kyung.min.park, cathy.zhang

Enumerate AVX512 Half-precision floating point (FP16) CPUID feature
flag. Compared with using FP32, using FP16 cut the number of bits
required for storage in half, reducing the exponent from 8 bits to 5,
and the mantissa from 23 bits to 10. Using FP16 also enables developers
to train and run inference on deep learning models fast when all
precision or magnitude (FP32) is not needed.

A processor supports AVX512 FP16 if CPUID.(EAX=7,ECX=0):EDX[bit 23]
is present. The AVX512 FP16 requires AVX512BW feature be implemented
since the instructions for manipulating 32bit masks are associated with
AVX512BW.

The only in-kernel usage of this is kvm passthrough. The CPU feature
flag is shown as "avx512_fp16" in /proc/cpuinfo.

Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b6b9b3407c22..bec37ec7101e 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -375,6 +375,7 @@
 #define X86_FEATURE_TSXLDTRK		(18*32+16) /* TSX Suspend Load Address Tracking */
 #define X86_FEATURE_PCONFIG		(18*32+18) /* Intel PCONFIG */
 #define X86_FEATURE_ARCH_LBR		(18*32+19) /* Intel ARCH LBR */
+#define X86_FEATURE_AVX512_FP16		(18*32+23) /* AVX512 FP16 */
 #define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */
 #define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */
 #define X86_FEATURE_FLUSH_L1D		(18*32+28) /* Flush L1D cache */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index d502241995a3..42af31b64c2c 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_CQM_MBM_TOTAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
+	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
 	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
 	{ X86_FEATURE_PER_THREAD_MBA,		X86_FEATURE_MBA       },
 	{}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID
  2020-12-08  3:34 [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Kyung Min Park
  2020-12-08  3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park
@ 2020-12-08  3:34 ` Kyung Min Park
  2020-12-11  1:03   ` Sean Christopherson
  2020-12-11 23:42 ` [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Paolo Bonzini
  2 siblings, 1 reply; 7+ messages in thread
From: Kyung Min Park @ 2020-12-08  3:34 UTC (permalink / raw)
  To: x86, linux-kernel, kvm
  Cc: tglx, mingo, bp, hpa, pbonzini, sean.j.christopherson, jmattson,
	joro, vkuznets, wanpengli, kyung.min.park, cathy.zhang

From: Cathy Zhang <cathy.zhang@intel.com>

AVX512_FP16 is supported by Intel processors, like Sapphire Rapids.
It could gain better performance for it's faster compared to FP32
while meets the precision or magnitude requirement. It's availability
is indicated by CPUID.(EAX=7,ECX=0):EDX[bit 23].

Expose it in KVM supported CPUID, then guest could make use of it.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
---
 arch/x86/kvm/cpuid.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index e83bfe2daf82..d7707cfc9401 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -416,7 +416,7 @@ void kvm_set_cpu_caps(void)
 		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
 		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
 		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
-		F(SERIALIZE) | F(TSXLDTRK)
+		F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16)
 	);
 
 	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag
  2020-12-08  3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park
@ 2020-12-08  9:28   ` Borislav Petkov
  0 siblings, 0 replies; 7+ messages in thread
From: Borislav Petkov @ 2020-12-08  9:28 UTC (permalink / raw)
  To: Kyung Min Park
  Cc: x86, linux-kernel, kvm, tglx, mingo, hpa, pbonzini,
	sean.j.christopherson, jmattson, joro, vkuznets, wanpengli,
	cathy.zhang

On Mon, Dec 07, 2020 at 07:34:40PM -0800, Kyung Min Park wrote:
> Enumerate AVX512 Half-precision floating point (FP16) CPUID feature
> flag. Compared with using FP32, using FP16 cut the number of bits
> required for storage in half, reducing the exponent from 8 bits to 5,
> and the mantissa from 23 bits to 10. Using FP16 also enables developers
> to train and run inference on deep learning models fast when all
> precision or magnitude (FP32) is not needed.
> 
> A processor supports AVX512 FP16 if CPUID.(EAX=7,ECX=0):EDX[bit 23]
> is present. The AVX512 FP16 requires AVX512BW feature be implemented
> since the instructions for manipulating 32bit masks are associated with
> AVX512BW.
> 
> The only in-kernel usage of this is kvm passthrough. The CPU feature
> flag is shown as "avx512_fp16" in /proc/cpuinfo.
> 
> Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
> Acked-by: Dave Hansen <dave.hansen@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/include/asm/cpufeatures.h | 1 +
>  arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index b6b9b3407c22..bec37ec7101e 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -375,6 +375,7 @@
>  #define X86_FEATURE_TSXLDTRK		(18*32+16) /* TSX Suspend Load Address Tracking */
>  #define X86_FEATURE_PCONFIG		(18*32+18) /* Intel PCONFIG */
>  #define X86_FEATURE_ARCH_LBR		(18*32+19) /* Intel ARCH LBR */
> +#define X86_FEATURE_AVX512_FP16		(18*32+23) /* AVX512 FP16 */
>  #define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */
>  #define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */
>  #define X86_FEATURE_FLUSH_L1D		(18*32+28) /* Flush L1D cache */
> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> index d502241995a3..42af31b64c2c 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
>  	{ X86_FEATURE_CQM_MBM_TOTAL,		X86_FEATURE_CQM_LLC   },
>  	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
>  	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
> +	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
>  	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
>  	{ X86_FEATURE_PER_THREAD_MBA,		X86_FEATURE_MBA       },
>  	{}
> -- 

Acked-by: Borislav Petkov <bp@suse.de>

Paolo, you can pick those up if you prefer.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID
  2020-12-08  3:34 ` [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park
@ 2020-12-11  1:03   ` Sean Christopherson
  0 siblings, 0 replies; 7+ messages in thread
From: Sean Christopherson @ 2020-12-11  1:03 UTC (permalink / raw)
  To: Kyung Min Park
  Cc: x86, linux-kernel, kvm, tglx, mingo, bp, hpa, pbonzini,
	sean.j.christopherson, jmattson, joro, vkuznets, wanpengli,
	cathy.zhang

Shortlog should use "KVM: x86: ...", and probably s/for/in.  It currently reads
like the kernel is exposing the flag to KVM for KVM's supported CPUID, e.g.:

  KVM: x86: Expose AVX512_FP16 in supported CPUID

On Mon, Dec 07, 2020, Kyung Min Park wrote:
> From: Cathy Zhang <cathy.zhang@intel.com>
> 
> AVX512_FP16 is supported by Intel processors, like Sapphire Rapids.
> It could gain better performance for it's faster compared to FP32
> while meets the precision or magnitude requirement. It's availability
> is indicated by CPUID.(EAX=7,ECX=0):EDX[bit 23].
> 
> Expose it in KVM supported CPUID, then guest could make use of it.

For new features like this that don't require additional KVM enabling, it would
be nice to explicitly state as much in the changelog, along with a brief
explanation of why additional KVM enabling is not necessary.  It doesn't have to
be much, just something to help people that aren't already familiar with FP16
understand what this patch actually exposes to the guest.  E.g. I assume there
are new instructions that are available with FP16?

> Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
> Signed-off-by: Kyung Min Park <kyung.min.park@intel.com>
> Acked-by: Dave Hansen <dave.hansen@intel.com>
> Reviewed-by: Tony Luck <tony.luck@intel.com>
> ---
>  arch/x86/kvm/cpuid.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index e83bfe2daf82..d7707cfc9401 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -416,7 +416,7 @@ void kvm_set_cpu_caps(void)
>  		F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
>  		F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
>  		F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
> -		F(SERIALIZE) | F(TSXLDTRK)
> +		F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16)
>  	);
>  
>  	/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
> -- 
> 2.17.1
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Enumerate and expose AVX512_FP16 feature
  2020-12-08  3:34 [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Kyung Min Park
  2020-12-08  3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park
  2020-12-08  3:34 ` [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park
@ 2020-12-11 23:42 ` Paolo Bonzini
  2020-12-15  1:43   ` Zhang, Cathy
  2 siblings, 1 reply; 7+ messages in thread
From: Paolo Bonzini @ 2020-12-11 23:42 UTC (permalink / raw)
  To: Kyung Min Park, x86, linux-kernel, kvm
  Cc: tglx, mingo, bp, hpa, sean.j.christopherson, jmattson, joro,
	vkuznets, wanpengli, cathy.zhang

On 08/12/20 04:34, Kyung Min Park wrote:
> Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors
> that support it. KVM reports this information and guests can make use
> of it.
> 
> Detailed information on the instruction and CPUID feature flag can be found
> in the latest "extensions" manual [1].
> 
> Reference:
> [1]. https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
> 
> Cathy Zhang (1):
>    x86: Expose AVX512_FP16 for supported CPUID
> 
> Kyung Min Park (1):
>    Enumerate AVX512 FP16 CPUID feature flag
> 
>   arch/x86/include/asm/cpufeatures.h | 1 +
>   arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
>   arch/x86/kvm/cpuid.c               | 2 +-
>   3 files changed, 3 insertions(+), 1 deletion(-)
> 

Queued, with adjusted commit message according to Sean's review.

Paolo


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] Enumerate and expose AVX512_FP16 feature
  2020-12-11 23:42 ` [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Paolo Bonzini
@ 2020-12-15  1:43   ` Zhang, Cathy
  0 siblings, 0 replies; 7+ messages in thread
From: Zhang, Cathy @ 2020-12-15  1:43 UTC (permalink / raw)
  To: Paolo Bonzini, sean.j.christopherson, Kyung Min Park, x86,
	linux-kernel, kvm
  Cc: tglx, mingo, bp, hpa, jmattson, joro, vkuznets, wanpengli

Thanks Paolo and Sean! Sorry for the delay response, I'm back from 
vacation and just see Sean's comment, and I see Paolo has made changes, 
thanks a bunch!

On 12/12/2020 7:42 AM, Paolo Bonzini wrote:
> On 08/12/20 04:34, Kyung Min Park wrote:
>> Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors
>> that support it. KVM reports this information and guests can make use
>> of it.
>>
>> Detailed information on the instruction and CPUID feature flag can be 
>> found
>> in the latest "extensions" manual [1].
>>
>> Reference:
>> [1]. 
>> https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
>>
>> Cathy Zhang (1):
>>    x86: Expose AVX512_FP16 for supported CPUID
>>
>> Kyung Min Park (1):
>>    Enumerate AVX512 FP16 CPUID feature flag
>>
>>   arch/x86/include/asm/cpufeatures.h | 1 +
>>   arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
>>   arch/x86/kvm/cpuid.c               | 2 +-
>>   3 files changed, 3 insertions(+), 1 deletion(-)
>>
>
> Queued, with adjusted commit message according to Sean's review.
>
> Paolo
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-12-15  1:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-08  3:34 [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Kyung Min Park
2020-12-08  3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park
2020-12-08  9:28   ` Borislav Petkov
2020-12-08  3:34 ` [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park
2020-12-11  1:03   ` Sean Christopherson
2020-12-11 23:42 ` [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Paolo Bonzini
2020-12-15  1:43   ` Zhang, Cathy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).