* [PATCH 0/2] Enumerate and expose AVX512_FP16 feature @ 2020-12-08 3:34 Kyung Min Park 2020-12-08 3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Kyung Min Park @ 2020-12-08 3:34 UTC (permalink / raw) To: x86, linux-kernel, kvm Cc: tglx, mingo, bp, hpa, pbonzini, sean.j.christopherson, jmattson, joro, vkuznets, wanpengli, kyung.min.park, cathy.zhang Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors that support it. KVM reports this information and guests can make use of it. Detailed information on the instruction and CPUID feature flag can be found in the latest "extensions" manual [1]. Reference: [1]. https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Cathy Zhang (1): x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park (1): Enumerate AVX512 FP16 CPUID feature flag arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/cpuid-deps.c | 1 + arch/x86/kvm/cpuid.c | 2 +- 3 files changed, 3 insertions(+), 1 deletion(-) -- 2.17.1 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag 2020-12-08 3:34 [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Kyung Min Park @ 2020-12-08 3:34 ` Kyung Min Park 2020-12-08 9:28 ` Borislav Petkov 2020-12-08 3:34 ` [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park 2020-12-11 23:42 ` [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Paolo Bonzini 2 siblings, 1 reply; 7+ messages in thread From: Kyung Min Park @ 2020-12-08 3:34 UTC (permalink / raw) To: x86, linux-kernel, kvm Cc: tglx, mingo, bp, hpa, pbonzini, sean.j.christopherson, jmattson, joro, vkuznets, wanpengli, kyung.min.park, cathy.zhang Enumerate AVX512 Half-precision floating point (FP16) CPUID feature flag. Compared with using FP32, using FP16 cut the number of bits required for storage in half, reducing the exponent from 8 bits to 5, and the mantissa from 23 bits to 10. Using FP16 also enables developers to train and run inference on deep learning models fast when all precision or magnitude (FP32) is not needed. A processor supports AVX512 FP16 if CPUID.(EAX=7,ECX=0):EDX[bit 23] is present. The AVX512 FP16 requires AVX512BW feature be implemented since the instructions for manipulating 32bit masks are associated with AVX512BW. The only in-kernel usage of this is kvm passthrough. The CPU feature flag is shown as "avx512_fp16" in /proc/cpuinfo. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/cpuid-deps.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index b6b9b3407c22..bec37ec7101e 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -375,6 +375,7 @@ #define X86_FEATURE_TSXLDTRK (18*32+16) /* TSX Suspend Load Address Tracking */ #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */ #define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */ +#define X86_FEATURE_AVX512_FP16 (18*32+23) /* AVX512 FP16 */ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */ diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index d502241995a3..42af31b64c2c 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL }, + { X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW }, { X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES }, { X86_FEATURE_PER_THREAD_MBA, X86_FEATURE_MBA }, {} -- 2.17.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag 2020-12-08 3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park @ 2020-12-08 9:28 ` Borislav Petkov 0 siblings, 0 replies; 7+ messages in thread From: Borislav Petkov @ 2020-12-08 9:28 UTC (permalink / raw) To: Kyung Min Park Cc: x86, linux-kernel, kvm, tglx, mingo, hpa, pbonzini, sean.j.christopherson, jmattson, joro, vkuznets, wanpengli, cathy.zhang On Mon, Dec 07, 2020 at 07:34:40PM -0800, Kyung Min Park wrote: > Enumerate AVX512 Half-precision floating point (FP16) CPUID feature > flag. Compared with using FP32, using FP16 cut the number of bits > required for storage in half, reducing the exponent from 8 bits to 5, > and the mantissa from 23 bits to 10. Using FP16 also enables developers > to train and run inference on deep learning models fast when all > precision or magnitude (FP32) is not needed. > > A processor supports AVX512 FP16 if CPUID.(EAX=7,ECX=0):EDX[bit 23] > is present. The AVX512 FP16 requires AVX512BW feature be implemented > since the instructions for manipulating 32bit masks are associated with > AVX512BW. > > The only in-kernel usage of this is kvm passthrough. The CPU feature > flag is shown as "avx512_fp16" in /proc/cpuinfo. > > Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> > Acked-by: Dave Hansen <dave.hansen@intel.com> > Reviewed-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/include/asm/cpufeatures.h | 1 + > arch/x86/kernel/cpu/cpuid-deps.c | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h > index b6b9b3407c22..bec37ec7101e 100644 > --- a/arch/x86/include/asm/cpufeatures.h > +++ b/arch/x86/include/asm/cpufeatures.h > @@ -375,6 +375,7 @@ > #define X86_FEATURE_TSXLDTRK (18*32+16) /* TSX Suspend Load Address Tracking */ > #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */ > #define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */ > +#define X86_FEATURE_AVX512_FP16 (18*32+23) /* AVX512 FP16 */ > #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ > #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ > #define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */ > diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c > index d502241995a3..42af31b64c2c 100644 > --- a/arch/x86/kernel/cpu/cpuid-deps.c > +++ b/arch/x86/kernel/cpu/cpuid-deps.c > @@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = { > { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC }, > { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC }, > { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL }, > + { X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW }, > { X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES }, > { X86_FEATURE_PER_THREAD_MBA, X86_FEATURE_MBA }, > {} > -- Acked-by: Borislav Petkov <bp@suse.de> Paolo, you can pick those up if you prefer. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID 2020-12-08 3:34 [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Kyung Min Park 2020-12-08 3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park @ 2020-12-08 3:34 ` Kyung Min Park 2020-12-11 1:03 ` Sean Christopherson 2020-12-11 23:42 ` [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Paolo Bonzini 2 siblings, 1 reply; 7+ messages in thread From: Kyung Min Park @ 2020-12-08 3:34 UTC (permalink / raw) To: x86, linux-kernel, kvm Cc: tglx, mingo, bp, hpa, pbonzini, sean.j.christopherson, jmattson, joro, vkuznets, wanpengli, kyung.min.park, cathy.zhang From: Cathy Zhang <cathy.zhang@intel.com> AVX512_FP16 is supported by Intel processors, like Sapphire Rapids. It could gain better performance for it's faster compared to FP32 while meets the precision or magnitude requirement. It's availability is indicated by CPUID.(EAX=7,ECX=0):EDX[bit 23]. Expose it in KVM supported CPUID, then guest could make use of it. Signed-off-by: Cathy Zhang <cathy.zhang@intel.com> Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> --- arch/x86/kvm/cpuid.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index e83bfe2daf82..d7707cfc9401 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -416,7 +416,7 @@ void kvm_set_cpu_caps(void) F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) | F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | - F(SERIALIZE) | F(TSXLDTRK) + F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) ); /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ -- 2.17.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID 2020-12-08 3:34 ` [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park @ 2020-12-11 1:03 ` Sean Christopherson 0 siblings, 0 replies; 7+ messages in thread From: Sean Christopherson @ 2020-12-11 1:03 UTC (permalink / raw) To: Kyung Min Park Cc: x86, linux-kernel, kvm, tglx, mingo, bp, hpa, pbonzini, sean.j.christopherson, jmattson, joro, vkuznets, wanpengli, cathy.zhang Shortlog should use "KVM: x86: ...", and probably s/for/in. It currently reads like the kernel is exposing the flag to KVM for KVM's supported CPUID, e.g.: KVM: x86: Expose AVX512_FP16 in supported CPUID On Mon, Dec 07, 2020, Kyung Min Park wrote: > From: Cathy Zhang <cathy.zhang@intel.com> > > AVX512_FP16 is supported by Intel processors, like Sapphire Rapids. > It could gain better performance for it's faster compared to FP32 > while meets the precision or magnitude requirement. It's availability > is indicated by CPUID.(EAX=7,ECX=0):EDX[bit 23]. > > Expose it in KVM supported CPUID, then guest could make use of it. For new features like this that don't require additional KVM enabling, it would be nice to explicitly state as much in the changelog, along with a brief explanation of why additional KVM enabling is not necessary. It doesn't have to be much, just something to help people that aren't already familiar with FP16 understand what this patch actually exposes to the guest. E.g. I assume there are new instructions that are available with FP16? > Signed-off-by: Cathy Zhang <cathy.zhang@intel.com> > Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> > Acked-by: Dave Hansen <dave.hansen@intel.com> > Reviewed-by: Tony Luck <tony.luck@intel.com> > --- > arch/x86/kvm/cpuid.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index e83bfe2daf82..d7707cfc9401 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -416,7 +416,7 @@ void kvm_set_cpu_caps(void) > F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) | > F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) | > F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) | > - F(SERIALIZE) | F(TSXLDTRK) > + F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16) > ); > > /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */ > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] Enumerate and expose AVX512_FP16 feature 2020-12-08 3:34 [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Kyung Min Park 2020-12-08 3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park 2020-12-08 3:34 ` [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park @ 2020-12-11 23:42 ` Paolo Bonzini 2020-12-15 1:43 ` Zhang, Cathy 2 siblings, 1 reply; 7+ messages in thread From: Paolo Bonzini @ 2020-12-11 23:42 UTC (permalink / raw) To: Kyung Min Park, x86, linux-kernel, kvm Cc: tglx, mingo, bp, hpa, sean.j.christopherson, jmattson, joro, vkuznets, wanpengli, cathy.zhang On 08/12/20 04:34, Kyung Min Park wrote: > Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors > that support it. KVM reports this information and guests can make use > of it. > > Detailed information on the instruction and CPUID feature flag can be found > in the latest "extensions" manual [1]. > > Reference: > [1]. https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html > > Cathy Zhang (1): > x86: Expose AVX512_FP16 for supported CPUID > > Kyung Min Park (1): > Enumerate AVX512 FP16 CPUID feature flag > > arch/x86/include/asm/cpufeatures.h | 1 + > arch/x86/kernel/cpu/cpuid-deps.c | 1 + > arch/x86/kvm/cpuid.c | 2 +- > 3 files changed, 3 insertions(+), 1 deletion(-) > Queued, with adjusted commit message according to Sean's review. Paolo ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] Enumerate and expose AVX512_FP16 feature 2020-12-11 23:42 ` [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Paolo Bonzini @ 2020-12-15 1:43 ` Zhang, Cathy 0 siblings, 0 replies; 7+ messages in thread From: Zhang, Cathy @ 2020-12-15 1:43 UTC (permalink / raw) To: Paolo Bonzini, sean.j.christopherson, Kyung Min Park, x86, linux-kernel, kvm Cc: tglx, mingo, bp, hpa, jmattson, joro, vkuznets, wanpengli Thanks Paolo and Sean! Sorry for the delay response, I'm back from vacation and just see Sean's comment, and I see Paolo has made changes, thanks a bunch! On 12/12/2020 7:42 AM, Paolo Bonzini wrote: > On 08/12/20 04:34, Kyung Min Park wrote: >> Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors >> that support it. KVM reports this information and guests can make use >> of it. >> >> Detailed information on the instruction and CPUID feature flag can be >> found >> in the latest "extensions" manual [1]. >> >> Reference: >> [1]. >> https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html >> >> Cathy Zhang (1): >> x86: Expose AVX512_FP16 for supported CPUID >> >> Kyung Min Park (1): >> Enumerate AVX512 FP16 CPUID feature flag >> >> arch/x86/include/asm/cpufeatures.h | 1 + >> arch/x86/kernel/cpu/cpuid-deps.c | 1 + >> arch/x86/kvm/cpuid.c | 2 +- >> 3 files changed, 3 insertions(+), 1 deletion(-) >> > > Queued, with adjusted commit message according to Sean's review. > > Paolo > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-12-15 1:46 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-12-08 3:34 [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Kyung Min Park 2020-12-08 3:34 ` [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Kyung Min Park 2020-12-08 9:28 ` Borislav Petkov 2020-12-08 3:34 ` [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID Kyung Min Park 2020-12-11 1:03 ` Sean Christopherson 2020-12-11 23:42 ` [PATCH 0/2] Enumerate and expose AVX512_FP16 feature Paolo Bonzini 2020-12-15 1:43 ` Zhang, Cathy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).