All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check
@ 2021-09-20  5:39 Noah Goldstein
  2021-09-27 18:02 ` Noah Goldstein
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Noah Goldstein @ 2021-09-20  5:39 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

This commit creates a new mask, XFEATURE_MASK_ZMM, to test against
xfeatures for conditionally updating the axx512_timestamp.

Based on the comments, the avx512 state is meant to track when the
state would cause frequencey throttling. The opmasks (k0-k7) do not
cause frequency throttling, so they don't make sense to include.

The current implementation, as well as the old, still does have a
false positive on ymm16-ymm31 and xmm16-31 because
XFEATURE_MASK_Hi16_ZMM includes them.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
Issue is reproducible with the following code on x86_64:

```
	.global	_start
	.text
_start:
	korq	%k0, %k0, %k0

loop:
	jmp	loop


	movl	$60, %eax
	xorl	%edi, %edi
	syscall
```

Pretending run as executable named "foo":

$> cat /proc/$(pidof foo)/arch_status


This should yield -1 as no frequency changing AVX512 instructions
where used but instead tracks the process.

Note there still is a false positive with ymm16-ymm31 and xmm16-xmm31
but since there is no state to distinguish between there use and
zmm16-31 that seems inevitable.

    
 arch/x86/include/asm/fpu/types.h | 2 ++
 arch/x86/kernel/fpu/core.c       | 4 +++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f5a38a5f3ae1..a4816fa7d541 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -141,6 +141,8 @@ enum xfeature {
 #define XFEATURE_MASK_AVX512		(XFEATURE_MASK_OPMASK \
 					 | XFEATURE_MASK_ZMM_Hi256 \
 					 | XFEATURE_MASK_Hi16_ZMM)
+#define XFEATURE_MASK_ZMM		(XFEATURE_MASK_ZMM_Hi256 \
+					 | XFEATURE_MASK_Hi16_ZMM)
 
 #define FIRST_EXTENDED_XFEATURE	XFEATURE_YMM
 
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7ada7bd03a32..342620a2e8ef 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -104,8 +104,10 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
 		/*
 		 * AVX512 state is tracked here because its use is
 		 * known to slow the max clock speed of the core.
+		 * Note: This has a false positive on Hi16 ymm and
+		 * xmm registers.
 		 */
-		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_ZMM)
 			fpu->avx512_timestamp = jiffies;
 		return;
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check
  2021-09-20  5:39 [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check Noah Goldstein
@ 2021-09-27 18:02 ` Noah Goldstein
  2021-10-13 22:36   ` Noah Goldstein
  2021-10-15 20:47 ` [PATCH v2 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-09-27 18:02 UTC (permalink / raw)
  Cc: tglx, mingo, bp, X86 ML, hpa, Andy Lutomirski, open list

On Mon, Sep 20, 2021 at 12:40 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> This commit creates a new mask, XFEATURE_MASK_ZMM, to test against
> xfeatures for conditionally updating the axx512_timestamp.
>
> Based on the comments, the avx512 state is meant to track when the
> state would cause frequencey throttling. The opmasks (k0-k7) do not
> cause frequency throttling, so they don't make sense to include.
>
> The current implementation, as well as the old, still does have a
> false positive on ymm16-ymm31 and xmm16-31 because
> XFEATURE_MASK_Hi16_ZMM includes them.
>
> Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
> ---
> Issue is reproducible with the following code on x86_64:
>
> ```
>         .global _start
>         .text
> _start:
>         korq    %k0, %k0, %k0
>
> loop:
>         jmp     loop
>
>
>         movl    $60, %eax
>         xorl    %edi, %edi
>         syscall
> ```
>
> Pretending run as executable named "foo":
>
> $> cat /proc/$(pidof foo)/arch_status
>
>
> This should yield -1 as no frequency changing AVX512 instructions
> where used but instead tracks the process.
>
> Note there still is a false positive with ymm16-ymm31 and xmm16-xmm31
> but since there is no state to distinguish between there use and
> zmm16-31 that seems inevitable.
>
>
>  arch/x86/include/asm/fpu/types.h | 2 ++
>  arch/x86/kernel/fpu/core.c       | 4 +++-
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
> index f5a38a5f3ae1..a4816fa7d541 100644
> --- a/arch/x86/include/asm/fpu/types.h
> +++ b/arch/x86/include/asm/fpu/types.h
> @@ -141,6 +141,8 @@ enum xfeature {
>  #define XFEATURE_MASK_AVX512           (XFEATURE_MASK_OPMASK \
>                                          | XFEATURE_MASK_ZMM_Hi256 \
>                                          | XFEATURE_MASK_Hi16_ZMM)
> +#define XFEATURE_MASK_ZMM              (XFEATURE_MASK_ZMM_Hi256 \
> +                                        | XFEATURE_MASK_Hi16_ZMM)
>
>  #define FIRST_EXTENDED_XFEATURE        XFEATURE_YMM
>
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 7ada7bd03a32..342620a2e8ef 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -104,8 +104,10 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
>                 /*
>                  * AVX512 state is tracked here because its use is
>                  * known to slow the max clock speed of the core.
> +                * Note: This has a false positive on Hi16 ymm and
> +                * xmm registers.
>                  */
> -               if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
> +               if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_ZMM)
>                         fpu->avx512_timestamp = jiffies;
>                 return;
>         }
> --
> 2.25.1
>
Ping

(sorry if this has shown up multiple times for anyone,
was accidentally including HTML earlier)

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check
  2021-09-27 18:02 ` Noah Goldstein
@ 2021-10-13 22:36   ` Noah Goldstein
  2021-10-14  8:28     ` Borislav Petkov
  0 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-13 22:36 UTC (permalink / raw)
  Cc: tglx, mingo, bp, X86 ML, hpa, Andy Lutomirski, open list

On Mon, Sep 27, 2021 at 1:02 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Mon, Sep 20, 2021 at 12:40 AM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
> >
> > This commit creates a new mask, XFEATURE_MASK_ZMM, to test against
> > xfeatures for conditionally updating the axx512_timestamp.
> >
> > Based on the comments, the avx512 state is meant to track when the
> > state would cause frequencey throttling. The opmasks (k0-k7) do not
> > cause frequency throttling, so they don't make sense to include.
> >
> > The current implementation, as well as the old, still does have a
> > false positive on ymm16-ymm31 and xmm16-31 because
> > XFEATURE_MASK_Hi16_ZMM includes them.
> >
> > Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
> > ---
> > Issue is reproducible with the following code on x86_64:
> >
> > ```
> >         .global _start
> >         .text
> > _start:
> >         korq    %k0, %k0, %k0
> >
> > loop:
> >         jmp     loop
> >
> >
> >         movl    $60, %eax
> >         xorl    %edi, %edi
> >         syscall
> > ```
> >
> > Pretending run as executable named "foo":
> >
> > $> cat /proc/$(pidof foo)/arch_status
> >
> >
> > This should yield -1 as no frequency changing AVX512 instructions
> > where used but instead tracks the process.
> >
> > Note there still is a false positive with ymm16-ymm31 and xmm16-xmm31
> > but since there is no state to distinguish between there use and
> > zmm16-31 that seems inevitable.
> >
> >
> >  arch/x86/include/asm/fpu/types.h | 2 ++
> >  arch/x86/kernel/fpu/core.c       | 4 +++-
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
> > index f5a38a5f3ae1..a4816fa7d541 100644
> > --- a/arch/x86/include/asm/fpu/types.h
> > +++ b/arch/x86/include/asm/fpu/types.h
> > @@ -141,6 +141,8 @@ enum xfeature {
> >  #define XFEATURE_MASK_AVX512           (XFEATURE_MASK_OPMASK \
> >                                          | XFEATURE_MASK_ZMM_Hi256 \
> >                                          | XFEATURE_MASK_Hi16_ZMM)
> > +#define XFEATURE_MASK_ZMM              (XFEATURE_MASK_ZMM_Hi256 \
> > +                                        | XFEATURE_MASK_Hi16_ZMM)
> >
> >  #define FIRST_EXTENDED_XFEATURE        XFEATURE_YMM
> >
> > diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> > index 7ada7bd03a32..342620a2e8ef 100644
> > --- a/arch/x86/kernel/fpu/core.c
> > +++ b/arch/x86/kernel/fpu/core.c
> > @@ -104,8 +104,10 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
> >                 /*
> >                  * AVX512 state is tracked here because its use is
> >                  * known to slow the max clock speed of the core.
> > +                * Note: This has a false positive on Hi16 ymm and
> > +                * xmm registers.
> >                  */
> > -               if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
> > +               if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_ZMM)
> >                         fpu->avx512_timestamp = jiffies;
> >                 return;
> >         }
> > --
> > 2.25.1
> >
> Ping
>
> (sorry if this has shown up multiple times for anyone,
> was accidentally including HTML earlier)

Ping2

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check
  2021-10-13 22:36   ` Noah Goldstein
@ 2021-10-14  8:28     ` Borislav Petkov
  2021-10-14 15:49       ` Noah Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2021-10-14  8:28 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Oct 13, 2021 at 05:36:14PM -0500, Noah Goldstein wrote:
> Ping2

Why?

The original patch which added this abomination:

2f7726f95557 ("x86/fpu: Track AVX-512 usage of tasks")

says already:

 the tracking mechanism is imprecise and can theoretically miss AVX-512
 usage during context switch. But it has been measured to be precise
 enough to be useful under real-world workloads like tensorflow and
 linpack.

 If higher precision is required, suggest user space tools to use the
 PMU-based mechanisms in combination.

and as you've noticed, the high 16 regs would cause a false positive
too.

So what is the actual real-life use case for this?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check
  2021-10-14  8:28     ` Borislav Petkov
@ 2021-10-14 15:49       ` Noah Goldstein
  2021-10-15 14:40         ` Borislav Petkov
  2021-10-15 15:25         ` Dave Hansen
  0 siblings, 2 replies; 31+ messages in thread
From: Noah Goldstein @ 2021-10-14 15:49 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Thu, Oct 14, 2021 at 3:28 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Oct 13, 2021 at 05:36:14PM -0500, Noah Goldstein wrote:
> > Ping2
>
> Why?


This left me confused and annoyed for several hours because
the avx512 usage information on my applications didn't make
sense. Figured worth a patch for posterity.

>
> The original patch which added this abomination:
>
> 2f7726f95557 ("x86/fpu: Track AVX-512 usage of tasks")
>
> says already:
>
>  the tracking mechanism is imprecise and can theoretically miss AVX-512
>  usage during context switch. But it has been measured to be precise
>  enough to be useful under real-world workloads like tensorflow and
>  linpack.
>
>  If higher precision is required, suggest user space tools to use the
>  PMU-based mechanisms in combination.

This patch isn't about higher precision / false negatives.
It's about false positives.

>
> and as you've noticed, the high 16 regs would cause a false positive
> too.
>
> So what is the actual real-life use case for this?

No big project. There is a case for avoiding the extended avx512
registers (reg16..31) for various reasons (context switches,
code size, instruction limitations) that don't apply to mask registers.

Irrelevant of the still existing flaws, it makes the output more accurate.

Is there a cost to the change I am not seeing?

If not it seems to be a tradeoff between more accurate vs less accurate.
So why not take the former?

>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check
  2021-10-14 15:49       ` Noah Goldstein
@ 2021-10-15 14:40         ` Borislav Petkov
  2021-10-15 15:25         ` Dave Hansen
  1 sibling, 0 replies; 31+ messages in thread
From: Borislav Petkov @ 2021-10-15 14:40 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Thu, Oct 14, 2021 at 10:49:25AM -0500, Noah Goldstein wrote:
> Is there a cost to the change I am not seeing?

The cost is having yet another macro in our crazy FPU code because
apparently everything but the kitchen sink is being put into xstates.

So I'd suggest you carve out that timestamp update into a separate
function, i.e., something like fpu_update_avx_timestamp() or so - and
add the macro in it along with the comment explaining what and why
is being tracked.

So that that functionality is separated out of the main flow,
save_fpregs_to_fpstate() simply calls it and it doesn't get in the way
of the next rewrite of the FPU code to accomodate new crap^W features.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check
  2021-10-14 15:49       ` Noah Goldstein
  2021-10-15 14:40         ` Borislav Petkov
@ 2021-10-15 15:25         ` Dave Hansen
  2021-10-15 17:30           ` Noah Goldstein
  1 sibling, 1 reply; 31+ messages in thread
From: Dave Hansen @ 2021-10-15 15:25 UTC (permalink / raw)
  To: Noah Goldstein, Borislav Petkov
  Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list, aubrey,
	Chen, Tim C, Van De Ven, Arjan

On 10/14/21 8:49 AM, Noah Goldstein wrote:
> Irrelevant of the still existing flaws, it makes the output more accurate.
> 
> Is there a cost to the change I am not seeing?

We'd want to make sure that this doesn't break anything.  It probably
won't, but it theoretically could.

For instance, if someone was doing:

	avx512_foo();
	xsave->xstate_bv &= ~XFEATURE_MASK_ZMMS;
	XRSTOR(xsave, -1);

That would leave the opmask in place, but would lead to the ZMM
registers tracked as being in their init state.

This would be *very* unlikely, but it would be great if Aubrey (the
original avx512_timestamp patch author) could make sure that it doesn't
break anything.

Also, there's the side issue of AVX-256 use.  AVX-256 uses the ZMM
registers which are a part of XFEATURE_MASK_AVX512, but does not incur
the same frequency penalties of the full 512-bit-wide instructions.
Since ZMM_Hi256 is the *only* ZMM state which is truly 512-bit-only, we
could argue that it's the only one we should consider.

Noah, thanks for bringing this up.  I'm not opposed to your patch, but
let's just make sure that it doesn't break anything and also that we
shouldn't do a bit more at the same time (ignore Hi16_ZMM for
avx512_timestamp).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check
  2021-10-15 15:25         ` Dave Hansen
@ 2021-10-15 17:30           ` Noah Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Noah Goldstein @ 2021-10-15 17:30 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Borislav Petkov, tglx, mingo, X86 ML, hpa, Andy Lutomirski,
	open list, aubrey, Chen, Tim C, Van De Ven, Arjan

On Fri, Oct 15, 2021 at 10:25 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 10/14/21 8:49 AM, Noah Goldstein wrote:
> > Irrelevant of the still existing flaws, it makes the output more accurate.
> >
> > Is there a cost to the change I am not seeing?
>
> We'd want to make sure that this doesn't break anything.  It probably
> won't, but it theoretically could.
>
> For instance, if someone was doing:
>
>         avx512_foo();
>         xsave->xstate_bv &= ~XFEATURE_MASK_ZMMS;
>         XRSTOR(xsave, -1);
>
> That would leave the opmask in place, but would lead to the ZMM
> registers tracked as being in their init state.

The 'XFEATURE_MASK_ZMMS' is new to this patch so I don't think
this patch could be adding that issue.

>
> This would be *very* unlikely, but it would be great if Aubrey (the
> original avx512_timestamp patch author) could make sure that it doesn't
> break anything.
>
> Also, there's the side issue of AVX-256 use.  AVX-256 uses the ZMM
> registers which are a part of XFEATURE_MASK_AVX512, but does not incur
> the same frequency penalties of the full 512-bit-wide instructions.
> Since ZMM_Hi256 is the *only* ZMM state which is truly 512-bit-only, we
> could argue that it's the only one we should consider.
>
> Noah, thanks for bringing this up.  I'm not opposed to your patch, but
> let's just make sure that it doesn't break anything and also that we
> shouldn't do a bit more at the same time (ignore Hi16_ZMM for
> avx512_timestamp).

I think that may make sense. Or outputting separate timestamps for
both. Especially because in GLIBC we have moved to preferring
EVEX implementings for all x86_64 string functions because
vzeroupper aborts RTM transactions:
https://sourceware.org/bugzilla/show_bug.cgi?id=27457

So if an application is using GLIBC on an avx512 machine most likely
the avx512 indicator will be permanently set.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2 1/2] x86/fpu: Add helper function for tracking AVX512 status
  2021-09-20  5:39 [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check Noah Goldstein
  2021-09-27 18:02 ` Noah Goldstein
@ 2021-10-15 20:47 ` Noah Goldstein
  2021-10-26 23:15   ` Noah Goldstein
  2021-10-15 20:47 ` [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-15 20:47 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

This commit adds a new helper function 'fpu_update_avx_timestamp' to
perform the logic from tracking AVX512 feature use.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
Since Borislav Petkov is concerned about adding more macro masks and
inlining the AVX512 status this patch adds a new helper function. This
patch does not change the behavior of the current AVX512 status.

 arch/x86/include/asm/fpu/xstate.h |  1 +
 arch/x86/kernel/fpu/core.c        |  6 ++----
 arch/x86/kernel/fpu/xstate.c      | 13 +++++++++++++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 109dfcc75299..fe84ac5fb039 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -134,6 +134,7 @@ extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
+void fpu_update_avx_timestamp(struct fpu *fpu);
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7ada7bd03a32..6dbf3ee642f9 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -102,11 +102,9 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
 		os_xsave(&fpu->state.xsave);
 
 		/*
-		 * AVX512 state is tracked here because its use is
-		 * known to slow the max clock speed of the core.
+		 * Track of the state of desired avx related xfeatures.
 		 */
-		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-			fpu->avx512_timestamp = jiffies;
+		fpu_update_avx_timestamp(fpu);
 		return;
 	}
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c8def1b7f8fb..00b495914be2 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1245,6 +1245,19 @@ void xrstors(struct xregs_state *xstate, u64 mask)
 	WARN_ON_ONCE(err);
 }
 
+/*
+ * Track of the state of desired avx architecture features.
+ */
+void fpu_update_avx_timestamp(struct fpu *fpu)
+{
+	/*
+	 * AVX512 state is tracked here because its use is known to slow
+	 * the max clock speed of the core.
+	 */
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+		fpu->avx512_timestamp = jiffies;
+}
+
 #ifdef CONFIG_PROC_PID_ARCH_STATUS
 /*
  * Report the amount of time elapsed in millisecond since last AVX512
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-09-20  5:39 [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check Noah Goldstein
  2021-09-27 18:02 ` Noah Goldstein
  2021-10-15 20:47 ` [PATCH v2 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
@ 2021-10-15 20:47 ` Noah Goldstein
  2021-10-27 11:07   ` Borislav Petkov
  2021-10-27 16:26 ` [PATCH v3 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-15 20:47 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

This patch splits the timestamps for tracking the AVX512 status into
'avx512_ZMM_Hi256_timestamp' and 'avx512_Hi16_ZMM_timestamp'. They are
used for tracking XFEATURE_ZMM_Hi256 and XFEATURE_Hi16_ZMM use
respectively.

The purpose of tracking the AVX512 status is to convey information
about possible frequency throttling. The current implementation has
false positives on XFEATURE_OPMASK use and any usage of the hi16 xmm
and ymm registers which are included in the XFEATURE_Hi16_ZMM set as
neither will cause frequency throttling.

This patches implementation avoids to add more clarity to the
output. The 'avx512_ZMM_Hi256_timestamp' will not have false positives
so its value will at least be indicative frequency throttling. Since
'avx512_Hi16_ZMM_timestamp' can still indicate frequency throttling
from zmm16...zmm31 use though had false positives it is separated.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
Because (Hi16_ZMM | ZMM_Hi256) will likely be full of false positives
on any machine that support avx512 I split the avx512_timestamp.

One to track Hi16_ZMM and one for ZMM_Hi256. My though is it's not
good to add more holes where the status doesn't report, but also have
fields that will be useful and not be burdened by false positives.

This might be overdoing it. If so we can either use a both of both or
just ZMM_Hi256.

 arch/x86/include/asm/fpu/types.h | 16 +++++++++---
 arch/x86/kernel/fpu/xstate.c     | 45 ++++++++++++++++++++++++++------
 2 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f5a38a5f3ae1..cb10909fa3da 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -330,11 +330,21 @@ struct fpu {
 	unsigned int			last_cpu;
 
 	/*
-	 * @avx512_timestamp:
+	 * @avx512_ZMM_Hi256_timestamp:
 	 *
-	 * Records the timestamp of AVX512 use during last context switch.
+	 * Records the timestamp of AVX512 use in the ZMM_Hi256 xfeature
+	 * set. This include zmm0...zmm15.
 	 */
-	unsigned long			avx512_timestamp;
+	unsigned long			avx512_ZMM_Hi256_timestamp;
+
+	/*
+	 * @avx512_Hi16_ZMM_timestamp:
+	 *
+	 * Records the timestamp of AVX512 use in the Hi16_ZMM xfeature
+	 * set. This includes usage of any of the hi16 xmm, ymm, or zmm
+	 * registers.
+	 */
+	unsigned long			avx512_Hi16_ZMM_timestamp;
 
 	/*
 	 * @state:
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 00b495914be2..5b0ff609af2f 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1246,7 +1246,7 @@ void xrstors(struct xregs_state *xstate, u64 mask)
 }
 
 /*
- * Track of the state of desired avx architecture features.
+ * Track of the state of desired avx related xfeatures.
  */
 void fpu_update_avx_timestamp(struct fpu *fpu)
 {
@@ -1254,18 +1254,28 @@ void fpu_update_avx_timestamp(struct fpu *fpu)
 	 * AVX512 state is tracked here because its use is known to slow
 	 * the max clock speed of the core.
 	 */
-	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-		fpu->avx512_timestamp = jiffies;
+
+	/*
+	 * Store a separate state for ZMM_Hi256 and Hi16_ZMM xfeature use.
+	 * If ZMM_Hi256 is used the machine has certainly used a zmm
+	 * register.  Hi16_ZMM, however, has false positives on usage of
+	 * hi16 xmm and ymm registers.
+	 */
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_ZMM_Hi256)
+		fpu->avx512_ZMM_Hi256_timestamp = jiffies;
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_Hi16_ZMM)
+		fpu->avx512_Hi16_ZMM_timestamp = jiffies;
 }
 
+
 #ifdef CONFIG_PROC_PID_ARCH_STATUS
+
 /*
- * Report the amount of time elapsed in millisecond since last AVX512
- * use in the task.
+ * Helper function for computing proper output for avx512_status
+ * timestamp.
  */
-static void avx512_status(struct seq_file *m, struct task_struct *task)
+static long avx_status_compute_delta(unsigned long timestamp)
 {
-	unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp);
 	long delta;
 
 	if (!timestamp) {
@@ -1282,8 +1292,27 @@ static void avx512_status(struct seq_file *m, struct task_struct *task)
 			delta = LONG_MAX;
 		delta = jiffies_to_msecs(delta);
 	}
+	return delta;
+}
 
-	seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta);
+/*
+ * Report the amount of time elapsed in millisecond since last AVX512
+ * use in the task.
+ */
+static void avx512_status(struct seq_file *m, struct task_struct *task)
+{
+	unsigned long timestamp;
+	long delta_ZMM_Hi256, delta_Hi16_ZMM;
+
+	timestamp = READ_ONCE(task->thread.fpu.avx512_ZMM_Hi256_timestamp);
+	delta_ZMM_Hi256 = avx_status_compute_delta(timestamp);
+
+	timestamp = READ_ONCE(task->thread.fpu.avx512_Hi16_ZMM_timestamp);
+	delta_Hi16_ZMM = avx_status_compute_delta(timestamp);
+
+	seq_put_decimal_ll(m, "AVX512_ZMM_Hi256_elapsed_ms:\t", delta_ZMM_Hi256);
+	seq_putc(m, '\n');
+	seq_put_decimal_ll(m, "AVX512_Hi16_ZMM_elapsed_ms:\t", delta_Hi16_ZMM);
 	seq_putc(m, '\n');
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 1/2] x86/fpu: Add helper function for tracking AVX512 status
  2021-10-15 20:47 ` [PATCH v2 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
@ 2021-10-26 23:15   ` Noah Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Noah Goldstein @ 2021-10-26 23:15 UTC (permalink / raw)
  Cc: tglx, mingo, Borislav Petkov, X86 ML, hpa, Andy Lutomirski, open list

On Fri, Oct 15, 2021 at 3:57 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> This commit adds a new helper function 'fpu_update_avx_timestamp' to
> perform the logic from tracking AVX512 feature use.
>
> Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
> ---
> Since Borislav Petkov is concerned about adding more macro masks and
> inlining the AVX512 status this patch adds a new helper function. This
> patch does not change the behavior of the current AVX512 status.
>
>  arch/x86/include/asm/fpu/xstate.h |  1 +
>  arch/x86/kernel/fpu/core.c        |  6 ++----
>  arch/x86/kernel/fpu/xstate.c      | 13 +++++++++++++
>  3 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
> index 109dfcc75299..fe84ac5fb039 100644
> --- a/arch/x86/include/asm/fpu/xstate.h
> +++ b/arch/x86/include/asm/fpu/xstate.h
> @@ -134,6 +134,7 @@ extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
>  extern void __init update_regset_xstate_info(unsigned int size,
>                                              u64 xstate_mask);
>
> +void fpu_update_avx_timestamp(struct fpu *fpu);
>  void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
>  int xfeature_size(int xfeature_nr);
>  int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 7ada7bd03a32..6dbf3ee642f9 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -102,11 +102,9 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
>                 os_xsave(&fpu->state.xsave);
>
>                 /*
> -                * AVX512 state is tracked here because its use is
> -                * known to slow the max clock speed of the core.
> +                * Track of the state of desired avx related xfeatures.
>                  */
> -               if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
> -                       fpu->avx512_timestamp = jiffies;
> +               fpu_update_avx_timestamp(fpu);
>                 return;
>         }
>
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index c8def1b7f8fb..00b495914be2 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -1245,6 +1245,19 @@ void xrstors(struct xregs_state *xstate, u64 mask)
>         WARN_ON_ONCE(err);
>  }
>
> +/*
> + * Track of the state of desired avx architecture features.
> + */
> +void fpu_update_avx_timestamp(struct fpu *fpu)
> +{
> +       /*
> +        * AVX512 state is tracked here because its use is known to slow
> +        * the max clock speed of the core.
> +        */
> +       if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
> +               fpu->avx512_timestamp = jiffies;
> +}
> +
>  #ifdef CONFIG_PROC_PID_ARCH_STATUS
>  /*
>   * Report the amount of time elapsed in millisecond since last AVX512
> --
> 2.25.1
>

Ping.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-15 20:47 ` [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
@ 2021-10-27 11:07   ` Borislav Petkov
  2021-10-27 16:28     ` Noah Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2021-10-27 11:07 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: tglx, mingo, x86, hpa, luto, linux-kernel

On Fri, Oct 15, 2021 at 03:47:17PM -0500, Noah Goldstein wrote:
> This patch splits the timestamps for tracking the AVX512 status into

For future submissions, please avoid having "This patch" or "This
commit" in the commit message. It is tautologically useless.

...

> @@ -1282,8 +1292,27 @@ static void avx512_status(struct seq_file *m, struct task_struct *task)
>  			delta = LONG_MAX;
>  		delta = jiffies_to_msecs(delta);
>  	}
> +	return delta;
> +}
>  
> -	seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta);

Regardless of whether I think this exposing of AVX512 usage is silly
or not, we do not break userspace by changing those strings which some
script or tool already probably relies upon.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v3 1/2] x86/fpu: Add helper function for tracking AVX512 status
  2021-09-20  5:39 [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check Noah Goldstein
                   ` (2 preceding siblings ...)
  2021-10-15 20:47 ` [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
@ 2021-10-27 16:26 ` Noah Goldstein
  2021-10-27 16:26   ` [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
  2021-10-27 17:17 ` [PATCH v4 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 16:26 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

This commit adds a new helper function 'fpu_update_avx_timestamp' to
perform the logic from tracking AVX512 feature use.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
 arch/x86/include/asm/fpu/xstate.h |  1 +
 arch/x86/kernel/fpu/core.c        |  6 ++----
 arch/x86/kernel/fpu/xstate.c      | 13 +++++++++++++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 109dfcc75299..fe84ac5fb039 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -134,6 +134,7 @@ extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
+void fpu_update_avx_timestamp(struct fpu *fpu);
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7ada7bd03a32..6dbf3ee642f9 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -102,11 +102,9 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
 		os_xsave(&fpu->state.xsave);
 
 		/*
-		 * AVX512 state is tracked here because its use is
-		 * known to slow the max clock speed of the core.
+		 * Track of the state of desired avx related xfeatures.
 		 */
-		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-			fpu->avx512_timestamp = jiffies;
+		fpu_update_avx_timestamp(fpu);
 		return;
 	}
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c8def1b7f8fb..00b495914be2 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1245,6 +1245,19 @@ void xrstors(struct xregs_state *xstate, u64 mask)
 	WARN_ON_ONCE(err);
 }
 
+/*
+ * Track of the state of desired avx architecture features.
+ */
+void fpu_update_avx_timestamp(struct fpu *fpu)
+{
+	/*
+	 * AVX512 state is tracked here because its use is known to slow
+	 * the max clock speed of the core.
+	 */
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+		fpu->avx512_timestamp = jiffies;
+}
+
 #ifdef CONFIG_PROC_PID_ARCH_STATUS
 /*
  * Report the amount of time elapsed in millisecond since last AVX512
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 16:26 ` [PATCH v3 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
@ 2021-10-27 16:26   ` Noah Goldstein
  2021-10-27 17:11     ` Borislav Petkov
  0 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 16:26 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

This patch splits the timestamps for tracking the AVX512 status into
'avx512_ZMM_Hi256_timestamp' and 'avx512_Hi16_ZMM_timestamp'. They are
used for tracking XFEATURE_ZMM_Hi256 and XFEATURE_Hi16_ZMM use
respectively.

The purpose of tracking the AVX512 status is to convey information
about possible frequency throttling. The current implementation has
false positives on XFEATURE_OPMASK use and any usage of the hi16 xmm
and ymm registers which are included in the XFEATURE_Hi16_ZMM set as
neither will cause frequency throttling.

This patches implementation avoids to add more clarity to the
output. The 'avx512_ZMM_Hi256_timestamp' will not have false positives
so its value will at least be indicative frequency throttling. Since
'avx512_Hi16_ZMM_timestamp' can still indicate frequency throttling
from zmm16...zmm31 use though had false positives it is separated.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
 arch/x86/include/asm/fpu/types.h | 16 ++++++++--
 arch/x86/kernel/fpu/xstate.c     | 53 +++++++++++++++++++++++++++-----
 2 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f5a38a5f3ae1..cb10909fa3da 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -330,11 +330,21 @@ struct fpu {
 	unsigned int			last_cpu;
 
 	/*
-	 * @avx512_timestamp:
+	 * @avx512_ZMM_Hi256_timestamp:
 	 *
-	 * Records the timestamp of AVX512 use during last context switch.
+	 * Records the timestamp of AVX512 use in the ZMM_Hi256 xfeature
+	 * set. This include zmm0...zmm15.
 	 */
-	unsigned long			avx512_timestamp;
+	unsigned long			avx512_ZMM_Hi256_timestamp;
+
+	/*
+	 * @avx512_Hi16_ZMM_timestamp:
+	 *
+	 * Records the timestamp of AVX512 use in the Hi16_ZMM xfeature
+	 * set. This includes usage of any of the hi16 xmm, ymm, or zmm
+	 * registers.
+	 */
+	unsigned long			avx512_Hi16_ZMM_timestamp;
 
 	/*
 	 * @state:
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 00b495914be2..3bb1a425ce56 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1246,7 +1246,7 @@ void xrstors(struct xregs_state *xstate, u64 mask)
 }
 
 /*
- * Track of the state of desired avx architecture features.
+ * Track of the state of desired avx related xfeatures.
  */
 void fpu_update_avx_timestamp(struct fpu *fpu)
 {
@@ -1254,18 +1254,28 @@ void fpu_update_avx_timestamp(struct fpu *fpu)
 	 * AVX512 state is tracked here because its use is known to slow
 	 * the max clock speed of the core.
 	 */
-	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-		fpu->avx512_timestamp = jiffies;
+
+	/*
+	 * Store a separate state for ZMM_Hi256 and Hi16_ZMM xfeature use.
+	 * If ZMM_Hi256 is used the machine has certainly used a zmm
+	 * register.  Hi16_ZMM, however, has false positives on usage of
+	 * hi16 xmm and ymm registers.
+	 */
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_ZMM_Hi256)
+		fpu->avx512_ZMM_Hi256_timestamp = jiffies;
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_Hi16_ZMM)
+		fpu->avx512_Hi16_ZMM_timestamp = jiffies;
 }
 
+
 #ifdef CONFIG_PROC_PID_ARCH_STATUS
+
 /*
- * Report the amount of time elapsed in millisecond since last AVX512
- * use in the task.
+ * Helper function for computing proper output for avx512_status
+ * timestamp.
  */
-static void avx512_status(struct seq_file *m, struct task_struct *task)
+static long avx_status_compute_delta(unsigned long timestamp)
 {
-	unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp);
 	long delta;
 
 	if (!timestamp) {
@@ -1282,8 +1292,35 @@ static void avx512_status(struct seq_file *m, struct task_struct *task)
 			delta = LONG_MAX;
 		delta = jiffies_to_msecs(delta);
 	}
+	return delta;
+}
 
-	seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta);
+/*
+ * Report the amount of time elapsed in millisecond since last AVX512
+ * use in the task.
+ */
+static void avx512_status(struct seq_file *m, struct task_struct *task)
+{
+	unsigned long timestamp;
+	long delta_ZMM_Hi256, delta_Hi16_ZMM, delta_unified;
+
+	timestamp = READ_ONCE(task->thread.fpu.avx512_ZMM_Hi256_timestamp);
+	delta_ZMM_Hi256 = avx_status_compute_delta(timestamp);
+
+	timestamp = READ_ONCE(task->thread.fpu.avx512_Hi16_ZMM_timestamp);
+	delta_Hi16_ZMM = avx_status_compute_delta(timestamp);
+
+	/*
+	 * Report unified delta of most recent AVX512 usage from either
+	 * Hi16_ZMM or ZMM_Hi256 xfeature sets.
+	 */
+	delta_unified = timestamp ? delta_Hi16_ZMM : delta_ZMM_Hi256;
+
+	seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta_unified);
+	seq_putc(m, '\n');
+	seq_put_decimal_ll(m, "AVX512_ZMM_Hi256_elapsed_ms:\t", delta_ZMM_Hi256);
+	seq_putc(m, '\n');
+	seq_put_decimal_ll(m, "AVX512_Hi16_ZMM_elapsed_ms:\t", delta_Hi16_ZMM);
 	seq_putc(m, '\n');
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 11:07   ` Borislav Petkov
@ 2021-10-27 16:28     ` Noah Goldstein
  2021-10-27 16:58       ` Borislav Petkov
  0 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 16:28 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Oct 27, 2021 at 6:07 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Oct 15, 2021 at 03:47:17PM -0500, Noah Goldstein wrote:
> > This patch splits the timestamps for tracking the AVX512 status into
>
> For future submissions, please avoid having "This patch" or "This
> commit" in the commit message. It is tautologically useless.

Got it. Thanks.
>
> ...
>
> > @@ -1282,8 +1292,27 @@ static void avx512_status(struct seq_file *m, struct task_struct *task)
> >                       delta = LONG_MAX;
> >               delta = jiffies_to_msecs(delta);
> >       }
> > +     return delta;
> > +}
> >
> > -     seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta);
>
> Regardless of whether I think this exposing of AVX512 usage is silly
> or not, we do not break userspace by changing those strings which some
> script or tool already probably relies upon.

Fixed. Added 'delta_unified' which will report the most recent usage of
either.

>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 16:28     ` Noah Goldstein
@ 2021-10-27 16:58       ` Borislav Petkov
  2021-10-27 17:18         ` Noah Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2021-10-27 16:58 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Oct 27, 2021 at 11:28:07AM -0500, Noah Goldstein wrote:
> Got it. Thanks.

Did ya?

:-)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 16:26   ` [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
@ 2021-10-27 17:11     ` Borislav Petkov
  2021-10-27 17:37       ` Noah Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2021-10-27 17:11 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: tglx, mingo, x86, hpa, luto, linux-kernel

On Wed, Oct 27, 2021 at 11:26:15AM -0500, Noah Goldstein wrote:
> diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
> index f5a38a5f3ae1..cb10909fa3da 100644
> --- a/arch/x86/include/asm/fpu/types.h
> +++ b/arch/x86/include/asm/fpu/types.h
> @@ -330,11 +330,21 @@ struct fpu {
>  	unsigned int			last_cpu;
>  
>  	/*
> -	 * @avx512_timestamp:
> +	 * @avx512_ZMM_Hi256_timestamp:
>  	 *
> -	 * Records the timestamp of AVX512 use during last context switch.
> +	 * Records the timestamp of AVX512 use in the ZMM_Hi256 xfeature
> +	 * set. This include zmm0...zmm15.
>  	 */
> -	unsigned long			avx512_timestamp;
> +	unsigned long			avx512_ZMM_Hi256_timestamp;
> +
> +	/*
> +	 * @avx512_Hi16_ZMM_timestamp:
> +	 *
> +	 * Records the timestamp of AVX512 use in the Hi16_ZMM xfeature
> +	 * set. This includes usage of any of the hi16 xmm, ymm, or zmm
> +	 * registers.
> +	 */
> +	unsigned long			avx512_Hi16_ZMM_timestamp;

No, not more of this but less.

That was a bad idea to begin with as exposing this to userspace would
cause exactly this: but but, I need to track my special use case more
precisely.

But the feature mask can't give you that precision so it'll be only an
approximation no matter what you do.

And I'm being told future cores won't have this "problem" so on them
that file becomes actively misleading.

If you really wanna track performance drop precisely or AVX use or
whatnot, there's performance counters for that which can give you
exactly what you wanna know.

So I'll take a simple patch carving out that into a function and which
removes the opmask and otherwise let that thing die. And on future cores
which are not affected, that thing will report only 0 anyway.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v4 1/2] x86/fpu: Add helper function for tracking AVX512 status
  2021-09-20  5:39 [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check Noah Goldstein
                   ` (3 preceding siblings ...)
  2021-10-27 16:26 ` [PATCH v3 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
@ 2021-10-27 17:17 ` Noah Goldstein
  2021-10-27 17:17   ` [PATCH v4 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
  2021-10-27 18:21 ` [PATCH v5 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
  2021-11-17 18:29 ` [tip: x86/fpu] " tip-bot2 for Noah Goldstein
  6 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 17:17 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

Add a new helper function 'fpu_update_avx_timestamp' to perform
the logic from tracking AVX512 feature use.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
 arch/x86/include/asm/fpu/xstate.h |  1 +
 arch/x86/kernel/fpu/core.c        |  6 ++----
 arch/x86/kernel/fpu/xstate.c      | 13 +++++++++++++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 109dfcc75299..fe84ac5fb039 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -134,6 +134,7 @@ extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
+void fpu_update_avx_timestamp(struct fpu *fpu);
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7ada7bd03a32..6dbf3ee642f9 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -102,11 +102,9 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
 		os_xsave(&fpu->state.xsave);
 
 		/*
-		 * AVX512 state is tracked here because its use is
-		 * known to slow the max clock speed of the core.
+		 * Track of the state of desired avx related xfeatures.
 		 */
-		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-			fpu->avx512_timestamp = jiffies;
+		fpu_update_avx_timestamp(fpu);
 		return;
 	}
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c8def1b7f8fb..00b495914be2 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1245,6 +1245,19 @@ void xrstors(struct xregs_state *xstate, u64 mask)
 	WARN_ON_ONCE(err);
 }
 
+/*
+ * Track of the state of desired avx architecture features.
+ */
+void fpu_update_avx_timestamp(struct fpu *fpu)
+{
+	/*
+	 * AVX512 state is tracked here because its use is known to slow
+	 * the max clock speed of the core.
+	 */
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+		fpu->avx512_timestamp = jiffies;
+}
+
 #ifdef CONFIG_PROC_PID_ARCH_STATUS
 /*
  * Report the amount of time elapsed in millisecond since last AVX512
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v4 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 17:17 ` [PATCH v4 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
@ 2021-10-27 17:17   ` Noah Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 17:17 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

Split the timestamps for tracking the AVX512 status into
'avx512_ZMM_Hi256_timestamp' and 'avx512_Hi16_ZMM_timestamp'. They are
used for tracking XFEATURE_ZMM_Hi256 and XFEATURE_Hi16_ZMM use
respectively.

The purpose of tracking the AVX512 status is to convey information
about possible frequency throttling. The current implementation has
false positives on XFEATURE_OPMASK use and any usage of the hi16 xmm
and ymm registers which are included in the XFEATURE_Hi16_ZMM set as
neither will cause frequency throttling.

The implementation splits the relevant xfeature sets to add more
clarity to the output. The 'avx512_ZMM_Hi256_timestamp' will not have
false positives so its value will at least be indicative frequency
throttling. Since 'avx512_Hi16_ZMM_timestamp' can still indicate
frequency throttling from zmm16...zmm31 use though had false positives
it is separated. As well since existing code may be relying on
"AVX512_elapsed_ms" as a catchall output both xfeature sets are use to
compute its output (taking whichever, if any, of the two are in use).

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
 arch/x86/include/asm/fpu/types.h | 16 ++++++++--
 arch/x86/kernel/fpu/xstate.c     | 53 +++++++++++++++++++++++++++-----
 2 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f5a38a5f3ae1..cb10909fa3da 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -330,11 +330,21 @@ struct fpu {
 	unsigned int			last_cpu;
 
 	/*
-	 * @avx512_timestamp:
+	 * @avx512_ZMM_Hi256_timestamp:
 	 *
-	 * Records the timestamp of AVX512 use during last context switch.
+	 * Records the timestamp of AVX512 use in the ZMM_Hi256 xfeature
+	 * set. This include zmm0...zmm15.
 	 */
-	unsigned long			avx512_timestamp;
+	unsigned long			avx512_ZMM_Hi256_timestamp;
+
+	/*
+	 * @avx512_Hi16_ZMM_timestamp:
+	 *
+	 * Records the timestamp of AVX512 use in the Hi16_ZMM xfeature
+	 * set. This includes usage of any of the hi16 xmm, ymm, or zmm
+	 * registers.
+	 */
+	unsigned long			avx512_Hi16_ZMM_timestamp;
 
 	/*
 	 * @state:
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 00b495914be2..3bb1a425ce56 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1246,7 +1246,7 @@ void xrstors(struct xregs_state *xstate, u64 mask)
 }
 
 /*
- * Track of the state of desired avx architecture features.
+ * Track of the state of desired avx related xfeatures.
  */
 void fpu_update_avx_timestamp(struct fpu *fpu)
 {
@@ -1254,18 +1254,28 @@ void fpu_update_avx_timestamp(struct fpu *fpu)
 	 * AVX512 state is tracked here because its use is known to slow
 	 * the max clock speed of the core.
 	 */
-	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-		fpu->avx512_timestamp = jiffies;
+
+	/*
+	 * Store a separate state for ZMM_Hi256 and Hi16_ZMM xfeature use.
+	 * If ZMM_Hi256 is used the machine has certainly used a zmm
+	 * register.  Hi16_ZMM, however, has false positives on usage of
+	 * hi16 xmm and ymm registers.
+	 */
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_ZMM_Hi256)
+		fpu->avx512_ZMM_Hi256_timestamp = jiffies;
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_Hi16_ZMM)
+		fpu->avx512_Hi16_ZMM_timestamp = jiffies;
 }
 
+
 #ifdef CONFIG_PROC_PID_ARCH_STATUS
+
 /*
- * Report the amount of time elapsed in millisecond since last AVX512
- * use in the task.
+ * Helper function for computing proper output for avx512_status
+ * timestamp.
  */
-static void avx512_status(struct seq_file *m, struct task_struct *task)
+static long avx_status_compute_delta(unsigned long timestamp)
 {
-	unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp);
 	long delta;
 
 	if (!timestamp) {
@@ -1282,8 +1292,35 @@ static void avx512_status(struct seq_file *m, struct task_struct *task)
 			delta = LONG_MAX;
 		delta = jiffies_to_msecs(delta);
 	}
+	return delta;
+}
 
-	seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta);
+/*
+ * Report the amount of time elapsed in millisecond since last AVX512
+ * use in the task.
+ */
+static void avx512_status(struct seq_file *m, struct task_struct *task)
+{
+	unsigned long timestamp;
+	long delta_ZMM_Hi256, delta_Hi16_ZMM, delta_unified;
+
+	timestamp = READ_ONCE(task->thread.fpu.avx512_ZMM_Hi256_timestamp);
+	delta_ZMM_Hi256 = avx_status_compute_delta(timestamp);
+
+	timestamp = READ_ONCE(task->thread.fpu.avx512_Hi16_ZMM_timestamp);
+	delta_Hi16_ZMM = avx_status_compute_delta(timestamp);
+
+	/*
+	 * Report unified delta of most recent AVX512 usage from either
+	 * Hi16_ZMM or ZMM_Hi256 xfeature sets.
+	 */
+	delta_unified = timestamp ? delta_Hi16_ZMM : delta_ZMM_Hi256;
+
+	seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta_unified);
+	seq_putc(m, '\n');
+	seq_put_decimal_ll(m, "AVX512_ZMM_Hi256_elapsed_ms:\t", delta_ZMM_Hi256);
+	seq_putc(m, '\n');
+	seq_put_decimal_ll(m, "AVX512_Hi16_ZMM_elapsed_ms:\t", delta_Hi16_ZMM);
 	seq_putc(m, '\n');
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 16:58       ` Borislav Petkov
@ 2021-10-27 17:18         ` Noah Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 17:18 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Oct 27, 2021 at 11:58 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Oct 27, 2021 at 11:28:07AM -0500, Noah Goldstein wrote:
> > Got it. Thanks.
>
> Did ya?
>
> :-)

Oof :/


Fixed.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 17:11     ` Borislav Petkov
@ 2021-10-27 17:37       ` Noah Goldstein
  2021-10-27 17:45         ` Borislav Petkov
  0 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 17:37 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Oct 27, 2021 at 12:11 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Oct 27, 2021 at 11:26:15AM -0500, Noah Goldstein wrote:
> > diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
> > index f5a38a5f3ae1..cb10909fa3da 100644
> > --- a/arch/x86/include/asm/fpu/types.h
> > +++ b/arch/x86/include/asm/fpu/types.h
> > @@ -330,11 +330,21 @@ struct fpu {
> >       unsigned int                    last_cpu;
> >
> >       /*
> > -      * @avx512_timestamp:
> > +      * @avx512_ZMM_Hi256_timestamp:
> >        *
> > -      * Records the timestamp of AVX512 use during last context switch.
> > +      * Records the timestamp of AVX512 use in the ZMM_Hi256 xfeature
> > +      * set. This include zmm0...zmm15.
> >        */
> > -     unsigned long                   avx512_timestamp;
> > +     unsigned long                   avx512_ZMM_Hi256_timestamp;
> > +
> > +     /*
> > +      * @avx512_Hi16_ZMM_timestamp:
> > +      *
> > +      * Records the timestamp of AVX512 use in the Hi16_ZMM xfeature
> > +      * set. This includes usage of any of the hi16 xmm, ymm, or zmm
> > +      * registers.
> > +      */
> > +     unsigned long                   avx512_Hi16_ZMM_timestamp;
>
> No, not more of this but less.
>
> That was a bad idea to begin with as exposing this to userspace would
> cause exactly this: but but, I need to track my special use case more
> precisely.
>
> But the feature mask can't give you that precision so it'll be only an
> approximation no matter what you do.
>
> And I'm being told future cores won't have this "problem" so on them
> that file becomes actively misleading.

What I've heard is it's a lot better on Rocket Lake (only extra downclocking
on multicore usage) and TBD for Saphire Rapids.
>
> If you really wanna track performance drop precisely or AVX use or
> whatnot, there's performance counters for that which can give you
> exactly what you wanna know.
>
> So I'll take a simple patch carving out that into a function and which
> removes the opmask and otherwise let that thing die. And on future cores
> which are not affected, that thing will report only 0 anyway.

What about just splitting off a field for 'AVX512_ZMM_Hi256'? That field
seems like it can give meaningful information.

I think mostly like 'AVX512_Hi16_ZMM' will almost always be set because
GLIBC's ifunc uses EVEX implementations for all string/memory functions
on CPUs with avx512.

But if you insist will do.
>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 17:37       ` Noah Goldstein
@ 2021-10-27 17:45         ` Borislav Petkov
  2021-10-27 18:21           ` Noah Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2021-10-27 17:45 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Oct 27, 2021 at 12:37:16PM -0500, Noah Goldstein wrote:
> What about just splitting off a field for 'AVX512_ZMM_Hi256'?

Let's please not perpetuate this into what it cannot become anyway. The
more stuff we add to it, the more people will start using it and we're
stuck with it forever. And soon that thing will be worthless anyway.

So what's the point...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v5 1/2] x86/fpu: Add helper function for tracking AVX512 status
  2021-09-20  5:39 [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check Noah Goldstein
                   ` (4 preceding siblings ...)
  2021-10-27 17:17 ` [PATCH v4 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
@ 2021-10-27 18:21 ` Noah Goldstein
  2021-10-27 18:21   ` [PATCH v5 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
  2021-11-17 18:29 ` [tip: x86/fpu] " tip-bot2 for Noah Goldstein
  6 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 18:21 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

Add a new helper function 'fpu_update_avx_timestamp' to perform
the logic from tracking AVX512 feature use.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
 arch/x86/include/asm/fpu/xstate.h |  1 +
 arch/x86/kernel/fpu/core.c        |  6 ++----
 arch/x86/kernel/fpu/xstate.c      | 13 +++++++++++++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 109dfcc75299..fe84ac5fb039 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -134,6 +134,7 @@ extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
 extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
+void fpu_update_avx_timestamp(struct fpu *fpu);
 void *get_xsave_addr(struct xregs_state *xsave, int xfeature_nr);
 int xfeature_size(int xfeature_nr);
 int copy_uabi_from_kernel_to_xstate(struct xregs_state *xsave, const void *kbuf);
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 7ada7bd03a32..6dbf3ee642f9 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -102,11 +102,9 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
 		os_xsave(&fpu->state.xsave);
 
 		/*
-		 * AVX512 state is tracked here because its use is
-		 * known to slow the max clock speed of the core.
+		 * Track of the state of desired avx related xfeatures.
 		 */
-		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-			fpu->avx512_timestamp = jiffies;
+		fpu_update_avx_timestamp(fpu);
 		return;
 	}
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c8def1b7f8fb..00b495914be2 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1245,6 +1245,19 @@ void xrstors(struct xregs_state *xstate, u64 mask)
 	WARN_ON_ONCE(err);
 }
 
+/*
+ * Track of the state of desired avx architecture features.
+ */
+void fpu_update_avx_timestamp(struct fpu *fpu)
+{
+	/*
+	 * AVX512 state is tracked here because its use is known to slow
+	 * the max clock speed of the core.
+	 */
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+		fpu->avx512_timestamp = jiffies;
+}
+
 #ifdef CONFIG_PROC_PID_ARCH_STATUS
 /*
  * Report the amount of time elapsed in millisecond since last AVX512
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v5 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 18:21 ` [PATCH v5 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
@ 2021-10-27 18:21   ` Noah Goldstein
  2021-11-16 16:21     ` [PATCH] x86/fpu: Correct AVX512 state tracking Borislav Petkov
  0 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 18:21 UTC (permalink / raw)
  Cc: goldstein.w.n, tglx, mingo, bp, x86, hpa, luto, linux-kernel

Update the mask for tracking AVX512 usage to not include the opmask
xfeature set.

The purpose of tracking the AVX512 status is to convey information
about possible frequency throttling. Opmask usage does not cause
frequency throttling so it is a completely unnecessary false positive.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
---
 arch/x86/include/asm/fpu/types.h | 4 ++++
 arch/x86/kernel/fpu/xstate.c     | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f5a38a5f3ae1..e9a48f75159e 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -142,6 +142,10 @@ enum xfeature {
 					 | XFEATURE_MASK_ZMM_Hi256 \
 					 | XFEATURE_MASK_Hi16_ZMM)
 
+#define XFEATURE_MASK_AVX512_THROTTLE		(XFEATURE_MASK_ZMM_Hi256 \
+					 | XFEATURE_MASK_Hi16_ZMM)
+
+
 #define FIRST_EXTENDED_XFEATURE	XFEATURE_YMM
 
 struct reg_128_bit {
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 00b495914be2..e129fae48792 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1254,7 +1254,7 @@ void fpu_update_avx_timestamp(struct fpu *fpu)
 	 * AVX512 state is tracked here because its use is known to slow
 	 * the max clock speed of the core.
 	 */
-	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+	if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512_THROTTLE)
 		fpu->avx512_timestamp = jiffies;
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 17:45         ` Borislav Petkov
@ 2021-10-27 18:21           ` Noah Goldstein
  2021-11-03 20:22             ` Noah Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-10-27 18:21 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Oct 27, 2021 at 12:45 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Oct 27, 2021 at 12:37:16PM -0500, Noah Goldstein wrote:
> > What about just splitting off a field for 'AVX512_ZMM_Hi256'?
>
> Let's please not perpetuate this into what it cannot become anyway. The
> more stuff we add to it, the more people will start using it and we're
> stuck with it forever. And soon that thing will be worthless anyway.
>
> So what's the point...

Got it and done in V5.

>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-10-27 18:21           ` Noah Goldstein
@ 2021-11-03 20:22             ` Noah Goldstein
  2021-11-03 20:48               ` Borislav Petkov
  0 siblings, 1 reply; 31+ messages in thread
From: Noah Goldstein @ 2021-11-03 20:22 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Oct 27, 2021 at 1:21 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote:
>
> On Wed, Oct 27, 2021 at 12:45 PM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Wed, Oct 27, 2021 at 12:37:16PM -0500, Noah Goldstein wrote:
> > > What about just splitting off a field for 'AVX512_ZMM_Hi256'?
> >
> > Let's please not perpetuate this into what it cannot become anyway. The
> > more stuff we add to it, the more people will start using it and we're
> > stuck with it forever. And soon that thing will be worthless anyway.
> >
> > So what's the point...
>
> Got it and done in V5.

Ping.

>
> >
> > --
> > Regards/Gruss,
> >     Boris.
> >
> > https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-11-03 20:22             ` Noah Goldstein
@ 2021-11-03 20:48               ` Borislav Petkov
  2021-11-03 20:56                 ` Noah Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2021-11-03 20:48 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Nov 03, 2021 at 03:22:38PM -0500, Noah Goldstein wrote:
> Ping.

Dear Noah,

in case you don't know, we have a merge window now. See below for a
lenghty explanation what that means about new patches.

Also, there's no need to constantly ping people - your patch has landed
on my TODO list and will be dealt with when its time comes.

If it is a serious bug fix, it will be handled with a higher priority
but that is not the case, as far as I see it.

So please sit back and be patient - it'll happen.

Thx.

"Merge window
^^^^^^^^^^^^

Please do not expect large patch series to be handled during the merge
window or even during the week before.  Such patches should be submitted in
mergeable state *at* *least* a week before the merge window opens.
Exceptions are made for bug fixes and *sometimes* for small standalone
drivers for new hardware or minimally invasive patches for hardware
enablement.

During the merge window, the maintainers instead focus on following the
upstream changes, fixing merge window fallout, collecting bug fixes, and
allowing themselves a breath. Please respect that.

The release candidate -rc1 is the starting point for new patches to be
applied which are targeted for the next merge window."

from Documentation/process/maintainer-tip.rst

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate
  2021-11-03 20:48               ` Borislav Petkov
@ 2021-11-03 20:56                 ` Noah Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Noah Goldstein @ 2021-11-03 20:56 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Wed, Nov 3, 2021 at 3:48 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Nov 03, 2021 at 03:22:38PM -0500, Noah Goldstein wrote:
> > Ping.
>
> Dear Noah,
>
> in case you don't know, we have a merge window now. See below for a
> lenghty explanation what that means about new patches.
>
> Also, there's no need to constantly ping people - your patch has landed
> on my TODO list and will be dealt with when its time comes.


Sorry. Don't mean to pressure. Just didn't know the state it had landed.

Thank you for taking care of it :)

>
> If it is a serious bug fix, it will be handled with a higher priority
> but that is not the case, as far as I see it.
>
> So please sit back and be patient - it'll happen.
>
> Thx.
>
> "Merge window
> ^^^^^^^^^^^^
>
> Please do not expect large patch series to be handled during the merge
> window or even during the week before.  Such patches should be submitted in
> mergeable state *at* *least* a week before the merge window opens.
> Exceptions are made for bug fixes and *sometimes* for small standalone
> drivers for new hardware or minimally invasive patches for hardware
> enablement.
>
> During the merge window, the maintainers instead focus on following the
> upstream changes, fixing merge window fallout, collecting bug fixes, and
> allowing themselves a breath. Please respect that.
>
> The release candidate -rc1 is the starting point for new patches to be
> applied which are targeted for the next merge window."
>
> from Documentation/process/maintainer-tip.rst
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH] x86/fpu: Correct AVX512 state tracking
  2021-10-27 18:21   ` [PATCH v5 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
@ 2021-11-16 16:21     ` Borislav Petkov
  2021-11-16 16:49       ` Noah Goldstein
  0 siblings, 1 reply; 31+ messages in thread
From: Borislav Petkov @ 2021-11-16 16:21 UTC (permalink / raw)
  To: Noah Goldstein; +Cc: tglx, mingo, x86, hpa, luto, linux-kernel

Does that still work with your test case? (Ontop of -rc1)

---
From: Noah Goldstein <goldstein.w.n@gmail.com>
Date: Tue, 16 Nov 2021 17:14:21 +0100
Subject: [PATCH] x86/fpu: Correct AVX512 state tracking

Add a separate, local mask for tracking AVX512 usage which does not
include the opmask xfeature set. Opmask registers usage does not cause
frequency throttling so it is a completely unnecessary false positive.

While at it, carve it out into a separate function to keep that
abomination extracted out.

 [ bp: Rediff and cleanup ontop of 5.16-rc1. ]

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20210920053951.4093668-1-goldstein.w.n@gmail.com
---
 arch/x86/kernel/fpu/core.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8ea306b1bf8e..dd3777ac0443 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -98,6 +98,19 @@ bool irq_fpu_usable(void)
 }
 EXPORT_SYMBOL(irq_fpu_usable);
 
+/*
+ * Track AVX512 state use because it is known to slow the max clock
+ * speed of the core.
+ */
+static void update_avx_timestamp(struct fpu *fpu)
+{
+
+#define AVX512_TRACKING_MASK	(XFEATURE_MASK_ZMM_Hi256 | XFEATURE_MASK_Hi16_ZMM)
+
+	if (fpu->fpstate->regs.xsave.header.xfeatures & AVX512_TRACKING_MASK)
+		fpu->avx512_timestamp = jiffies;
+}
+
 /*
  * Save the FPU register state in fpu->fpstate->regs. The register state is
  * preserved.
@@ -116,13 +129,7 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		os_xsave(fpu->fpstate);
-
-		/*
-		 * AVX512 state is tracked here because its use is
-		 * known to slow the max clock speed of the core.
-		 */
-		if (fpu->fpstate->regs.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-			fpu->avx512_timestamp = jiffies;
+		update_avx_timestamp(fpu);
 		return;
 	}
 
-- 
2.29.2


-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH] x86/fpu: Correct AVX512 state tracking
  2021-11-16 16:21     ` [PATCH] x86/fpu: Correct AVX512 state tracking Borislav Petkov
@ 2021-11-16 16:49       ` Noah Goldstein
  0 siblings, 0 replies; 31+ messages in thread
From: Noah Goldstein @ 2021-11-16 16:49 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, X86 ML, hpa, Andy Lutomirski, open list

On Tue, Nov 16, 2021 at 10:21 AM Borislav Petkov <bp@alien8.de> wrote:
>
> Does that still work with your test case? (Ontop of -rc1)

This should work.

>
> ---
> From: Noah Goldstein <goldstein.w.n@gmail.com>
> Date: Tue, 16 Nov 2021 17:14:21 +0100
> Subject: [PATCH] x86/fpu: Correct AVX512 state tracking
>
> Add a separate, local mask for tracking AVX512 usage which does not
> include the opmask xfeature set. Opmask registers usage does not cause
> frequency throttling so it is a completely unnecessary false positive.
>
> While at it, carve it out into a separate function to keep that
> abomination extracted out.
>
>  [ bp: Rediff and cleanup ontop of 5.16-rc1. ]
>
> Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Link: https://lore.kernel.org/r/20210920053951.4093668-1-goldstein.w.n@gmail.com
> ---
>  arch/x86/kernel/fpu/core.c | 21 ++++++++++++++-------
>  1 file changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 8ea306b1bf8e..dd3777ac0443 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -98,6 +98,19 @@ bool irq_fpu_usable(void)
>  }
>  EXPORT_SYMBOL(irq_fpu_usable);
>
> +/*
> + * Track AVX512 state use because it is known to slow the max clock
> + * speed of the core.
> + */
> +static void update_avx_timestamp(struct fpu *fpu)
> +{
> +
> +#define AVX512_TRACKING_MASK   (XFEATURE_MASK_ZMM_Hi256 | XFEATURE_MASK_Hi16_ZMM)
> +
> +       if (fpu->fpstate->regs.xsave.header.xfeatures & AVX512_TRACKING_MASK)
> +               fpu->avx512_timestamp = jiffies;
> +}
> +
>  /*
>   * Save the FPU register state in fpu->fpstate->regs. The register state is
>   * preserved.
> @@ -116,13 +129,7 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
>  {
>         if (likely(use_xsave())) {
>                 os_xsave(fpu->fpstate);
> -
> -               /*
> -                * AVX512 state is tracked here because its use is
> -                * known to slow the max clock speed of the core.
> -                */
> -               if (fpu->fpstate->regs.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
> -                       fpu->avx512_timestamp = jiffies;
> +               update_avx_timestamp(fpu);
>                 return;
>         }
>
> --
> 2.29.2
>
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [tip: x86/fpu] x86/fpu: Correct AVX512 state tracking
  2021-09-20  5:39 [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check Noah Goldstein
                   ` (5 preceding siblings ...)
  2021-10-27 18:21 ` [PATCH v5 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
@ 2021-11-17 18:29 ` tip-bot2 for Noah Goldstein
  6 siblings, 0 replies; 31+ messages in thread
From: tip-bot2 for Noah Goldstein @ 2021-11-17 18:29 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Noah Goldstein, Borislav Petkov, x86, linux-kernel

The following commit has been merged into the x86/fpu branch of tip:

Commit-ID:     0fe4ff885f8a50082d9dc241b657472894caba16
Gitweb:        https://git.kernel.org/tip/0fe4ff885f8a50082d9dc241b657472894caba16
Author:        Noah Goldstein <goldstein.w.n@gmail.com>
AuthorDate:    Tue, 16 Nov 2021 17:14:21 +01:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Tue, 16 Nov 2021 17:19:41 +01:00

x86/fpu: Correct AVX512 state tracking

Add a separate, local mask for tracking AVX512 usage which does not
include the opmask xfeature set. Opmask registers usage does not cause
frequency throttling so it is a completely unnecessary false positive.

While at it, carve it out into a separate function to keep that
abomination extracted out.

 [ bp: Rediff and cleanup ontop of 5.16-rc1. ]

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20210920053951.4093668-1-goldstein.w.n@gmail.com
---
 arch/x86/kernel/fpu/core.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 8ea306b..dd3777a 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -99,6 +99,19 @@ bool irq_fpu_usable(void)
 EXPORT_SYMBOL(irq_fpu_usable);
 
 /*
+ * Track AVX512 state use because it is known to slow the max clock
+ * speed of the core.
+ */
+static void update_avx_timestamp(struct fpu *fpu)
+{
+
+#define AVX512_TRACKING_MASK	(XFEATURE_MASK_ZMM_Hi256 | XFEATURE_MASK_Hi16_ZMM)
+
+	if (fpu->fpstate->regs.xsave.header.xfeatures & AVX512_TRACKING_MASK)
+		fpu->avx512_timestamp = jiffies;
+}
+
+/*
  * Save the FPU register state in fpu->fpstate->regs. The register state is
  * preserved.
  *
@@ -116,13 +129,7 @@ void save_fpregs_to_fpstate(struct fpu *fpu)
 {
 	if (likely(use_xsave())) {
 		os_xsave(fpu->fpstate);
-
-		/*
-		 * AVX512 state is tracked here because its use is
-		 * known to slow the max clock speed of the core.
-		 */
-		if (fpu->fpstate->regs.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-			fpu->avx512_timestamp = jiffies;
+		update_avx_timestamp(fpu);
 		return;
 	}
 

^ permalink raw reply related	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2021-11-17 18:29 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-20  5:39 [PATCH v1] x86/fpu: Remove opmask state from avx512_timestamp check Noah Goldstein
2021-09-27 18:02 ` Noah Goldstein
2021-10-13 22:36   ` Noah Goldstein
2021-10-14  8:28     ` Borislav Petkov
2021-10-14 15:49       ` Noah Goldstein
2021-10-15 14:40         ` Borislav Petkov
2021-10-15 15:25         ` Dave Hansen
2021-10-15 17:30           ` Noah Goldstein
2021-10-15 20:47 ` [PATCH v2 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
2021-10-26 23:15   ` Noah Goldstein
2021-10-15 20:47 ` [PATCH v2 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
2021-10-27 11:07   ` Borislav Petkov
2021-10-27 16:28     ` Noah Goldstein
2021-10-27 16:58       ` Borislav Petkov
2021-10-27 17:18         ` Noah Goldstein
2021-10-27 16:26 ` [PATCH v3 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
2021-10-27 16:26   ` [PATCH v3 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
2021-10-27 17:11     ` Borislav Petkov
2021-10-27 17:37       ` Noah Goldstein
2021-10-27 17:45         ` Borislav Petkov
2021-10-27 18:21           ` Noah Goldstein
2021-11-03 20:22             ` Noah Goldstein
2021-11-03 20:48               ` Borislav Petkov
2021-11-03 20:56                 ` Noah Goldstein
2021-10-27 17:17 ` [PATCH v4 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
2021-10-27 17:17   ` [PATCH v4 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
2021-10-27 18:21 ` [PATCH v5 1/2] x86/fpu: Add helper function for tracking AVX512 status Noah Goldstein
2021-10-27 18:21   ` [PATCH v5 2/2] x86/xstate: Make AVX512 status tracking more accurate Noah Goldstein
2021-11-16 16:21     ` [PATCH] x86/fpu: Correct AVX512 state tracking Borislav Petkov
2021-11-16 16:49       ` Noah Goldstein
2021-11-17 18:29 ` [tip: x86/fpu] " tip-bot2 for Noah Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.