linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
       [not found]   ` <CAKwvOdnRvd5KK01awAyeyt5S36TPPW4_8Z6YL1r4gB-pBrHTbg@mail.gmail.com>
@ 2023-03-07 11:35     ` Peter Zijlstra
  2023-03-07 17:48       ` Linus Torvalds
  2023-03-07 18:43       ` Segher Boessenkool
  0 siblings, 2 replies; 18+ messages in thread
From: Peter Zijlstra @ 2023-03-07 11:35 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Stephane Eranian, linux-kernel, tony.luck, reinette.chatre,
	fenghua.yu, peternewman, james.morse, babu.moger, ananth.narayan,
	vschneid, Nathan Chancellor, clang-built-linux, Borislav Petkov,
	Linus Torvalds, H. Peter Anvin, linux-toolchains

On Mon, Mar 06, 2023 at 04:16:52PM -0800, Nick Desaulniers wrote:
> Start of Lore thread:
> https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/
> 
> On Mon, Mar 6, 2023 at 4:01 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Fri, Mar 03, 2023 at 03:11:33PM -0800, Stephane Eranian wrote:
> >
> > > The problem is located in the __resctrl_sched_in() routine which rewrites
> > > the active closid via the PQR_ASSOC register. Because this is an expensive
> > > operation, the kernel only does it when the context switch involves tasks
> > > with different CLOSID. And to check that, it needs to access the current
> > > task's closid field using current->closid. current is actually a macro
> > > that reads the per-cpu variable pcpu_hot.current_task.
> > >
> > > After an investigation by compiler experts, the problem has been tracked down
> > > to the usage of the get_current() macro in the __resctrl_sched_in() code and
> > > in particular the per-cpu macro:
> > >
> > > static __always_inline struct task_struct *get_current(void)
> > > {
> > >         return this_cpu_read_stable(pcpu_hot.current_task);
> > > }
> > >
> > > And as per percpu.h:
> > >
> > > /*
> > >  * this_cpu_read() makes gcc load the percpu variable every time it is
> > >  * accessed while this_cpu_read_stable() allows the value to be cached.
> > >  * this_cpu_read_stable() is more efficient and can be used if its value
> > >  * is guaranteed to be valid across cpus.  The current users include
> > >  * get_current() and get_thread_info() both of which are actually
> > >  * per-thread variables implemented as per-cpu variables and thus
> > >  * stable for the duration of the respective task.
> > >  */
> > >
> > > The _stable version of the macro allows the value to be cached, meaning it
> > > does not force a reload.
> >
> > Right, so afaict the difference between this_cpu_read() and
> > this_cpu_read_stable() is the volatile qualifier.
> >
> > this_cpu_read() is asm volatile(), while this_cpu_read_stable() and
> > raw_cpu_read() are both an unqualified asm().
> >
> > Now, afaiu we're inlining all of this into __switch_to(), which has
> > raw_cpu_write(pcpu_hot.current_task, next_p).
> >
> > And I suppose what the compiler is doing is lifting the 'current' load
> > over that store, but how is it allowed that? I thought C was supposed to
> > have PO consistency, That raw_cpu_write() should be seen as a store to
> > to pcpu_hot.current_task, why can it lift a load over the store?
> >
> > Specifically, percpu_to_op() has a "+m" output constaint while
> > percpu_stable_op() has a "p" input constraint on the same address.
> 
> I definitely think the issue is specific to "p" constraints.
> https://godbolt.org/z/34YeG6WbY is the test case I reduced which I
> think demonstrates the issue.
> 
> https://reviews.llvm.org/D145416
> -> click "Show Older Changes" for the ongoing discussion.

So per that summary, I'm going to nit-pick and state we very much want
CSE. CSE good. What we don't want it violating store-load ordering.

> I don't have a satisfactory answer yet, but am looking into this.

Oh, geez, what a twisty tale that... So Linus knew back in '09 that "p"
was icky, but it sorta was the only thing and it 'worked' -- until now
:/

Is there a way to explicitly order these things? barrier() obviously
isn't going to help here.

So ideally we'd get something that respects the whole store-load
ordering but still allows agressive CSE. And works for both toolchains.
Small ask, I know :-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 11:35     ` [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in Peter Zijlstra
@ 2023-03-07 17:48       ` Linus Torvalds
  2023-03-07 18:43       ` Segher Boessenkool
  1 sibling, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2023-03-07 17:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Stephane Eranian, linux-kernel, tony.luck,
	reinette.chatre, fenghua.yu, peternewman, james.morse,
	babu.moger, ananth.narayan, vschneid, Nathan Chancellor,
	clang-built-linux, Borislav Petkov, H. Peter Anvin,
	linux-toolchains

On Tue, Mar 7, 2023 at 3:36 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> > I don't have a satisfactory answer yet, but am looking into this.
>
> Oh, geez, what a twisty tale that... So Linus knew back in '09 that "p"
> was icky, but it sorta was the only thing and it 'worked' -- until now
> :/

Yeah, so 'p' definitely is about the pointer, and I do worry that it
is only a dependency on exactly that - not the memory behind it.

I have this dim memory about us having talked about this with some gcc
person, and coming to the conclusion that it was all fine, but I
suspect it was in some very specific case where it might have been
fine for other reasons.

> Is there a way to explicitly order these things? barrier() obviously
> isn't going to help here.

So one "asm volatile" should always be ordered wrt another "asm volatile".

I have this other dim memory of it not even being clear whether "asm"
and "asm volatile" are ordered. I don't think they necessarily are
(with the obvious caveat that an asm without any arguments - a
so-called "basic asm" - is always volatile whether the "volatile" is
there or not).

I have a lot of dim memories, in other words. Should that worry me?

And then there's the "memory" clobber, of course.

But both of those are also going to disable CSE.

I do think that percpu_stable_op can use "p", but only when the value
is *truly* stable and there is no question about it being moved around
things that might modify it.

                         Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 11:35     ` [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in Peter Zijlstra
  2023-03-07 17:48       ` Linus Torvalds
@ 2023-03-07 18:43       ` Segher Boessenkool
  2023-03-07 20:43         ` Jakub Jelinek
  1 sibling, 1 reply; 18+ messages in thread
From: Segher Boessenkool @ 2023-03-07 18:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Stephane Eranian, linux-kernel, tony.luck,
	reinette.chatre, fenghua.yu, peternewman, james.morse,
	babu.moger, ananth.narayan, vschneid, Nathan Chancellor,
	clang-built-linux, Borislav Petkov, Linus Torvalds,
	H. Peter Anvin, linux-toolchains

Hi!

On Tue, Mar 07, 2023 at 12:35:45PM +0100, Peter Zijlstra wrote:
> So per that summary, I'm going to nit-pick and state we very much want
> CSE. CSE good. What we don't want it violating store-load ordering.

So you need to describe exactly what you *do* want.  There is no way to
forbid most otherwise valid things.  But you can express pretty much all
dependencies.

> Oh, geez, what a twisty tale that... So Linus knew back in '09 that "p"
> was icky, but it sorta was the only thing and it 'worked' -- until now
> :/

The "p" constraint is just like any other address_constraint, in most
aspects.  Since this is very specific to "p", that limits what is going
on to really just one thing.

For "p", after reload, strict_memory_address_addr_space_p is used.  That
is, targetm.addr_space.legitimate_address_p with strict set to true.  On
x86 that limits what registers can be used?  So I guess that made things
accidentally work before?

> So ideally we'd get something that respects the whole store-load
> ordering but still allows agressive CSE. And works for both toolchains.
> Small ask, I know :-)

Well, what is the ordering you need respected, *exactly*?


Segher

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 18:43       ` Segher Boessenkool
@ 2023-03-07 20:43         ` Jakub Jelinek
  2023-03-07 20:54           ` Linus Torvalds
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Jelinek @ 2023-03-07 20:43 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Peter Zijlstra, Nick Desaulniers, Stephane Eranian, linux-kernel,
	tony.luck, reinette.chatre, fenghua.yu, peternewman, james.morse,
	babu.moger, ananth.narayan, vschneid, Nathan Chancellor,
	clang-built-linux, Borislav Petkov, Linus Torvalds,
	H. Peter Anvin, linux-toolchains

On Tue, Mar 07, 2023 at 12:43:16PM -0600, Segher Boessenkool wrote:
> On Tue, Mar 07, 2023 at 12:35:45PM +0100, Peter Zijlstra wrote:
> > So per that summary, I'm going to nit-pick and state we very much want
> > CSE. CSE good. What we don't want it violating store-load ordering.
> 
> So you need to describe exactly what you *do* want.  There is no way to
> forbid most otherwise valid things.  But you can express pretty much all
> dependencies.
> 
> > Oh, geez, what a twisty tale that... So Linus knew back in '09 that "p"
> > was icky, but it sorta was the only thing and it 'worked' -- until now
> > :/
> 
> The "p" constraint is just like any other address_constraint, in most
> aspects.  Since this is very specific to "p", that limits what is going
> on to really just one thing.

Are we actually talking here about "p" constraint or about p/P (x86) modifiers
(asm ("%p0" : : "i" (42));)?

	Jakub


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 20:43         ` Jakub Jelinek
@ 2023-03-07 20:54           ` Linus Torvalds
  2023-03-07 21:06             ` Linus Torvalds
  2023-03-07 21:11             ` Luck, Tony
  0 siblings, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2023-03-07 20:54 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Segher Boessenkool, Peter Zijlstra, Nick Desaulniers,
	Stephane Eranian, linux-kernel, tony.luck, reinette.chatre,
	fenghua.yu, peternewman, james.morse, babu.moger, ananth.narayan,
	vschneid, Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

On Tue, Mar 7, 2023 at 12:43 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Are we actually talking here about "p" constraint or about p/P (x86) modifiers
> (asm ("%p0" : : "i" (42));)?

In this case it's actually the "p" constraint.

And the "percpu_stable_op()" thing uses it exactly because it thinks
it wants the "I don't care about data dependencies in memory, because
this memory location doesn't change".

Of course, that "this memory location doesn't change" is then always
technically a lie. It's exactly things like "current" that obviously
*does* change in the big picture, but from the context of a particular
_thread_, "current" is a fixed value.

Which then causes problems when you use that "percpu_stable_op()"
thing from within the scheduler (generally without being *aware* of
this issue at all, since the use is hidden a few macro expansions
down).

I think the problem is that the <asm/resctrl.h> code is disgusting and
horrible in multiple ways:

 (a) it shouldn't define and declare a static function in a header file

 (b) the resctrl_sched_in() inline function is midesigned to begin with

basically, the actual scheduling functions should *never* look at
"current" at all. They are - byu definition - changing it, and they
already *know* what both the "incoming current" (aka "prev_p") and
"new current" (aka "next_p") are.

So any scheduling function that uses "current" is hot garbage.

In this case, that hot garbage is resctrl_sched_in().

I do not believe this is a compiler issue. This is literally just a
kernel bug, and that resctrl code being very very wrong.

              Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 20:54           ` Linus Torvalds
@ 2023-03-07 21:06             ` Linus Torvalds
  2023-03-07 21:35               ` Luck, Tony
                                 ` (2 more replies)
  2023-03-07 21:11             ` Luck, Tony
  1 sibling, 3 replies; 18+ messages in thread
From: Linus Torvalds @ 2023-03-07 21:06 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Segher Boessenkool, Peter Zijlstra, Nick Desaulniers,
	Stephane Eranian, linux-kernel, tony.luck, reinette.chatre,
	fenghua.yu, peternewman, james.morse, babu.moger, ananth.narayan,
	vschneid, Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

[-- Attachment #1: Type: text/plain, Size: 721 bytes --]

On Tue, Mar 7, 2023 at 12:54 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I think the problem is that the <asm/resctrl.h> code is disgusting and
> horrible in multiple ways:
>
>  (a) it shouldn't define and declare a static function in a header file
>
>  (b) the resctrl_sched_in() inline function is misdesigned to begin with

Ok, so here's a *ttoally* untested and mindless patch to maybe fix
what I dislike about that resctl code.

Does it fix the code generation issue? I have no idea. But this is
what I would suggest is the right answer, without actually knowing the
code any better, and just going on a mindless rampage.

It seems to compile for me, fwiw.

             Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 3368 bytes --]

 arch/x86/include/asm/resctrl.h         | 12 ++++++------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |  4 ++--
 arch/x86/kernel/process_32.c           |  2 +-
 arch/x86/kernel/process_64.c           |  2 +-
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 52788f79786f..255a78d9d906 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -49,7 +49,7 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
  *   simple as possible.
  * Must be called with preemption disabled.
  */
-static void __resctrl_sched_in(void)
+static inline void __resctrl_sched_in(struct task_struct *tsk)
 {
 	struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
 	u32 closid = state->default_closid;
@@ -61,13 +61,13 @@ static void __resctrl_sched_in(void)
 	 * Else use the closid/rmid assigned to this cpu.
 	 */
 	if (static_branch_likely(&rdt_alloc_enable_key)) {
-		tmp = READ_ONCE(current->closid);
+		tmp = READ_ONCE(tsk->closid);
 		if (tmp)
 			closid = tmp;
 	}
 
 	if (static_branch_likely(&rdt_mon_enable_key)) {
-		tmp = READ_ONCE(current->rmid);
+		tmp = READ_ONCE(tsk->rmid);
 		if (tmp)
 			rmid = tmp;
 	}
@@ -88,17 +88,17 @@ static inline unsigned int resctrl_arch_round_mon_val(unsigned int val)
 	return val * scale;
 }
 
-static inline void resctrl_sched_in(void)
+static inline void resctrl_sched_in(struct task_struct *tsk)
 {
 	if (static_branch_likely(&rdt_enable_key))
-		__resctrl_sched_in();
+		__resctrl_sched_in(tsk);
 }
 
 void resctrl_cpu_detect(struct cpuinfo_x86 *c);
 
 #else
 
-static inline void resctrl_sched_in(void) {}
+static inline void resctrl_sched_in(struct task_struct *tsk) {}
 static inline void resctrl_cpu_detect(struct cpuinfo_x86 *c) {}
 
 #endif /* CONFIG_X86_CPU_RESCTRL */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e2c1599d1b37..884b6e9a7e31 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -314,7 +314,7 @@ static void update_cpu_closid_rmid(void *info)
 	 * executing task might have its own closid selected. Just reuse
 	 * the context switch code.
 	 */
-	resctrl_sched_in();
+	resctrl_sched_in(current);
 }
 
 /*
@@ -530,7 +530,7 @@ static void _update_task_closid_rmid(void *task)
 	 * Otherwise, the MSR is updated when the task is scheduled in.
 	 */
 	if (task == current)
-		resctrl_sched_in();
+		resctrl_sched_in(task);
 }
 
 static void update_task_closid_rmid(struct task_struct *t)
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 470c128759ea..708c87b88cc1 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -212,7 +212,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	switch_fpu_finish();
 
 	/* Load the Intel cache allocation PQR MSR. */
-	resctrl_sched_in();
+	resctrl_sched_in(next_p);
 
 	return prev_p;
 }
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4e34b3b68ebd..bb65a68b4b49 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -656,7 +656,7 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
 	}
 
 	/* Load the Intel cache allocation PQR MSR. */
-	resctrl_sched_in();
+	resctrl_sched_in(next_p);
 
 	return prev_p;
 }

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* RE: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 20:54           ` Linus Torvalds
  2023-03-07 21:06             ` Linus Torvalds
@ 2023-03-07 21:11             ` Luck, Tony
  2023-03-07 21:14               ` Linus Torvalds
  2023-03-07 21:16               ` Nick Desaulniers
  1 sibling, 2 replies; 18+ messages in thread
From: Luck, Tony @ 2023-03-07 21:11 UTC (permalink / raw)
  To: Torvalds, Linus, Jakub Jelinek
  Cc: Segher Boessenkool, Peter Zijlstra, Nick Desaulniers, Eranian,
	Stephane, linux-kernel, Chatre, Reinette, Yu, Fenghua,
	peternewman, james.morse, babu.moger, ananth.narayan, vschneid,
	Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

> (a) it shouldn't define and declare a static function in a header file
>
>  (b) the resctrl_sched_in() inline function is midesigned to begin with

Fixing "b" would seem to be to just pass "next_p" to the function to
use instead of "current".

Can you expand about part "a" ... Linux has zillions of static inline functions
in header files to handle CONFIG options. One version is the real McCoy
while the other is just a stub for the CONFIG=n case.

What's different about this one?

-Tony

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:11             ` Luck, Tony
@ 2023-03-07 21:14               ` Linus Torvalds
  2023-03-07 21:23                 ` Luck, Tony
  2023-03-07 21:16               ` Nick Desaulniers
  1 sibling, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2023-03-07 21:14 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Jakub Jelinek, Segher Boessenkool, Peter Zijlstra,
	Nick Desaulniers, Eranian, Stephane, linux-kernel, Chatre,
	Reinette, Yu, Fenghua, peternewman, james.morse, babu.moger,
	ananth.narayan, vschneid, Nathan Chancellor, clang-built-linux,
	Borislav Petkov, H. Peter Anvin, linux-toolchains

On Tue, Mar 7, 2023 at 1:11 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> Can you expand about part "a" ... Linux has zillions of static inline functions
> in header files to handle CONFIG options. One version is the real McCoy
> while the other is just a stub for the CONFIG=n case.
>
> What's different about this one?

See the patch I just sent out.

Linux has a lot of "static inline" functions. But that's not at all
what that function was. It was literally just

  static void __resctrl_sched_in(..)

which is disgusting and very wrong.

I hope that compilers then just ignored it ("It's static and not used,
so I'm not generating that code"), and that header file isn't included
in very many places, but it's still very wrong.

               Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:11             ` Luck, Tony
  2023-03-07 21:14               ` Linus Torvalds
@ 2023-03-07 21:16               ` Nick Desaulniers
  2023-03-07 21:19                 ` Linus Torvalds
  1 sibling, 1 reply; 18+ messages in thread
From: Nick Desaulniers @ 2023-03-07 21:16 UTC (permalink / raw)
  To: Luck, Tony, Torvalds, Linus
  Cc: Jakub Jelinek, Segher Boessenkool, Peter Zijlstra, Eranian,
	Stephane, linux-kernel, Chatre, Reinette, Yu, Fenghua,
	peternewman, james.morse, babu.moger, ananth.narayan, vschneid,
	Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

On Tue, Mar 7, 2023 at 1:11 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> > (a) it shouldn't define and declare a static function in a header file
> >
> >  (b) the resctrl_sched_in() inline function is midesigned to begin with
>
> Fixing "b" would seem to be to just pass "next_p" to the function to
> use instead of "current".

```
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 52788f79786f..f46c0b97334d 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -49,7 +49,7 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
  *   simple as possible.
  * Must be called with preemption disabled.
  */
-static void __resctrl_sched_in(void)
+static void __resctrl_sched_in(struct task_struct *next)
 {
  struct resctrl_pqr_state *state = this_cpu_ptr(&pqr_state);
  u32 closid = state->default_closid;
@@ -61,13 +61,13 @@ static void __resctrl_sched_in(void)
  * Else use the closid/rmid assigned to this cpu.
  */
  if (static_branch_likely(&rdt_alloc_enable_key)) {
- tmp = READ_ONCE(current->closid);
+ tmp = READ_ONCE(next->closid);
  if (tmp)
  closid = tmp;
  }

  if (static_branch_likely(&rdt_mon_enable_key)) {
- tmp = READ_ONCE(current->rmid);
+ tmp = READ_ONCE(next->rmid);
  if (tmp)
  rmid = tmp;
  }
@@ -88,17 +88,17 @@ static inline unsigned int
resctrl_arch_round_mon_val(unsigned int val)
  return val * scale;
 }

-static inline void resctrl_sched_in(void)
+static inline void resctrl_sched_in(struct task_struct *next)
 {
  if (static_branch_likely(&rdt_enable_key))
- __resctrl_sched_in();
+ __resctrl_sched_in(next);
 }

 void resctrl_cpu_detect(struct cpuinfo_x86 *c);

 #else

-static inline void resctrl_sched_in(void) {}
+static inline void resctrl_sched_in(struct task_struct *next) {}
 static inline void resctrl_cpu_detect(struct cpuinfo_x86 *c) {}

 #endif /* CONFIG_X86_CPU_RESCTRL */
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e2c1599d1b37..d970347838a4 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -314,7 +314,7 @@ static void update_cpu_closid_rmid(void *info)
  * executing task might have its own closid selected. Just reuse
  * the context switch code.
  */
- resctrl_sched_in();
+ resctrl_sched_in(current);
 }

 /*
@@ -530,7 +530,7 @@ static void _update_task_closid_rmid(void *task)
  * Otherwise, the MSR is updated when the task is scheduled in.
  */
  if (task == current)
- resctrl_sched_in();
+ resctrl_sched_in(current);
 }

 static void update_task_closid_rmid(struct task_struct *t)
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 470c128759ea..708c87b88cc1 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -212,7 +212,7 @@ __switch_to(struct task_struct *prev_p, struct
task_struct *next_p)
  switch_fpu_finish();

  /* Load the Intel cache allocation PQR MSR. */
- resctrl_sched_in();
+ resctrl_sched_in(next_p);

  return prev_p;
 }
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 4e34b3b68ebd..bb65a68b4b49 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -656,7 +656,7 @@ __switch_to(struct task_struct *prev_p, struct
task_struct *next_p)
  }

  /* Load the Intel cache allocation PQR MSR. */
- resctrl_sched_in();
+ resctrl_sched_in(next_p);

  return prev_p;
 }

```
?

>
> Can you expand about part "a" ... Linux has zillions of static inline functions
> in header files to handle CONFIG options. One version is the real McCoy
> while the other is just a stub for the CONFIG=n case.

Right, I had the same question.

Perhaps it's more so that no one calls __resctrl_sched_in, only
resctrl_sched_in, therefor they should be folded into one function?

>
> What's different about this one?
>
> -Tony



-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:16               ` Nick Desaulniers
@ 2023-03-07 21:19                 ` Linus Torvalds
  2023-03-07 21:22                   ` Nick Desaulniers
  0 siblings, 1 reply; 18+ messages in thread
From: Linus Torvalds @ 2023-03-07 21:19 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Luck, Tony, Jakub Jelinek, Segher Boessenkool, Peter Zijlstra,
	Eranian, Stephane, linux-kernel, Chatre, Reinette, Yu, Fenghua,
	peternewman, james.morse, babu.moger, ananth.narayan, vschneid,
	Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

On Tue, Mar 7, 2023 at 1:16 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> >
> > Can you expand about part "a" ... Linux has zillions of static inline functions
> > in header files to handle CONFIG options. One version is the real McCoy
> > while the other is just a stub for the CONFIG=n case.
>
> Right, I had the same question.
>
> Perhaps it's more so that no one calls __resctrl_sched_in, only
> resctrl_sched_in, therefor they should be folded into one function?

If you think it should be inlined, it should be marked as such.

And if you *don't* think it should be inlined - quite possibly the
right answer here - the definition of that function damn well
shouldn't be in a header file.

So either way, that __resctrl_sched_in() was wrong, wrong, wrong.

           Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:19                 ` Linus Torvalds
@ 2023-03-07 21:22                   ` Nick Desaulniers
  0 siblings, 0 replies; 18+ messages in thread
From: Nick Desaulniers @ 2023-03-07 21:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Luck, Tony, Jakub Jelinek, Segher Boessenkool, Peter Zijlstra,
	Eranian, Stephane, linux-kernel, Chatre, Reinette, Yu, Fenghua,
	peternewman, james.morse, babu.moger, ananth.narayan, vschneid,
	Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

On Tue, Mar 7, 2023 at 1:19 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Tue, Mar 7, 2023 at 1:16 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> >
> > >
> > > Can you expand about part "a" ... Linux has zillions of static inline functions
> > > in header files to handle CONFIG options. One version is the real McCoy
> > > while the other is just a stub for the CONFIG=n case.
> >
> > Right, I had the same question.
> >
> > Perhaps it's more so that no one calls __resctrl_sched_in, only
> > resctrl_sched_in, therefor they should be folded into one function?
>
> If you think it should be inlined, it should be marked as such.

Yep, sorry, I missed that `inline` was missing from that definition!

>
> And if you *don't* think it should be inlined - quite possibly the
> right answer here - the definition of that function damn well
> shouldn't be in a header file.
>
> So either way, that __resctrl_sched_in() was wrong, wrong, wrong.
>
>            Linus
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:14               ` Linus Torvalds
@ 2023-03-07 21:23                 ` Luck, Tony
  2023-03-08  0:36                   ` Luck, Tony
  0 siblings, 1 reply; 18+ messages in thread
From: Luck, Tony @ 2023-03-07 21:23 UTC (permalink / raw)
  To: Torvalds, Linus
  Cc: Jakub Jelinek, Segher Boessenkool, Peter Zijlstra,
	Nick Desaulniers, Eranian, Stephane, linux-kernel, Chatre,
	Reinette, Yu, Fenghua, peternewman, james.morse, babu.moger,
	ananth.narayan, vschneid, Nathan Chancellor, clang-built-linux,
	Borislav Petkov, H. Peter Anvin, linux-toolchains

> Linux has a lot of "static inline" functions. But that's not at all
> what that function was. It was literally just
>
>   static void __resctrl_sched_in(..)
>
> which is disgusting and very wrong.

Ah. I was looking at:

static inline void resctrl_sched_in(void)
{
        if (static_branch_likely(&rdt_enable_key))
                __resctrl_sched_in();
}

which does have the "inline" attribute.

I wonder if checkpatch could catch missing "inline" on static
function definitions in header files?

-Tony

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:06             ` Linus Torvalds
@ 2023-03-07 21:35               ` Luck, Tony
  2023-03-07 21:58                 ` Nick Desaulniers
  2023-03-08  6:13               ` Stephane Eranian
  2023-03-08 16:02               ` Moger, Babu
  2 siblings, 1 reply; 18+ messages in thread
From: Luck, Tony @ 2023-03-07 21:35 UTC (permalink / raw)
  To: Torvalds, Linus, Jakub Jelinek
  Cc: Segher Boessenkool, Peter Zijlstra, Nick Desaulniers, Eranian,
	Stephane, linux-kernel, Chatre, Reinette, Yu, Fenghua,
	peternewman, james.morse, babu.moger, ananth.narayan, vschneid,
	Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

> Ok, so here's a *ttoally* untested and mindless patch to maybe fix
> what I dislike about that resctl code.
>
> Does it fix the code generation issue? I have no idea. But this is
> what I would suggest is the right answer, without actually knowing the
> code any better, and just going on a mindless rampage.
>
> It seems to compile for me, fwiw.

Beyond compiling it boots and passes the tools/testing/selftests/resctrl test suite.

Tested-by: Tony Luck <tony.luck@intel.com>

-Tony

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:35               ` Luck, Tony
@ 2023-03-07 21:58                 ` Nick Desaulniers
  0 siblings, 0 replies; 18+ messages in thread
From: Nick Desaulniers @ 2023-03-07 21:58 UTC (permalink / raw)
  To: Torvalds, Linus
  Cc: Jakub Jelinek, Segher Boessenkool, Peter Zijlstra, Eranian,
	Stephane, linux-kernel, Chatre, Reinette, Yu, Fenghua,
	peternewman, james.morse, babu.moger, ananth.narayan, vschneid,
	Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains, Luck, Tony

On Tue, Mar 7, 2023 at 1:35 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> > Ok, so here's a *ttoally* untested and mindless patch to maybe fix
> > what I dislike about that resctl code.
> >
> > Does it fix the code generation issue? I have no idea. But this is
> > what I would suggest is the right answer, without actually knowing the
> > code any better, and just going on a mindless rampage.
> >
> > It seems to compile for me, fwiw.
>
> Beyond compiling it boots and passes the tools/testing/selftests/resctrl test suite.
>
> Tested-by: Tony Luck <tony.luck@intel.com>

LGTM; reloading of current becomes irrelevant now that we're reusing
the existing variable `next_p`.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

Might be nice to tag this for stable.

Cc: <stable@vger.kernel.org>

And credit Stephane who did a nice job tracking this down and having a
concise reproducer.

Reported-by: Stephane Eranian <eranian@google.com>

Perhaps relevant links

Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/
Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/

Consider reusing parts of Stephane's message from the initial Link above?

Thanks for the patch.

---

While this issue was specific to the usage of `current` (implemented
as a macro that uses `this_cpu_read_stable`, I wonder if we might hit
issues again in the future (at least on x86 using the "p" constraint)
in code that:
1. uses this_cpu_read_stable to access a per cpu variable
2. updates that per cpu variable
3. uses this_cpu_read_stable to access the variable hoping to get the
new value rather than the old.

I guess that this_cpu_read should be used rather than
this_cpu_read_stable? Maybe we can beef up the comment in
arch/x86/include/asm/percpu.h to warn about this? The sentence about
get_thread_info() being a user of this_cpu_read_stable() seems
outdated/due for a refresh?

Is __switch_to the only place that should be updating current?  Are
there other variables other than current that might be afflicted by
that 1,2,3 pattern I mention above?

current_top_of_stack() uses this_cpu_read_stable() for instance.
Perhaps there could be a caller that measures the stack depth, grows
the stack, then rereads the wrong value?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:23                 ` Luck, Tony
@ 2023-03-08  0:36                   ` Luck, Tony
  0 siblings, 0 replies; 18+ messages in thread
From: Luck, Tony @ 2023-03-08  0:36 UTC (permalink / raw)
  To: Torvalds, Linus
  Cc: Jakub Jelinek, Segher Boessenkool, Peter Zijlstra,
	Nick Desaulniers, Eranian, Stephane, linux-kernel, Chatre,
	Reinette, Yu, Fenghua, peternewman, james.morse, babu.moger,
	ananth.narayan, vschneid, Nathan Chancellor, clang-built-linux,
	Borislav Petkov, H. Peter Anvin, linux-toolchains

> I wonder if checkpatch could catch missing "inline" on static
> function definitions in header files?

Some casual use of grep shows that resctrl isn't the only offender.

There are many non-inline static functions in header files.

A few hundred scattered across core kernel code, drivers and most
architectures. E.g. a dozen in arch/x86/include/asm/floppy.h.

static irqreturn_t floppy_hardint(int irq, void *dev_id)
static void fd_disable_dma(void)
static int vdma_request_dma(unsigned int dmanr, const char *device_id)
static void vdma_nop(unsigned int dummy)
static int vdma_get_dma_residue(unsigned int dummy)
static int fd_request_irq(void)
static unsigned long dma_mem_alloc(unsigned long size)
static unsigned long vdma_mem_alloc(unsigned long size)
static void _fd_dma_mem_free(unsigned long addr, unsigned long size)
static void _fd_chose_dma_mode(char *addr, unsigned long size)
static int vdma_dma_setup(char *addr, unsigned long size, int mode, int io)
static int hard_dma_setup(char *addr, unsigned long size, int mode, int io)

-Tony

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:06             ` Linus Torvalds
  2023-03-07 21:35               ` Luck, Tony
@ 2023-03-08  6:13               ` Stephane Eranian
  2023-03-08 23:25                 ` Linus Torvalds
  2023-03-08 16:02               ` Moger, Babu
  2 siblings, 1 reply; 18+ messages in thread
From: Stephane Eranian @ 2023-03-08  6:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jakub Jelinek, Segher Boessenkool, Peter Zijlstra,
	Nick Desaulniers, linux-kernel, tony.luck, reinette.chatre,
	fenghua.yu, peternewman, james.morse, babu.moger, ananth.narayan,
	vschneid, Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

On Tue, Mar 7, 2023 at 1:06 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Tue, Mar 7, 2023 at 12:54 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > I think the problem is that the <asm/resctrl.h> code is disgusting and
> > horrible in multiple ways:
> >
> >  (a) it shouldn't define and declare a static function in a header file
> >
> >  (b) the resctrl_sched_in() inline function is misdesigned to begin with
>
> Ok, so here's a *ttoally* untested and mindless patch to maybe fix
> what I dislike about that resctl code.
>
> Does it fix the code generation issue? I have no idea. But this is
> what I would suggest is the right answer, without actually knowing the
> code any better, and just going on a mindless rampage.
>
> It seems to compile for me, fwiw.
>

On Tue, Mar 7, 2023 at 3:01 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
>
> On Tue, Mar 7, 2023 at 2:56 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Tue, Mar 7, 2023 at 2:03 PM Nick Desaulniers <ndesaulniers@google.com> wrote:
> > >
> > > Sounds like Stephane is going to re-run the internal tests he used to
> > > discover the issue with your diff applied, if you don't mind holding
> > > out for another Tested-by tag. EOM
> >
> > Ack. I am in no hurry.
> >
> > In fact, I'd prefer to just get the patch sent back to me with a
> > commit message too, if somebody has the energy. I don't need the
> > credit for a trivial thing like that.
>
> Sure, then maybe Stephane you can supply a v2 with updated commit message and a
>
> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
>
I verified Linus' patch on my test case on AMD Zen3 and it works as
 expected, i.e., the limit is enforced. I had tried a similar approach myself
as well and it worked.

I think passing the task pointer is the proper approach because we are
in a sched_in routine
and I would expect the task scheduled in to be passed as argument
instead of having the function
retrieve it from the current pointer.

Thanks.

Tested-by: Stephane Eranian <eranian@google.com>

>              Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-07 21:06             ` Linus Torvalds
  2023-03-07 21:35               ` Luck, Tony
  2023-03-08  6:13               ` Stephane Eranian
@ 2023-03-08 16:02               ` Moger, Babu
  2 siblings, 0 replies; 18+ messages in thread
From: Moger, Babu @ 2023-03-08 16:02 UTC (permalink / raw)
  To: Linus Torvalds, Jakub Jelinek
  Cc: Segher Boessenkool, Peter Zijlstra, Nick Desaulniers,
	Stephane Eranian, linux-kernel, tony.luck, reinette.chatre,
	fenghua.yu, peternewman, james.morse, ananth.narayan, vschneid,
	Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains



On 3/7/23 15:06, Linus Torvalds wrote:
> On Tue, Mar 7, 2023 at 12:54 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> I think the problem is that the <asm/resctrl.h> code is disgusting and
>> horrible in multiple ways:
>>
>>  (a) it shouldn't define and declare a static function in a header file
>>
>>  (b) the resctrl_sched_in() inline function is misdesigned to begin with
> 
> Ok, so here's a *ttoally* untested and mindless patch to maybe fix
> what I dislike about that resctl code.
> 
> Does it fix the code generation issue? I have no idea. But this is
> what I would suggest is the right answer, without actually knowing the
> code any better, and just going on a mindless rampage.
> 
> It seems to compile for me, fwiw.
> 
>              Linus

Tested both on GCC and CLANG. Test passes and resctrl limits are working
fine. Thanks for the patch.

Tested-by: Babu Moger <babu.moger@amd.com>

-- 
Thanks
Babu Moger

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in
  2023-03-08  6:13               ` Stephane Eranian
@ 2023-03-08 23:25                 ` Linus Torvalds
  0 siblings, 0 replies; 18+ messages in thread
From: Linus Torvalds @ 2023-03-08 23:25 UTC (permalink / raw)
  To: Stephane Eranian
  Cc: Jakub Jelinek, Segher Boessenkool, Peter Zijlstra,
	Nick Desaulniers, linux-kernel, tony.luck, reinette.chatre,
	fenghua.yu, peternewman, james.morse, babu.moger, ananth.narayan,
	vschneid, Nathan Chancellor, clang-built-linux, Borislav Petkov,
	H. Peter Anvin, linux-toolchains

On Tue, Mar 7, 2023 at 10:13 PM Stephane Eranian <eranian@google.com> wrote:
>
> Tested-by: Stephane Eranian <eranian@google.com>

It's out as commit 7fef09970252 ("x86/resctl: fix scheduler confusion
with 'current'") if anybody cares.

           Linus

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2023-03-08 23:33 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20230303231133.1486085-1-eranian@google.com>
     [not found] ` <20230306120106.GE1267364@hirez.programming.kicks-ass.net>
     [not found]   ` <CAKwvOdnRvd5KK01awAyeyt5S36TPPW4_8Z6YL1r4gB-pBrHTbg@mail.gmail.com>
2023-03-07 11:35     ` [PATCH] x86/resctrl: avoid compiler optimization in __resctrl_sched_in Peter Zijlstra
2023-03-07 17:48       ` Linus Torvalds
2023-03-07 18:43       ` Segher Boessenkool
2023-03-07 20:43         ` Jakub Jelinek
2023-03-07 20:54           ` Linus Torvalds
2023-03-07 21:06             ` Linus Torvalds
2023-03-07 21:35               ` Luck, Tony
2023-03-07 21:58                 ` Nick Desaulniers
2023-03-08  6:13               ` Stephane Eranian
2023-03-08 23:25                 ` Linus Torvalds
2023-03-08 16:02               ` Moger, Babu
2023-03-07 21:11             ` Luck, Tony
2023-03-07 21:14               ` Linus Torvalds
2023-03-07 21:23                 ` Luck, Tony
2023-03-08  0:36                   ` Luck, Tony
2023-03-07 21:16               ` Nick Desaulniers
2023-03-07 21:19                 ` Linus Torvalds
2023-03-07 21:22                   ` Nick Desaulniers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).