linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Loongson (and other $ARCHs?) idle VS timer enqueue
@ 2023-04-21 12:36 Frederic Weisbecker
  2023-04-21 14:24 ` Peter Zijlstra
  2023-04-21 15:24 ` Thomas Gleixner
  0 siblings, 2 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2023-04-21 12:36 UTC (permalink / raw)
  To: Huacai Chen, WANG Xuerui
  Cc: Peter Zijlstra, Thomas Gleixner, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

Hi,

I'm looking at the __arch_cpu_idle() implementation in Loongarch
and I'm wondering about the rollback code. I don't understand well
that code but with the help from PeterZ I might have seen something,
so tell me if I'm wrong: when an interrupt happens within
__arch_cpu_idle(), handle_vint() rolls back the return value to the
beginning of __arch_cpu_idle(), so that TIF_NEED_RESCHED is checked
again. Is that correct?

Because if an interrupt fires while in __arch_cpu_idle(), that IRQ
might enqueue a new timer and that new timer needs to be reprogrammed
from the main idle loop and just checking TIF_NEED_RESCHED doesn't
tell about that information.

More generally IRQs must _not_ be re-enabled between cpuidle_select()
(or just tick_nohz_idle_stop_tick() if no cpuidle support) and the
last halting ASM instruction. If that happens there must be
a mechanism to cope with that and make sure we return to the main
idle loop.

If that mechanism has to go through rollback (I wish your arch allows
you to find a simpler and less error prone mechanism through), then the
rollback must actually fast forward to after the halting instruction
so that the main idle loop re-checks the timers. But then
__arch_cpu_idle() alone is not enough to be part of the fastforward
section, it has to start before the raw_local_irq_enable() in
arch_cpu_idle().

Another way to cope with this would be to have:

#define TIF_IDLE_TIMER	 ...
#define TIF_IDLE_EXIT	 (TIF_NEED_RESCHED | TIF_IDLE_TIMER)

And set that from the timer enqueue in idle time and check TIF_IDLE_EXIT
on idle callback. It depends how many architectures are concerned by this.
All I know so far is:

* mwait based mechanism should be fine if called with IRQs disabled (or
sti is called right before) but then we must be sure that IRQs have never
been even temporarily re-enabled between cpuidle_select() and mwait. How
to detect that kind of mistake?

* wfi based mechanism look fine, but again we must make sure IRQs have
never been re-enabled.

* sti;hlt should be fine but again...

* Need to check all other archs

I'm trying to find an automated way to debug this kind of issue but it's not
easy...

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-21 12:36 Loongson (and other $ARCHs?) idle VS timer enqueue Frederic Weisbecker
@ 2023-04-21 14:24 ` Peter Zijlstra
  2023-04-21 16:47   ` Frederic Weisbecker
  2023-04-21 15:24 ` Thomas Gleixner
  1 sibling, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2023-04-21 14:24 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Huacai Chen, WANG Xuerui, Thomas Gleixner, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Fri, Apr 21, 2023 at 02:36:52PM +0200, Frederic Weisbecker wrote:
> Hi,
> 
> I'm looking at the __arch_cpu_idle() implementation in Loongarch
> and I'm wondering about the rollback code. I don't understand well
> that code but with the help from PeterZ I might have seen something,
> so tell me if I'm wrong: when an interrupt happens within
> __arch_cpu_idle(), handle_vint() rolls back the return value to the
> beginning of __arch_cpu_idle(), so that TIF_NEED_RESCHED is checked
> again. Is that correct?

Loongson copied this crap from MIPS, so they are direct affected too.

> Because if an interrupt fires while in __arch_cpu_idle(), that IRQ
> might enqueue a new timer and that new timer needs to be reprogrammed
> from the main idle loop and just checking TIF_NEED_RESCHED doesn't
> tell about that information.

Notably; this is only relevant to NOHZ, right?

> 
> More generally IRQs must _not_ be re-enabled between cpuidle_select()
> (or just tick_nohz_idle_stop_tick() if no cpuidle support) and the
> last halting ASM instruction. If that happens there must be
> a mechanism to cope with that and make sure we return to the main
> idle loop.
> 
> If that mechanism has to go through rollback (I wish your arch allows
> you to find a simpler and less error prone mechanism through), then the
> rollback must actually fast forward to after the halting instruction
> so that the main idle loop re-checks the timers. But then
> __arch_cpu_idle() alone is not enough to be part of the fastforward
> section, it has to start before the raw_local_irq_enable() in
> arch_cpu_idle().
> 
> Another way to cope with this would be to have:
> 
> #define TIF_IDLE_TIMER	 ...
> #define TIF_IDLE_EXIT	 (TIF_NEED_RESCHED | TIF_IDLE_TIMER)
> 
> And set that from the timer enqueue in idle time and check TIF_IDLE_EXIT
> on idle callback. It depends how many architectures are concerned by this.
> All I know so far is:

The alternative is changing kernel/entry/common.c:irqentry_exit() to add
a nohz callback next to ct_irq_exit(), and have that reprogram the timer
if/when we're in NOHZ mode.

> * mwait based mechanism should be fine if called with IRQs disabled (or
> sti is called right before) but then we must be sure that IRQs have never
> been even temporarily re-enabled between cpuidle_select() and mwait. How
> to detect that kind of mistake?

mwait_idle_with_hints() is unaffected since it will idle with IRQs
disabled and wait on IRQ-pending (without taking the interrupt).

*HOWEVER*

intel_idle_irq() is affected -- except that we only (normally) use that
for very shallow idle states and it won't interact with NOHZ (because we
only disable the tick for deep idle states).

sti_mwait() relies on the STI shadow, just like hlt below.

> * wfi based mechanism look fine, but again we must make sure IRQs have
> never been re-enabled.

Yes, arm64 WFI has interrupts disabled and will wake on IRQ-pending

> * sti;hlt should be fine but again...

Indeed, STI has a shadow where the effect of enabling interrupts lags
one instruction, so the HLT being in that shadow actually still runs
with IRQs disabled. Any interrupt will then hit when HLT is in effect
and wake up.

> * Need to check all other archs
> 
> I'm trying to find an automated way to debug this kind of issue but it's not
> easy...

Yeah, too much arch code :/ Easiest might be to check if our idea of
where the timer should be and the hardware agree on IRQ entry or so --
*any* IRQ. That will miss a lot of cases, but at least it's something.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-21 12:36 Loongson (and other $ARCHs?) idle VS timer enqueue Frederic Weisbecker
  2023-04-21 14:24 ` Peter Zijlstra
@ 2023-04-21 15:24 ` Thomas Gleixner
  2023-04-21 16:55   ` Frederic Weisbecker
  2023-04-22  8:17   ` Peter Zijlstra
  1 sibling, 2 replies; 23+ messages in thread
From: Thomas Gleixner @ 2023-04-21 15:24 UTC (permalink / raw)
  To: Frederic Weisbecker, Huacai Chen, WANG Xuerui
  Cc: Peter Zijlstra, Rafael J. Wysocki, Anna-Maria Behnsen, LKML

On Fri, Apr 21 2023 at 14:36, Frederic Weisbecker wrote:
> I'm looking at the __arch_cpu_idle() implementation in Loongarch
> and I'm wondering about the rollback code. I don't understand well
> that code but with the help from PeterZ I might have seen something,
> so tell me if I'm wrong: when an interrupt happens within
> __arch_cpu_idle(), handle_vint() rolls back the return value to the
> beginning of __arch_cpu_idle(), so that TIF_NEED_RESCHED is checked
> again. Is that correct?
>
> Because if an interrupt fires while in __arch_cpu_idle(), that IRQ
> might enqueue a new timer and that new timer needs to be reprogrammed
> from the main idle loop and just checking TIF_NEED_RESCHED doesn't
> tell about that information.

The check for TIF_NEED_RESCHED as loop termination condition is simply
wrong. The architecture is not to supposed to loop in arch_cpu_idle().

That loop is from Linux 0.9 days. Seems MIPS::__r4k_wait() and
loongarch, which copied that muck are still stuck in the 1990'ies.

It has to return when an interrupt brings it out of the "idle wait"
instruction.

The special case are mwait() alike mechanisms which also return when a
monitored cacheline is written to. x86 uses that to spare the reseched
IPI as MWAIT will return when TIF_NEED_RESCHED is set by a remote CPU.

> More generally IRQs must _not_ be re-enabled between cpuidle_select()
> (or just tick_nohz_idle_stop_tick() if no cpuidle support) and the
> last halting ASM instruction. If that happens there must be
> a mechanism to cope with that and make sure we return to the main
> idle loop.

No. arch_cpu_idle() can safely reenable interrupts when the "wait"
instruction requires that. It has then to disable interrupts before
returning.

x86 default_idle() does: STI; HLT; CLI; That's perfectly fine because
the effect of STI is delayed to the HLT instruction boundary.

> Another way to cope with this would be to have:
>
> #define TIF_IDLE_TIMER	 ...
> #define TIF_IDLE_EXIT	 (TIF_NEED_RESCHED | TIF_IDLE_TIMER)

There is absolutely no need for this. arch_cpu_idle() has to return
after an interrupt, period. If MIPS/loongarch cannot do that then they
can have their private interrupt counter in that magic rollback ASM to
check for. But we really don't need a TIF flag which makes the (hr)timer
enqueue path more complex.

> I'm trying to find an automated way to debug this kind of issue but
> it's not easy...

It's far from trivial because you'd need correlation between the
interrupt entry and the enter to and return from arch_cpu_idle().

I fear manual inspection is the main tool here :(

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-21 14:24 ` Peter Zijlstra
@ 2023-04-21 16:47   ` Frederic Weisbecker
  2023-04-22  8:08     ` Peter Zijlstra
  0 siblings, 1 reply; 23+ messages in thread
From: Frederic Weisbecker @ 2023-04-21 16:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Huacai Chen, WANG Xuerui, Thomas Gleixner, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Fri, Apr 21, 2023 at 04:24:46PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 21, 2023 at 02:36:52PM +0200, Frederic Weisbecker wrote:
> > Hi,
> > 
> > I'm looking at the __arch_cpu_idle() implementation in Loongarch
> > and I'm wondering about the rollback code. I don't understand well
> > that code but with the help from PeterZ I might have seen something,
> > so tell me if I'm wrong: when an interrupt happens within
> > __arch_cpu_idle(), handle_vint() rolls back the return value to the
> > beginning of __arch_cpu_idle(), so that TIF_NEED_RESCHED is checked
> > again. Is that correct?
> 
> Loongson copied this crap from MIPS, so they are direct affected too.

Right.

> 
> > Because if an interrupt fires while in __arch_cpu_idle(), that IRQ
> > might enqueue a new timer and that new timer needs to be reprogrammed
> > from the main idle loop and just checking TIF_NEED_RESCHED doesn't
> > tell about that information.
> 
> Notably; this is only relevant to NOHZ, right?

Indeed.

> > And set that from the timer enqueue in idle time and check TIF_IDLE_EXIT
> > on idle callback. It depends how many architectures are concerned by this.
> > All I know so far is:
> 
> The alternative is changing kernel/entry/common.c:irqentry_exit() to add
> a nohz callback next to ct_irq_exit(), and have that reprogram the timer
> if/when we're in NOHZ mode.

We used to do that but Rafael rewrote the thing a few years ago in order for
the cpuidle governor to know about the next timer event as a heuristic to
predict the best c-state, and actually decide if it's worth stopping the
tick.

So cpuidle_select() eventually calls tick_nohz_get_sleep_length() in the
beginning of the idle loop to know the next timer event (without stopping the
tick yet), on top of that and other informations, tick is stopped or not
(cf: stop_tick argument in cpuidle_select()).

If an IRQ wakes up the CPU and queues a timer, we need to go through that
whole process again, otherwise we shortcut cpuidle C-state update.

> *HOWEVER*
> 
> intel_idle_irq() is affected -- except that we only (normally) use that
> for very shallow idle states and it won't interact with NOHZ (because we
> only disable the tick for deep idle states).

Well I don't know, that doesn't look comfortable... :)

Also why does it need to enable IRQs if ecx=1 ?

> > * Need to check all other archs
> > 
> > I'm trying to find an automated way to debug this kind of issue but it's not
> > easy...
> 
> Yeah, too much arch code :/ Easiest might be to check if our idea of
> where the timer should be and the hardware agree on IRQ entry or so --
> *any* IRQ. That will miss a lot of cases, but at least it's something.

Hmm, not sure I understand what you're suggesting...

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-21 15:24 ` Thomas Gleixner
@ 2023-04-21 16:55   ` Frederic Weisbecker
  2023-04-21 20:28     ` Thomas Gleixner
  2023-04-22  8:17   ` Peter Zijlstra
  1 sibling, 1 reply; 23+ messages in thread
From: Frederic Weisbecker @ 2023-04-21 16:55 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Huacai Chen, WANG Xuerui, Peter Zijlstra, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Fri, Apr 21, 2023 at 05:24:36PM +0200, Thomas Gleixner wrote:
> On Fri, Apr 21 2023 at 14:36, Frederic Weisbecker wrote:
> The check for TIF_NEED_RESCHED as loop termination condition is simply
> wrong. The architecture is not to supposed to loop in arch_cpu_idle().
> 
> That loop is from Linux 0.9 days. Seems MIPS::__r4k_wait() and
> loongarch, which copied that muck are still stuck in the 1990'ies.
> 
> It has to return when an interrupt brings it out of the "idle wait"
> instruction.
> 
> The special case are mwait() alike mechanisms which also return when a
> monitored cacheline is written to. x86 uses that to spare the reseched
> IPI as MWAIT will return when TIF_NEED_RESCHED is set by a remote CPU.

Right.

> 
> > More generally IRQs must _not_ be re-enabled between cpuidle_select()
> > (or just tick_nohz_idle_stop_tick() if no cpuidle support) and the
> > last halting ASM instruction. If that happens there must be
> > a mechanism to cope with that and make sure we return to the main
> > idle loop.
> 
> No. arch_cpu_idle() can safely reenable interrupts when the "wait"
> instruction requires that. It has then to disable interrupts before
> returning.
> 
> x86 default_idle() does: STI; HLT; CLI; That's perfectly fine because
> the effect of STI is delayed to the HLT instruction boundary.

Right, I implicitly included sti;mwait and sti;hlt
The point is that if interrupts are enabled too early before the
idling instruction then we are screwed.

> 
> > Another way to cope with this would be to have:
> >
> > #define TIF_IDLE_TIMER	 ...
> > #define TIF_IDLE_EXIT	 (TIF_NEED_RESCHED | TIF_IDLE_TIMER)
> 
> There is absolutely no need for this. arch_cpu_idle() has to return
> after an interrupt, period. If MIPS/loongarch cannot do that then they
> can have their private interrupt counter in that magic rollback ASM to
> check for. But we really don't need a TIF flag which makes the (hr)timer
> enqueue path more complex.

Then I'm relieved :)  (well sort-of, given the risk for an accident somewhere
on an arch or a cpuidle driver I may have overlooked).

> 
> > I'm trying to find an automated way to debug this kind of issue but
> > it's not easy...
> 
> It's far from trivial because you'd need correlation between the
> interrupt entry and the enter to and return from arch_cpu_idle().
> 
> I fear manual inspection is the main tool here :(

I thought so :)

I'm already halfway through the architectures, then will come the cpuidle drivers...

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-21 16:55   ` Frederic Weisbecker
@ 2023-04-21 20:28     ` Thomas Gleixner
  0 siblings, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2023-04-21 20:28 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Huacai Chen, WANG Xuerui, Peter Zijlstra, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Fri, Apr 21 2023 at 18:55, Frederic Weisbecker wrote:
> On Fri, Apr 21, 2023 at 05:24:36PM +0200, Thomas Gleixner wrote:
>> It's far from trivial because you'd need correlation between the
>> interrupt entry and the enter to and return from arch_cpu_idle().
>> 
>> I fear manual inspection is the main tool here :(
>
> I thought so :)
>
> I'm already halfway through the architectures, then will come the cpuidle drivers...

For objtool covered architectures you might come up with some annotation
which allows objtool to yell, when it discovers a local_irq_enable() or
such at the wrong place.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-21 16:47   ` Frederic Weisbecker
@ 2023-04-22  8:08     ` Peter Zijlstra
  2023-04-22 11:38       ` Peter Zijlstra
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2023-04-22  8:08 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Huacai Chen, WANG Xuerui, Thomas Gleixner, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Fri, Apr 21, 2023 at 06:47:29PM +0200, Frederic Weisbecker wrote:

> > *HOWEVER*
> > 
> > intel_idle_irq() is affected -- except that we only (normally) use that
> > for very shallow idle states and it won't interact with NOHZ (because we
> > only disable the tick for deep idle states).
> 
> Well I don't know, that doesn't look comfortable... :)
> 
> Also why does it need to enable IRQs if ecx=1 ?

Supposedly this is some interrupt latency hack. See commit:

  c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-21 15:24 ` Thomas Gleixner
  2023-04-21 16:55   ` Frederic Weisbecker
@ 2023-04-22  8:17   ` Peter Zijlstra
  2023-04-22  8:22     ` Peter Zijlstra
  2023-04-22 14:21     ` Frederic Weisbecker
  1 sibling, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2023-04-22  8:17 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Frederic Weisbecker, Huacai Chen, WANG Xuerui, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Fri, Apr 21, 2023 at 05:24:36PM +0200, Thomas Gleixner wrote:
> On Fri, Apr 21 2023 at 14:36, Frederic Weisbecker wrote:
> > I'm looking at the __arch_cpu_idle() implementation in Loongarch
> > and I'm wondering about the rollback code. I don't understand well
> > that code but with the help from PeterZ I might have seen something,
> > so tell me if I'm wrong: when an interrupt happens within
> > __arch_cpu_idle(), handle_vint() rolls back the return value to the
> > beginning of __arch_cpu_idle(), so that TIF_NEED_RESCHED is checked
> > again. Is that correct?
> >
> > Because if an interrupt fires while in __arch_cpu_idle(), that IRQ
> > might enqueue a new timer and that new timer needs to be reprogrammed
> > from the main idle loop and just checking TIF_NEED_RESCHED doesn't
> > tell about that information.
> 
> The check for TIF_NEED_RESCHED as loop termination condition is simply
> wrong. The architecture is not to supposed to loop in arch_cpu_idle().
> 
> That loop is from Linux 0.9 days. Seems MIPS::__r4k_wait() and
> loongarch, which copied that muck are still stuck in the 1990'ies.
> 
> It has to return when an interrupt brings it out of the "idle wait"
> instruction.

So I think the below is enough for these two...

diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
index 44ff1ff64260..5a102ff80de0 100644
--- a/arch/loongarch/kernel/genex.S
+++ b/arch/loongarch/kernel/genex.S
@@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
 	ori	t0, t0, 0x1f
 	xori	t0, t0, 0x1f
 	bne	t0, t1, 1f
+	addi.d	t0, t0, 0x20
 	LONG_S	t0, sp, PT_ERA
 1:	move	a0, sp
 	move	a1, sp
diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
index b6de8e88c1bd..cd6aae441ad9 100644
--- a/arch/mips/kernel/genex.S
+++ b/arch/mips/kernel/genex.S
@@ -140,6 +140,7 @@ LEAF(__r4k_wait)
 	ori	k0, 0x1f	/* 32 byte rollback region */
 	xori	k0, 0x1f
 	bne	k0, k1, \handler
+	addiu	k0, 0x20
 	MTC0	k0, CP0_EPC
 	.set pop
 	.endm

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-22  8:17   ` Peter Zijlstra
@ 2023-04-22  8:22     ` Peter Zijlstra
  2023-04-22 14:21     ` Frederic Weisbecker
  1 sibling, 0 replies; 23+ messages in thread
From: Peter Zijlstra @ 2023-04-22  8:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Frederic Weisbecker, Huacai Chen, WANG Xuerui, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML, tsbogend


+ MIPS Thomas

On Sat, Apr 22, 2023 at 10:17:00AM +0200, Peter Zijlstra wrote:
> On Fri, Apr 21, 2023 at 05:24:36PM +0200, Thomas Gleixner wrote:
> > On Fri, Apr 21 2023 at 14:36, Frederic Weisbecker wrote:
> > > I'm looking at the __arch_cpu_idle() implementation in Loongarch
> > > and I'm wondering about the rollback code. I don't understand well
> > > that code but with the help from PeterZ I might have seen something,
> > > so tell me if I'm wrong: when an interrupt happens within
> > > __arch_cpu_idle(), handle_vint() rolls back the return value to the
> > > beginning of __arch_cpu_idle(), so that TIF_NEED_RESCHED is checked
> > > again. Is that correct?
> > >
> > > Because if an interrupt fires while in __arch_cpu_idle(), that IRQ
> > > might enqueue a new timer and that new timer needs to be reprogrammed
> > > from the main idle loop and just checking TIF_NEED_RESCHED doesn't
> > > tell about that information.
> > 
> > The check for TIF_NEED_RESCHED as loop termination condition is simply
> > wrong. The architecture is not to supposed to loop in arch_cpu_idle().
> > 
> > That loop is from Linux 0.9 days. Seems MIPS::__r4k_wait() and
> > loongarch, which copied that muck are still stuck in the 1990'ies.
> > 
> > It has to return when an interrupt brings it out of the "idle wait"
> > instruction.
> 
> So I think the below is enough for these two...
> 
> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> index 44ff1ff64260..5a102ff80de0 100644
> --- a/arch/loongarch/kernel/genex.S
> +++ b/arch/loongarch/kernel/genex.S
> @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
>  	ori	t0, t0, 0x1f
>  	xori	t0, t0, 0x1f
>  	bne	t0, t1, 1f
> +	addi.d	t0, t0, 0x20
>  	LONG_S	t0, sp, PT_ERA
>  1:	move	a0, sp
>  	move	a1, sp
> diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
> index b6de8e88c1bd..cd6aae441ad9 100644
> --- a/arch/mips/kernel/genex.S
> +++ b/arch/mips/kernel/genex.S
> @@ -140,6 +140,7 @@ LEAF(__r4k_wait)
>  	ori	k0, 0x1f	/* 32 byte rollback region */
>  	xori	k0, 0x1f
>  	bne	k0, k1, \handler
> +	addiu	k0, 0x20
>  	MTC0	k0, CP0_EPC
>  	.set pop
>  	.endm

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-22  8:08     ` Peter Zijlstra
@ 2023-04-22 11:38       ` Peter Zijlstra
  2023-04-22 14:48         ` Frederic Weisbecker
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2023-04-22 11:38 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Huacai Chen, WANG Xuerui, Thomas Gleixner, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Sat, Apr 22, 2023 at 10:08:14AM +0200, Peter Zijlstra wrote:
> On Fri, Apr 21, 2023 at 06:47:29PM +0200, Frederic Weisbecker wrote:
> 
> > > *HOWEVER*
> > > 
> > > intel_idle_irq() is affected -- except that we only (normally) use that
> > > for very shallow idle states and it won't interact with NOHZ (because we
> > > only disable the tick for deep idle states).
> > 
> > Well I don't know, that doesn't look comfortable... :)
> > 
> > Also why does it need to enable IRQs if ecx=1 ?
> 
> Supposedly this is some interrupt latency hack. See commit:
> 
>   c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")

Something like so perhaps...

diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index 778df05f8539..07a4072c43de 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -115,8 +115,14 @@ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned lo
 		}
 
 		__monitor((void *)&current_thread_info()->flags, 0, 0);
-		if (!need_resched())
-			__mwait(eax, ecx);
+		if (!need_resched()) {
+			if (ecx & 1) {
+				__mwait(eax, ecx);
+			} else {
+				__sti_mwait(eax, ecx);
+				raw_local_irq_disable();
+			}
+		}
 	}
 	current_clr_polling();
 }
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 938c17f25d94..4a823bd0f5e0 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -130,11 +130,12 @@ static unsigned int mwait_substates __initdata;
 #define MWAIT2flg(eax) ((eax & 0xFF) << 24)
 
 static __always_inline int __intel_idle(struct cpuidle_device *dev,
-					struct cpuidle_driver *drv, int index)
+					struct cpuidle_driver *drv,
+					int index, bool irqoff)
 {
 	struct cpuidle_state *state = &drv->states[index];
 	unsigned long eax = flg2MWAIT(state->flags);
-	unsigned long ecx = 1; /* break on interrupt flag */
+	unsigned long ecx = 1*irqoff; /* break on interrupt flag */
 
 	mwait_idle_with_hints(eax, ecx);
 
@@ -158,19 +159,13 @@ static __always_inline int __intel_idle(struct cpuidle_device *dev,
 static __cpuidle int intel_idle(struct cpuidle_device *dev,
 				struct cpuidle_driver *drv, int index)
 {
-	return __intel_idle(dev, drv, index);
+	return __intel_idle(dev, drv, index, true);
 }
 
 static __cpuidle int intel_idle_irq(struct cpuidle_device *dev,
 				    struct cpuidle_driver *drv, int index)
 {
-	int ret;
-
-	raw_local_irq_enable();
-	ret = __intel_idle(dev, drv, index);
-	raw_local_irq_disable();
-
-	return ret;
+	return __intel_idle(dev, drv, index, false);
 }
 
 static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
@@ -183,7 +178,7 @@ static __cpuidle int intel_idle_ibrs(struct cpuidle_device *dev,
 	if (smt_active)
 		native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
 
-	ret = __intel_idle(dev, drv, index);
+	ret = __intel_idle(dev, drv, index, true);
 
 	if (smt_active)
 		native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
@@ -195,7 +190,7 @@ static __cpuidle int intel_idle_xstate(struct cpuidle_device *dev,
 				       struct cpuidle_driver *drv, int index)
 {
 	fpu_idle_fpregs();
-	return __intel_idle(dev, drv, index);
+	return __intel_idle(dev, drv, index, true);
 }
 
 /**

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-22  8:17   ` Peter Zijlstra
  2023-04-22  8:22     ` Peter Zijlstra
@ 2023-04-22 14:21     ` Frederic Weisbecker
  2023-04-22 15:04       ` Peter Zijlstra
  1 sibling, 1 reply; 23+ messages in thread
From: Frederic Weisbecker @ 2023-04-22 14:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Thomas Gleixner, Huacai Chen, WANG Xuerui, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Sat, Apr 22, 2023 at 10:17:00AM +0200, Peter Zijlstra wrote:
> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> index 44ff1ff64260..5a102ff80de0 100644
> --- a/arch/loongarch/kernel/genex.S
> +++ b/arch/loongarch/kernel/genex.S
> @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
>  	ori	t0, t0, 0x1f
>  	xori	t0, t0, 0x1f
>  	bne	t0, t1, 1f
> +	addi.d	t0, t0, 0x20
>  	LONG_S	t0, sp, PT_ERA
>  1:	move	a0, sp
>  	move	a1, sp

But the interrupts are enabled in C from arch_cpu_idle(), which
only then calls the ASM __arch_cpu_idle(). So if the interrupt happens
somewhere in between the call, the rollback (or fast-forward now)
doesn't apply.

I guess interrupts need to be re-enabled from ASM in the beginning
of __arch_cpu_idle() so that it's part of the fast-forward region.

> diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
> index b6de8e88c1bd..cd6aae441ad9 100644
> --- a/arch/mips/kernel/genex.S
> +++ b/arch/mips/kernel/genex.S
> @@ -140,6 +140,7 @@ LEAF(__r4k_wait)
>  	ori	k0, 0x1f	/* 32 byte rollback region */
>  	xori	k0, 0x1f
>  	bne	k0, k1, \handler
> +	addiu	k0, 0x20
>  	MTC0	k0, CP0_EPC
>  	.set pop
>  	.endm

The same seem to apply with interrupts being re-enabled by r4k_wait().

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-22 11:38       ` Peter Zijlstra
@ 2023-04-22 14:48         ` Frederic Weisbecker
  0 siblings, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2023-04-22 14:48 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Huacai Chen, WANG Xuerui, Thomas Gleixner, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

eeLe Sat, Apr 22, 2023 at 01:38:11PM +0200, Peter Zijlstra a écrit :
> On Sat, Apr 22, 2023 at 10:08:14AM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 21, 2023 at 06:47:29PM +0200, Frederic Weisbecker wrote:
> > 
> > > > *HOWEVER*
> > > > 
> > > > intel_idle_irq() is affected -- except that we only (normally) use that
> > > > for very shallow idle states and it won't interact with NOHZ (because we
> > > > only disable the tick for deep idle states).
> > > 
> > > Well I don't know, that doesn't look comfortable... :)
> > > 
> > > Also why does it need to enable IRQs if ecx=1 ?
> > 
> > Supposedly this is some interrupt latency hack. See commit:
> > 
> >   c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> 
> Something like so perhaps...
> 
> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> index 778df05f8539..07a4072c43de 100644
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -115,8 +115,14 @@ static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned lo
>  		}
>  
>  		__monitor((void *)&current_thread_info()->flags, 0, 0);
> -		if (!need_resched())
> -			__mwait(eax, ecx);
> +		if (!need_resched()) {
> +			if (ecx & 1) {
> +				__mwait(eax, ecx);
> +			} else {
> +				__sti_mwait(eax, ecx);
> +				raw_local_irq_disable();
> +			}
> +		}

Yep that looks good!

Thanks!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-22 14:21     ` Frederic Weisbecker
@ 2023-04-22 15:04       ` Peter Zijlstra
  2023-04-23 13:52         ` bibo, mao
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Zijlstra @ 2023-04-22 15:04 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Thomas Gleixner, Huacai Chen, WANG Xuerui, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML

On Sat, Apr 22, 2023 at 04:21:45PM +0200, Frederic Weisbecker wrote:
> On Sat, Apr 22, 2023 at 10:17:00AM +0200, Peter Zijlstra wrote:
> > diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> > index 44ff1ff64260..5a102ff80de0 100644
> > --- a/arch/loongarch/kernel/genex.S
> > +++ b/arch/loongarch/kernel/genex.S
> > @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
> >  	ori	t0, t0, 0x1f
> >  	xori	t0, t0, 0x1f
> >  	bne	t0, t1, 1f
> > +	addi.d	t0, t0, 0x20
> >  	LONG_S	t0, sp, PT_ERA
> >  1:	move	a0, sp
> >  	move	a1, sp
> 
> But the interrupts are enabled in C from arch_cpu_idle(), which
> only then calls the ASM __arch_cpu_idle(). So if the interrupt happens
> somewhere in between the call, the rollback (or fast-forward now)
> doesn't apply.
> 
> I guess interrupts need to be re-enabled from ASM in the beginning
> of __arch_cpu_idle() so that it's part of the fast-forward region.

Right; something like so I suppose, but at this point I'm really just
guessing... Loongarch person will have to do.

diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
index 44ff1ff64260..4814ac5334ef 100644
--- a/arch/loongarch/kernel/genex.S
+++ b/arch/loongarch/kernel/genex.S
@@ -19,13 +19,13 @@
 	.align	5
 SYM_FUNC_START(__arch_cpu_idle)
 	/* start of rollback region */
+	move	t0, CSR_CRMD_IE
+	csrxchg	t0, t0, LOONGARCH_CSR_CRMD
 	LONG_L	t0, tp, TI_FLAGS
 	nop
 	andi	t0, t0, _TIF_NEED_RESCHED
 	bnez	t0, 1f
 	nop
-	nop
-	nop
 	idle	0
 	/* end of rollback region */
 1:	jr	ra
@@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
 	ori	t0, t0, 0x1f
 	xori	t0, t0, 0x1f
 	bne	t0, t1, 1f
+	addi.d	t0, t0, 0x20
 	LONG_S	t0, sp, PT_ERA
 1:	move	a0, sp
 	move	a1, sp
diff --git a/arch/loongarch/kernel/idle.c b/arch/loongarch/kernel/idle.c
index 0b5dd2faeb90..5ba72d229920 100644
--- a/arch/loongarch/kernel/idle.c
+++ b/arch/loongarch/kernel/idle.c
@@ -11,7 +11,6 @@
 
 void __cpuidle arch_cpu_idle(void)
 {
-	raw_local_irq_enable();
 	__arch_cpu_idle(); /* idle instruction needs irq enabled */
 	raw_local_irq_disable();
 }

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-22 15:04       ` Peter Zijlstra
@ 2023-04-23 13:52         ` bibo, mao
  2023-04-24  8:26           ` Frederic Weisbecker
  2023-04-25 11:49           ` Peter Zijlstra
  0 siblings, 2 replies; 23+ messages in thread
From: bibo, mao @ 2023-04-23 13:52 UTC (permalink / raw)
  To: Peter Zijlstra, Frederic Weisbecker
  Cc: Thomas Gleixner, Huacai Chen, WANG Xuerui, Rafael J. Wysocki,
	Anna-Maria Behnsen, LKML



在 2023/4/22 23:04, Peter Zijlstra 写道:
> On Sat, Apr 22, 2023 at 04:21:45PM +0200, Frederic Weisbecker wrote:
>> On Sat, Apr 22, 2023 at 10:17:00AM +0200, Peter Zijlstra wrote:
>>> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
>>> index 44ff1ff64260..5a102ff80de0 100644
>>> --- a/arch/loongarch/kernel/genex.S
>>> +++ b/arch/loongarch/kernel/genex.S
>>> @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
>>>   	ori	t0, t0, 0x1f
>>>   	xori	t0, t0, 0x1f
>>>   	bne	t0, t1, 1f
>>> +	addi.d	t0, t0, 0x20
>>>   	LONG_S	t0, sp, PT_ERA
>>>   1:	move	a0, sp
>>>   	move	a1, sp
>>
>> But the interrupts are enabled in C from arch_cpu_idle(), which
>> only then calls the ASM __arch_cpu_idle(). So if the interrupt happens
>> somewhere in between the call, the rollback (or fast-forward now)
>> doesn't apply.
I do not know much details about scheduler and timer, if the interrupt 
happens between the call, will flag _TIF_NEED_RESCHED be set? If it is 
set, the rollback will still apply.


>>
>> I guess interrupts need to be re-enabled from ASM in the beginning
>> of __arch_cpu_idle() so that it's part of the fast-forward region.
> 
> Right; something like so I suppose, but at this point I'm really just
> guessing... Loongarch person will have to do.
> 
> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> index 44ff1ff64260..4814ac5334ef 100644
> --- a/arch/loongarch/kernel/genex.S
> +++ b/arch/loongarch/kernel/genex.S
> @@ -19,13 +19,13 @@
>   	.align	5
>   SYM_FUNC_START(__arch_cpu_idle)
>   	/* start of rollback region */
> +	move	t0, CSR_CRMD_IE
> +	csrxchg	t0, t0, LOONGARCH_CSR_CRMD
>   	LONG_L	t0, tp, TI_FLAGS
>   	nop
>   	andi	t0, t0, _TIF_NEED_RESCHED
>   	bnez	t0, 1f
>   	nop
> -	nop
> -	nop
>   	idle	0
>   	/* end of rollback region */
>   1:	jr	ra
> @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
>   	ori	t0, t0, 0x1f
>   	xori	t0, t0, 0x1f
>   	bne	t0, t1, 1f
> +	addi.d	t0, t0, 0x20
It is more reasonable with this patch, this will jump out of idle 
function directly after interrupt returns. If so, can we remove checking 
_TIF_NEED_RESCHED in idle ASM function?

 > +	move	t0, CSR_CRMD_IE
 > +	csrxchg	t0, t0, LOONGARCH_CSR_CRMD
-   	LONG_L	t0, tp, TI_FLAGS
+	nop
 >   	nop
-	andi	t0, t0, _TIF_NEED_RESCHED
-	bnez	t0, 1f
+	nop
+	nop
 >   	nop
 > -	nop
 > -	nop
 >   	idle	0

Regards
Bibo, Mao
>   	LONG_S	t0, sp, PT_ERA
>   1:	move	a0, sp
>   	move	a1, sp
> diff --git a/arch/loongarch/kernel/idle.c b/arch/loongarch/kernel/idle.c
> index 0b5dd2faeb90..5ba72d229920 100644
> --- a/arch/loongarch/kernel/idle.c
> +++ b/arch/loongarch/kernel/idle.c
> @@ -11,7 +11,6 @@
>   
>   void __cpuidle arch_cpu_idle(void)
>   {
> -	raw_local_irq_enable();
>   	__arch_cpu_idle(); /* idle instruction needs irq enabled */
>   	raw_local_irq_disable();
>   }


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-23 13:52         ` bibo, mao
@ 2023-04-24  8:26           ` Frederic Weisbecker
  2023-04-24 11:23             ` maobibo
  2023-04-25 11:49           ` Peter Zijlstra
  1 sibling, 1 reply; 23+ messages in thread
From: Frederic Weisbecker @ 2023-04-24  8:26 UTC (permalink / raw)
  To: bibo, mao
  Cc: Peter Zijlstra, Thomas Gleixner, Huacai Chen, WANG Xuerui,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML

On Sun, Apr 23, 2023 at 09:52:49PM +0800, bibo, mao wrote:
> 
> 
> 在 2023/4/22 23:04, Peter Zijlstra 写道:
> > On Sat, Apr 22, 2023 at 04:21:45PM +0200, Frederic Weisbecker wrote:
> > > On Sat, Apr 22, 2023 at 10:17:00AM +0200, Peter Zijlstra wrote:
> > > > diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> > > > index 44ff1ff64260..5a102ff80de0 100644
> > > > --- a/arch/loongarch/kernel/genex.S
> > > > +++ b/arch/loongarch/kernel/genex.S
> > > > @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
> > > >   	ori	t0, t0, 0x1f
> > > >   	xori	t0, t0, 0x1f
> > > >   	bne	t0, t1, 1f
> > > > +	addi.d	t0, t0, 0x20
> > > >   	LONG_S	t0, sp, PT_ERA
> > > >   1:	move	a0, sp
> > > >   	move	a1, sp
> > > 
> > > But the interrupts are enabled in C from arch_cpu_idle(), which
> > > only then calls the ASM __arch_cpu_idle(). So if the interrupt happens
> > > somewhere in between the call, the rollback (or fast-forward now)
> > > doesn't apply.
> I do not know much details about scheduler and timer, if the interrupt
> happens between the call, will flag _TIF_NEED_RESCHED be set? If it is set,
> the rollback will still apply.

Nop, TIF_NEED_RESCHED is set only if a task is ready to run after the interrupt,
not if the interrupt only modified/added a timer.

> > @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
> >   	ori	t0, t0, 0x1f
> >   	xori	t0, t0, 0x1f
> >   	bne	t0, t1, 1f
> > +	addi.d	t0, t0, 0x20
> It is more reasonable with this patch, this will jump out of idle function
> directly after interrupt returns. If so, can we remove checking
> _TIF_NEED_RESCHED in idle ASM function?

Indeed we can remove the check to TIF_RESCHED!

Thanks!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-24  8:26           ` Frederic Weisbecker
@ 2023-04-24 11:23             ` maobibo
  0 siblings, 0 replies; 23+ messages in thread
From: maobibo @ 2023-04-24 11:23 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Thomas Gleixner, Huacai Chen, WANG Xuerui,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML



在 2023/4/24 16:26, Frederic Weisbecker 写道:
> On Sun, Apr 23, 2023 at 09:52:49PM +0800, bibo, mao wrote:
>>
>>
>> 在 2023/4/22 23:04, Peter Zijlstra 写道:
>>> On Sat, Apr 22, 2023 at 04:21:45PM +0200, Frederic Weisbecker wrote:
>>>> On Sat, Apr 22, 2023 at 10:17:00AM +0200, Peter Zijlstra wrote:
>>>>> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
>>>>> index 44ff1ff64260..5a102ff80de0 100644
>>>>> --- a/arch/loongarch/kernel/genex.S
>>>>> +++ b/arch/loongarch/kernel/genex.S
>>>>> @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
>>>>>   	ori	t0, t0, 0x1f
>>>>>   	xori	t0, t0, 0x1f
>>>>>   	bne	t0, t1, 1f
>>>>> +	addi.d	t0, t0, 0x20
>>>>>   	LONG_S	t0, sp, PT_ERA
>>>>>   1:	move	a0, sp
>>>>>   	move	a1, sp
>>>>
>>>> But the interrupts are enabled in C from arch_cpu_idle(), which
>>>> only then calls the ASM __arch_cpu_idle(). So if the interrupt happens
>>>> somewhere in between the call, the rollback (or fast-forward now)
>>>> doesn't apply.
>> I do not know much details about scheduler and timer, if the interrupt
>> happens between the call, will flag _TIF_NEED_RESCHED be set? If it is set,
>> the rollback will still apply.
> 
> Nop, TIF_NEED_RESCHED is set only if a task is ready to run after the interrupt,
> not if the interrupt only modified/added a timer.
Got it, thanks for your explanation, it is actually one issue in the LoongArch
ASM code __arch_cpu_idle().

Regards
Bibo, Mao

> 
>>> @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
>>>   	ori	t0, t0, 0x1f
>>>   	xori	t0, t0, 0x1f
>>>   	bne	t0, t1, 1f
>>> +	addi.d	t0, t0, 0x20
>> It is more reasonable with this patch, this will jump out of idle function
>> directly after interrupt returns. If so, can we remove checking
>> _TIF_NEED_RESCHED in idle ASM function?
> 
> Indeed we can remove the check to TIF_RESCHED!
> 
> Thanks!


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-23 13:52         ` bibo, mao
  2023-04-24  8:26           ` Frederic Weisbecker
@ 2023-04-25 11:49           ` Peter Zijlstra
  2023-04-25 13:25             ` maobibo
  2023-06-06 22:07             ` Frederic Weisbecker
  1 sibling, 2 replies; 23+ messages in thread
From: Peter Zijlstra @ 2023-04-25 11:49 UTC (permalink / raw)
  To: bibo, mao
  Cc: Frederic Weisbecker, Thomas Gleixner, Huacai Chen, WANG Xuerui,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML

On Sun, Apr 23, 2023 at 09:52:49PM +0800, bibo, mao wrote:

> > @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
> >   	ori	t0, t0, 0x1f
> >   	xori	t0, t0, 0x1f
> >   	bne	t0, t1, 1f
> > +	addi.d	t0, t0, 0x20
> It is more reasonable with this patch, this will jump out of idle function
> directly after interrupt returns. If so, can we remove checking
> _TIF_NEED_RESCHED in idle ASM function?
> 
> > +	move	t0, CSR_CRMD_IE
> > +	csrxchg	t0, t0, LOONGARCH_CSR_CRMD
> -   	LONG_L	t0, tp, TI_FLAGS
> +	nop
> >   	nop
> -	andi	t0, t0, _TIF_NEED_RESCHED
> -	bnez	t0, 1f
> +	nop
> +	nop
> >   	nop
> > -	nop
> > -	nop
> >   	idle	0

Would not something like the below be a more compact form?
That is; afaict there is no reason to keep it 32 bytes, we can easily go
16 and drop 4 nops.

Additionally, instead of truncating to the start, increase to the end by
doing:

	ip |= 0xf;
	ip++;

Also; I added a wee comment.


diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
index 44ff1ff64260..3c8a6bab98fe 100644
--- a/arch/loongarch/kernel/genex.S
+++ b/arch/loongarch/kernel/genex.S
@@ -18,27 +18,31 @@
 
 	.align	5
 SYM_FUNC_START(__arch_cpu_idle)
-	/* start of rollback region */
-	LONG_L	t0, tp, TI_FLAGS
-	nop
-	andi	t0, t0, _TIF_NEED_RESCHED
-	bnez	t0, 1f
-	nop
-	nop
-	nop
+	/* start of idle interrupt region */
+	move	t0, CSR_CRMD_IE
+	csrxchg	t0, t0, LOONGARCH_CSR_CRMD
+	/*
+	 * If an interrupt lands here; between enabling interrupts above and
+	 * going idle on the next instruction, we must *NOT* go idle since the
+	 * interrupt could have set TIF_NEED_RESCHED or caused an timer to need
+	 * reprogramming. Fall through -- see handle_vint() below -- and have
+	 * the idle loop take care of things.
+	 */
 	idle	0
-	/* end of rollback region */
-1:	jr	ra
+	nop
+	/* end of idle interrupt region */
+SYM_INNER_LBEL(__arch_cpu_idle_exit, SYM_L_LOCAL)
+	jr	ra
 SYM_FUNC_END(__arch_cpu_idle)
 
 SYM_FUNC_START(handle_vint)
 	BACKUP_T0T1
 	SAVE_ALL
-	la_abs	t1, __arch_cpu_idle
+	la_abs	t1, __arch_cpu_idle_exit
 	LONG_L	t0, sp, PT_ERA
-	/* 32 byte rollback region */
-	ori	t0, t0, 0x1f
-	xori	t0, t0, 0x1f
+	/* 16 byte idle interrupt region */
+	ori	t0, t0, 0x0f
+	addi.d	t0, t0, 1
 	bne	t0, t1, 1f
 	LONG_S	t0, sp, PT_ERA
 1:	move	a0, sp
diff --git a/arch/loongarch/kernel/idle.c b/arch/loongarch/kernel/idle.c
index 0b5dd2faeb90..5ba72d229920 100644
--- a/arch/loongarch/kernel/idle.c
+++ b/arch/loongarch/kernel/idle.c
@@ -11,7 +11,6 @@
 
 void __cpuidle arch_cpu_idle(void)
 {
-	raw_local_irq_enable();
 	__arch_cpu_idle(); /* idle instruction needs irq enabled */
 	raw_local_irq_disable();
 }

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-25 11:49           ` Peter Zijlstra
@ 2023-04-25 13:25             ` maobibo
  2023-04-25 13:28               ` WANG Xuerui
  2023-06-06 22:07             ` Frederic Weisbecker
  1 sibling, 1 reply; 23+ messages in thread
From: maobibo @ 2023-04-25 13:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Huacai Chen, WANG Xuerui,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML



在 2023/4/25 19:49, Peter Zijlstra 写道:
> On Sun, Apr 23, 2023 at 09:52:49PM +0800, bibo, mao wrote:
> 
>>> @@ -40,6 +40,7 @@ SYM_FUNC_START(handle_vint)
>>>   	ori	t0, t0, 0x1f
>>>   	xori	t0, t0, 0x1f
>>>   	bne	t0, t1, 1f
>>> +	addi.d	t0, t0, 0x20
>> It is more reasonable with this patch, this will jump out of idle function
>> directly after interrupt returns. If so, can we remove checking
>> _TIF_NEED_RESCHED in idle ASM function?
>>
>>> +	move	t0, CSR_CRMD_IE
>>> +	csrxchg	t0, t0, LOONGARCH_CSR_CRMD
>> -   	LONG_L	t0, tp, TI_FLAGS
>> +	nop
>>>   	nop
>> -	andi	t0, t0, _TIF_NEED_RESCHED
>> -	bnez	t0, 1f
>> +	nop
>> +	nop
>>>   	nop
>>> -	nop
>>> -	nop
>>>   	idle	0
> 
> Would not something like the below be a more compact form?
> That is; afaict there is no reason to keep it 32 bytes, we can easily go
> 16 and drop 4 nops.
> 
> Additionally, instead of truncating to the start, increase to the end by
> doing:
> 
> 	ip |= 0xf;
> 	ip++;
> 
> Also; I added a wee comment.
Excellent, you are so smart, I test the patch, it works.

> 
> 
> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> index 44ff1ff64260..3c8a6bab98fe 100644
> --- a/arch/loongarch/kernel/genex.S
> +++ b/arch/loongarch/kernel/genex.S
> @@ -18,27 +18,31 @@
>  
>  	.align	5
>  SYM_FUNC_START(__arch_cpu_idle)
> -	/* start of rollback region */
> -	LONG_L	t0, tp, TI_FLAGS
> -	nop
> -	andi	t0, t0, _TIF_NEED_RESCHED
> -	bnez	t0, 1f
> -	nop
> -	nop
> -	nop
> +	/* start of idle interrupt region */
> +	move	t0, CSR_CRMD_IE
addi.d  t0, zero, CSR_CRMD_IE can be used here, move is used for reg to reg

> +	csrxchg	t0, t0, LOONGARCH_CSR_CRMD
> +	/*
> +	 * If an interrupt lands here; between enabling interrupts above and
> +	 * going idle on the next instruction, we must *NOT* go idle since the
> +	 * interrupt could have set TIF_NEED_RESCHED or caused an timer to need
> +	 * reprogramming. Fall through -- see handle_vint() below -- and have
> +	 * the idle loop take care of things.
> +	 */
>  	idle	0
> -	/* end of rollback region */
> -1:	jr	ra
> +	nop
> +	/* end of idle interrupt region */
> +SYM_INNER_LBEL(__arch_cpu_idle_exit, SYM_L_LOCAL)
typo SYM_INNER_LABEL

otherwise LGTM

Tested-by: Bibo, Mao <maobibo@loongson.cn>

> +	jr	ra
>  SYM_FUNC_END(__arch_cpu_idle)
>  
>  SYM_FUNC_START(handle_vint)
>  	BACKUP_T0T1
>  	SAVE_ALL
> -	la_abs	t1, __arch_cpu_idle
> +	la_abs	t1, __arch_cpu_idle_exit
>  	LONG_L	t0, sp, PT_ERA
> -	/* 32 byte rollback region */
> -	ori	t0, t0, 0x1f
> -	xori	t0, t0, 0x1f
> +	/* 16 byte idle interrupt region */
> +	ori	t0, t0, 0x0f
> +	addi.d	t0, t0, 1
>  	bne	t0, t1, 1f
>  	LONG_S	t0, sp, PT_ERA
>  1:	move	a0, sp
> diff --git a/arch/loongarch/kernel/idle.c b/arch/loongarch/kernel/idle.c
> index 0b5dd2faeb90..5ba72d229920 100644
> --- a/arch/loongarch/kernel/idle.c
> +++ b/arch/loongarch/kernel/idle.c
> @@ -11,7 +11,6 @@
>  
>  void __cpuidle arch_cpu_idle(void)
>  {
> -	raw_local_irq_enable();
>  	__arch_cpu_idle(); /* idle instruction needs irq enabled */
>  	raw_local_irq_disable();
>  }


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-25 13:25             ` maobibo
@ 2023-04-25 13:28               ` WANG Xuerui
  2023-04-26  0:46                 ` maobibo
  0 siblings, 1 reply; 23+ messages in thread
From: WANG Xuerui @ 2023-04-25 13:28 UTC (permalink / raw)
  To: maobibo, Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Huacai Chen,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML

On 2023/4/25 21:25, maobibo wrote:
> 
> 
> 在 2023/4/25 19:49, Peter Zijlstra 写道:

<snip>

>>
>> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
>> index 44ff1ff64260..3c8a6bab98fe 100644
>> --- a/arch/loongarch/kernel/genex.S
>> +++ b/arch/loongarch/kernel/genex.S
>> @@ -18,27 +18,31 @@
>>   
>>   	.align	5
>>   SYM_FUNC_START(__arch_cpu_idle)
>> -	/* start of rollback region */
>> -	LONG_L	t0, tp, TI_FLAGS
>> -	nop
>> -	andi	t0, t0, _TIF_NEED_RESCHED
>> -	bnez	t0, 1f
>> -	nop
>> -	nop
>> -	nop
>> +	/* start of idle interrupt region */
>> +	move	t0, CSR_CRMD_IE
> addi.d  t0, zero, CSR_CRMD_IE can be used here, move is used for reg to reg

Or better: li.d t0, CSR_CRMD_IE (prefer pseudo-instruction over concrete 
ones whenever it helps readability). We don't need to support ancient 
in-house toolchains without support for even li. ;-)

-- 
WANG "xen0n" Xuerui

Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-25 13:28               ` WANG Xuerui
@ 2023-04-26  0:46                 ` maobibo
  2023-04-26  2:10                   ` WANG Xuerui
  0 siblings, 1 reply; 23+ messages in thread
From: maobibo @ 2023-04-26  0:46 UTC (permalink / raw)
  To: WANG Xuerui, Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Huacai Chen,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML



在 2023/4/25 21:28, WANG Xuerui 写道:
> On 2023/4/25 21:25, maobibo wrote:
>>
>>
>> 在 2023/4/25 19:49, Peter Zijlstra 写道:
> 
> <snip>
> 
>>>
>>> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
>>> index 44ff1ff64260..3c8a6bab98fe 100644
>>> --- a/arch/loongarch/kernel/genex.S
>>> +++ b/arch/loongarch/kernel/genex.S
>>> @@ -18,27 +18,31 @@
>>>         .align    5
>>>   SYM_FUNC_START(__arch_cpu_idle)
>>> -    /* start of rollback region */
>>> -    LONG_L    t0, tp, TI_FLAGS
>>> -    nop
>>> -    andi    t0, t0, _TIF_NEED_RESCHED
>>> -    bnez    t0, 1f
>>> -    nop
>>> -    nop
>>> -    nop
>>> +    /* start of idle interrupt region */
>>> +    move    t0, CSR_CRMD_IE
>> addi.d  t0, zero, CSR_CRMD_IE can be used here, move is used for reg to reg
> 
> Or better: li.d t0, CSR_CRMD_IE (prefer pseudo-instruction over concrete ones whenever it helps readability). We don't need to support ancient in-house toolchains without support for even li. ;-)
  
I am not familiar with compiler:(, how many actual instructions does
pseudo-instr li.d takes? It will be ok if it uses only one intr, else
there will be problem.

Regards
Bibo, Mao

> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-26  0:46                 ` maobibo
@ 2023-04-26  2:10                   ` WANG Xuerui
  2023-04-26  2:23                     ` maobibo
  0 siblings, 1 reply; 23+ messages in thread
From: WANG Xuerui @ 2023-04-26  2:10 UTC (permalink / raw)
  To: maobibo, Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Huacai Chen,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML

On 2023/4/26 08:46, maobibo wrote:
> 
> 
> 在 2023/4/25 21:28, WANG Xuerui 写道:
>> On 2023/4/25 21:25, maobibo wrote:
>>>
>>>
>>> 在 2023/4/25 19:49, Peter Zijlstra 写道:
>>
>> <snip>
>>
>>>>
>>>> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
>>>> index 44ff1ff64260..3c8a6bab98fe 100644
>>>> --- a/arch/loongarch/kernel/genex.S
>>>> +++ b/arch/loongarch/kernel/genex.S
>>>> @@ -18,27 +18,31 @@
>>>>          .align    5
>>>>    SYM_FUNC_START(__arch_cpu_idle)
>>>> -    /* start of rollback region */
>>>> -    LONG_L    t0, tp, TI_FLAGS
>>>> -    nop
>>>> -    andi    t0, t0, _TIF_NEED_RESCHED
>>>> -    bnez    t0, 1f
>>>> -    nop
>>>> -    nop
>>>> -    nop
>>>> +    /* start of idle interrupt region */
>>>> +    move    t0, CSR_CRMD_IE
>>> addi.d  t0, zero, CSR_CRMD_IE can be used here, move is used for reg to reg
>>
>> Or better: li.d t0, CSR_CRMD_IE (prefer pseudo-instruction over concrete ones whenever it helps readability). We don't need to support ancient in-house toolchains without support for even li. ;-)
>    
> I am not familiar with compiler:(, how many actual instructions does
> pseudo-instr li.d takes? It will be ok if it uses only one intr, else
> there will be problem.

It's just `ori $t0, $zero, 4` no matter which of li.w or li.d is used. 
It only matters when the immediate to load is bigger. Given CSR_CRMD_IE 
is just 4 (1<<2) you can definitely say `li.w` if you want to be extra 
cautious and it won't hurt. ;-)

-- 
WANG "xen0n" Xuerui

Linux/LoongArch mailing list: https://lore.kernel.org/loongarch/


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-26  2:10                   ` WANG Xuerui
@ 2023-04-26  2:23                     ` maobibo
  0 siblings, 0 replies; 23+ messages in thread
From: maobibo @ 2023-04-26  2:23 UTC (permalink / raw)
  To: WANG Xuerui, Peter Zijlstra
  Cc: Frederic Weisbecker, Thomas Gleixner, Huacai Chen,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML



在 2023/4/26 10:10, WANG Xuerui 写道:
> On 2023/4/26 08:46, maobibo wrote:
>>
>>
>> 在 2023/4/25 21:28, WANG Xuerui 写道:
>>> On 2023/4/25 21:25, maobibo wrote:
>>>>
>>>>
>>>> 在 2023/4/25 19:49, Peter Zijlstra 写道:
>>>
>>> <snip>
>>>
>>>>>
>>>>> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
>>>>> index 44ff1ff64260..3c8a6bab98fe 100644
>>>>> --- a/arch/loongarch/kernel/genex.S
>>>>> +++ b/arch/loongarch/kernel/genex.S
>>>>> @@ -18,27 +18,31 @@
>>>>>          .align    5
>>>>>    SYM_FUNC_START(__arch_cpu_idle)
>>>>> -    /* start of rollback region */
>>>>> -    LONG_L    t0, tp, TI_FLAGS
>>>>> -    nop
>>>>> -    andi    t0, t0, _TIF_NEED_RESCHED
>>>>> -    bnez    t0, 1f
>>>>> -    nop
>>>>> -    nop
>>>>> -    nop
>>>>> +    /* start of idle interrupt region */
>>>>> +    move    t0, CSR_CRMD_IE
>>>> addi.d  t0, zero, CSR_CRMD_IE can be used here, move is used for reg to reg
>>>
>>> Or better: li.d t0, CSR_CRMD_IE (prefer pseudo-instruction over concrete ones whenever it helps readability). We don't need to support ancient in-house toolchains without support for even li. ;-)
>>    I am not familiar with compiler:(, how many actual instructions does
>> pseudo-instr li.d takes? It will be ok if it uses only one intr, else
>> there will be problem.
> 
> It's just `ori $t0, $zero, 4` no matter which of li.w or li.d is used. It only matters when the immediate to load is bigger. Given CSR_CRMD_IE is just 4 (1<<2) you can definitely say `li.w` if you want to be extra cautious and it won't hurt. ;-)

That is good to me, you are compiler expert:)

> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Loongson (and other $ARCHs?) idle VS timer enqueue
  2023-04-25 11:49           ` Peter Zijlstra
  2023-04-25 13:25             ` maobibo
@ 2023-06-06 22:07             ` Frederic Weisbecker
  1 sibling, 0 replies; 23+ messages in thread
From: Frederic Weisbecker @ 2023-06-06 22:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: bibo, mao, Thomas Gleixner, Huacai Chen, WANG Xuerui,
	Rafael J. Wysocki, Anna-Maria Behnsen, LKML

On Tue, Apr 25, 2023 at 01:49:37PM +0200, Peter Zijlstra wrote:
> Would not something like the below be a more compact form?
> That is; afaict there is no reason to keep it 32 bytes, we can easily go
> 16 and drop 4 nops.
> 
> Additionally, instead of truncating to the start, increase to the end by
> doing:
> 
> 	ip |= 0xf;
> 	ip++;
> 
> Also; I added a wee comment.
> 
> 
> diff --git a/arch/loongarch/kernel/genex.S b/arch/loongarch/kernel/genex.S
> index 44ff1ff64260..3c8a6bab98fe 100644
> --- a/arch/loongarch/kernel/genex.S
> +++ b/arch/loongarch/kernel/genex.S
> @@ -18,27 +18,31 @@
>  
>  	.align	5
>  SYM_FUNC_START(__arch_cpu_idle)
> -	/* start of rollback region */
> -	LONG_L	t0, tp, TI_FLAGS
> -	nop
> -	andi	t0, t0, _TIF_NEED_RESCHED
> -	bnez	t0, 1f
> -	nop
> -	nop
> -	nop
> +	/* start of idle interrupt region */
> +	move	t0, CSR_CRMD_IE
> +	csrxchg	t0, t0, LOONGARCH_CSR_CRMD
> +	/*
> +	 * If an interrupt lands here; between enabling interrupts above and
> +	 * going idle on the next instruction, we must *NOT* go idle since the
> +	 * interrupt could have set TIF_NEED_RESCHED or caused an timer to need
> +	 * reprogramming. Fall through -- see handle_vint() below -- and have
> +	 * the idle loop take care of things.
> +	 */
>  	idle	0
> -	/* end of rollback region */
> -1:	jr	ra
> +	nop
> +	/* end of idle interrupt region */
> +SYM_INNER_LBEL(__arch_cpu_idle_exit, SYM_L_LOCAL)
> +	jr	ra
>  SYM_FUNC_END(__arch_cpu_idle)
>  
>  SYM_FUNC_START(handle_vint)
>  	BACKUP_T0T1
>  	SAVE_ALL
> -	la_abs	t1, __arch_cpu_idle
> +	la_abs	t1, __arch_cpu_idle_exit
>  	LONG_L	t0, sp, PT_ERA
> -	/* 32 byte rollback region */
> -	ori	t0, t0, 0x1f
> -	xori	t0, t0, 0x1f
> +	/* 16 byte idle interrupt region */
> +	ori	t0, t0, 0x0f
> +	addi.d	t0, t0, 1
>  	bne	t0, t1, 1f
>  	LONG_S	t0, sp, PT_ERA
>  1:	move	a0, sp
> diff --git a/arch/loongarch/kernel/idle.c b/arch/loongarch/kernel/idle.c
> index 0b5dd2faeb90..5ba72d229920 100644
> --- a/arch/loongarch/kernel/idle.c
> +++ b/arch/loongarch/kernel/idle.c
> @@ -11,7 +11,6 @@
>  
>  void __cpuidle arch_cpu_idle(void)
>  {
> -	raw_local_irq_enable();
>  	__arch_cpu_idle(); /* idle instruction needs irq enabled */
>  	raw_local_irq_disable();
>  }

What is the status of this? Is someone from loongarch going to take it?

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-06-06 22:07 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-21 12:36 Loongson (and other $ARCHs?) idle VS timer enqueue Frederic Weisbecker
2023-04-21 14:24 ` Peter Zijlstra
2023-04-21 16:47   ` Frederic Weisbecker
2023-04-22  8:08     ` Peter Zijlstra
2023-04-22 11:38       ` Peter Zijlstra
2023-04-22 14:48         ` Frederic Weisbecker
2023-04-21 15:24 ` Thomas Gleixner
2023-04-21 16:55   ` Frederic Weisbecker
2023-04-21 20:28     ` Thomas Gleixner
2023-04-22  8:17   ` Peter Zijlstra
2023-04-22  8:22     ` Peter Zijlstra
2023-04-22 14:21     ` Frederic Weisbecker
2023-04-22 15:04       ` Peter Zijlstra
2023-04-23 13:52         ` bibo, mao
2023-04-24  8:26           ` Frederic Weisbecker
2023-04-24 11:23             ` maobibo
2023-04-25 11:49           ` Peter Zijlstra
2023-04-25 13:25             ` maobibo
2023-04-25 13:28               ` WANG Xuerui
2023-04-26  0:46                 ` maobibo
2023-04-26  2:10                   ` WANG Xuerui
2023-04-26  2:23                     ` maobibo
2023-06-06 22:07             ` Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).