All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC patch] Use IPI_shortcut for lapic timer broadcast
@ 2009-06-29  6:47 Luming Yu
  2009-06-29  7:20 ` Ingo Molnar
  2009-06-29 20:34 ` Pallipadi, Venkatesh
  0 siblings, 2 replies; 13+ messages in thread
From: Luming Yu @ 2009-06-29  6:47 UTC (permalink / raw)
  To: LKML; +Cc: suresh.b.siddha, venkatesh.pallipadi

[-- Attachment #1: Type: text/plain, Size: 1669 bytes --]

Hello,

We need to use IPI shortcut to send lapic timer broadcast
to avoid the latency of sending IPI one bye one on systems with many
logical processors when NO_HZ is disabled.
Without this patch,I have seen upstream kernel with RHEL 5 kernel
config boot hang .

The patch also changes physflat_send_IPI_all to IPI shortcut mode.

Please review, and apply.

**The patch is enclosed in text attachment*
**Using web client to send the patch* *
**below is for review, please apply attached  patch*/

Thanks,
Luming


Signed-off-by: Yu Luming <luming.yu@intel.com>

 apic.c         |    4 +++-
 apic_flat_64.c |    7 ++++++-
 2 files changed, 9 insertions(+), 2 deletions(-)

--- linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c.0	2009-06-28
20:22:55.000000000 -0600
+++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c	2009-06-29
00:21:44.000000000 -0600
@@ -419,7 +419,9 @@
 static void lapic_timer_broadcast(const struct cpumask *mask)
 {
 #ifdef CONFIG_SMP
-	apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
+	if (cpus_empty(*mask))
+		return;
+	apic->send_IPI_all(LOCAL_TIMER_VECTOR);
 #endif
 }

--- linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c.0	2009-06-29
00:13:26.000000000 -0600
+++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c	2009-06-29
00:11:23.000000000 -0600
@@ -274,7 +274,12 @@

 static void physflat_send_IPI_all(int vector)
 {
-	physflat_send_IPI_mask(cpu_online_mask, vector);
+	if (vector == NMI_VECTOR) {
+		physflat_send_IPI_mask(cpu_online_mask, vector);
+	} else {
+		__default_send_IPI_shortcut(APIC_DEST_ALLINC,
+					    vector, apic->dest_logical);
+	}
 }

 static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask)

[-- Attachment #2: 1.patch --]
[-- Type: application/octet-stream, Size: 999 bytes --]

--- linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c.0	2009-06-28 20:22:55.000000000 -0600
+++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c	2009-06-29 00:21:44.000000000 -0600
@@ -419,7 +419,9 @@
 static void lapic_timer_broadcast(const struct cpumask *mask)
 {
 #ifdef CONFIG_SMP
-	apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
+	if (cpus_empty(*mask))
+		return;
+	apic->send_IPI_all(LOCAL_TIMER_VECTOR);
 #endif
 }
 
--- linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c.0	2009-06-29 00:13:26.000000000 -0600
+++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c	2009-06-29 00:11:23.000000000 -0600
@@ -274,7 +274,12 @@
 
 static void physflat_send_IPI_all(int vector)
 {
-	physflat_send_IPI_mask(cpu_online_mask, vector);
+	if (vector == NMI_VECTOR) {
+		physflat_send_IPI_mask(cpu_online_mask, vector);
+	} else {
+		__default_send_IPI_shortcut(APIC_DEST_ALLINC,
+					    vector, apic->dest_logical);
+	}
 }
 
 static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  6:47 [RFC patch] Use IPI_shortcut for lapic timer broadcast Luming Yu
@ 2009-06-29  7:20 ` Ingo Molnar
  2009-06-29  8:04   ` Luming Yu
  2009-06-29 20:34 ` Pallipadi, Venkatesh
  1 sibling, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2009-06-29  7:20 UTC (permalink / raw)
  To: Luming Yu
  Cc: LKML, suresh.b.siddha, venkatesh.pallipadi, Thomas Gleixner,
	H. Peter Anvin


* Luming Yu <luming.yu@gmail.com> wrote:

> Hello,
> 
> We need to use IPI shortcut to send lapic timer broadcast
> to avoid the latency of sending IPI one bye one on systems with many
> logical processors when NO_HZ is disabled.
> Without this patch,I have seen upstream kernel with RHEL 5 kernel
> config boot hang .

hm, that might be a valid optimization - but why does the lack of 
this optimization result in a hang?

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  7:20 ` Ingo Molnar
@ 2009-06-29  8:04   ` Luming Yu
  2009-06-29  8:16     ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Luming Yu @ 2009-06-29  8:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, suresh.b.siddha, venkatesh.pallipadi, Thomas Gleixner,
	H. Peter Anvin

On Mon, Jun 29, 2009 at 3:20 PM, Ingo Molnar<mingo@elte.hu> wrote:
>
> * Luming Yu <luming.yu@gmail.com> wrote:
>
>> Hello,
>>
>> We need to use IPI shortcut to send lapic timer broadcast
>> to avoid the latency of sending IPI one bye one on systems with many
>> logical processors when NO_HZ is disabled.
>> Without this patch,I have seen upstream kernel with RHEL 5 kernel
>> config boot hang .
>
> hm, that might be a valid optimization - but why does the lack of
> this optimization result in a hang?

It is hang caused by kernel code for work around lapic-timer-stop issue.
With HZ=1000, and a lot of cpus (eg. 64 logical cpus), cpu 0 will be
busy working on send TIMER IPI instead of making progress in boot
(right after deep-C-state has been used).




>
>        Ingo
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  8:04   ` Luming Yu
@ 2009-06-29  8:16     ` Ingo Molnar
  2009-06-29  8:21       ` Luming Yu
  0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2009-06-29  8:16 UTC (permalink / raw)
  To: Luming Yu
  Cc: LKML, suresh.b.siddha, venkatesh.pallipadi, Thomas Gleixner,
	H. Peter Anvin


* Luming Yu <luming.yu@gmail.com> wrote:

> On Mon, Jun 29, 2009 at 3:20 PM, Ingo Molnar<mingo@elte.hu> wrote:
> >
> > * Luming Yu <luming.yu@gmail.com> wrote:
> >
> >> Hello,
> >>
> >> We need to use IPI shortcut to send lapic timer broadcast
> >> to avoid the latency of sending IPI one bye one on systems with many
> >> logical processors when NO_HZ is disabled.
> >> Without this patch,I have seen upstream kernel with RHEL 5 kernel
> >> config boot hang .
> >
> > hm, that might be a valid optimization - but why does the lack of
> > this optimization result in a hang?
> 
> It is hang caused by kernel code for work around lapic-timer-stop 
> issue. With HZ=1000, and a lot of cpus (eg. 64 logical cpus), cpu 
> 0 will be busy working on send TIMER IPI instead of making 
> progress in boot (right after deep-C-state has been used).

that's a bit weird. With HZ=1000 we have 1000 usecs between each 
timer tick. Assuming a CPU sends to a lot of CPUs (64 logical CPUs) 
that means that each IPI takes more than ~15 microseconds to 
process. On what hardware/platform can this happen realistically?

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  8:16     ` Ingo Molnar
@ 2009-06-29  8:21       ` Luming Yu
  2009-06-29  8:30         ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Luming Yu @ 2009-06-29  8:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, suresh.b.siddha, venkatesh.pallipadi, Thomas Gleixner,
	H. Peter Anvin

On Mon, Jun 29, 2009 at 4:16 PM, Ingo Molnar<mingo@elte.hu> wrote:
>
> * Luming Yu <luming.yu@gmail.com> wrote:
>
>> On Mon, Jun 29, 2009 at 3:20 PM, Ingo Molnar<mingo@elte.hu> wrote:
>> >
>> > * Luming Yu <luming.yu@gmail.com> wrote:
>> >
>> >> Hello,
>> >>
>> >> We need to use IPI shortcut to send lapic timer broadcast
>> >> to avoid the latency of sending IPI one bye one on systems with many
>> >> logical processors when NO_HZ is disabled.
>> >> Without this patch,I have seen upstream kernel with RHEL 5 kernel
>> >> config boot hang .
>> >
>> > hm, that might be a valid optimization - but why does the lack of
>> > this optimization result in a hang?
>>
>> It is hang caused by kernel code for work around lapic-timer-stop
>> issue. With HZ=1000, and a lot of cpus (eg. 64 logical cpus), cpu
>> 0 will be busy working on send TIMER IPI instead of making
>> progress in boot (right after deep-C-state has been used).
>
> that's a bit weird. With HZ=1000 we have 1000 usecs between each
> timer tick. Assuming a CPU sends to a lot of CPUs (64 logical CPUs)
> that means that each IPI takes more than ~15 microseconds to
> process. On what hardware/platform can this happen realistically?

https://bugzilla.redhat.com/show_bug.cgi?id=499271

Someone has measured that  it needs 50-100us latency to send one IPI

>
>        Ingo
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  8:21       ` Luming Yu
@ 2009-06-29  8:30         ` Ingo Molnar
  2009-06-29  8:43           ` Luming Yu
  0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2009-06-29  8:30 UTC (permalink / raw)
  To: Luming Yu, Arjan van de Ven
  Cc: LKML, suresh.b.siddha, venkatesh.pallipadi, Thomas Gleixner,
	H. Peter Anvin


* Luming Yu <luming.yu@gmail.com> wrote:

> On Mon, Jun 29, 2009 at 4:16 PM, Ingo Molnar<mingo@elte.hu> wrote:
> >
> > * Luming Yu <luming.yu@gmail.com> wrote:
> >
> >> On Mon, Jun 29, 2009 at 3:20 PM, Ingo Molnar<mingo@elte.hu> wrote:
> >> >
> >> > * Luming Yu <luming.yu@gmail.com> wrote:
> >> >
> >> >> Hello,
> >> >>
> >> >> We need to use IPI shortcut to send lapic timer broadcast
> >> >> to avoid the latency of sending IPI one bye one on systems with many
> >> >> logical processors when NO_HZ is disabled.
> >> >> Without this patch,I have seen upstream kernel with RHEL 5 kernel
> >> >> config boot hang .
> >> >
> >> > hm, that might be a valid optimization - but why does the lack of
> >> > this optimization result in a hang?
> >>
> >> It is hang caused by kernel code for work around lapic-timer-stop
> >> issue. With HZ=1000, and a lot of cpus (eg. 64 logical cpus), cpu
> >> 0 will be busy working on send TIMER IPI instead of making
> >> progress in boot (right after deep-C-state has been used).
> >
> > that's a bit weird. With HZ=1000 we have 1000 usecs between each
> > timer tick. Assuming a CPU sends to a lot of CPUs (64 logical CPUs)
> > that means that each IPI takes more than ~15 microseconds to
> > process. On what hardware/platform can this happen realistically?
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=499271
> 
> Someone has measured that it needs 50-100us latency to send one 
> IPI

Ugh. What platform is it that takes this much time to pass an IPI? 

IPIs are the lifeline of process messaging under Linux. TLB flushes 
in threaded apps rely on it (heavily), the scheduler relies on it 
for wakeups (heavily) and a lot of other code relies on IPIs as 
well.

Even a Pentium-5 100 MHz dual box was able to do cross-CPU IPIs 
within 10-20 microseconds more than a decade ago - so 50-100 usecs 
latency on a modern platform is totally out of this planet and will 
hurt Linux performance big time. And the worst thing about it is 
that none of the usual performance metrics will really show _why_ 
performance is tanking ...

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  8:30         ` Ingo Molnar
@ 2009-06-29  8:43           ` Luming Yu
  2009-06-29  9:15             ` Ingo Molnar
  2009-06-29 14:01             ` Arjan van de Ven
  0 siblings, 2 replies; 13+ messages in thread
From: Luming Yu @ 2009-06-29  8:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Arjan van de Ven, LKML, suresh.b.siddha, venkatesh.pallipadi,
	Thomas Gleixner, H. Peter Anvin

On Mon, Jun 29, 2009 at 4:30 PM, Ingo Molnar<mingo@elte.hu> wrote:
>
> * Luming Yu <luming.yu@gmail.com> wrote:
>
>> On Mon, Jun 29, 2009 at 4:16 PM, Ingo Molnar<mingo@elte.hu> wrote:
>> >
>> > * Luming Yu <luming.yu@gmail.com> wrote:
>> >
>> >> On Mon, Jun 29, 2009 at 3:20 PM, Ingo Molnar<mingo@elte.hu> wrote:
>> >> >
>> >> > * Luming Yu <luming.yu@gmail.com> wrote:
>> >> >
>> >> >> Hello,
>> >> >>
>> >> >> We need to use IPI shortcut to send lapic timer broadcast
>> >> >> to avoid the latency of sending IPI one bye one on systems with many
>> >> >> logical processors when NO_HZ is disabled.
>> >> >> Without this patch,I have seen upstream kernel with RHEL 5 kernel
>> >> >> config boot hang .
>> >> >
>> >> > hm, that might be a valid optimization - but why does the lack of
>> >> > this optimization result in a hang?
>> >>
>> >> It is hang caused by kernel code for work around lapic-timer-stop
>> >> issue. With HZ=1000, and a lot of cpus (eg. 64 logical cpus), cpu
>> >> 0 will be busy working on send TIMER IPI instead of making
>> >> progress in boot (right after deep-C-state has been used).
>> >
>> > that's a bit weird. With HZ=1000 we have 1000 usecs between each
>> > timer tick. Assuming a CPU sends to a lot of CPUs (64 logical CPUs)
>> > that means that each IPI takes more than ~15 microseconds to
>> > process. On what hardware/platform can this happen realistically?
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=499271
>>
>> Someone has measured that it needs 50-100us latency to send one
>> IPI
>
> Ugh. What platform is it that takes this much time to pass an IPI?
>
> IPIs are the lifeline of process messaging under Linux. TLB flushes
> in threaded apps rely on it (heavily), the scheduler relies on it
> for wakeups (heavily) and a lot of other code relies on IPIs as
> well.
>
> Even a Pentium-5 100 MHz dual box was able to do cross-CPU IPIs
> within 10-20 microseconds more than a decade ago - so 50-100 usecs
> latency on a modern platform is totally out of this planet and will
> hurt Linux performance big time. And the worst thing about it is
> that none of the usual performance metrics will really show _why_
> performance is tanking ...
>


Please note this is deep-C-state related.
C state does add extra latency.. but I don't know how much...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  8:43           ` Luming Yu
@ 2009-06-29  9:15             ` Ingo Molnar
  2009-06-29 14:01             ` Arjan van de Ven
  1 sibling, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2009-06-29  9:15 UTC (permalink / raw)
  To: Luming Yu
  Cc: Arjan van de Ven, LKML, suresh.b.siddha, venkatesh.pallipadi,
	Thomas Gleixner, H. Peter Anvin


* Luming Yu <luming.yu@gmail.com> wrote:

> Please note this is deep-C-state related. C state does add extra 
> latency.. but I don't know how much...

ok, so if each of the CPUs has to be brought out of deep-C and with 
each one 50-100 usecs a pop one could see the total time taking more 
than 1000 usecs. That would explain the hang.

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  8:43           ` Luming Yu
  2009-06-29  9:15             ` Ingo Molnar
@ 2009-06-29 14:01             ` Arjan van de Ven
  1 sibling, 0 replies; 13+ messages in thread
From: Arjan van de Ven @ 2009-06-29 14:01 UTC (permalink / raw)
  To: Luming Yu
  Cc: Ingo Molnar, LKML, suresh.b.siddha, venkatesh.pallipadi,
	Thomas Gleixner, H. Peter Anvin

On Mon, 29 Jun 2009 16:43:16 +0800
Luming Yu <luming.yu@gmail.com> wrote:

> > Even a Pentium-5 100 MHz dual box was able to do cross-CPU IPIs
> > within 10-20 microseconds more than a decade ago - so 50-100 usecs
> > latency on a modern platform is totally out of this planet and will
> > hurt Linux performance big time. And the worst thing about it is
> > that none of the usual performance metrics will really show _why_
> > performance is tanking ...
> >
> 
> 
> Please note this is deep-C-state related.
> C state does add extra latency.. but I don't know how much...

C states normally only add on the "wait for" side, not on the "send"
side.

(RHEL5 is rather old, so it may have done things a bit different)

Maybe it is time for mainline to not allow a !NO_HZ config....


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29  6:47 [RFC patch] Use IPI_shortcut for lapic timer broadcast Luming Yu
  2009-06-29  7:20 ` Ingo Molnar
@ 2009-06-29 20:34 ` Pallipadi, Venkatesh
  2009-06-30  7:01   ` Luming Yu
  1 sibling, 1 reply; 13+ messages in thread
From: Pallipadi, Venkatesh @ 2009-06-29 20:34 UTC (permalink / raw)
  To: Luming Yu
  Cc: LKML, Siddha, Suresh B, Arjan van de Ven, Thomas Gleixner,
	H. Peter Anvin, Ingo Molnar

On Sun, 2009-06-28 at 23:47 -0700, Luming Yu wrote:
> Hello,
> 
> We need to use IPI shortcut to send lapic timer broadcast
> to avoid the latency of sending IPI one bye one on systems with many
> logical processors when NO_HZ is disabled.
> Without this patch,I have seen upstream kernel with RHEL 5 kernel
> config boot hang .
> 
> The patch also changes physflat_send_IPI_all to IPI shortcut mode.
> 
> Please review, and apply.
> 
> **The patch is enclosed in text attachment*
> **Using web client to send the patch* *
> **below is for review, please apply attached  patch*/
> 
> Thanks,
> Luming
> 
> 
> Signed-off-by: Yu Luming <luming.yu@intel.com>
> 
>  apic.c         |    4 +++-
>  apic_flat_64.c |    7 ++++++-
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> --- linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c.0	2009-06-28
> 20:22:55.000000000 -0600
> +++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c	2009-06-29
> 00:21:44.000000000 -0600
> @@ -419,7 +419,9 @@
>  static void lapic_timer_broadcast(const struct cpumask *mask)
>  {
>  #ifdef CONFIG_SMP
> -	apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
> +	if (cpus_empty(*mask))
> +		return;
> +	apic->send_IPI_all(LOCAL_TIMER_VECTOR);
>  #endif
>  }
> 

Hmm.. this change looks wrong. This will cause unnecessary timer
broadcasts when nohz is actually enabled. I mean, when nohz is enabled,
we want to send this interrupt broadcast to one or some subset of CPUs
and this change sends the interrupt to all CPUs.

On a 16 logical CPUs with nohz enabled, powertop without this patch I
see
Wakeups-from-idle per second : 11.6     interval: 15.0s
and with this patch I see
Wakeups-from-idle per second : 24.7     interval: 15.0s

Also, cat /proc/interrupts | grep LOC shows this problem where all CPUs
seem to get roughly same number of local APIC interrupts over a period
of time with this patch.

We should only do the allbutself when nohz is disabled with a big
unlikely() around that code.

Thanks,
Venki

> --- linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c.0	2009-06-29
> 00:13:26.000000000 -0600
> +++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c	2009-06-29
> 00:11:23.000000000 -0600
> @@ -274,7 +274,12 @@
> 
>  static void physflat_send_IPI_all(int vector)
>  {
> -	physflat_send_IPI_mask(cpu_online_mask, vector);
> +	if (vector == NMI_VECTOR) {
> +		physflat_send_IPI_mask(cpu_online_mask, vector);
> +	} else {
> +		__default_send_IPI_shortcut(APIC_DEST_ALLINC,
> +					    vector, apic->dest_logical);
> +	}
>  }
> 
>  static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-29 20:34 ` Pallipadi, Venkatesh
@ 2009-06-30  7:01   ` Luming Yu
  2009-07-03  0:23     ` Pallipadi, Venkatesh
  0 siblings, 1 reply; 13+ messages in thread
From: Luming Yu @ 2009-06-30  7:01 UTC (permalink / raw)
  To: Pallipadi, Venkatesh
  Cc: LKML, Siddha, Suresh B, Arjan van de Ven, Thomas Gleixner,
	H. Peter Anvin, Ingo Molnar

[-- Attachment #1: Type: text/plain, Size: 1434 bytes --]

Thanks for review. How about the following patch?

/**The patch is enclosed in text attachment**
**Using web client to send the patch**
**below is for review, please apply attached  patch*/


Signed-off-by: Yu Luming <luming.yu@intel.com>

 apic.c         |    5 ++++-
 apic_flat_64.c |    7 ++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)


--- linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c.0	2009-06-29
23:45:05.000000000 -0600
+++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c	2009-06-30
00:37:56.000000000 -0600
@@ -419,7 +419,10 @@
 static void lapic_timer_broadcast(const struct cpumask *mask)
 {
 #ifdef CONFIG_SMP
-	apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
+	if (unlikely(cpumask_weight(mask) == num_online_cpus() -1))
+		apic->send_IPI_all(LOCAL_TIMER_VECTOR);
+	else
+		apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
 #endif
 }

--- linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c.0	2009-06-29
00:13:26.000000000 -0600
+++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c	2009-06-29
00:11:23.000000000 -0600
@@ -274,7 +274,12 @@

 static void physflat_send_IPI_all(int vector)
 {
-	physflat_send_IPI_mask(cpu_online_mask, vector);
+	if (vector == NMI_VECTOR) {
+		physflat_send_IPI_mask(cpu_online_mask, vector);
+	} else {
+		__default_send_IPI_shortcut(APIC_DEST_ALLINC,
+					    vector, apic->dest_logical);
+	}
 }

 static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask)

[-- Attachment #2: 1.patch --]
[-- Type: application/octet-stream, Size: 1084 bytes --]

--- linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c.0	2009-06-29 23:45:05.000000000 -0600
+++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c	2009-06-30 00:37:56.000000000 -0600
@@ -419,7 +419,10 @@
 static void lapic_timer_broadcast(const struct cpumask *mask)
 {
 #ifdef CONFIG_SMP
-	apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
+	if (unlikely(cpumask_weight(mask) == num_online_cpus() -1))
+		apic->send_IPI_all(LOCAL_TIMER_VECTOR);
+	else
+		apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
 #endif
 }
 
--- linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c.0	2009-06-29 00:13:26.000000000 -0600
+++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c	2009-06-29 00:11:23.000000000 -0600
@@ -274,7 +274,12 @@
 
 static void physflat_send_IPI_all(int vector)
 {
-	physflat_send_IPI_mask(cpu_online_mask, vector);
+	if (vector == NMI_VECTOR) {
+		physflat_send_IPI_mask(cpu_online_mask, vector);
+	} else {
+		__default_send_IPI_shortcut(APIC_DEST_ALLINC,
+					    vector, apic->dest_logical);
+	}
 }
 
 static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-06-30  7:01   ` Luming Yu
@ 2009-07-03  0:23     ` Pallipadi, Venkatesh
  2009-07-03  2:04       ` Luming Yu
  0 siblings, 1 reply; 13+ messages in thread
From: Pallipadi, Venkatesh @ 2009-07-03  0:23 UTC (permalink / raw)
  To: Luming Yu
  Cc: LKML, Siddha, Suresh B, Arjan van de Ven, Thomas Gleixner,
	H. Peter Anvin, Ingo Molnar

On Tue, 2009-06-30 at 00:01 -0700, Luming Yu wrote:
> Thanks for review. How about the following patch?
> 
> /**The patch is enclosed in text attachment**
> **Using web client to send the patch**
> **below is for review, please apply attached  patch*/
> 
> 
> Signed-off-by: Yu Luming <luming.yu@intel.com>
> 
>  apic.c         |    5 ++++-
>  apic_flat_64.c |    7 ++++++-
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> 
> --- linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c.0	2009-06-29
> 23:45:05.000000000 -0600
> +++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c	2009-06-30
> 00:37:56.000000000 -0600
> @@ -419,7 +419,10 @@
>  static void lapic_timer_broadcast(const struct cpumask *mask)
>  {
>  #ifdef CONFIG_SMP
> -	apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
> +	if (unlikely(cpumask_weight(mask) == num_online_cpus() -1))
> +		apic->send_IPI_all(LOCAL_TIMER_VECTOR);
> +	else
> +		apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
>  #endif
>  }

I dont think it is a good idea to pay the penalty for cpumask_weight and
num_online_cpus, for the more common tickless case to fix this problem
with less common no tickeless case.

We should be able to add an interface to get tick_nohz_enabled from
tick-sched.c and use that instead.

Thanks,
Venki

> 
> --- linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c.0	2009-06-29
> 00:13:26.000000000 -0600
> +++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic_flat_64.c	2009-06-29
> 00:11:23.000000000 -0600
> @@ -274,7 +274,12 @@
> 
>  static void physflat_send_IPI_all(int vector)
>  {
> -	physflat_send_IPI_mask(cpu_online_mask, vector);
> +	if (vector == NMI_VECTOR) {
> +		physflat_send_IPI_mask(cpu_online_mask, vector);
> +	} else {
> +		__default_send_IPI_shortcut(APIC_DEST_ALLINC,
> +					    vector, apic->dest_logical);
> +	}
>  }
> 
>  static unsigned int physflat_cpu_mask_to_apicid(const struct cpumask *cpumask)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC patch] Use IPI_shortcut for lapic timer broadcast
  2009-07-03  0:23     ` Pallipadi, Venkatesh
@ 2009-07-03  2:04       ` Luming Yu
  0 siblings, 0 replies; 13+ messages in thread
From: Luming Yu @ 2009-07-03  2:04 UTC (permalink / raw)
  To: Pallipadi, Venkatesh
  Cc: LKML, Siddha, Suresh B, Arjan van de Ven, Thomas Gleixner,
	H. Peter Anvin, Ingo Molnar

On Fri, Jul 3, 2009 at 8:23 AM, Pallipadi,
Venkatesh<venkatesh.pallipadi@intel.com> wrote:
> On Tue, 2009-06-30 at 00:01 -0700, Luming Yu wrote:
>> Thanks for review. How about the following patch?
>>
>> /**The patch is enclosed in text attachment**
>> **Using web client to send the patch**
>> **below is for review, please apply attached  patch*/
>>
>>
>> Signed-off-by: Yu Luming <luming.yu@intel.com>
>>
>>  apic.c         |    5 ++++-
>>  apic_flat_64.c |    7 ++++++-
>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>
>>
>> --- linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c.0    2009-06-29
>> 23:45:05.000000000 -0600
>> +++ linux-2.6.30-rc6/arch/x86/kernel/apic/apic.c      2009-06-30
>> 00:37:56.000000000 -0600
>> @@ -419,7 +419,10 @@
>>  static void lapic_timer_broadcast(const struct cpumask *mask)
>>  {
>>  #ifdef CONFIG_SMP
>> -     apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
>> +     if (unlikely(cpumask_weight(mask) == num_online_cpus() -1))
>> +             apic->send_IPI_all(LOCAL_TIMER_VECTOR);
>> +     else
>> +             apic->send_IPI_mask(mask, LOCAL_TIMER_VECTOR);
>>  #endif
>>  }
>
> I dont think it is a good idea to pay the penalty for cpumask_weight and
> num_online_cpus, for the more common tickless case to fix this problem
> with less common no tickeless case.

In NO_HZ, the lapic_timer_broadcast is rare thing too...

>
> We should be able to add an interface to get tick_nohz_enabled from
> tick-sched.c and use that instead.

If we use send_IPI_all just for NO_HZ is disabled, we will leave a corner case
here that lapic_timer_broadcast could be really really being actively used even
NO_HZ enabled... if this will never happen, I would agree with you..

Thanks,
LUming

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2009-07-03  2:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-29  6:47 [RFC patch] Use IPI_shortcut for lapic timer broadcast Luming Yu
2009-06-29  7:20 ` Ingo Molnar
2009-06-29  8:04   ` Luming Yu
2009-06-29  8:16     ` Ingo Molnar
2009-06-29  8:21       ` Luming Yu
2009-06-29  8:30         ` Ingo Molnar
2009-06-29  8:43           ` Luming Yu
2009-06-29  9:15             ` Ingo Molnar
2009-06-29 14:01             ` Arjan van de Ven
2009-06-29 20:34 ` Pallipadi, Venkatesh
2009-06-30  7:01   ` Luming Yu
2009-07-03  0:23     ` Pallipadi, Venkatesh
2009-07-03  2:04       ` Luming Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.