linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
@ 2008-09-05 21:40 Mike Travis
  2008-09-05 21:40 ` [PATCH 1/3] " Mike Travis
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Mike Travis @ 2008-09-05 21:40 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton
  Cc: Jack Steiner, Jes Sorensen, David Miller, Thomas Gleixner, linux-kernel


  * Cleanup cpumask_t usages in smp_call_function_mask function chain
    to prevent stack overflow problem when NR_CPUS=4096.

  * Reduce the number of passed cpumask_t variables in the following
    call chain for x86_64:

	smp_call_function_mask -->
	    arch_send_call_function_ipi->
		    smp_ops.send_call_func_ipi -->
			    genapic->send_IPI_mask

    Since the smp_call_function_mask() is an EXPORTED function, we
    cannot change it's calling interface for a patch to 2.6.27.

    The smp_ops.send_call_func_ipi interface is internal only and
    has two arch provided functions:

	arch/x86/kernel/smp.c:  .send_call_func_ipi = native_send_call_func_ipi
	arch/x86/xen/smp.c:     .send_call_func_ipi = xen_smp_send_call_function_ipi
	arch/x86/mach-voyager/voyager_smp.c:    (uses native_send_call_func_ipi)

    Therefore modifying the internal interface to use a cpumask_t pointer
    is straight-forward.

    The changes to genapic are much more extensive and are affected by the
    recent additions of the x2apic modes, so they will be done for 2.6.28 only.

Based on 2.6.27-rc5-git6.

Applies to linux-2.6.tip/master (with FUZZ).

Signed-off-by: Mike Travis <travis@sgi.com>
---


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-05 21:40 [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Mike Travis
@ 2008-09-05 21:40 ` Mike Travis
  2008-09-05 21:40 ` [PATCH 2/3] x86: reduce stack requirements for send_call_func_ipi Mike Travis
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Mike Travis @ 2008-09-05 21:40 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton
  Cc: Jack Steiner, Jes Sorensen, David Miller, Thomas Gleixner, linux-kernel

[-- Attachment #1: smp_call_function_mask --]
[-- Type: text/plain, Size: 1817 bytes --]

  * Cleanup cpumask_t usages in smp_call_function_mask to remove stack
    overflow problem when NR_CPUS=4096.  This removes over 1000 bytes
    from the stack with NR_CPUS=4096.

Based on 2.6.27-rc5-git6.

Applies to linux-2.6.tip/master (with FUZZ).

Signed-off-by: Mike Travis <travis@sgi.com>
---
 kernel/smp.c |   12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

--- linux-2.6.orig/kernel/smp.c
+++ linux-2.6/kernel/smp.c
@@ -303,7 +303,7 @@ static int check_stack_overflow(void)
  * If a faster scheme can be made, we could go back to preferring stack based
  * data -- the data allocation/free is non-zero cost.
  */
-static void smp_call_function_mask_quiesce_stack(cpumask_t mask)
+static void smp_call_function_mask_quiesce_stack(const cpumask_t *mask)
 {
 	struct call_single_data data;
 	int cpu;
@@ -311,7 +311,7 @@ static void smp_call_function_mask_quies
 	data.func = quiesce_dummy;
 	data.info = NULL;
 
-	for_each_cpu_mask(cpu, mask) {
+	for_each_cpu_mask_nr(cpu, *mask) {
 		data.flags = CSD_FLAG_WAIT;
 		generic_exec_single(cpu, &data);
 	}
@@ -339,7 +339,6 @@ int smp_call_function_mask(cpumask_t mas
 {
 	struct call_function_data d;
 	struct call_function_data *data = NULL;
-	cpumask_t allbutself;
 	unsigned long flags;
 	int cpu, num_cpus;
 	int slowpath = 0;
@@ -353,9 +352,8 @@ dump_stack();
 	WARN_ON(irqs_disabled());
 
 	cpu = smp_processor_id();
-	allbutself = cpu_online_map;
-	cpu_clear(cpu, allbutself);
-	cpus_and(mask, mask, allbutself);
+	cpus_and(mask, mask, cpu_online_map);
+	cpu_clear(cpu, mask);
 	num_cpus = cpus_weight(mask);
 
 	/*
@@ -398,7 +396,7 @@ dump_stack();
 	if (wait) {
 		csd_flag_wait(&data->csd);
 		if (unlikely(slowpath))
-			smp_call_function_mask_quiesce_stack(mask);
+			smp_call_function_mask_quiesce_stack(&mask);
 	}
 
 	return 0;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] x86: reduce stack requirements for send_call_func_ipi
  2008-09-05 21:40 [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Mike Travis
  2008-09-05 21:40 ` [PATCH 1/3] " Mike Travis
@ 2008-09-05 21:40 ` Mike Travis
  2008-09-05 21:40 ` [PATCH 3/3] x86: restore 4096 limit for NR_CPUS Mike Travis
  2008-09-06 13:29 ` [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Ingo Molnar
  3 siblings, 0 replies; 14+ messages in thread
From: Mike Travis @ 2008-09-05 21:40 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton
  Cc: Jack Steiner, Jes Sorensen, David Miller, Thomas Gleixner, linux-kernel

[-- Attachment #1: smp_ops --]
[-- Type: text/plain, Size: 2883 bytes --]

  * By converting the internal x86 smp_ops function send_call_func_ipi
    to pass a pointer to the cpumask_t variable, we greatly reduce the
    stack space required when NR_CPUS=4096.

    Further reduction will be realized when the send_IPI_mask interface
    is changed in 2.6.28.

Based on 2.6.27-rc5-git6.

Applies to linux-2.6.tip/master (with FUZZ).

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/smp.c |    6 +++---
 arch/x86/xen/smp.c    |    6 +++---
 include/asm-x86/smp.h |    6 +++---
 3 files changed, 9 insertions(+), 9 deletions(-)

--- linux-2.6.orig/arch/x86/kernel/smp.c
+++ linux-2.6/arch/x86/kernel/smp.c
@@ -126,18 +126,18 @@ void native_send_call_func_single_ipi(in
 	send_IPI_mask(cpumask_of_cpu(cpu), CALL_FUNCTION_SINGLE_VECTOR);
 }
 
-void native_send_call_func_ipi(cpumask_t mask)
+void native_send_call_func_ipi(const cpumask_t *mask)
 {
 	cpumask_t allbutself;
 
 	allbutself = cpu_online_map;
 	cpu_clear(smp_processor_id(), allbutself);
 
-	if (cpus_equal(mask, allbutself) &&
+	if (cpus_equal(*mask, allbutself) &&
 	    cpus_equal(cpu_online_map, cpu_callout_map))
 		send_IPI_allbutself(CALL_FUNCTION_VECTOR);
 	else
-		send_IPI_mask(mask, CALL_FUNCTION_VECTOR);
+		send_IPI_mask(*mask, CALL_FUNCTION_VECTOR);
 }
 
 static void stop_this_cpu(void *dummy)
--- linux-2.6.orig/arch/x86/xen/smp.c
+++ linux-2.6/arch/x86/xen/smp.c
@@ -371,14 +371,14 @@ static void xen_send_IPI_mask(cpumask_t 
 		xen_send_IPI_one(cpu, vector);
 }
 
-static void xen_smp_send_call_function_ipi(cpumask_t mask)
+static void xen_smp_send_call_function_ipi(const cpumask_t *mask)
 {
 	int cpu;
 
-	xen_send_IPI_mask(mask, XEN_CALL_FUNCTION_VECTOR);
+	xen_send_IPI_mask(*mask, XEN_CALL_FUNCTION_VECTOR);
 
 	/* Make sure other vcpus get a chance to run if they need to. */
-	for_each_cpu_mask_nr(cpu, mask) {
+	for_each_cpu_mask_nr(cpu, *mask) {
 		if (xen_vcpu_stolen(cpu)) {
 			HYPERVISOR_sched_op(SCHEDOP_yield, 0);
 			break;
--- linux-2.6.orig/include/asm-x86/smp.h
+++ linux-2.6/include/asm-x86/smp.h
@@ -53,7 +53,7 @@ struct smp_ops {
 	void (*smp_send_stop)(void);
 	void (*smp_send_reschedule)(int cpu);
 
-	void (*send_call_func_ipi)(cpumask_t mask);
+	void (*send_call_func_ipi)(const cpumask_t *mask);
 	void (*send_call_func_single_ipi)(int cpu);
 };
 
@@ -103,14 +103,14 @@ static inline void arch_send_call_functi
 
 static inline void arch_send_call_function_ipi(cpumask_t mask)
 {
-	smp_ops.send_call_func_ipi(mask);
+	smp_ops.send_call_func_ipi(&mask);
 }
 
 void native_smp_prepare_boot_cpu(void);
 void native_smp_prepare_cpus(unsigned int max_cpus);
 void native_smp_cpus_done(unsigned int max_cpus);
 int native_cpu_up(unsigned int cpunum);
-void native_send_call_func_ipi(cpumask_t mask);
+void native_send_call_func_ipi(const cpumask_t *mask);
 void native_send_call_func_single_ipi(int cpu);
 
 extern int __cpu_disable(void);


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] x86: restore 4096 limit for NR_CPUS
  2008-09-05 21:40 [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Mike Travis
  2008-09-05 21:40 ` [PATCH 1/3] " Mike Travis
  2008-09-05 21:40 ` [PATCH 2/3] x86: reduce stack requirements for send_call_func_ipi Mike Travis
@ 2008-09-05 21:40 ` Mike Travis
  2008-09-06 13:29 ` [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Ingo Molnar
  3 siblings, 0 replies; 14+ messages in thread
From: Mike Travis @ 2008-09-05 21:40 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton
  Cc: Jack Steiner, Jes Sorensen, David Miller, Thomas Gleixner, linux-kernel

[-- Attachment #1: Kconfig --]
[-- Type: text/plain, Size: 975 bytes --]

  * With the previous cleanups, NR_CPUS=4096 can now be enabled again.

Based on 2.6.27-rc5-git6.

Applies to linux-2.6.tip/master (with FUZZ).

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/Kconfig |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- linux-2.6.orig/arch/x86/Kconfig
+++ linux-2.6/arch/x86/Kconfig
@@ -587,8 +587,8 @@ config MAXSMP
 	  If unsure, say N.
 
 config NR_CPUS
-	int "Maximum number of CPUs (2-512)" if !MAXSMP
-	range 2 512
+	int "Maximum number of CPUs (2-4096)" if !MAXSMP
+	range 2 4096
 	depends on SMP
 	default "4096" if MAXSMP
 	default "32" if X86_NUMAQ || X86_SUMMIT || X86_BIGSMP || X86_ES7000
@@ -599,7 +599,7 @@ config NR_CPUS
 	  minimum value which makes sense is 2.
 
 	  This is purely to save memory - each supported CPU adds
-	  approximately eight kilobytes to the kernel image.
+	  approximately one kilobyte to the kernel image.
 
 config SCHED_SMT
 	bool "SMT (Hyperthreading) scheduler support"


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-05 21:40 [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Mike Travis
                   ` (2 preceding siblings ...)
  2008-09-05 21:40 ` [PATCH 3/3] x86: restore 4096 limit for NR_CPUS Mike Travis
@ 2008-09-06 13:29 ` Ingo Molnar
  2008-09-06 18:12   ` Mike Travis
  2008-09-08  9:48   ` Jes Sorensen
  3 siblings, 2 replies; 14+ messages in thread
From: Ingo Molnar @ 2008-09-06 13:29 UTC (permalink / raw)
  To: Mike Travis
  Cc: Andrew Morton, Jack Steiner, Jes Sorensen, David Miller,
	Thomas Gleixner, linux-kernel


* Mike Travis <travis@sgi.com> wrote:

>   * Cleanup cpumask_t usages in smp_call_function_mask function chain
>     to prevent stack overflow problem when NR_CPUS=4096.
> 
>   * Reduce the number of passed cpumask_t variables in the following
>     call chain for x86_64:
> 
> 	smp_call_function_mask -->
> 	    arch_send_call_function_ipi->
> 		    smp_ops.send_call_func_ipi -->
> 			    genapic->send_IPI_mask
> 
>     Since the smp_call_function_mask() is an EXPORTED function, we
>     cannot change it's calling interface for a patch to 2.6.27.
> 
>     The smp_ops.send_call_func_ipi interface is internal only and
>     has two arch provided functions:
> 
> 	arch/x86/kernel/smp.c:  .send_call_func_ipi = native_send_call_func_ipi
> 	arch/x86/xen/smp.c:     .send_call_func_ipi = xen_smp_send_call_function_ipi
> 	arch/x86/mach-voyager/voyager_smp.c:    (uses native_send_call_func_ipi)
> 
>     Therefore modifying the internal interface to use a cpumask_t pointer
>     is straight-forward.
> 
>     The changes to genapic are much more extensive and are affected by the
>     recent additions of the x2apic modes, so they will be done for 2.6.28 only.
> 
> Based on 2.6.27-rc5-git6.
> 
> Applies to linux-2.6.tip/master (with FUZZ).

applied to tip/cpus4096, thanks Mike.

I'm still wondering whether we should get rid of non-reference based 
cpumask_t altogether ...

Did you have a chance to look at the ftrace/stacktrace tracer in latest 
tip/master, which will show the maximum stack footprint that can occur?

Also, i've applied the patch below as well to restore MAXSMP in a muted 
form - with big warning signs added as well.

	Ingo

-------------->
>From 363a5e3d7b4b69371f21bcafd7fc76e68c73733a Mon Sep 17 00:00:00 2001
From: Ingo Molnar <mingo@elte.hu>
Date: Sat, 6 Sep 2008 15:24:52 +0200
Subject: [PATCH] x86: add MAXSMP

restore MAXSMP, it's a nice debugging helper to trigger various crashes
and problems with maximum sized x86 systems.

Make it depend on EXPERIMENTAL and DEBUG_KERNEL, and inform the user
about the effects (stacksize, overhead, memory usage) of this flag.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86/Kconfig |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ed97f2b..91212c1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -580,10 +580,15 @@ config IOMMU_HELPER
 
 config MAXSMP
 	bool "Configure Maximum number of SMP Processors and NUMA Nodes"
-	depends on X86_64 && SMP && BROKEN
-	default n
+	depends on X86_64 && SMP && DEBUG_KERNEL && EXPERIMENTAL
 	help
-	  Configure maximum number of CPUS and NUMA Nodes for this architecture.
+	  Configure maximum number of CPUS and NUMA Nodes for this
+	  architecture (up to 4096!).
+
+	  This can increase memory usage, bigger stack footprint and can
+	  add some runtime overhead as well so unless you want a generic
+	  distro kernel you likely want to say N.
+
 	  If unsure, say N.
 
 config NR_CPUS

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-06 13:29 ` [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Ingo Molnar
@ 2008-09-06 18:12   ` Mike Travis
  2008-09-06 18:21     ` Ingo Molnar
  2008-09-08 10:30     ` Nick Piggin
  2008-09-08  9:48   ` Jes Sorensen
  1 sibling, 2 replies; 14+ messages in thread
From: Mike Travis @ 2008-09-06 18:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Jack Steiner, Jes Sorensen, David Miller,
	Thomas Gleixner, linux-kernel

Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
> 
>>   * Cleanup cpumask_t usages in smp_call_function_mask function chain
>>     to prevent stack overflow problem when NR_CPUS=4096.
>>
>>   * Reduce the number of passed cpumask_t variables in the following
>>     call chain for x86_64:
>>
>> 	smp_call_function_mask -->
>> 	    arch_send_call_function_ipi->
>> 		    smp_ops.send_call_func_ipi -->
>> 			    genapic->send_IPI_mask
>>
>>     Since the smp_call_function_mask() is an EXPORTED function, we
>>     cannot change it's calling interface for a patch to 2.6.27.
>>
>>     The smp_ops.send_call_func_ipi interface is internal only and
>>     has two arch provided functions:
>>
>> 	arch/x86/kernel/smp.c:  .send_call_func_ipi = native_send_call_func_ipi
>> 	arch/x86/xen/smp.c:     .send_call_func_ipi = xen_smp_send_call_function_ipi
>> 	arch/x86/mach-voyager/voyager_smp.c:    (uses native_send_call_func_ipi)
>>
>>     Therefore modifying the internal interface to use a cpumask_t pointer
>>     is straight-forward.
>>
>>     The changes to genapic are much more extensive and are affected by the
>>     recent additions of the x2apic modes, so they will be done for 2.6.28 only.
>>
>> Based on 2.6.27-rc5-git6.
>>
>> Applies to linux-2.6.tip/master (with FUZZ).
> 
> applied to tip/cpus4096, thanks Mike.

Thanks Ingo!  Could you send me the git id for the merge?

> 
> I'm still wondering whether we should get rid of non-reference based 
> cpumask_t altogether ...

I've got a whole slew of "get-ready-to-remove-cpumask_t's" coming soon.
There are two phases, one completely within the x86 arch and the 2nd hits
the generic smp_call_function_mask ABI (won't be doable as a back-ported
patch to 2.6.27.)

> 
> Did you have a chance to look at the ftrace/stacktrace tracer in latest 
> tip/master, which will show the maximum stack footprint that can occur?

Hmm, no.  I'm using a default config right now as I can boot that pretty
easily.  I'll turn on the ftrace thing and check it out.

> 
> Also, i've applied the patch below as well to restore MAXSMP in a muted 
> form - with big warning signs added as well.

The main thing is to allow the distros to set it manually for their QA
testing of 2.6.27.  I'm sure I'll get back bugs because of just that.

(Is there a way to have them know to assign bugzilla's to me if NR_CPUS=4k
is the root of the problem?  This is an extremely serious issue for SGI
and I'd like to avoid any delays in me finding out about problems.)

Thanks again,
Mike

> 
> 	Ingo
> 
> -------------->
>>From 363a5e3d7b4b69371f21bcafd7fc76e68c73733a Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@elte.hu>
> Date: Sat, 6 Sep 2008 15:24:52 +0200
> Subject: [PATCH] x86: add MAXSMP
> 
> restore MAXSMP, it's a nice debugging helper to trigger various crashes
> and problems with maximum sized x86 systems.
> 
> Make it depend on EXPERIMENTAL and DEBUG_KERNEL, and inform the user
> about the effects (stacksize, overhead, memory usage) of this flag.
> 
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  arch/x86/Kconfig |   11 ++++++++---
>  1 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index ed97f2b..91212c1 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -580,10 +580,15 @@ config IOMMU_HELPER
>  
>  config MAXSMP
>  	bool "Configure Maximum number of SMP Processors and NUMA Nodes"
> -	depends on X86_64 && SMP && BROKEN
> -	default n
> +	depends on X86_64 && SMP && DEBUG_KERNEL && EXPERIMENTAL
>  	help
> -	  Configure maximum number of CPUS and NUMA Nodes for this architecture.
> +	  Configure maximum number of CPUS and NUMA Nodes for this
> +	  architecture (up to 4096!).
> +
> +	  This can increase memory usage, bigger stack footprint and can
> +	  add some runtime overhead as well so unless you want a generic
> +	  distro kernel you likely want to say N.
> +
>  	  If unsure, say N.
>  
>  config NR_CPUS


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-06 18:12   ` Mike Travis
@ 2008-09-06 18:21     ` Ingo Molnar
  2008-09-08 10:30     ` Nick Piggin
  1 sibling, 0 replies; 14+ messages in thread
From: Ingo Molnar @ 2008-09-06 18:21 UTC (permalink / raw)
  To: Mike Travis
  Cc: Andrew Morton, Jack Steiner, Jes Sorensen, David Miller,
	Thomas Gleixner, linux-kernel


* Mike Travis <travis@sgi.com> wrote:

> Ingo Molnar wrote:
> > * Mike Travis <travis@sgi.com> wrote:
> > 
> >>   * Cleanup cpumask_t usages in smp_call_function_mask function chain
> >>     to prevent stack overflow problem when NR_CPUS=4096.
> >>
> >>   * Reduce the number of passed cpumask_t variables in the following
> >>     call chain for x86_64:
> >>
> >> 	smp_call_function_mask -->
> >> 	    arch_send_call_function_ipi->
> >> 		    smp_ops.send_call_func_ipi -->
> >> 			    genapic->send_IPI_mask
> >>
> >>     Since the smp_call_function_mask() is an EXPORTED function, we
> >>     cannot change it's calling interface for a patch to 2.6.27.
> >>
> >>     The smp_ops.send_call_func_ipi interface is internal only and
> >>     has two arch provided functions:
> >>
> >> 	arch/x86/kernel/smp.c:  .send_call_func_ipi = native_send_call_func_ipi
> >> 	arch/x86/xen/smp.c:     .send_call_func_ipi = xen_smp_send_call_function_ipi
> >> 	arch/x86/mach-voyager/voyager_smp.c:    (uses native_send_call_func_ipi)
> >>
> >>     Therefore modifying the internal interface to use a cpumask_t pointer
> >>     is straight-forward.
> >>
> >>     The changes to genapic are much more extensive and are affected by the
> >>     recent additions of the x2apic modes, so they will be done for 2.6.28 only.
> >>
> >> Based on 2.6.27-rc5-git6.
> >>
> >> Applies to linux-2.6.tip/master (with FUZZ).
> > 
> > applied to tip/cpus4096, thanks Mike.
> 
> Thanks Ingo!  Could you send me the git id for the merge?

the commits are:

363a5e3: x86: add MAXSMP
01f569c: x86: restore 4096 limit for NR_CPUS
ae74da3: x86: reduce stack requirements for send_call_func_ipi
562d8c2: smp: reduce stack requirements for smp_call_function_mask

the merge into tip/master is:

| commit 7f5d26f9425851e20ca9774acbd13d0e3b96d9dd
| Merge: da5e209... 363a5e3...
| Author: Ingo Molnar <mingo@elte.hu>
| Date:   Sat Sep 6 15:29:18 2008 +0200
|
|     Merge branch 'cpus4096'

That merge commit will go away on the next integration run though.

your changes seem to be largely problem-free so far - with two dozen 
MAXSMP=y random bootups already.

> > I'm still wondering whether we should get rid of non-reference based 
> > cpumask_t altogether ...
> 
> I've got a whole slew of "get-ready-to-remove-cpumask_t's" coming 
> soon. There are two phases, one completely within the x86 arch and the 
> 2nd hits the generic smp_call_function_mask ABI (won't be doable as a 
> back-ported patch to 2.6.27.)

ok. None of this can go into v2.6.27 obviously - the stack corruptions 
were rather nasty. But it's looking good for v2.6.28 - especially if you 
are removing cpumask_t.

> > Did you have a chance to look at the ftrace/stacktrace tracer in 
> > latest tip/master, which will show the maximum stack footprint that 
> > can occur?
> 
> Hmm, no.  I'm using a default config right now as I can boot that 
> pretty easily.  I'll turn on the ftrace thing and check it out.

it's CONFIG_STACK_TRACER=y and rather nifty.

> > Also, i've applied the patch below as well to restore MAXSMP in a 
> > muted form - with big warning signs added as well.
> 
> The main thing is to allow the distros to set it manually for their QA 
> testing of 2.6.27.  I'm sure I'll get back bugs because of just that.
> 
> (Is there a way to have them know to assign bugzilla's to me if 
> NR_CPUS=4k is the root of the problem?  This is an extremely serious 
> issue for SGI and I'd like to avoid any delays in me finding out about 
> problems.)

i dont think there's any easy mapping.

	Ingo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-06 13:29 ` [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Ingo Molnar
  2008-09-06 18:12   ` Mike Travis
@ 2008-09-08  9:48   ` Jes Sorensen
  2008-09-08 15:41     ` Mike Travis
  1 sibling, 1 reply; 14+ messages in thread
From: Jes Sorensen @ 2008-09-08  9:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mike Travis, Andrew Morton, Jack Steiner, David Miller,
	Thomas Gleixner, linux-kernel

Ingo Molnar wrote:
>> Applies to linux-2.6.tip/master (with FUZZ).
> 
> applied to tip/cpus4096, thanks Mike.
> 
> I'm still wondering whether we should get rid of non-reference based 
> cpumask_t altogether ...

Cool,

I think we should, it's like a ticking bomb waiting to explode on us
eventually. IMHO it was a big mistake to allow cpumask_t being passed
by value in the first place.

Cheers,
Jes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-06 18:12   ` Mike Travis
  2008-09-06 18:21     ` Ingo Molnar
@ 2008-09-08 10:30     ` Nick Piggin
  2008-09-08 15:47       ` Mike Travis
  2008-09-08 19:51       ` David Miller
  1 sibling, 2 replies; 14+ messages in thread
From: Nick Piggin @ 2008-09-08 10:30 UTC (permalink / raw)
  To: Mike Travis
  Cc: Ingo Molnar, Andrew Morton, Jack Steiner, Jes Sorensen,
	David Miller, Thomas Gleixner, linux-kernel

On Sunday 07 September 2008 04:12, Mike Travis wrote:
> Ingo Molnar wrote:
> > * Mike Travis <travis@sgi.com> wrote:
> >>   * Cleanup cpumask_t usages in smp_call_function_mask function chain
> >>     to prevent stack overflow problem when NR_CPUS=4096.
> >>
> >>   * Reduce the number of passed cpumask_t variables in the following
> >>     call chain for x86_64:
> >>
> >> 	smp_call_function_mask -->
> >> 	    arch_send_call_function_ipi->
> >> 		    smp_ops.send_call_func_ipi -->
> >> 			    genapic->send_IPI_mask
> >>
> >>     Since the smp_call_function_mask() is an EXPORTED function, we
> >>     cannot change it's calling interface for a patch to 2.6.27.
> >>
> >>     The smp_ops.send_call_func_ipi interface is internal only and
> >>     has two arch provided functions:
> >>
> >> 	arch/x86/kernel/smp.c:  .send_call_func_ipi = native_send_call_func_ipi
> >> 	arch/x86/xen/smp.c:     .send_call_func_ipi =
> >> xen_smp_send_call_function_ipi arch/x86/mach-voyager/voyager_smp.c:   
> >> (uses native_send_call_func_ipi)
> >>
> >>     Therefore modifying the internal interface to use a cpumask_t
> >> pointer is straight-forward.
> >>
> >>     The changes to genapic are much more extensive and are affected by
> >> the recent additions of the x2apic modes, so they will be done for
> >> 2.6.28 only.
> >>
> >> Based on 2.6.27-rc5-git6.
> >>
> >> Applies to linux-2.6.tip/master (with FUZZ).
> >
> > applied to tip/cpus4096, thanks Mike.
>
> Thanks Ingo!  Could you send me the git id for the merge?
>
> > I'm still wondering whether we should get rid of non-reference based
> > cpumask_t altogether ...
>
> I've got a whole slew of "get-ready-to-remove-cpumask_t's" coming soon.
> There are two phases, one completely within the x86 arch and the 2nd hits
> the generic smp_call_function_mask ABI (won't be doable as a back-ported
> patch to 2.6.27.)
>
> > Did you have a chance to look at the ftrace/stacktrace tracer in latest
> > tip/master, which will show the maximum stack footprint that can occur?
>
> Hmm, no.  I'm using a default config right now as I can boot that pretty
> easily.  I'll turn on the ftrace thing and check it out.
>
> > Also, i've applied the patch below as well to restore MAXSMP in a muted
> > form - with big warning signs added as well.
>
> The main thing is to allow the distros to set it manually for their QA
> testing of 2.6.27.  I'm sure I'll get back bugs because of just that.
>
> (Is there a way to have them know to assign bugzilla's to me if NR_CPUS=4k
> is the root of the problem?  This is an extremely serious issue for SGI
> and I'd like to avoid any delays in me finding out about problems.)

Considering that, unless I'm mistaken, you want to run production systems
with 4096 CPUs at some point, then I would say you should really consider
increasing NR_CPUS _further_ than that in QA efforts, so that we might be
a bit more confident of running production kernels with 4096.

Is that being tried? Setting it to 8192 or even higher during QA seems
like a good idea to me.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-08  9:48   ` Jes Sorensen
@ 2008-09-08 15:41     ` Mike Travis
  0 siblings, 0 replies; 14+ messages in thread
From: Mike Travis @ 2008-09-08 15:41 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Ingo Molnar, Andrew Morton, Jack Steiner, David Miller,
	Thomas Gleixner, linux-kernel

Jes Sorensen wrote:
> Ingo Molnar wrote:
>>> Applies to linux-2.6.tip/master (with FUZZ).
>>
>> applied to tip/cpus4096, thanks Mike.
>>
>> I'm still wondering whether we should get rid of non-reference based
>> cpumask_t altogether ...
> 
> Cool,
> 
> I think we should, it's like a ticking bomb waiting to explode on us
> eventually. IMHO it was a big mistake to allow cpumask_t being passed
> by value in the first place.
> 
> Cheers,
> Jes

Linus's idea of defining cpumask_t to be a simple long[1] or a pointer to
a cpumask is a good one.  Unfortunately, the amount (and breadth) of the
code changes required is daunting, to say the least.  In my source tree
there are 892 references to cpumask_t.

But I'll start looking into it asap.  I don't know however if "NR_CPUS >
BITS_PER_LONG" is the correct metric to decide when to use pointers.  There
must be a better "pain" indicator... ;-)

Thanks,
Mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-08 10:30     ` Nick Piggin
@ 2008-09-08 15:47       ` Mike Travis
  2008-09-08 19:51       ` David Miller
  1 sibling, 0 replies; 14+ messages in thread
From: Mike Travis @ 2008-09-08 15:47 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ingo Molnar, Andrew Morton, Jack Steiner, Jes Sorensen,
	David Miller, Thomas Gleixner, linux-kernel

Nick Piggin wrote:
> On Sunday 07 September 2008 04:12, Mike Travis wrote:
>> Ingo Molnar wrote:
>>> * Mike Travis <travis@sgi.com> wrote:
>>>>   * Cleanup cpumask_t usages in smp_call_function_mask function chain
>>>>     to prevent stack overflow problem when NR_CPUS=4096.
>>>>
>>>>   * Reduce the number of passed cpumask_t variables in the following
>>>>     call chain for x86_64:
>>>>
>>>> 	smp_call_function_mask -->
>>>> 	    arch_send_call_function_ipi->
>>>> 		    smp_ops.send_call_func_ipi -->
>>>> 			    genapic->send_IPI_mask
>>>>
>>>>     Since the smp_call_function_mask() is an EXPORTED function, we
>>>>     cannot change it's calling interface for a patch to 2.6.27.
>>>>
>>>>     The smp_ops.send_call_func_ipi interface is internal only and
>>>>     has two arch provided functions:
>>>>
>>>> 	arch/x86/kernel/smp.c:  .send_call_func_ipi = native_send_call_func_ipi
>>>> 	arch/x86/xen/smp.c:     .send_call_func_ipi =
>>>> xen_smp_send_call_function_ipi arch/x86/mach-voyager/voyager_smp.c:   
>>>> (uses native_send_call_func_ipi)
>>>>
>>>>     Therefore modifying the internal interface to use a cpumask_t
>>>> pointer is straight-forward.
>>>>
>>>>     The changes to genapic are much more extensive and are affected by
>>>> the recent additions of the x2apic modes, so they will be done for
>>>> 2.6.28 only.
>>>>
>>>> Based on 2.6.27-rc5-git6.
>>>>
>>>> Applies to linux-2.6.tip/master (with FUZZ).
>>> applied to tip/cpus4096, thanks Mike.
>> Thanks Ingo!  Could you send me the git id for the merge?
>>
>>> I'm still wondering whether we should get rid of non-reference based
>>> cpumask_t altogether ...
>> I've got a whole slew of "get-ready-to-remove-cpumask_t's" coming soon.
>> There are two phases, one completely within the x86 arch and the 2nd hits
>> the generic smp_call_function_mask ABI (won't be doable as a back-ported
>> patch to 2.6.27.)
>>
>>> Did you have a chance to look at the ftrace/stacktrace tracer in latest
>>> tip/master, which will show the maximum stack footprint that can occur?
>> Hmm, no.  I'm using a default config right now as I can boot that pretty
>> easily.  I'll turn on the ftrace thing and check it out.
>>
>>> Also, i've applied the patch below as well to restore MAXSMP in a muted
>>> form - with big warning signs added as well.
>> The main thing is to allow the distros to set it manually for their QA
>> testing of 2.6.27.  I'm sure I'll get back bugs because of just that.
>>
>> (Is there a way to have them know to assign bugzilla's to me if NR_CPUS=4k
>> is the root of the problem?  This is an extremely serious issue for SGI
>> and I'd like to avoid any delays in me finding out about problems.)
> 
> Considering that, unless I'm mistaken, you want to run production systems
> with 4096 CPUs at some point, then I would say you should really consider
> increasing NR_CPUS _further_ than that in QA efforts, so that we might be
> a bit more confident of running production kernels with 4096.
> 
> Is that being tried? Setting it to 8192 or even higher during QA seems
> like a good idea to me.


That's a good idea.  I do occasionally set it to 16k (and 64k) for experimental
reasons (and to really highlight where cpumask_t space hogs reside), but I
hadn't thought to do it in the QA environment.

Thanks,
Mike

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-08 10:30     ` Nick Piggin
  2008-09-08 15:47       ` Mike Travis
@ 2008-09-08 19:51       ` David Miller
  2008-09-08 20:11         ` Mike Travis
  1 sibling, 1 reply; 14+ messages in thread
From: David Miller @ 2008-09-08 19:51 UTC (permalink / raw)
  To: nickpiggin; +Cc: travis, mingo, akpm, steiner, jes, tglx, linux-kernel

From: Nick Piggin <nickpiggin@yahoo.com.au>
Date: Mon, 8 Sep 2008 20:30:41 +1000

> Is that being tried? Setting it to 8192 or even higher during QA seems
> like a good idea to me.

This is a great idea, especially since it will make it even more
painfully obvious that essentially any function local cpumask_t
variable is a bug.

Really, it seems sensible to do something like:

1) Make cpumask_t a pointer.

2) Add cpumask_data_t which is what cpumask_t is now.  This gets
   used when for the actual storage, and will only get applied to
   datastructures that are dynamically allocated.  For example, for
   the cpu_vm_mask in mm_struct.

3) Type make and fix build failures until they are all gone.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-08 19:51       ` David Miller
@ 2008-09-08 20:11         ` Mike Travis
  2008-09-08 20:48           ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Travis @ 2008-09-08 20:11 UTC (permalink / raw)
  To: David Miller; +Cc: nickpiggin, mingo, akpm, steiner, jes, tglx, linux-kernel

David Miller wrote:
> From: Nick Piggin <nickpiggin@yahoo.com.au>
> Date: Mon, 8 Sep 2008 20:30:41 +1000
> 
>> Is that being tried? Setting it to 8192 or even higher during QA seems
>> like a good idea to me.
> 
> This is a great idea, especially since it will make it even more
> painfully obvious that essentially any function local cpumask_t
> variable is a bug.

Yes, that's what I have done in the past ... but putting it into the QA
testing would really trigger those stack overflow problems... ;-)

> 
> Really, it seems sensible to do something like:
> 
> 1) Make cpumask_t a pointer.
> 
> 2) Add cpumask_data_t which is what cpumask_t is now.  This gets
>    used when for the actual storage, and will only get applied to
>    datastructures that are dynamically allocated.  For example, for
>    the cpu_vm_mask in mm_struct.
> 
> 3) Type make and fix build failures until they are all gone.

I was wondering if we'd need to be able to default a cpumask_t pointer
argument to be a const and then use a different method for those cases
where it shouldn't be?  This would strengthen the compiler type checking
of functions calls.

For example:

	proto(cpumask_t mask)

would imply that *mask is a const, whereas

	proto(cpumask_var mask)

would indicate it to be non-const?

But then we couldn't use "cpumask_t" as a local declarator... So perhaps
we need something completely different for declaring cpumask arguments?

(I'm trying to figure out how to structure this with the least amount of
source editing.)

Thanks!
Mike



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask
  2008-09-08 20:11         ` Mike Travis
@ 2008-09-08 20:48           ` David Miller
  0 siblings, 0 replies; 14+ messages in thread
From: David Miller @ 2008-09-08 20:48 UTC (permalink / raw)
  To: travis; +Cc: nickpiggin, mingo, akpm, steiner, jes, tglx, linux-kernel

From: Mike Travis <travis@sgi.com>
Date: Mon, 08 Sep 2008 13:11:59 -0700

> I was wondering if we'd need to be able to default a cpumask_t pointer
> argument to be a const and then use a different method for those cases
> where it shouldn't be?  This would strengthen the compiler type checking
> of functions calls.

Yes, of course, the pointer should be const.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-09-08 20:48 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-05 21:40 [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Mike Travis
2008-09-05 21:40 ` [PATCH 1/3] " Mike Travis
2008-09-05 21:40 ` [PATCH 2/3] x86: reduce stack requirements for send_call_func_ipi Mike Travis
2008-09-05 21:40 ` [PATCH 3/3] x86: restore 4096 limit for NR_CPUS Mike Travis
2008-09-06 13:29 ` [PATCH 0/3] smp: reduce stack requirements for smp_call_function_mask Ingo Molnar
2008-09-06 18:12   ` Mike Travis
2008-09-06 18:21     ` Ingo Molnar
2008-09-08 10:30     ` Nick Piggin
2008-09-08 15:47       ` Mike Travis
2008-09-08 19:51       ` David Miller
2008-09-08 20:11         ` Mike Travis
2008-09-08 20:48           ` David Miller
2008-09-08  9:48   ` Jes Sorensen
2008-09-08 15:41     ` Mike Travis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).