* [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays
@ 2009-01-10 22:38 Mike Travis
2009-01-10 22:38 ` [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs Mike Travis
` (4 more replies)
0 siblings, 5 replies; 28+ messages in thread
From: Mike Travis @ 2009-01-10 22:38 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
The following patches change irq_desc and kstat_irq_legacy into
variable sized arrays based on nr_cpu_ids when CONFIG_SPARSE_IRQS=y.
irq: change references from NR_IRQS to nr_irqs
irq: allocate irq_desc_ptrs array based on nr_irqs
irq: initialize nr_irqs based on nr_cpu_ids
kstat: modify kstat_irqs_legacy to be variable sized
Based on: tip/cpus4096 @ v2.6.28-6140-g36c401a
(Ingo - I will push these to your tip/cpus4096 branch via my cpus4096-for-ingo
git tree.)
Signed-off-by: Mike Travis <travis@sgi.com>
---
Affects of the SPARSE changes on NR_CPUS values.
1 - 128-defconfig (non-SPARSE)
2 - 4k-defconfig (non-SPARSE)
3 - 4k-defconfig (SPARSE)
====== Data
.1. .2. .3. ..final..
1114112 . -1114112 . -100% irq_desc(.data.cacheline_aligned)
208896 -69632 -138752 512 -99% irq_cfgx(.data)
34816 . -34816 . -100% irq_timer_state(.bss)
17480 . -17480 . -100% per_cpu__kstat(.data.percpu)
0 . +4096 4096 . irq_desc_legacy(.data.cacheline_aligned)
====== Sections
.1. .2. .3. ..final..
1140032 +459264 -1110016 489280 -57% .data.cacheline_aligned
34408 +6768 +16 41192 +19% .data.read_mostly
====== PerCPU ()
.1. .2. .3. ..final..
18432 -2048 -16384 . -100% kstat
10240 . -2048 8192 -20% init_tss
====== MemInfo ()
Static memory available at boot time:
.1. .2. .3. ..final..
8069795840 -10207232 +10575872 8070164480(+368640) MemFree
8263630848 -9310208 +10260480 8264581120(+950272) MemTotal
--
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs
2009-01-10 22:38 [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Mike Travis
@ 2009-01-10 22:38 ` Mike Travis
2009-01-10 22:45 ` Ingo Molnar
2009-01-10 22:38 ` [PATCH 2/4] irq: allocate irq_desc_ptrs array based on nr_irqs Mike Travis
` (3 subsequent siblings)
4 siblings, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-10 22:38 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
[-- Attachment #1: irq-prep --]
[-- Type: text/plain, Size: 1913 bytes --]
Impact: preparation, cleanup
Modify references from NR_IRQS to nr_irqs as the later will become
variable-sized based on nr_cpu_ids when CONFIG_SPARSE_IRQS=y.
Signed-off-by: Mike Travis <travis@sgi.com>
---
arch/x86/kernel/io_apic.c | 2 +-
kernel/irq/handle.c | 12 +++++++-----
2 files changed, 8 insertions(+), 6 deletions(-)
--- linux-2.6-for-ingo.orig/arch/x86/kernel/io_apic.c
+++ linux-2.6-for-ingo/arch/x86/kernel/io_apic.c
@@ -3185,7 +3185,7 @@ unsigned int create_irq_nr(unsigned int
irq = 0;
spin_lock_irqsave(&vector_lock, flags);
- for (new = irq_want; new < NR_IRQS; new++) {
+ for (new = irq_want; new < nr_irqs; new++) {
if (platform_legacy_irq(new))
continue;
--- linux-2.6-for-ingo.orig/kernel/irq/handle.c
+++ linux-2.6-for-ingo/kernel/irq/handle.c
@@ -132,6 +132,8 @@ int __init early_irq_init(void)
int legacy_count;
int i;
+ printk(KERN_INFO "NR_IRQS:%d nr_irqs:%d\n", NR_IRQS, nr_irqs);
+
desc = irq_desc_legacy;
legacy_count = ARRAY_SIZE(irq_desc_legacy);
@@ -143,7 +145,7 @@ int __init early_irq_init(void)
irq_desc_ptrs[i] = desc + i;
}
- for (i = legacy_count; i < NR_IRQS; i++)
+ for (i = legacy_count; i < nr_irqs; i++)
irq_desc_ptrs[i] = NULL;
return arch_early_irq_init();
@@ -151,7 +153,7 @@ int __init early_irq_init(void)
struct irq_desc *irq_to_desc(unsigned int irq)
{
- return (irq < NR_IRQS) ? irq_desc_ptrs[irq] : NULL;
+ return (irq < nr_irqs) ? irq_desc_ptrs[irq] : NULL;
}
struct irq_desc *irq_to_desc_alloc_cpu(unsigned int irq, int cpu)
@@ -160,9 +162,9 @@ struct irq_desc *irq_to_desc_alloc_cpu(u
unsigned long flags;
int node;
- if (irq >= NR_IRQS) {
- printk(KERN_WARNING "irq >= NR_IRQS in irq_to_desc_alloc: %d %d\n",
- irq, NR_IRQS);
+ if (irq >= nr_irqs) {
+ printk(KERN_WARNING "irq >= nr_irqs in irq_to_desc_alloc: %d %d\n",
+ irq, nr_irqs);
WARN_ON(1);
return NULL;
}
--
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 2/4] irq: allocate irq_desc_ptrs array based on nr_irqs
2009-01-10 22:38 [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Mike Travis
2009-01-10 22:38 ` [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs Mike Travis
@ 2009-01-10 22:38 ` Mike Travis
2009-01-10 22:47 ` Ingo Molnar
2009-01-10 22:38 ` [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids Mike Travis
` (2 subsequent siblings)
4 siblings, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-10 22:38 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
[-- Attachment #1: irq-alloc-irq_ptrs --]
[-- Type: text/plain, Size: 2351 bytes --]
Impact: preparation for making irq_desc_ptrs variable-sized.
This addresses this memory usage bump when NR_CPUS bumped from 128 to 4096:
34816 +229376 264192 +658% irq_desc_ptrs(.data.read_mostly)
The patch is split into two part, the first simply allocates the
irq_desc_ptrs array. Then next will deal with making it variable.
This is only when CONFIG_SPARSE_IRQS=y.
Signed-off-by: Mike Travis <travis@sgi.com>
---
kernel/irq/handle.c | 11 +++++++++--
kernel/irq/internals.h | 4 ++++
2 files changed, 13 insertions(+), 2 deletions(-)
--- linux-2.6-for-ingo.orig/kernel/irq/handle.c
+++ linux-2.6-for-ingo/kernel/irq/handle.c
@@ -17,6 +17,7 @@
#include <linux/kernel_stat.h>
#include <linux/rculist.h>
#include <linux/hash.h>
+#include <linux/bootmem.h>
#include "internals.h"
@@ -110,7 +111,7 @@ static void init_one_irq_desc(int irq, s
*/
DEFINE_SPINLOCK(sparse_irq_lock);
-struct irq_desc *irq_desc_ptrs[NR_IRQS] __read_mostly;
+struct irq_desc **irq_desc_ptrs __read_mostly;
static struct irq_desc irq_desc_legacy[NR_IRQS_LEGACY] __cacheline_aligned_in_smp = {
[0 ... NR_IRQS_LEGACY-1] = {
@@ -137,6 +138,9 @@ int __init early_irq_init(void)
desc = irq_desc_legacy;
legacy_count = ARRAY_SIZE(irq_desc_legacy);
+ /* allocate irq_desc_ptrs array based on nr_irqs */
+ irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
+
for (i = 0; i < legacy_count; i++) {
desc[i].irq = i;
desc[i].kstat_irqs = kstat_irqs_legacy[i];
@@ -153,7 +157,10 @@ int __init early_irq_init(void)
struct irq_desc *irq_to_desc(unsigned int irq)
{
- return (irq < nr_irqs) ? irq_desc_ptrs[irq] : NULL;
+ if (irq_desc_ptrs && irq < nr_irqs)
+ return irq_desc_ptrs[irq];
+
+ return NULL;
}
struct irq_desc *irq_to_desc_alloc_cpu(unsigned int irq, int cpu)
--- linux-2.6-for-ingo.orig/kernel/irq/internals.h
+++ linux-2.6-for-ingo/kernel/irq/internals.h
@@ -16,7 +16,11 @@ extern int __irq_set_trigger(struct irq_
extern struct lock_class_key irq_desc_lock_class;
extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr);
extern spinlock_t sparse_irq_lock;
+#ifdef CONFIG_SPARSE_IRQ
+extern struct irq_desc **irq_desc_ptrs;
+#else
extern struct irq_desc *irq_desc_ptrs[NR_IRQS];
+#endif
#ifdef CONFIG_PROC_FS
extern void register_irq_proc(unsigned int irq, struct irq_desc *desc);
--
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids
2009-01-10 22:38 [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Mike Travis
2009-01-10 22:38 ` [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs Mike Travis
2009-01-10 22:38 ` [PATCH 2/4] irq: allocate irq_desc_ptrs array based on nr_irqs Mike Travis
@ 2009-01-10 22:38 ` Mike Travis
2009-01-10 22:50 ` Ingo Molnar
2009-01-10 22:38 ` [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized Mike Travis
2009-01-10 22:43 ` [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Ingo Molnar
4 siblings, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-10 22:38 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
[-- Attachment #1: irq-variable-nr_irqs --]
[-- Type: text/plain, Size: 950 bytes --]
Impact: Reduce memory usage.
This is the second half of the changes to make the irq_desc_ptrs be
variable sized based on nr_cpu_ids. The algorithm is the same as the
setting of NR_IRQS except use nr_cpu_ids instead of NR_CPUS. This is
only when CONFIG_SPARSE_IRQS=y.
Signed-off-by: Mike Travis <travis@sgi.com>
---
kernel/irq/handle.c | 5 +++++
1 file changed, 5 insertions(+)
Files linux-2.6-for-ingo.orig/kernel/irq/.handle.c.swp and linux-2.6-for-ingo/kernel/irq/.handle.c.swp differ
--- linux-2.6-for-ingo.orig/kernel/irq/handle.c
+++ linux-2.6-for-ingo/kernel/irq/handle.c
@@ -133,6 +133,11 @@ int __init early_irq_init(void)
int legacy_count;
int i;
+ /* initialize nr_irqs based on nr_cpu_ids */
+ nr_irqs = (8 * nr_cpu_ids) > (32 * MAX_IO_APICS) ?
+ NR_VECTORS + (8 * nr_cpu_ids) :
+ NR_VECTORS + (32 * MAX_IO_APICS);
+
printk(KERN_INFO "NR_IRQS:%d nr_irqs:%d\n", NR_IRQS, nr_irqs);
desc = irq_desc_legacy;
--
^ permalink raw reply [flat|nested] 28+ messages in thread
* [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-10 22:38 [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Mike Travis
` (2 preceding siblings ...)
2009-01-10 22:38 ` [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids Mike Travis
@ 2009-01-10 22:38 ` Mike Travis
2009-01-10 22:52 ` Ingo Molnar
2009-01-11 7:01 ` Yinghai Lu
2009-01-10 22:43 ` [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Ingo Molnar
4 siblings, 2 replies; 28+ messages in thread
From: Mike Travis @ 2009-01-10 22:38 UTC (permalink / raw)
To: Ingo Molnar; +Cc: Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
[-- Attachment #1: var-kstat-irqs --]
[-- Type: text/plain, Size: 1338 bytes --]
Impact: reduce memory usage.
Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
memory usage bump when NR_CPUS bumped from 128 to 4096:
8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
This is only when CONFIG_SPARSE_IRQS=y.
Signed-off-by: Mike Travis <travis@sgi.com>
---
kernel/irq/handle.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
--- linux-2.6-for-ingo.orig/kernel/irq/handle.c
+++ linux-2.6-for-ingo/kernel/irq/handle.c
@@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
}
};
-/* FIXME: use bootmem alloc ...*/
-static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
+static unsigned int *kstat_irqs_legacy;
int __init early_irq_init(void)
{
@@ -146,9 +145,13 @@ int __init early_irq_init(void)
/* allocate irq_desc_ptrs array based on nr_irqs */
irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
+ /* allocate based on nr_cpu_ids */
+ kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
+ sizeof(int));
+
for (i = 0; i < legacy_count; i++) {
desc[i].irq = i;
- desc[i].kstat_irqs = kstat_irqs_legacy[i];
+ desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
lockdep_set_class(&desc[i].lock, &irq_desc_lock_class);
init_alloc_desc_masks(&desc[i], 0, true);
irq_desc_ptrs[i] = desc + i;
--
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays
2009-01-10 22:38 [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Mike Travis
` (3 preceding siblings ...)
2009-01-10 22:38 ` [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized Mike Travis
@ 2009-01-10 22:43 ` Ingo Molnar
2009-01-11 0:15 ` Mike Travis
4 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2009-01-10 22:43 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
>
> The following patches change irq_desc and kstat_irq_legacy into
> variable sized arrays based on nr_cpu_ids when CONFIG_SPARSE_IRQS=y.
>
> irq: change references from NR_IRQS to nr_irqs
> irq: allocate irq_desc_ptrs array based on nr_irqs
> irq: initialize nr_irqs based on nr_cpu_ids
> kstat: modify kstat_irqs_legacy to be variable sized
>
> Based on: tip/cpus4096 @ v2.6.28-6140-g36c401a
>
> (Ingo - I will push these to your tip/cpus4096 branch via my cpus4096-for-ingo
> git tree.)
Thanks - please send a pull request when you feel good about them.
The changes look good to me.
> Affects of the SPARSE changes on NR_CPUS values.
>
> 1 - 128-defconfig (non-SPARSE)
> 2 - 4k-defconfig (non-SPARSE)
> 3 - 4k-defconfig (SPARSE)
>
> ====== Data
>
> .1. .2. .3. ..final..
> 1114112 . -1114112 . -100% irq_desc(.data.cacheline_aligned)
> 208896 -69632 -138752 512 -99% irq_cfgx(.data)
> 34816 . -34816 . -100% irq_timer_state(.bss)
> 17480 . -17480 . -100% per_cpu__kstat(.data.percpu)
> 0 . +4096 4096 . irq_desc_legacy(.data.cacheline_aligned)
Impressive!
If i remember your previous figures correctly we've got about +4MB of
memory bloat at 4K CPUs, that is now down to 3MB total?
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs
2009-01-10 22:38 ` [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs Mike Travis
@ 2009-01-10 22:45 ` Ingo Molnar
2009-01-10 23:10 ` Mike Travis
0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2009-01-10 22:45 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> Impact: preparation, cleanup
s/Impact: preparation, cleanup/Impact: preparation, cleanup, add printk
> @@ -160,9 +162,9 @@ struct irq_desc *irq_to_desc_alloc_cpu(u
> unsigned long flags;
> int node;
>
> - if (irq >= NR_IRQS) {
> - printk(KERN_WARNING "irq >= NR_IRQS in irq_to_desc_alloc: %d %d\n",
> - irq, NR_IRQS);
> + if (irq >= nr_irqs) {
> + printk(KERN_WARNING "irq >= nr_irqs in irq_to_desc_alloc: %d %d\n",
> + irq, nr_irqs);
> WARN_ON(1);
> return NULL;
While at it, could you please also convert this to a WARN() construct
instead? (in a separate commit)
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/4] irq: allocate irq_desc_ptrs array based on nr_irqs
2009-01-10 22:38 ` [PATCH 2/4] irq: allocate irq_desc_ptrs array based on nr_irqs Mike Travis
@ 2009-01-10 22:47 ` Ingo Molnar
2009-01-10 23:03 ` Mike Travis
0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2009-01-10 22:47 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> +++ linux-2.6-for-ingo/kernel/irq/internals.h
> @@ -16,7 +16,11 @@ extern int __irq_set_trigger(struct irq_
> extern struct lock_class_key irq_desc_lock_class;
> extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr);
> extern spinlock_t sparse_irq_lock;
> +#ifdef CONFIG_SPARSE_IRQ
> +extern struct irq_desc **irq_desc_ptrs;
> +#else
> extern struct irq_desc *irq_desc_ptrs[NR_IRQS];
> +#endif
why the #ifdef? irq_desc_ptrs is only defined and used by sparseirq.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids
2009-01-10 22:38 ` [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids Mike Travis
@ 2009-01-10 22:50 ` Ingo Molnar
2009-01-10 23:20 ` Mike Travis
2009-01-11 2:00 ` Mike Travis
0 siblings, 2 replies; 28+ messages in thread
From: Ingo Molnar @ 2009-01-10 22:50 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> Impact: Reduce memory usage.
>
> This is the second half of the changes to make the irq_desc_ptrs be
> variable sized based on nr_cpu_ids. The algorithm is the same as the
> setting of NR_IRQS except use nr_cpu_ids instead of NR_CPUS. This is
> only when CONFIG_SPARSE_IRQS=y.
>
> Signed-off-by: Mike Travis <travis@sgi.com>
> ---
> kernel/irq/handle.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> Files linux-2.6-for-ingo.orig/kernel/irq/.handle.c.swp and linux-2.6-for-ingo/kernel/irq/.handle.c.swp differ
> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
> +++ linux-2.6-for-ingo/kernel/irq/handle.c
> @@ -133,6 +133,11 @@ int __init early_irq_init(void)
> int legacy_count;
> int i;
>
> + /* initialize nr_irqs based on nr_cpu_ids */
> + nr_irqs = (8 * nr_cpu_ids) > (32 * MAX_IO_APICS) ?
> + NR_VECTORS + (8 * nr_cpu_ids) :
> + NR_VECTORS + (32 * MAX_IO_APICS);
> +
this will break non-x86. Please move this to the x86 early-irq-init
function instead, and also sync it up with the current NR_IRQS sizing
macro.
I.e. do not duplicate the numbers but introduce some sort of
max_nr_irqs(nr_cpus) define that is used to initialize NR_IRQS and also
used later on to narrow down nr_irqs later on.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-10 22:38 ` [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized Mike Travis
@ 2009-01-10 22:52 ` Ingo Molnar
2009-01-10 23:08 ` Mike Travis
2009-01-11 7:01 ` Yinghai Lu
1 sibling, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2009-01-10 22:52 UTC (permalink / raw)
To: Mike Travis, Andrew Morton
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> Impact: reduce memory usage.
>
> Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
> memory usage bump when NR_CPUS bumped from 128 to 4096:
>
> 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
>
> This is only when CONFIG_SPARSE_IRQS=y.
>
> Signed-off-by: Mike Travis <travis@sgi.com>
> ---
> kernel/irq/handle.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
> +++ linux-2.6-for-ingo/kernel/irq/handle.c
> @@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
> }
> };
>
> -/* FIXME: use bootmem alloc ...*/
> -static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
> +static unsigned int *kstat_irqs_legacy;
>
> int __init early_irq_init(void)
> {
> @@ -146,9 +145,13 @@ int __init early_irq_init(void)
> /* allocate irq_desc_ptrs array based on nr_irqs */
> irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
>
> + /* allocate based on nr_cpu_ids */
> + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
> + sizeof(int));
> +
> for (i = 0; i < legacy_count; i++) {
> desc[i].irq = i;
> - desc[i].kstat_irqs = kstat_irqs_legacy[i];
> + desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
> lockdep_set_class(&desc[i].lock, &irq_desc_lock_class);
> init_alloc_desc_masks(&desc[i], 0, true);
> irq_desc_ptrs[i] = desc + i;
btw., while at it - dont we want to upgrade this to a 'long' (in a
separate commit)? Having more than 4 billion irqs after bootup is easily
possible.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/4] irq: allocate irq_desc_ptrs array based on nr_irqs
2009-01-10 22:47 ` Ingo Molnar
@ 2009-01-10 23:03 ` Mike Travis
2009-01-11 1:06 ` Ingo Molnar
0 siblings, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-10 23:03 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
>
>> +++ linux-2.6-for-ingo/kernel/irq/internals.h
>> @@ -16,7 +16,11 @@ extern int __irq_set_trigger(struct irq_
>> extern struct lock_class_key irq_desc_lock_class;
>> extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr);
>> extern spinlock_t sparse_irq_lock;
>> +#ifdef CONFIG_SPARSE_IRQ
>> +extern struct irq_desc **irq_desc_ptrs;
>> +#else
>> extern struct irq_desc *irq_desc_ptrs[NR_IRQS];
>> +#endif
>
> why the #ifdef? irq_desc_ptrs is only defined and used by sparseirq.
>
> Ingo
Hmm, good point. I just modified as it was.
Should it only be:
#ifdef CONFIG_SPARSE_IRQ
extern struct irq_desc **irq_desc_ptrs;
#endif
Or do we normally declare something even though it won't be defined anywhere?
Thanks,
Mike
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-10 22:52 ` Ingo Molnar
@ 2009-01-10 23:08 ` Mike Travis
2009-01-11 1:08 ` Ingo Molnar
0 siblings, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-10 23:08 UTC (permalink / raw)
To: Ingo Molnar
Cc: Andrew Morton, Ingo Molnar, Rusty Russell, Yinghai Lu,
Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
>
>> Impact: reduce memory usage.
>>
>> Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
>> memory usage bump when NR_CPUS bumped from 128 to 4096:
>>
>> 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
>>
>> This is only when CONFIG_SPARSE_IRQS=y.
>>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>> kernel/irq/handle.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
>> +++ linux-2.6-for-ingo/kernel/irq/handle.c
>> @@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
>> }
>> };
>>
>> -/* FIXME: use bootmem alloc ...*/
>> -static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
>> +static unsigned int *kstat_irqs_legacy;
>>
>> int __init early_irq_init(void)
>> {
>> @@ -146,9 +145,13 @@ int __init early_irq_init(void)
>> /* allocate irq_desc_ptrs array based on nr_irqs */
>> irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
>>
>> + /* allocate based on nr_cpu_ids */
>> + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
>> + sizeof(int));
>> +
>> for (i = 0; i < legacy_count; i++) {
>> desc[i].irq = i;
>> - desc[i].kstat_irqs = kstat_irqs_legacy[i];
>> + desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
>> lockdep_set_class(&desc[i].lock, &irq_desc_lock_class);
>> init_alloc_desc_masks(&desc[i], 0, true);
>> irq_desc_ptrs[i] = desc + i;
>
> btw., while at it - dont we want to upgrade this to a 'long' (in a
> separate commit)? Having more than 4 billion irqs after bootup is easily
> possible.
>
> Ingo
Yes, I can do that. There are two places where it allocated, both for
legacy and dynamically added desc's.
You want basically this, yes?
--- linux-2.6-for-ingo.orig/include/linux/irq.h
+++ linux-2.6-for-ingo/include/linux/irq.h
@@ -162,7 +162,7 @@ struct irq_desc {
unsigned int irq;
#ifdef CONFIG_SPARSE_IRQ
struct timer_rand_state *timer_rand_state;
- unsigned int *kstat_irqs;
+ unsigned long *kstat_irqs;
# ifdef CONFIG_INTR_REMAP
struct irq_2_iommu *irq_2_iommu;
# endif
Thanks,
Mike
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs
2009-01-10 22:45 ` Ingo Molnar
@ 2009-01-10 23:10 ` Mike Travis
2009-01-11 1:06 ` Ingo Molnar
0 siblings, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-10 23:10 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
>
>> Impact: preparation, cleanup
>
> s/Impact: preparation, cleanup/Impact: preparation, cleanup, add printk
Hmm, should I attempt the 'git rebase -i' to change this or just fix up
the patch, and repush (getting new commit ids)?
>
>> @@ -160,9 +162,9 @@ struct irq_desc *irq_to_desc_alloc_cpu(u
>> unsigned long flags;
>> int node;
>>
>> - if (irq >= NR_IRQS) {
>> - printk(KERN_WARNING "irq >= NR_IRQS in irq_to_desc_alloc: %d %d\n",
>> - irq, NR_IRQS);
>> + if (irq >= nr_irqs) {
>> + printk(KERN_WARNING "irq >= nr_irqs in irq_to_desc_alloc: %d %d\n",
>> + irq, nr_irqs);
>> WARN_ON(1);
>> return NULL;
>
> While at it, could you please also convert this to a WARN() construct
> instead? (in a separate commit)
>
> Ingo
Yes, I will.
Thanks,
Mike
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids
2009-01-10 22:50 ` Ingo Molnar
@ 2009-01-10 23:20 ` Mike Travis
2009-01-11 1:07 ` Ingo Molnar
2009-01-11 2:00 ` Mike Travis
1 sibling, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-10 23:20 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
>
>> Impact: Reduce memory usage.
>>
>> This is the second half of the changes to make the irq_desc_ptrs be
>> variable sized based on nr_cpu_ids. The algorithm is the same as the
>> setting of NR_IRQS except use nr_cpu_ids instead of NR_CPUS. This is
>> only when CONFIG_SPARSE_IRQS=y.
>>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>> kernel/irq/handle.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> Files linux-2.6-for-ingo.orig/kernel/irq/.handle.c.swp and linux-2.6-for-ingo/kernel/irq/.handle.c.swp differ
>> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
>> +++ linux-2.6-for-ingo/kernel/irq/handle.c
>> @@ -133,6 +133,11 @@ int __init early_irq_init(void)
>> int legacy_count;
>> int i;
>>
>> + /* initialize nr_irqs based on nr_cpu_ids */
>> + nr_irqs = (8 * nr_cpu_ids) > (32 * MAX_IO_APICS) ?
>> + NR_VECTORS + (8 * nr_cpu_ids) :
>> + NR_VECTORS + (32 * MAX_IO_APICS);
>> +
>
> this will break non-x86. Please move this to the x86 early-irq-init
Ahh, I did wonder about that but the way it was defined in irq_vectors and
only this code is only for architectures that define CONFIG_SPARSE_IRQ=y,
I thought this would duplicate that.
> function instead, and also sync it up with the current NR_IRQS sizing
> macro.
If mine is not the latest, then I think I'm a victim if git-remote update
not updating again as I did that before applying the changes and then
copied it in.
>
> I.e. do not duplicate the numbers but introduce some sort of
> max_nr_irqs(nr_cpus) define that is used to initialize NR_IRQS and also
> used later on to narrow down nr_irqs later on.
Ok, let me think about that some.
>
> Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays
2009-01-10 22:43 ` [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Ingo Molnar
@ 2009-01-11 0:15 ` Mike Travis
2009-01-11 1:10 ` Ingo Molnar
0 siblings, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-11 0:15 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
>
>> The following patches change irq_desc and kstat_irq_legacy into
>> variable sized arrays based on nr_cpu_ids when CONFIG_SPARSE_IRQS=y.
>>
>> irq: change references from NR_IRQS to nr_irqs
>> irq: allocate irq_desc_ptrs array based on nr_irqs
>> irq: initialize nr_irqs based on nr_cpu_ids
>> kstat: modify kstat_irqs_legacy to be variable sized
>>
>> Based on: tip/cpus4096 @ v2.6.28-6140-g36c401a
>>
>> (Ingo - I will push these to your tip/cpus4096 branch via my cpus4096-for-ingo
>> git tree.)
>
> Thanks - please send a pull request when you feel good about them.
>
> The changes look good to me.
>
>> Affects of the SPARSE changes on NR_CPUS values.
>>
>> 1 - 128-defconfig (non-SPARSE)
>> 2 - 4k-defconfig (non-SPARSE)
>> 3 - 4k-defconfig (SPARSE)
>>
>> ====== Data
>>
>> .1. .2. .3. ..final..
>> 1114112 . -1114112 . -100% irq_desc(.data.cacheline_aligned)
>> 208896 -69632 -138752 512 -99% irq_cfgx(.data)
>> 34816 . -34816 . -100% irq_timer_state(.bss)
>> 17480 . -17480 . -100% per_cpu__kstat(.data.percpu)
>> 0 . +4096 4096 . irq_desc_legacy(.data.cacheline_aligned)
>
> Impressive!
>
> If i remember your previous figures correctly we've got about +4MB of
> memory bloat at 4K CPUs, that is now down to 3MB total?
>
> Ingo
Here's my latest stats:
====== Text/Data ()
1 - 128-defconfig
2 - 4k-defconfig
.1. .2. ..final..
5935104 -6144 5928960 -0.10% TextSize
3833856 +43008 3876864 +1.12% DataSize
9734144 -219136 9515008 -2.25% BssSize
2449408 +790528 3239936 +32% InitSize
1884160 +6144 1890304 +0.33% PerCPU
110592 +462848 573440 +418% OtherSize
====== Sections (-l 500)
1 - 128-defconfig
2 - 4k-defconfig
.1. .2. ..final..
118246380 +1109486 119355866 +0.94% Total
74327135 +46083 74373218 +0.06% .debug_info
10590059 -14760 10575299 -0.14% .debug_loc
9734104 -219904 9514200 -2.26% .bss
5934961 -5744 5929217 -0.10% .text
4387075 -6047 4381028 -0.14% .debug_line
2369795 +25072 2394867 +1.06% .rodata
1938985 +21033 1960018 +1.08% .debug_abbrev
1883808 +5760 1889568 +0.31% .data.percpu
1691216 -7952 1683264 -0.47% .debug_ranges
1481593 -920 1480673 -0.06% .debug_str
1316320 -1864 1314456 -0.14% .debug_frame
1195280 +20408 1215688 +1.71% .data
34440 +6784 41224 +19% .data.read_mostly
30016 +459264 489280 +1530% .data.cacheline_aligned
23352 -1692 21660 -7.25% __bug_table
13760 +1312 15072 +9.53% __param
====== Data (-l 500)
1 - 128-defconfig
2 - 4k-defconfig
.1. .2. ..final..
1179648 -1179648 . -100% obj_hash(.bss)
23816 +459264 483080 +1928% kmalloc_caches(.data.cacheline_aligned)
20480 -20480 . -100% obj_static_pool(.bss)
16384 +507904 524288 +3100% boot_pageset(.bss)
9576 +15032 24608 +156% def_root_domain(.bss)
6404 +26880 33284 +419% e820_saved(.bss)
6404 +26880 33284 +419% e820(.bss)
5216 +31736 36952 +608% iter(.bss)
5120 +35840 40960 +700% node_devices(.bss)
4976 +552 5528 +11% init_task(.data)
3776 +25088 28864 +664% hstates(.bss)
3072 +95232 98304 +3100% ucode_cpu_info(.bss)
1145 -1145 . -100% centrino_target(.text)
1064 +31744 32808 +2983% max_tr(.bss)
1064 +31744 32808 +2983% global_trace(.bss)
1040 +32240 33280 +3100% cpu_bit_bitmap(.rodata)
1024 +1024 2048 +100% pxm_to_node_map(.data)
1024 +7168 8192 +700% nodes_add(.bss)
1020 +130052 131072 +12750% apic_version(.bss)
945 -945 . -100% debug_objects_mem_init(.init.text)
835 -835 . -100% speedstep_get_processor_frequency(.text)
752 -752 . -100% __debug_object_init(.text)
541 -541 . -100% cpufreq_p4_target(.text)
525 -525 . -100% speedstep_detect_processor(.text)
514 -514 . -100% page_mkclean(.text)
512 -512 . -100% speedstep_get_freqs(.text)
512 +3584 4096 +700% node_data(.data.read_mostly)
512 +15872 16384 +3100% last_pid(.data)
512 -512 . -100% has_N44_O17_errata(.bss)
====== Stack (-l 500)
1 - 128-defconfig
2 - 4k-defconfig
(Note the cpuset bumps have been addressed by Li Zefan's changes.)
.1. .2. ..final..
0 +808 808 . cpuset_write_resmask
0 +736 736 . update_flag
0 +640 640 . cpuset_attach
0 +600 600 . powernowk8_target
0 +584 584 . shmem_getpage
0 +584 584 . __percpu_alloc_mask
0 +568 568 . powernowk8_cpu_init
0 +552 552 . smp_call_function_many
0 +536 536 . pci_device_probe
0 +536 536 . cpuset_common_file_read
0 +528 528 . check_supported_cpu
0 +520 520 . cpuset_can_attach
0 +512 512 . powernowk8_get
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs
2009-01-10 23:10 ` Mike Travis
@ 2009-01-11 1:06 ` Ingo Molnar
0 siblings, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2009-01-11 1:06 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> Ingo Molnar wrote:
> > * Mike Travis <travis@sgi.com> wrote:
> >
> >> Impact: preparation, cleanup
> >
> > s/Impact: preparation, cleanup/Impact: preparation, cleanup, add printk
>
> Hmm, should I attempt the 'git rebase -i' to change this or just fix up
> the patch, and repush (getting new commit ids)?
Your choice really - it's the same end result: different commit IDs. I'd
suggest the method that is easier to you.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 2/4] irq: allocate irq_desc_ptrs array based on nr_irqs
2009-01-10 23:03 ` Mike Travis
@ 2009-01-11 1:06 ` Ingo Molnar
0 siblings, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2009-01-11 1:06 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> Ingo Molnar wrote:
> > * Mike Travis <travis@sgi.com> wrote:
> >
> >> +++ linux-2.6-for-ingo/kernel/irq/internals.h
> >> @@ -16,7 +16,11 @@ extern int __irq_set_trigger(struct irq_
> >> extern struct lock_class_key irq_desc_lock_class;
> >> extern void init_kstat_irqs(struct irq_desc *desc, int cpu, int nr);
> >> extern spinlock_t sparse_irq_lock;
> >> +#ifdef CONFIG_SPARSE_IRQ
> >> +extern struct irq_desc **irq_desc_ptrs;
> >> +#else
> >> extern struct irq_desc *irq_desc_ptrs[NR_IRQS];
> >> +#endif
> >
> > why the #ifdef? irq_desc_ptrs is only defined and used by sparseirq.
> >
> > Ingo
>
> Hmm, good point. I just modified as it was.
>
> Should it only be:
>
> #ifdef CONFIG_SPARSE_IRQ
> extern struct irq_desc **irq_desc_ptrs;
> #endif
>
> Or do we normally declare something even though it won't be defined anywhere?
yeah, we tend to skip the #ifdefs whenever we can do so unpunished.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids
2009-01-10 23:20 ` Mike Travis
@ 2009-01-11 1:07 ` Ingo Molnar
0 siblings, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2009-01-11 1:07 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> > function instead, and also sync it up with the current NR_IRQS sizing
> > macro.
>
> If mine is not the latest, then I think I'm a victim if git-remote
> update not updating again as I did that before applying the changes and
> then copied it in.
i meant something else: 'sync up' in the sense of not duplicating the same
rules. Should someone else change it - it's easy to miss the 'other' site
where we size. So my suggestion is to unify them into a single helper
inline or macro.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-10 23:08 ` Mike Travis
@ 2009-01-11 1:08 ` Ingo Molnar
2009-01-11 4:32 ` Mike Travis
0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2009-01-11 1:08 UTC (permalink / raw)
To: Mike Travis
Cc: Andrew Morton, Ingo Molnar, Rusty Russell, Yinghai Lu,
Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> Ingo Molnar wrote:
> > * Mike Travis <travis@sgi.com> wrote:
> >
> >> Impact: reduce memory usage.
> >>
> >> Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
> >> memory usage bump when NR_CPUS bumped from 128 to 4096:
> >>
> >> 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
> >>
> >> This is only when CONFIG_SPARSE_IRQS=y.
> >>
> >> Signed-off-by: Mike Travis <travis@sgi.com>
> >> ---
> >> kernel/irq/handle.c | 9 ++++++---
> >> 1 file changed, 6 insertions(+), 3 deletions(-)
> >>
> >> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
> >> +++ linux-2.6-for-ingo/kernel/irq/handle.c
> >> @@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
> >> }
> >> };
> >>
> >> -/* FIXME: use bootmem alloc ...*/
> >> -static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
> >> +static unsigned int *kstat_irqs_legacy;
> >>
> >> int __init early_irq_init(void)
> >> {
> >> @@ -146,9 +145,13 @@ int __init early_irq_init(void)
> >> /* allocate irq_desc_ptrs array based on nr_irqs */
> >> irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
> >>
> >> + /* allocate based on nr_cpu_ids */
> >> + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
> >> + sizeof(int));
> >> +
> >> for (i = 0; i < legacy_count; i++) {
> >> desc[i].irq = i;
> >> - desc[i].kstat_irqs = kstat_irqs_legacy[i];
> >> + desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
> >> lockdep_set_class(&desc[i].lock, &irq_desc_lock_class);
> >> init_alloc_desc_masks(&desc[i], 0, true);
> >> irq_desc_ptrs[i] = desc + i;
> >
> > btw., while at it - dont we want to upgrade this to a 'long' (in a
> > separate commit)? Having more than 4 billion irqs after bootup is easily
> > possible.
> >
> > Ingo
>
> Yes, I can do that. There are two places where it allocated, both for
> legacy and dynamically added desc's.
>
> You want basically this, yes?
yes - but please also check all that places that use it.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays
2009-01-11 0:15 ` Mike Travis
@ 2009-01-11 1:10 ` Ingo Molnar
2009-01-11 1:19 ` Mike Travis
0 siblings, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2009-01-11 1:10 UTC (permalink / raw)
To: Mike Travis
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
* Mike Travis <travis@sgi.com> wrote:
> Here's my latest stats:
>
> ====== Text/Data ()
>
> 1 - 128-defconfig
> 2 - 4k-defconfig
small nit, shouldnt this:
> .1. .2. ..final..
be:
> .1. .2 .delta
as 'final' is .2 in essence, just in absolute numbers, right? Confusing.
> 5935104 -6144 5928960 -0.10% TextSize
> 3833856 +43008 3876864 +1.12% DataSize
> 9734144 -219136 9515008 -2.25% BssSize
> 2449408 +790528 3239936 +32% InitSize
> 1884160 +6144 1890304 +0.33% PerCPU
> 110592 +462848 573440 +418% OtherSize
and flip around columns 2 and 3 ? It's the delta that matters, and the
percentage figure.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays
2009-01-11 1:10 ` Ingo Molnar
@ 2009-01-11 1:19 ` Mike Travis
0 siblings, 0 replies; 28+ messages in thread
From: Mike Travis @ 2009-01-11 1:19 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
>
>> Here's my latest stats:
>>
>> ====== Text/Data ()
>>
>> 1 - 128-defconfig
>> 2 - 4k-defconfig
>
>
> small nit, shouldnt this:
>> .1. .2. ..final..
>
> be:
>> .1. .2 .delta
>
> as 'final' is .2 in essence, just in absolute numbers, right? Confusing.
>
>
>> 5935104 -6144 5928960 -0.10% TextSize
>> 3833856 +43008 3876864 +1.12% DataSize
>> 9734144 -219136 9515008 -2.25% BssSize
>> 2449408 +790528 3239936 +32% InitSize
>> 1884160 +6144 1890304 +0.33% PerCPU
>> 110592 +462848 573440 +418% OtherSize
>
> and flip around columns 2 and 3 ? It's the delta that matters, and the
> percentage figure.
>
> Ingo
Sure thing. This originally was built for many columns so the net effect
of each was fairly important. But with only two columns it does make
sense to do a direct comparision.
Thanks,
Mike
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids
2009-01-10 22:50 ` Ingo Molnar
2009-01-10 23:20 ` Mike Travis
@ 2009-01-11 2:00 ` Mike Travis
1 sibling, 0 replies; 28+ messages in thread
From: Mike Travis @ 2009-01-11 2:00 UTC (permalink / raw)
To: Ingo Molnar
Cc: Ingo Molnar, Rusty Russell, Yinghai Lu, Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
>
>> Impact: Reduce memory usage.
>>
>> This is the second half of the changes to make the irq_desc_ptrs be
>> variable sized based on nr_cpu_ids. The algorithm is the same as the
>> setting of NR_IRQS except use nr_cpu_ids instead of NR_CPUS. This is
>> only when CONFIG_SPARSE_IRQS=y.
>>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>> kernel/irq/handle.c | 5 +++++
>> 1 file changed, 5 insertions(+)
>>
>> Files linux-2.6-for-ingo.orig/kernel/irq/.handle.c.swp and linux-2.6-for-ingo/kernel/irq/.handle.c.swp differ
>> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
>> +++ linux-2.6-for-ingo/kernel/irq/handle.c
>> @@ -133,6 +133,11 @@ int __init early_irq_init(void)
>> int legacy_count;
>> int i;
>>
>> + /* initialize nr_irqs based on nr_cpu_ids */
>> + nr_irqs = (8 * nr_cpu_ids) > (32 * MAX_IO_APICS) ?
>> + NR_VECTORS + (8 * nr_cpu_ids) :
>> + NR_VECTORS + (32 * MAX_IO_APICS);
>> +
>
> this will break non-x86. Please move this to the x86 early-irq-init
> function instead, and also sync it up with the current NR_IRQS sizing
> macro.
>
> I.e. do not duplicate the numbers but introduce some sort of
> max_nr_irqs(nr_cpus) define that is used to initialize NR_IRQS and also
> used later on to narrow down nr_irqs later on.
The way it is laid out is:
kernel/softirq.c:
int __init __weak early_irq_init(void)
{
return 0;
}
kernel/irq/handle.c:
#ifdef CONFIG_SPARSE_IRQ
:
struct irq_desc **irq_desc_ptrs __read_mostly;
:
int __init early_irq_init(void)
{
:
nr_irqs = ...
#else
:
struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned_in_smp = {
:
int __init early_irq_init(void)
{
<don't set nr_irqs>
So I don't see an x86 specific early_irq_init.
Since NR_IRQS is defined in a lot of different ways (in different places),
I think I'll have to do something like;
arch/x86/include/asm/irq_vectors.h:
#define max_nr_irqs(nr_cpus) \
include/linux/irqnr.h:
#ifndef max_nr_irqs
#define max_nr_irqs(unused) NR_IRQS
#endif
Would that work for you?
Thanks,
Mike
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-11 1:08 ` Ingo Molnar
@ 2009-01-11 4:32 ` Mike Travis
0 siblings, 0 replies; 28+ messages in thread
From: Mike Travis @ 2009-01-11 4:32 UTC (permalink / raw)
To: Ingo Molnar
Cc: Andrew Morton, Ingo Molnar, Rusty Russell, Yinghai Lu,
Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Mike Travis <travis@sgi.com> wrote:
...
>>>> Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
>>>> memory usage bump when NR_CPUS bumped from 128 to 4096:
>>>>
>>>> 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
>>>>
...
>>>> + /* allocate based on nr_cpu_ids */
>>>> + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
>>>> + sizeof(int));
...
>>> btw., while at it - dont we want to upgrade this to a 'long' (in a
>>> separate commit)? Having more than 4 billion irqs after bootup is easily
>>> possible.
Looking at this more closely, it seems it would be better to per_cpu_alloc
the legacy kstat_irqs as that would place the value being incremented on the
node of the cpu doing the incrementing.
This, of course, would need the early per_cpu_alloc (in bootmem) that
the new cpu_alloc changes from Christoph and Rusty is providing.
Thanks,
Mike
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-10 22:38 ` [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized Mike Travis
2009-01-10 22:52 ` Ingo Molnar
@ 2009-01-11 7:01 ` Yinghai Lu
2009-01-11 12:17 ` Ingo Molnar
2009-01-11 17:40 ` Mike Travis
1 sibling, 2 replies; 28+ messages in thread
From: Yinghai Lu @ 2009-01-11 7:01 UTC (permalink / raw)
To: Mike Travis; +Cc: Ingo Molnar, Rusty Russell, Jack Steiner, linux-kernel
On Sat, Jan 10, 2009 at 2:38 PM, Mike Travis <travis@sgi.com> wrote:
> Impact: reduce memory usage.
>
> Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
> memory usage bump when NR_CPUS bumped from 128 to 4096:
>
> 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
>
> This is only when CONFIG_SPARSE_IRQS=y.
>
> Signed-off-by: Mike Travis <travis@sgi.com>
> ---
> kernel/irq/handle.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
> +++ linux-2.6-for-ingo/kernel/irq/handle.c
> @@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
> }
> };
>
> -/* FIXME: use bootmem alloc ...*/
> -static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
> +static unsigned int *kstat_irqs_legacy;
>
> int __init early_irq_init(void)
> {
> @@ -146,9 +145,13 @@ int __init early_irq_init(void)
> /* allocate irq_desc_ptrs array based on nr_irqs */
> irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
>
> + /* allocate based on nr_cpu_ids */
> + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
> + sizeof(int));
> +
> for (i = 0; i < legacy_count; i++) {
> desc[i].irq = i;
> - desc[i].kstat_irqs = kstat_irqs_legacy[i];
> + desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
here should be
+ desc[i].kstat_irqs = kstat_irqs_legacy + i * nr_cpu_ids;
YH
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-11 7:01 ` Yinghai Lu
@ 2009-01-11 12:17 ` Ingo Molnar
2009-01-11 17:50 ` Mike Travis
2009-01-11 17:40 ` Mike Travis
1 sibling, 1 reply; 28+ messages in thread
From: Ingo Molnar @ 2009-01-11 12:17 UTC (permalink / raw)
To: Yinghai Lu
Cc: Mike Travis, Ingo Molnar, Rusty Russell, Jack Steiner, linux-kernel
* Yinghai Lu <yinghai@kernel.org> wrote:
> On Sat, Jan 10, 2009 at 2:38 PM, Mike Travis <travis@sgi.com> wrote:
> > Impact: reduce memory usage.
> >
> > Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
> > memory usage bump when NR_CPUS bumped from 128 to 4096:
> >
> > 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
> >
> > This is only when CONFIG_SPARSE_IRQS=y.
> >
> > Signed-off-by: Mike Travis <travis@sgi.com>
> > ---
> > kernel/irq/handle.c | 9 ++++++---
> > 1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
> > +++ linux-2.6-for-ingo/kernel/irq/handle.c
> > @@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
> > }
> > };
> >
> > -/* FIXME: use bootmem alloc ...*/
> > -static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
> > +static unsigned int *kstat_irqs_legacy;
> >
> > int __init early_irq_init(void)
> > {
> > @@ -146,9 +145,13 @@ int __init early_irq_init(void)
> > /* allocate irq_desc_ptrs array based on nr_irqs */
> > irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
> >
> > + /* allocate based on nr_cpu_ids */
> > + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
> > + sizeof(int));
> > +
> > for (i = 0; i < legacy_count; i++) {
> > desc[i].irq = i;
> > - desc[i].kstat_irqs = kstat_irqs_legacy[i];
> > + desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
>
> here should be
>
> + desc[i].kstat_irqs = kstat_irqs_legacy + i * nr_cpu_ids;
indeed - i fixed this in tip/cpus4096.
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-11 7:01 ` Yinghai Lu
2009-01-11 12:17 ` Ingo Molnar
@ 2009-01-11 17:40 ` Mike Travis
2009-01-11 17:48 ` Mike Travis
1 sibling, 1 reply; 28+ messages in thread
From: Mike Travis @ 2009-01-11 17:40 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Ingo Molnar, Rusty Russell, Jack Steiner, linux-kernel
Yinghai Lu wrote:
> On Sat, Jan 10, 2009 at 2:38 PM, Mike Travis <travis@sgi.com> wrote:
>> Impact: reduce memory usage.
>>
>> Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
>> memory usage bump when NR_CPUS bumped from 128 to 4096:
>>
>> 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
>>
>> This is only when CONFIG_SPARSE_IRQS=y.
>>
>> Signed-off-by: Mike Travis <travis@sgi.com>
>> ---
>> kernel/irq/handle.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
>> +++ linux-2.6-for-ingo/kernel/irq/handle.c
>> @@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
>> }
>> };
>>
>> -/* FIXME: use bootmem alloc ...*/
>> -static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
>> +static unsigned int *kstat_irqs_legacy;
>>
>> int __init early_irq_init(void)
>> {
>> @@ -146,9 +145,13 @@ int __init early_irq_init(void)
>> /* allocate irq_desc_ptrs array based on nr_irqs */
>> irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
>>
>> + /* allocate based on nr_cpu_ids */
>> + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
>> + sizeof(int));
>> +
>> for (i = 0; i < legacy_count; i++) {
>> desc[i].irq = i;
>> - desc[i].kstat_irqs = kstat_irqs_legacy[i];
>> + desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
>
> here should be
>
> + desc[i].kstat_irqs = kstat_irqs_legacy + i * nr_cpu_ids;
>
> YH
Hmm, thanks! (I wonder why I wanted it the other way? Oh well.)
I'll make an append patch to fix it.
Thanks,
Mike
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-11 17:40 ` Mike Travis
@ 2009-01-11 17:48 ` Mike Travis
0 siblings, 0 replies; 28+ messages in thread
From: Mike Travis @ 2009-01-11 17:48 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Ingo Molnar, Rusty Russell, Jack Steiner, linux-kernel
Mike Travis wrote:
> Yinghai Lu wrote:
>> On Sat, Jan 10, 2009 at 2:38 PM, Mike Travis <travis@sgi.com> wrote:
>>> Impact: reduce memory usage.
>>>
>>> Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
>>> memory usage bump when NR_CPUS bumped from 128 to 4096:
>>>
>>> 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
>>>
>>> This is only when CONFIG_SPARSE_IRQS=y.
>>>
>>> Signed-off-by: Mike Travis <travis@sgi.com>
>>> ---
>>> kernel/irq/handle.c | 9 ++++++---
>>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
>>> +++ linux-2.6-for-ingo/kernel/irq/handle.c
>>> @@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
>>> }
>>> };
>>>
>>> -/* FIXME: use bootmem alloc ...*/
>>> -static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
>>> +static unsigned int *kstat_irqs_legacy;
>>>
>>> int __init early_irq_init(void)
>>> {
>>> @@ -146,9 +145,13 @@ int __init early_irq_init(void)
>>> /* allocate irq_desc_ptrs array based on nr_irqs */
>>> irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
>>>
>>> + /* allocate based on nr_cpu_ids */
>>> + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
>>> + sizeof(int));
>>> +
>>> for (i = 0; i < legacy_count; i++) {
>>> desc[i].irq = i;
>>> - desc[i].kstat_irqs = kstat_irqs_legacy[i];
>>> + desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
>> here should be
>>
>> + desc[i].kstat_irqs = kstat_irqs_legacy + i * nr_cpu_ids;
>>
>> YH
>
> Hmm, thanks! (I wonder why I wanted it the other way? Oh well.)
>
> I'll make an append patch to fix it.
>
> Thanks,
> Mike
I'll push this to my cpus4096-for-ingo.git tree:
Subject: irq: fix initialization of legacy kstat irqs
Impact: fix bug with legacy kstat irqs
The sectioning of the allocated kstat_irqs_legacy should use nr_cpu_ids
instead of NR_IRQS_LEGACY.
Thanks to Yinghai for pointing this out.
Signed-off-by: Mike Travis <travis@sgi.com>
---
kernel/irq/handle.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- linux-2.6-for-ingo.orig/kernel/irq/handle.c
+++ linux-2.6-for-ingo/kernel/irq/handle.c
@@ -155,7 +155,7 @@ int __init early_irq_init(void)
for (i = 0; i < legacy_count; i++) {
desc[i].irq = i;
- desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
+ desc[i].kstat_irqs = kstat_irqs_legacy + i * nr_cpu_ids;
lockdep_set_class(&desc[i].lock, &irq_desc_lock_class);
init_alloc_desc_masks(&desc[i], 0, true);
irq_desc_ptrs[i] = desc + i;
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized
2009-01-11 12:17 ` Ingo Molnar
@ 2009-01-11 17:50 ` Mike Travis
0 siblings, 0 replies; 28+ messages in thread
From: Mike Travis @ 2009-01-11 17:50 UTC (permalink / raw)
To: Ingo Molnar
Cc: Yinghai Lu, Ingo Molnar, Rusty Russell, Jack Steiner, linux-kernel
Ingo Molnar wrote:
> * Yinghai Lu <yinghai@kernel.org> wrote:
>
>> On Sat, Jan 10, 2009 at 2:38 PM, Mike Travis <travis@sgi.com> wrote:
>>> Impact: reduce memory usage.
>>>
>>> Allocate kstat_irqs_legacy based on nr_cpu_ids to deal with this
>>> memory usage bump when NR_CPUS bumped from 128 to 4096:
>>>
>>> 8192 +253952 262144 +3100% kstat_irqs_legacy(.bss)
>>>
>>> This is only when CONFIG_SPARSE_IRQS=y.
>>>
>>> Signed-off-by: Mike Travis <travis@sgi.com>
>>> ---
>>> kernel/irq/handle.c | 9 ++++++---
>>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> --- linux-2.6-for-ingo.orig/kernel/irq/handle.c
>>> +++ linux-2.6-for-ingo/kernel/irq/handle.c
>>> @@ -124,8 +124,7 @@ static struct irq_desc irq_desc_legacy[N
>>> }
>>> };
>>>
>>> -/* FIXME: use bootmem alloc ...*/
>>> -static unsigned int kstat_irqs_legacy[NR_IRQS_LEGACY][NR_CPUS];
>>> +static unsigned int *kstat_irqs_legacy;
>>>
>>> int __init early_irq_init(void)
>>> {
>>> @@ -146,9 +145,13 @@ int __init early_irq_init(void)
>>> /* allocate irq_desc_ptrs array based on nr_irqs */
>>> irq_desc_ptrs = alloc_bootmem(nr_irqs * sizeof(void *));
>>>
>>> + /* allocate based on nr_cpu_ids */
>>> + kstat_irqs_legacy = alloc_bootmem(NR_IRQS_LEGACY * nr_cpu_ids *
>>> + sizeof(int));
>>> +
>>> for (i = 0; i < legacy_count; i++) {
>>> desc[i].irq = i;
>>> - desc[i].kstat_irqs = kstat_irqs_legacy[i];
>>> + desc[i].kstat_irqs = kstat_irqs_legacy + i * NR_IRQS_LEGACY;
>> here should be
>>
>> + desc[i].kstat_irqs = kstat_irqs_legacy + i * nr_cpu_ids;
>
> indeed - i fixed this in tip/cpus4096.
>
> Ingo
Ahh, I didn't see this. Thanks Ingo!
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2009-01-11 17:50 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-01-10 22:38 [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Mike Travis
2009-01-10 22:38 ` [PATCH 1/4] irq: change references from NR_IRQS to nr_irqs Mike Travis
2009-01-10 22:45 ` Ingo Molnar
2009-01-10 23:10 ` Mike Travis
2009-01-11 1:06 ` Ingo Molnar
2009-01-10 22:38 ` [PATCH 2/4] irq: allocate irq_desc_ptrs array based on nr_irqs Mike Travis
2009-01-10 22:47 ` Ingo Molnar
2009-01-10 23:03 ` Mike Travis
2009-01-11 1:06 ` Ingo Molnar
2009-01-10 22:38 ` [PATCH 3/4] irq: initialize nr_irqs based on nr_cpu_ids Mike Travis
2009-01-10 22:50 ` Ingo Molnar
2009-01-10 23:20 ` Mike Travis
2009-01-11 1:07 ` Ingo Molnar
2009-01-11 2:00 ` Mike Travis
2009-01-10 22:38 ` [PATCH 4/4] kstat: modify kstat_irqs_legacy to be variable sized Mike Travis
2009-01-10 22:52 ` Ingo Molnar
2009-01-10 23:08 ` Mike Travis
2009-01-11 1:08 ` Ingo Molnar
2009-01-11 4:32 ` Mike Travis
2009-01-11 7:01 ` Yinghai Lu
2009-01-11 12:17 ` Ingo Molnar
2009-01-11 17:50 ` Mike Travis
2009-01-11 17:40 ` Mike Travis
2009-01-11 17:48 ` Mike Travis
2009-01-10 22:43 ` [PATCH 0/4] irq: change irq_desc and kstat_irq_legacy to variable sized arrays Ingo Molnar
2009-01-11 0:15 ` Mike Travis
2009-01-11 1:10 ` Ingo Molnar
2009-01-11 1:19 ` Mike Travis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).