linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] garbage values in file /proc/net/sockstat
@ 2006-01-23 11:21 pravin shelar
  2006-01-23 11:24 ` Andi Kleen
  0 siblings, 1 reply; 13+ messages in thread
From: pravin shelar @ 2006-01-23 11:21 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ravikiran G Thirumalai, Shai Fultheim, linux-kernel, David S. Miller

	In 2.6.16-rc1-mm1, (for x86_64 arch) cpu_possible_map is not same 
as NR_CPUS (prefill_possible_map()). Therefore per cpu areas are allocated 
for cpu_possible cpus only (setup_per_cpu_areas()). This causes sockstat 
to return garbage value on x84_64 arch.

So these per_cpu accesses are geting relocated (RELOC_HIDE) using
boot_cpu_pda[]->data_offset which is not initialized.

There are other instances of same bug where per_cpu() macro is used
without cpu_possible() check. e.g. net/core/utils.c :: 
net_random_reseed(), net/core/dev.c :: net_dev_init(), etc.

This patch fixes these bugs.

Regards,
Pravin.

---

Signed-off by: Pravin B. Shelar <pravins@calsoftinc.com>
Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>

Index: linux-2.6.15.1/net/core/dev.c
===================================================================
--- linux-2.6.15.1.orig/net/core/dev.c	2006-01-23 02:31:13.000000000 -0800
+++ linux-2.6.15.1/net/core/dev.c	2006-01-23 02:32:12.000000000 -0800
@@ -3240,7 +3240,7 @@ static int __init net_dev_init(void)
 	 *	Initialise the packet receive queues.
 	 */
 
-	for (i = 0; i < NR_CPUS; i++) {
+	for_each_cpu (i) {
 		struct softnet_data *queue;
 
 		queue = &per_cpu(softnet_data, i);
Index: linux-2.6.15.1/net/core/utils.c
===================================================================
--- linux-2.6.15.1.orig/net/core/utils.c	2006-01-23 02:31:13.000000000 -0800
+++ linux-2.6.15.1/net/core/utils.c	2006-01-23 02:32:12.000000000 -0800
@@ -133,7 +133,7 @@ static int net_random_reseed(void)
 	unsigned long seed[NR_CPUS];
 
 	get_random_bytes(seed, sizeof(seed));
-	for (i = 0; i < NR_CPUS; i++) {
+	for_each_cpu(i) {
 		struct nrnd_state *state = &per_cpu(net_rand_state,i);
 		__net_srandom(state, seed[i]);
 	}
Index: linux-2.6.15.1/net/socket.c
===================================================================
--- linux-2.6.15.1.orig/net/socket.c	2006-01-23 02:31:13.000000000 -0800
+++ linux-2.6.15.1/net/socket.c	2006-01-23 02:32:12.000000000 -0800
@@ -2079,7 +2079,7 @@ void socket_seq_show(struct seq_file *se
 	int cpu;
 	int counter = 0;
 
-	for (cpu = 0; cpu < NR_CPUS; cpu++)
+	for_each_cpu (cpu) 
 		counter += per_cpu(sockets_in_use, cpu);
 
 	/* It can be negative, by the way. 8) */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-23 11:21 [PATCH] garbage values in file /proc/net/sockstat pravin shelar
@ 2006-01-23 11:24 ` Andi Kleen
  2006-01-23 13:28   ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2006-01-23 11:24 UTC (permalink / raw)
  To: pravin shelar
  Cc: Ravikiran G Thirumalai, Shai Fultheim, linux-kernel, David S. Miller

On Monday 23 January 2006 12:21, pravin shelar wrote:
> 	In 2.6.16-rc1-mm1, (for x86_64 arch) cpu_possible_map is not same 
> as NR_CPUS (prefill_possible_map()). Therefore per cpu areas are allocated 
> for cpu_possible cpus only (setup_per_cpu_areas()). This causes sockstat 
> to return garbage value on x84_64 arch.
> 
> So these per_cpu accesses are geting relocated (RELOC_HIDE) using
> boot_cpu_pda[]->data_offset which is not initialized.
> 
> There are other instances of same bug where per_cpu() macro is used
> without cpu_possible() check. e.g. net/core/utils.c :: 
> net_random_reseed(), net/core/dev.c :: net_dev_init(), etc.
> 
> This patch fixes these bugs.

Thanks. Patches Look good.  Dave, can you push them for 2.6.16 still please?

-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-23 11:24 ` Andi Kleen
@ 2006-01-23 13:28   ` Eric Dumazet
  2006-01-23 15:11     ` Andi Kleen
  2006-01-25 21:45     ` Red zones (was: [PATCH] garbage values in file /proc/net/sockstat) Bernd Eckenfels
  0 siblings, 2 replies; 13+ messages in thread
From: Eric Dumazet @ 2006-01-23 13:28 UTC (permalink / raw)
  To: Andi Kleen
  Cc: pravin shelar, Ravikiran G Thirumalai, Shai Fultheim,
	linux-kernel, David S. Miller

Andi Kleen a écrit :
> On Monday 23 January 2006 12:21, pravin shelar wrote:
>> 	In 2.6.16-rc1-mm1, (for x86_64 arch) cpu_possible_map is not same 
>> as NR_CPUS (prefill_possible_map()). Therefore per cpu areas are allocated 
>> for cpu_possible cpus only (setup_per_cpu_areas()). This causes sockstat 
>> to return garbage value on x84_64 arch.
>>
>> So these per_cpu accesses are geting relocated (RELOC_HIDE) using
>> boot_cpu_pda[]->data_offset which is not initialized.
>>
>> There are other instances of same bug where per_cpu() macro is used
>> without cpu_possible() check. e.g. net/core/utils.c :: 
>> net_random_reseed(), net/core/dev.c :: net_dev_init(), etc.
>>
>> This patch fixes these bugs.
> 
> Thanks. Patches Look good.  Dave, can you push them for 2.6.16 still please?
> 

Shouldnt we force a page fault for not possible cpus in cpu_data
to catch all access to per_cpu(some_object, some_not_possible_cpu) ?

We can use a red zone big enough to hold the whole per_cpu data.

Something like :

file include/asm-x86_64/pgtable.h

#define CPUDATA_RED_ZONE 0xffff808000000000UL  /* start of percpu catcher */

file arch/x86_64/kernel/setup64.c

setup_per_cpu_areas(void)
{
...
     for (i = 0 ; i < NR_CPUS ; i++) {
         if (!cpu_possible(cpu))
             cpu_pda(i)->data_offset = CPUDATA_RED_ZONE - __per_cpu_start ;
     }
}

Eric


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-23 13:28   ` Eric Dumazet
@ 2006-01-23 15:11     ` Andi Kleen
  2006-01-23 16:28       ` Eric Dumazet
  2006-01-25 21:45     ` Red zones (was: [PATCH] garbage values in file /proc/net/sockstat) Bernd Eckenfels
  1 sibling, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2006-01-23 15:11 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: pravin shelar, Ravikiran G Thirumalai, Shai Fultheim,
	linux-kernel, David S. Miller

On Monday 23 January 2006 14:28, Eric Dumazet wrote:

> Shouldnt we force a page fault for not possible cpus in cpu_data
> to catch all access to per_cpu(some_object, some_not_possible_cpu) ?
>
> We can use a red zone big enough to hold the whole per_cpu data.

Good idea. Can you please send me a tested patch?

-Andi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-23 15:11     ` Andi Kleen
@ 2006-01-23 16:28       ` Eric Dumazet
  2006-01-23 16:46         ` Eric Dumazet
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2006-01-23 16:28 UTC (permalink / raw)
  To: Andi Kleen
  Cc: pravin shelar, Ravikiran G Thirumalai, Shai Fultheim,
	linux-kernel, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 1205 bytes --]

Andi Kleen a écrit :
> On Monday 23 January 2006 14:28, Eric Dumazet wrote:
> 
>> Shouldnt we force a page fault for not possible cpus in cpu_data
>> to catch all access to per_cpu(some_object, some_not_possible_cpu) ?
>>
>> We can use a red zone big enough to hold the whole per_cpu data.
> 
> Good idea. Can you please send me a tested patch?
> 

I did a patch (on top of 2.6.16-rc1-mm2) , but the kernel crashes in 
sched_init(void)

for (i = 0; i < NR_CPUS; i++) {
	prio_array_t *array;
	rq = cpu_rq(i);
	spin_lock_init(&rq->lock);  <<-CRASH


In my config, NR_CPUS = 8, and I have one only one CPU inside my test box.

So should I send only the patch or all the corrections I have to do to avoid 
all possible crashes in my config ?

Thank you
Eric Dumazet

[PATCH] x86_64 : Use a special CPUDATA_RED_ZONE to catch accesses to 
per_cpu(some_object, some_not_possible_cpu)

Because cpu_data(cpu)->data_offset may contain garbage, some buggy code may do 
random things without notice. If we initialize data_offset so that the 
per_cpu() data sits in an unmapped memory area, we should get page faults and 
stack traces should help us find the bugs.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>



[-- Attachment #2: cpudata_red_zone.patch --]
[-- Type: text/plain, Size: 1733 bytes --]

--- linux-2.6.16-rc1/Documentation/x86_64/mm.txt	2006-01-17 08:44:47.000000000 +0100
+++ linux-2.6.16-rc1-mm2-ed/Documentation/x86_64/mm.txt	2006-01-23 16:54:46.000000000 +0100
@@ -5,7 +5,8 @@
 
 0000000000000000 - 00007fffffffffff (=47bits) user space, different per mm
 hole caused by [48:63] sign extension
-ffff800000000000 - ffff80ffffffffff (=40bits) guard hole
+ffff800000000000 - ffff807fffffffff (=39bits) guard hole
+ffff808000000000 - ffff80ffffffffff (=39bits) not possible cpus percpudata hole
 ffff810000000000 - ffffc0ffffffffff (=46bits) direct mapping of all phys. memory
 ffffc10000000000 - ffffc1ffffffffff (=40bits) hole
 ffffc20000000000 - ffffe1ffffffffff (=45bits) vmalloc/ioremap space
--- linux-2.6.16-rc1/include/asm-x86_64/pgtable.h	2006-01-17 08:44:47.000000000 +0100
+++ linux-2.6.16-rc1-mm2-ed/include/asm-x86_64/pgtable.h	2006-01-23 16:54:46.000000000 +0100
@@ -136,6 +136,7 @@
 
 #ifndef __ASSEMBLY__
 #define MAXMEM		 0x3fffffffffffUL
+#define CPUDATA_RED_ZONE 0xffff808000000000UL
 #define VMALLOC_START    0xffffc20000000000UL
 #define VMALLOC_END      0xffffe1ffffffffffUL
 #define MODULES_VADDR    0xffffffff88000000UL
--- linux-2.6.16-rc1/arch/x86_64/kernel/setup64.c	2006-01-23 16:36:38.000000000 +0100
+++ linux-2.6.16-rc1-mm2-ed/arch/x86_64/kernel/setup64.c	2006-01-23 16:58:30.000000000 +0100
@@ -99,9 +99,13 @@
 		size = PERCPU_ENOUGH_ROOM;
 #endif
 
-	for_each_cpu_mask (i, cpu_possible_map) {
+	for (i = 0 ; i < NR_CPUS ; i++) {
 		char *ptr;
 
+		cpu_pda(i)->data_offset = (char *)CPUDATA_RED_ZONE - __per_cpu_start;
+		if (!cpu_possible(i))
+			continue;
+
 		if (!NODE_DATA(cpu_to_node(i))) {
 			printk("cpu with no node %d, num_online_nodes %d\n",
 			       i, num_online_nodes());

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-23 16:28       ` Eric Dumazet
@ 2006-01-23 16:46         ` Eric Dumazet
  2006-01-25 13:31           ` Andi Kleen
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2006-01-23 16:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andi Kleen, pravin shelar, Ravikiran G Thirumalai, Shai Fultheim,
	linux-kernel, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 561 bytes --]


Sorry for the first version of he patch. I did one change, in order to do the 
initialization only for !possible cpu

[PATCH] x86_64 : Use a special CPUDATA_RED_ZONE to catch accesses to 
per_cpu(some_object, some_not_possible_cpu)

Because cpu_data(cpu)->data_offset may contain garbage, some buggy code may do 
random things without notice. If we initialize data_offset so that the 
per_cpu() data sits in an unmapped memory area, we should get page faults and 
stack traces should help us find the bugs.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>



[-- Attachment #2: cpudata_red_zone.patch --]
[-- Type: text/plain, Size: 1741 bytes --]

--- linux-2.6.16-rc1/Documentation/x86_64/mm.txt	2006-01-17 08:44:47.000000000 +0100
+++ linux-2.6.16-rc1-mm2-ed/Documentation/x86_64/mm.txt	2006-01-23 16:54:46.000000000 +0100
@@ -5,7 +5,8 @@
 
 0000000000000000 - 00007fffffffffff (=47bits) user space, different per mm
 hole caused by [48:63] sign extension
-ffff800000000000 - ffff80ffffffffff (=40bits) guard hole
+ffff800000000000 - ffff807fffffffff (=39bits) guard hole
+ffff808000000000 - ffff80ffffffffff (=39bits) not possible cpus percpudata hole
 ffff810000000000 - ffffc0ffffffffff (=46bits) direct mapping of all phys. memory
 ffffc10000000000 - ffffc1ffffffffff (=40bits) hole
 ffffc20000000000 - ffffe1ffffffffff (=45bits) vmalloc/ioremap space
--- linux-2.6.16-rc1/include/asm-x86_64/pgtable.h	2006-01-17 08:44:47.000000000 +0100
+++ linux-2.6.16-rc1-mm2-ed/include/asm-x86_64/pgtable.h	2006-01-23 16:54:46.000000000 +0100
@@ -136,6 +136,7 @@
 
 #ifndef __ASSEMBLY__
 #define MAXMEM		 0x3fffffffffffUL
+#define CPUDATA_RED_ZONE 0xffff808000000000UL
 #define VMALLOC_START    0xffffc20000000000UL
 #define VMALLOC_END      0xffffe1ffffffffffUL
 #define MODULES_VADDR    0xffffffff88000000UL
--- linux-2.6.16-rc1/arch/x86_64/kernel/setup64.c	2006-01-23 16:36:38.000000000 +0100
+++ linux-2.6.16-rc1-mm2-ed/arch/x86_64/kernel/setup64.c	2006-01-23 17:40:54.000000000 +0100
@@ -99,9 +99,14 @@
 		size = PERCPU_ENOUGH_ROOM;
 #endif
 
-	for_each_cpu_mask (i, cpu_possible_map) {
+	for (i = 0 ; i < NR_CPUS ; i++) {
 		char *ptr;
 
+		if (!cpu_possible(i)) {
+			cpu_pda(i)->data_offset = (char *)CPUDATA_RED_ZONE - __per_cpu_start;
+			continue;
+		}
+
 		if (!NODE_DATA(cpu_to_node(i))) {
 			printk("cpu with no node %d, num_online_nodes %d\n",
 			       i, num_online_nodes());

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-23 16:46         ` Eric Dumazet
@ 2006-01-25 13:31           ` Andi Kleen
  2006-01-25 19:59             ` Ravikiran G Thirumalai
  0 siblings, 1 reply; 13+ messages in thread
From: Andi Kleen @ 2006-01-25 13:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: pravin shelar, Ravikiran G Thirumalai, Shai Fultheim,
	linux-kernel, David S. Miller

[-- Attachment #1: Type: text/plain, Size: 1237 bytes --]

On Monday 23 January 2006 17:46, Eric Dumazet wrote:
> 
> Sorry for the first version of he patch. I did one change, in order to do the 
> initialization only for !possible cpu
> 
> [PATCH] x86_64 : Use a special CPUDATA_RED_ZONE to catch accesses to 
> per_cpu(some_object, some_not_possible_cpu)
> 
> Because cpu_data(cpu)->data_offset may contain garbage, some buggy code may do 
> random things without notice. If we initialize data_offset so that the 
> per_cpu() data sits in an unmapped memory area, we should get page faults and 
> stack traces should help us find the bugs.
> 
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>

With that patch sched_init gives an early exception. I don't think
we can fix up all cases before 2.6.16, but it's dangerous
to let it reference freed memory.

I think the best course of action for this now for 2.6.16 is:

- mark percpu init data not __init
(this way it will still reference valid memory, although shared between
all impossible CPUs)
- keep the impossible CPUs per cpu data to point to the original reference  
version (== offset 0)

For 2.6.17 all the occurrences of NR_CPUS should be audited
and fixed up like you started in your earlier patch.

Like the attached patch.

-Andi


[-- Attachment #2: impossible-per-cpu-data-workaround --]
[-- Type: text/x-diff, Size: 2034 bytes --]

Let impossible CPUs point to reference per cpu data

Hack for 2.6.16. In 2.6.17 all code that uses NR_CPUs should
be audited and changed to only touch possible CPUs.

Don't mark the reference per cpu data init data (so it stays
around after boot) and point all impossible CPUs to it. This way
they reference some valid - although shared memory. Usually
this is only initialization like INIT_LIST_HEADs and there
won't be races because these CPUs never run. Still somewhat hackish.

Signed-off-by: Andi Kleen <ak@suse.de>

Index: linux/arch/x86_64/kernel/setup64.c
===================================================================
--- linux.orig/arch/x86_64/kernel/setup64.c
+++ linux/arch/x86_64/kernel/setup64.c
@@ -99,8 +99,15 @@ void __init setup_per_cpu_areas(void)
 		size = PERCPU_ENOUGH_ROOM;
 #endif
 
-	for_each_cpu_mask (i, cpu_possible_map) {
+	for (i = 0; i < NR_CPUS; i++) { 
 		char *ptr;
+		
+		/* Later set this to a unmapped area, but first 
+		   need to clean up NR_CPUS usage everywhere */
+		if (!cpu_possible(i)) {	
+			/* Point the the original reference data */
+			cpu_pda(i)->data_offset = 0;
+		}
 
 		if (!NODE_DATA(cpu_to_node(i))) {
 			printk("cpu with no node %d, num_online_nodes %d\n",
Index: linux/arch/x86_64/kernel/vmlinux.lds.S
===================================================================
--- linux.orig/arch/x86_64/kernel/vmlinux.lds.S
+++ linux/arch/x86_64/kernel/vmlinux.lds.S
@@ -172,13 +172,16 @@ SECTIONS
   . = ALIGN(4096);
   __initramfs_start = .;
   .init.ramfs : AT(ADDR(.init.ramfs) - LOAD_OFFSET) { *(.init.ramfs) }
-  __initramfs_end = .;	
+ /* temporary here to work around NR_CPUS. If you see this comment in 2.6.17+
+   complain */ 
+ __initramfs_end = .;	
+  . = ALIGN(4096);
+  __init_end = .;	
   . = ALIGN(32);
   __per_cpu_start = .;
   .data.percpu  : AT(ADDR(.data.percpu) - LOAD_OFFSET) { *(.data.percpu) }
   __per_cpu_end = .;
-  . = ALIGN(4096);
-  __init_end = .;
+  /* __init_end / ALIGN(4096) should be here */
 
   . = ALIGN(4096);
   __nosave_begin = .;

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-25 13:31           ` Andi Kleen
@ 2006-01-25 19:59             ` Ravikiran G Thirumalai
  2006-01-25 20:47               ` Ravikiran G Thirumalai
  2006-01-26  0:32               ` Andi Kleen
  0 siblings, 2 replies; 13+ messages in thread
From: Ravikiran G Thirumalai @ 2006-01-25 19:59 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Eric Dumazet, pravin shelar, Shai Fultheim, linux-kernel,
	David S. Miller

On Wed, Jan 25, 2006 at 02:31:15PM +0100, Andi Kleen wrote:
> On Monday 23 January 2006 17:46, Eric Dumazet wrote:
> 
> I think the best course of action for this now for 2.6.16 is:
> 
> - mark percpu init data not __init
> (this way it will still reference valid memory, although shared between
> all impossible CPUs)
> - keep the impossible CPUs per cpu data to point to the original reference  
> version (== offset 0)
> 

How about doing the above using a debug config option? So that when the
config option is turned on, all per-cpu area references to not possible 
cpus crash? and leave that option default on on -mm :).  That way we can 
quickly catch all references.  We can probably change the arch independent 
setup_per_cpu_areas also to do allocations for cpu_possible cpus only while 
we are at it?

Kiran

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-25 19:59             ` Ravikiran G Thirumalai
@ 2006-01-25 20:47               ` Ravikiran G Thirumalai
  2006-01-26  0:32               ` Andi Kleen
  1 sibling, 0 replies; 13+ messages in thread
From: Ravikiran G Thirumalai @ 2006-01-25 20:47 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Eric Dumazet, pravin shelar, Shai Fultheim, linux-kernel,
	David S. Miller

On Wed, Jan 25, 2006 at 11:59:46AM -0800, Ravikiran G Thirumalai wrote:
> 
> How about doing the above using a debug config option? So that when the
> config option is turned on, all per-cpu area references to not possible 
> cpus crash? and leave that option default on on -mm :).  That way we can 
> quickly catch all references.  We can probably change the arch independent 
> setup_per_cpu_areas also to do allocations for cpu_possible cpus only while 
> we are at it?

Ahh! just realised mm3 is already out with the Eric Dumazet's patch.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Red zones (was: [PATCH] garbage values in file /proc/net/sockstat)
  2006-01-23 13:28   ` Eric Dumazet
  2006-01-23 15:11     ` Andi Kleen
@ 2006-01-25 21:45     ` Bernd Eckenfels
  2006-01-26  5:28       ` Red zones Eric Dumazet
  1 sibling, 1 reply; 13+ messages in thread
From: Bernd Eckenfels @ 2006-01-25 21:45 UTC (permalink / raw)
  To: linux-kernel

Eric Dumazet <dada1@cosmosbay.com> wrote:
> We can use a red zone big enough to hold the whole per_cpu data.

I am trying to learn a bit here: why is it required to have a speciel red
zone for this case? Wouldnt it make more sence to have a single red zone
which can be used by all locations in the kernel for unused structures? That
would reduce the number of wasted segements in the page table, or?

Gruss
Bernd

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] garbage values in file /proc/net/sockstat
  2006-01-25 19:59             ` Ravikiran G Thirumalai
  2006-01-25 20:47               ` Ravikiran G Thirumalai
@ 2006-01-26  0:32               ` Andi Kleen
  1 sibling, 0 replies; 13+ messages in thread
From: Andi Kleen @ 2006-01-26  0:32 UTC (permalink / raw)
  To: Ravikiran G Thirumalai
  Cc: Eric Dumazet, pravin shelar, Shai Fultheim, linux-kernel,
	David S. Miller

On Wednesday 25 January 2006 20:59, Ravikiran G Thirumalai wrote:
> On Wed, Jan 25, 2006 at 02:31:15PM +0100, Andi Kleen wrote:
> > On Monday 23 January 2006 17:46, Eric Dumazet wrote:
> > 
> > I think the best course of action for this now for 2.6.16 is:
> > 
> > - mark percpu init data not __init
> > (this way it will still reference valid memory, although shared between
> > all impossible CPUs)
> > - keep the impossible CPUs per cpu data to point to the original reference  
> > version (== offset 0)
> > 
> 
> How about doing the above using a debug config option? So that when the
> config option is turned on, all per-cpu area references to not possible 
> cpus crash? and leave that option default on on -mm :)

In -mm* we could just apply Eric's patch and then someone should just
grep the tree for NR_CPUS and audit all users - that should
catch basically all occurrences. I can put it onto my todo list,
but I don't know when I'll get to it so it would be nice if someone
else could do this.

For 2.6.16 I think it's best to go forward with my hack.

> .  That way we can  
> quickly catch all references.  We can probably change the arch independent 
> setup_per_cpu_areas also to do allocations for cpu_possible cpus only while 
> we are at it?

Eric did that already.

-Andi


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Red zones
  2006-01-25 21:45     ` Red zones (was: [PATCH] garbage values in file /proc/net/sockstat) Bernd Eckenfels
@ 2006-01-26  5:28       ` Eric Dumazet
  2006-01-26 10:07         ` Bernd Eckenfels
  0 siblings, 1 reply; 13+ messages in thread
From: Eric Dumazet @ 2006-01-26  5:28 UTC (permalink / raw)
  To: Bernd Eckenfels; +Cc: linux-kernel

Bernd Eckenfels a écrit :
> Eric Dumazet <dada1@cosmosbay.com> wrote:
>> We can use a red zone big enough to hold the whole per_cpu data.
> 
> I am trying to learn a bit here: why is it required to have a speciel red
> zone for this case? Wouldnt it make more sence to have a single red zone
> which can be used by all locations in the kernel for unused structures? That
> would reduce the number of wasted segements in the page table, or?
>

On x86_64, available virtual space is huge, so having different red zones can 
spot the fault more easily : If the target of the fault is in the PER_CPU 
redzone given range, we can instantly knows there is still a per_cpu() user 
accessing a non possible cpu area. As the red zone is not mapped at all, no 
page table is setup.


On 32 bits platforms, this is completely different : space is scarse (typical 
User/Kernel split of 3GB/1GB), so we should avoid to reserve even a 32 KB 
redzone. We could do it in DEBUG mode for example. Current interim patch in 
2.6.16-rc1-mm3 is using NULL pointer but this is not a perfect solution since 
the underlying current user process can perfectly map something in this 'zone'.

Eric



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Red zones
  2006-01-26  5:28       ` Red zones Eric Dumazet
@ 2006-01-26 10:07         ` Bernd Eckenfels
  0 siblings, 0 replies; 13+ messages in thread
From: Bernd Eckenfels @ 2006-01-26 10:07 UTC (permalink / raw)
  To: linux-kernel

Eric Dumazet <dada1@cosmosbay.com> wrote:
> On x86_64, available virtual space is huge, so having different red zones can 
> spot the fault more easily : If the target of the fault is in the PER_CPU 
> redzone given range, we can instantly knows there is still a per_cpu() user 
> accessing a non possible cpu area. As the red zone is not mapped at all, no 
> page table is setup.

Ok, however you can also tell from the stack trace who accessed the red zone, right?

Gruss
Bernd

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2006-01-26 10:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-23 11:21 [PATCH] garbage values in file /proc/net/sockstat pravin shelar
2006-01-23 11:24 ` Andi Kleen
2006-01-23 13:28   ` Eric Dumazet
2006-01-23 15:11     ` Andi Kleen
2006-01-23 16:28       ` Eric Dumazet
2006-01-23 16:46         ` Eric Dumazet
2006-01-25 13:31           ` Andi Kleen
2006-01-25 19:59             ` Ravikiran G Thirumalai
2006-01-25 20:47               ` Ravikiran G Thirumalai
2006-01-26  0:32               ` Andi Kleen
2006-01-25 21:45     ` Red zones (was: [PATCH] garbage values in file /proc/net/sockstat) Bernd Eckenfels
2006-01-26  5:28       ` Red zones Eric Dumazet
2006-01-26 10:07         ` Bernd Eckenfels

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).