linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* panic on bootup due to __GFP_ZERO patch
@ 2005-01-08  9:06 Chris Wright
  2005-01-08 20:00 ` Dave Jones
  2005-01-09  9:45 ` Andrew Morton
  0 siblings, 2 replies; 11+ messages in thread
From: Chris Wright @ 2005-01-08  9:06 UTC (permalink / raw)
  To: clameter; +Cc: linux-kernel

I'm getting a panic during pidmap_init with a backtrace that looks
something like:

buffered_rmqueue
__alloc_pages
get_zeroed_page
pidmap_init
start_kernel

Reverting the __GFP_ZERO patch fixes the issue, haven't drilled down
any deeper yet to see what in the patch is causing the problem.  This is
x86 w/out HIGHMEM (and no NUMA).

Any ideas?

thanks,
-chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: panic on bootup due to __GFP_ZERO patch
  2005-01-08  9:06 panic on bootup due to __GFP_ZERO patch Chris Wright
@ 2005-01-08 20:00 ` Dave Jones
  2005-01-09  9:45 ` Andrew Morton
  1 sibling, 0 replies; 11+ messages in thread
From: Dave Jones @ 2005-01-08 20:00 UTC (permalink / raw)
  To: Chris Wright; +Cc: clameter, linux-kernel

On Sat, Jan 08, 2005 at 01:06:30AM -0800, Chris Wright wrote:
 > I'm getting a panic during pidmap_init with a backtrace that looks
 > something like:
 > 
 > buffered_rmqueue
 > __alloc_pages
 > get_zeroed_page
 > pidmap_init
 > start_kernel
 > 
 > Reverting the __GFP_ZERO patch fixes the issue, haven't drilled down
 > any deeper yet to see what in the patch is causing the problem.  This is
 > x86 w/out HIGHMEM (and no NUMA).

ACK, there has been a number of folks hit by this since I updated
the Fedora rawhide kernel to snapshots including this change.

https://bugzilla.redhat.com/beta/show_bug.cgi?id=144480

I've also hit in on my test box that has 256MB.
The pattern so far does seem to be 'no highmem', though
I've not actually tried a recent snapshot on my highmem box.

		Dave


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: panic on bootup due to __GFP_ZERO patch
  2005-01-08  9:06 panic on bootup due to __GFP_ZERO patch Chris Wright
  2005-01-08 20:00 ` Dave Jones
@ 2005-01-09  9:45 ` Andrew Morton
  2005-01-09 15:56   ` [PATCH] Fixes for prep_zero_page Zwane Mwaikambo
  1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2005-01-09  9:45 UTC (permalink / raw)
  To: Chris Wright; +Cc: clameter, linux-kernel

Chris Wright <chrisw@osdl.org> wrote:
>
> I'm getting a panic during pidmap_init with a backtrace that looks
> something like:
> 
> buffered_rmqueue
> __alloc_pages
> get_zeroed_page
> pidmap_init
> start_kernel
> 
> Reverting the __GFP_ZERO patch fixes the issue, haven't drilled down
> any deeper yet to see what in the patch is causing the problem.  This is
> x86 w/out HIGHMEM (and no NUMA).
> 

Well it's doing clear_highpage() before __alloc_pages() has called
kernel_map_pages(), so CONFIG_DEBUG_PAGEALLOC is quite kaput.

So the current __GFP_ZERO buglist is:

1: Breaks CONFIG_DEBUG_PAGEALLOC

2: Breaks the cache aliasing protection for anonymous pages

3: prep_zero_page() uses KM_USER0 so __GFP_ZERO from IRQ context will
   cause rare memory corruption.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] Fixes for prep_zero_page
  2005-01-09  9:45 ` Andrew Morton
@ 2005-01-09 15:56   ` Zwane Mwaikambo
  2005-01-09 20:52     ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-01-09 15:56 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Chris Wright, clameter, linux-kernel

On Sun, 9 Jan 2005, Andrew Morton wrote:

> Well it's doing clear_highpage() before __alloc_pages() has called
> kernel_map_pages(), so CONFIG_DEBUG_PAGEALLOC is quite kaput.
> 
> So the current __GFP_ZERO buglist is:
> 
> 1: Breaks CONFIG_DEBUG_PAGEALLOC
> 
> 2: Breaks the cache aliasing protection for anonymous pages
> 
> 3: prep_zero_page() uses KM_USER0 so __GFP_ZERO from IRQ context will
>    cause rare memory corruption.

The following should take care of 1 and 3. I opted to unmap the pages 
again after the clear page so that it remains isolated and we don't have 
to make additional checks to see if we should unmap the pages.

Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>

Index: linux-2.6.10-mm2/include/linux/highmem.h
===================================================================
RCS file: /home/cvsroot/linux-2.6.10-mm2/include/linux/highmem.h,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 highmem.h
--- linux-2.6.10-mm2/include/linux/highmem.h	9 Jan 2005 04:51:52 -0000	1.1.1.1
+++ linux-2.6.10-mm2/include/linux/highmem.h	9 Jan 2005 15:32:17 -0000
@@ -50,6 +50,18 @@ static inline void clear_highpage(struct
 	kunmap_atomic(kaddr, KM_USER0);
 }
 
+static inline void clear_irq_highpage(struct page *page)
+{
+	char *kaddr;
+	unsigned long flags;
+
+	local_irq_save(flags);
+	kaddr = kmap_atomic(page, KM_IRQ0);
+	clear_page(kaddr);
+	kunmap_atomic(kaddr, KM_IRQ0);
+	local_irq_restore(flags);
+}
+
 /*
  * Same but also flushes aliased cache contents to RAM.
  */
Index: linux-2.6.10-mm2/mm/page_alloc.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.10-mm2/mm/page_alloc.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 page_alloc.c
--- linux-2.6.10-mm2/mm/page_alloc.c	9 Jan 2005 04:52:40 -0000	1.1.1.1
+++ linux-2.6.10-mm2/mm/page_alloc.c	9 Jan 2005 15:46:00 -0000
@@ -691,10 +691,17 @@ perthread_pages_alloc(void)
  */
 static inline void prep_zero_page(struct page *page, int order)
 {
-	int i;
+	int i, nr_pages = (1 << order);
+	int context = in_interrupt();
 
-	for(i = 0; i < (1 << order); i++)
-		clear_highpage(page + i);
+	kernel_map_pages(page, nr_pages, 1);
+	for(i = 0; i < nr_pages; i++) {
+		if (likely(!context))
+			clear_highpage(page + i);
+		else
+			clear_irq_highpage(page + i);
+	}
+	kernel_map_pages(page, nr_pages, 0);
 }
 
 static struct page *

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fixes for prep_zero_page
  2005-01-09 15:56   ` [PATCH] Fixes for prep_zero_page Zwane Mwaikambo
@ 2005-01-09 20:52     ` Andrew Morton
  2005-01-09 21:32       ` Zwane Mwaikambo
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2005-01-09 20:52 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: chrisw, clameter, linux-kernel

Zwane Mwaikambo <zwane@arm.linux.org.uk> wrote:
>
> On Sun, 9 Jan 2005, Andrew Morton wrote:
> 
> > Well it's doing clear_highpage() before __alloc_pages() has called
> > kernel_map_pages(), so CONFIG_DEBUG_PAGEALLOC is quite kaput.
> > 
> > So the current __GFP_ZERO buglist is:
> > 
> > 1: Breaks CONFIG_DEBUG_PAGEALLOC
> > 
> > 2: Breaks the cache aliasing protection for anonymous pages
> > 
> > 3: prep_zero_page() uses KM_USER0 so __GFP_ZERO from IRQ context will
> >    cause rare memory corruption.
> 
> The following should take care of 1 and 3. I opted to unmap the pages 
> again after the clear page so that it remains isolated and we don't have 
> to make additional checks to see if we should unmap the pages.
> 
> Signed-off-by: Zwane Mwaikambo <zwane@arm.linux.org.uk>
> 
> Index: linux-2.6.10-mm2/include/linux/highmem.h
> ===================================================================
> RCS file: /home/cvsroot/linux-2.6.10-mm2/include/linux/highmem.h,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 highmem.h
> --- linux-2.6.10-mm2/include/linux/highmem.h	9 Jan 2005 04:51:52 -0000	1.1.1.1
> +++ linux-2.6.10-mm2/include/linux/highmem.h	9 Jan 2005 15:32:17 -0000
> @@ -50,6 +50,18 @@ static inline void clear_highpage(struct
>  	kunmap_atomic(kaddr, KM_USER0);
>  }
>  
> +static inline void clear_irq_highpage(struct page *page)
> +{
> +	char *kaddr;
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	kaddr = kmap_atomic(page, KM_IRQ0);
> +	clear_page(kaddr);
> +	kunmap_atomic(kaddr, KM_IRQ0);
> +	local_irq_restore(flags);
> +}
> +

This won't work right if someone tries to allocate memory while holding a
KM_IRQ0 atomic kmap.

It would be quite bizarre for anyone to be allocating highmem pages from
IRQ context anyway, but as a generic mechanism this really should work as
expected in all contexts.  That means a new kmap slot.

>  /*
>   * Same but also flushes aliased cache contents to RAM.
>   */
> Index: linux-2.6.10-mm2/mm/page_alloc.c
> ===================================================================
> RCS file: /home/cvsroot/linux-2.6.10-mm2/mm/page_alloc.c,v
> retrieving revision 1.1.1.1
> diff -u -p -B -r1.1.1.1 page_alloc.c
> --- linux-2.6.10-mm2/mm/page_alloc.c	9 Jan 2005 04:52:40 -0000	1.1.1.1
> +++ linux-2.6.10-mm2/mm/page_alloc.c	9 Jan 2005 15:46:00 -0000
> @@ -691,10 +691,17 @@ perthread_pages_alloc(void)
>   */
>  static inline void prep_zero_page(struct page *page, int order)
>  {
> -	int i;
> +	int i, nr_pages = (1 << order);
> +	int context = in_interrupt();
>  
> -	for(i = 0; i < (1 << order); i++)
> -		clear_highpage(page + i);
> +	kernel_map_pages(page, nr_pages, 1);
> +	for(i = 0; i < nr_pages; i++) {
> +		if (likely(!context))
> +			clear_highpage(page + i);
> +		else
> +			clear_irq_highpage(page + i);
> +	}
> +	kernel_map_pages(page, nr_pages, 0);
>  }

Can't we simply move the page zeroing to the very end of __alloc_pages()?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fixes for prep_zero_page
  2005-01-09 20:52     ` Andrew Morton
@ 2005-01-09 21:32       ` Zwane Mwaikambo
  2005-01-09 22:48         ` Chris Wright
  0 siblings, 1 reply; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-01-09 21:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Chris Wright, clameter, Linux Kernel

On Sun, 9 Jan 2005, Andrew Morton wrote:

> This won't work right if someone tries to allocate memory while holding a
> KM_IRQ0 atomic kmap.
> 
> It would be quite bizarre for anyone to be allocating highmem pages from
> IRQ context anyway, but as a generic mechanism this really should work as
> expected in all contexts.  That means a new kmap slot.

I was trying to find users of nested KM_IRQ to verify against so i just 
went with the first slot. The problem with sharing a slot with irq context 
is that you have to exclude interrupts whilst doing the zeroing too, 
unless we maybe create two slots.

> Can't we simply move the page zeroing to the very end of __alloc_pages()?

Ok, i've changed that bit to something like;

Index: linux-2.6.10-mm2/mm/page_alloc.c
===================================================================
RCS file: /home/cvsroot/linux-2.6.10-mm2/mm/page_alloc.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 page_alloc.c
--- linux-2.6.10-mm2/mm/page_alloc.c	9 Jan 2005 04:52:40 -0000	1.1.1.1
+++ linux-2.6.10-mm2/mm/page_alloc.c	9 Jan 2005 21:23:31 -0000
@@ -732,9 +740,6 @@ buffered_rmqueue(struct zone *zone, int 
 		mod_page_state_zone(zone, pgalloc, 1 << order);
 		prep_new_page(page, order);
 
-		if (gfp_flags & __GFP_ZERO)
-			prep_zero_page(page, order);
-
 		if (order && (gfp_flags & __GFP_COMP))
 			prep_compound_page(page, order);
 	}
@@ -935,6 +940,8 @@ nopage:
 got_pg:
 	zone_statistics(zonelist, z);
 	kernel_map_pages(page, 1 << order, 1);
+	if (gfp_mask & __GFP_ZERO)
+		prep_zero_page(page, order);
 	return page;
 }
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fixes for prep_zero_page
  2005-01-09 21:32       ` Zwane Mwaikambo
@ 2005-01-09 22:48         ` Chris Wright
  2005-01-10  4:18           ` Zwane Mwaikambo
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wright @ 2005-01-09 22:48 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Andrew Morton, Chris Wright, clameter, Linux Kernel

* Zwane Mwaikambo (zwane@arm.linux.org.uk) wrote:
> On Sun, 9 Jan 2005, Andrew Morton wrote:
> > Can't we simply move the page zeroing to the very end of __alloc_pages()?
> 
> Ok, i've changed that bit to something like;

I did it the other way around, and moved kernel_map_pages to prep_new_page
so it's called before zeroing to keep that with the other prep bits
in buffered_rmqueue.  Made sense to me that kernel_map_pages is part of
prepping a new page, but this isn't my area, so I could be way off ;-)
It works for me with DEBUG_PAGEALLOC enabled.

===== mm/page_alloc.c 1.251 vs edited =====
--- 1.251/mm/page_alloc.c	2005-01-07 21:44:07 -08:00
+++ edited/mm/page_alloc.c	2005-01-09 14:36:38 -08:00
@@ -413,6 +413,7 @@ static void prep_new_page(struct page *p
 			1 << PG_checked | 1 << PG_mappedtodisk);
 	page->private = 0;
 	set_page_refs(page, order);
+	kernel_map_pages(page, 1 << order, 1);
 }
 
 /* 
@@ -823,7 +824,6 @@ nopage:
 	return NULL;
 got_pg:
 	zone_statistics(zonelist, z);
-	kernel_map_pages(page, 1 << order, 1);
 	return page;
 }
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fixes for prep_zero_page
  2005-01-09 22:48         ` Chris Wright
@ 2005-01-10  4:18           ` Zwane Mwaikambo
  2005-01-13  5:05             ` Zwane Mwaikambo
  0 siblings, 1 reply; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-01-10  4:18 UTC (permalink / raw)
  To: Chris Wright; +Cc: Andrew Morton, clameter, Linux Kernel

On Sun, 9 Jan 2005, Chris Wright wrote:

> * Zwane Mwaikambo (zwane@arm.linux.org.uk) wrote:
> > On Sun, 9 Jan 2005, Andrew Morton wrote:
> > > Can't we simply move the page zeroing to the very end of __alloc_pages()?
> > 
> > Ok, i've changed that bit to something like;
> 
> I did it the other way around, and moved kernel_map_pages to prep_new_page
> so it's called before zeroing to keep that with the other prep bits
> in buffered_rmqueue.  Made sense to me that kernel_map_pages is part of
> prepping a new page, but this isn't my area, so I could be way off ;-)
> It works for me with DEBUG_PAGEALLOC enabled.

A lot more digestible than my offering ;)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fixes for prep_zero_page
  2005-01-10  4:18           ` Zwane Mwaikambo
@ 2005-01-13  5:05             ` Zwane Mwaikambo
  2005-01-13 18:24               ` Chris Wright
  0 siblings, 1 reply; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-01-13  5:05 UTC (permalink / raw)
  To: Chris Wright; +Cc: Andrew Morton, clameter, Linux Kernel

It looks like it's still not happy with CONFIG_DEBUG_PAGEALLOC under load.

Unable to handle kernel paging request at virtual address ec5d97f4
 printing eip:
c014a882
*pde = 0083e067
Oops: 0000 [#1]
PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU:    0
EIP:    0060:[<c014a882>]    Not tainted VLI
EFLAGS: 00010002   (2.6.10-mm2)
EIP is at check_slabuse+0x52/0xf0
eax: ec5d97f4   ebx: c18dd180   ecx: ec5b685c   edx: ec5d9000
esi: c18dd180   edi: ec5d9000   ebp: c19d1efc   esp: c19d1ed4
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 6, threadinfo=c19d0000 task=c19abac0)
Stack: c19d1efc c0149c64 c19d1f0c c0149b0e c19d1ef0 00000000 ec5b685c ec5b685c
       c18dd180 c18dd1a0 c19d1f24 c014aa19 c18dd2a0 c19d1f24 c014acec 00000000
       c18dd180 c18dd2a0 c18dd180 00000005 c19d1f48 c014af91 c18dd2c8 c18dd2a0
Call Trace:
 [<c0103fda>] show_stack+0x7a/0x90
 [<c0104166>] show_registers+0x156/0x1c0
 [<c0104380>] die+0x100/0x190
 [<c0117159>] do_page_fault+0x369/0x65f
 [<c0103c6f>] error_code+0x2b/0x30
 [<c014aa19>] check_redzone+0xf9/0x140
 [<c014af91>] cache_reap+0x221/0x230
 [<c0130b2b>] worker_thread+0x17b/0x210
 [<c0135605>] kthread+0xa5/0xb0
 [<c0101415>] kernel_thread_helper+0x5/0x10
Code: 00 00 8d bc 27 00 00 00 00 83 c4 1c 5b 5e 5f 5d c3 8b 43 38 8b 7d ec 
8b 4d f0 0f af f8 8b 410c 01 c7 89 fa 89 d8 e8 ee d3 ff ff <8b> 30 89 fa 
89 d8 e8 13 d4 ff ff 81 fe 71 f0 2c 5a 8b 00 74 7b
 <3>Debug: sleeping function called from invalid context at 
include/linux/rwsem.h:43
in_atomic():1, irqs_disabled():0
 [<c0104007>] dump_stack+0x17/0x20
 [<c011cb7c>] __might_sleep+0xac/0xc0
 [<c0120753>] profile_task_exit+0x23/0x60
 [<c012297c>] do_exit+0x1c/0x440
 [<c0104408>] die+0x188/0x190
 [<c0117159>] do_page_fault+0x369/0x65f
 [<c0103c6f>] error_code+0x2b/0x30
 [<c014aa19>] check_redzone+0xf9/0x140
 [<c014af91>] cache_reap+0x221/0x230
 [<c0130b2b>] worker_thread+0x17b/0x210
 [<c0135605>] kthread+0xa5/0xb0
 [<c0101415>] kernel_thread_helper+0x5/0x10
note: events/0[6] exited with preempt_count 1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fixes for prep_zero_page
  2005-01-13  5:05             ` Zwane Mwaikambo
@ 2005-01-13 18:24               ` Chris Wright
  2005-01-14  0:50                 ` Zwane Mwaikambo
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Wright @ 2005-01-13 18:24 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Chris Wright, Andrew Morton, clameter, Linux Kernel

* Zwane Mwaikambo (zwane@arm.linux.org.uk) wrote:
> It looks like it's still not happy with CONFIG_DEBUG_PAGEALLOC under load.
> 
> Unable to handle kernel paging request at virtual address ec5d97f4

Is that in vmalloc space?

>  printing eip:
> c014a882
> *pde = 0083e067
> Oops: 0000 [#1]
> PREEMPT SMP DEBUG_PAGEALLOC
> Modules linked in:
> CPU:    0
> EIP:    0060:[<c014a882>]    Not tainted VLI
> EFLAGS: 00010002   (2.6.10-mm2)
> EIP is at check_slabuse+0x52/0xf0

Hmm, isn't that from Manfred's patch to periodically scan?  Doesn't look
to me like it's related to the page prep fixup.  What kind of load, etc?

thanks,
-chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fixes for prep_zero_page
  2005-01-13 18:24               ` Chris Wright
@ 2005-01-14  0:50                 ` Zwane Mwaikambo
  0 siblings, 0 replies; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-01-14  0:50 UTC (permalink / raw)
  To: Chris Wright; +Cc: Andrew Morton, clameter, Linux Kernel

On Thu, 13 Jan 2005, Chris Wright wrote:

> * Zwane Mwaikambo (zwane@arm.linux.org.uk) wrote:
> > It looks like it's still not happy with CONFIG_DEBUG_PAGEALLOC under load.
> > 
> > Unable to handle kernel paging request at virtual address ec5d97f4
> 
> Is that in vmalloc space?

Hmm it looks like.

> >  printing eip:
> > c014a882
> > *pde = 0083e067
> > Oops: 0000 [#1]
> > PREEMPT SMP DEBUG_PAGEALLOC
> > Modules linked in:
> > CPU:    0
> > EIP:    0060:[<c014a882>]    Not tainted VLI
> > EFLAGS: 00010002   (2.6.10-mm2)
> > EIP is at check_slabuse+0x52/0xf0
> 
> Hmm, isn't that from Manfred's patch to periodically scan?  Doesn't look
> to me like it's related to the page prep fixup.  What kind of load, etc?

I had an email exchange with Christopher and we came to the conclusion 
that it's breakage elsewhere with CONFIG_DEBUG_PAGEALLOC.


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-01-14  0:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-01-08  9:06 panic on bootup due to __GFP_ZERO patch Chris Wright
2005-01-08 20:00 ` Dave Jones
2005-01-09  9:45 ` Andrew Morton
2005-01-09 15:56   ` [PATCH] Fixes for prep_zero_page Zwane Mwaikambo
2005-01-09 20:52     ` Andrew Morton
2005-01-09 21:32       ` Zwane Mwaikambo
2005-01-09 22:48         ` Chris Wright
2005-01-10  4:18           ` Zwane Mwaikambo
2005-01-13  5:05             ` Zwane Mwaikambo
2005-01-13 18:24               ` Chris Wright
2005-01-14  0:50                 ` Zwane Mwaikambo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).