All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix special PTE code for secondary hash bucket
@ 2007-08-03  8:58 Paul Mackerras
  2007-08-03 11:51 ` Benjamin Herrenschmidt
  2007-08-03 19:32 ` Page faults blowing up ... [was " Linas Vepstas
  0 siblings, 2 replies; 6+ messages in thread
From: Paul Mackerras @ 2007-08-03  8:58 UTC (permalink / raw)
  To: linuxppc-dev

The code for mapping special 4k pages on kernels using a 64kB base
page size was missing the code for doing the RPN (real page number)
manipulation when inserting the hardware PTE in the secondary hash
bucket.  It needs the same code as has already been added to the
code that inserts the HPTE in the primary hash bucket.  This adds it.

Spotted by Ben Herrenschmidt.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 4762ff7..35eabfb 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -472,10 +472,12 @@ _GLOBAL(htab_call_hpte_insert1)
 	/* Now try secondary slot */
 
 	/* real page number in r5, PTE RPN value + index */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
+	andis.	r0,r31,_PAGE_4K_PFN@h
+	srdi	r5,r31,PTE_RPN_SHIFT
+	bne-	3f
 	sldi	r5,r5,PAGE_SHIFT-HW_PAGE_SHIFT
 	add	r5,r5,r25
-	sldi	r5,r5,HW_PAGE_SHIFT
+3:	sldi	r5,r5,HW_PAGE_SHIFT
 
 	/* Calculate secondary group hash */
 	andc	r0,r27,r28

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] Fix special PTE code for secondary hash bucket
  2007-08-03  8:58 [PATCH] Fix special PTE code for secondary hash bucket Paul Mackerras
@ 2007-08-03 11:51 ` Benjamin Herrenschmidt
  2007-08-03 19:32 ` Page faults blowing up ... [was " Linas Vepstas
  1 sibling, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2007-08-03 11:51 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev

On Fri, 2007-08-03 at 18:58 +1000, Paul Mackerras wrote:
> The code for mapping special 4k pages on kernels using a 64kB base
> page size was missing the code for doing the RPN (real page number)
> manipulation when inserting the hardware PTE in the secondary hash
> bucket.  It needs the same code as has already been added to the
> code that inserts the HPTE in the primary hash bucket.  This adds it.
> 
> Spotted by Ben Herrenschmidt.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

> ---
> diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
> index 4762ff7..35eabfb 100644
> --- a/arch/powerpc/mm/hash_low_64.S
> +++ b/arch/powerpc/mm/hash_low_64.S
> @@ -472,10 +472,12 @@ _GLOBAL(htab_call_hpte_insert1)
>  	/* Now try secondary slot */
>  
>  	/* real page number in r5, PTE RPN value + index */
> -	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
> +	andis.	r0,r31,_PAGE_4K_PFN@h
> +	srdi	r5,r31,PTE_RPN_SHIFT
> +	bne-	3f
>  	sldi	r5,r5,PAGE_SHIFT-HW_PAGE_SHIFT
>  	add	r5,r5,r25
> -	sldi	r5,r5,HW_PAGE_SHIFT
> +3:	sldi	r5,r5,HW_PAGE_SHIFT
>  
>  	/* Calculate secondary group hash */
>  	andc	r0,r27,r28
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Page faults blowing up ... [was Re: [PATCH] Fix special PTE code for secondary hash bucket
  2007-08-03  8:58 [PATCH] Fix special PTE code for secondary hash bucket Paul Mackerras
  2007-08-03 11:51 ` Benjamin Herrenschmidt
@ 2007-08-03 19:32 ` Linas Vepstas
  2007-08-03 21:54   ` Mike Strosaker
  2007-08-04  2:31   ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 6+ messages in thread
From: Linas Vepstas @ 2007-08-03 19:32 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, benh

On Fri, Aug 03, 2007 at 06:58:51PM +1000, Paul Mackerras wrote:
> The code for mapping special 4k pages on kernels using a 64kB base
> page size was missing the code for doing the RPN (real page number)
> manipulation when inserting the hardware PTE in the secondary hash
> bucket.  It needs the same code as has already been added to the
> code that inserts the HPTE in the primary hash bucket.  This adds it.

So what are the symptoms of hitting this? Does this affect only 
recent kernels, or old ones too?

I'm hitting the craziest bug I've seen in a while, I get some
corrputed value in a register: 0x80000000077b21e0  which sure looks
like an address with 0x8... instead of 0xc... and, what is even
stranger, I find that 0xc0000000077b21e0 is pointing at the data
that I *should have had* in the register!  And theres some other
oddball stuff hinting that a page fault handler ran and blew up:

3:mon> d c0000000077b21e0
c0000000077b21e0 e00000008004b224 0674100900000080  |.......$.t......|

Well, howdy doody, there's the value that should have been in r3 ....

c0000000077b21f0 c4008e0000000000 0000000049424d00  |............IBM.|

IBM ???

c0000000077b2200 5048003006000000 0000000000000000  |PH.0............|
c0000000077b2210 0000000000000000 4800000300000000  |........H.......|
c0000000077b2220 0000000000000000 0000000000000000  |................|
c0000000077b2230 5548001806000000 1000400000000000  |UH........@.....|
c0000000077b2240 0000200000000000 4d43002806000000  |.. .....MC.(....|
c0000000077b2250 0000000000000001 00c3000000000000  |................|
c0000000077b2260 e00000008004b224 0000000000000000  |.......$........|
c0000000077b2270 d0000000000d32c0 8000000000101032  |......2........2|

hey .. wait .. d0000000000d32c0 is the faulting adddress; whats it doing here ???
... and 8000000000101032 is the value of the MSR ... why is that here ??

c0000000077b2280 0000000000000000 0000000000000000  |................|
c0000000077b2290 0000000000000000 0000000000000000  |................|


Any hints or tips appreciated ... btw, I should mention
I'm seeing this exact same bug on both 2.6.9 (RHEL4) and 
on 2.6.16 (SLES10) so... wtf ??? why now ? 

--linas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Page faults blowing up ... [was Re: [PATCH] Fix special PTE code for secondary hash bucket
  2007-08-03 19:32 ` Page faults blowing up ... [was " Linas Vepstas
@ 2007-08-03 21:54   ` Mike Strosaker
  2007-08-04  2:31   ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 6+ messages in thread
From: Mike Strosaker @ 2007-08-03 21:54 UTC (permalink / raw)
  To: Linas Vepstas; +Cc: linuxppc-dev, Paul Mackerras, benh

Linas Vepstas wrote:
> 3:mon> d c0000000077b21e0
> c0000000077b21e0 e00000008004b224 0674100900000080  |.......$.t......|
> 
> Well, howdy doody, there's the value that should have been in r3 ....
> 
> c0000000077b21f0 c4008e0000000000 0000000049424d00  |............IBM.|
> 
> IBM ???
> 
> c0000000077b2200 5048003006000000 0000000000000000  |PH.0............|
> c0000000077b2210 0000000000000000 4800000300000000  |........H.......|
> c0000000077b2220 0000000000000000 0000000000000000  |................|
> c0000000077b2230 5548001806000000 1000400000000000  |UH........@.....|
> c0000000077b2240 0000200000000000 4d43002806000000  |.. .....MC.(....|
> c0000000077b2250 0000000000000001 00c3000000000000  |................|
> c0000000077b2260 e00000008004b224 0000000000000000  |.......$........|
> c0000000077b2270 d0000000000d32c0 8000000000101032  |......2........2|
> 
> hey .. wait .. d0000000000d32c0 is the faulting adddress; whats it doing here ???
> ... and 8000000000101032 is the value of the MSR ... why is that here ??

That looks like part of an RTAS event.  PH indicates a "Main A" section, UH a 
"Main B" section, and, probably of most interest to you, MC indicates a "Failing 
Memory Address" section.  The "Error and Event Notification" chapter of the PAPR 
will be useful here.  You can use rtas_dump (in either powerpc-utils or 
ppc64-utils, depending on the distro) to decode the event in its entirety.  A 
quick hand-decode of the MC section yields (might be wrong, you'll want to 
double-check):

Unrecoverable memory error (UE); transient UE, 64-bit effective address provided 
by the log (located at c0000000077b2260), 64-bit logical address of logical page 
is not provided by the log; error detected by load/store unit of the processor.

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Page faults blowing up ... [was Re: [PATCH] Fix special PTE code for secondary hash bucket
  2007-08-03 19:32 ` Page faults blowing up ... [was " Linas Vepstas
  2007-08-03 21:54   ` Mike Strosaker
@ 2007-08-04  2:31   ` Benjamin Herrenschmidt
  2007-08-06 19:19     ` Linas Vepstas
  1 sibling, 1 reply; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2007-08-04  2:31 UTC (permalink / raw)
  To: Linas Vepstas; +Cc: linuxppc-dev, Paul Mackerras, benh

On Fri, 2007-08-03 at 14:32 -0500, Linas Vepstas wrote:
> On Fri, Aug 03, 2007 at 06:58:51PM +1000, Paul Mackerras wrote:
> > The code for mapping special 4k pages on kernels using a 64kB base
> > page size was missing the code for doing the RPN (real page number)
> > manipulation when inserting the hardware PTE in the secondary hash
> > bucket.  It needs the same code as has already been added to the
> > code that inserts the HPTE in the primary hash bucket.  This adds it.
> 
> So what are the symptoms of hitting this? Does this affect only 
> recent kernels, or old ones too?

Paulus stuff is likely to be unrelated to your bug. Also, whatever blurb
you pasted in this email is totally incomprehensible due to total lack
of context.

Ben.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Page faults blowing up ... [was Re: [PATCH] Fix special PTE code for secondary hash bucket
  2007-08-04  2:31   ` Benjamin Herrenschmidt
@ 2007-08-06 19:19     ` Linas Vepstas
  0 siblings, 0 replies; 6+ messages in thread
From: Linas Vepstas @ 2007-08-06 19:19 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Paul Mackerras, benh

On Sat, Aug 04, 2007 at 12:31:28PM +1000, Benjamin Herrenschmidt wrote:
> 
> Paulus stuff is likely to be unrelated to your bug. Also, whatever blurb
> you pasted in this email is totally incomprehensible due to total lack
> of context.

Sorry. Mike Strosaker nailed it; its a nutty hypervisor bug.

--linas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-08-06 19:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-08-03  8:58 [PATCH] Fix special PTE code for secondary hash bucket Paul Mackerras
2007-08-03 11:51 ` Benjamin Herrenschmidt
2007-08-03 19:32 ` Page faults blowing up ... [was " Linas Vepstas
2007-08-03 21:54   ` Mike Strosaker
2007-08-04  2:31   ` Benjamin Herrenschmidt
2007-08-06 19:19     ` Linas Vepstas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.