linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
@ 2009-10-05 12:16 Joakim Tjernlund
  2009-10-05 12:16 ` [PATCH 1/6] 8xx: DTLB Error must check for more errors Joakim Tjernlund
                   ` (2 more replies)
  0 siblings, 3 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 12:16 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, linuxppc-dev, Benjamin Herrenschmidt

Here are my latest code to fixup 8xx's TLB code.
This code needs some serious testing before it
can be commited.
The "8xx, fault: Add some debug code to do_page_fault()" is
purely a debug check and will be removed/disabled when
this series appear stable.

Scott and Rex, please disregard other patches from me and
try these out instead.

Joakim Tjernlund (6):
  8xx: DTLB Error must check for more errors.
  8xx, fault: Add some debug code to do_page_fault()
  8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  8xx: Tag DAR with 0x00f0 to catch buggy instructions.
  8xx: Fixup DAR from buggy dcbX instructions.
  8xx: start using dcbX instructions in various copy routines

 arch/powerpc/include/asm/pte-8xx.h |    9 +-
 arch/powerpc/kernel/head_8xx.S     |  324 ++++++++++++++++++++++++++++++-----
 arch/powerpc/kernel/misc_32.S      |   18 --
 arch/powerpc/lib/copy_32.S         |   24 ---
 arch/powerpc/mm/fault.c            |   82 +++++++++
 5 files changed, 363 insertions(+), 94 deletions(-)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH 1/6] 8xx: DTLB Error must check for more errors.
  2009-10-05 12:16 [PATCH 0/6] PowerPc 8xx TLB/MMU fixes Joakim Tjernlund
@ 2009-10-05 12:16 ` Joakim Tjernlund
  2009-10-05 12:16   ` [PATCH 2/6] 8xx, fault: Add some debug code to do_page_fault() Joakim Tjernlund
  2009-10-05 18:12 ` [PATCH 0/6] PowerPc 8xx TLB/MMU fixes Scott Wood
  2009-10-05 22:04 ` Rex Feany
  2 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 12:16 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, linuxppc-dev, Benjamin Herrenschmidt

DataTLBError currently does:
 if ((err & 0x02000000) == 0)
    DSI();
This won't handle a store with no valid translation.
Change this to
 if ((err & 0x48000000) != 0)
    DSI();
that is, branch to DSI if either !permission or
!translation.
---
 arch/powerpc/kernel/head_8xx.S |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 52ff8c5..118bb05 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -472,8 +472,8 @@ DataTLBError:
 	/* First, make sure this was a store operation.
 	*/
 	mfspr	r10, SPRN_DSISR
-	andis.	r11, r10, 0x0200	/* If set, indicates store op */
-	beq	2f
+	andis.	r11, r10, 0x4800 /* no translation, no permission. */
+	bne	2f	/* branch if either is set */
 
 	/* The EA of a data TLB miss is automatically stored in the MD_EPN
 	 * register.  The EA of a data TLB error is automatically stored in
-- 
1.6.4.4

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 2/6] 8xx, fault: Add some debug code to do_page_fault()
  2009-10-05 12:16 ` [PATCH 1/6] 8xx: DTLB Error must check for more errors Joakim Tjernlund
@ 2009-10-05 12:16   ` Joakim Tjernlund
  2009-10-05 12:16     ` [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU Joakim Tjernlund
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 12:16 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, linuxppc-dev, Benjamin Herrenschmidt

---
 arch/powerpc/mm/fault.c |   82 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 82 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 7699394..c33c6de 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -139,6 +139,88 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 #else
 	is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
+#if 1 /* defined(CONFIG_8xx)*/
+#define DEBUG_DCBX
+/*
+ Work around DTLB Miss/Error, as these do not update
+ DAR for dcbf, dcbi, dcbst, dcbz and icbi instructions
+ This relies on every exception tagging DAR with 0xf0
+ before returning (rfi)
+ DAR is passed as 'address' to this function.
+ */
+	{
+		unsigned long ra, rb, dar, insn;
+#ifdef DEBUG_DCBX
+		const char *istr = NULL;
+
+		insn = *((unsigned long *)regs->nip);
+		if (((insn >> (31-5)) & 0x3f) == 31) {
+			if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
+				istr = "dcbz";
+			if (((insn >> 1) & 0x3ff) == 86) /* dcbf ? 0x56 */
+				istr = "dcbf";
+			if (((insn >> 1) & 0x3ff) == 470) /* dcbi ? 0x1d6 */
+				istr = "dcbi";
+			if (((insn >> 1) & 0x3ff) == 54) /* dcbst ? 0x36 */
+				istr = "dcbst";
+			if (((insn >> 1) & 0x3ff) == 982) /* icbi ? 0x3d6 */
+				istr = "icbi";
+			if (istr) {
+				ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
+				rb = (insn >> (31-20)) & 0x1f; /* Reg RB */
+				dar = regs->gpr[rb];
+				if (ra)
+					dar += regs->gpr[ra];
+				if (dar != address && address != 0x00f0 && trap == 0x300)
+					printk(KERN_CRIT "%s: address:%lx, dar:%lx!\n", istr, address, dar);
+				if (!strcmp(istr, "dcbst") && is_write) {
+					printk(KERN_CRIT "dcbst R%ld,R%ld = %lx as a store, fixing!\n",
+					       ra, rb, dar);
+					is_write = 0;
+				}
+
+				if (trap == 0x300 && address != dar) {
+					__asm__ ("mtdar %0" : : "r" (dar));
+					return 0;
+				}
+			}
+		}
+#endif
+		if (address == 0x00f0 && trap == 0x300) {
+			pte_t *ptep;
+
+			/* This is from a dcbX or icbi insn gone bad, these
+			 * insn do not set DAR so we have to do it here instead */
+			insn = *((unsigned long *)regs->nip);
+
+			ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
+			rb = (insn >> (31-20)) & 0x1f; /* Reg RB */
+			dar = regs->gpr[rb];
+			if (ra)
+				dar += regs->gpr[ra];
+			/* Set DAR to correct address for the DTLB Miss/Error handler
+			 * to redo the TLB exception. This time with correct address */
+			__asm__ ("mtdar %0" : : "r" (dar));
+#ifdef DEBUG_DCBX
+			printk(KERN_CRIT "trap:%x address:%lx, dar:%lx,err:%lx %s\n",
+			       trap, address, dar, error_code, istr);
+#endif
+			address = dar;
+#if 1
+			if (is_write && get_pteptr(mm, dar, &ptep, NULL)) {
+				pte_t my_pte = *ptep;
+
+				if (pte_present(my_pte) && pte_write(my_pte)) {
+					pte_val(my_pte) |= _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE;
+					set_pte_at(mm, dar, ptep, my_pte);
+				}
+			}
+#else
+			return 0;
+#endif
+		}
+	}
+#endif
 
 	if (notify_page_fault(regs))
 		return 0;
-- 
1.6.4.4

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 12:16   ` [PATCH 2/6] 8xx, fault: Add some debug code to do_page_fault() Joakim Tjernlund
@ 2009-10-05 12:16     ` Joakim Tjernlund
  2009-10-05 12:16       ` [PATCH 4/6] 8xx: Tag DAR with 0x00f0 to catch buggy instructions Joakim Tjernlund
  2009-10-05 20:17       ` [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU Benjamin Herrenschmidt
  0 siblings, 2 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 12:16 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, linuxppc-dev, Benjamin Herrenschmidt

Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
Pros:
 - I/D TLB Miss never needs to write to the linux pte.
 - _PAGE_ACCESSED is only set on I/D TLB Error fixing accounting
 - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
    when a page has been made dirty.
 - Proper RO/RW mapping of user space.
 - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
Cons:
 - 4 more instructions in I/D TLB Miss, but the since the linux pte is
   not written anymore, it should still be a win.
---
 arch/powerpc/include/asm/pte-8xx.h |    9 +-
 arch/powerpc/kernel/head_8xx.S     |  163 ++++++++++++++++++++++++++----------
 2 files changed, 122 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/pte-8xx.h
index 8c6e312..af541a2 100644
--- a/arch/powerpc/include/asm/pte-8xx.h
+++ b/arch/powerpc/include/asm/pte-8xx.h
@@ -32,22 +32,21 @@
 #define _PAGE_FILE	0x0002	/* when !present: nonlinear file mapping */
 #define _PAGE_NO_CACHE	0x0002	/* I: cache inhibit */
 #define _PAGE_SHARED	0x0004	/* No ASID (context) compare */
+#define _PAGE_DIRTY	0x0100	/* C: page changed */
 
 /* These five software bits must be masked out when the entry is loaded
  * into the TLB.
  */
 #define _PAGE_EXEC	0x0008	/* software: i-cache coherency required */
 #define _PAGE_GUARDED	0x0010	/* software: guarded access */
-#define _PAGE_DIRTY	0x0020	/* software: page changed */
-#define _PAGE_RW	0x0040	/* software: user write access allowed */
-#define _PAGE_ACCESSED	0x0080	/* software: page referenced */
+#define _PAGE_USER	0x0020	/* software: User space access */
 
 /* Setting any bits in the nibble with the follow two controls will
  * require a TLB exception handler change.  It is assumed unused bits
  * are always zero.
  */
-#define _PAGE_HWWRITE	0x0100	/* h/w write enable: never set in Linux PTE */
-#define _PAGE_USER	0x0800	/* One of the PP bits, the other is USER&~RW */
+#define _PAGE_RW	0x0400	/* lsb PP bits */
+#define _PAGE_ACCESSED	0x0800	/* msb PP bits */
 
 #define _PMD_PRESENT	0x0001
 #define _PMD_BAD	0x0ff0
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 118bb05..b1f72d9 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -333,21 +333,15 @@ InstructionTLBMiss:
 	mfspr	r11, SPRN_MD_TWC	/* ....and get the pte address */
 	lwz	r10, 0(r11)	/* Get the pte */
 
-#ifdef CONFIG_SWAP
-	/* do not set the _PAGE_ACCESSED bit of a non-present page */
-	andi.	r11, r10, _PAGE_PRESENT
-	beq	4f
-	ori	r10, r10, _PAGE_ACCESSED
-	mfspr	r11, SPRN_MD_TWC	/* get the pte address again */
-	stw	r10, 0(r11)
-4:
-#else
-	ori	r10, r10, _PAGE_ACCESSED
-	stw	r10, 0(r11)
-#endif
-
+	andi.	r11, r10, _PAGE_USER | _PAGE_ACCESSED
+	cmpwi	cr0, r11, _PAGE_USER | _PAGE_ACCESSED
+	beq+	cr0, 5f	/* branch if access allowed */
+	rlwinm	r10, r10, 0, 22, 19 /* r10 &= ~(_PAGE_ACCESSED | _PAGE_RW) */
+	b	6f
+5:	xori	r10, r10, _PAGE_RW /* invert RW bit */
+6:
 	/* The Linux PTE won't go exactly into the MMU TLB.
-	 * Software indicator bits 21, 22 and 28 must be clear.
+	 * Software indicator bit 28 must be clear.
 	 * Software indicator bits 24, 25, 26, and 27 must be
 	 * set.  All other Linux PTE bits control the behavior
 	 * of the MMU.
@@ -409,21 +403,15 @@ DataStoreTLBMiss:
 	DO_8xx_CPU6(0x3b80, r3)
 	mtspr	SPRN_MD_TWC, r11
 
-#ifdef CONFIG_SWAP
-	/* do not set the _PAGE_ACCESSED bit of a non-present page */
-	andi.	r11, r10, _PAGE_PRESENT
-	beq	4f
-	ori	r10, r10, _PAGE_ACCESSED
-4:
-	/* and update pte in table */
-#else
-	ori	r10, r10, _PAGE_ACCESSED
-#endif
-	mfspr	r11, SPRN_MD_TWC	/* get the pte address again */
-	stw	r10, 0(r11)
-
+	andi.	r11, r10, _PAGE_USER | _PAGE_ACCESSED
+	cmpwi	cr0, r11, _PAGE_USER | _PAGE_ACCESSED
+	beq+	cr0, 5f	/* branch if access allowed */
+	rlwinm	r10, r10, 0, 22, 19 /* r10 &= ~(_PAGE_ACCESSED | _PAGE_RW) */
+	b	6f
+5:	xori	r10, r10, _PAGE_RW /* invert RW bit */
+6:
 	/* The Linux PTE won't go exactly into the MMU TLB.
-	 * Software indicator bits 21, 22 and 28 must be clear.
+	 * Software indicator bit 28 must be clear.
 	 * Software indicator bits 24, 25, 26, and 27 must be
 	 * set.  All other Linux PTE bits control the behavior
 	 * of the MMU.
@@ -448,6 +436,91 @@ DataStoreTLBMiss:
  */
 	. = 0x1300
 InstructionTLBError:
+#ifdef CONFIG_8xx_CPU6
+	stw	r3, 8(r0)
+#endif
+	DO_8xx_CPU6(0x3f80, r3)
+	mtspr	SPRN_M_TW, r10	/* Save a couple of working registers */
+	mfcr	r10
+	stw	r10, 0(r0)
+	stw	r11, 4(r0)
+
+	mfspr	r11, SPRN_SRR1
+	andis.	r11, r11, 0x5000	/* no translation, guarded */
+	bne	2f
+
+	mfspr	r10, SPRN_SRR0	/* Get effective address of fault */
+#ifdef CONFIG_8xx_CPU15
+	addi	r11, r10, 0x1000
+	tlbie	r11
+	addi	r11, r10, -0x1000
+	tlbie	r11
+#endif
+	DO_8xx_CPU6(0x3780, r3)
+	mtspr	SPRN_MD_EPN, r10	/* Have to use MD_EPN for walk, MI_EPN can't */
+	mfspr	r10, SPRN_M_TWB	/* Get level 1 table entry address */
+
+	/* If we are faulting a kernel address, we have to use the
+	 * kernel page tables.
+	 */
+	andi.	r11, r10, 0x0800	/* Address >= 0x80000000 */
+	beq	3f
+	lis	r11, swapper_pg_dir@h
+	ori	r11, r11, swapper_pg_dir@l
+	rlwimi	r10, r11, 0, 2, 19
+3:
+	lwz	r11, 0(r10)	/* Get the level 1 entry */
+	rlwinm.	r10, r11,0,0,19	/* Extract page descriptor page address */
+	beq	2f		/* If zero, don't try to find a pte */
+
+	/* We have a pte table, so load the MI_TWC with the attributes
+	 * for this "segment."
+	 */
+	ori	r11,r11,1		/* Set valid bit */
+	DO_8xx_CPU6(0x2b80, r3)
+	mtspr	SPRN_MI_TWC, r11	/* Set segment attributes */
+	DO_8xx_CPU6(0x3b80, r3)
+	mtspr	SPRN_MD_TWC, r11	/* Load pte table base address */
+
+	mfspr	r11, SPRN_SRR1
+	andi.	r11, r11, 0x4000 /* MSR[PR] */
+	mfspr	r11, SPRN_MD_TWC	/* ....and get the pte address */
+	lwz	r10, 0(r11)	/* Get the pte */
+	beq	5f /* Kernel access always OK */
+	andi.	r11,r10, _PAGE_USER
+	beq	2f
+5:	ori	r10, r10, _PAGE_ACCESSED
+	mfspr	r21, SPRN_MD_TWC	/* ....and get the pte address */
+	stw	r10, 0(r11)
+	xori	r10, r10, _PAGE_RW /* invert RW bit */
+
+	/* The Linux PTE won't go exactly into the MMU TLB.
+	 * Software indicator bit 28 must be clear.
+	 * Software indicator bits 24, 25, 26, and 27 must be
+	 * set.  All other Linux PTE bits control the behavior
+	 * of the MMU.
+	 */
+	li	r11, 0x00f0
+	rlwimi	r10, r11, 0, 24, 28	/* Set 24-27, clear 28 */
+	DO_8xx_CPU6(0x2d80, r3)
+	mtspr	SPRN_MI_RPN, r10	/* Update TLB entry */
+
+	mfspr	r10, SPRN_M_TW	/* Restore registers */
+	lwz	r11, 0(r0)
+	mtcr	r11
+	lwz	r11, 4(r0)
+#ifdef CONFIG_8xx_CPU6
+	lwz	r3, 8(r0)
+#endif
+	rfi
+
+2:	mfspr	r10, SPRN_M_TW	/* Restore registers */
+	lwz	r11, 0(r0)
+	mtcr	r11
+	lwz	r11, 4(r0)
+#ifdef CONFIG_8xx_CPU6
+	lwz	r3, 8(r0)
+#endif
 	b	InstructionAccess
 
 /* This is the data TLB error on the MPC8xx.  This could be due to
@@ -472,8 +545,8 @@ DataTLBError:
 	/* First, make sure this was a store operation.
 	*/
 	mfspr	r10, SPRN_DSISR
-	andis.	r11, r10, 0x4800 /* no translation, no permission. */
-	bne	2f	/* branch if either is set */
+	andis.	r11, r10, 0x4000 /* no translation */
+	bne	2f	/* branch if set */
 
 	/* The EA of a data TLB miss is automatically stored in the MD_EPN
 	 * register.  The EA of a data TLB error is automatically stored in
@@ -522,26 +595,26 @@ DataTLBError:
 	mfspr	r11, SPRN_MD_TWC		/* ....and get the pte address */
 	lwz	r10, 0(r11)		/* Get the pte */
 
-	andi.	r11, r10, _PAGE_RW	/* Is it writeable? */
-	beq	2f			/* Bail out if not */
+	mfspr	r11, SPRN_SRR1
+	andi.	r11, r11, 0x4000 /* MSR[PR] */
+	beq	5f /* Kernel access always OK */
+	andi.	r11,r10, _PAGE_USER
+	beq	2f
+5:	mfspr	r11, SPRN_DSISR
+	andis.	r11, r11, 0x0200	/* store */
+	beq	6f
+	andi.	r11, r10, _PAGE_RW	/* writeable? */
+	beq	2f /* branch if not */
+	ori	r10, r10, _PAGE_DIRTY | _PAGE_HWWRITE
+6:	ori	r10, r10, _PAGE_ACCESSED
 
-	/* Update 'changed', among others.
-	*/
-#ifdef CONFIG_SWAP
-	ori	r10, r10, _PAGE_DIRTY|_PAGE_HWWRITE
-	/* do not set the _PAGE_ACCESSED bit of a non-present page */
-	andi.	r11, r10, _PAGE_PRESENT
-	beq	4f
-	ori	r10, r10, _PAGE_ACCESSED
-4:
-#else
-	ori	r10, r10, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
-#endif
 	mfspr	r11, SPRN_MD_TWC		/* Get pte address again */
 	stw	r10, 0(r11)		/* and update pte in table */
 
+	xori	r10, r10, _PAGE_RW	/* Invert RW bit */
+
 	/* The Linux PTE won't go exactly into the MMU TLB.
-	 * Software indicator bits 21, 22 and 28 must be clear.
+	 * Software indicator bit 28 must be clear.
 	 * Software indicator bits 24, 25, 26, and 27 must be
 	 * set.  All other Linux PTE bits control the behavior
 	 * of the MMU.
-- 
1.6.4.4

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 4/6] 8xx: Tag DAR with 0x00f0 to catch buggy instructions.
  2009-10-05 12:16     ` [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU Joakim Tjernlund
@ 2009-10-05 12:16       ` Joakim Tjernlund
  2009-10-05 12:16         ` [PATCH 5/6] 8xx: Fixup DAR from buggy dcbX instructions Joakim Tjernlund
  2009-10-05 20:17       ` [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU Benjamin Herrenschmidt
  1 sibling, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 12:16 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, linuxppc-dev, Benjamin Herrenschmidt

dcbz, dcbf, dcbi, dcbst and icbi do not set DAR when they
cause a DTLB Error. Dectect this by tagging DAR with 0x00f0
at every exception exit that modifies DAR.
Test for DAR=0x00f0 in DataTLBError and bail
to handle_page_fault().
---
 arch/powerpc/kernel/head_8xx.S |   19 ++++++++++++++++---
 1 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index b1f72d9..b93e32f 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -206,6 +206,8 @@ MachineCheck:
 	EXCEPTION_PROLOG
 	mfspr r4,SPRN_DAR
 	stw r4,_DAR(r11)
+	li r5,0x00f0
+	mtspr SPRN_DAR,r5	/* Tag DAR, to be used in DTLB Error */
 	mfspr r5,SPRN_DSISR
 	stw r5,_DSISR(r11)
 	addi r3,r1,STACK_FRAME_OVERHEAD
@@ -222,6 +224,8 @@ DataAccess:
 	stw	r10,_DSISR(r11)
 	mr	r5,r10
 	mfspr	r4,SPRN_DAR
+	li	r10,0x00f0
+	mtspr	SPRN_DAR,r10	/* Tag DAR, to be used in DTLB Error */
 	EXC_XFER_EE_LITE(0x300, handle_page_fault)
 
 /* Instruction access exception.
@@ -244,6 +248,8 @@ Alignment:
 	EXCEPTION_PROLOG
 	mfspr	r4,SPRN_DAR
 	stw	r4,_DAR(r11)
+	li	r5,0x00f0
+	mtspr	SPRN_DAR,r5	/* Tag DAR, to be used in DTLB Error */
 	mfspr	r5,SPRN_DSISR
 	stw	r5,_DSISR(r11)
 	addi	r3,r1,STACK_FRAME_OVERHEAD
@@ -417,6 +423,7 @@ DataStoreTLBMiss:
 	 * of the MMU.
 	 */
 2:	li	r11, 0x00f0
+	mtspr	SPRN_DAR,r11	/* Tag DAR */
 	rlwimi	r10, r11, 0, 24, 28	/* Set 24-27, clear 28 */
 	DO_8xx_CPU6(0x3d80, r3)
 	mtspr	SPRN_MD_RPN, r10	/* Update TLB entry */
@@ -542,10 +549,14 @@ DataTLBError:
 	stw	r10, 0(r0)
 	stw	r11, 4(r0)
 
+	mfspr	r10, SPRN_DAR
+	cmpwi	cr0, r10, 0x00f0
+	beq-	2f	/* must be a buggy dcbX, icbi insn. */
+
 	/* First, make sure this was a store operation.
 	*/
-	mfspr	r10, SPRN_DSISR
-	andis.	r11, r10, 0x4000 /* no translation */
+	mfspr	r11, SPRN_DSISR
+	andis.	r11, r11, 0x4000 /* no translation */
 	bne	2f	/* branch if set */
 
 	/* The EA of a data TLB miss is automatically stored in the MD_EPN
@@ -564,7 +575,8 @@ DataTLBError:
 	 * are initialized in mapin_ram().  This will avoid the problem,
 	 * assuming we only use the dcbi instruction on kernel addresses.
 	 */
-	mfspr	r10, SPRN_DAR
+
+	/* DAR is in r10 already */
 	rlwinm	r11, r10, 0, 0, 19
 	ori	r11, r11, MD_EVALID
 	mfspr	r10, SPRN_M_CASID
@@ -620,6 +632,7 @@ DataTLBError:
 	 * of the MMU.
 	 */
 	li	r11, 0x00f0
+	mtspr	SPRN_DAR,r11	/* Tag DAR */
 	rlwimi	r10, r11, 0, 24, 28	/* Set 24-27, clear 28 */
 	DO_8xx_CPU6(0x3d80, r3)
 	mtspr	SPRN_MD_RPN, r10	/* Update TLB entry */
-- 
1.6.4.4

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 5/6] 8xx: Fixup DAR from buggy dcbX instructions.
  2009-10-05 12:16       ` [PATCH 4/6] 8xx: Tag DAR with 0x00f0 to catch buggy instructions Joakim Tjernlund
@ 2009-10-05 12:16         ` Joakim Tjernlund
  2009-10-05 12:16           ` [PATCH 6/6] 8xx: start using dcbX instructions in various copy routines Joakim Tjernlund
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 12:16 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, linuxppc-dev, Benjamin Herrenschmidt

This is an assembler version to fixup DAR not being set
by dcbX, icbi instructions. There are two versions, one
uses selfmodifing code(default), the other uses
jump table but is much bigger.
---
 arch/powerpc/kernel/head_8xx.S |  146 +++++++++++++++++++++++++++++++++++++++-
 1 files changed, 145 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index b93e32f..bddaf26 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -551,7 +551,8 @@ DataTLBError:
 
 	mfspr	r10, SPRN_DAR
 	cmpwi	cr0, r10, 0x00f0
-	beq-	2f	/* must be a buggy dcbX, icbi insn. */
+	beq-	FixDAR	/* must be a buggy dcbX, icbi insn. */
+DARFix:	/* Return from dcbx instruction bug workaround, r10 holds value of DAR */
 
 	/* First, make sure this was a store operation.
 	*/
@@ -674,6 +675,149 @@ DataTLBError:
 
 	. = 0x2000
 
+/* This is the procedure to calculate the data EA for buggy dcbx,dcbi instructions
+ * by decoding the registers used by the dcbx instruction and adding them.
+ * DAR is set to the calculated address and r10 also holds the EA on exit.
+ */
+//#define NO_SELF_MODIFYING_CODE /* define if you don't want to use self modifying code */
+	nop	/* A few nops to make the modified_instr: space below cache line aligned */
+	nop
+139:	/* fetch instruction from userspace memory */
+	DO_8xx_CPU6(0x3780, r3)
+	mtspr	SPRN_MD_EPN, r10
+	mfspr	r11, SPRN_M_TWB	/* Get level 1 table entry address */
+	lwz	r11, 0(r11)	/* Get the level 1 entry */
+	tophys  (r11, r11)
+	DO_8xx_CPU6(0x3b80, r3)
+	mtspr	SPRN_MD_TWC, r11	/* Load pte table base address */
+	mfspr	r11, SPRN_MD_TWC	/* ....and get the pte address */
+	lwz	r11, 0(r11)	/* Get the pte */
+	/* concat physical page address(r11) and page offset(r10) */
+	rlwimi	r11, r10, 0, 20, 31
+	b	140f
+FixDAR:	/* Entry point for dcbx workaround. */
+	/* fetch instruction from memory. */
+	mfspr	r10, SPRN_SRR0
+	andis.	r11, r10, 0x8000
+	tophys  (r11, r10)
+	beq-	139b		/* Branch if user space address */
+140:	lwz	r11,0(r11)
+#ifdef CONFIG_8xx_CPU6
+	lwz	r3, 8(r0)	/* restore r3 from memory */
+#endif
+#ifndef NO_SELF_MODIFYING_CODE
+	andis.	r10,r11,0x1f	/* test if reg RA is r0 */
+	li	r10,modified_instr@l
+	dcbtst	r0,r10		/* touch for store */
+	rlwinm	r11,r11,0,0,20	/* Zero lower 10 bits */
+	oris	r11,r11,640	/* Transform instr. to a "add r10,RA,RB" */
+	ori	r11,r11,532
+	stw	r11,0(r10)	/* store add/and instruction */
+	dcbf	0,r10		/* flush new instr. to memory. */
+	icbi	0,r10		/* invalidate instr. cache line */
+	lwz	r11, 4(r0)	/* restore r11 from memory */
+	mfspr	r10, SPRN_M_TW	/* restore r10 from M_TW */
+	isync			/* Wait until new instr is loaded from memory */
+modified_instr:
+	.space	4		/* this is where the add/and instr. is stored */
+	bne+	143f
+	subf	r10,r0,r10	/* r10=r10-r0, only if reg RA is r0 */
+143:	mtdar	r10		/* store faulting EA in DAR */
+	b	DARFix		/* Go back to normal TLB handling */
+#else
+	mfctr	r10
+	mtdar	r10			/* save ctr reg in DAR */
+	rlwinm	r10, r11, 24, 24, 28	/* offset into jump table for reg RB */
+	addi	r10, r10, 150f@l	/* add start of table */
+	mtctr	r10			/* load ctr with jump address */
+	xor	r10, r10, r10		/* sum starts at zero */
+	bctr				/* jump into table */
+150:
+	add	r10, r10, r0
+	b	151f
+	add	r10, r10, r1
+	b	151f
+	add	r10, r10, r2
+	b	151f
+	add	r10, r10, r3
+	b	151f
+	add	r10, r10, r4
+	b	151f
+	add	r10, r10, r5
+	b	151f
+	add	r10, r10, r6
+	b	151f
+	add	r10, r10, r7
+	b	151f
+	add	r10, r10, r8
+	b	151f
+	add	r10, r10, r9
+	b	151f
+	add	r10, r10, r10
+	b	151f
+	add	r10, r10, r11
+	b	151f
+	add	r10, r10, r12
+	b	151f
+	add	r10, r10, r13
+	b	151f
+	add	r10, r10, r14
+	b	151f
+	add	r10, r10, r15
+	b	151f
+	add	r10, r10, r16
+	b	151f
+	add	r10, r10, r17
+	b	151f
+	add	r10, r10, r18
+	b	151f
+	add	r10, r10, r19
+	b	151f
+	mtctr	r11	/* r10 needs special handling */
+	b	154f
+	mtctr	r11	/* r11 needs special handling */
+	b	153f
+	add	r10, r10, r22
+	b	151f
+	add	r10, r10, r23
+	b	151f
+	add	r10, r10, r24
+	b	151f
+	add	r10, r10, r25
+	b	151f
+	add	r10, r10, r25
+	b	151f
+	add	r10, r10, r27
+	b	151f
+	add	r10, r10, r28
+	b	151f
+	add	r10, r10, r29
+	b	151f
+	add	r10, r10, r30
+	b	151f
+	add	r10, r10, r31
+151:
+	rlwinm. r11,r11,19,24,28	/* offset into jump table for reg RA */
+	beq	152f			/* if reg RA is zero, don't add it */ 
+	addi	r11, r11, 150b@l	/* add start of table */
+	mtctr	r11			/* load ctr with jump address */
+	rlwinm	r11,r11,0,16,10		/* make sure we don't execute this more than once */
+	bctr				/* jump into table */
+152:
+	mfdar	r11
+	mtctr	r11			/* restore ctr reg from DAR */
+	mtdar	r10			/* save fault EA to DAR */
+	b	DARFix			/* Go back to normal TLB handling */
+
+	/* special handling for r10,r11 since these are modified already */
+153:	lwz	r11, 4(r0)	/* load r11 from memory */
+	b	155f
+154:	mfspr	r11, SPRN_M_TW	/* load r10 from M_TW */
+155:	add	r10, r10, r11	/* add it */
+	mfctr	r11		/* restore r11 */
+	b	151b
+#endif
+
 	.globl	giveup_fpu
 giveup_fpu:
 	blr
-- 
1.6.4.4

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH 6/6] 8xx: start using dcbX instructions in various copy routines
  2009-10-05 12:16         ` [PATCH 5/6] 8xx: Fixup DAR from buggy dcbX instructions Joakim Tjernlund
@ 2009-10-05 12:16           ` Joakim Tjernlund
  0 siblings, 0 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 12:16 UTC (permalink / raw)
  To: Scott Wood, Rex Feany, linuxppc-dev, Benjamin Herrenschmidt

Now that 8xx can fixup dcbX instructions, start using them
where possible like every other PowerPc arch do.
---
 arch/powerpc/kernel/misc_32.S |   18 ------------------
 arch/powerpc/lib/copy_32.S    |   24 ------------------------
 2 files changed, 0 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/kernel/misc_32.S b/arch/powerpc/kernel/misc_32.S
index 15f28e0..b92095e 100644
--- a/arch/powerpc/kernel/misc_32.S
+++ b/arch/powerpc/kernel/misc_32.S
@@ -495,15 +495,7 @@ _GLOBAL(clear_pages)
 	li	r0,PAGE_SIZE/L1_CACHE_BYTES
 	slw	r0,r0,r4
 	mtctr	r0
-#ifdef CONFIG_8xx
-	li	r4, 0
-1:	stw	r4, 0(r3)
-	stw	r4, 4(r3)
-	stw	r4, 8(r3)
-	stw	r4, 12(r3)
-#else
 1:	dcbz	0,r3
-#endif
 	addi	r3,r3,L1_CACHE_BYTES
 	bdnz	1b
 	blr
@@ -528,15 +520,6 @@ _GLOBAL(copy_page)
 	addi	r3,r3,-4
 	addi	r4,r4,-4
 
-#ifdef CONFIG_8xx
-	/* don't use prefetch on 8xx */
-    	li	r0,4096/L1_CACHE_BYTES
-	mtctr	r0
-1:	COPY_16_BYTES
-	bdnz	1b
-	blr
-
-#else	/* not 8xx, we can prefetch */
 	li	r5,4
 
 #if MAX_COPY_PREFETCH > 1
@@ -577,7 +560,6 @@ _GLOBAL(copy_page)
 	li	r0,MAX_COPY_PREFETCH
 	li	r11,4
 	b	2b
-#endif	/* CONFIG_8xx */
 
 /*
  * void atomic_clear_mask(atomic_t mask, atomic_t *addr)
diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S
index c657de5..74a7f41 100644
--- a/arch/powerpc/lib/copy_32.S
+++ b/arch/powerpc/lib/copy_32.S
@@ -98,20 +98,7 @@ _GLOBAL(cacheable_memzero)
 	bdnz	4b
 3:	mtctr	r9
 	li	r7,4
-#if !defined(CONFIG_8xx)
 10:	dcbz	r7,r6
-#else
-10:	stw	r4, 4(r6)
-	stw	r4, 8(r6)
-	stw	r4, 12(r6)
-	stw	r4, 16(r6)
-#if CACHE_LINE_SIZE >= 32
-	stw	r4, 20(r6)
-	stw	r4, 24(r6)
-	stw	r4, 28(r6)
-	stw	r4, 32(r6)
-#endif /* CACHE_LINE_SIZE */
-#endif
 	addi	r6,r6,CACHELINE_BYTES
 	bdnz	10b
 	clrlwi	r5,r8,32-LG_CACHELINE_BYTES
@@ -200,9 +187,7 @@ _GLOBAL(cacheable_memcpy)
 	mtctr	r0
 	beq	63f
 53:
-#if !defined(CONFIG_8xx)
 	dcbz	r11,r6
-#endif
 	COPY_16_BYTES
 #if L1_CACHE_BYTES >= 32
 	COPY_16_BYTES
@@ -356,14 +341,6 @@ _GLOBAL(__copy_tofrom_user)
 	li	r11,4
 	beq	63f
 
-#ifdef CONFIG_8xx
-	/* Don't use prefetch on 8xx */
-	mtctr	r0
-	li	r0,0
-53:	COPY_16_BYTES_WITHEX(0)
-	bdnz	53b
-
-#else /* not CONFIG_8xx */
 	/* Here we decide how far ahead to prefetch the source */
 	li	r3,4
 	cmpwi	r0,1
@@ -416,7 +393,6 @@ _GLOBAL(__copy_tofrom_user)
 	li	r3,4
 	li	r7,0
 	bne	114b
-#endif /* CONFIG_8xx */
 
 63:	srwi.	r0,r5,2
 	mtctr	r0
-- 
1.6.4.4

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 12:16 [PATCH 0/6] PowerPc 8xx TLB/MMU fixes Joakim Tjernlund
  2009-10-05 12:16 ` [PATCH 1/6] 8xx: DTLB Error must check for more errors Joakim Tjernlund
@ 2009-10-05 18:12 ` Scott Wood
  2009-10-05 18:27   ` Joakim Tjernlund
  2009-10-05 22:04 ` Rex Feany
  2 siblings, 1 reply; 49+ messages in thread
From: Scott Wood @ 2009-10-05 18:12 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: linuxppc-dev, Rex Feany

On Mon, Oct 05, 2009 at 02:16:33PM +0200, Joakim Tjernlund wrote:
> Here are my latest code to fixup 8xx's TLB code.
> This code needs some serious testing before it
> can be commited.
> The "8xx, fault: Add some debug code to do_page_fault()" is
> purely a debug check and will be removed/disabled when
> this series appear stable.
> 
> Scott and Rex, please disregard other patches from me and
> try these out instead.

I'm not sure which tree you based this on, but I got some conflicts
in pte-8xx.h using Ben's merge branch.

After resolving the conflict, without adding tlbil_va in do_page_fault(), I
get the same stuck behavior as before.  With tlbil_va, I get this (not sure
if it's related to using the wrong base tree):

INIT: version 2.85 booting
Mounting /proc and /sys
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   173 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   174 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   175 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   176 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   177 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   178 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   179 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   180 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   181 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 24:   182 Segmentation fault      /etc/rc.d/init.d/$i $mode
dcbz: address:1, dar:30020000!
/etc/rc.d/rcS: line 37:   183 Segmentation fault      /etc/rc.d/rc.local $mode
INIT: Entering runlevel: 1
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
dcbz: address:1, dar:30020000!
INIT: Id "l1" respawning too fast: disabled for 5 minutes
INIT: no more processes left in this runlevel

-Scott

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 18:12 ` [PATCH 0/6] PowerPc 8xx TLB/MMU fixes Scott Wood
@ 2009-10-05 18:27   ` Joakim Tjernlund
  2009-10-05 20:09     ` Scott Wood
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 18:27 UTC (permalink / raw)
  To: Scott Wood; +Cc: Rex Feany, linuxppc-dev



Scott Wood <scottwood@freescale.com> wrote on 05/10/2009 20:12:34:

>
> On Mon, Oct 05, 2009 at 02:16:33PM +0200, Joakim Tjernlund wrote:
> > Here are my latest code to fixup 8xx's TLB code.
> > This code needs some serious testing before it
> > can be commited.
> > The "8xx, fault: Add some debug code to do_page_fault()" is
> > purely a debug check and will be removed/disabled when
> > this series appear stable.
> >
> > Scott and Rex, please disregard other patches from me and
> > try these out instead.
>
> I'm not sure which tree you based this on, but I got some conflicts
> in pte-8xx.h using Ben's merge branch.

It is a somewhat older tree, didn't think it would clash, sorry

>
> After resolving the conflict, without adding tlbil_va in do_page_fault(), I
> get the same stuck behavior as before.

Expected, I havn't not tried to fix the missing tlbil_va(). That is
different problem that you and Ben needs to sort out.

>  With tlbil_va, I get this (not sure
> if it's related to using the wrong base tree):

Could be the debug code doing something bad or the
"Fixup DAR from buggy dcbX instructions". Could you back that one out?
and if that does not help, backout:
"start using dcbX instructions in various copy routines" too.

>
> INIT: version 2.85 booting
> Mounting /proc and /sys
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   173 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   174 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   175 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   176 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   177 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   178 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   179 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   180 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   181 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 24:   182 Segmentation fault      /etc/rc.d/init.d/$i $mode
> dcbz: address:1, dar:30020000!
> /etc/rc.d/rcS: line 37:   183 Segmentation fault      /etc/rc.d/rc.local $mode
> INIT: Entering runlevel: 1
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> dcbz: address:1, dar:30020000!
> INIT: Id "l1" respawning too fast: disabled for 5 minutes
> INIT: no more processes left in this runlevel
>
> -Scott
>
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 18:27   ` Joakim Tjernlund
@ 2009-10-05 20:09     ` Scott Wood
  2009-10-05 21:04       ` Joakim Tjernlund
  0 siblings, 1 reply; 49+ messages in thread
From: Scott Wood @ 2009-10-05 20:09 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Rex Feany, linuxppc-dev

On Mon, Oct 05, 2009 at 08:27:39PM +0200, Joakim Tjernlund wrote:
> > After resolving the conflict, without adding tlbil_va in do_page_fault(), I
> > get the same stuck behavior as before.
> 
> Expected, I havn't not tried to fix the missing tlbil_va(). That is
> different problem that you and Ben needs to sort out.

Right, I just wanted to be clear about which results were from which
modifications.

> >  With tlbil_va, I get this (not sure
> > if it's related to using the wrong base tree):
> 
> Could be the debug code doing something bad or the
> "Fixup DAR from buggy dcbX instructions". Could you back that one out?

That gets rid of the segfaults.  The bootup gets stuck running udev, though
-- the system is either idle, or it's stuck doing some syscall.

I see it sometimes (but less reliably) without this patchset, so it may just
be changing the circumstances to expose the issue more consistently.  Or
maybe I'm seeing the dcbX bug?  There do seem to be dcbX instructions in the
dynamic linker I'm using (including a disturbing section that appears to be
assuming 32 byte cache lines, even though this is supposed to be an 8xx
RFS).

I'll look into it.

> and if that does not help, backout:
> "start using dcbX instructions in various copy routines" too.

No difference.

-Scott

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 12:16     ` [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU Joakim Tjernlund
  2009-10-05 12:16       ` [PATCH 4/6] 8xx: Tag DAR with 0x00f0 to catch buggy instructions Joakim Tjernlund
@ 2009-10-05 20:17       ` Benjamin Herrenschmidt
  2009-10-05 21:25         ` Joakim Tjernlund
  1 sibling, 1 reply; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-05 20:17 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Mon, 2009-10-05 at 14:16 +0200, Joakim Tjernlund wrote:
> Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
> Pros:
>  - I/D TLB Miss never needs to write to the linux pte.
>  - _PAGE_ACCESSED is only set on I/D TLB Error fixing accounting
>  - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
>     when a page has been made dirty.

Not sure here. You seem to still set those from asm. Is that necessary ?

The approach I take on BookE is to simply not set these from asm, -and-
(this is important) not set write permission if dirty is not set in
the TLB miss and set no access permission at all when accessed is not
set. This is important or we'll miss dirty updates which can
be fatal.

The C code in handle_pte_fault() will fixup missing access and dirty
if necessary and flush.

Also look at the 440 code, I think you could simplify your
implementation using similar techniques, such as andc of PTE against
requested bits etc... and thus maybe save a couple of insns.

Cheers,
Ben.

>  - Proper RO/RW mapping of user space.
>  - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
> Cons:
>  - 4 more instructions in I/D TLB Miss, but the since the linux pte is
>    not written anymore, it should still be a win.
> ---
>  arch/powerpc/include/asm/pte-8xx.h |    9 +-
>  arch/powerpc/kernel/head_8xx.S     |  163 ++++++++++++++++++++++++++----------
>  2 files changed, 122 insertions(+), 50 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/pte-8xx.h
> index 8c6e312..af541a2 100644
> --- a/arch/powerpc/include/asm/pte-8xx.h
> +++ b/arch/powerpc/include/asm/pte-8xx.h
> @@ -32,22 +32,21 @@
>  #define _PAGE_FILE	0x0002	/* when !present: nonlinear file mapping */
>  #define _PAGE_NO_CACHE	0x0002	/* I: cache inhibit */
>  #define _PAGE_SHARED	0x0004	/* No ASID (context) compare */
> +#define _PAGE_DIRTY	0x0100	/* C: page changed */
>  
>  /* These five software bits must be masked out when the entry is loaded
>   * into the TLB.
>   */
>  #define _PAGE_EXEC	0x0008	/* software: i-cache coherency required */
>  #define _PAGE_GUARDED	0x0010	/* software: guarded access */
> -#define _PAGE_DIRTY	0x0020	/* software: page changed */
> -#define _PAGE_RW	0x0040	/* software: user write access allowed */
> -#define _PAGE_ACCESSED	0x0080	/* software: page referenced */
> +#define _PAGE_USER	0x0020	/* software: User space access */
>  
>  /* Setting any bits in the nibble with the follow two controls will
>   * require a TLB exception handler change.  It is assumed unused bits
>   * are always zero.
>   */
> -#define _PAGE_HWWRITE	0x0100	/* h/w write enable: never set in Linux PTE */
> -#define _PAGE_USER	0x0800	/* One of the PP bits, the other is USER&~RW */
> +#define _PAGE_RW	0x0400	/* lsb PP bits */
> +#define _PAGE_ACCESSED	0x0800	/* msb PP bits */
>  
>  #define _PMD_PRESENT	0x0001
>  #define _PMD_BAD	0x0ff0
> diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> index 118bb05..b1f72d9 100644
> --- a/arch/powerpc/kernel/head_8xx.S
> +++ b/arch/powerpc/kernel/head_8xx.S
> @@ -333,21 +333,15 @@ InstructionTLBMiss:
>  	mfspr	r11, SPRN_MD_TWC	/* ....and get the pte address */
>  	lwz	r10, 0(r11)	/* Get the pte */
>  
> -#ifdef CONFIG_SWAP
> -	/* do not set the _PAGE_ACCESSED bit of a non-present page */
> -	andi.	r11, r10, _PAGE_PRESENT
> -	beq	4f
> -	ori	r10, r10, _PAGE_ACCESSED
> -	mfspr	r11, SPRN_MD_TWC	/* get the pte address again */
> -	stw	r10, 0(r11)
> -4:
> -#else
> -	ori	r10, r10, _PAGE_ACCESSED
> -	stw	r10, 0(r11)
> -#endif
> -
> +	andi.	r11, r10, _PAGE_USER | _PAGE_ACCESSED
> +	cmpwi	cr0, r11, _PAGE_USER | _PAGE_ACCESSED
> +	beq+	cr0, 5f	/* branch if access allowed */
> +	rlwinm	r10, r10, 0, 22, 19 /* r10 &= ~(_PAGE_ACCESSED | _PAGE_RW) */
> +	b	6f
> +5:	xori	r10, r10, _PAGE_RW /* invert RW bit */
> +6:
>  	/* The Linux PTE won't go exactly into the MMU TLB.
> -	 * Software indicator bits 21, 22 and 28 must be clear.
> +	 * Software indicator bit 28 must be clear.
>  	 * Software indicator bits 24, 25, 26, and 27 must be
>  	 * set.  All other Linux PTE bits control the behavior
>  	 * of the MMU.
> @@ -409,21 +403,15 @@ DataStoreTLBMiss:
>  	DO_8xx_CPU6(0x3b80, r3)
>  	mtspr	SPRN_MD_TWC, r11
>  
> -#ifdef CONFIG_SWAP
> -	/* do not set the _PAGE_ACCESSED bit of a non-present page */
> -	andi.	r11, r10, _PAGE_PRESENT
> -	beq	4f
> -	ori	r10, r10, _PAGE_ACCESSED
> -4:
> -	/* and update pte in table */
> -#else
> -	ori	r10, r10, _PAGE_ACCESSED
> -#endif
> -	mfspr	r11, SPRN_MD_TWC	/* get the pte address again */
> -	stw	r10, 0(r11)
> -
> +	andi.	r11, r10, _PAGE_USER | _PAGE_ACCESSED
> +	cmpwi	cr0, r11, _PAGE_USER | _PAGE_ACCESSED
> +	beq+	cr0, 5f	/* branch if access allowed */
> +	rlwinm	r10, r10, 0, 22, 19 /* r10 &= ~(_PAGE_ACCESSED | _PAGE_RW) */
> +	b	6f
> +5:	xori	r10, r10, _PAGE_RW /* invert RW bit */
> +6:
>  	/* The Linux PTE won't go exactly into the MMU TLB.
> -	 * Software indicator bits 21, 22 and 28 must be clear.
> +	 * Software indicator bit 28 must be clear.
>  	 * Software indicator bits 24, 25, 26, and 27 must be
>  	 * set.  All other Linux PTE bits control the behavior
>  	 * of the MMU.
> @@ -448,6 +436,91 @@ DataStoreTLBMiss:
>   */
>  	. = 0x1300
>  InstructionTLBError:
> +#ifdef CONFIG_8xx_CPU6
> +	stw	r3, 8(r0)
> +#endif
> +	DO_8xx_CPU6(0x3f80, r3)
> +	mtspr	SPRN_M_TW, r10	/* Save a couple of working registers */
> +	mfcr	r10
> +	stw	r10, 0(r0)
> +	stw	r11, 4(r0)
> +
> +	mfspr	r11, SPRN_SRR1
> +	andis.	r11, r11, 0x5000	/* no translation, guarded */
> +	bne	2f
> +
> +	mfspr	r10, SPRN_SRR0	/* Get effective address of fault */
> +#ifdef CONFIG_8xx_CPU15
> +	addi	r11, r10, 0x1000
> +	tlbie	r11
> +	addi	r11, r10, -0x1000
> +	tlbie	r11
> +#endif
> +	DO_8xx_CPU6(0x3780, r3)
> +	mtspr	SPRN_MD_EPN, r10	/* Have to use MD_EPN for walk, MI_EPN can't */
> +	mfspr	r10, SPRN_M_TWB	/* Get level 1 table entry address */
> +
> +	/* If we are faulting a kernel address, we have to use the
> +	 * kernel page tables.
> +	 */
> +	andi.	r11, r10, 0x0800	/* Address >= 0x80000000 */
> +	beq	3f
> +	lis	r11, swapper_pg_dir@h
> +	ori	r11, r11, swapper_pg_dir@l
> +	rlwimi	r10, r11, 0, 2, 19
> +3:
> +	lwz	r11, 0(r10)	/* Get the level 1 entry */
> +	rlwinm.	r10, r11,0,0,19	/* Extract page descriptor page address */
> +	beq	2f		/* If zero, don't try to find a pte */
> +
> +	/* We have a pte table, so load the MI_TWC with the attributes
> +	 * for this "segment."
> +	 */
> +	ori	r11,r11,1		/* Set valid bit */
> +	DO_8xx_CPU6(0x2b80, r3)
> +	mtspr	SPRN_MI_TWC, r11	/* Set segment attributes */
> +	DO_8xx_CPU6(0x3b80, r3)
> +	mtspr	SPRN_MD_TWC, r11	/* Load pte table base address */
> +
> +	mfspr	r11, SPRN_SRR1
> +	andi.	r11, r11, 0x4000 /* MSR[PR] */
> +	mfspr	r11, SPRN_MD_TWC	/* ....and get the pte address */
> +	lwz	r10, 0(r11)	/* Get the pte */
> +	beq	5f /* Kernel access always OK */
> +	andi.	r11,r10, _PAGE_USER
> +	beq	2f
> +5:	ori	r10, r10, _PAGE_ACCESSED
> +	mfspr	r21, SPRN_MD_TWC	/* ....and get the pte address */
> +	stw	r10, 0(r11)
> +	xori	r10, r10, _PAGE_RW /* invert RW bit */
> +
> +	/* The Linux PTE won't go exactly into the MMU TLB.
> +	 * Software indicator bit 28 must be clear.
> +	 * Software indicator bits 24, 25, 26, and 27 must be
> +	 * set.  All other Linux PTE bits control the behavior
> +	 * of the MMU.
> +	 */
> +	li	r11, 0x00f0
> +	rlwimi	r10, r11, 0, 24, 28	/* Set 24-27, clear 28 */
> +	DO_8xx_CPU6(0x2d80, r3)
> +	mtspr	SPRN_MI_RPN, r10	/* Update TLB entry */
> +
> +	mfspr	r10, SPRN_M_TW	/* Restore registers */
> +	lwz	r11, 0(r0)
> +	mtcr	r11
> +	lwz	r11, 4(r0)
> +#ifdef CONFIG_8xx_CPU6
> +	lwz	r3, 8(r0)
> +#endif
> +	rfi
> +
> +2:	mfspr	r10, SPRN_M_TW	/* Restore registers */
> +	lwz	r11, 0(r0)
> +	mtcr	r11
> +	lwz	r11, 4(r0)
> +#ifdef CONFIG_8xx_CPU6
> +	lwz	r3, 8(r0)
> +#endif
>  	b	InstructionAccess
>  
>  /* This is the data TLB error on the MPC8xx.  This could be due to
> @@ -472,8 +545,8 @@ DataTLBError:
>  	/* First, make sure this was a store operation.
>  	*/
>  	mfspr	r10, SPRN_DSISR
> -	andis.	r11, r10, 0x4800 /* no translation, no permission. */
> -	bne	2f	/* branch if either is set */
> +	andis.	r11, r10, 0x4000 /* no translation */
> +	bne	2f	/* branch if set */
>  
>  	/* The EA of a data TLB miss is automatically stored in the MD_EPN
>  	 * register.  The EA of a data TLB error is automatically stored in
> @@ -522,26 +595,26 @@ DataTLBError:
>  	mfspr	r11, SPRN_MD_TWC		/* ....and get the pte address */
>  	lwz	r10, 0(r11)		/* Get the pte */
>  
> -	andi.	r11, r10, _PAGE_RW	/* Is it writeable? */
> -	beq	2f			/* Bail out if not */
> +	mfspr	r11, SPRN_SRR1
> +	andi.	r11, r11, 0x4000 /* MSR[PR] */
> +	beq	5f /* Kernel access always OK */
> +	andi.	r11,r10, _PAGE_USER
> +	beq	2f
> +5:	mfspr	r11, SPRN_DSISR
> +	andis.	r11, r11, 0x0200	/* store */
> +	beq	6f
> +	andi.	r11, r10, _PAGE_RW	/* writeable? */
> +	beq	2f /* branch if not */
> +	ori	r10, r10, _PAGE_DIRTY | _PAGE_HWWRITE
> +6:	ori	r10, r10, _PAGE_ACCESSED
>  
> -	/* Update 'changed', among others.
> -	*/
> -#ifdef CONFIG_SWAP
> -	ori	r10, r10, _PAGE_DIRTY|_PAGE_HWWRITE
> -	/* do not set the _PAGE_ACCESSED bit of a non-present page */
> -	andi.	r11, r10, _PAGE_PRESENT
> -	beq	4f
> -	ori	r10, r10, _PAGE_ACCESSED
> -4:
> -#else
> -	ori	r10, r10, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
> -#endif
>  	mfspr	r11, SPRN_MD_TWC		/* Get pte address again */
>  	stw	r10, 0(r11)		/* and update pte in table */
>  
> +	xori	r10, r10, _PAGE_RW	/* Invert RW bit */
> +
>  	/* The Linux PTE won't go exactly into the MMU TLB.
> -	 * Software indicator bits 21, 22 and 28 must be clear.
> +	 * Software indicator bit 28 must be clear.
>  	 * Software indicator bits 24, 25, 26, and 27 must be
>  	 * set.  All other Linux PTE bits control the behavior
>  	 * of the MMU.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 20:09     ` Scott Wood
@ 2009-10-05 21:04       ` Joakim Tjernlund
  2009-10-05 21:31         ` Benjamin Herrenschmidt
  2009-10-05 21:31         ` Scott Wood
  0 siblings, 2 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 21:04 UTC (permalink / raw)
  To: Scott Wood; +Cc: Rex Feany, linuxppc-dev

Scott Wood <scottwood@freescale.com> wrote on 05/10/2009 22:09:41:
>
> On Mon, Oct 05, 2009 at 08:27:39PM +0200, Joakim Tjernlund wrote:
> > > After resolving the conflict, without adding tlbil_va in do_page_fault(), I
> > > get the same stuck behavior as before.
> >
> > Expected, I havn't not tried to fix the missing tlbil_va(). That is
> > different problem that you and Ben needs to sort out.
>
> Right, I just wanted to be clear about which results were from which
> modifications.
>
> > >  With tlbil_va, I get this (not sure
> > > if it's related to using the wrong base tree):
> >
> > Could be the debug code doing something bad or the
> > "Fixup DAR from buggy dcbX instructions". Could you back that one out?
>
> That gets rid of the segfaults.  The bootup gets stuck running udev, though
> -- the system is either idle, or it's stuck doing some syscall.

Ok, that just means that my asm version of fixing up dcbX insn does not work.

>
> I see it sometimes (but less reliably) without this patchset, so it may just
> be changing the circumstances to expose the issue more consistently.  Or
> maybe I'm seeing the dcbX bug?  There do seem to be dcbX instructions in the
> dynamic linker I'm using (including a disturbing section that appears to be
> assuming 32 byte cache lines, even though this is supposed to be an 8xx
> RFS).

Yes, every ld.so uses dcbX and icbi insn when relocatin code.
Maybe you see some version of the dcbX bug, but my fault.c should
fix them up. My bet would be the 32 byte cache line, it will miss
out every second line and so the results are unreliable.

>
> I'll look into it.
>
> > and if that does not help, backout:
> > "start using dcbX instructions in various copy routines" too.
>
> No difference.

Good, it seems possible to uses the dcbX insn now. I will
have to fixup fault a bit more though.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 20:17       ` [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU Benjamin Herrenschmidt
@ 2009-10-05 21:25         ` Joakim Tjernlund
  2009-10-05 21:37           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 21:25 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany



Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 22:17:04:
>
> On Mon, 2009-10-05 at 14:16 +0200, Joakim Tjernlund wrote:
> > Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
> > Pros:
> >  - I/D TLB Miss never needs to write to the linux pte.
> >  - _PAGE_ACCESSED is only set on I/D TLB Error fixing accounting
> >  - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
> >     when a page has been made dirty.
>
> Not sure here. You seem to still set those from asm. Is that necessary ?

Well, yes. head_32.S also sets ACCESSED and DIRTY from asm so why not me?
The basic method I use is:
TLB gets mapped with NoAccess, then the first access will trap
into TLB error, where ACCESSED will be set if permission to access
the page is OK (and RW too). On first store 8xx will trap
into DTLB error and permissions is OK, DIRTY will be set too.
Is this wrong?
Do I have to trap to C for first store?

>
> The approach I take on BookE is to simply not set these from asm, -and-
> (this is important) not set write permission if dirty is not set in
> the TLB miss and set no access permission at all when accessed is not
> set. This is important or we'll miss dirty updates which can
> be fatal.

not sure, but this seems similar to what I do. DIRTY will be set,
in asm, on first store.
ACCESSED will only be set iff (USER && VALID)

>
> The C code in handle_pte_fault() will fixup missing access and dirty
> if necessary and flush.
>
> Also look at the 440 code, I think you could simplify your
> implementation using similar techniques, such as andc of PTE against
> requested bits etc... and thus maybe save a couple of insns.

Great, but first I want to make sure I doing it right :)

So is there some golden rule I have to follow?

 Jocke

>
> Cheers,
> Ben.
>
> >  - Proper RO/RW mapping of user space.
> >  - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
> > Cons:
> >  - 4 more instructions in I/D TLB Miss, but the since the linux pte is
> >    not written anymore, it should still be a win.
> > ---
> >  arch/powerpc/include/asm/pte-8xx.h |    9 +-
> >  arch/powerpc/kernel/head_8xx.S     |  163 ++++++++++++++++++++++++++----------
> >  2 files changed, 122 insertions(+), 50 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/pte-8xx.h
> > index 8c6e312..af541a2 100644
> > --- a/arch/powerpc/include/asm/pte-8xx.h
> > +++ b/arch/powerpc/include/asm/pte-8xx.h
> > @@ -32,22 +32,21 @@
> >  #define _PAGE_FILE   0x0002   /* when !present: nonlinear file mapping */
> >  #define _PAGE_NO_CACHE   0x0002   /* I: cache inhibit */
> >  #define _PAGE_SHARED   0x0004   /* No ASID (context) compare */
> > +#define _PAGE_DIRTY   0x0100   /* C: page changed */
> >
> >  /* These five software bits must be masked out when the entry is loaded
> >   * into the TLB.
> >   */
> >  #define _PAGE_EXEC   0x0008   /* software: i-cache coherency required */
> >  #define _PAGE_GUARDED   0x0010   /* software: guarded access */
> > -#define _PAGE_DIRTY   0x0020   /* software: page changed */
> > -#define _PAGE_RW   0x0040   /* software: user write access allowed */
> > -#define _PAGE_ACCESSED   0x0080   /* software: page referenced */
> > +#define _PAGE_USER   0x0020   /* software: User space access */
> >
> >  /* Setting any bits in the nibble with the follow two controls will
> >   * require a TLB exception handler change.  It is assumed unused bits
> >   * are always zero.
> >   */
> > -#define _PAGE_HWWRITE   0x0100   /* h/w write enable: never set in Linux PTE */
> > -#define _PAGE_USER   0x0800   /* One of the PP bits, the other is USER&~RW */
> > +#define _PAGE_RW   0x0400   /* lsb PP bits */
> > +#define _PAGE_ACCESSED   0x0800   /* msb PP bits */
> >
> >  #define _PMD_PRESENT   0x0001
> >  #define _PMD_BAD   0x0ff0
> > diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> > index 118bb05..b1f72d9 100644
> > --- a/arch/powerpc/kernel/head_8xx.S
> > +++ b/arch/powerpc/kernel/head_8xx.S
> > @@ -333,21 +333,15 @@ InstructionTLBMiss:
> >     mfspr   r11, SPRN_MD_TWC   /* ....and get the pte address */
> >     lwz   r10, 0(r11)   /* Get the pte */
> >
> > -#ifdef CONFIG_SWAP
> > -   /* do not set the _PAGE_ACCESSED bit of a non-present page */
> > -   andi.   r11, r10, _PAGE_PRESENT
> > -   beq   4f
> > -   ori   r10, r10, _PAGE_ACCESSED
> > -   mfspr   r11, SPRN_MD_TWC   /* get the pte address again */
> > -   stw   r10, 0(r11)
> > -4:
> > -#else
> > -   ori   r10, r10, _PAGE_ACCESSED
> > -   stw   r10, 0(r11)
> > -#endif
> > -
> > +   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
> > +   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
> > +   beq+   cr0, 5f   /* branch if access allowed */
> > +   rlwinm   r10, r10, 0, 22, 19 /* r10 &= ~(_PAGE_ACCESSED | _PAGE_RW) */
> > +   b   6f
> > +5:   xori   r10, r10, _PAGE_RW /* invert RW bit */
> > +6:
> >     /* The Linux PTE won't go exactly into the MMU TLB.
> > -    * Software indicator bits 21, 22 and 28 must be clear.
> > +    * Software indicator bit 28 must be clear.
> >      * Software indicator bits 24, 25, 26, and 27 must be
> >      * set.  All other Linux PTE bits control the behavior
> >      * of the MMU.
> > @@ -409,21 +403,15 @@ DataStoreTLBMiss:
> >     DO_8xx_CPU6(0x3b80, r3)
> >     mtspr   SPRN_MD_TWC, r11
> >
> > -#ifdef CONFIG_SWAP
> > -   /* do not set the _PAGE_ACCESSED bit of a non-present page */
> > -   andi.   r11, r10, _PAGE_PRESENT
> > -   beq   4f
> > -   ori   r10, r10, _PAGE_ACCESSED
> > -4:
> > -   /* and update pte in table */
> > -#else
> > -   ori   r10, r10, _PAGE_ACCESSED
> > -#endif
> > -   mfspr   r11, SPRN_MD_TWC   /* get the pte address again */
> > -   stw   r10, 0(r11)
> > -
> > +   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
> > +   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
> > +   beq+   cr0, 5f   /* branch if access allowed */
> > +   rlwinm   r10, r10, 0, 22, 19 /* r10 &= ~(_PAGE_ACCESSED | _PAGE_RW) */
> > +   b   6f
> > +5:   xori   r10, r10, _PAGE_RW /* invert RW bit */
> > +6:
> >     /* The Linux PTE won't go exactly into the MMU TLB.
> > -    * Software indicator bits 21, 22 and 28 must be clear.
> > +    * Software indicator bit 28 must be clear.
> >      * Software indicator bits 24, 25, 26, and 27 must be
> >      * set.  All other Linux PTE bits control the behavior
> >      * of the MMU.
> > @@ -448,6 +436,91 @@ DataStoreTLBMiss:
> >   */
> >     . = 0x1300
> >  InstructionTLBError:
> > +#ifdef CONFIG_8xx_CPU6
> > +   stw   r3, 8(r0)
> > +#endif
> > +   DO_8xx_CPU6(0x3f80, r3)
> > +   mtspr   SPRN_M_TW, r10   /* Save a couple of working registers */
> > +   mfcr   r10
> > +   stw   r10, 0(r0)
> > +   stw   r11, 4(r0)
> > +
> > +   mfspr   r11, SPRN_SRR1
> > +   andis.   r11, r11, 0x5000   /* no translation, guarded */
> > +   bne   2f
> > +
> > +   mfspr   r10, SPRN_SRR0   /* Get effective address of fault */
> > +#ifdef CONFIG_8xx_CPU15
> > +   addi   r11, r10, 0x1000
> > +   tlbie   r11
> > +   addi   r11, r10, -0x1000
> > +   tlbie   r11
> > +#endif
> > +   DO_8xx_CPU6(0x3780, r3)
> > +   mtspr   SPRN_MD_EPN, r10   /* Have to use MD_EPN for walk, MI_EPN can't */
> > +   mfspr   r10, SPRN_M_TWB   /* Get level 1 table entry address */
> > +
> > +   /* If we are faulting a kernel address, we have to use the
> > +    * kernel page tables.
> > +    */
> > +   andi.   r11, r10, 0x0800   /* Address >= 0x80000000 */
> > +   beq   3f
> > +   lis   r11, swapper_pg_dir@h
> > +   ori   r11, r11, swapper_pg_dir@l
> > +   rlwimi   r10, r11, 0, 2, 19
> > +3:
> > +   lwz   r11, 0(r10)   /* Get the level 1 entry */
> > +   rlwinm.   r10, r11,0,0,19   /* Extract page descriptor page address */
> > +   beq   2f      /* If zero, don't try to find a pte */
> > +
> > +   /* We have a pte table, so load the MI_TWC with the attributes
> > +    * for this "segment."
> > +    */
> > +   ori   r11,r11,1      /* Set valid bit */
> > +   DO_8xx_CPU6(0x2b80, r3)
> > +   mtspr   SPRN_MI_TWC, r11   /* Set segment attributes */
> > +   DO_8xx_CPU6(0x3b80, r3)
> > +   mtspr   SPRN_MD_TWC, r11   /* Load pte table base address */
> > +
> > +   mfspr   r11, SPRN_SRR1
> > +   andi.   r11, r11, 0x4000 /* MSR[PR] */
> > +   mfspr   r11, SPRN_MD_TWC   /* ....and get the pte address */
> > +   lwz   r10, 0(r11)   /* Get the pte */
> > +   beq   5f /* Kernel access always OK */
> > +   andi.   r11,r10, _PAGE_USER
> > +   beq   2f
> > +5:   ori   r10, r10, _PAGE_ACCESSED
> > +   mfspr   r21, SPRN_MD_TWC   /* ....and get the pte address */
> > +   stw   r10, 0(r11)
> > +   xori   r10, r10, _PAGE_RW /* invert RW bit */
> > +
> > +   /* The Linux PTE won't go exactly into the MMU TLB.
> > +    * Software indicator bit 28 must be clear.
> > +    * Software indicator bits 24, 25, 26, and 27 must be
> > +    * set.  All other Linux PTE bits control the behavior
> > +    * of the MMU.
> > +    */
> > +   li   r11, 0x00f0
> > +   rlwimi   r10, r11, 0, 24, 28   /* Set 24-27, clear 28 */
> > +   DO_8xx_CPU6(0x2d80, r3)
> > +   mtspr   SPRN_MI_RPN, r10   /* Update TLB entry */
> > +
> > +   mfspr   r10, SPRN_M_TW   /* Restore registers */
> > +   lwz   r11, 0(r0)
> > +   mtcr   r11
> > +   lwz   r11, 4(r0)
> > +#ifdef CONFIG_8xx_CPU6
> > +   lwz   r3, 8(r0)
> > +#endif
> > +   rfi
> > +
> > +2:   mfspr   r10, SPRN_M_TW   /* Restore registers */
> > +   lwz   r11, 0(r0)
> > +   mtcr   r11
> > +   lwz   r11, 4(r0)
> > +#ifdef CONFIG_8xx_CPU6
> > +   lwz   r3, 8(r0)
> > +#endif
> >     b   InstructionAccess
> >
> >  /* This is the data TLB error on the MPC8xx.  This could be due to
> > @@ -472,8 +545,8 @@ DataTLBError:
> >     /* First, make sure this was a store operation.
> >     */
> >     mfspr   r10, SPRN_DSISR
> > -   andis.   r11, r10, 0x4800 /* no translation, no permission. */
> > -   bne   2f   /* branch if either is set */
> > +   andis.   r11, r10, 0x4000 /* no translation */
> > +   bne   2f   /* branch if set */
> >
> >     /* The EA of a data TLB miss is automatically stored in the MD_EPN
> >      * register.  The EA of a data TLB error is automatically stored in
> > @@ -522,26 +595,26 @@ DataTLBError:
> >     mfspr   r11, SPRN_MD_TWC      /* ....and get the pte address */
> >     lwz   r10, 0(r11)      /* Get the pte */
> >
> > -   andi.   r11, r10, _PAGE_RW   /* Is it writeable? */
> > -   beq   2f         /* Bail out if not */
> > +   mfspr   r11, SPRN_SRR1
> > +   andi.   r11, r11, 0x4000 /* MSR[PR] */
> > +   beq   5f /* Kernel access always OK */
> > +   andi.   r11,r10, _PAGE_USER
> > +   beq   2f
> > +5:   mfspr   r11, SPRN_DSISR
> > +   andis.   r11, r11, 0x0200   /* store */
> > +   beq   6f
> > +   andi.   r11, r10, _PAGE_RW   /* writeable? */
> > +   beq   2f /* branch if not */
> > +   ori   r10, r10, _PAGE_DIRTY | _PAGE_HWWRITE
> > +6:   ori   r10, r10, _PAGE_ACCESSED
> >
> > -   /* Update 'changed', among others.
> > -   */
> > -#ifdef CONFIG_SWAP
> > -   ori   r10, r10, _PAGE_DIRTY|_PAGE_HWWRITE
> > -   /* do not set the _PAGE_ACCESSED bit of a non-present page */
> > -   andi.   r11, r10, _PAGE_PRESENT
> > -   beq   4f
> > -   ori   r10, r10, _PAGE_ACCESSED
> > -4:
> > -#else
> > -   ori   r10, r10, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
> > -#endif
> >     mfspr   r11, SPRN_MD_TWC      /* Get pte address again */
> >     stw   r10, 0(r11)      /* and update pte in table */
> >
> > +   xori   r10, r10, _PAGE_RW   /* Invert RW bit */
> > +
> >     /* The Linux PTE won't go exactly into the MMU TLB.
> > -    * Software indicator bits 21, 22 and 28 must be clear.
> > +    * Software indicator bit 28 must be clear.
> >      * Software indicator bits 24, 25, 26, and 27 must be
> >      * set.  All other Linux PTE bits control the behavior
> >      * of the MMU.
>
>
>
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 21:04       ` Joakim Tjernlund
@ 2009-10-05 21:31         ` Benjamin Herrenschmidt
  2009-10-05 21:41           ` Joakim Tjernlund
  2009-10-05 21:31         ` Scott Wood
  1 sibling, 1 reply; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-05 21:31 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Mon, 2009-10-05 at 23:04 +0200, Joakim Tjernlund wrote:
> 
> Yes, every ld.so uses dcbX and icbi insn when relocatin code.

Even with -msecure-plt ?

> Maybe you see some version of the dcbX bug, but my fault.c should
> fix them up. My bet would be the 32 byte cache line, it will miss
> out every second line and so the results are unreliable.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 21:04       ` Joakim Tjernlund
  2009-10-05 21:31         ` Benjamin Herrenschmidt
@ 2009-10-05 21:31         ` Scott Wood
  1 sibling, 0 replies; 49+ messages in thread
From: Scott Wood @ 2009-10-05 21:31 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Rex Feany, linuxppc-dev

Joakim Tjernlund wrote:
> Yes, every ld.so uses dcbX and icbi insn when relocatin code.
> Maybe you see some version of the dcbX bug, but my fault.c should
> fix them up. My bet would be the 32 byte cache line, it will miss
> out every second line and so the results are unreliable.

I found the 32-byte stuff from an objdump; it looks like there's a 
runtime check for the cache line size that avoids running that code if 
it's different.

-Scott

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 21:25         ` Joakim Tjernlund
@ 2009-10-05 21:37           ` Benjamin Herrenschmidt
  2009-10-05 22:00             ` Joakim Tjernlund
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-05 21:37 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Mon, 2009-10-05 at 23:25 +0200, Joakim Tjernlund wrote:
> 
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 22:17:04:
> >
> > On Mon, 2009-10-05 at 14:16 +0200, Joakim Tjernlund wrote:
> > > Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
> > > Pros:
> > >  - I/D TLB Miss never needs to write to the linux pte.
> > >  - _PAGE_ACCESSED is only set on I/D TLB Error fixing accounting
> > >  - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
> > >     when a page has been made dirty.
> >
> > Not sure here. You seem to still set those from asm. Is that necessary ?
> 
> Well, yes. head_32.S also sets ACCESSED and DIRTY from asm so why not me?

Don't look at the hash 32 code :-)

> The basic method I use is:
> TLB gets mapped with NoAccess, then the first access will trap
> into TLB error, where ACCESSED will be set if permission to access
> the page is OK (and RW too). On first store 8xx will trap
> into DTLB error and permissions is OK, DIRTY will be set too.
> Is this wrong?

>From your explanation it looks ok. IE. as long as !DIRTY -> not
writeable (will fault on store) and !ACCESSED means not accessible (will
fault on any access) then you are ok.

> Do I have to trap to C for first store?

Most of the time you do anyways since the PTE isn't populated at all. At
which point, Linux will populate a PTE with DIRTY and ACCESSED already
set. It should be reasonably rare to actually fault because DIRTY and/or
ACCESSED are missing.

> > The approach I take on BookE is to simply not set these from asm, -and-
> > (this is important) not set write permission if dirty is not set in
> > the TLB miss and set no access permission at all when accessed is not
> > set. This is important or we'll miss dirty updates which can
> > be fatal.
> 
> not sure, but this seems similar to what I do. DIRTY will be set,
> in asm, on first store.

Don't set DIRTY if !VALID

> ACCESSED will only be set iff (USER && VALID)

My point is that you should be able to simplify the code here, have only
the TLB miss look at the PTE, not the data access and instruction
access, and have the later be a boring exception going straight to C.

> >
> > The C code in handle_pte_fault() will fixup missing access and dirty
> > if necessary and flush.
> >
> > Also look at the 440 code, I think you could simplify your
> > implementation using similar techniques, such as andc of PTE against
> > requested bits etc... and thus maybe save a couple of insns.
> 
> Great, but first I want to make sure I doing it right :)
> 
> So is there some golden rule I have to follow?

Mostly only !ACCESSED -> no access permitted and !DIRTY -> no store
permitted and don't write anything back if !VALID.

Cheers,
Ben.

>  Jocke
> 
> >
> > Cheers,
> > Ben.
> >
> > >  - Proper RO/RW mapping of user space.
> > >  - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
> > > Cons:
> > >  - 4 more instructions in I/D TLB Miss, but the since the linux pte is
> > >    not written anymore, it should still be a win.
> > > ---
> > >  arch/powerpc/include/asm/pte-8xx.h |    9 +-
> > >  arch/powerpc/kernel/head_8xx.S     |  163 ++++++++++++++++++++++++++----------
> > >  2 files changed, 122 insertions(+), 50 deletions(-)
> > >
> > > diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/pte-8xx.h
> > > index 8c6e312..af541a2 100644
> > > --- a/arch/powerpc/include/asm/pte-8xx.h
> > > +++ b/arch/powerpc/include/asm/pte-8xx.h
> > > @@ -32,22 +32,21 @@
> > >  #define _PAGE_FILE   0x0002   /* when !present: nonlinear file mapping */
> > >  #define _PAGE_NO_CACHE   0x0002   /* I: cache inhibit */
> > >  #define _PAGE_SHARED   0x0004   /* No ASID (context) compare */
> > > +#define _PAGE_DIRTY   0x0100   /* C: page changed */
> > >
> > >  /* These five software bits must be masked out when the entry is loaded
> > >   * into the TLB.
> > >   */
> > >  #define _PAGE_EXEC   0x0008   /* software: i-cache coherency required */
> > >  #define _PAGE_GUARDED   0x0010   /* software: guarded access */
> > > -#define _PAGE_DIRTY   0x0020   /* software: page changed */
> > > -#define _PAGE_RW   0x0040   /* software: user write access allowed */
> > > -#define _PAGE_ACCESSED   0x0080   /* software: page referenced */
> > > +#define _PAGE_USER   0x0020   /* software: User space access */
> > >
> > >  /* Setting any bits in the nibble with the follow two controls will
> > >   * require a TLB exception handler change.  It is assumed unused bits
> > >   * are always zero.
> > >   */
> > > -#define _PAGE_HWWRITE   0x0100   /* h/w write enable: never set in Linux PTE */
> > > -#define _PAGE_USER   0x0800   /* One of the PP bits, the other is USER&~RW */
> > > +#define _PAGE_RW   0x0400   /* lsb PP bits */
> > > +#define _PAGE_ACCESSED   0x0800   /* msb PP bits */
> > >
> > >  #define _PMD_PRESENT   0x0001
> > >  #define _PMD_BAD   0x0ff0
> > > diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
> > > index 118bb05..b1f72d9 100644
> > > --- a/arch/powerpc/kernel/head_8xx.S
> > > +++ b/arch/powerpc/kernel/head_8xx.S
> > > @@ -333,21 +333,15 @@ InstructionTLBMiss:
> > >     mfspr   r11, SPRN_MD_TWC   /* ....and get the pte address */
> > >     lwz   r10, 0(r11)   /* Get the pte */
> > >
> > > -#ifdef CONFIG_SWAP
> > > -   /* do not set the _PAGE_ACCESSED bit of a non-present page */
> > > -   andi.   r11, r10, _PAGE_PRESENT
> > > -   beq   4f
> > > -   ori   r10, r10, _PAGE_ACCESSED
> > > -   mfspr   r11, SPRN_MD_TWC   /* get the pte address again */
> > > -   stw   r10, 0(r11)
> > > -4:
> > > -#else
> > > -   ori   r10, r10, _PAGE_ACCESSED
> > > -   stw   r10, 0(r11)
> > > -#endif
> > > -
> > > +   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
> > > +   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
> > > +   beq+   cr0, 5f   /* branch if access allowed */
> > > +   rlwinm   r10, r10, 0, 22, 19 /* r10 &= ~(_PAGE_ACCESSED | _PAGE_RW) */
> > > +   b   6f
> > > +5:   xori   r10, r10, _PAGE_RW /* invert RW bit */
> > > +6:
> > >     /* The Linux PTE won't go exactly into the MMU TLB.
> > > -    * Software indicator bits 21, 22 and 28 must be clear.
> > > +    * Software indicator bit 28 must be clear.
> > >      * Software indicator bits 24, 25, 26, and 27 must be
> > >      * set.  All other Linux PTE bits control the behavior
> > >      * of the MMU.
> > > @@ -409,21 +403,15 @@ DataStoreTLBMiss:
> > >     DO_8xx_CPU6(0x3b80, r3)
> > >     mtspr   SPRN_MD_TWC, r11
> > >
> > > -#ifdef CONFIG_SWAP
> > > -   /* do not set the _PAGE_ACCESSED bit of a non-present page */
> > > -   andi.   r11, r10, _PAGE_PRESENT
> > > -   beq   4f
> > > -   ori   r10, r10, _PAGE_ACCESSED
> > > -4:
> > > -   /* and update pte in table */
> > > -#else
> > > -   ori   r10, r10, _PAGE_ACCESSED
> > > -#endif
> > > -   mfspr   r11, SPRN_MD_TWC   /* get the pte address again */
> > > -   stw   r10, 0(r11)
> > > -
> > > +   andi.   r11, r10, _PAGE_USER | _PAGE_ACCESSED
> > > +   cmpwi   cr0, r11, _PAGE_USER | _PAGE_ACCESSED
> > > +   beq+   cr0, 5f   /* branch if access allowed */
> > > +   rlwinm   r10, r10, 0, 22, 19 /* r10 &= ~(_PAGE_ACCESSED | _PAGE_RW) */
> > > +   b   6f
> > > +5:   xori   r10, r10, _PAGE_RW /* invert RW bit */
> > > +6:
> > >     /* The Linux PTE won't go exactly into the MMU TLB.
> > > -    * Software indicator bits 21, 22 and 28 must be clear.
> > > +    * Software indicator bit 28 must be clear.
> > >      * Software indicator bits 24, 25, 26, and 27 must be
> > >      * set.  All other Linux PTE bits control the behavior
> > >      * of the MMU.
> > > @@ -448,6 +436,91 @@ DataStoreTLBMiss:
> > >   */
> > >     . = 0x1300
> > >  InstructionTLBError:
> > > +#ifdef CONFIG_8xx_CPU6
> > > +   stw   r3, 8(r0)
> > > +#endif
> > > +   DO_8xx_CPU6(0x3f80, r3)
> > > +   mtspr   SPRN_M_TW, r10   /* Save a couple of working registers */
> > > +   mfcr   r10
> > > +   stw   r10, 0(r0)
> > > +   stw   r11, 4(r0)
> > > +
> > > +   mfspr   r11, SPRN_SRR1
> > > +   andis.   r11, r11, 0x5000   /* no translation, guarded */
> > > +   bne   2f
> > > +
> > > +   mfspr   r10, SPRN_SRR0   /* Get effective address of fault */
> > > +#ifdef CONFIG_8xx_CPU15
> > > +   addi   r11, r10, 0x1000
> > > +   tlbie   r11
> > > +   addi   r11, r10, -0x1000
> > > +   tlbie   r11
> > > +#endif
> > > +   DO_8xx_CPU6(0x3780, r3)
> > > +   mtspr   SPRN_MD_EPN, r10   /* Have to use MD_EPN for walk, MI_EPN can't */
> > > +   mfspr   r10, SPRN_M_TWB   /* Get level 1 table entry address */
> > > +
> > > +   /* If we are faulting a kernel address, we have to use the
> > > +    * kernel page tables.
> > > +    */
> > > +   andi.   r11, r10, 0x0800   /* Address >= 0x80000000 */
> > > +   beq   3f
> > > +   lis   r11, swapper_pg_dir@h
> > > +   ori   r11, r11, swapper_pg_dir@l
> > > +   rlwimi   r10, r11, 0, 2, 19
> > > +3:
> > > +   lwz   r11, 0(r10)   /* Get the level 1 entry */
> > > +   rlwinm.   r10, r11,0,0,19   /* Extract page descriptor page address */
> > > +   beq   2f      /* If zero, don't try to find a pte */
> > > +
> > > +   /* We have a pte table, so load the MI_TWC with the attributes
> > > +    * for this "segment."
> > > +    */
> > > +   ori   r11,r11,1      /* Set valid bit */
> > > +   DO_8xx_CPU6(0x2b80, r3)
> > > +   mtspr   SPRN_MI_TWC, r11   /* Set segment attributes */
> > > +   DO_8xx_CPU6(0x3b80, r3)
> > > +   mtspr   SPRN_MD_TWC, r11   /* Load pte table base address */
> > > +
> > > +   mfspr   r11, SPRN_SRR1
> > > +   andi.   r11, r11, 0x4000 /* MSR[PR] */
> > > +   mfspr   r11, SPRN_MD_TWC   /* ....and get the pte address */
> > > +   lwz   r10, 0(r11)   /* Get the pte */
> > > +   beq   5f /* Kernel access always OK */
> > > +   andi.   r11,r10, _PAGE_USER
> > > +   beq   2f
> > > +5:   ori   r10, r10, _PAGE_ACCESSED
> > > +   mfspr   r21, SPRN_MD_TWC   /* ....and get the pte address */
> > > +   stw   r10, 0(r11)
> > > +   xori   r10, r10, _PAGE_RW /* invert RW bit */
> > > +
> > > +   /* The Linux PTE won't go exactly into the MMU TLB.
> > > +    * Software indicator bit 28 must be clear.
> > > +    * Software indicator bits 24, 25, 26, and 27 must be
> > > +    * set.  All other Linux PTE bits control the behavior
> > > +    * of the MMU.
> > > +    */
> > > +   li   r11, 0x00f0
> > > +   rlwimi   r10, r11, 0, 24, 28   /* Set 24-27, clear 28 */
> > > +   DO_8xx_CPU6(0x2d80, r3)
> > > +   mtspr   SPRN_MI_RPN, r10   /* Update TLB entry */
> > > +
> > > +   mfspr   r10, SPRN_M_TW   /* Restore registers */
> > > +   lwz   r11, 0(r0)
> > > +   mtcr   r11
> > > +   lwz   r11, 4(r0)
> > > +#ifdef CONFIG_8xx_CPU6
> > > +   lwz   r3, 8(r0)
> > > +#endif
> > > +   rfi
> > > +
> > > +2:   mfspr   r10, SPRN_M_TW   /* Restore registers */
> > > +   lwz   r11, 0(r0)
> > > +   mtcr   r11
> > > +   lwz   r11, 4(r0)
> > > +#ifdef CONFIG_8xx_CPU6
> > > +   lwz   r3, 8(r0)
> > > +#endif
> > >     b   InstructionAccess
> > >
> > >  /* This is the data TLB error on the MPC8xx.  This could be due to
> > > @@ -472,8 +545,8 @@ DataTLBError:
> > >     /* First, make sure this was a store operation.
> > >     */
> > >     mfspr   r10, SPRN_DSISR
> > > -   andis.   r11, r10, 0x4800 /* no translation, no permission. */
> > > -   bne   2f   /* branch if either is set */
> > > +   andis.   r11, r10, 0x4000 /* no translation */
> > > +   bne   2f   /* branch if set */
> > >
> > >     /* The EA of a data TLB miss is automatically stored in the MD_EPN
> > >      * register.  The EA of a data TLB error is automatically stored in
> > > @@ -522,26 +595,26 @@ DataTLBError:
> > >     mfspr   r11, SPRN_MD_TWC      /* ....and get the pte address */
> > >     lwz   r10, 0(r11)      /* Get the pte */
> > >
> > > -   andi.   r11, r10, _PAGE_RW   /* Is it writeable? */
> > > -   beq   2f         /* Bail out if not */
> > > +   mfspr   r11, SPRN_SRR1
> > > +   andi.   r11, r11, 0x4000 /* MSR[PR] */
> > > +   beq   5f /* Kernel access always OK */
> > > +   andi.   r11,r10, _PAGE_USER
> > > +   beq   2f
> > > +5:   mfspr   r11, SPRN_DSISR
> > > +   andis.   r11, r11, 0x0200   /* store */
> > > +   beq   6f
> > > +   andi.   r11, r10, _PAGE_RW   /* writeable? */
> > > +   beq   2f /* branch if not */
> > > +   ori   r10, r10, _PAGE_DIRTY | _PAGE_HWWRITE
> > > +6:   ori   r10, r10, _PAGE_ACCESSED
> > >
> > > -   /* Update 'changed', among others.
> > > -   */
> > > -#ifdef CONFIG_SWAP
> > > -   ori   r10, r10, _PAGE_DIRTY|_PAGE_HWWRITE
> > > -   /* do not set the _PAGE_ACCESSED bit of a non-present page */
> > > -   andi.   r11, r10, _PAGE_PRESENT
> > > -   beq   4f
> > > -   ori   r10, r10, _PAGE_ACCESSED
> > > -4:
> > > -#else
> > > -   ori   r10, r10, _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE
> > > -#endif
> > >     mfspr   r11, SPRN_MD_TWC      /* Get pte address again */
> > >     stw   r10, 0(r11)      /* and update pte in table */
> > >
> > > +   xori   r10, r10, _PAGE_RW   /* Invert RW bit */
> > > +
> > >     /* The Linux PTE won't go exactly into the MMU TLB.
> > > -    * Software indicator bits 21, 22 and 28 must be clear.
> > > +    * Software indicator bit 28 must be clear.
> > >      * Software indicator bits 24, 25, 26, and 27 must be
> > >      * set.  All other Linux PTE bits control the behavior
> > >      * of the MMU.
> >
> >
> >
> >

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 21:31         ` Benjamin Herrenschmidt
@ 2009-10-05 21:41           ` Joakim Tjernlund
  2009-10-05 21:46             ` Scott Wood
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 21:41 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 23:31:39:

> >
> > Yes, every ld.so uses dcbX and icbi insn when relocatin code.
>
> Even with -msecure-plt ?

hmm, maybe not. Can't remember now. But perhaps LAZY relocs still
need dcbX? Easiest is to check in uClibc. I impl. it, but was to long
time ago for me to remember. Scott are you using uClibc? I don't
think so because of the dynamic check of the cache line size

>
> > Maybe you see some version of the dcbX bug, but my fault.c should
> > fix them up. My bet would be the 32 byte cache line, it will miss
> > out every second line and so the results are unreliable.
>
> Cheers,
> Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 21:41           ` Joakim Tjernlund
@ 2009-10-05 21:46             ` Scott Wood
  0 siblings, 0 replies; 49+ messages in thread
From: Scott Wood @ 2009-10-05 21:46 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Rex Feany, linuxppc-dev

Joakim Tjernlund wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 23:31:39:
> 
>>> Yes, every ld.so uses dcbX and icbi insn when relocatin code.
>> Even with -msecure-plt ?
> 
> hmm, maybe not. Can't remember now. But perhaps LAZY relocs still
> need dcbX? Easiest is to check in uClibc. I impl. it, but was to long
> time ago for me to remember. Scott are you using uClibc? I don't
> think so because of the dynamic check of the cache line size

No, this is glibc.

-Scott

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 21:37           ` Benjamin Herrenschmidt
@ 2009-10-05 22:00             ` Joakim Tjernlund
  2009-10-05 22:09               ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 22:00 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 23:37:23:
>
> On Mon, 2009-10-05 at 23:25 +0200, Joakim Tjernlund wrote:
> >
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 22:17:04:
> > >
> > > On Mon, 2009-10-05 at 14:16 +0200, Joakim Tjernlund wrote:
> > > > Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
> > > > Pros:
> > > >  - I/D TLB Miss never needs to write to the linux pte.
> > > >  - _PAGE_ACCESSED is only set on I/D TLB Error fixing accounting
> > > >  - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
> > > >     when a page has been made dirty.
> > >
> > > Not sure here. You seem to still set those from asm. Is that necessary ?
> >
> > Well, yes. head_32.S also sets ACCESSED and DIRTY from asm so why not me?
>
> Don't look at the hash 32 code :-)

So what to look at then? :)

>
> > The basic method I use is:
> > TLB gets mapped with NoAccess, then the first access will trap
> > into TLB error, where ACCESSED will be set if permission to access
> > the page is OK (and RW too). On first store 8xx will trap
> > into DTLB error and permissions is OK, DIRTY will be set too.
> > Is this wrong?
>
> >From your explanation it looks ok. IE. as long as !DIRTY -> not
> writeable (will fault on store) and !ACCESSED means not accessible (will
> fault on any access) then you are ok.

Yes, that is what I have (tried) to do.

>
> > Do I have to trap to C for first store?
>
> Most of the time you do anyways since the PTE isn't populated at all. At
> which point, Linux will populate a PTE with DIRTY and ACCESSED already
> set. It should be reasonably rare to actually fault because DIRTY and/or
> ACCESSED are missing.

I tried to unconditionally trap to C in DTLB error but it just hung if I did
that so the asm has to do something.

>
> > > The approach I take on BookE is to simply not set these from asm, -and-
> > > (this is important) not set write permission if dirty is not set in

Did you really mean "if dirty is not" ? I test RW for write permission.
Dirty is just set when the first write happens after the permission check.

> > > the TLB miss and set no access permission at all when accessed is not
> > > set. This is important or we'll miss dirty updates which can
> > > be fatal.
> >
> > not sure, but this seems similar to what I do. DIRTY will be set,
> > in asm, on first store.
>
> Don't set DIRTY if !VALID

I don't. !VALID trap to C

>
> > ACCESSED will only be set iff (USER && VALID)
>
> My point is that you should be able to simplify the code here, have only
> the TLB miss look at the PTE, not the data access and instruction
> access, and have the later be a boring exception going straight to C.

I do this, TLB Miss only looks at TLB and then transforms it into a HW pte.
The HW pte do not map 1:1 so I need to some magic to fit the linux pte
into a HW pte.

>
> > >
> > > The C code in handle_pte_fault() will fixup missing access and dirty
> > > if necessary and flush.
> > >
> > > Also look at the 440 code, I think you could simplify your
> > > implementation using similar techniques, such as andc of PTE against
> > > requested bits etc... and thus maybe save a couple of insns.
> >
> > Great, but first I want to make sure I doing it right :)
> >
> > So is there some golden rule I have to follow?
>
> Mostly only !ACCESSED -> no access permitted and !DIRTY -> no store
> permitted and don't write anything back if !VALID.

That should be !RW and not !DIRTY I hope? Then trap
first store and set DIRTY (without trapping to C)

>
> Cheers,
> Ben.
>
> >  Jocke
> >
> > >
> > > Cheers,
> > > Ben.
> > >
> > > >  - Proper RO/RW mapping of user space.
> > > >  - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
> > > > Cons:
> > > >  - 4 more instructions in I/D TLB Miss, but the since the linux pte is
> > > >    not written anymore, it should still be a win.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 12:16 [PATCH 0/6] PowerPc 8xx TLB/MMU fixes Joakim Tjernlund
  2009-10-05 12:16 ` [PATCH 1/6] 8xx: DTLB Error must check for more errors Joakim Tjernlund
  2009-10-05 18:12 ` [PATCH 0/6] PowerPc 8xx TLB/MMU fixes Scott Wood
@ 2009-10-05 22:04 ` Rex Feany
  2009-10-05 22:31   ` Joakim Tjernlund
  2 siblings, 1 reply; 49+ messages in thread
From: Rex Feany @ 2009-10-05 22:04 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev

Thus spake Joakim Tjernlund (Joakim.Tjernlund@transmode.se):

> Scott and Rex, please disregard other patches from me and
> try these out instead.

I have results similar to Scott's. I tried both with and without patch 
5 & 6, and I also need the tlbia_va in ptep_set_access_flags().

I got this oops:

Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000e110
Oops: Kernel access of bad area, sig: 11 [#1]
MRV NM2
NIP: c000e110 LR: c000d520 CTR: 1006bf40
REGS: c2203dd0 TRAP: 0300   Not tainted  (2.6.32-rc3-00014-gcea49b0-dirty)
MSR: 00009032 <EE,ME,IR,DR>  CR: 44022422  XER: 20000000
DAR: 00000000, DSISR: c0000000
TASK = c21c18c0[69] 'rc.sysinit' THREAD: c2202000
GPR00: 00000401 c2203e80 c21c18c0 c2203f50 00000000 4000d032 00000000 00000000
GPR08: 10093eb4 00000000 00009032 c000d514 021c1ad0 10095c50 03ffb000 10006500
GPR16: 10006508 00000000 00000000 00000000 100a2fc8 021c4238 00000000 00000400
GPR24: c21a2700 00000002 40000000 00000000 00000000 c2203f50 00000000 00000011
NIP [c000e110] do_page_fault+0x44/0x5fc
LR [c000d520] handle_page_fault+0xc/0x80
Call Trace:
[c2203e80] [c000e58c] do_page_fault+0x4c0/0x5fc (unreliable)
[c2203f40] [c000d520] handle_page_fault+0xc/0x80
Instruction dump:
800300a0 7cba2b78 54170036 2f970400 7c9c2378 830200e0 54b6018c 40be000c
74ba4820 3ac00000 813d0080 3bc00000 <80a90000> 54a036be 2f80001f 40be0100
---[ end trace 99a4d88f7e2f1b60 ]---

this happens in do_page_fault becaose regs->nip is null, so

                insn = *((unsigned long *)regs->nip);
c000e110:       80 a9 00 00     lwz     r5,0(r9)

fails.

take care!
/rex.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 22:00             ` Joakim Tjernlund
@ 2009-10-05 22:09               ` Benjamin Herrenschmidt
  2009-10-05 22:55                 ` Joakim Tjernlund
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-05 22:09 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Tue, 2009-10-06 at 00:00 +0200, Joakim Tjernlund wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 23:37:23:
> >
> > On Mon, 2009-10-05 at 23:25 +0200, Joakim Tjernlund wrote:
> > >
> > > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 22:17:04:
> > > >
> > > > On Mon, 2009-10-05 at 14:16 +0200, Joakim Tjernlund wrote:
> > > > > Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
> > > > > Pros:
> > > > >  - I/D TLB Miss never needs to write to the linux pte.
> > > > >  - _PAGE_ACCESSED is only set on I/D TLB Error fixing accounting
> > > > >  - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
> > > > >     when a page has been made dirty.
> > > >
> > > > Not sure here. You seem to still set those from asm. Is that necessary ?
> > >
> > > Well, yes. head_32.S also sets ACCESSED and DIRTY from asm so why not me?
> >
> > Don't look at the hash 32 code :-)
> 
> So what to look at then? :)

head_440.S on a recent 2.6

> > Most of the time you do anyways since the PTE isn't populated at all. At
> > which point, Linux will populate a PTE with DIRTY and ACCESSED already
> > set. It should be reasonably rare to actually fault because DIRTY and/or
> > ACCESSED are missing.
> 
> I tried to unconditionally trap to C in DTLB error but it just hung if I did
> that so the asm has to do something.

Sure, the question is what and how can we get away without it ? :-)

> > > > The approach I take on BookE is to simply not set these from asm, -and-
> > > > (this is important) not set write permission if dirty is not set in
> 
> Did you really mean "if dirty is not" ? I test RW for write permission.
> Dirty is just set when the first write happens after the permission check.

Sure but if dirty is cleared by the kernel, then you also need to remove
write permission in the TLB or it will miss setting dirty on the next
store to the page.

So dirty basically acts as a filter on RW, and accessed as a filter on
valid if you want.

> > Mostly only !ACCESSED -> no access permitted and !DIRTY -> no store
> > permitted and don't write anything back if !VALID.
> 
> That should be !RW and not !DIRTY I hope? Then trap
> first store and set DIRTY (without trapping to C)

No it's really !(RW _AND_ DIRTY) -> no store permitted, and
!(PRESENT _AND_ ACCESSED) -> no access permitted.

Cheers,
Ben.

> >
> > Cheers,
> > Ben.
> >
> > >  Jocke
> > >
> > > >
> > > > Cheers,
> > > > Ben.
> > > >
> > > > >  - Proper RO/RW mapping of user space.
> > > > >  - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
> > > > > Cons:
> > > > >  - 4 more instructions in I/D TLB Miss, but the since the linux pte is
> > > > >    not written anymore, it should still be a win.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 22:04 ` Rex Feany
@ 2009-10-05 22:31   ` Joakim Tjernlund
  2009-10-05 22:37     ` Benjamin Herrenschmidt
  2009-10-05 22:42     ` Rex Feany
  0 siblings, 2 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 22:31 UTC (permalink / raw)
  To: Rex Feany; +Cc: Scott Wood, linuxppc-dev

Rex Feany <RFeany@mrv.com> wrote on 06/10/2009 00:04:20:
>
> Thus spake Joakim Tjernlund (Joakim.Tjernlund@transmode.se):
>
> > Scott and Rex, please disregard other patches from me and
> > try these out instead.
>
> I have results similar to Scott's. I tried both with and without patch
> 5 & 6, and I also need the tlbia_va in ptep_set_access_flags().
>
> I got this oops:
>
> Unable to handle kernel paging request for data at address 0x00000000
> Faulting instruction address: 0xc000e110
> Oops: Kernel access of bad area, sig: 11 [#1]
> MRV NM2
> NIP: c000e110 LR: c000d520 CTR: 1006bf40
> REGS: c2203dd0 TRAP: 0300   Not tainted  (2.6.32-rc3-00014-gcea49b0-dirty)
> MSR: 00009032 <EE,ME,IR,DR>  CR: 44022422  XER: 20000000
> DAR: 00000000, DSISR: c0000000
> TASK = c21c18c0[69] 'rc.sysinit' THREAD: c2202000
> GPR00: 00000401 c2203e80 c21c18c0 c2203f50 00000000 4000d032 00000000 00000000
> GPR08: 10093eb4 00000000 00009032 c000d514 021c1ad0 10095c50 03ffb000 10006500
> GPR16: 10006508 00000000 00000000 00000000 100a2fc8 021c4238 00000000 00000400
> GPR24: c21a2700 00000002 40000000 00000000 00000000 c2203f50 00000000 00000011
> NIP [c000e110] do_page_fault+0x44/0x5fc
> LR [c000d520] handle_page_fault+0xc/0x80
> Call Trace:
> [c2203e80] [c000e58c] do_page_fault+0x4c0/0x5fc (unreliable)
> [c2203f40] [c000d520] handle_page_fault+0xc/0x80
> Instruction dump:
> 800300a0 7cba2b78 54170036 2f970400 7c9c2378 830200e0 54b6018c 40be000c
> 74ba4820 3ac00000 813d0080 3bc00000 <80a90000> 54a036be 2f80001f 40be0100
> ---[ end trace 99a4d88f7e2f1b60 ]---
>
> this happens in do_page_fault becaose regs->nip is null, so

regs or regs->nip is NULL? Either one does not make sense
In any case it might be a secondary problem as DAR is NULL already when you
enter the page fault.
>
>                 insn = *((unsigned long *)regs->nip);
> c000e110:       80 a9 00 00     lwz     r5,0(r9)
>
> fails.

hmm, I wonder if you managed to invalidate the a kernel TLB?
Are you using pinned kernel TLBs?

   Jocke

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 22:31   ` Joakim Tjernlund
@ 2009-10-05 22:37     ` Benjamin Herrenschmidt
  2009-10-05 22:58       ` Joakim Tjernlund
  2009-10-05 23:49       ` Joakim Tjernlund
  2009-10-05 22:42     ` Rex Feany
  1 sibling, 2 replies; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-05 22:37 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Tue, 2009-10-06 at 00:31 +0200, Joakim Tjernlund wrote:
> 
> regs or regs->nip is NULL? Either one does not make sense
> In any case it might be a secondary problem as DAR is NULL already
> when you
> enter the page fault.
> >
> >                 insn = *((unsigned long *)regs->nip);
> > c000e110:       80 a9 00 00     lwz     r5,0(r9)
> >
> > fails.
> 
> hmm, I wonder if you managed to invalidate the a kernel TLB?
> Are you using pinned kernel TLBs? 

You should not dereference a user address like that. Use get_user !

Obviously you got 0 in SRR0 for some reason (somebody tried to jump
to 0, either intentionally or as a result of some other problem) and
the above will crash the kernel when it happens.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 22:31   ` Joakim Tjernlund
  2009-10-05 22:37     ` Benjamin Herrenschmidt
@ 2009-10-05 22:42     ` Rex Feany
  2009-10-05 23:00       ` Joakim Tjernlund
  2009-10-06  6:25       ` Joakim Tjernlund
  1 sibling, 2 replies; 49+ messages in thread
From: Rex Feany @ 2009-10-05 22:42 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev

Thus spake Joakim Tjernlund (joakim.tjernlund@transmode.se):

> > I got this oops:
> >
> > Unable to handle kernel paging request for data at address 0x00000000
> > Faulting instruction address: 0xc000e110
> > Oops: Kernel access of bad area, sig: 11 [#1]
> > MRV NM2
> > NIP: c000e110 LR: c000d520 CTR: 1006bf40
> > REGS: c2203dd0 TRAP: 0300   Not tainted  (2.6.32-rc3-00014-gcea49b0-dirty)
> > MSR: 00009032 <EE,ME,IR,DR>  CR: 44022422  XER: 20000000
> > DAR: 00000000, DSISR: c0000000
> > TASK = c21c18c0[69] 'rc.sysinit' THREAD: c2202000
> > GPR00: 00000401 c2203e80 c21c18c0 c2203f50 00000000 4000d032 00000000 00000000
> > GPR08: 10093eb4 00000000 00009032 c000d514 021c1ad0 10095c50 03ffb000 10006500
> > GPR16: 10006508 00000000 00000000 00000000 100a2fc8 021c4238 00000000 00000400
> > GPR24: c21a2700 00000002 40000000 00000000 00000000 c2203f50 00000000 00000011
> > NIP [c000e110] do_page_fault+0x44/0x5fc
> > LR [c000d520] handle_page_fault+0xc/0x80
> > Call Trace:
> > [c2203e80] [c000e58c] do_page_fault+0x4c0/0x5fc (unreliable)
> > [c2203f40] [c000d520] handle_page_fault+0xc/0x80
> > Instruction dump:
> > 800300a0 7cba2b78 54170036 2f970400 7c9c2378 830200e0 54b6018c 40be000c
> > 74ba4820 3ac00000 813d0080 3bc00000 <80a90000> 54a036be 2f80001f 40be0100
> > ---[ end trace 99a4d88f7e2f1b60 ]---
> >
> > this happens in do_page_fault becaose regs->nip is null, so
> 
> regs or regs->nip is NULL? Either one does not make sense
> In any case it might be a secondary problem as DAR is NULL already when you
> enter the page fault.

I assumed it was NIP because ... I'm not sure why. TRAP() above
dereferences regs, and that didn't fail, but I didn't see that until
now.

> >                 insn = *((unsigned long *)regs->nip);
> > c000e110:       80 a9 00 00     lwz     r5,0(r9)
> 
> hmm, I wonder if you managed to invalidate the a kernel TLB?
> Are you using pinned kernel TLBs?

I'm not using pinned kernel TLBs, should I be?

take care!
/rex.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 22:09               ` Benjamin Herrenschmidt
@ 2009-10-05 22:55                 ` Joakim Tjernlund
  2009-10-05 23:15                   ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 22:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 00:09:16:
>
> On Tue, 2009-10-06 at 00:00 +0200, Joakim Tjernlund wrote:
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 23:37:23:
> > >
> > > On Mon, 2009-10-05 at 23:25 +0200, Joakim Tjernlund wrote:
> > > >
> > > > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/10/2009 22:17:04:
> > > > >
> > > > > On Mon, 2009-10-05 at 14:16 +0200, Joakim Tjernlund wrote:
> > > > > > Update the TLB asm to make proper use of _PAGE_DIRY and _PAGE_ACCESSED.
> > > > > > Pros:
> > > > > >  - I/D TLB Miss never needs to write to the linux pte.
> > > > > >  - _PAGE_ACCESSED is only set on I/D TLB Error fixing accounting
> > > > > >  - _PAGE_DIRTY is mapped to 0x100, the changed bit, and is set directly
> > > > > >     when a page has been made dirty.
> > > > >
> > > > > Not sure here. You seem to still set those from asm. Is that necessary ?
> > > >
> > > > Well, yes. head_32.S also sets ACCESSED and DIRTY from asm so why not me?
> > >
> > > Don't look at the hash 32 code :-)
> >
> > So what to look at then? :)
>
> head_440.S on a recent 2.6

OK, will look

>
> > > Most of the time you do anyways since the PTE isn't populated at all. At
> > > which point, Linux will populate a PTE with DIRTY and ACCESSED already
> > > set. It should be reasonably rare to actually fault because DIRTY and/or
> > > ACCESSED are missing.
> >
> > I tried to unconditionally trap to C in DTLB error but it just hung if I did
> > that so the asm has to do something.
>
> Sure, the question is what and how can we get away without it ? :-)

You tell me :)

>
> > > > > The approach I take on BookE is to simply not set these from asm, -and-
> > > > > (this is important) not set write permission if dirty is not set in
> >
> > Did you really mean "if dirty is not" ? I test RW for write permission.
> > Dirty is just set when the first write happens after the permission check.
>
> Sure but if dirty is cleared by the kernel, then you also need to remove
> write permission in the TLB or it will miss setting dirty on the next
> store to the page.

Dirty map to HW changed bit so if kernel clears it, the next
store will trap to DTLB error. There I will check RW and clear it again
without trapping to C. Is that OK? Not if I read you correcly below ..

>
> So dirty basically acts as a filter on RW, and accessed as a filter on
> valid if you want.
>
> > > Mostly only !ACCESSED -> no access permitted and !DIRTY -> no store
> > > permitted and don't write anything back if !VALID.
> >
> > That should be !RW and not !DIRTY I hope? Then trap
> > first store and set DIRTY (without trapping to C)
>
> No it's really !(RW _AND_ DIRTY) -> no store permitted, and

..  hmm, I don't do that. I should do a
    if ( store && !(pte & (RW | DIRTY))
       goto DSI
 in DTLB error?

> !(PRESENT _AND_ ACCESSED) -> no access permitted.

Yes, how about those pinned kernel ptes. Do they have ACCESSED from start?

>
> Cheers,
> Ben.
>
> > >
> > > Cheers,
> > > Ben.
> > >
> > > >  Jocke
> > > >
> > > > >
> > > > > Cheers,
> > > > > Ben.
> > > > >
> > > > > >  - Proper RO/RW mapping of user space.
> > > > > >  - Free up 2 SW TLB bits in the linux pte(add back _PAGE_WRITETHRU ?)
> > > > > > Cons:
> > > > > >  - 4 more instructions in I/D TLB Miss, but the since the linux pte is
> > > > > >    not written anymore, it should still be a win.
>
>
>
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 22:37     ` Benjamin Herrenschmidt
@ 2009-10-05 22:58       ` Joakim Tjernlund
  2009-10-05 23:49       ` Joakim Tjernlund
  1 sibling, 0 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 22:58 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 00:37:28:
>
> On Tue, 2009-10-06 at 00:31 +0200, Joakim Tjernlund wrote:
> >
> > regs or regs->nip is NULL? Either one does not make sense
> > In any case it might be a secondary problem as DAR is NULL already
> > when you
> > enter the page fault.
> > >
> > >                 insn = *((unsigned long *)regs->nip);
> > > c000e110:       80 a9 00 00     lwz     r5,0(r9)
> > >
> > > fails.
> >
> > hmm, I wonder if you managed to invalidate the a kernel TLB?
> > Are you using pinned kernel TLBs?
>
> You should not dereference a user address like that. Use get_user !

Ah, forgot about that. Will change

>
> Obviously you got 0 in SRR0 for some reason (somebody tried to jump
> to 0, either intentionally or as a result of some other problem) and
> the above will crash the kernel when it happens.
>
> Cheers,
> Ben.
>
>
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 22:42     ` Rex Feany
@ 2009-10-05 23:00       ` Joakim Tjernlund
  2009-10-06  6:25       ` Joakim Tjernlund
  1 sibling, 0 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 23:00 UTC (permalink / raw)
  To: Rex Feany; +Cc: Scott Wood, linuxppc-dev

Rex Feany <RFeany@mrv.com> wrote on 06/10/2009 00:42:18:
>
> Thus spake Joakim Tjernlund (joakim.tjernlund@transmode.se):
>
> > > I got this oops:
> > >
> > > Unable to handle kernel paging request for data at address 0x00000000
> > > Faulting instruction address: 0xc000e110
> > > Oops: Kernel access of bad area, sig: 11 [#1]
> > > MRV NM2
> > > NIP: c000e110 LR: c000d520 CTR: 1006bf40
> > > REGS: c2203dd0 TRAP: 0300   Not tainted  (2.6.32-rc3-00014-gcea49b0-dirty)
> > > MSR: 00009032 <EE,ME,IR,DR>  CR: 44022422  XER: 20000000
> > > DAR: 00000000, DSISR: c0000000
> > > TASK = c21c18c0[69] 'rc.sysinit' THREAD: c2202000
> > > GPR00: 00000401 c2203e80 c21c18c0 c2203f50 00000000 4000d032 00000000 00000000
> > > GPR08: 10093eb4 00000000 00009032 c000d514 021c1ad0 10095c50 03ffb000 10006500
> > > GPR16: 10006508 00000000 00000000 00000000 100a2fc8 021c4238 00000000 00000400
> > > GPR24: c21a2700 00000002 40000000 00000000 00000000 c2203f50 00000000 00000011
> > > NIP [c000e110] do_page_fault+0x44/0x5fc
> > > LR [c000d520] handle_page_fault+0xc/0x80
> > > Call Trace:
> > > [c2203e80] [c000e58c] do_page_fault+0x4c0/0x5fc (unreliable)
> > > [c2203f40] [c000d520] handle_page_fault+0xc/0x80
> > > Instruction dump:
> > > 800300a0 7cba2b78 54170036 2f970400 7c9c2378 830200e0 54b6018c 40be000c
> > > 74ba4820 3ac00000 813d0080 3bc00000 <80a90000> 54a036be 2f80001f 40be0100
> > > ---[ end trace 99a4d88f7e2f1b60 ]---
> > >
> > > this happens in do_page_fault becaose regs->nip is null, so
> >
> > regs or regs->nip is NULL? Either one does not make sense
> > In any case it might be a secondary problem as DAR is NULL already when you
> > enter the page fault.
>
> I assumed it was NIP because ... I'm not sure why. TRAP() above
> dereferences regs, and that didn't fail, but I didn't see that until
> now.

It probably was and I need to use get_user() instead.

>
> > >                 insn = *((unsigned long *)regs->nip);
> > > c000e110:       80 a9 00 00     lwz     r5,0(r9)
> >
> > hmm, I wonder if you managed to invalidate the a kernel TLB?
> > Are you using pinned kernel TLBs?
>
> I'm not using pinned kernel TLBs, should I be?

Pinned TLBs is usally faster unless you know better or so I think.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 22:55                 ` Joakim Tjernlund
@ 2009-10-05 23:15                   ` Benjamin Herrenschmidt
  2009-10-05 23:35                     ` Joakim Tjernlund
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-05 23:15 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Tue, 2009-10-06 at 00:55 +0200, Joakim Tjernlund wrote:
> > Sure but if dirty is cleared by the kernel, then you also need to
> remove
> > write permission in the TLB or it will miss setting dirty on the
> next
> > store to the page.
> 
> Dirty map to HW changed bit so if kernel clears it, the next
> store will trap to DTLB error. There I will check RW and clear it
> again
> without trapping to C. Is that OK? Not if I read you correcly below ..

Well, if the HW has the ability to enforce trap when store with !DIRTY,
then that's fine, just map it that way, but you shouldn't have to handle
it in the DTLB error neither, the kernel will fix it up for you in
handle_pte_fault().

I think I need to get myself an 8xx manual and figure out what that damn
MMU actually does :-)

> No it's really !(RW _AND_ DIRTY) -> no store permitted, and
> 
> ..  hmm, I don't do that. I should do a
>     if ( store && !(pte & (RW | DIRTY))
>        goto DSI
>  in DTLB error?

Well, if the HW does the test of DIRTY, then it's fine, as you seem to
suggest above. _something_ needs to do it, either SW or HW, ie, we need
to catch any attempt to store to a non-dirty page to set dirty.

> > !(PRESENT _AND_ ACCESSED) -> no access permitted.
> 
> Yes, how about those pinned kernel ptes. Do they have ACCESSED from
> start?

They should.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 23:15                   ` Benjamin Herrenschmidt
@ 2009-10-05 23:35                     ` Joakim Tjernlund
  2009-10-06  0:34                       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 23:35 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 01:15:39:
>
> On Tue, 2009-10-06 at 00:55 +0200, Joakim Tjernlund wrote:
> > > Sure but if dirty is cleared by the kernel, then you also need to
> > remove
> > > write permission in the TLB or it will miss setting dirty on the
> > next
> > > store to the page.
> >
> > Dirty map to HW changed bit so if kernel clears it, the next
> > store will trap to DTLB error. There I will check RW and clear it
> > again
> > without trapping to C. Is that OK? Not if I read you correcly below ..
>
> Well, if the HW has the ability to enforce trap when store with !DIRTY,

Yes, provided that the kernel invalidates the TLB too so the next access
will provoke a TLB Miss, which will then provoke a TLB error. The TLB
error routine checks VALID, RW and USER(if not a kernel access), then sets
ACCESSED & DIRTY and writes the TLB(RPN reg).

Perhaps the missing invalidate is haunting us here?

> then that's fine, just map it that way, but you shouldn't have to handle
> it in the DTLB error neither, the kernel will fix it up for you in
> handle_pte_fault().

Does not all ppc have the Changed bit?

>
> I think I need to get myself an 8xx manual and figure out what that damn
> MMU actually does :-)

Please do, get the mpc862 users manual :)

>
> > No it's really !(RW _AND_ DIRTY) -> no store permitted, and
> >
> > ..  hmm, I don't do that. I should do a
> >     if ( store && !(pte & (RW | DIRTY))
> >        goto DSI
> >  in DTLB error?
>
> Well, if the HW does the test of DIRTY, then it's fine, as you seem to
> suggest above. _something_ needs to do it, either SW or HW, ie, we need
> to catch any attempt to store to a non-dirty page to set dirty.

Yes, I think I do this.

>
> > > !(PRESENT _AND_ ACCESSED) -> no access permitted.
> >
> > Yes, how about those pinned kernel ptes. Do they have ACCESSED from
> > start?
>
> They should.
>
> Cheers,
> Ben.
>
>
>
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 22:37     ` Benjamin Herrenschmidt
  2009-10-05 22:58       ` Joakim Tjernlund
@ 2009-10-05 23:49       ` Joakim Tjernlund
  2009-10-06  1:52         ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-05 23:49 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 00:37:28:
>
> On Tue, 2009-10-06 at 00:31 +0200, Joakim Tjernlund wrote:
> >
> > regs or regs->nip is NULL? Either one does not make sense
> > In any case it might be a secondary problem as DAR is NULL already
> > when you
> > enter the page fault.
> > >
> > >                 insn = *((unsigned long *)regs->nip);
> > > c000e110:       80 a9 00 00     lwz     r5,0(r9)
> > >
> > > fails.
> >
> > hmm, I wonder if you managed to invalidate the a kernel TLB?
> > Are you using pinned kernel TLBs?
>
> You should not dereference a user address like that. Use get_user !

So how does this look? Does it change anything?
It should as the previous way was way off :(

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index c33c6de..08a392f 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -153,7 +153,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 #ifdef DEBUG_DCBX
 		const char *istr = NULL;

-		insn = *((unsigned long *)regs->nip);
+		__get_user(insn, (unsigned long __user *)regs->nip);
 		if (((insn >> (31-5)) & 0x3f) == 31) {
 			if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
 				istr = "dcbz";
@@ -178,11 +178,12 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 					       ra, rb, dar);
 					is_write = 0;
 				}
-
+#if 0
 				if (trap == 0x300 && address != dar) {
 					__asm__ ("mtdar %0" : : "r" (dar));
 					return 0;
 				}
+#endif
 			}
 		}
 #endif
@@ -191,7 +192,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,

 			/* This is from a dcbX or icbi insn gone bad, these
 			 * insn do not set DAR so we have to do it here instead */
-			insn = *((unsigned long *)regs->nip);
+			__get_user(insn, (unsigned long __user *)regs->nip);

 			ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
 			rb = (insn >> (31-20)) & 0x1f; /* Reg RB */

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-05 23:35                     ` Joakim Tjernlund
@ 2009-10-06  0:34                       ` Benjamin Herrenschmidt
  2009-10-06  6:15                         ` Joakim Tjernlund
  2009-10-06 22:05                         ` Joakim Tjernlund
  0 siblings, 2 replies; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-06  0:34 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Tue, 2009-10-06 at 01:35 +0200, Joakim Tjernlund wrote:
> 
> > Well, if the HW has the ability to enforce trap when store with !
> DIRTY,
> 
> Yes, provided that the kernel invalidates the TLB too so the next
> access
> will provoke a TLB Miss, which will then provoke a TLB error. The TLB
> error routine checks VALID, RW and USER(if not a kernel access), then
> sets
> ACCESSED & DIRTY and writes the TLB(RPN reg).
> 
> Perhaps the missing invalidate is haunting us here?

No, the kernel will invalidate when clearing dirty or accessed, I don't
think that's our problem.

This is still all inefficient, we end up basically with two traps.

8xx provides backup GPRs when doing TLB misses ? What does it cost to
jump out of a TLB miss back into "normal" context ?

IE. What I do on 440 is I set a mask of required bits, basically
_PAGE_PRESENT | _PAGE_ACCESSED is the base. The DTLB miss also sticks
in _PAGE_RW | _PAGE_DIRTY when it's a store fault.

Then, I andc. the PTE value off that mask, and if the result is non-0
(which means one of the required bits is missing), I get out of the TLB
miss immediately and go to the data (or instruction) access interrupt.

Once you've done that, you should be able to have data and instruction
access go straight to C. Missing _PAGE_ACCESSED and _PAGE_DIRTY are
going to be fixed up by generic code.

> > then that's fine, just map it that way, but you shouldn't have to
> handle
> > it in the DTLB error neither, the kernel will fix it up for you in
> > handle_pte_fault().
> 
> Does not all ppc have the Changed bit?

No. BookE doesn't.

> Please do, get the mpc862 users manual :)

Ok :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 23:49       ` Joakim Tjernlund
@ 2009-10-06  1:52         ` Benjamin Herrenschmidt
  2009-10-06  8:06           ` Joakim Tjernlund
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-06  1:52 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

\
> So how does this look? Does it change anything?
> It should as the previous way was way off :(
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index c33c6de..08a392f 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -153,7 +153,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
>  #ifdef DEBUG_DCBX
>  		const char *istr = NULL;
> 
> -		insn = *((unsigned long *)regs->nip);
> +		__get_user(insn, (unsigned long __user *)regs->nip);

No, use get_user() not __get_user() or if you use the later, also use
access_ok(), and test the result in case it errors (if it does, you
probably want to just goto bad access and SEGV).

Cheers,
Ben.

>  		if (((insn >> (31-5)) & 0x3f) == 31) {
>  			if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
>  				istr = "dcbz";
> @@ -178,11 +178,12 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
>  					       ra, rb, dar);
>  					is_write = 0;
>  				}
> -
> +#if 0
>  				if (trap == 0x300 && address != dar) {
>  					__asm__ ("mtdar %0" : : "r" (dar));
>  					return 0;
>  				}
> +#endif
>  			}
>  		}
>  #endif
> @@ -191,7 +192,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
> 
>  			/* This is from a dcbX or icbi insn gone bad, these
>  			 * insn do not set DAR so we have to do it here instead */
> -			insn = *((unsigned long *)regs->nip);
> +			__get_user(insn, (unsigned long __user *)regs->nip);
> 
>  			ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
>  			rb = (insn >> (31-20)) & 0x1f; /* Reg RB */
> 
> 

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-06  0:34                       ` Benjamin Herrenschmidt
@ 2009-10-06  6:15                         ` Joakim Tjernlund
  2009-10-06  6:45                           ` Benjamin Herrenschmidt
  2009-10-06 22:05                         ` Joakim Tjernlund
  1 sibling, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06  6:15 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 02:34:15:
>
> On Tue, 2009-10-06 at 01:35 +0200, Joakim Tjernlund wrote:
> >
> > > Well, if the HW has the ability to enforce trap when store with !
> > DIRTY,
> >
> > Yes, provided that the kernel invalidates the TLB too so the next
> > access
> > will provoke a TLB Miss, which will then provoke a TLB error. The TLB
> > error routine checks VALID, RW and USER(if not a kernel access), then
> > sets
> > ACCESSED & DIRTY and writes the TLB(RPN reg).
> >
> > Perhaps the missing invalidate is haunting us here?
>
> No, the kernel will invalidate when clearing dirty or accessed, I don't
> think that's our problem.
>
> This is still all inefficient, we end up basically with two traps.

Yes, but once the 2 traps is over, it gets much cheaper plus I don't
get a choice, see below.

>
> 8xx provides backup GPRs when doing TLB misses ? What does it cost to
> jump out of a TLB miss back into "normal" context ?

Nope, there is just one TLB scratch register. I have been meaning to
ask you about SPRG2, it seems unused?
There is a leftover from 2.4 that inits G2 to something but the
it appears unused otherwise.

>
> IE. What I do on 440 is I set a mask of required bits, basically
> _PAGE_PRESENT | _PAGE_ACCESSED is the base. The DTLB miss also sticks
> in _PAGE_RW | _PAGE_DIRTY when it's a store fault.

Yes, I would too but TLB Miss knows nothing about load/store, protection etc.
because DSISR isn't set. So I cannot see any other way than the TLB Error way.

>
> Then, I andc. the PTE value off that mask, and if the result is non-0
> (which means one of the required bits is missing), I get out of the TLB
> miss immediately and go to the data (or instruction) access interrupt.
>
> Once you've done that, you should be able to have data and instruction
> access go straight to C. Missing _PAGE_ACCESSED and _PAGE_DIRTY are
> going to be fixed up by generic code.
>
> > > then that's fine, just map it that way, but you shouldn't have to
> > handle
> > > it in the DTLB error neither, the kernel will fix it up for you in
> > > handle_pte_fault().
> >
> > Does not all ppc have the Changed bit?
>
> No. BookE doesn't.

But I guess BookE knows if it is a load or store in TLB Miss?
Then it can emulate changed bit I guess.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-05 22:42     ` Rex Feany
  2009-10-05 23:00       ` Joakim Tjernlund
@ 2009-10-06  6:25       ` Joakim Tjernlund
  2009-10-06  6:44         ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06  6:25 UTC (permalink / raw)
  To: Rex Feany; +Cc: Scott Wood, linuxppc-dev

Rex Feany <RFeany@mrv.com> wrote on 06/10/2009 00:42:18:
>
> Thus spake Joakim Tjernlund (joakim.tjernlund@transmode.se):
>
> > > I got this oops:
> > >
> > > Unable to handle kernel paging request for data at address 0x00000000
> > > Faulting instruction address: 0xc000e110
> > > Oops: Kernel access of bad area, sig: 11 [#1]
> > > MRV NM2
> > > NIP: c000e110 LR: c000d520 CTR: 1006bf40
> > > REGS: c2203dd0 TRAP: 0300   Not tainted  (2.6.32-rc3-00014-gcea49b0-dirty)
> > > MSR: 00009032 <EE,ME,IR,DR>  CR: 44022422  XER: 20000000
> > > DAR: 00000000, DSISR: c0000000
> > > TASK = c21c18c0[69] 'rc.sysinit' THREAD: c2202000
> > > GPR00: 00000401 c2203e80 c21c18c0 c2203f50 00000000 4000d032 00000000 00000000
> > > GPR08: 10093eb4 00000000 00009032 c000d514 021c1ad0 10095c50 03ffb000 10006500
> > > GPR16: 10006508 00000000 00000000 00000000 100a2fc8 021c4238 00000000 00000400
> > > GPR24: c21a2700 00000002 40000000 00000000 00000000 c2203f50 00000000 00000011
> > > NIP [c000e110] do_page_fault+0x44/0x5fc
> > > LR [c000d520] handle_page_fault+0xc/0x80
> > > Call Trace:
> > > [c2203e80] [c000e58c] do_page_fault+0x4c0/0x5fc (unreliable)
> > > [c2203f40] [c000d520] handle_page_fault+0xc/0x80
> > > Instruction dump:
> > > 800300a0 7cba2b78 54170036 2f970400 7c9c2378 830200e0 54b6018c 40be000c
> > > 74ba4820 3ac00000 813d0080 3bc00000 <80a90000> 54a036be 2f80001f 40be0100
> > > ---[ end trace 99a4d88f7e2f1b60 ]---
> > >
> > > this happens in do_page_fault becaose regs->nip is null, so
> >
> > regs or regs->nip is NULL? Either one does not make sense
> > In any case it might be a secondary problem as DAR is NULL already when you
> > enter the page fault.
>
> I assumed it was NIP because ... I'm not sure why. TRAP() above
> dereferences regs, and that didn't fail, but I didn't see that until
> now.

Yes, it is a bit strange though that the kernel isn't allowed to read from
NULL. Is that expected?

 Jocke

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-06  6:25       ` Joakim Tjernlund
@ 2009-10-06  6:44         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-06  6:44 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany


> > I assumed it was NIP because ... I'm not sure why. TRAP() above
> > dereferences regs, and that didn't fail, but I didn't see that until
> > now.
> 
> Yes, it is a bit strange though that the kernel isn't allowed to read from
> NULL. Is that expected?

Yes, that's absolutely expected :-) You really want any NULL deref to
blow up asap.

0 is actually part of the address space assigned to user processes. By
default they don't have anything there neither though, but if the
current process do have something mapped there, then the kernel would
read that when doing a NULL deref.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-06  6:15                         ` Joakim Tjernlund
@ 2009-10-06  6:45                           ` Benjamin Herrenschmidt
  2009-10-06  7:54                             ` Joakim Tjernlund
  2009-10-06 15:40                             ` Joakim Tjernlund
  0 siblings, 2 replies; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-06  6:45 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Tue, 2009-10-06 at 08:15 +0200, Joakim Tjernlund wrote:

> Yes, I would too but TLB Miss knows nothing about load/store, protection etc.
> because DSISR isn't set. So I cannot see any other way than the TLB Error way.

Hrm... that MMU really sucks more than I thought :-(

I'll go read the manual and think about that a bit more.

> But I guess BookE knows if it is a load or store in TLB Miss?
> Then it can emulate changed bit I guess.

Yes, it knows.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-06  6:45                           ` Benjamin Herrenschmidt
@ 2009-10-06  7:54                             ` Joakim Tjernlund
  2009-10-06 15:40                             ` Joakim Tjernlund
  1 sibling, 0 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06  7:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 08:45:47:
>
> On Tue, 2009-10-06 at 08:15 +0200, Joakim Tjernlund wrote:
>
> > Yes, I would too but TLB Miss knows nothing about load/store, protection etc.
> > because DSISR isn't set. So I cannot see any other way than the TLB Error way.
>
> Hrm... that MMU really sucks more than I thought :-(

Yeah, but at least it got a Changed bit as opposed to some other
arch mention in this mail thread :)

>
> I'll go read the manual and think about that a bit more.
Thanks

I wonder if not much of the problems Scott and Rex are seeing are from
dcbX insn and that my fixup isn't quite working yet. I am tempted
to get the asm version working again as then I don't have to worry about
permission and get_user() et. all :)

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-06  1:52         ` Benjamin Herrenschmidt
@ 2009-10-06  8:06           ` Joakim Tjernlund
  2009-10-06  8:32             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06  8:06 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 03:52:15:
>
> \
> > So how does this look? Does it change anything?
> > It should as the previous way was way off :(
> >
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index c33c6de..08a392f 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -153,7 +153,7 @@ int __kprobes do_page_fault(struct pt_regs *regs,
> unsigned long address,
> >  #ifdef DEBUG_DCBX
> >        const char *istr = NULL;
> >
> > -      insn = *((unsigned long *)regs->nip);
> > +      __get_user(insn, (unsigned long __user *)regs->nip);
>
> No, use get_user() not __get_user() or if you use the later, also use
> access_ok(), and test the result in case it errors (if it does, you
> probably want to just goto bad access and SEGV).

OK, lets see what this gives us:

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index c33c6de..1bf91d3 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -153,7 +153,8 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 #ifdef DEBUG_DCBX
 		const char *istr = NULL;

-		insn = *((unsigned long *)regs->nip);
+		insn = 0;
+		__get_user(insn, (unsigned long __user *)regs->nip);
 		if (((insn >> (31-5)) & 0x3f) == 31) {
 			if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
 				istr = "dcbz";
@@ -171,27 +172,32 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 				dar = regs->gpr[rb];
 				if (ra)
 					dar += regs->gpr[ra];
-				if (dar != address && address != 0x00f0 && trap == 0x300)
+				if (dar != address && trap == 0x300)
 					printk(KERN_CRIT "%s: address:%lx, dar:%lx!\n", istr, address, dar);
 				if (!strcmp(istr, "dcbst") && is_write) {
 					printk(KERN_CRIT "dcbst R%ld,R%ld = %lx as a store, fixing!\n",
 					       ra, rb, dar);
 					is_write = 0;
 				}
-
+#if 0
 				if (trap == 0x300 && address != dar) {
 					__asm__ ("mtdar %0" : : "r" (dar));
 					return 0;
 				}
+#endif
 			}
 		}
 #endif
 		if (address == 0x00f0 && trap == 0x300) {
-			pte_t *ptep;
+			//pte_t *ptep;

 			/* This is from a dcbX or icbi insn gone bad, these
 			 * insn do not set DAR so we have to do it here instead */
-			insn = *((unsigned long *)regs->nip);
+			if (get_user(insn, (unsigned long __user *)regs->nip)) {
+				printk(KERN_CRIT "get_user failed, NIP:%lx\n",
+				       regs->nip);
+				goto bad_area_nosemaphore;
+			}

 			ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
 			rb = (insn >> (31-20)) & 0x1f; /* Reg RB */
@@ -206,7 +212,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 			       trap, address, dar, error_code, istr);
 #endif
 			address = dar;
-#if 1
+#if 0
 			if (is_write && get_pteptr(mm, dar, &ptep, NULL)) {
 				pte_t my_pte = *ptep;

@@ -216,7 +222,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 				}
 			}
 #else
-			return 0;
+			//return 0;
 #endif
 		}
 	}

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-06  8:06           ` Joakim Tjernlund
@ 2009-10-06  8:32             ` Benjamin Herrenschmidt
  2009-10-06 10:58               ` Joakim Tjernlund
  0 siblings, 1 reply; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-06  8:32 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany


> > No, use get_user() not __get_user() or if you use the later, also use
> > access_ok(), and test the result in case it errors (if it does, you
> > probably want to just goto bad access and SEGV).
> 
> OK, lets see what this gives us:

Hrm... did you change anything ? :-)

Ben.

> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index c33c6de..1bf91d3 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -153,7 +153,8 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
>  #ifdef DEBUG_DCBX
>  		const char *istr = NULL;
> 
> -		insn = *((unsigned long *)regs->nip);
> +		insn = 0;
> +		__get_user(insn, (unsigned long __user *)regs->nip);
>  		if (((insn >> (31-5)) & 0x3f) == 31) {
>  			if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
>  				istr = "dcbz";
> @@ -171,27 +172,32 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
>  				dar = regs->gpr[rb];
>  				if (ra)
>  					dar += regs->gpr[ra];
> -				if (dar != address && address != 0x00f0 && trap == 0x300)
> +				if (dar != address && trap == 0x300)
>  					printk(KERN_CRIT "%s: address:%lx, dar:%lx!\n", istr, address, dar);
>  				if (!strcmp(istr, "dcbst") && is_write) {
>  					printk(KERN_CRIT "dcbst R%ld,R%ld = %lx as a store, fixing!\n",
>  					       ra, rb, dar);
>  					is_write = 0;
>  				}
> -
> +#if 0
>  				if (trap == 0x300 && address != dar) {
>  					__asm__ ("mtdar %0" : : "r" (dar));
>  					return 0;
>  				}
> +#endif
>  			}
>  		}
>  #endif
>  		if (address == 0x00f0 && trap == 0x300) {
> -			pte_t *ptep;
> +			//pte_t *ptep;
> 
>  			/* This is from a dcbX or icbi insn gone bad, these
>  			 * insn do not set DAR so we have to do it here instead */
> -			insn = *((unsigned long *)regs->nip);
> +			if (get_user(insn, (unsigned long __user *)regs->nip)) {
> +				printk(KERN_CRIT "get_user failed, NIP:%lx\n",
> +				       regs->nip);
> +				goto bad_area_nosemaphore;
> +			}
> 
>  			ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
>  			rb = (insn >> (31-20)) & 0x1f; /* Reg RB */
> @@ -206,7 +212,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
>  			       trap, address, dar, error_code, istr);
>  #endif
>  			address = dar;
> -#if 1
> +#if 0
>  			if (is_write && get_pteptr(mm, dar, &ptep, NULL)) {
>  				pte_t my_pte = *ptep;
> 
> @@ -216,7 +222,7 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
>  				}
>  			}
>  #else
> -			return 0;
> +			//return 0;
>  #endif
>  		}
>  	}

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-06  8:32             ` Benjamin Herrenschmidt
@ 2009-10-06 10:58               ` Joakim Tjernlund
  2009-10-06 11:06                 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06 10:58 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

>
>
> > > No, use get_user() not __get_user() or if you use the later, also use
> > > access_ok(), and test the result in case it errors (if it does, you
> > > probably want to just goto bad access and SEGV).
> >
> > OK, lets see what this gives us:
>
> Hrm... did you change anything ? :-)

Yes, see below

>
> Ben.
>
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index c33c6de..1bf91d3 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -153,7 +153,8 @@ int __kprobes do_page_fault(struct pt_regs *regs,
> unsigned long address,
> >  #ifdef DEBUG_DCBX
> >        const char *istr = NULL;
> >
> > -      insn = *((unsigned long *)regs->nip);
> > +      insn = 0;
> > +      __get_user(insn, (unsigned long __user *)regs->nip);

Here I don't care if err. insn will be 0 if it fails and the following
if will be false

> >        if (((insn >> (31-5)) & 0x3f) == 31) {
> >           if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
> >              istr = "dcbz";
> > @@ -171,27 +172,32 @@ int __kprobes do_page_fault(struct pt_regs *regs,
> unsigned long address,
> >              dar = regs->gpr[rb];
> >              if (ra)
> >                 dar += regs->gpr[ra];
> > -            if (dar != address && address != 0x00f0 && trap == 0x300)
> > +            if (dar != address && trap == 0x300)
> >                 printk(KERN_CRIT "%s: address:%lx, dar:%lx!\n", istr, address, dar);
> >              if (!strcmp(istr, "dcbst") && is_write) {
> >                 printk(KERN_CRIT "dcbst R%ld,R%ld = %lx as a store, fixing!\n",
> >                        ra, rb, dar);
> >                 is_write = 0;
> >              }
> > -
> > +#if 0
> >              if (trap == 0x300 && address != dar) {
> >                 __asm__ ("mtdar %0" : : "r" (dar));
> >                 return 0;
> >              }
> > +#endif
> >           }
> >        }
> >  #endif
> >        if (address == 0x00f0 && trap == 0x300) {
> > -         pte_t *ptep;
> > +         //pte_t *ptep;
> >
> >           /* This is from a dcbX or icbi insn gone bad, these
> >            * insn do not set DAR so we have to do it here instead */
> > -         insn = *((unsigned long *)regs->nip);
> > +         if (get_user(insn, (unsigned long __user *)regs->nip)) {
> > +            printk(KERN_CRIT "get_user failed, NIP:%lx\n",
> > +                   regs->nip);
> > +            goto bad_area_nosemaphore;
> > +         }

and here I go to bad_area

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-06 10:58               ` Joakim Tjernlund
@ 2009-10-06 11:06                 ` Benjamin Herrenschmidt
  2009-10-06 11:39                   ` Joakim Tjernlund
  2009-10-06 13:18                   ` Joakim Tjernlund
  0 siblings, 2 replies; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-06 11:06 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

On Tue, 2009-10-06 at 12:58 +0200, Joakim Tjernlund wrote:

> Here I don't care if err. insn will be 0 if it fails and the following
> if will be false

I'd rather you use get_user() so it does access_ok().

Else, you can probably manufacture some code that will make the kernel
access some MMIO register for example, which could be nasty.

At this point, you may as well also check the result even if indeed a
fault isn't going to matter. Just makes the code cleaner and avoids some
random janitor coming up with a patch later on :-)

Cheers,
Ben.

> > >        if (((insn >> (31-5)) & 0x3f) == 31) {
> > >           if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
> > >              istr = "dcbz";
> > > @@ -171,27 +172,32 @@ int __kprobes do_page_fault(struct pt_regs *regs,
> > unsigned long address,
> > >              dar = regs->gpr[rb];
> > >              if (ra)
> > >                 dar += regs->gpr[ra];
> > > -            if (dar != address && address != 0x00f0 && trap == 0x300)
> > > +            if (dar != address && trap == 0x300)
> > >                 printk(KERN_CRIT "%s: address:%lx, dar:%lx!\n", istr, address, dar);
> > >              if (!strcmp(istr, "dcbst") && is_write) {
> > >                 printk(KERN_CRIT "dcbst R%ld,R%ld = %lx as a store, fixing!\n",
> > >                        ra, rb, dar);
> > >                 is_write = 0;
> > >              }
> > > -
> > > +#if 0
> > >              if (trap == 0x300 && address != dar) {
> > >                 __asm__ ("mtdar %0" : : "r" (dar));
> > >                 return 0;
> > >              }
> > > +#endif
> > >           }
> > >        }
> > >  #endif
> > >        if (address == 0x00f0 && trap == 0x300) {
> > > -         pte_t *ptep;
> > > +         //pte_t *ptep;
> > >
> > >           /* This is from a dcbX or icbi insn gone bad, these
> > >            * insn do not set DAR so we have to do it here instead */
> > > -         insn = *((unsigned long *)regs->nip);
> > > +         if (get_user(insn, (unsigned long __user *)regs->nip)) {
> > > +            printk(KERN_CRIT "get_user failed, NIP:%lx\n",
> > > +                   regs->nip);
> > > +            goto bad_area_nosemaphore;
> > > +         }
> 
> and here I go to bad_area

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-06 11:06                 ` Benjamin Herrenschmidt
@ 2009-10-06 11:39                   ` Joakim Tjernlund
  2009-10-06 13:18                   ` Joakim Tjernlund
  1 sibling, 0 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06 11:39 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany



Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 13:06:26:

> From:
>
> Benjamin Herrenschmidt <benh@kernel.crashing.org>
>
> To:
>
> Joakim Tjernlund <joakim.tjernlund@transmode.se>
>
> Cc:
>
> "linuxppc-dev@ozlabs.org" <linuxppc-dev@ozlabs.org>, Rex Feany
> <RFeany@mrv.com>, Scott Wood <scottwood@freescale.com>
>
> Date:
>
> 06/10/2009 13:06
>
> Subject:
>
> Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
>
> On Tue, 2009-10-06 at 12:58 +0200, Joakim Tjernlund wrote:
>
> > Here I don't care if err. insn will be 0 if it fails and the following
> > if will be false
>
> I'd rather you use get_user() so it does access_ok().

It is only debug code so it will go away
The real access is later.

I need do kernel space separately, is there a better way than:

if (regs->nip & 0xc0000000) /* kernel space ? */
	insn = *((unsigned long *)regs->nip);
else if (get_user(insn, (unsigned long *)regs->nip)) {
	printk(KERN_CRIT "get_user! NIP:%lx\n", regs->nip); /* DEBUG */
	goto bad_area_nosemaphore;
}

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 0/6] PowerPc 8xx TLB/MMU fixes
  2009-10-06 11:06                 ` Benjamin Herrenschmidt
  2009-10-06 11:39                   ` Joakim Tjernlund
@ 2009-10-06 13:18                   ` Joakim Tjernlund
  1 sibling, 0 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06 13:18 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 13:06:26:
>
> On Tue, 2009-10-06 at 12:58 +0200, Joakim Tjernlund wrote:
>
> > Here I don't care if err. insn will be 0 if it fails and the following
> > if will be false
>
> I'd rather you use get_user() so it does access_ok().
>
> Else, you can probably manufacture some code that will make the kernel
> access some MMIO register for example, which could be nasty.
>
> At this point, you may as well also check the result even if indeed a
> fault isn't going to matter. Just makes the code cleaner and avoids some
> random janitor coming up with a patch later on :-)
>
> Cheers,
> Ben.

So my user space access was bust. Try slapping this on top
It might be that my crappy user space handling also tripped the
asm version, would be great if you could try that one again too.

I suspect that you both, Rex and Scott, have dcbX/icbi insn's in user
space that trip DTLB errors, that would explain everything I think.

      Jocke

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index c33c6de..d757ec8 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -152,8 +152,16 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 		unsigned long ra, rb, dar, insn;
 #ifdef DEBUG_DCBX
 		const char *istr = NULL;
+		int ret;
+
+		insn = 0;
+		if (user_mode(regs)) {
+			if ((ret = get_user(insn, (unsigned long __user *)regs->nip)))
+				printk(KERN_CRIT "get_user:NIP:0x%08lx, err:%d\n",
+				       regs->nip, ret);
+		} else
+			insn = *((unsigned long *)regs->nip);

-		insn = *((unsigned long *)regs->nip);
 		if (((insn >> (31-5)) & 0x3f) == 31) {
 			if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
 				istr = "dcbz";
@@ -178,20 +186,27 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 					       ra, rb, dar);
 					is_write = 0;
 				}
-
+#if 0
 				if (trap == 0x300 && address != dar) {
 					__asm__ ("mtdar %0" : : "r" (dar));
 					return 0;
 				}
+#endif
 			}
 		}
 #endif
 		if (address == 0x00f0 && trap == 0x300) {
-			pte_t *ptep;
-
+			int ret;
 			/* This is from a dcbX or icbi insn gone bad, these
 			 * insn do not set DAR so we have to do it here instead */
-			insn = *((unsigned long *)regs->nip);
+			if (user_mode(regs)) {
+				if ((ret = get_user(insn, (unsigned long __user *)regs->nip))) {
+					printk(KERN_CRIT "get_user:NIP:%lx, err:%d\n",
+					       regs->nip, ret);
+					goto bad_area_nosemaphore;
+				}
+			} else
+				insn = *((unsigned long *)regs->nip);

 			ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
 			rb = (insn >> (31-20)) & 0x1f; /* Reg RB */
@@ -206,18 +221,6 @@ int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 			       trap, address, dar, error_code, istr);
 #endif
 			address = dar;
-#if 1
-			if (is_write && get_pteptr(mm, dar, &ptep, NULL)) {
-				pte_t my_pte = *ptep;
-
-				if (pte_present(my_pte) && pte_write(my_pte)) {
-					pte_val(my_pte) |= _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE;
-					set_pte_at(mm, dar, ptep, my_pte);
-				}
-			}
-#else
-			return 0;
-#endif
 		}
 	}
 #endif

^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-06  6:45                           ` Benjamin Herrenschmidt
  2009-10-06  7:54                             ` Joakim Tjernlund
@ 2009-10-06 15:40                             ` Joakim Tjernlund
  2009-10-06 17:28                               ` Joakim Tjernlund
  1 sibling, 1 reply; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06 15:40 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 08:45:47:
>
> On Tue, 2009-10-06 at 08:15 +0200, Joakim Tjernlund wrote:
>
> > Yes, I would too but TLB Miss knows nothing about load/store, protection etc.
> > because DSISR isn't set. So I cannot see any other way than the TLB Error way.
>
> Hrm... that MMU really sucks more than I thought :-(
>
> I'll go read the manual and think about that a bit more.

I realise now that for an ITLB Miss I can work out everything I need.
an ITLB must be a read en everything else is in the linux pte.
Just check present and user, add accessed if not already there.
I blame that on too little sleep :)

I don't think I will dwell on that ATM.

    Jocke

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-06 15:40                             ` Joakim Tjernlund
@ 2009-10-06 17:28                               ` Joakim Tjernlund
  0 siblings, 0 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06 17:28 UTC (permalink / raw)
  Cc: Scott Wood, Rex Feany, linuxppc-dev

>
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 08:45:47:
> >
> > On Tue, 2009-10-06 at 08:15 +0200, Joakim Tjernlund wrote:
> >
> > > Yes, I would too but TLB Miss knows nothing about load/store, protection etc.
> > > because DSISR isn't set. So I cannot see any other way than the TLB Error way.
> >
> > Hrm... that MMU really sucks more than I thought :-(
> >
> > I'll go read the manual and think about that a bit more.
>
> I realise now that for an ITLB Miss I can work out everything I need.
> an ITLB must be a read en everything else is in the linux pte.
> Just check present and user, add accessed if not already there.
> I blame that on too little sleep :)

bugger,  more bugs I think. Need to fix the TLB code some more

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-06  0:34                       ` Benjamin Herrenschmidt
  2009-10-06  6:15                         ` Joakim Tjernlund
@ 2009-10-06 22:05                         ` Joakim Tjernlund
  2009-10-06 23:25                           ` Benjamin Herrenschmidt
  2009-10-07  1:07                           ` Benjamin Herrenschmidt
  1 sibling, 2 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-06 22:05 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 06/10/2009 02:34:15:
>
> On Tue, 2009-10-06 at 01:35 +0200, Joakim Tjernlund wrote:
> >
> > > Well, if the HW has the ability to enforce trap when store with !
> > DIRTY,
> >
> > Yes, provided that the kernel invalidates the TLB too so the next
> > access
> > will provoke a TLB Miss, which will then provoke a TLB error. The TLB
> > error routine checks VALID, RW and USER(if not a kernel access), then
> > sets
> > ACCESSED & DIRTY and writes the TLB(RPN reg).
> >
> > Perhaps the missing invalidate is haunting us here?
>
> No, the kernel will invalidate when clearing dirty or accessed, I don't
> think that's our problem.
>
> This is still all inefficient, we end up basically with two traps.
>
> 8xx provides backup GPRs when doing TLB misses ? What does it cost to
> jump out of a TLB miss back into "normal" context ?
>
> IE. What I do on 440 is I set a mask of required bits, basically
> _PAGE_PRESENT | _PAGE_ACCESSED is the base. The DTLB miss also sticks
> in _PAGE_RW | _PAGE_DIRTY when it's a store fault.

After some more thinking I don't think I do TLB Miss/Error correctly yet.
The problem is ACCESSED. Since I don't know if load or store in TLB Miss
I must choose:
 - Assume load and do what you do above. That will incorrectly
   set ACCESSED on store ops when mapped as RO(plus whatever more I haven't thought about yet)

 - Trap to TLB Error and do the above. That will set ACCESSED correctly
   but won't trap kernel space so these remain what they are.
   Is anything depending on ACCESSED for kernel pages?
   if so, what if we set ACCESSED on all kernel pages when mapping them at boot?
   Will SWAP or some other service(accounting?) not like that?

Finally, why do you need to include DIRTY when a store OP?
Do you need to do COW before dirtying the page?
Seems to work for me to just set DIRTY in TLB Error if RW is set too
and not trap to C.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-06 22:05                         ` Joakim Tjernlund
@ 2009-10-06 23:25                           ` Benjamin Herrenschmidt
  2009-10-07  1:07                           ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-06 23:25 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany


> After some more thinking I don't think I do TLB Miss/Error correctly yet.
> The problem is ACCESSED. Since I don't know if load or store in TLB Miss
> I must choose:
>  - Assume load and do what you do above. That will incorrectly
>    set ACCESSED on store ops when mapped as RO(plus whatever more I haven't thought about yet)
> 
>  - Trap to TLB Error and do the above. That will set ACCESSED correctly
>    but won't trap kernel space so these remain what they are.
>    Is anything depending on ACCESSED for kernel pages?
>    if so, what if we set ACCESSED on all kernel pages when mapping them at boot?
>    Will SWAP or some other service(accounting?) not like that?

Kernel pages always have ACCESSED set and never clear it.

I think the only solution here is that if -anything- doesn't look right,
the TLB miss should create one of those "unpopulated" entries and we
need to make sure they are properly invalidated in the subsequent fault
path. I'll read the 8xx doco asap to make sure I get it right.

> Finally, why do you need to include DIRTY when a store OP?
> Do you need to do COW before dirtying the page?
> Seems to work for me to just set DIRTY in TLB Error if RW is set too
> and not trap to C.

You can, but what about a load ? That shouldn't set dirty. So if you set
dirty only on stores, then a load will bring in a page without dirty,
you need to make sure you get another TLB error when writing to it so
you get a chance to set dirty.

Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-06 22:05                         ` Joakim Tjernlund
  2009-10-06 23:25                           ` Benjamin Herrenschmidt
@ 2009-10-07  1:07                           ` Benjamin Herrenschmidt
  2009-10-07  7:47                             ` Joakim Tjernlund
  1 sibling, 1 reply; 49+ messages in thread
From: Benjamin Herrenschmidt @ 2009-10-07  1:07 UTC (permalink / raw)
  To: Joakim Tjernlund; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Allright, did a bit of reading of doco and code..

Doco isn't totally clear though. At some stage, it -hints- that in case
of a TLB "error" (match on EA/ASID but incorrect
protection/valid/changed/...) the offending TLB entry is automatically
invalidated. Do you know if that is correct ?

I would hope so since we never do anything to remove the "invalid"
entries we write to the TLB when hitting non-present PTEs but then, that
may also explain some of our problems...

Now, a few comments from what I read in the code:

 - The whole writeback could be avoided in Instruction/Data TLB miss,
but for that, you need to make sure that the TLB entry we create has
valid set only if -both- present and accessed are set. That would save
in the case of CONFIG_SWAP, a store and a read back of TWC I suppose,
but you do need to find a way to do that ANDing of ACCESSED and PRESENT.

 - I think we can get rid of HWWRITE. We can make CHANGED be the current
HWWRITE value, I agree with you, which matches the HW changed bit. We
need to be a bit careful of how we setup the PP bits tho. At this stage,
I see several approaches:

   * One is to basically "generate" the right PP bits based on a
combination of _PAGE_USER and _PAGE_RW. That's the simpler approach but
probably uses more code in the TLB miss handler. It would look like
that, with MxCTR:PPCS=0

  _PAGE_USER    _PAGE_RW      PP (bits 20..27)
      0            0               011C1111    (C is _PAGE_DIRTY)
      0            1               000C1111
      1            0               110C1111
      1            1               100C1111

One easy way to do that is to have _PAGE_USER and _PAGE_RW sit next to
each other in bit position 28 and 29 (0xc). Load a GPR with something
like 0110000011001000 and rotate it left by PTE & 0xc, then move the
resulting 3 bits into position, or something along those lines. You can
also give up on kernel read-only support and go down to 2 PP bits and
never use the extended encoding.

  * Another one is to use MxCTR:PPCS=1 a mask of 100C1U00 (U is
_PAGE_USER) and or in ^_PAGE_RW (it could actually be made
reverse-polarity in the PTE but that would mean some changes to non-8xx
specific headers, so let's avoid it for now at least).

At least that's the best options I can see from my reading of the doco,
though it's not totally clear to me what really happens when doing the
PPCS trick, will it generate a TLB error on a non-match or will it try
to TLB miss, which could be bad.

  * Last but not least, it wouldn't be hard to use either of the above
encodings, and have the PTE actually contain the right bit combination
already. You don't need to have a _PAGE_RW, you don't need to have a
_PAGE_USER  :-) Look at how I do things for book3e, where I layout the
6 BookE protection bit directly in the PTE. That's a bit harder and maybe
will need subtle changes to pte-common.h and such to accomodate it, and
so probably something to experiment with as a second step, but it's the
most efficient approach in the long run for obvious reasons.

 - I think we could have more bits pre-set to the right values in the
PTE, look how we play with defining some of the constants in
pte-common.h, might be worth having a look, but -after- we have
something that works :-)

 - Possible bug: I'm very disturbed by the fact that DataTLBError sets
HWWRITE and DIRTY on a non-present PTE. It should not. Just like
ACCESSED. That's going to cause trouble and swap corruption, even more
as we move DIRTY around.

 - Maybe we can completely remove that mucking around with dirty,
setting of accessed etc... from DataTLBError. Just make it go to C code
just like InstructionAccess, as we discussed earlier, the generic code
will fix it up and we'll speed up page faults. It should be fairly rare
to take a fault due to a missing _PAGE_ACCESSED or _PAGE_DIRTY in any
case, so all that fixup in those exceptions is just overhead.

 - Later on, if it rocks your boat, you may want to look into removing
the mucking around with swapper_pg_dir in the TLB miss. Instead, what
you can do is lazily copy the kernel PMDs into the user PMDs. IE. On the
first kernel access from a new user context, you fault, and from the
do_page_fault() code, we can detect that and fixup the user PMD to point
to the kernel page tables, thus avoiding that whole bunch of code in the
TLB miss. When creating new user page tables, we can pre-fill the kernel
PMDs too. I think x86 does that. We could do that for BookE too, though
it's less of a win since we have to fixup the TID in MMUCR/MAS but for
8xx it would save a few instructions & conditionals in the TLB miss fast
path.

 - I still have problems with the comment next to the "workaround"
tlbil_va() we have in the icache/dcache flush. It doesn't make much
sense to me since dcbst is going to be done by the kernel on a kernel
address, not a user address, so I don't see how the thing we invalidate
relates to the flush we do just below.... So that raises a few
questions:

   * If this is to avoid write faults on kernel pages, then we may just
want to have a fixup in do_page_fault() to ignore them when coming from
flush_dcache_icache_* using the exception table mechanism instead ?

   * I don't see how it works around user faults but maybe you have a
clear scenario in mind

   * It appears still reasonably obvious that we need that tlbil_va
somewhere in that fault path, but I don't know what for, I'm not sure
it's really what the comment says it's about. That raises back the whole
question of when are those "invalid" TLB entries that we create when
_PAGE_PRESENT is not set are going to be removed. The doc hints that
they go a way on TLB error, but the doc really only says so for missing
changed bit... so I'm tempted to say we need to basically stick a
tlbil_va on 8xx on both set_pte_at() and ptep_set_access_flags() for any
non-kernel page. What do you think ? That would be the safest.

 - The "fake" DataAccess and InstructionAccess are useless... just a
spurious jump for nothing. We could implement those straight in
DataTLBError and InstructionTLBError, we just need to make sure that
the trap numbers we put in the exception frame are 0x300 and 0x400, but
that's just a matter of passing the right bits to EXC_XFER_EE_LITE().

 - Finally, another boat-rocking optimisation to do is preload. You
could hook in update_mmu_cache() for 8xx to go whack an entry in the TLB
as well, which would avoid a TLB miss right after a page fault. But
again, only when things work.

None of the above of course address your DAR-not-updated concern. I
think your approach of "clobbering" the DAR on every storage exception
after it's been snapshotted and then testing for the clobber and
emulating the instruction is the way to go, though your patch could use
some cleaning there.

I'll post comments on it separately.

At the end of the day, I can't help that much without HW access and/or a
simulator, as you can guess, and no, I'm not asking for people to send
me an 8xx board :-) (especially useless without a BDI imho). So I rely
on Scott and yourself to sort this out. But thanks for showing up, and
please, do port your board to 2.6 so you can test your stuff :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU.
  2009-10-07  1:07                           ` Benjamin Herrenschmidt
@ 2009-10-07  7:47                             ` Joakim Tjernlund
  0 siblings, 0 replies; 49+ messages in thread
From: Joakim Tjernlund @ 2009-10-07  7:47 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Scott Wood, linuxppc-dev, Rex Feany

Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 07/10/2009 03:07:35:
>
> Allright, did a bit of reading of doco and code..

hey, this is a super, thanks!

>
> Doco isn't totally clear though. At some stage, it -hints- that in case
> of a TLB "error" (match on EA/ASID but incorrect
> protection/valid/changed/...) the offending TLB entry is automatically
> invalidated. Do you know if that is correct ?

Nope, I don't know.

>
> I would hope so since we never do anything to remove the "invalid"
> entries we write to the TLB when hitting non-present PTEs but then, that
> may also explain some of our problems...

hmm, then a tlbil_va() in do_page_fault for "no translation" errors
would help I guess.

>
> Now, a few comments from what I read in the code:
>
>  - The whole writeback could be avoided in Instruction/Data TLB miss,
> but for that, you need to make sure that the TLB entry we create has
> valid set only if -both- present and accessed are set. That would save
> in the case of CONFIG_SWAP, a store and a read back of TWC I suppose,
> but you do need to find a way to do that ANDing of ACCESSED and PRESENT.

So far I have mapped !ACCESSED pages as No Access instead, maybe that
was wrong? I thought it was easier but I will have a look at this too.

>
>  - I think we can get rid of HWWRITE. We can make CHANGED be the current
> HWWRITE value, I agree with you, which matches the HW changed bit. We
> need to be a bit careful of how we setup the PP bits tho. At this stage,
> I see several approaches:
>
>    * One is to basically "generate" the right PP bits based on a
> combination of _PAGE_USER and _PAGE_RW. That's the simpler approach but
> probably uses more code in the TLB miss handler. It would look like
> that, with MxCTR:PPCS=0
>
>   _PAGE_USER    _PAGE_RW      PP (bits 20..27)
>       0            0               011C1111    (C is _PAGE_DIRTY)
>       0            1               000C1111
>       1            0               110C1111
>       1            1               100C1111
>
> One easy way to do that is to have _PAGE_USER and _PAGE_RW sit next to
> each other in bit position 28 and 29 (0xc). Load a GPR with something
> like 0110000011001000 and rotate it left by PTE & 0xc, then move the
> resulting 3 bits into position, or something along those lines. You can
> also give up on kernel read-only support and go down to 2 PP bits and
> never use the extended encoding.

yes, I do think the extended encoding is too much work and not worth it.
One concern I have is if a user RO mapping also can be a kernel RO
mapping? That is, do kernel require RW to a user page mapped RO?

>
>   * Another one is to use MxCTR:PPCS=1 a mask of 100C1U00 (U is
> _PAGE_USER) and or in ^_PAGE_RW (it could actually be made
> reverse-polarity in the PTE but that would mean some changes to non-8xx
> specific headers, so let's avoid it for now at least).

hmm, interesting.
Then I will also have set ACCESSED manually in TLB Miss I think.

>
> At least that's the best options I can see from my reading of the doco,
> though it's not totally clear to me what really happens when doing the
> PPCS trick, will it generate a TLB error on a non-match or will it try
> to TLB miss, which could be bad.

Yes, I don't think anybody has tested this.

>
>   * Last but not least, it wouldn't be hard to use either of the above
> encodings, and have the PTE actually contain the right bit combination
> already. You don't need to have a _PAGE_RW, you don't need to have a
> _PAGE_USER  :-) Look at how I do things for book3e, where I layout the
> 6 BookE protection bit directly in the PTE. That's a bit harder and maybe
> will need subtle changes to pte-common.h and such to accomodate it, and
> so probably something to experiment with as a second step, but it's the
> most efficient approach in the long run for obvious reasons.
>
>  - I think we could have more bits pre-set to the right values in the
> PTE, look how we play with defining some of the constants in
> pte-common.h, might be worth having a look, but -after- we have
> something that works :-)
>
>  - Possible bug: I'm very disturbed by the fact that DataTLBError sets
> HWWRITE and DIRTY on a non-present PTE. It should not. Just like
> ACCESSED. That's going to cause trouble and swap corruption, even more
> as we move DIRTY around.

Yes, this is what I fix with the first patch in my series:
   8xx: DTLB Error must check for more errors.

    DataTLBError currently does:
     if ((err & 0x02000000) == 0)
        DSI();
    This won't handle a store with no valid translation.
    Change this to
     if ((err & 0x48000000) != 0)
        DSI();
    that is, branch to DSI if either !permission or
    !translation.

>
>  - Maybe we can completely remove that mucking around with dirty,
> setting of accessed etc... from DataTLBError. Just make it go to C code
> just like InstructionAccess, as we discussed earlier, the generic code
> will fix it up and we'll speed up page faults. It should be fairly rare
> to take a fault due to a missing _PAGE_ACCESSED or _PAGE_DIRTY in any
> case, so all that fixup in those exceptions is just overhead.

That depends on if you set ACCESSED in TLB Miss or not I think.
if TLB Miss maps pages as NoAccess du to missing ACCESSED then the
first access is always going to trap to TLB Error and then to
C just to fix DIRTY and/or ACCESSED, feels a bit more expensive
fixing it in TLB error, no?

BTW, does 2.4 update ACCESSED and DIRTY in generic code the same way
as 2.6? It is still much easier for me to test thing in 2.4 and then
move it over to 2.6

>
>  - Later on, if it rocks your boat, you may want to look into removing
> the mucking around with swapper_pg_dir in the TLB miss. Instead, what
> you can do is lazily copy the kernel PMDs into the user PMDs. IE. On the
> first kernel access from a new user context, you fault, and from the
> do_page_fault() code, we can detect that and fixup the user PMD to point
> to the kernel page tables, thus avoiding that whole bunch of code in the
> TLB miss. When creating new user page tables, we can pre-fill the kernel
> PMDs too. I think x86 does that. We could do that for BookE too, though
> it's less of a win since we have to fixup the TID in MMUCR/MAS but for
> 8xx it would save a few instructions & conditionals in the TLB miss fast
> path.

That would be nice, I have had that idea but no clue as how to do it.

>
>  - I still have problems with the comment next to the "workaround"
> tlbil_va() we have in the icache/dcache flush. It doesn't make much
> sense to me since dcbst is going to be done by the kernel on a kernel
> address, not a user address, so I don't see how the thing we invalidate
> relates to the flush we do just below.... So that raises a few
> questions:

This workaround was added when we started to force TLB Errors in
TLB Miss for missing pmd entries so it may be some other bug in the
TLB code that makes this happen.
Here is a link:
 http://www.mail-archive.com/linuxppc-embedded@ozlabs.org/msg07885.html
Seems to boil down to a missing tlbie()

>
>    * If this is to avoid write faults on kernel pages, then we may just
> want to have a fixup in do_page_fault() to ignore them when coming from
> flush_dcache_icache_* using the exception table mechanism instead ?
>
>    * I don't see how it works around user faults but maybe you have a
> clear scenario in mind
>
>    * It appears still reasonably obvious that we need that tlbil_va
> somewhere in that fault path, but I don't know what for, I'm not sure
> it's really what the comment says it's about. That raises back the whole
> question of when are those "invalid" TLB entries that we create when
> _PAGE_PRESENT is not set are going to be removed. The doc hints that
> they go a way on TLB error, but the doc really only says so for missing
> changed bit... so I'm tempted to say we need to basically stick a
> tlbil_va on 8xx on both set_pte_at() and ptep_set_access_flags() for any
> non-kernel page. What do you think ? That would be the safest.

Possibly just doing a tlbil_va in do_page_fault() for no translation
errors comes to mind.

>
>  - The "fake" DataAccess and InstructionAccess are useless... just a
> spurious jump for nothing. We could implement those straight in
> DataTLBError and InstructionTLBError, we just need to make sure that
> the trap numbers we put in the exception frame are 0x300 and 0x400, but
> that's just a matter of passing the right bits to EXC_XFER_EE_LITE().
>
>  - Finally, another boat-rocking optimisation to do is preload. You
> could hook in update_mmu_cache() for 8xx to go whack an entry in the TLB
> as well, which would avoid a TLB miss right after a page fault. But
> again, only when things work.
>
> None of the above of course address your DAR-not-updated concern. I
> think your approach of "clobbering" the DAR on every storage exception
> after it's been snapshotted and then testing for the clobber and
> emulating the instruction is the way to go, though your patch could use
> some cleaning there.

Yes, it is full of debug cruft now.

>
> I'll post comments on it separately.

Thanks, I still like the asm version for two reasons:
- It contains the problem to 8xx files
- fixing it up in do_page_fault() may lead to a never ending loop.
  This is mainly due to the current TLB code not using DIRTY correctly.

In any case it would be nice to keep the asm version in the file, should
we need it some day. It was rather painful to get it working so saving it
somewhere would be nice.

>
> At the end of the day, I can't help that much without HW access and/or a
> simulator, as you can guess, and no, I'm not asking for people to send
> me an 8xx board :-) (especially useless without a BDI imho). So I rely
> on Scott and yourself to sort this out. But thanks for showing up, and
> please, do port your board to 2.6 so you can test your stuff :-)

 :)

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2009-10-07  7:49 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-05 12:16 [PATCH 0/6] PowerPc 8xx TLB/MMU fixes Joakim Tjernlund
2009-10-05 12:16 ` [PATCH 1/6] 8xx: DTLB Error must check for more errors Joakim Tjernlund
2009-10-05 12:16   ` [PATCH 2/6] 8xx, fault: Add some debug code to do_page_fault() Joakim Tjernlund
2009-10-05 12:16     ` [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU Joakim Tjernlund
2009-10-05 12:16       ` [PATCH 4/6] 8xx: Tag DAR with 0x00f0 to catch buggy instructions Joakim Tjernlund
2009-10-05 12:16         ` [PATCH 5/6] 8xx: Fixup DAR from buggy dcbX instructions Joakim Tjernlund
2009-10-05 12:16           ` [PATCH 6/6] 8xx: start using dcbX instructions in various copy routines Joakim Tjernlund
2009-10-05 20:17       ` [PATCH 3/6] 8xx: get rid of _PAGE_HWWRITE dependency in MMU Benjamin Herrenschmidt
2009-10-05 21:25         ` Joakim Tjernlund
2009-10-05 21:37           ` Benjamin Herrenschmidt
2009-10-05 22:00             ` Joakim Tjernlund
2009-10-05 22:09               ` Benjamin Herrenschmidt
2009-10-05 22:55                 ` Joakim Tjernlund
2009-10-05 23:15                   ` Benjamin Herrenschmidt
2009-10-05 23:35                     ` Joakim Tjernlund
2009-10-06  0:34                       ` Benjamin Herrenschmidt
2009-10-06  6:15                         ` Joakim Tjernlund
2009-10-06  6:45                           ` Benjamin Herrenschmidt
2009-10-06  7:54                             ` Joakim Tjernlund
2009-10-06 15:40                             ` Joakim Tjernlund
2009-10-06 17:28                               ` Joakim Tjernlund
2009-10-06 22:05                         ` Joakim Tjernlund
2009-10-06 23:25                           ` Benjamin Herrenschmidt
2009-10-07  1:07                           ` Benjamin Herrenschmidt
2009-10-07  7:47                             ` Joakim Tjernlund
2009-10-05 18:12 ` [PATCH 0/6] PowerPc 8xx TLB/MMU fixes Scott Wood
2009-10-05 18:27   ` Joakim Tjernlund
2009-10-05 20:09     ` Scott Wood
2009-10-05 21:04       ` Joakim Tjernlund
2009-10-05 21:31         ` Benjamin Herrenschmidt
2009-10-05 21:41           ` Joakim Tjernlund
2009-10-05 21:46             ` Scott Wood
2009-10-05 21:31         ` Scott Wood
2009-10-05 22:04 ` Rex Feany
2009-10-05 22:31   ` Joakim Tjernlund
2009-10-05 22:37     ` Benjamin Herrenschmidt
2009-10-05 22:58       ` Joakim Tjernlund
2009-10-05 23:49       ` Joakim Tjernlund
2009-10-06  1:52         ` Benjamin Herrenschmidt
2009-10-06  8:06           ` Joakim Tjernlund
2009-10-06  8:32             ` Benjamin Herrenschmidt
2009-10-06 10:58               ` Joakim Tjernlund
2009-10-06 11:06                 ` Benjamin Herrenschmidt
2009-10-06 11:39                   ` Joakim Tjernlund
2009-10-06 13:18                   ` Joakim Tjernlund
2009-10-05 22:42     ` Rex Feany
2009-10-05 23:00       ` Joakim Tjernlund
2009-10-06  6:25       ` Joakim Tjernlund
2009-10-06  6:44         ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).