linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/4] try secondary hash before BUG in kernel_map_linear_page()
@ 2013-04-12  2:16 Li Zhong
  2013-04-12  2:16 ` [RFC PATCH v2 1/4] powerpc: Move the setting of rflags out of loop in __hash_page_huge Li Zhong
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Li Zhong @ 2013-04-12  2:16 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: paulus, Li Zhong

Hi Michael, 

Here is the updated version,  could you please help to review it again? 

As you suggested, this version didn't copy the code, but splitted
the logic into a helper function, so both kernel_map_linear_page() and
__hash_page_huge() can use. 

Also patch #1 moves some unnecessary code out of the repeating loop, so the 
splitting is easier. Patch #3 removes the HPTE_V_BOLTED flag in 
kernel_map_linear_page(), it seems not needed based on my understanding. 

Changes are split into smaller ones, so each one did only one thing. 

Thanks, Zhong

Li Zhong (4):
  powerpc: Move the setting of rflags out of loop in __hash_page_huge
  powerpc: Split the code trying to insert hpte repeatedly as an helper
    function
  powerpc: Don't bolt the hpte in kernel_map_linear_page()
  powerpc: Try to insert the hptes repeatedly in
    kernel_map_linear_page()

 arch/powerpc/mm/hash_utils_64.c      |   45 +++++++++++++++++++++++++++++++---
 arch/powerpc/mm/hugetlbpage-hash64.c |   31 +++++------------------
 2 files changed, 47 insertions(+), 29 deletions(-)

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v2 1/4] powerpc: Move the setting of rflags out of loop in __hash_page_huge
  2013-04-12  2:16 [RFC PATCH v2 0/4] try secondary hash before BUG in kernel_map_linear_page() Li Zhong
@ 2013-04-12  2:16 ` Li Zhong
  2013-04-15  6:32   ` Michael Ellerman
  2013-04-12  2:16 ` [RFC PATCH v2 2/4] powerpc: Split the code trying to insert hpte repeatedly as an helper function Li Zhong
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Li Zhong @ 2013-04-12  2:16 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: paulus, Li Zhong

It seems that rflags don't get changed in  the repeating loop, so move
it out of the loop.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hugetlbpage-hash64.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index cecad34..edb4129 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -87,10 +87,6 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 
 		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
 
-repeat:
-		hpte_group = ((hash & htab_hash_mask) *
-			      HPTES_PER_GROUP) & ~0x7UL;
-
 		/* clear HPTE slot informations in new PTE */
 #ifdef CONFIG_PPC_64K_PAGES
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0;
@@ -101,6 +97,10 @@ repeat:
 		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
 				      _PAGE_COHERENT | _PAGE_GUARDED));
 
+repeat:
+		hpte_group = ((hash & htab_hash_mask) *
+			      HPTES_PER_GROUP) & ~0x7UL;
+
 		/* Insert into the hash table, primary slot */
 		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
 					  mmu_psize, ssize);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v2 2/4] powerpc: Split the code trying to insert hpte repeatedly as an helper function
  2013-04-12  2:16 [RFC PATCH v2 0/4] try secondary hash before BUG in kernel_map_linear_page() Li Zhong
  2013-04-12  2:16 ` [RFC PATCH v2 1/4] powerpc: Move the setting of rflags out of loop in __hash_page_huge Li Zhong
@ 2013-04-12  2:16 ` Li Zhong
  2013-04-12  2:16 ` [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page() Li Zhong
  2013-04-12  2:17 ` [RFC PATCH v2 4/4] powerpc: Try to insert the hptes repeatedly " Li Zhong
  3 siblings, 0 replies; 14+ messages in thread
From: Li Zhong @ 2013-04-12  2:16 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: paulus, Li Zhong

Move the logical trying to insert hpte in __hash_page_huge() to an helper
function, so it could also be used by others.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash_utils_64.c      |   35 ++++++++++++++++++++++++++++++++++
 arch/powerpc/mm/hugetlbpage-hash64.c |   31 ++++++------------------------
 2 files changed, 41 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index f410c3e..716f42b 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1230,6 +1230,41 @@ void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
 		bad_page_fault(regs, address, SIGBUS);
 }
 
+long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
+			   unsigned long pa, unsigned long rflags,
+			   int psize, int ssize)
+{
+	unsigned long hpte_group;
+	long slot;
+
+repeat:
+	hpte_group = ((hash & htab_hash_mask) *
+		       HPTES_PER_GROUP) & ~0x7UL;
+
+	/* Insert into the hash table, primary slot */
+	slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
+				  psize, ssize);
+
+	/* Primary is full, try the secondary */
+	if (unlikely(slot == -1)) {
+		hpte_group = ((~hash & htab_hash_mask) *
+			      HPTES_PER_GROUP) & ~0x7UL;
+		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags,
+					  HPTE_V_SECONDARY,
+					  psize, ssize);
+		if (slot == -1) {
+			if (mftb() & 0x1)
+				hpte_group = ((hash & htab_hash_mask) *
+					      HPTES_PER_GROUP)&~0x7UL;
+
+			ppc_md.hpte_remove(hpte_group);
+			goto repeat;
+		}
+	}
+
+	return slot;
+}
+
 #ifdef CONFIG_DEBUG_PAGEALLOC
 static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
 {
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index edb4129..bd7d38b 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -14,6 +14,10 @@
 #include <asm/cacheflush.h>
 #include <asm/machdep.h>
 
+extern long hpte_insert_repeating(unsigned long hash, unsigned long vpn,
+			   unsigned long pa, unsigned long rlags,
+			   int psize, int ssize);
+
 int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		     pte_t *ptep, unsigned long trap, int local, int ssize,
 		     unsigned int shift, unsigned int mmu_psize)
@@ -83,7 +87,6 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 
 	if (likely(!(old_pte & _PAGE_HASHPTE))) {
 		unsigned long hash = hpt_hash(vpn, shift, ssize);
-		unsigned long hpte_group;
 
 		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
 
@@ -97,30 +100,8 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
 				      _PAGE_COHERENT | _PAGE_GUARDED));
 
-repeat:
-		hpte_group = ((hash & htab_hash_mask) *
-			      HPTES_PER_GROUP) & ~0x7UL;
-
-		/* Insert into the hash table, primary slot */
-		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
-					  mmu_psize, ssize);
-
-		/* Primary is full, try the secondary */
-		if (unlikely(slot == -1)) {
-			hpte_group = ((~hash & htab_hash_mask) *
-				      HPTES_PER_GROUP) & ~0x7UL;
-			slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags,
-						  HPTE_V_SECONDARY,
-						  mmu_psize, ssize);
-			if (slot == -1) {
-				if (mftb() & 0x1)
-					hpte_group = ((hash & htab_hash_mask) *
-						      HPTES_PER_GROUP)&~0x7UL;
-
-				ppc_md.hpte_remove(hpte_group);
-				goto repeat;
-                        }
-		}
+		slot = hpte_insert_repeating(hash, vpn, pa,
+					     rflags, mmu_psize, ssize);
 
 		/*
 		 * Hypervisor failure. Restore old pte and return -1
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page()
  2013-04-12  2:16 [RFC PATCH v2 0/4] try secondary hash before BUG in kernel_map_linear_page() Li Zhong
  2013-04-12  2:16 ` [RFC PATCH v2 1/4] powerpc: Move the setting of rflags out of loop in __hash_page_huge Li Zhong
  2013-04-12  2:16 ` [RFC PATCH v2 2/4] powerpc: Split the code trying to insert hpte repeatedly as an helper function Li Zhong
@ 2013-04-12  2:16 ` Li Zhong
  2013-04-15  3:50   ` Paul Mackerras
  2013-04-12  2:17 ` [RFC PATCH v2 4/4] powerpc: Try to insert the hptes repeatedly " Li Zhong
  3 siblings, 1 reply; 14+ messages in thread
From: Li Zhong @ 2013-04-12  2:16 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: paulus, Li Zhong

It seems that in kernel_unmap_linear_page(), it only checks whether there
is a map in the linear_map_hash_slots array, so seems we don't need bolt
the hpte.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash_utils_64.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 716f42b..a7f54f0 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1281,7 +1281,7 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
 	if (!vsid)
 		return;
 	ret = ppc_md.hpte_insert(hpteg, vpn, __pa(vaddr),
-				 mode, HPTE_V_BOLTED,
+				 mode, 0,
 				 mmu_linear_psize, mmu_kernel_ssize);
 	BUG_ON (ret < 0);
 	spin_lock(&linear_map_hash_lock);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v2 4/4] powerpc: Try to insert the hptes repeatedly in kernel_map_linear_page()
  2013-04-12  2:16 [RFC PATCH v2 0/4] try secondary hash before BUG in kernel_map_linear_page() Li Zhong
                   ` (2 preceding siblings ...)
  2013-04-12  2:16 ` [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page() Li Zhong
@ 2013-04-12  2:17 ` Li Zhong
  2013-04-15  6:36   ` Michael Ellerman
  3 siblings, 1 reply; 14+ messages in thread
From: Li Zhong @ 2013-04-12  2:17 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: paulus, Li Zhong

This patch tries to fix following issue when CONFIG_DEBUG_PAGEALLOC
is enabled:

[  543.075675] ------------[ cut here ]------------
[  543.075701] kernel BUG at arch/powerpc/mm/hash_utils_64.c:1239!
[  543.075714] Oops: Exception in kernel mode, sig: 5 [#1]
[  543.075722] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
[  543.075741] Modules linked in: binfmt_misc ehea
[  543.075759] NIP: c000000000036eb0 LR: c000000000036ea4 CTR: c00000000005a594
[  543.075771] REGS: c0000000a90832c0 TRAP: 0700   Not tainted  (3.8.0-next-20130222)
[  543.075781] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 22224482  XER: 00000000
[  543.075816] SOFTE: 0
[  543.075823] CFAR: c00000000004c200
[  543.075830] TASK = c0000000e506b750[23934] 'cc1' THREAD: c0000000a9080000 CPU: 1
GPR00: 0000000000000001 c0000000a9083540 c000000000c600a8 ffffffffffffffff
GPR04: 0000000000000050 fffffffffffffffa c0000000a90834e0 00000000004ff594
GPR08: 0000000000000001 0000000000000000 000000009592d4d8 c000000000c86854
GPR12: 0000000000000002 c000000006ead300 0000000000a51000 0000000000000001
GPR16: f000000003354380 ffffffffffffffff ffffffffffffff80 0000000000000000
GPR20: 0000000000000001 c000000000c600a8 0000000000000001 0000000000000001
GPR24: 0000000003354380 c000000000000000 0000000000000000 c000000000b65950
GPR28: 0000002000000000 00000000000cd50e 0000000000bf50d9 c000000000c7c230
[  543.076005] NIP [c000000000036eb0] .kernel_map_pages+0x1e0/0x3f8
[  543.076016] LR [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8
[  543.076025] Call Trace:
[  543.076033] [c0000000a9083540] [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8 (unreliable)
[  543.076053] [c0000000a9083640] [c000000000167638] .get_page_from_freelist+0x6cc/0x8dc
[  543.076067] [c0000000a9083800] [c000000000167a48] .__alloc_pages_nodemask+0x200/0x96c
[  543.076082] [c0000000a90839c0] [c0000000001ade44] .alloc_pages_vma+0x160/0x1e4
[  543.076098] [c0000000a9083a80] [c00000000018ce04] .handle_pte_fault+0x1b0/0x7e8
[  543.076113] [c0000000a9083b50] [c00000000018d5a8] .handle_mm_fault+0x16c/0x1a0
[  543.076129] [c0000000a9083c00] [c0000000007bf1dc] .do_page_fault+0x4d0/0x7a4
[  543.076144] [c0000000a9083e30] [c0000000000090e8] handle_page_fault+0x10/0x30
[  543.076155] Instruction dump:
[  543.076163] 7c630038 78631d88 e80a0000 f8410028 7c0903a6 e91f01de e96a0010 e84a0008
[  543.076192] 4e800421 e8410028 7c7107b4 7a200fe0 <0b000000> 7f63db78 48785781 60000000
[  543.076224] ---[ end trace bd5807e8d6ae186b ]---

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash_utils_64.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index a7f54f0..4b449a0 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1272,7 +1272,7 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
 	unsigned long vsid = get_kernel_vsid(vaddr, mmu_kernel_ssize);
 	unsigned long vpn = hpt_vpn(vaddr, vsid, mmu_kernel_ssize);
 	unsigned long mode = htab_convert_pte_flags(PAGE_KERNEL);
-	int ret;
+	long ret;
 
 	hash = hpt_hash(vpn, PAGE_SHIFT, mmu_kernel_ssize);
 	hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP);
@@ -1280,9 +1280,11 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
 	/* Don't create HPTE entries for bad address */
 	if (!vsid)
 		return;
-	ret = ppc_md.hpte_insert(hpteg, vpn, __pa(vaddr),
-				 mode, 0,
-				 mmu_linear_psize, mmu_kernel_ssize);
+
+	ret = hpte_insert_repeating(hash, vpn, __pa(vaddr),
+				    mode, mmu_linear_psize,
+				    mmu_kernel_ssize);
+
 	BUG_ON (ret < 0);
 	spin_lock(&linear_map_hash_lock);
 	BUG_ON(linear_map_hash_slots[lmi] & 0x80);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page()
  2013-04-12  2:16 ` [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page() Li Zhong
@ 2013-04-15  3:50   ` Paul Mackerras
  2013-04-15  6:56     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Paul Mackerras @ 2013-04-15  3:50 UTC (permalink / raw)
  To: Li Zhong; +Cc: linuxppc-dev

On Fri, Apr 12, 2013 at 10:16:59AM +0800, Li Zhong wrote:
> It seems that in kernel_unmap_linear_page(), it only checks whether there
> is a map in the linear_map_hash_slots array, so seems we don't need bolt
> the hpte.

I don't exactly understand your rationale here, but I don't think it's
safe not to have linear mapping pages bolted.  Basically, if a page
will be used in the process of calling hash_page to demand-fault an
HPTE into the hash table, then that page needs to be bolted, otherwise
we can get an infinite recursion of HPT misses.  That includes all
kernel stack pages, among other things, so I think we need to leave
the HPTE_V_BOLTED in there.

Paul.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 1/4] powerpc: Move the setting of rflags out of loop in __hash_page_huge
  2013-04-12  2:16 ` [RFC PATCH v2 1/4] powerpc: Move the setting of rflags out of loop in __hash_page_huge Li Zhong
@ 2013-04-15  6:32   ` Michael Ellerman
  2013-04-15  8:17     ` Li Zhong
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2013-04-15  6:32 UTC (permalink / raw)
  To: Li Zhong; +Cc: linuxppc-dev, paulus

On Fri, Apr 12, 2013 at 10:16:57AM +0800, Li Zhong wrote:
> It seems that rflags don't get changed in  the repeating loop, so move
> it out of the loop.

You've also changed the way new_pte is handled on repeat, but I think
that's OK too.

cheers

> diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
> index cecad34..edb4129 100644
> --- a/arch/powerpc/mm/hugetlbpage-hash64.c
> +++ b/arch/powerpc/mm/hugetlbpage-hash64.c
> @@ -87,10 +87,6 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
>  
>  		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
>  
> -repeat:
> -		hpte_group = ((hash & htab_hash_mask) *
> -			      HPTES_PER_GROUP) & ~0x7UL;
> -
>  		/* clear HPTE slot informations in new PTE */
>  #ifdef CONFIG_PPC_64K_PAGES
>  		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0;

ie. here new_pte was updated on repeat, but now it's not.

> @@ -101,6 +97,10 @@ repeat:
>  		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
>  				      _PAGE_COHERENT | _PAGE_GUARDED));
>  
> +repeat:
> +		hpte_group = ((hash & htab_hash_mask) *
> +			      HPTES_PER_GROUP) & ~0x7UL;
> +
>  		/* Insert into the hash table, primary slot */
>  		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
>  					  mmu_psize, ssize);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 4/4] powerpc: Try to insert the hptes repeatedly in kernel_map_linear_page()
  2013-04-12  2:17 ` [RFC PATCH v2 4/4] powerpc: Try to insert the hptes repeatedly " Li Zhong
@ 2013-04-15  6:36   ` Michael Ellerman
  2013-04-15  8:21     ` Li Zhong
  0 siblings, 1 reply; 14+ messages in thread
From: Michael Ellerman @ 2013-04-15  6:36 UTC (permalink / raw)
  To: Li Zhong; +Cc: linuxppc-dev, paulus

On Fri, Apr 12, 2013 at 10:17:00AM +0800, Li Zhong wrote:
> This patch tries to fix following issue when CONFIG_DEBUG_PAGEALLOC
> is enabled:

Please include a better changelog.

This patch does fix (I hope?) the following oops, caused by xxx. Reproducible
by doing yyy.

cheers

> 
> [  543.075675] ------------[ cut here ]------------
> [  543.075701] kernel BUG at arch/powerpc/mm/hash_utils_64.c:1239!
> [  543.075714] Oops: Exception in kernel mode, sig: 5 [#1]
> [  543.075722] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
> [  543.075741] Modules linked in: binfmt_misc ehea
> [  543.075759] NIP: c000000000036eb0 LR: c000000000036ea4 CTR: c00000000005a594
> [  543.075771] REGS: c0000000a90832c0 TRAP: 0700   Not tainted  (3.8.0-next-20130222)
> [  543.075781] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 22224482  XER: 00000000
> [  543.075816] SOFTE: 0
> [  543.075823] CFAR: c00000000004c200
> [  543.075830] TASK = c0000000e506b750[23934] 'cc1' THREAD: c0000000a9080000 CPU: 1
> GPR00: 0000000000000001 c0000000a9083540 c000000000c600a8 ffffffffffffffff
> GPR04: 0000000000000050 fffffffffffffffa c0000000a90834e0 00000000004ff594
> GPR08: 0000000000000001 0000000000000000 000000009592d4d8 c000000000c86854
> GPR12: 0000000000000002 c000000006ead300 0000000000a51000 0000000000000001
> GPR16: f000000003354380 ffffffffffffffff ffffffffffffff80 0000000000000000
> GPR20: 0000000000000001 c000000000c600a8 0000000000000001 0000000000000001
> GPR24: 0000000003354380 c000000000000000 0000000000000000 c000000000b65950
> GPR28: 0000002000000000 00000000000cd50e 0000000000bf50d9 c000000000c7c230
> [  543.076005] NIP [c000000000036eb0] .kernel_map_pages+0x1e0/0x3f8
> [  543.076016] LR [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8
> [  543.076025] Call Trace:
> [  543.076033] [c0000000a9083540] [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8 (unreliable)
> [  543.076053] [c0000000a9083640] [c000000000167638] .get_page_from_freelist+0x6cc/0x8dc
> [  543.076067] [c0000000a9083800] [c000000000167a48] .__alloc_pages_nodemask+0x200/0x96c
> [  543.076082] [c0000000a90839c0] [c0000000001ade44] .alloc_pages_vma+0x160/0x1e4
> [  543.076098] [c0000000a9083a80] [c00000000018ce04] .handle_pte_fault+0x1b0/0x7e8
> [  543.076113] [c0000000a9083b50] [c00000000018d5a8] .handle_mm_fault+0x16c/0x1a0
> [  543.076129] [c0000000a9083c00] [c0000000007bf1dc] .do_page_fault+0x4d0/0x7a4
> [  543.076144] [c0000000a9083e30] [c0000000000090e8] handle_page_fault+0x10/0x30
> [  543.076155] Instruction dump:
> [  543.076163] 7c630038 78631d88 e80a0000 f8410028 7c0903a6 e91f01de e96a0010 e84a0008
> [  543.076192] 4e800421 e8410028 7c7107b4 7a200fe0 <0b000000> 7f63db78 48785781 60000000
> [  543.076224] ---[ end trace bd5807e8d6ae186b ]---
> 
> Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/hash_utils_64.c |   10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index a7f54f0..4b449a0 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -1272,7 +1272,7 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
>  	unsigned long vsid = get_kernel_vsid(vaddr, mmu_kernel_ssize);
>  	unsigned long vpn = hpt_vpn(vaddr, vsid, mmu_kernel_ssize);
>  	unsigned long mode = htab_convert_pte_flags(PAGE_KERNEL);
> -	int ret;
> +	long ret;
>  
>  	hash = hpt_hash(vpn, PAGE_SHIFT, mmu_kernel_ssize);
>  	hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP);
> @@ -1280,9 +1280,11 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
>  	/* Don't create HPTE entries for bad address */
>  	if (!vsid)
>  		return;
> -	ret = ppc_md.hpte_insert(hpteg, vpn, __pa(vaddr),
> -				 mode, 0,
> -				 mmu_linear_psize, mmu_kernel_ssize);
> +
> +	ret = hpte_insert_repeating(hash, vpn, __pa(vaddr),
> +				    mode, mmu_linear_psize,
> +				    mmu_kernel_ssize);
> +
>  	BUG_ON (ret < 0);
>  	spin_lock(&linear_map_hash_lock);
>  	BUG_ON(linear_map_hash_slots[lmi] & 0x80);
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page()
  2013-04-15  3:50   ` Paul Mackerras
@ 2013-04-15  6:56     ` Benjamin Herrenschmidt
  2013-04-15  8:15       ` Li Zhong
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2013-04-15  6:56 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev, Li Zhong

On Mon, 2013-04-15 at 13:50 +1000, Paul Mackerras wrote:
> On Fri, Apr 12, 2013 at 10:16:59AM +0800, Li Zhong wrote:
> > It seems that in kernel_unmap_linear_page(), it only checks whether there
> > is a map in the linear_map_hash_slots array, so seems we don't need bolt
> > the hpte.
> 
> I don't exactly understand your rationale here, but I don't think it's
> safe not to have linear mapping pages bolted.  Basically, if a page
> will be used in the process of calling hash_page to demand-fault an
> HPTE into the hash table, then that page needs to be bolted, otherwise
> we can get an infinite recursion of HPT misses.  That includes all
> kernel stack pages, among other things, so I think we need to leave
> the HPTE_V_BOLTED in there.

I suspect Li's confusion comes from the fact that he doesn't realizes
that we might evict random hash slots. If the linear mapping hash
entries could only be thrown out via kernel_unmap_linear_page() then his
comment would make sense. However this isn't the case.

Li: When faulting something in, if both the primary and secondary
buckets are full, we "somewhat randomly" evict the content of a slot and
replace it. However we only do that on non-bolted slots.

This is why the linear mapping (and the vmemmap) must be bolted.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page()
  2013-04-15  6:56     ` Benjamin Herrenschmidt
@ 2013-04-15  8:15       ` Li Zhong
  2013-04-15 11:27         ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 14+ messages in thread
From: Li Zhong @ 2013-04-15  8:15 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Paul Mackerras

On Mon, 2013-04-15 at 08:56 +0200, Benjamin Herrenschmidt wrote:
> On Mon, 2013-04-15 at 13:50 +1000, Paul Mackerras wrote:
> > On Fri, Apr 12, 2013 at 10:16:59AM +0800, Li Zhong wrote:
> > > It seems that in kernel_unmap_linear_page(), it only checks whether there
> > > is a map in the linear_map_hash_slots array, so seems we don't need bolt
> > > the hpte.
> > 

Hi Paul, Ben

Thank you both for the comments and detailed information. I'll keep it
bolted in the next version. If you have time, please help to check
whether my understanding below is correct.

Thanks, Zhong

> > I don't exactly understand your rationale here, but I don't think it's
> > safe not to have linear mapping pages bolted.  Basically, if a page
> > will be used in the process of calling hash_page to demand-fault an
> > HPTE into the hash table, then that page needs to be bolted, otherwise
> > we can get an infinite recursion of HPT misses.  

So the infinite recursion happens like below?

        fault for PAGE A
        
        hash_page for PAGE A 
        
        some page B needed by hash_page processing removed by others,
        before inserting the HPTE
        
        fault for PAGE B 
        
        hash_page for PAGE B and recursion for ever
        

> That includes all
> > kernel stack pages, among other things, so I think we need to leave
> > the HPTE_V_BOLTED in there.
> 
> I suspect Li's confusion comes from the fact that he doesn't realizes
> that we might evict random hash slots. If the linear mapping hash
> entries could only be thrown out via kernel_unmap_linear_page() then his
> comment would make sense. However this isn't the case.
> 
> Li: When faulting something in, if both the primary and secondary
> buckets are full, we "somewhat randomly" evict the content of a slot and
> replace it. However we only do that on non-bolted slots.

So the code is implemented in ppc_md.hpte_remove(), may be called by
__hash_huge_page(), and asm code htab_call_hpte_remove?

> This is why the linear mapping (and the vmemmap) must be bolted.

If not, it would result the infinite recursion like above?

> Cheers,
> Ben.
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 1/4] powerpc: Move the setting of rflags out of loop in __hash_page_huge
  2013-04-15  6:32   ` Michael Ellerman
@ 2013-04-15  8:17     ` Li Zhong
  0 siblings, 0 replies; 14+ messages in thread
From: Li Zhong @ 2013-04-15  8:17 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, paulus

On Mon, 2013-04-15 at 16:32 +1000, Michael Ellerman wrote:
> On Fri, Apr 12, 2013 at 10:16:57AM +0800, Li Zhong wrote:
> > It seems that rflags don't get changed in  the repeating loop, so move
> > it out of the loop.

> You've also changed the way new_pte is handled on repeat, but I think
> that's OK too.

OK, I'll add it in the description :)

Thanks, Zhong

> cheers
> 
> > diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
> > index cecad34..edb4129 100644
> > --- a/arch/powerpc/mm/hugetlbpage-hash64.c
> > +++ b/arch/powerpc/mm/hugetlbpage-hash64.c
> > @@ -87,10 +87,6 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
> >  
> >  		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
> >  
> > -repeat:
> > -		hpte_group = ((hash & htab_hash_mask) *
> > -			      HPTES_PER_GROUP) & ~0x7UL;
> > -
> >  		/* clear HPTE slot informations in new PTE */
> >  #ifdef CONFIG_PPC_64K_PAGES
> >  		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0;
> 
> ie. here new_pte was updated on repeat, but now it's not.
> 
> > @@ -101,6 +97,10 @@ repeat:
> >  		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
> >  				      _PAGE_COHERENT | _PAGE_GUARDED));
> >  
> > +repeat:
> > +		hpte_group = ((hash & htab_hash_mask) *
> > +			      HPTES_PER_GROUP) & ~0x7UL;
> > +
> >  		/* Insert into the hash table, primary slot */
> >  		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
> >  					  mmu_psize, ssize);
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 4/4] powerpc: Try to insert the hptes repeatedly in kernel_map_linear_page()
  2013-04-15  6:36   ` Michael Ellerman
@ 2013-04-15  8:21     ` Li Zhong
  0 siblings, 0 replies; 14+ messages in thread
From: Li Zhong @ 2013-04-15  8:21 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, paulus

On Mon, 2013-04-15 at 16:36 +1000, Michael Ellerman wrote:
> On Fri, Apr 12, 2013 at 10:17:00AM +0800, Li Zhong wrote:
> > This patch tries to fix following issue when CONFIG_DEBUG_PAGEALLOC
> > is enabled:
> 
> Please include a better changelog.

OK, I'll use the following as a template, thank you for the suggestion. 

> 
> This patch does fix (I hope?) the following oops, caused by xxx. Reproducible
> by doing yyy.
> 
> cheers
> 
> > 
> > [  543.075675] ------------[ cut here ]------------
> > [  543.075701] kernel BUG at arch/powerpc/mm/hash_utils_64.c:1239!
> > [  543.075714] Oops: Exception in kernel mode, sig: 5 [#1]
> > [  543.075722] PREEMPT SMP NR_CPUS=16 DEBUG_PAGEALLOC NUMA pSeries
> > [  543.075741] Modules linked in: binfmt_misc ehea
> > [  543.075759] NIP: c000000000036eb0 LR: c000000000036ea4 CTR: c00000000005a594
> > [  543.075771] REGS: c0000000a90832c0 TRAP: 0700   Not tainted  (3.8.0-next-20130222)
> > [  543.075781] MSR: 8000000000029032 <SF,EE,ME,IR,DR,RI>  CR: 22224482  XER: 00000000
> > [  543.075816] SOFTE: 0
> > [  543.075823] CFAR: c00000000004c200
> > [  543.075830] TASK = c0000000e506b750[23934] 'cc1' THREAD: c0000000a9080000 CPU: 1
> > GPR00: 0000000000000001 c0000000a9083540 c000000000c600a8 ffffffffffffffff
> > GPR04: 0000000000000050 fffffffffffffffa c0000000a90834e0 00000000004ff594
> > GPR08: 0000000000000001 0000000000000000 000000009592d4d8 c000000000c86854
> > GPR12: 0000000000000002 c000000006ead300 0000000000a51000 0000000000000001
> > GPR16: f000000003354380 ffffffffffffffff ffffffffffffff80 0000000000000000
> > GPR20: 0000000000000001 c000000000c600a8 0000000000000001 0000000000000001
> > GPR24: 0000000003354380 c000000000000000 0000000000000000 c000000000b65950
> > GPR28: 0000002000000000 00000000000cd50e 0000000000bf50d9 c000000000c7c230
> > [  543.076005] NIP [c000000000036eb0] .kernel_map_pages+0x1e0/0x3f8
> > [  543.076016] LR [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8
> > [  543.076025] Call Trace:
> > [  543.076033] [c0000000a9083540] [c000000000036ea4] .kernel_map_pages+0x1d4/0x3f8 (unreliable)
> > [  543.076053] [c0000000a9083640] [c000000000167638] .get_page_from_freelist+0x6cc/0x8dc
> > [  543.076067] [c0000000a9083800] [c000000000167a48] .__alloc_pages_nodemask+0x200/0x96c
> > [  543.076082] [c0000000a90839c0] [c0000000001ade44] .alloc_pages_vma+0x160/0x1e4
> > [  543.076098] [c0000000a9083a80] [c00000000018ce04] .handle_pte_fault+0x1b0/0x7e8
> > [  543.076113] [c0000000a9083b50] [c00000000018d5a8] .handle_mm_fault+0x16c/0x1a0
> > [  543.076129] [c0000000a9083c00] [c0000000007bf1dc] .do_page_fault+0x4d0/0x7a4
> > [  543.076144] [c0000000a9083e30] [c0000000000090e8] handle_page_fault+0x10/0x30
> > [  543.076155] Instruction dump:
> > [  543.076163] 7c630038 78631d88 e80a0000 f8410028 7c0903a6 e91f01de e96a0010 e84a0008
> > [  543.076192] 4e800421 e8410028 7c7107b4 7a200fe0 <0b000000> 7f63db78 48785781 60000000
> > [  543.076224] ---[ end trace bd5807e8d6ae186b ]---
> > 
> > Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/mm/hash_utils_64.c |   10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> > index a7f54f0..4b449a0 100644
> > --- a/arch/powerpc/mm/hash_utils_64.c
> > +++ b/arch/powerpc/mm/hash_utils_64.c
> > @@ -1272,7 +1272,7 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
> >  	unsigned long vsid = get_kernel_vsid(vaddr, mmu_kernel_ssize);
> >  	unsigned long vpn = hpt_vpn(vaddr, vsid, mmu_kernel_ssize);
> >  	unsigned long mode = htab_convert_pte_flags(PAGE_KERNEL);
> > -	int ret;
> > +	long ret;
> >  
> >  	hash = hpt_hash(vpn, PAGE_SHIFT, mmu_kernel_ssize);
> >  	hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP);
> > @@ -1280,9 +1280,11 @@ static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
> >  	/* Don't create HPTE entries for bad address */
> >  	if (!vsid)
> >  		return;
> > -	ret = ppc_md.hpte_insert(hpteg, vpn, __pa(vaddr),
> > -				 mode, 0,
> > -				 mmu_linear_psize, mmu_kernel_ssize);
> > +
> > +	ret = hpte_insert_repeating(hash, vpn, __pa(vaddr),
> > +				    mode, mmu_linear_psize,
> > +				    mmu_kernel_ssize);
> > +
> >  	BUG_ON (ret < 0);
> >  	spin_lock(&linear_map_hash_lock);
> >  	BUG_ON(linear_map_hash_slots[lmi] & 0x80);
> > -- 
> > 1.7.9.5
> > 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page()
  2013-04-15  8:15       ` Li Zhong
@ 2013-04-15 11:27         ` Benjamin Herrenschmidt
  2013-04-16  2:51           ` Li Zhong
  0 siblings, 1 reply; 14+ messages in thread
From: Benjamin Herrenschmidt @ 2013-04-15 11:27 UTC (permalink / raw)
  To: Li Zhong; +Cc: linuxppc-dev, Paul Mackerras

On Mon, 2013-04-15 at 16:15 +0800, Li Zhong wrote:

> So the code is implemented in ppc_md.hpte_remove(), may be called by
> __hash_huge_page(), and asm code htab_call_hpte_remove?
> 
> > This is why the linear mapping (and the vmemmap) must be bolted.
> 
> If not, it would result the infinite recursion like above?

Potentially, we don't expect to fault linear mapping or vmemmap entries
on demand. We aren't equipped to do it and we occasionally have code
path that access the linear mapping and cannot afford to have SRR0 and
SRR1 clobbered by a page fault.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page()
  2013-04-15 11:27         ` Benjamin Herrenschmidt
@ 2013-04-16  2:51           ` Li Zhong
  0 siblings, 0 replies; 14+ messages in thread
From: Li Zhong @ 2013-04-16  2:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Paul Mackerras

On Mon, 2013-04-15 at 13:27 +0200, Benjamin Herrenschmidt wrote:
> On Mon, 2013-04-15 at 16:15 +0800, Li Zhong wrote:
> 
> > So the code is implemented in ppc_md.hpte_remove(), may be called by
> > __hash_huge_page(), and asm code htab_call_hpte_remove?
> > 
> > > This is why the linear mapping (and the vmemmap) must be bolted.
> > 
> > If not, it would result the infinite recursion like above?
> 
> Potentially, we don't expect to fault linear mapping or vmemmap entries
> on demand. We aren't equipped to do it and we occasionally have code
> path that access the linear mapping and cannot afford to have SRR0 and
> SRR1 clobbered by a page fault.

Thank you for the education :)

Thanks, Zhong

> 
> Cheers,
> Ben.
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-04-16  2:52 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-12  2:16 [RFC PATCH v2 0/4] try secondary hash before BUG in kernel_map_linear_page() Li Zhong
2013-04-12  2:16 ` [RFC PATCH v2 1/4] powerpc: Move the setting of rflags out of loop in __hash_page_huge Li Zhong
2013-04-15  6:32   ` Michael Ellerman
2013-04-15  8:17     ` Li Zhong
2013-04-12  2:16 ` [RFC PATCH v2 2/4] powerpc: Split the code trying to insert hpte repeatedly as an helper function Li Zhong
2013-04-12  2:16 ` [RFC PATCH v2 3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page() Li Zhong
2013-04-15  3:50   ` Paul Mackerras
2013-04-15  6:56     ` Benjamin Herrenschmidt
2013-04-15  8:15       ` Li Zhong
2013-04-15 11:27         ` Benjamin Herrenschmidt
2013-04-16  2:51           ` Li Zhong
2013-04-12  2:17 ` [RFC PATCH v2 4/4] powerpc: Try to insert the hptes repeatedly " Li Zhong
2013-04-15  6:36   ` Michael Ellerman
2013-04-15  8:21     ` Li Zhong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).