All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash
@ 2015-07-24  5:18 Michael Ellerman
  2015-07-24  6:45 ` Aneesh Kumar K.V
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Ellerman @ 2015-07-24  5:18 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: cyrilbur, Michael Neuling, Benjamin Herrenschmidt, aneesh.kumar

The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K
PAGE_SIZE.

However when built with a 4K PAGE_SIZE there is an additional config
option which can be enabled, PPC_HAS_HASH_64K, which means the kernel
also knows how to hash a 64K page even though the base PAGE_SIZE is 4K.

This is used in one obscure configuration, to support 64K pages for SPU
local store on the Cell processor when the rest of the kernel is using
4K pages.

In this configuration, pte_pagesize_index() is defined to just pass
through its arguments to get_slice_psize(). However pte_pagesize_index()
is called for both user and kernel addresses, whereas get_slice_psize()
only knows how to handle user addresses.

This has been broken forever, however until recently it happened to
work. That was because in get_slice_psize() the large kernel address
would cause the right shift of the slize mask to return zero.

However in commit 7aa0727f3302 "powerpc/mm: Increase the slice range to
64TB", the get_slice_psize() code was changed so that instead of a right
shift we do an array lookup based on the address. When passed a kernel
address this means we index way off the end of the slice array and
return random junk.

That is only fatal if we happen to hit something non-zero, but when we
do return a non-zero value we confuse the MMU code and eventually cause
a check stop.

This fix is ugly, but simple. When we're called for a kernel address we
return 4K, which is always correct in this configuration, otherwise we
use the slice mask.

Fixes: 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB")
Reported-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/pgtable-ppc64.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 3bb7488bd24b..330ae1d81662 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -135,7 +135,15 @@
 #define pte_iterate_hashed_end() } while(0)
 
 #ifdef CONFIG_PPC_HAS_HASH_64K
-#define pte_pagesize_index(mm, addr, pte)	get_slice_psize(mm, addr)
+#define pte_pagesize_index(mm, addr, pte)			\
+	({							\
+		unsigned int psize;				\
+		if (is_kernel_addr(addr))			\
+			psize = MMU_PAGE_4K;			\
+		else						\
+			psize = get_slice_psize(mm, addr);	\
+		psize;						\
+	})
 #else
 #define pte_pagesize_index(mm, addr, pte)	MMU_PAGE_4K
 #endif
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash
  2015-07-24  5:18 [PATCH] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash Michael Ellerman
@ 2015-07-24  6:45 ` Aneesh Kumar K.V
  2015-07-25  8:59   ` Michael Ellerman
  0 siblings, 1 reply; 3+ messages in thread
From: Aneesh Kumar K.V @ 2015-07-24  6:45 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: Michael Neuling, cyrilbur

Michael Ellerman <mpe@ellerman.id.au> writes:

> The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K
> PAGE_SIZE.
>
> However when built with a 4K PAGE_SIZE there is an additional config
> option which can be enabled, PPC_HAS_HASH_64K, which means the kernel
> also knows how to hash a 64K page even though the base PAGE_SIZE is 4K.
>
> This is used in one obscure configuration, to support 64K pages for SPU
> local store on the Cell processor when the rest of the kernel is using
> 4K pages.
>
> In this configuration, pte_pagesize_index() is defined to just pass
> through its arguments to get_slice_psize(). However pte_pagesize_index()
> is called for both user and kernel addresses, whereas get_slice_psize()
> only knows how to handle user addresses.
>
> This has been broken forever, however until recently it happened to
> work. That was because in get_slice_psize() the large kernel address
> would cause the right shift of the slize mask to return zero.
>
> However in commit 7aa0727f3302 "powerpc/mm: Increase the slice range to
> 64TB", the get_slice_psize() code was changed so that instead of a right
> shift we do an array lookup based on the address. When passed a kernel
> address this means we index way off the end of the slice array and
> return random junk.
>
> That is only fatal if we happen to hit something non-zero, but when we
> do return a non-zero value we confuse the MMU code and eventually cause
> a check stop.
>
> This fix is ugly, but simple. When we're called for a kernel address we
> return 4K, which is always correct in this configuration, otherwise we
> use the slice mask.
>
> Fixes: 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB")
> Reported-by: Cyril Bur <cyrilbur@gmail.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/include/asm/pgtable-ppc64.h | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> index 3bb7488bd24b..330ae1d81662 100644
> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> @@ -135,7 +135,15 @@
>  #define pte_iterate_hashed_end() } while(0)
>  
>  #ifdef CONFIG_PPC_HAS_HASH_64K
> -#define pte_pagesize_index(mm, addr, pte)	get_slice_psize(mm, addr)
> +#define pte_pagesize_index(mm, addr, pte)			\
> +	({							\
> +		unsigned int psize;				\
> +		if (is_kernel_addr(addr))			\
> +			psize = MMU_PAGE_4K;			\
> +		else						\
> +			psize = get_slice_psize(mm, addr);	\
> +		psize;						\
> +	})
>  #else
>  #define pte_pagesize_index(mm, addr, pte)	MMU_PAGE_4K
>  #endif

That is confusing, because we enable PPC_HASH_HAS_64K for 64K page size
too. why not

psize = mmu_virtual_psize;


But that leave another question. What if kernel address used 16MB
mapping ? Or are we going to get a call for pte_pagesize_index, only for
vmalloc area of the kernel ? In any case, this need more comment
explaining the caller and possibly DEBUG_VM WARN_ON() to catch wrong
users ?

-aneesh

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash
  2015-07-24  6:45 ` Aneesh Kumar K.V
@ 2015-07-25  8:59   ` Michael Ellerman
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Ellerman @ 2015-07-25  8:59 UTC (permalink / raw)
  To: Aneesh Kumar K.V, Benjamin Herrenschmidt
  Cc: linuxppc-dev, Michael Neuling, cyrilbur

On Fri, 2015-07-24 at 12:15 +0530, Aneesh Kumar K.V wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> 
> > The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K
> > PAGE_SIZE.
...
> > diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> > index 3bb7488bd24b..330ae1d81662 100644
> > --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> > +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> > @@ -135,7 +135,15 @@
> >  #define pte_iterate_hashed_end() } while(0)
> >  
> >  #ifdef CONFIG_PPC_HAS_HASH_64K
> > -#define pte_pagesize_index(mm, addr, pte)	get_slice_psize(mm, addr)
> > +#define pte_pagesize_index(mm, addr, pte)			\
> > +	({							\
> > +		unsigned int psize;				\
> > +		if (is_kernel_addr(addr))			\
> > +			psize = MMU_PAGE_4K;			\
> > +		else						\
> > +			psize = get_slice_psize(mm, addr);	\
> > +		psize;						\
> > +	})
> >  #else
> >  #define pte_pagesize_index(mm, addr, pte)	MMU_PAGE_4K
> >  #endif
> 
> That is confusing, because we enable PPC_HASH_HAS_64K for 64K page size
> too. 

We do, but in that case we get the definition in pte-hash64-64k.h which is:

  #define pte_pagesize_index(mm, addr, pte)	\
	(((pte) & _PAGE_COMBO)? MMU_PAGE_4K: MMU_PAGE_64K)

> why not
> psize = mmu_virtual_psize;


Maybe. Though I think actually mmu_io_psize would be correct. But none of the
other versions of the macro use the mmu_xx_psize variables they all use the
MMU_PAGE_xx #defines. So basically I just aped those.

Hopefully Ben can chime in, he wrote it originally.

> But that leave another question. What if kernel address used 16MB
> mapping ? Or are we going to get a call for pte_pagesize_index, only for
> vmalloc area of the kernel ?

Not sure. I can't see any guarantee of that. I guess we don't map/unmap the
linear mapping, so possibly we're just getting away with it? And looks like
DEBUG_PAGEALLOC doesn't hit it.

> In any case, this need more comment explaining the caller and possibly
> DEBUG_VM WARN_ON() to catch wrong users ?

My plan is actually to drop support for 64K hash with 4K PAGE_SIZE as soon as
we've fixed this. I just didn't want to remove the code in a known broken state
when we knew how to fix it.

When we drop that support we'll just end up with two versions for 64K and 4K
respectively:

 #define pte_pagesize_index(mm, addr, pte)	\
	(((pte) & _PAGE_COMBO)? MMU_PAGE_4K: MMU_PAGE_64K)

 #define pte_pagesize_index(mm, addr, pte)	MMU_PAGE_4K


And given it's only used in one function I'd be inclined to just open code it,
or at the very least move the macro into tlb_hash64.c

cheers

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-07-25  8:59 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-24  5:18 [PATCH] powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash Michael Ellerman
2015-07-24  6:45 ` Aneesh Kumar K.V
2015-07-25  8:59   ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.