All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
@ 2015-10-17 10:08 Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 01/31] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
                   ` (32 more replies)
  0 siblings, 33 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Hi All,

This patch series attempt to update book3s 64 linux page table format to
make it more flexible. Our current pte format is very restrictive and we
overload multiple pte bits. This is due to the non-availability of free bits
in pte_t. We use pte_t to track the validity of 4K subpages. This patch
series free up pte_t of 11 bits by moving 4K subpage tracking to the
lower half of PTE page. The pte format is updated such that we have a
better method for identifying a pte entry at pmd level. This will also enable
us to implement hugetlb migration(not yet done in this series). 

Before making the changes to the pte format, I am splitting the
pte header definition such that we now have the below layout for headers

book3s
   32
     hash.h pgtable.h 
   64
     hash.h  pgtable.h hash-4k.h hash-64k.h
booke
  32
     pgtable.h pte-40x.h pte-44x.h pte-8xx.h pte-fsl-booke.h
  64
    pgtable-4k.h  pgtable-64k.h  pgtable.h

I have done the header split such that booke headers and modified to the minimum so as to avoid
causing breakage in booke.

The patch series can also be found at
https://github.com/kvaneesh/linux.git book3s-pte-format 
https://github.com/kvaneesh/linux/commits/book3s-pte-format


Performance numbers with and without patch series.

Path length __hash_page_4k
with patch: 196
without patch: 142

Path length __hash_page_64k
with patch: 219
without patch: 154

But even if we have a path lengh increase of around 50 instructions. We don't see
the impact when running workload. I tried the kernelbuild test. 

With THP enabled (which is default) we see an improvement. I haven't fully looked at
the reason. This could be due to reduced contention of ptl lock. __hash_thp_page is
already a C code.

make -j64 vmlinux modules 
With fix:
---------
real    1m35.509s
user    56m8.565s
sys     4m34.973s

real    1m32.174s
user    57m2.336s
sys     4m39.142s

Without fix:
---------------
real    1m37.703s
user    58m50.783s
sys     7m52.440s

real    1m37.890s
user    57m55.445s
sys     7m50.501s

THP disabled:

make -j64 vmlinux modules 
With fix:
---------
real    1m37.197s
user    58m28.672s
sys     7m58.188s

real    1m44.638s
user    58m37.551s
sys     7m53.960s

Without fix:
------------
real    1m41.224s
user    58m46.944s
sys     7m49.714s

real    1m42.585s
user    59m14.019s
sys     7m52.714s


Changes from V3:
* Add missing #define pgprot_*
* Add Acked-by

Changes from V2:
* rebase to -next for powerpc tree

Changes from V1:
1) Build fix with STRICT_MM_TYPES enabled 
2) pte_mkwrite fix for nohash
3) rebase to latest linus tree.


Aneesh Kumar K.V (31):
  powerpc/mm: move pte headers to book3s directory
  powerpc/mm: move pte headers to book3s directory (part 2)
  powerpc/mm: make a separate copy for book3s
  powerpc/mm: make a separate copy for book3s (part 2)
  powerpc/mm: Move hash specific pte width and other defines to book3s
  powerpc/mm: Delete booke bits from book3s
  powerpc/mm: Don't have generic headers introduce functions touching
    pte bits
  powerpc/mm: Drop pte-common.h from BOOK3S 64
  powerpc/mm: Don't use pte_val as lvalue
  powerpc/mm: Don't use pmd_val,pud_val and pgd_val as lvalue
  powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h
  powerpc/mm: Move PTE bits from generic functions to hash64 functions.
  powerpc/booke: Move nohash headers (part 1)
  powerpc/booke: Move nohash headers (part 2)
  powerpc/booke: Move nohash headers (part 3)
  powerpc/booke: Move nohash headers (part 4)
  powerpc/booke: Move nohash headers (part 5)
  powerpc/mm: Increase the pte frag size.
  powerpc/mm: Convert 4k hash insert to C
  powerpc/mm: update __real_pte to take address as argument
  powerpc/mm: make pte page hash index slot 8 bits
  powerpc/mm: Don't track subpage valid bit in pte_t
  powerpc/mm: Increase the width of #define
  powerpc/mm: Convert __hash_page_64K to C
  powerpc/mm: Convert 4k insert from asm to C
  powerpc/mm: Remove the dependency on pte bit position in asm code
  powerpc/mm: Add helper for converting pte bit to hpte bits
  powerpc/mm: Move WIMG update to helper.
  powerpc/mm: Move hugetlb related headers
  powerpc/mm: Move THP headers around
  powerpc/mm: Add a _PAGE_PTE bit

 .../include/asm/{pte-hash32.h => book3s/32/hash.h} |    6 +-
 .../asm/{pgtable-ppc32.h => book3s/32/pgtable.h}   |  286 ++++--
 .../{pgtable-ppc64-4k.h => book3s/64/hash-4k.h}    |   58 +-
 arch/powerpc/include/asm/book3s/64/hash-64k.h      |  296 ++++++
 arch/powerpc/include/asm/book3s/64/hash.h          |  530 +++++++++++
 arch/powerpc/include/asm/book3s/64/pgtable.h       |  266 ++++++
 arch/powerpc/include/asm/book3s/pgtable.h          |   29 +
 arch/powerpc/include/asm/mmu-hash64.h              |    2 +-
 .../asm/{pgtable-ppc32.h => nohash/32/pgtable.h}   |   25 +-
 arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h |    6 +-
 arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h |    6 +-
 arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h |    6 +-
 .../include/asm/{ => nohash/32}/pte-fsl-booke.h    |    6 +-
 .../{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} |   12 +-
 .../64/pgtable-64k.h}                              |    6 +-
 .../asm/{pgtable-ppc64.h => nohash/64/pgtable.h}   |  307 +-----
 arch/powerpc/include/asm/{ => nohash}/pgtable.h    |  175 ++--
 arch/powerpc/include/asm/{ => nohash}/pte-book3e.h |    6 +-
 arch/powerpc/include/asm/page.h                    |   90 +-
 arch/powerpc/include/asm/pgalloc-32.h              |   34 +-
 arch/powerpc/include/asm/pgalloc-64.h              |   29 +-
 arch/powerpc/include/asm/pgtable.h                 |  200 +---
 arch/powerpc/include/asm/pte-common.h              |    5 +
 arch/powerpc/include/asm/pte-hash64-4k.h           |   17 -
 arch/powerpc/include/asm/pte-hash64-64k.h          |  102 --
 arch/powerpc/include/asm/pte-hash64.h              |   54 --
 arch/powerpc/kernel/exceptions-64s.S               |   16 +-
 arch/powerpc/mm/40x_mmu.c                          |   10 +-
 arch/powerpc/mm/Makefile                           |    9 +-
 arch/powerpc/mm/hash64_4k.c                        |  123 +++
 arch/powerpc/mm/hash64_64k.c                       |  313 ++++++
 arch/powerpc/mm/hash_low_64.S                      | 1003 --------------------
 arch/powerpc/mm/hash_native_64.c                   |   10 +
 arch/powerpc/mm/hash_utils_64.c                    |  105 +-
 arch/powerpc/mm/hugepage-hash64.c                  |   20 +-
 arch/powerpc/mm/hugetlbpage-hash64.c               |   15 +-
 arch/powerpc/mm/hugetlbpage.c                      |   58 +-
 arch/powerpc/mm/pgtable.c                          |    4 +
 arch/powerpc/mm/pgtable_64.c                       |   28 +-
 arch/powerpc/mm/tlb_hash64.c                       |    2 +-
 arch/powerpc/platforms/pseries/lpar.c              |   10 +
 41 files changed, 2184 insertions(+), 2101 deletions(-)
 rename arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} (93%)
 copy arch/powerpc/include/asm/{pgtable-ppc32.h => book3s/32/pgtable.h} (62%)
 copy arch/powerpc/include/asm/{pgtable-ppc64-4k.h => book3s/64/hash-4k.h} (71%)
 create mode 100644 arch/powerpc/include/asm/book3s/64/hash-64k.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/hash.h
 create mode 100644 arch/powerpc/include/asm/book3s/64/pgtable.h
 create mode 100644 arch/powerpc/include/asm/book3s/pgtable.h
 rename arch/powerpc/include/asm/{pgtable-ppc32.h => nohash/32/pgtable.h} (96%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h (95%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h (96%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h (95%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-fsl-booke.h (88%)
 rename arch/powerpc/include/asm/{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} (92%)
 rename arch/powerpc/include/asm/{pgtable-ppc64-64k.h => nohash/64/pgtable-64k.h} (90%)
 rename arch/powerpc/include/asm/{pgtable-ppc64.h => nohash/64/pgtable.h} (56%)
 copy arch/powerpc/include/asm/{ => nohash}/pgtable.h (62%)
 rename arch/powerpc/include/asm/{ => nohash}/pte-book3e.h (95%)
 delete mode 100644 arch/powerpc/include/asm/pte-hash64-4k.h
 delete mode 100644 arch/powerpc/include/asm/pte-hash64-64k.h
 delete mode 100644 arch/powerpc/include/asm/pte-hash64.h
 create mode 100644 arch/powerpc/mm/hash64_4k.c
 create mode 100644 arch/powerpc/mm/hash64_64k.c
 delete mode 100644 arch/powerpc/mm/hash_low_64.S

-- 
2.5.0

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH V4 01/31] powerpc/mm: move pte headers to book3s directory
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 02/31] powerpc/mm: move pte headers to book3s directory (part 2) Aneesh Kumar K.V
                   ` (31 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} | 0
 arch/powerpc/include/asm/{pte-hash64.h => book3s/64/hash.h} | 0
 arch/powerpc/include/asm/pgtable-ppc32.h                    | 2 +-
 arch/powerpc/include/asm/pgtable-ppc64.h                    | 2 +-
 4 files changed, 2 insertions(+), 2 deletions(-)
 rename arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} (100%)
 rename arch/powerpc/include/asm/{pte-hash64.h => book3s/64/hash.h} (100%)

diff --git a/arch/powerpc/include/asm/pte-hash32.h b/arch/powerpc/include/asm/book3s/32/hash.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-hash32.h
rename to arch/powerpc/include/asm/book3s/32/hash.h
diff --git a/arch/powerpc/include/asm/pte-hash64.h b/arch/powerpc/include/asm/book3s/64/hash.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-hash64.h
rename to arch/powerpc/include/asm/book3s/64/hash.h
diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/pgtable-ppc32.h
index 9c326565d498..1a58a05be99c 100644
--- a/arch/powerpc/include/asm/pgtable-ppc32.h
+++ b/arch/powerpc/include/asm/pgtable-ppc32.h
@@ -116,7 +116,7 @@ extern int icache_44x_need_flush;
 #elif defined(CONFIG_8xx)
 #include <asm/pte-8xx.h>
 #else /* CONFIG_6xx */
-#include <asm/pte-hash32.h>
+#include <asm/book3s/32/hash.h>
 #endif
 
 /* And here we include common definitions */
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 3245f2d96d4f..b36a932abdfb 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -98,7 +98,7 @@
  * Include the PTE bits definitions
  */
 #ifdef CONFIG_PPC_BOOK3S
-#include <asm/pte-hash64.h>
+#include <asm/book3s/64/hash.h>
 #else
 #include <asm/pte-book3e.h>
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 02/31] powerpc/mm: move pte headers to book3s directory (part 2)
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 01/31] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 03/31] powerpc/mm: make a separate copy for book3s Aneesh Kumar K.V
                   ` (30 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Splitting this so that rename can track changes to file. Before merging
we will fold this

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/hash.h                      |  6 +++---
 .../include/asm/{pte-hash64-4k.h => book3s/64/hash-4k.h}       |  1 -
 .../include/asm/{pte-hash64-64k.h => book3s/64/hash-64k.h}     |  0
 arch/powerpc/include/asm/book3s/64/hash.h                      | 10 +++++-----
 4 files changed, 8 insertions(+), 9 deletions(-)
 rename arch/powerpc/include/asm/{pte-hash64-4k.h => book3s/64/hash-4k.h} (99%)
 rename arch/powerpc/include/asm/{pte-hash64-64k.h => book3s/64/hash-64k.h} (100%)

diff --git a/arch/powerpc/include/asm/book3s/32/hash.h b/arch/powerpc/include/asm/book3s/32/hash.h
index 62cfb0c663bb..264b754d65b0 100644
--- a/arch/powerpc/include/asm/book3s/32/hash.h
+++ b/arch/powerpc/include/asm/book3s/32/hash.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_HASH32_H
-#define _ASM_POWERPC_PTE_HASH32_H
+#ifndef _ASM_POWERPC_BOOK3S_32_HASH_H
+#define _ASM_POWERPC_BOOK3S_32_HASH_H
 #ifdef __KERNEL__
 
 /*
@@ -43,4 +43,4 @@
 #define PTE_ATOMIC_UPDATES	1
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_HASH32_H */
+#endif /* _ASM_POWERPC_BOOK3S_32_HASH_H */
diff --git a/arch/powerpc/include/asm/pte-hash64-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
similarity index 99%
rename from arch/powerpc/include/asm/pte-hash64-4k.h
rename to arch/powerpc/include/asm/book3s/64/hash-4k.h
index c134e809aac3..79750fd3eeb8 100644
--- a/arch/powerpc/include/asm/pte-hash64-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -14,4 +14,3 @@
 
 /* shift to put page number into pte */
 #define PTE_RPN_SHIFT	(17)
-
diff --git a/arch/powerpc/include/asm/pte-hash64-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-hash64-64k.h
rename to arch/powerpc/include/asm/book3s/64/hash-64k.h
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index ef612c160da7..8e60d4fa434d 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_HASH64_H
-#define _ASM_POWERPC_PTE_HASH64_H
+#ifndef _ASM_POWERPC_BOOK3S_64_HASH_H
+#define _ASM_POWERPC_BOOK3S_64_HASH_H
 #ifdef __KERNEL__
 
 /*
@@ -45,10 +45,10 @@
 #define PTE_ATOMIC_UPDATES	1
 
 #ifdef CONFIG_PPC_64K_PAGES
-#include <asm/pte-hash64-64k.h>
+#include <asm/book3s/64/hash-64k.h>
 #else
-#include <asm/pte-hash64-4k.h>
+#include <asm/book3s/64/hash-4k.h>
 #endif
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_HASH64_H */
+#endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 03/31] powerpc/mm: make a separate copy for book3s
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 01/31] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 02/31] powerpc/mm: move pte headers to book3s directory (part 2) Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 04/31] powerpc/mm: make a separate copy for book3s (part 2) Aneesh Kumar K.V
                   ` (29 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

In this patch we do:
cp pgtable-ppc32.h book3s/32/pgtable.h
cp pgtable-ppc64.h book3s/64/pgtable.h

This enable us to do further changes to hash specific config.
We will change the page table format for 64bit hash in later patches.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 .../include/asm/{pgtable-ppc32.h => book3s/32/pgtable.h}       |  0
 .../include/asm/{pgtable-ppc64.h => book3s/64/pgtable.h}       |  0
 arch/powerpc/include/asm/book3s/pgtable.h                      | 10 ++++++++++
 arch/powerpc/include/asm/mmu-hash64.h                          |  2 +-
 arch/powerpc/include/asm/pgtable.h                             |  4 ++++
 5 files changed, 15 insertions(+), 1 deletion(-)
 copy arch/powerpc/include/asm/{pgtable-ppc32.h => book3s/32/pgtable.h} (100%)
 copy arch/powerpc/include/asm/{pgtable-ppc64.h => book3s/64/pgtable.h} (100%)
 create mode 100644 arch/powerpc/include/asm/book3s/pgtable.h

diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
similarity index 100%
copy from arch/powerpc/include/asm/pgtable-ppc32.h
copy to arch/powerpc/include/asm/book3s/32/pgtable.h
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
similarity index 100%
copy from arch/powerpc/include/asm/pgtable-ppc64.h
copy to arch/powerpc/include/asm/book3s/64/pgtable.h
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
new file mode 100644
index 000000000000..a8d8e5152bd4
--- /dev/null
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -0,0 +1,10 @@
+#ifndef _ASM_POWERPC_BOOK3S_PGTABLE_H
+#define _ASM_POWERPC_BOOK3S_PGTABLE_H
+
+#ifdef CONFIG_PPC64
+#include <asm/book3s/64/pgtable.h>
+#else
+#include <asm/book3s/32/pgtable.h>
+#endif
+
+#endif
diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index ba3342bbdbda..7352d3f212df 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -21,7 +21,7 @@
  * need for various slices related matters. Note that this isn't the
  * complete pgtable.h but only a portion of it.
  */
-#include <asm/pgtable-ppc64.h>
+#include <asm/book3s/64/pgtable.h>
 #include <asm/bug.h>
 #include <asm/processor.h>
 
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index b64b4212b71f..c304d0767919 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -13,11 +13,15 @@ struct mm_struct;
 
 #endif /* !__ASSEMBLY__ */
 
+#ifdef CONFIG_PPC_BOOK3S
+#include <asm/book3s/pgtable.h>
+#else
 #if defined(CONFIG_PPC64)
 #  include <asm/pgtable-ppc64.h>
 #else
 #  include <asm/pgtable-ppc32.h>
 #endif
+#endif /* !CONFIG_PPC_BOOK3S */
 
 /*
  * We save the slot number & secondary bit in the second half of the
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 04/31] powerpc/mm: make a separate copy for book3s (part 2)
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 03/31] powerpc/mm: make a separate copy for book3s Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 05/31] powerpc/mm: Move hash specific pte width and other defines to book3s Aneesh Kumar K.V
                   ` (28 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Keep it seperate to make rebasing easier

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 4 ++--
 arch/powerpc/include/asm/book3s/64/pgtable.h | 8 ++++----
 arch/powerpc/include/asm/pgtable-ppc32.h     | 2 --
 arch/powerpc/include/asm/pgtable-ppc64.h     | 4 ----
 4 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 1a58a05be99c..a7738dfbe7e5 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC32_H
-#define _ASM_POWERPC_PGTABLE_PPC32_H
+#ifndef _ASM_POWERPC_BOOK3S_32_PGTABLE_H
+#define _ASM_POWERPC_BOOK3S_32_PGTABLE_H
 
 #include <asm-generic/pgtable-nopmd.h>
 
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index b36a932abdfb..cdd5284d9eaa 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC64_H_
-#define _ASM_POWERPC_PGTABLE_PPC64_H_
+#ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
+#define _ASM_POWERPC_BOOK3S_64_PGTABLE_H_
 /*
  * This file contains the functions and defines necessary to modify and use
  * the ppc64 hashed page table.
@@ -18,7 +18,7 @@
  * Size of EA range mapped by our pagetables.
  */
 #define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
-                	    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
+			    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
 #define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -623,4 +623,4 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
 	return true;
 }
 #endif /* __ASSEMBLY__ */
-#endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
+#endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/pgtable-ppc32.h
index 1a58a05be99c..aac6547b0823 100644
--- a/arch/powerpc/include/asm/pgtable-ppc32.h
+++ b/arch/powerpc/include/asm/pgtable-ppc32.h
@@ -115,8 +115,6 @@ extern int icache_44x_need_flush;
 #include <asm/pte-fsl-booke.h>
 #elif defined(CONFIG_8xx)
 #include <asm/pte-8xx.h>
-#else /* CONFIG_6xx */
-#include <asm/book3s/32/hash.h>
 #endif
 
 /* And here we include common definitions */
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index b36a932abdfb..1ef0fea32e1e 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -97,11 +97,7 @@
 /*
  * Include the PTE bits definitions
  */
-#ifdef CONFIG_PPC_BOOK3S
-#include <asm/book3s/64/hash.h>
-#else
 #include <asm/pte-book3e.h>
-#endif
 #include <asm/pte-common.h>
 
 #ifdef CONFIG_PPC_MM_SLICES
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 05/31] powerpc/mm: Move hash specific pte width and other defines to book3s
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 04/31] powerpc/mm: make a separate copy for book3s (part 2) Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 06/31] powerpc/mm: Delete booke bits from book3s Aneesh Kumar K.V
                   ` (27 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

This further make a copy of pte defines to book3s/64/hash*.h. This
remove the dependency on ppc64-4k.h and ppc64-64k.h

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 87 ++++++++++++++++++++++++++-
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 46 +++++++++++++-
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  6 +-
 3 files changed, 130 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 79750fd3eeb8..f2c51cd61f69 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -1,4 +1,51 @@
-/* To be include by pgtable-hash64.h only */
+#ifndef _ASM_POWERPC_BOOK3S_64_HASH_4K_H
+#define _ASM_POWERPC_BOOK3S_64_HASH_4K_H
+/*
+ * Entries per page directory level.  The PTE level must use a 64b record
+ * for each page table entry.  The PMD and PGD level use a 32b record for
+ * each entry by assuming that each entry is page aligned.
+ */
+#define PTE_INDEX_SIZE  9
+#define PMD_INDEX_SIZE  7
+#define PUD_INDEX_SIZE  9
+#define PGD_INDEX_SIZE  9
+
+#ifndef __ASSEMBLY__
+#define PTE_TABLE_SIZE	(sizeof(pte_t) << PTE_INDEX_SIZE)
+#define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
+#define PUD_TABLE_SIZE	(sizeof(pud_t) << PUD_INDEX_SIZE)
+#define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
+#endif	/* __ASSEMBLY__ */
+
+#define PTRS_PER_PTE	(1 << PTE_INDEX_SIZE)
+#define PTRS_PER_PMD	(1 << PMD_INDEX_SIZE)
+#define PTRS_PER_PUD	(1 << PUD_INDEX_SIZE)
+#define PTRS_PER_PGD	(1 << PGD_INDEX_SIZE)
+
+/* PMD_SHIFT determines what a second-level page table entry can map */
+#define PMD_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
+#define PMD_SIZE	(1UL << PMD_SHIFT)
+#define PMD_MASK	(~(PMD_SIZE-1))
+
+/* With 4k base page size, hugepage PTEs go at the PMD level */
+#define MIN_HUGEPTE_SHIFT	PMD_SHIFT
+
+/* PUD_SHIFT determines what a third-level page table entry can map */
+#define PUD_SHIFT	(PMD_SHIFT + PMD_INDEX_SIZE)
+#define PUD_SIZE	(1UL << PUD_SHIFT)
+#define PUD_MASK	(~(PUD_SIZE-1))
+
+/* PGDIR_SHIFT determines what a fourth-level page table entry can map */
+#define PGDIR_SHIFT	(PUD_SHIFT + PUD_INDEX_SIZE)
+#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK	(~(PGDIR_SIZE-1))
+
+/* Bits to mask out from a PMD to get to the PTE page */
+#define PMD_MASKED_BITS		0
+/* Bits to mask out from a PUD to get to the PMD page */
+#define PUD_MASKED_BITS		0
+/* Bits to mask out from a PGD to get to the PUD page */
+#define PGD_MASKED_BITS		0
 
 /* PTE bits */
 #define _PAGE_HASHPTE	0x0400 /* software: pte has an associated HPTE */
@@ -14,3 +61,41 @@
 
 /* shift to put page number into pte */
 #define PTE_RPN_SHIFT	(17)
+
+#ifndef __ASSEMBLY__
+/*
+ * 4-level page tables related bits
+ */
+
+#define pgd_none(pgd)		(!pgd_val(pgd))
+#define pgd_bad(pgd)		(pgd_val(pgd) == 0)
+#define pgd_present(pgd)	(pgd_val(pgd) != 0)
+#define pgd_clear(pgdp)		(pgd_val(*(pgdp)) = 0)
+#define pgd_page_vaddr(pgd)	(pgd_val(pgd) & ~PGD_MASKED_BITS)
+
+static inline pte_t pgd_pte(pgd_t pgd)
+{
+	return __pte(pgd_val(pgd));
+}
+
+static inline pgd_t pte_pgd(pte_t pte)
+{
+	return __pgd(pte_val(pte));
+}
+extern struct page *pgd_page(pgd_t pgd);
+
+#define pud_offset(pgdp, addr)	\
+  (((pud_t *) pgd_page_vaddr(*(pgdp))) + \
+    (((addr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1)))
+
+#define pud_ERROR(e) \
+	pr_err("%s:%d: bad pud %08lx.\n", __FILE__, __LINE__, pud_val(e))
+
+/*
+ * On all 4K setups, remap_4k_pfn() equates to remap_pfn_range() */
+#define remap_4k_pfn(vma, addr, pfn, prot)	\
+	remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* _ASM_POWERPC_BOOK3S_64_HASH_4K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 4f4ec2ab45c9..ee073822145d 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -1,4 +1,35 @@
-/* To be include by pgtable-hash64.h only */
+#ifndef _ASM_POWERPC_BOOK3S_64_HASH_64K_H
+#define _ASM_POWERPC_BOOK3S_64_HASH_64K_H
+
+#include <asm-generic/pgtable-nopud.h>
+
+#define PTE_INDEX_SIZE  8
+#define PMD_INDEX_SIZE  10
+#define PUD_INDEX_SIZE	0
+#define PGD_INDEX_SIZE  12
+
+#define PTRS_PER_PTE	(1 << PTE_INDEX_SIZE)
+#define PTRS_PER_PMD	(1 << PMD_INDEX_SIZE)
+#define PTRS_PER_PGD	(1 << PGD_INDEX_SIZE)
+
+/* With 4k base page size, hugepage PTEs go at the PMD level */
+#define MIN_HUGEPTE_SHIFT	PAGE_SHIFT
+
+/* PMD_SHIFT determines what a second-level page table entry can map */
+#define PMD_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
+#define PMD_SIZE	(1UL << PMD_SHIFT)
+#define PMD_MASK	(~(PMD_SIZE-1))
+
+/* PGDIR_SHIFT determines what a third-level page table entry can map */
+#define PGDIR_SHIFT	(PMD_SHIFT + PMD_INDEX_SIZE)
+#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
+#define PGDIR_MASK	(~(PGDIR_SIZE-1))
+
+/* Bits to mask out from a PMD to get to the PTE page */
+/* PMDs point to PTE table fragments which are 4K aligned.  */
+#define PMD_MASKED_BITS		0xfff
+/* Bits to mask out from a PGD/PUD to get to the PMD page */
+#define PUD_MASKED_BITS		0x1ff
 
 /* Additional PTE bits (don't change without checking asm in hash_low.S) */
 #define _PAGE_SPECIAL	0x00000400 /* software: special page */
@@ -74,8 +105,8 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 #define __rpte_to_pte(r)	((r).pte)
 #define __rpte_sub_valid(rpte, index) \
 	(pte_val(rpte.pte) & (_PAGE_HPTE_SUB0 >> (index)))
-
-/* Trick: we set __end to va + 64k, which happens works for
+/*
+ * Trick: we set __end to va + 64k, which happens works for
  * a 16M page as well as we want only one iteration
  */
 #define pte_iterate_hashed_subpages(rpte, psize, vpn, index, shift)	\
@@ -99,4 +130,13 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 		remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE,	\
 			__pgprot(pgprot_val((prot)) | _PAGE_4K_PFN)))
 
+#define PTE_TABLE_SIZE	(sizeof(real_pte_t) << PTE_INDEX_SIZE)
+#define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
+#define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
+
+#define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
+#define pte_pgd(pte)	((pgd_t)pte_pud(pte))
+
 #endif	/* __ASSEMBLY__ */
+
+#endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index cdd5284d9eaa..2741ac6fbd3d 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -5,11 +5,7 @@
  * the ppc64 hashed page table.
  */
 
-#ifdef CONFIG_PPC_64K_PAGES
-#include <asm/pgtable-ppc64-64k.h>
-#else
-#include <asm/pgtable-ppc64-4k.h>
-#endif
+#include <asm/book3s/64/hash.h>
 #include <asm/barrier.h>
 
 #define FIRST_USER_ADDRESS	0UL
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 06/31] powerpc/mm: Delete booke bits from book3s
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (4 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 05/31] powerpc/mm: Move hash specific pte width and other defines to book3s Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 07/31] powerpc/mm: Don't have generic headers introduce functions touching pte bits Aneesh Kumar K.V
                   ` (26 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We also move __ASSEMBLY__ towards the end of header. This avoid
having #ifndef __ASSEMBLY___ all over the header

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 93 +++++++---------------------
 arch/powerpc/include/asm/book3s/64/pgtable.h | 88 ++++++++------------------
 arch/powerpc/include/asm/book3s/pgtable.h    |  1 +
 3 files changed, 51 insertions(+), 131 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index a7738dfbe7e5..2afe5958c837 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -3,18 +3,10 @@
 
 #include <asm-generic/pgtable-nopmd.h>
 
-#ifndef __ASSEMBLY__
-#include <linux/sched.h>
-#include <linux/threads.h>
-#include <asm/io.h>			/* For sub-arch specific PPC_PIN_SIZE */
-
-extern unsigned long ioremap_bot;
-
-#ifdef CONFIG_44x
-extern int icache_44x_need_flush;
-#endif
+#include <asm/book3s/32/hash.h>
 
-#endif /* __ASSEMBLY__ */
+/* And here we include common definitions */
+#include <asm/pte-common.h>
 
 /*
  * The normal case is that PTEs are 32-bits and we have a 1-page
@@ -31,28 +23,11 @@ extern int icache_44x_need_flush;
 #define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE-1))
 
-/*
- * entries per page directory level: our page-table tree is two-level, so
- * we don't really have any PMD directory.
- */
-#ifndef __ASSEMBLY__
-#define PTE_TABLE_SIZE	(sizeof(pte_t) << PTE_SHIFT)
-#define PGD_TABLE_SIZE	(sizeof(pgd_t) << (32 - PGDIR_SHIFT))
-#endif	/* __ASSEMBLY__ */
-
 #define PTRS_PER_PTE	(1 << PTE_SHIFT)
 #define PTRS_PER_PMD	1
 #define PTRS_PER_PGD	(1 << (32 - PGDIR_SHIFT))
 
 #define USER_PTRS_PER_PGD	(TASK_SIZE / PGDIR_SIZE)
-#define FIRST_USER_ADDRESS	0UL
-
-#define pte_ERROR(e) \
-	pr_err("%s:%d: bad pte %llx.\n", __FILE__, __LINE__, \
-		(unsigned long long)pte_val(e))
-#define pgd_ERROR(e) \
-	pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
-
 /*
  * This is the bottom of the PKMAP area with HIGHMEM or an arbitrary
  * value (for now) on others, from where we can start layout kernel
@@ -100,30 +75,30 @@ extern int icache_44x_need_flush;
 #endif
 #define VMALLOC_END	ioremap_bot
 
+#ifndef __ASSEMBLY__
+#include <linux/sched.h>
+#include <linux/threads.h>
+#include <asm/io.h>			/* For sub-arch specific PPC_PIN_SIZE */
+
+extern unsigned long ioremap_bot;
+
+/*
+ * entries per page directory level: our page-table tree is two-level, so
+ * we don't really have any PMD directory.
+ */
+#define PTE_TABLE_SIZE	(sizeof(pte_t) << PTE_SHIFT)
+#define PGD_TABLE_SIZE	(sizeof(pgd_t) << (32 - PGDIR_SHIFT))
+
+#define pte_ERROR(e) \
+	pr_err("%s:%d: bad pte %llx.\n", __FILE__, __LINE__, \
+		(unsigned long long)pte_val(e))
+#define pgd_ERROR(e) \
+	pr_err("%s:%d: bad pgd %08lx.\n", __FILE__, __LINE__, pgd_val(e))
 /*
  * Bits in a linux-style PTE.  These match the bits in the
  * (hardware-defined) PowerPC PTE as closely as possible.
  */
 
-#if defined(CONFIG_40x)
-#include <asm/pte-40x.h>
-#elif defined(CONFIG_44x)
-#include <asm/pte-44x.h>
-#elif defined(CONFIG_FSL_BOOKE) && defined(CONFIG_PTE_64BIT)
-#include <asm/pte-book3e.h>
-#elif defined(CONFIG_FSL_BOOKE)
-#include <asm/pte-fsl-booke.h>
-#elif defined(CONFIG_8xx)
-#include <asm/pte-8xx.h>
-#else /* CONFIG_6xx */
-#include <asm/book3s/32/hash.h>
-#endif
-
-/* And here we include common definitions */
-#include <asm/pte-common.h>
-
-#ifndef __ASSEMBLY__
-
 #define pte_clear(mm, addr, ptep) \
 	do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
 
@@ -167,7 +142,6 @@ static inline unsigned long pte_update(pte_t *p,
 				       unsigned long clr,
 				       unsigned long set)
 {
-#ifdef PTE_ATOMIC_UPDATES
 	unsigned long old, tmp;
 
 	__asm__ __volatile__("\
@@ -180,15 +154,7 @@ static inline unsigned long pte_update(pte_t *p,
 	: "=&r" (old), "=&r" (tmp), "=m" (*p)
 	: "r" (p), "r" (clr), "r" (set), "m" (*p)
 	: "cc" );
-#else /* PTE_ATOMIC_UPDATES */
-	unsigned long old = pte_val(*p);
-	*p = __pte((old & ~clr) | set);
-#endif /* !PTE_ATOMIC_UPDATES */
-
-#ifdef CONFIG_44x
-	if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
-		icache_44x_need_flush = 1;
-#endif
+
 	return old;
 }
 #else /* CONFIG_PTE_64BIT */
@@ -196,7 +162,6 @@ static inline unsigned long long pte_update(pte_t *p,
 					    unsigned long clr,
 					    unsigned long set)
 {
-#ifdef PTE_ATOMIC_UPDATES
 	unsigned long long old;
 	unsigned long tmp;
 
@@ -211,15 +176,7 @@ static inline unsigned long long pte_update(pte_t *p,
 	: "=&r" (old), "=&r" (tmp), "=m" (*p)
 	: "r" (p), "r" ((unsigned long)(p) + 4), "r" (clr), "r" (set), "m" (*p)
 	: "cc" );
-#else /* PTE_ATOMIC_UPDATES */
-	unsigned long long old = pte_val(*p);
-	*p = __pte((old & ~(unsigned long long)clr) | set);
-#endif /* !PTE_ATOMIC_UPDATES */
-
-#ifdef CONFIG_44x
-	if ((old & _PAGE_USER) && (old & _PAGE_EXEC))
-		icache_44x_need_flush = 1;
-#endif
+
 	return old;
 }
 #endif /* CONFIG_PTE_64BIT */
@@ -233,12 +190,10 @@ static inline int __ptep_test_and_clear_young(unsigned int context, unsigned lon
 {
 	unsigned long old;
 	old = pte_update(ptep, _PAGE_ACCESSED, 0);
-#if _PAGE_HASHPTE != 0
 	if (old & _PAGE_HASHPTE) {
 		unsigned long ptephys = __pa(ptep) & PAGE_MASK;
 		flush_hash_pages(context, addr, ptephys, 1);
 	}
-#endif
 	return (old & _PAGE_ACCESSED) != 0;
 }
 #define ptep_test_and_clear_young(__vma, __addr, __ptep) \
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 2741ac6fbd3d..cf4824395027 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -8,7 +8,6 @@
 #include <asm/book3s/64/hash.h>
 #include <asm/barrier.h>
 
-#define FIRST_USER_ADDRESS	0UL
 
 /*
  * Size of EA range mapped by our pagetables.
@@ -25,27 +24,16 @@
 /*
  * Define the address range of the kernel non-linear virtual area
  */
-
-#ifdef CONFIG_PPC_BOOK3E
-#define KERN_VIRT_START ASM_CONST(0x8000000000000000)
-#else
 #define KERN_VIRT_START ASM_CONST(0xD000000000000000)
-#endif
 #define KERN_VIRT_SIZE	ASM_CONST(0x0000100000000000)
-
 /*
  * The vmalloc space starts at the beginning of that region, and
  * occupies half of it on hash CPUs and a quarter of it on Book3E
  * (we keep a quarter for the virtual memmap)
  */
 #define VMALLOC_START	KERN_VIRT_START
-#ifdef CONFIG_PPC_BOOK3E
-#define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 2)
-#else
 #define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 1)
-#endif
 #define VMALLOC_END	(VMALLOC_START + VMALLOC_SIZE)
-
 /*
  * The second half of the kernel virtual space is used for IO mappings,
  * it's itself carved into the PIO region (ISA and PHB IO space) and
@@ -64,7 +52,6 @@
 #define IOREMAP_BASE	(PHB_IO_END)
 #define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
 
-
 /*
  * Region IDs
  */
@@ -79,32 +66,41 @@
 
 /*
  * Defines the address of the vmemap area, in its own region on
- * hash table CPUs and after the vmalloc space on Book3E
+ * hash table CPUs.
  */
-#ifdef CONFIG_PPC_BOOK3E
-#define VMEMMAP_BASE		VMALLOC_END
-#define VMEMMAP_END		KERN_IO_START
-#else
 #define VMEMMAP_BASE		(VMEMMAP_REGION_ID << REGION_SHIFT)
-#endif
 #define vmemmap			((struct page *)VMEMMAP_BASE)
 
 
-/*
- * Include the PTE bits definitions
- */
-#ifdef CONFIG_PPC_BOOK3S
-#include <asm/book3s/64/hash.h>
-#else
-#include <asm/pte-book3e.h>
-#endif
-#include <asm/pte-common.h>
-
 #ifdef CONFIG_PPC_MM_SLICES
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#else
+#error "Hash Config needs mm slice enabled"
 #endif /* CONFIG_PPC_MM_SLICES */
 
+/*
+ * THP pages can't be special. So use the _PAGE_SPECIAL
+ */
+#define _PAGE_SPLITTING _PAGE_SPECIAL
+
+/*
+ * We need to differentiate between explicit huge page and THP huge
+ * page, since THP huge page also need to track real subpage details
+ */
+#define _PAGE_THP_HUGE  _PAGE_4K_PFN
+
+/*
+ * set of bits not changed in pmd_modify.
+ */
+#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
+			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
+			 _PAGE_THP_HUGE)
+/*
+ * Default defines for things which we don't use.
+ * We should get this removed.
+ */
+#include <asm/pte-common.h>
 #ifndef __ASSEMBLY__
 
 /*
@@ -144,7 +140,7 @@
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
 
-#define pmd_set(pmdp, pmdval) 	(pmd_val(*(pmdp)) = (pmdval))
+#define pmd_set(pmdp, pmdval)	(pmd_val(*(pmdp)) = (pmdval))
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
 				 || (pmd_val(pmd) & PMD_BAD_BITS))
@@ -206,7 +202,6 @@ static inline unsigned long pte_update(struct mm_struct *mm,
 				       unsigned long set,
 				       int huge)
 {
-#ifdef PTE_ATOMIC_UPDATES
 	unsigned long old, tmp;
 
 	__asm__ __volatile__(
@@ -220,18 +215,12 @@ static inline unsigned long pte_update(struct mm_struct *mm,
 	: "=&r" (old), "=&r" (tmp), "=m" (*ptep)
 	: "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
 	: "cc" );
-#else
-	unsigned long old = pte_val(*ptep);
-	*ptep = __pte((old & ~clr) | set);
-#endif
 	/* huge pages use the old page table lock */
 	if (!huge)
 		assert_pte_locked(mm, addr);
 
-#ifdef CONFIG_PPC_STD_MMU_64
 	if (old & _PAGE_HASHPTE)
 		hpte_need_flush(mm, addr, ptep, old, huge);
-#endif
 
 	return old;
 }
@@ -313,7 +302,6 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 	unsigned long bits = pte_val(entry) &
 		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
 
-#ifdef PTE_ATOMIC_UPDATES
 	unsigned long old, tmp;
 
 	__asm__ __volatile__(
@@ -326,10 +314,6 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 	:"=&r" (old), "=&r" (tmp), "=m" (*ptep)
 	:"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
 	:"cc");
-#else
-	unsigned long old = pte_val(*ptep);
-	*ptep = __pte(old | bits);
-#endif
 }
 
 #define __HAVE_ARCH_PTE_SAME
@@ -367,28 +351,8 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
-#endif /* __ASSEMBLY__ */
 
 /*
- * THP pages can't be special. So use the _PAGE_SPECIAL
- */
-#define _PAGE_SPLITTING _PAGE_SPECIAL
-
-/*
- * We need to differentiate between explicit huge page and THP huge
- * page, since THP huge page also need to track real subpage details
- */
-#define _PAGE_THP_HUGE  _PAGE_4K_PFN
-
-/*
- * set of bits not changed in pmd_modify.
- */
-#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
-			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
-
-#ifndef __ASSEMBLY__
-/*
  * The linux hugepage PMD now include the pmd entries followed by the address
  * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
  * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index a8d8e5152bd4..3818cc7bc9b7 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -7,4 +7,5 @@
 #include <asm/book3s/32/pgtable.h>
 #endif
 
+#define FIRST_USER_ADDRESS	0UL
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 07/31] powerpc/mm: Don't have generic headers introduce functions touching pte bits
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (5 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 06/31] powerpc/mm: Delete booke bits from book3s Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 08/31] powerpc/mm: Drop pte-common.h from BOOK3S 64 Aneesh Kumar K.V
                   ` (25 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We are going to drop pte_common.h in the later patch. The idea is to
enable hash code not require to define all PTE bits. Having PTE bits
defined in pte_common.h made the code unnecessarily complex.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/pgtable.h          | 176 +++++++++++++++++++
 .../include/asm/{pgtable.h => pgtable-book3e.h}    |  96 +----------
 arch/powerpc/include/asm/pgtable.h                 | 192 +--------------------
 3 files changed, 185 insertions(+), 279 deletions(-)
 copy arch/powerpc/include/asm/{pgtable.h => pgtable-book3e.h} (70%)

diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index 3818cc7bc9b7..fa270cfcf30a 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -8,4 +8,180 @@
 #endif
 
 #define FIRST_USER_ADDRESS	0UL
+#ifndef __ASSEMBLY__
+
+/* Generic accessors to PTE bits */
+static inline int pte_write(pte_t pte)
+{
+	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
+}
+static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
+static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
+static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
+static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
+static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
+
+#ifdef CONFIG_NUMA_BALANCING
+/*
+ * These work without NUMA balancing but the kernel does not care. See the
+ * comment in include/asm-generic/pgtable.h . On powerpc, this will only
+ * work for user pages and always return true for kernel pages.
+ */
+static inline int pte_protnone(pte_t pte)
+{
+	return (pte_val(pte) &
+		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
+}
+
+static inline int pmd_protnone(pmd_t pmd)
+{
+	return pte_protnone(pmd_pte(pmd));
+}
+#endif /* CONFIG_NUMA_BALANCING */
+
+static inline int pte_present(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_PRESENT;
+}
+
+/* Conversion functions: convert a page and protection to a page entry,
+ * and a page entry and page directory to the page they refer to.
+ *
+ * Even if PTEs can be unsigned long long, a PFN is always an unsigned
+ * long for now.
+ */
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
+	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
+		     pgprot_val(pgprot)); }
+static inline unsigned long pte_pfn(pte_t pte)	{
+	return pte_val(pte) >> PTE_RPN_SHIFT; }
+
+/* Generic modifiers for PTE bits */
+static inline pte_t pte_wrprotect(pte_t pte) {
+	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
+	pte_val(pte) |= _PAGE_RO; return pte; }
+static inline pte_t pte_mkclean(pte_t pte) {
+	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
+static inline pte_t pte_mkold(pte_t pte) {
+	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
+static inline pte_t pte_mkwrite(pte_t pte) {
+	pte_val(pte) &= ~_PAGE_RO;
+	pte_val(pte) |= _PAGE_RW; return pte; }
+static inline pte_t pte_mkdirty(pte_t pte) {
+	pte_val(pte) |= _PAGE_DIRTY; return pte; }
+static inline pte_t pte_mkyoung(pte_t pte) {
+	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
+static inline pte_t pte_mkspecial(pte_t pte) {
+	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
+static inline pte_t pte_mkhuge(pte_t pte) {
+	return pte; }
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
+	return pte;
+}
+
+
+/* Insert a PTE, top-level function is out of line. It uses an inline
+ * low level function in the respective pgtable-* files
+ */
+extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+		       pte_t pte);
+
+/* This low level function performs the actual PTE insertion
+ * Setting the PTE depends on the MMU type and other factors. It's
+ * an horrible mess that I'm not going to try to clean up now but
+ * I'm keeping it in one place rather than spread around
+ */
+static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
+				pte_t *ptep, pte_t pte, int percpu)
+{
+#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
+	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
+	 * helper pte_update() which does an atomic update. We need to do that
+	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
+	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
+	 * the hash bits instead (ie, same as the non-SMP case)
+	 */
+	if (percpu)
+		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+			      | (pte_val(pte) & ~_PAGE_HASHPTE));
+	else
+		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
+
+#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
+	/* Second case is 32-bit with 64-bit PTE.  In this case, we
+	 * can just store as long as we do the two halves in the right order
+	 * with a barrier in between. This is possible because we take care,
+	 * in the hash code, to pre-invalidate if the PTE was already hashed,
+	 * which synchronizes us with any concurrent invalidation.
+	 * In the percpu case, we also fallback to the simple update preserving
+	 * the hash bits
+	 */
+	if (percpu) {
+		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+			      | (pte_val(pte) & ~_PAGE_HASHPTE));
+		return;
+	}
+	if (pte_val(*ptep) & _PAGE_HASHPTE)
+		flush_hash_entry(mm, ptep, addr);
+	__asm__ __volatile__("\
+		stw%U0%X0 %2,%0\n\
+		eieio\n\
+		stw%U0%X0 %L2,%1"
+	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+	: "r" (pte) : "memory");
+
+#elif defined(CONFIG_PPC_STD_MMU_32)
+	/* Third case is 32-bit hash table in UP mode, we need to preserve
+	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
+	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
+	 * and see we need to keep track that this PTE needs invalidating
+	 */
+	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+		      | (pte_val(pte) & ~_PAGE_HASHPTE));
+
+#else
+	/* Anything else just stores the PTE normally. That covers all 64-bit
+	 * cases, and 32-bit non-hash with 32-bit PTEs.
+	 */
+	*ptep = pte;
+#endif
+}
+
+
+#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
+extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
+				 pte_t *ptep, pte_t entry, int dirty);
+
+/*
+ * Macro to mark a page protection value as "uncacheable".
+ */
+
+#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU)
+
+#define pgprot_noncached(prot)	  (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
+				            _PAGE_NO_CACHE | _PAGE_GUARDED))
+
+#define pgprot_noncached_wc(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
+				            _PAGE_NO_CACHE))
+
+#define pgprot_cached(prot)       (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
+				            _PAGE_COHERENT))
+
+#define pgprot_cached_wthru(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
+				            _PAGE_COHERENT | _PAGE_WRITETHRU))
+
+#define pgprot_cached_noncoherent(prot) \
+		(__pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL))
+
+#define pgprot_writecombine pgprot_noncached_wc
+
+struct file;
+extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
+				     unsigned long size, pgprot_t vma_prot);
+#define __HAVE_PHYS_MEM_ACCESS_PROT
+
+#endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable-book3e.h
similarity index 70%
copy from arch/powerpc/include/asm/pgtable.h
copy to arch/powerpc/include/asm/pgtable-book3e.h
index c304d0767919..a3221cff2e31 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable-book3e.h
@@ -1,41 +1,19 @@
-#ifndef _ASM_POWERPC_PGTABLE_H
-#define _ASM_POWERPC_PGTABLE_H
-#ifdef __KERNEL__
+#ifndef _ASM_POWERPC_PGTABLE_BOOK3E_H
+#define _ASM_POWERPC_PGTABLE_BOOK3E_H
 
-#ifndef __ASSEMBLY__
-#include <linux/mmdebug.h>
-#include <linux/mmzone.h>
-#include <asm/processor.h>		/* For TASK_SIZE */
-#include <asm/mmu.h>
-#include <asm/page.h>
-
-struct mm_struct;
-
-#endif /* !__ASSEMBLY__ */
-
-#ifdef CONFIG_PPC_BOOK3S
-#include <asm/book3s/pgtable.h>
-#else
 #if defined(CONFIG_PPC64)
-#  include <asm/pgtable-ppc64.h>
+#include <asm/pgtable-ppc64.h>
 #else
-#  include <asm/pgtable-ppc32.h>
+#include <asm/pgtable-ppc32.h>
 #endif
-#endif /* !CONFIG_PPC_BOOK3S */
-
-/*
- * We save the slot number & secondary bit in the second half of the
- * PTE page. We use the 8 bytes per each pte entry.
- */
-#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
 
 #ifndef __ASSEMBLY__
 
-#include <asm/tlbflush.h>
-
 /* Generic accessors to PTE bits */
 static inline int pte_write(pte_t pte)
-{	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO; }
+{
+	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
+}
 static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
 static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
@@ -77,10 +55,6 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
 static inline unsigned long pte_pfn(pte_t pte)	{
 	return pte_val(pte) >> PTE_RPN_SHIFT; }
 
-/* Keep these as a macros to avoid include dependency mess */
-#define pte_page(x)		pfn_to_page(pte_pfn(x))
-#define mk_pte(page, pgprot)	pfn_pte(page_to_pfn(page), (pgprot))
-
 /* Generic modifiers for PTE bits */
 static inline pte_t pte_wrprotect(pte_t pte) {
 	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
@@ -221,59 +195,5 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 				     unsigned long size, pgprot_t vma_prot);
 #define __HAVE_PHYS_MEM_ACCESS_PROT
 
-/*
- * ZERO_PAGE is a global shared page that is always zero: used
- * for zero-mapped memory areas etc..
- */
-extern unsigned long empty_zero_page[];
-#define ZERO_PAGE(vaddr) (virt_to_page(empty_zero_page))
-
-extern pgd_t swapper_pg_dir[];
-
-void limit_zone_pfn(enum zone_type zone, unsigned long max_pfn);
-int dma_pfn_limit_to_zone(u64 pfn_limit);
-extern void paging_init(void);
-
-/*
- * kern_addr_valid is intended to indicate whether an address is a valid
- * kernel address.  Most 32-bit archs define it as always true (like this)
- * but most 64-bit archs actually perform a test.  What should we do here?
- */
-#define kern_addr_valid(addr)	(1)
-
-#include <asm-generic/pgtable.h>
-
-
-/*
- * This gets called at the end of handling a page fault, when
- * the kernel has put a new PTE into the page table for the process.
- * We use it to ensure coherency between the i-cache and d-cache
- * for the page which has just been mapped in.
- * On machines which use an MMU hash table, we use this to put a
- * corresponding HPTE into the hash table ahead of time, instead of
- * waiting for the inevitable extra hash-table miss exception.
- */
-extern void update_mmu_cache(struct vm_area_struct *, unsigned long, pte_t *);
-
-extern int gup_hugepte(pte_t *ptep, unsigned long sz, unsigned long addr,
-		       unsigned long end, int write,
-		       struct page **pages, int *nr);
-#ifndef CONFIG_TRANSPARENT_HUGEPAGE
-#define pmd_large(pmd)		0
-#define has_transparent_hugepage() 0
-#endif
-pte_t *__find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
-				   bool *is_thp, unsigned *shift);
-static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
-					       bool *is_thp, unsigned *shift)
-{
-	if (!arch_irqs_disabled()) {
-		pr_info("%s called with irq enabled\n", __func__);
-		dump_stack();
-	}
-	return __find_linux_pte_or_hugepte(pgdir, ea, is_thp, shift);
-}
 #endif /* __ASSEMBLY__ */
-
-#endif /* __KERNEL__ */
-#endif /* _ASM_POWERPC_PGTABLE_H */
+#endif
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index c304d0767919..a27b8cef51d7 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -1,6 +1,5 @@
 #ifndef _ASM_POWERPC_PGTABLE_H
 #define _ASM_POWERPC_PGTABLE_H
-#ifdef __KERNEL__
 
 #ifndef __ASSEMBLY__
 #include <linux/mmdebug.h>
@@ -16,11 +15,7 @@ struct mm_struct;
 #ifdef CONFIG_PPC_BOOK3S
 #include <asm/book3s/pgtable.h>
 #else
-#if defined(CONFIG_PPC64)
-#  include <asm/pgtable-ppc64.h>
-#else
-#  include <asm/pgtable-ppc32.h>
-#endif
+#include <asm/pgtable-book3e.h>
 #endif /* !CONFIG_PPC_BOOK3S */
 
 /*
@@ -33,194 +28,10 @@ struct mm_struct;
 
 #include <asm/tlbflush.h>
 
-/* Generic accessors to PTE bits */
-static inline int pte_write(pte_t pte)
-{	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO; }
-static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
-static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
-static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
-static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
-
-#ifdef CONFIG_NUMA_BALANCING
-/*
- * These work without NUMA balancing but the kernel does not care. See the
- * comment in include/asm-generic/pgtable.h . On powerpc, this will only
- * work for user pages and always return true for kernel pages.
- */
-static inline int pte_protnone(pte_t pte)
-{
-	return (pte_val(pte) &
-		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
-}
-
-static inline int pmd_protnone(pmd_t pmd)
-{
-	return pte_protnone(pmd_pte(pmd));
-}
-#endif /* CONFIG_NUMA_BALANCING */
-
-static inline int pte_present(pte_t pte)
-{
-	return pte_val(pte) & _PAGE_PRESENT;
-}
-
-/* Conversion functions: convert a page and protection to a page entry,
- * and a page entry and page directory to the page they refer to.
- *
- * Even if PTEs can be unsigned long long, a PFN is always an unsigned
- * long for now.
- */
-static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
-	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
-		     pgprot_val(pgprot)); }
-static inline unsigned long pte_pfn(pte_t pte)	{
-	return pte_val(pte) >> PTE_RPN_SHIFT; }
-
 /* Keep these as a macros to avoid include dependency mess */
 #define pte_page(x)		pfn_to_page(pte_pfn(x))
 #define mk_pte(page, pgprot)	pfn_pte(page_to_pfn(page), (pgprot))
 
-/* Generic modifiers for PTE bits */
-static inline pte_t pte_wrprotect(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
-	pte_val(pte) |= _PAGE_RO; return pte; }
-static inline pte_t pte_mkclean(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
-static inline pte_t pte_mkold(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_RO;
-	pte_val(pte) |= _PAGE_RW; return pte; }
-static inline pte_t pte_mkdirty(pte_t pte) {
-	pte_val(pte) |= _PAGE_DIRTY; return pte; }
-static inline pte_t pte_mkyoung(pte_t pte) {
-	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkspecial(pte_t pte) {
-	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
-static inline pte_t pte_mkhuge(pte_t pte) {
-	return pte; }
-static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
-{
-	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
-	return pte;
-}
-
-
-/* Insert a PTE, top-level function is out of line. It uses an inline
- * low level function in the respective pgtable-* files
- */
-extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
-		       pte_t pte);
-
-/* This low level function performs the actual PTE insertion
- * Setting the PTE depends on the MMU type and other factors. It's
- * an horrible mess that I'm not going to try to clean up now but
- * I'm keeping it in one place rather than spread around
- */
-static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
-				pte_t *ptep, pte_t pte, int percpu)
-{
-#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
-	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
-	 * helper pte_update() which does an atomic update. We need to do that
-	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
-	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
-	 * the hash bits instead (ie, same as the non-SMP case)
-	 */
-	if (percpu)
-		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-			      | (pte_val(pte) & ~_PAGE_HASHPTE));
-	else
-		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
-
-#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
-	/* Second case is 32-bit with 64-bit PTE.  In this case, we
-	 * can just store as long as we do the two halves in the right order
-	 * with a barrier in between. This is possible because we take care,
-	 * in the hash code, to pre-invalidate if the PTE was already hashed,
-	 * which synchronizes us with any concurrent invalidation.
-	 * In the percpu case, we also fallback to the simple update preserving
-	 * the hash bits
-	 */
-	if (percpu) {
-		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-			      | (pte_val(pte) & ~_PAGE_HASHPTE));
-		return;
-	}
-#if _PAGE_HASHPTE != 0
-	if (pte_val(*ptep) & _PAGE_HASHPTE)
-		flush_hash_entry(mm, ptep, addr);
-#endif
-	__asm__ __volatile__("\
-		stw%U0%X0 %2,%0\n\
-		eieio\n\
-		stw%U0%X0 %L2,%1"
-	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
-	: "r" (pte) : "memory");
-
-#elif defined(CONFIG_PPC_STD_MMU_32)
-	/* Third case is 32-bit hash table in UP mode, we need to preserve
-	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
-	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
-	 * and see we need to keep track that this PTE needs invalidating
-	 */
-	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-		      | (pte_val(pte) & ~_PAGE_HASHPTE));
-
-#else
-	/* Anything else just stores the PTE normally. That covers all 64-bit
-	 * cases, and 32-bit non-hash with 32-bit PTEs.
-	 */
-	*ptep = pte;
-
-#ifdef CONFIG_PPC_BOOK3E_64
-	/*
-	 * With hardware tablewalk, a sync is needed to ensure that
-	 * subsequent accesses see the PTE we just wrote.  Unlike userspace
-	 * mappings, we can't tolerate spurious faults, so make sure
-	 * the new PTE will be seen the first time.
-	 */
-	if (is_kernel_addr(addr))
-		mb();
-#endif
-#endif
-}
-
-
-#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
-extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
-				 pte_t *ptep, pte_t entry, int dirty);
-
-/*
- * Macro to mark a page protection value as "uncacheable".
- */
-
-#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU)
-
-#define pgprot_noncached(prot)	  (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_NO_CACHE | _PAGE_GUARDED))
-
-#define pgprot_noncached_wc(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_NO_CACHE))
-
-#define pgprot_cached(prot)       (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_COHERENT))
-
-#define pgprot_cached_wthru(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_COHERENT | _PAGE_WRITETHRU))
-
-#define pgprot_cached_noncoherent(prot) \
-		(__pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL))
-
-#define pgprot_writecombine pgprot_noncached_wc
-
-struct file;
-extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
-				     unsigned long size, pgprot_t vma_prot);
-#define __HAVE_PHYS_MEM_ACCESS_PROT
-
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
  * for zero-mapped memory areas etc..
@@ -275,5 +86,4 @@ static inline pte_t *find_linux_pte_or_hugepte(pgd_t *pgdir, unsigned long ea,
 }
 #endif /* __ASSEMBLY__ */
 
-#endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_PGTABLE_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 08/31] powerpc/mm: Drop pte-common.h from BOOK3S 64
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (6 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 07/31] powerpc/mm: Don't have generic headers introduce functions touching pte bits Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 09/31] powerpc/mm: Don't use pte_val as lvalue Aneesh Kumar K.V
                   ` (24 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We copy only needed PTE bits define from pte-common.h to respective
hash related header. This should greatly simply later patches in which
we are going to change the pte format for hash config

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h |   1 +
 arch/powerpc/include/asm/book3s/64/hash.h    |   2 +
 arch/powerpc/include/asm/book3s/64/pgtable.h | 106 ++++++++++++++++++++++++++-
 arch/powerpc/include/asm/book3s/pgtable.h    |  16 ++--
 4 files changed, 113 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index f2c51cd61f69..15518b620f5a 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -62,6 +62,7 @@
 /* shift to put page number into pte */
 #define PTE_RPN_SHIFT	(17)
 
+#define _PAGE_4K_PFN		0
 #ifndef __ASSEMBLY__
 /*
  * 4-level page tables related bits
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 8e60d4fa434d..7deb5063ff8c 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -20,6 +20,7 @@
 #define _PAGE_EXEC		0x0004 /* No execute on POWER4 and newer (we invert) */
 #define _PAGE_GUARDED		0x0008
 /* We can derive Memory coherence from _PAGE_NO_CACHE */
+#define _PAGE_COHERENT		0x0
 #define _PAGE_NO_CACHE		0x0020 /* I: cache inhibit */
 #define _PAGE_WRITETHRU		0x0040 /* W: cache write-through */
 #define _PAGE_DIRTY		0x0080 /* C: page changed */
@@ -30,6 +31,7 @@
 /* No separate kernel read-only */
 #define _PAGE_KERNEL_RW		(_PAGE_RW | _PAGE_DIRTY) /* user access blocked by key */
 #define _PAGE_KERNEL_RO		 _PAGE_KERNEL_RW
+#define _PAGE_KERNEL_RWX	(_PAGE_DIRTY | _PAGE_RW | _PAGE_EXEC)
 
 /* Strong Access Ordering */
 #define _PAGE_SAO		(_PAGE_WRITETHRU | _PAGE_NO_CACHE | _PAGE_COHERENT)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index cf4824395027..c08e47511e26 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -96,11 +96,111 @@
 #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
 			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
 			 _PAGE_THP_HUGE)
+#define _PTE_NONE_MASK	_PAGE_HPTEFLAGS
 /*
- * Default defines for things which we don't use.
- * We should get this removed.
+ * The mask convered by the RPN must be a ULL on 32-bit platforms with
+ * 64-bit PTEs
+ * FIXME!! double check the RPN_MAX May be not used
  */
-#include <asm/pte-common.h>
+//#define PTE_RPN_MAX	(1UL << (32 - PTE_RPN_SHIFT))
+#define PTE_RPN_MASK	(~((1UL << PTE_RPN_SHIFT) - 1))
+/*
+ * _PAGE_CHG_MASK masks of bits that are to be preserved across
+ * pgprot changes
+ */
+#define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
+			 _PAGE_ACCESSED | _PAGE_SPECIAL)
+/*
+ * Mask of bits returned by pte_pgprot()
+ */
+#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU | _PAGE_4K_PFN | \
+			 _PAGE_USER | _PAGE_ACCESSED |  \
+			 _PAGE_RW |  _PAGE_DIRTY | _PAGE_EXEC)
+/*
+ * We define 2 sets of base prot bits, one for basic pages (ie,
+ * cacheable kernel and user pages) and one for non cacheable
+ * pages. We always set _PAGE_COHERENT when SMP is enabled or
+ * the processor might need it for DMA coherency.
+ */
+#define _PAGE_BASE_NC	(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE)
+#define _PAGE_BASE	(_PAGE_BASE_NC | _PAGE_COHERENT)
+
+/* Permission masks used to generate the __P and __S table,
+ *
+ * Note:__pgprot is defined in arch/powerpc/include/asm/page.h
+ *
+ * Write permissions imply read permissions for now (we could make write-only
+ * pages on BookE but we don't bother for now). Execute permission control is
+ * possible on platforms that define _PAGE_EXEC
+ *
+ * Note due to the way vm flags are laid out, the bits are XWR
+ */
+#define PAGE_NONE	__pgprot(_PAGE_BASE)
+#define PAGE_SHARED	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
+#define PAGE_SHARED_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | \
+				 _PAGE_EXEC)
+#define PAGE_COPY	__pgprot(_PAGE_BASE | _PAGE_USER )
+#define PAGE_COPY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+#define PAGE_READONLY	__pgprot(_PAGE_BASE | _PAGE_USER )
+#define PAGE_READONLY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+
+#define __P000	PAGE_NONE
+#define __P001	PAGE_READONLY
+#define __P010	PAGE_COPY
+#define __P011	PAGE_COPY
+#define __P100	PAGE_READONLY_X
+#define __P101	PAGE_READONLY_X
+#define __P110	PAGE_COPY_X
+#define __P111	PAGE_COPY_X
+
+#define __S000	PAGE_NONE
+#define __S001	PAGE_READONLY
+#define __S010	PAGE_SHARED
+#define __S011	PAGE_SHARED
+#define __S100	PAGE_READONLY_X
+#define __S101	PAGE_READONLY_X
+#define __S110	PAGE_SHARED_X
+#define __S111	PAGE_SHARED_X
+
+/* Permission masks used for kernel mappings */
+#define PAGE_KERNEL	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
+#define PAGE_KERNEL_NC	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
+				 _PAGE_NO_CACHE)
+#define PAGE_KERNEL_NCG	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
+				 _PAGE_NO_CACHE | _PAGE_GUARDED)
+#define PAGE_KERNEL_X	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
+#define PAGE_KERNEL_RO	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
+#define PAGE_KERNEL_ROX	__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
+
+/* Protection used for kernel text. We want the debuggers to be able to
+ * set breakpoints anywhere, so don't write protect the kernel text
+ * on platforms where such control is possible.
+ */
+#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) ||\
+	defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
+#define PAGE_KERNEL_TEXT	PAGE_KERNEL_X
+#else
+#define PAGE_KERNEL_TEXT	PAGE_KERNEL_ROX
+#endif
+
+/* Make modules code happy. We don't set RO yet */
+#define PAGE_KERNEL_EXEC	PAGE_KERNEL_X
+
+/*
+ * Don't just check for any non zero bits in __PAGE_USER, since for book3e
+ * and PTE_64BIT, PAGE_KERNEL_X contains _PAGE_BAP_SR which is also in
+ * _PAGE_USER.  Need to explicitly match _PAGE_BAP_UR bit in that case too.
+ */
+#define pte_user(val)		((val & _PAGE_USER) == _PAGE_USER)
+
+/* Advertise special mapping type for AGP */
+#define PAGE_AGP		(PAGE_KERNEL_NC)
+#define HAVE_PAGE_AGP
+
+/* Advertise support for _PAGE_SPECIAL */
+#define __HAVE_ARCH_PTE_SPECIAL
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index fa270cfcf30a..87333618af3b 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -11,10 +11,7 @@
 #ifndef __ASSEMBLY__
 
 /* Generic accessors to PTE bits */
-static inline int pte_write(pte_t pte)
-{
-	return (pte_val(pte) & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO;
-}
+static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
 static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
 static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
 static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
@@ -57,15 +54,16 @@ static inline unsigned long pte_pfn(pte_t pte)	{
 	return pte_val(pte) >> PTE_RPN_SHIFT; }
 
 /* Generic modifiers for PTE bits */
-static inline pte_t pte_wrprotect(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
-	pte_val(pte) |= _PAGE_RO; return pte; }
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	pte_val(pte) &= ~_PAGE_RW;
+	return pte;
+}
 static inline pte_t pte_mkclean(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
+	pte_val(pte) &= ~_PAGE_DIRTY; return pte; }
 static inline pte_t pte_mkold(pte_t pte) {
 	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
 static inline pte_t pte_mkwrite(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_RO;
 	pte_val(pte) |= _PAGE_RW; return pte; }
 static inline pte_t pte_mkdirty(pte_t pte) {
 	pte_val(pte) |= _PAGE_DIRTY; return pte; }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 09/31] powerpc/mm: Don't use pte_val as lvalue
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (7 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 08/31] powerpc/mm: Drop pte-common.h from BOOK3S 64 Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 10/31] powerpc/mm: Don't use pmd_val, pud_val and pgd_val " Aneesh Kumar K.V
                   ` (23 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We also convert few #define to static inline in this patch for better
type checking

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/pgtable.h | 118 +++++++++++++++++++++---------
 arch/powerpc/include/asm/page.h           |  10 ++-
 arch/powerpc/include/asm/pgtable-book3e.h |  68 ++++++++++++-----
 3 files changed, 139 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index 87333618af3b..ebd6677ea017 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -12,9 +12,9 @@
 
 /* Generic accessors to PTE bits */
 static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
-static inline int pte_dirty(pte_t pte)		{ return pte_val(pte) & _PAGE_DIRTY; }
-static inline int pte_young(pte_t pte)		{ return pte_val(pte) & _PAGE_ACCESSED; }
-static inline int pte_special(pte_t pte)	{ return pte_val(pte) & _PAGE_SPECIAL; }
+static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
+static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSED); }
+static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
 static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
 static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
 
@@ -47,36 +47,61 @@ static inline int pte_present(pte_t pte)
  * Even if PTEs can be unsigned long long, a PFN is always an unsigned
  * long for now.
  */
-static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
+{
 	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
-		     pgprot_val(pgprot)); }
-static inline unsigned long pte_pfn(pte_t pte)	{
-	return pte_val(pte) >> PTE_RPN_SHIFT; }
+		     pgprot_val(pgprot));
+}
+
+static inline unsigned long pte_pfn(pte_t pte)
+{
+	return pte_val(pte) >> PTE_RPN_SHIFT;
+}
 
 /* Generic modifiers for PTE bits */
 static inline pte_t pte_wrprotect(pte_t pte)
 {
-	pte_val(pte) &= ~_PAGE_RW;
+	return __pte(pte_val(pte) & ~_PAGE_RW);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_RW);
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+static inline pte_t pte_mkhuge(pte_t pte)
+{
 	return pte;
 }
-static inline pte_t pte_mkclean(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_DIRTY; return pte; }
-static inline pte_t pte_mkold(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) {
-	pte_val(pte) |= _PAGE_RW; return pte; }
-static inline pte_t pte_mkdirty(pte_t pte) {
-	pte_val(pte) |= _PAGE_DIRTY; return pte; }
-static inline pte_t pte_mkyoung(pte_t pte) {
-	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkspecial(pte_t pte) {
-	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
-static inline pte_t pte_mkhuge(pte_t pte) {
-	return pte; }
+
 static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
 {
-	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
-	return pte;
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
 }
 
 
@@ -159,22 +184,45 @@ extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addre
 #define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
 			 _PAGE_WRITETHRU)
 
-#define pgprot_noncached(prot)	  (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_NO_CACHE | _PAGE_GUARDED))
+#define pgprot_noncached pgprot_noncached
+static inline pgprot_t pgprot_noncached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE | _PAGE_GUARDED);
+}
 
-#define pgprot_noncached_wc(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_NO_CACHE))
+#define pgprot_noncached_wc pgprot_noncached_wc
+static inline pgprot_t pgprot_noncached_wc(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE);
+}
 
-#define pgprot_cached(prot)       (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_COHERENT))
+#define pgprot_cached pgprot_cached
+static inline pgprot_t pgprot_cached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT);
+}
 
-#define pgprot_cached_wthru(prot) (__pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) | \
-				            _PAGE_COHERENT | _PAGE_WRITETHRU))
+#define pgprot_cached_wthru pgprot_cached_wthru
+static inline pgprot_t pgprot_cached_wthru(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT | _PAGE_WRITETHRU);
+}
 
-#define pgprot_cached_noncoherent(prot) \
-		(__pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL))
+#define pgprot_cached_noncoherent pgprot_cached_noncoherent
+static inline pgprot_t pgprot_cached_noncoherent(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL);
+}
 
-#define pgprot_writecombine pgprot_noncached_wc
+#define pgprot_writecombine pgprot_writecombine
+static inline pgprot_t pgprot_writecombine(pgprot_t prot)
+{
+	return pgprot_noncached_wc(prot);
+}
 
 struct file;
 extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 96534b4e5a64..0e9d1d32053f 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -285,8 +285,11 @@ extern long long virt_phys_offset;
 
 /* PTE level */
 typedef struct { pte_basic_t pte; } pte_t;
-#define pte_val(x)	((x).pte)
 #define __pte(x)	((pte_t) { (x) })
+static inline pte_basic_t pte_val(pte_t x)
+{
+	return x.pte;
+}
 
 /* 64k pages additionally define a bigger "real PTE" type that gathers
  * the "second half" part of the PTE for pseudo 64k pages
@@ -328,8 +331,11 @@ typedef struct { unsigned long pgprot; } pgprot_t;
  */
 
 typedef pte_basic_t pte_t;
-#define pte_val(x)	(x)
 #define __pte(x)	(x)
+static inline pte_basic_t pte_val(pte_t pte)
+{
+	return pte;
+}
 
 #if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_STD_MMU_64)
 typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
diff --git a/arch/powerpc/include/asm/pgtable-book3e.h b/arch/powerpc/include/asm/pgtable-book3e.h
index a3221cff2e31..91325997ba25 100644
--- a/arch/powerpc/include/asm/pgtable-book3e.h
+++ b/arch/powerpc/include/asm/pgtable-book3e.h
@@ -56,30 +56,58 @@ static inline unsigned long pte_pfn(pte_t pte)	{
 	return pte_val(pte) >> PTE_RPN_SHIFT; }
 
 /* Generic modifiers for PTE bits */
-static inline pte_t pte_wrprotect(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE);
-	pte_val(pte) |= _PAGE_RO; return pte; }
-static inline pte_t pte_mkclean(pte_t pte) {
-	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
-static inline pte_t pte_mkold(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkwrite(pte_t pte) {
-	pte_val(pte) &= ~_PAGE_RO;
-	pte_val(pte) |= _PAGE_RW; return pte; }
-static inline pte_t pte_mkdirty(pte_t pte) {
-	pte_val(pte) |= _PAGE_DIRTY; return pte; }
-static inline pte_t pte_mkyoung(pte_t pte) {
-	pte_val(pte) |= _PAGE_ACCESSED; return pte; }
-static inline pte_t pte_mkspecial(pte_t pte) {
-	pte_val(pte) |= _PAGE_SPECIAL; return pte; }
-static inline pte_t pte_mkhuge(pte_t pte) {
-	return pte; }
-static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	pte_basic_t ptev;
+
+	ptev = pte_val(pte) & ~(_PAGE_RW | _PAGE_HWWRITE);
+	ptev |= _PAGE_RO;
+	return __pte(ptev);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~(_PAGE_DIRTY | _PAGE_HWWRITE));
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	pte_basic_t ptev;
+
+	ptev = pte_val(pte) & ~_PAGE_RO;
+	ptev |= _PAGE_RW;
+	return __pte(ptev);
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+static inline pte_t pte_mkhuge(pte_t pte)
 {
-	pte_val(pte) = (pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot);
 	return pte;
 }
 
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
+}
 
 /* Insert a PTE, top-level function is out of line. It uses an inline
  * low level function in the respective pgtable-* files
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 10/31] powerpc/mm: Don't use pmd_val, pud_val and pgd_val as lvalue
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (8 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 09/31] powerpc/mm: Don't use pte_val as lvalue Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 11/31] powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h Aneesh Kumar K.V
                   ` (22 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We convert them static inline function here as we did with pte_val in
the previous patch

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h |  6 ++++-
 arch/powerpc/include/asm/book3s/64/hash-4k.h |  6 ++++-
 arch/powerpc/include/asm/book3s/64/pgtable.h | 36 +++++++++++++++++++++-------
 arch/powerpc/include/asm/page.h              | 34 +++++++++++++++++++-------
 arch/powerpc/include/asm/pgalloc-32.h        | 34 +++++++++++++++++++-------
 arch/powerpc/include/asm/pgalloc-64.h        | 17 +++++++++----
 arch/powerpc/include/asm/pgtable-ppc32.h     |  7 +++++-
 arch/powerpc/include/asm/pgtable-ppc64-4k.h  |  6 ++++-
 arch/powerpc/include/asm/pgtable-ppc64.h     | 36 +++++++++++++++++++++-------
 arch/powerpc/mm/40x_mmu.c                    | 10 ++++----
 arch/powerpc/mm/pgtable_64.c                 | 19 +++++++--------
 11 files changed, 154 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 2afe5958c837..9e47515b2e01 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -105,7 +105,11 @@ extern unsigned long ioremap_bot;
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(pmd_val(pmd) & _PMD_BAD)
 #define	pmd_present(pmd)	(pmd_val(pmd) & _PMD_PRESENT_MASK)
-#define	pmd_clear(pmdp)		do { pmd_val(*(pmdp)) = 0; } while (0)
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	*pmdp = __pmd(0);
+}
+
 
 /*
  * When flushing the tlb entry for a page, we also need to flush the hash
diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 15518b620f5a..537eacecf6e9 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -71,9 +71,13 @@
 #define pgd_none(pgd)		(!pgd_val(pgd))
 #define pgd_bad(pgd)		(pgd_val(pgd) == 0)
 #define pgd_present(pgd)	(pgd_val(pgd) != 0)
-#define pgd_clear(pgdp)		(pgd_val(*(pgdp)) = 0)
 #define pgd_page_vaddr(pgd)	(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
+static inline void pgd_clear(pgd_t *pgdp)
+{
+	*pgdp = __pgd(0);
+}
+
 static inline pte_t pgd_pte(pgd_t pgd)
 {
 	return __pte(pgd_val(pgd));
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c08e47511e26..19ea3852514c 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -240,21 +240,38 @@
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
 
-#define pmd_set(pmdp, pmdval)	(pmd_val(*(pmdp)) = (pmdval))
+static inline void pmd_set(pmd_t *pmdp, unsigned long val)
+{
+	*pmdp = __pmd(val);
+}
+
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	*pmdp = __pmd(0);
+}
+
+
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
 				 || (pmd_val(pmd) & PMD_BAD_BITS))
 #define	pmd_present(pmd)	(!pmd_none(pmd))
-#define	pmd_clear(pmdp)		(pmd_val(*(pmdp)) = 0)
 #define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
 extern struct page *pmd_page(pmd_t pmd);
 
-#define pud_set(pudp, pudval)	(pud_val(*(pudp)) = (pudval))
+static inline void pud_set(pud_t *pudp, unsigned long val)
+{
+	*pudp = __pud(val);
+}
+
+static inline void pud_clear(pud_t *pudp)
+{
+	*pudp = __pud(0);
+}
+
 #define pud_none(pud)		(!pud_val(pud))
 #define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
 				 || (pud_val(pud) & PUD_BAD_BITS))
 #define pud_present(pud)	(pud_val(pud) != 0)
-#define pud_clear(pudp)		(pud_val(*(pudp)) = 0)
 #define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
 
 extern struct page *pud_page(pud_t pud);
@@ -269,8 +286,11 @@ static inline pud_t pte_pud(pte_t pte)
 	return __pud(pte_val(pte));
 }
 #define pud_write(pud)		pte_write(pud_pte(pud))
-#define pgd_set(pgdp, pudp)	({pgd_val(*(pgdp)) = (unsigned long)(pudp);})
 #define pgd_write(pgd)		pte_write(pgd_pte(pgd))
+static inline void pgd_set(pgd_t *pgdp, unsigned long val)
+{
+	*pgdp = __pgd(val);
+}
 
 /*
  * Find an entry in a page-table-directory.  We combine the address region
@@ -592,14 +612,12 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
 
 static inline pmd_t pmd_mknotpresent(pmd_t pmd)
 {
-	pmd_val(pmd) &= ~_PAGE_PRESENT;
-	return pmd;
+	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
 }
 
 static inline pmd_t pmd_mksplitting(pmd_t pmd)
 {
-	pmd_val(pmd) |= _PAGE_SPLITTING;
-	return pmd;
+	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
 }
 
 #define __HAVE_ARCH_PMD_SAME
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 0e9d1d32053f..9d2f38e1b21d 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -303,21 +303,30 @@ typedef struct { pte_t pte; } real_pte_t;
 /* PMD level */
 #ifdef CONFIG_PPC64
 typedef struct { unsigned long pmd; } pmd_t;
-#define pmd_val(x)	((x).pmd)
 #define __pmd(x)	((pmd_t) { (x) })
+static inline unsigned long pmd_val(pmd_t x)
+{
+	return x.pmd;
+}
 
 /* PUD level exusts only on 4k pages */
 #ifndef CONFIG_PPC_64K_PAGES
 typedef struct { unsigned long pud; } pud_t;
-#define pud_val(x)	((x).pud)
 #define __pud(x)	((pud_t) { (x) })
+static inline unsigned long pud_val(pud_t x)
+{
+	return x.pud;
+}
 #endif /* !CONFIG_PPC_64K_PAGES */
 #endif /* CONFIG_PPC64 */
 
 /* PGD level */
 typedef struct { unsigned long pgd; } pgd_t;
-#define pgd_val(x)	((x).pgd)
 #define __pgd(x)	((pgd_t) { (x) })
+static inline unsigned long pgd_val(pgd_t x)
+{
+	return x.pgd;
+}
 
 /* Page protection bits */
 typedef struct { unsigned long pgprot; } pgprot_t;
@@ -346,22 +355,31 @@ typedef pte_t real_pte_t;
 
 #ifdef CONFIG_PPC64
 typedef unsigned long pmd_t;
-#define pmd_val(x)	(x)
 #define __pmd(x)	(x)
+static inline unsigned long pmd_val(pmd_t pmd)
+{
+	return pmd;
+}
 
 #ifndef CONFIG_PPC_64K_PAGES
 typedef unsigned long pud_t;
-#define pud_val(x)	(x)
 #define __pud(x)	(x)
+static inline unsigned long pud_val(pud_t pud)
+{
+	return pud;
+}
 #endif /* !CONFIG_PPC_64K_PAGES */
 #endif /* CONFIG_PPC64 */
 
 typedef unsigned long pgd_t;
-#define pgd_val(x)	(x)
-#define pgprot_val(x)	(x)
+#define __pgd(x)	(x)
+static inline unsigned long pgd_val(pgd_t pgd)
+{
+	return pgd;
+}
 
 typedef unsigned long pgprot_t;
-#define __pgd(x)	(x)
+#define pgprot_val(x)	(x)
 #define __pgprot(x)	(x)
 
 #endif
diff --git a/arch/powerpc/include/asm/pgalloc-32.h b/arch/powerpc/include/asm/pgalloc-32.h
index 842846c1b711..76d6b9e0c8a9 100644
--- a/arch/powerpc/include/asm/pgalloc-32.h
+++ b/arch/powerpc/include/asm/pgalloc-32.h
@@ -21,16 +21,34 @@ extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
 /* #define pgd_populate(mm, pmd, pte)      BUG() */
 
 #ifndef CONFIG_BOOKE
-#define pmd_populate_kernel(mm, pmd, pte)	\
-		(pmd_val(*(pmd)) = __pa(pte) | _PMD_PRESENT)
-#define pmd_populate(mm, pmd, pte)	\
-		(pmd_val(*(pmd)) = (page_to_pfn(pte) << PAGE_SHIFT) | _PMD_PRESENT)
+
+static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp,
+				       pte_t *pte)
+{
+	*pmdp = __pmd(__pa(pte) | _PMD_PRESENT);
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
+				pgtable_t pte_page)
+{
+	*pmdp = __pmd((page_to_pfn(pte_page) << PAGE_SHIFT) | _PMD_PRESENT);
+}
+
 #define pmd_pgtable(pmd) pmd_page(pmd)
 #else
-#define pmd_populate_kernel(mm, pmd, pte)	\
-		(pmd_val(*(pmd)) = (unsigned long)pte | _PMD_PRESENT)
-#define pmd_populate(mm, pmd, pte)	\
-		(pmd_val(*(pmd)) = (unsigned long)lowmem_page_address(pte) | _PMD_PRESENT)
+
+static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp,
+				       pte_t *pte)
+{
+	*pmdp = __pmd((unsigned long)pte | _PMD_PRESENT);
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
+				pgtable_t pte_page)
+{
+	*pmdp = __pmd((unsigned long)lowmem_page_address(pte_page) | _PMD_PRESENT);
+}
+
 #define pmd_pgtable(pmd) pmd_page(pmd)
 #endif
 
diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index 4b0be20fcbfd..d8cde71f6734 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -53,7 +53,7 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 
 #ifndef CONFIG_PPC_64K_PAGES
 
-#define pgd_populate(MM, PGD, PUD)	pgd_set(PGD, PUD)
+#define pgd_populate(MM, PGD, PUD)	pgd_set(PGD, (unsigned long)PUD)
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
@@ -71,9 +71,18 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 	pud_set(pud, (unsigned long)pmd);
 }
 
-#define pmd_populate(mm, pmd, pte_page) \
-	pmd_populate_kernel(mm, pmd, page_address(pte_page))
-#define pmd_populate_kernel(mm, pmd, pte) pmd_set(pmd, (unsigned long)(pte))
+static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
+				       pte_t *pte)
+{
+	pmd_set(pmd, (unsigned long)pte);
+}
+
+static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
+				pgtable_t pte_page)
+{
+	pmd_set(pmd, (unsigned long)page_address(pte_page));
+}
+
 #define pmd_pgtable(pmd) pmd_page(pmd)
 
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/pgtable-ppc32.h
index aac6547b0823..fbb23c54b998 100644
--- a/arch/powerpc/include/asm/pgtable-ppc32.h
+++ b/arch/powerpc/include/asm/pgtable-ppc32.h
@@ -128,7 +128,12 @@ extern int icache_44x_need_flush;
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(pmd_val(pmd) & _PMD_BAD)
 #define	pmd_present(pmd)	(pmd_val(pmd) & _PMD_PRESENT_MASK)
-#define	pmd_clear(pmdp)		do { pmd_val(*(pmdp)) = 0; } while (0)
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	*pmdp = __pmd(0);
+}
+
+
 
 /*
  * When flushing the tlb entry for a page, we also need to flush the hash
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-4k.h b/arch/powerpc/include/asm/pgtable-ppc64-4k.h
index 132ee1d482c2..7bace25d6b62 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64-4k.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64-4k.h
@@ -55,11 +55,15 @@
 #define pgd_none(pgd)		(!pgd_val(pgd))
 #define pgd_bad(pgd)		(pgd_val(pgd) == 0)
 #define pgd_present(pgd)	(pgd_val(pgd) != 0)
-#define pgd_clear(pgdp)		(pgd_val(*(pgdp)) = 0)
 #define pgd_page_vaddr(pgd)	(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
 #ifndef __ASSEMBLY__
 
+static inline void pgd_clear(pgd_t *pgdp)
+{
+	*pgdp = __pgd(0);
+}
+
 static inline pte_t pgd_pte(pgd_t pgd)
 {
 	return __pte(pgd_val(pgd));
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index 1ef0fea32e1e..6be203d43fd1 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -144,21 +144,37 @@
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
 
-#define pmd_set(pmdp, pmdval) 	(pmd_val(*(pmdp)) = (pmdval))
+static inline void pmd_set(pmd_t *pmdp, unsigned long val)
+{
+	*pmdp = __pmd(val);
+}
+
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	*pmdp = __pmd(0);
+}
+
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
 				 || (pmd_val(pmd) & PMD_BAD_BITS))
 #define	pmd_present(pmd)	(!pmd_none(pmd))
-#define	pmd_clear(pmdp)		(pmd_val(*(pmdp)) = 0)
 #define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
 extern struct page *pmd_page(pmd_t pmd);
 
-#define pud_set(pudp, pudval)	(pud_val(*(pudp)) = (pudval))
+static inline void pud_set(pud_t *pudp, unsigned long val)
+{
+	*pudp = __pud(val);
+}
+
+static inline void pud_clear(pud_t *pudp)
+{
+	*pudp = __pud(0);
+}
+
 #define pud_none(pud)		(!pud_val(pud))
 #define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
 				 || (pud_val(pud) & PUD_BAD_BITS))
 #define pud_present(pud)	(pud_val(pud) != 0)
-#define pud_clear(pudp)		(pud_val(*(pudp)) = 0)
 #define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
 
 extern struct page *pud_page(pud_t pud);
@@ -173,9 +189,13 @@ static inline pud_t pte_pud(pte_t pte)
 	return __pud(pte_val(pte));
 }
 #define pud_write(pud)		pte_write(pud_pte(pud))
-#define pgd_set(pgdp, pudp)	({pgd_val(*(pgdp)) = (unsigned long)(pudp);})
 #define pgd_write(pgd)		pte_write(pgd_pte(pgd))
 
+static inline void pgd_set(pgd_t *pgdp, unsigned long val)
+{
+	*pgdp = __pgd(val);
+}
+
 /*
  * Find an entry in a page-table-directory.  We combine the address region
  * (the high order N bits) and the pgd portion of the address.
@@ -528,14 +548,12 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
 
 static inline pmd_t pmd_mknotpresent(pmd_t pmd)
 {
-	pmd_val(pmd) &= ~_PAGE_PRESENT;
-	return pmd;
+	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
 }
 
 static inline pmd_t pmd_mksplitting(pmd_t pmd)
 {
-	pmd_val(pmd) |= _PAGE_SPLITTING;
-	return pmd;
+	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
 }
 
 #define __HAVE_ARCH_PMD_SAME
diff --git a/arch/powerpc/mm/40x_mmu.c b/arch/powerpc/mm/40x_mmu.c
index 5810967511d4..31a5d42df8c9 100644
--- a/arch/powerpc/mm/40x_mmu.c
+++ b/arch/powerpc/mm/40x_mmu.c
@@ -110,10 +110,10 @@ unsigned long __init mmu_mapin_ram(unsigned long top)
 		unsigned long val = p | _PMD_SIZE_16M | _PAGE_EXEC | _PAGE_HWWRITE;
 
 		pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
-		pmd_val(*pmdp++) = val;
-		pmd_val(*pmdp++) = val;
-		pmd_val(*pmdp++) = val;
-		pmd_val(*pmdp++) = val;
+		*pmdp++ = __pmd(val);
+		*pmdp++ = __pmd(val);
+		*pmdp++ = __pmd(val);
+		*pmdp++ = __pmd(val);
 
 		v += LARGE_PAGE_SIZE_16M;
 		p += LARGE_PAGE_SIZE_16M;
@@ -125,7 +125,7 @@ unsigned long __init mmu_mapin_ram(unsigned long top)
 		unsigned long val = p | _PMD_SIZE_4M | _PAGE_EXEC | _PAGE_HWWRITE;
 
 		pmdp = pmd_offset(pud_offset(pgd_offset_k(v), v), v);
-		pmd_val(*pmdp) = val;
+		*pmdp = __pmd(val);
 
 		v += LARGE_PAGE_SIZE_4M;
 		p += LARGE_PAGE_SIZE_4M;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e92cb2146b18..d692ae31cfc7 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -759,22 +759,20 @@ void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
 
 static pmd_t pmd_set_protbits(pmd_t pmd, pgprot_t pgprot)
 {
-	pmd_val(pmd) |= pgprot_val(pgprot);
-	return pmd;
+	return __pmd(pmd_val(pmd) | pgprot_val(pgprot));
 }
 
 pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot)
 {
-	pmd_t pmd;
+	unsigned long pmdv;
 	/*
 	 * For a valid pte, we would have _PAGE_PRESENT always
 	 * set. We use this to check THP page at pmd level.
 	 * leaf pte for huge page, bottom two bits != 00
 	 */
-	pmd_val(pmd) = pfn << PTE_RPN_SHIFT;
-	pmd_val(pmd) |= _PAGE_THP_HUGE;
-	pmd = pmd_set_protbits(pmd, pgprot);
-	return pmd;
+	pmdv = pfn << PTE_RPN_SHIFT;
+	pmdv |= _PAGE_THP_HUGE;
+	return pmd_set_protbits(__pmd(pmdv), pgprot);
 }
 
 pmd_t mk_pmd(struct page *page, pgprot_t pgprot)
@@ -784,10 +782,11 @@ pmd_t mk_pmd(struct page *page, pgprot_t pgprot)
 
 pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
 {
+	unsigned long pmdv;
 
-	pmd_val(pmd) &= _HPAGE_CHG_MASK;
-	pmd = pmd_set_protbits(pmd, newprot);
-	return pmd;
+	pmdv = pmd_val(pmd);
+	pmdv &= _HPAGE_CHG_MASK;
+	return pmd_set_protbits(__pmd(pmdv), newprot);
 }
 
 /*
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 11/31] powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (9 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 10/31] powerpc/mm: Don't use pmd_val, pud_val and pgd_val " Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 12/31] powerpc/mm: Move PTE bits from generic functions to hash64 functions Aneesh Kumar K.V
                   ` (21 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

This enables us to keep hash64 related bits together, and makes it easy
to follow.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h    | 452 ++++++++++++++++++++++++++-
 arch/powerpc/include/asm/book3s/64/pgtable.h | 449 +-------------------------
 arch/powerpc/include/asm/pgtable.h           |   6 -
 3 files changed, 452 insertions(+), 455 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 7deb5063ff8c..954062beca9e 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -2,6 +2,63 @@
 #define _ASM_POWERPC_BOOK3S_64_HASH_H
 #ifdef __KERNEL__
 
+#ifdef CONFIG_PPC_64K_PAGES
+#include <asm/book3s/64/hash-64k.h>
+#else
+#include <asm/book3s/64/hash-4k.h>
+#endif
+
+/*
+ * Size of EA range mapped by our pagetables.
+ */
+#define PGTABLE_EADDR_SIZE	(PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
+				 PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
+#define PGTABLE_RANGE		(ASM_CONST(1) << PGTABLE_EADDR_SIZE)
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#define PMD_CACHE_INDEX	(PMD_INDEX_SIZE + 1)
+#else
+#define PMD_CACHE_INDEX	PMD_INDEX_SIZE
+#endif
+/*
+ * Define the address range of the kernel non-linear virtual area
+ */
+#define KERN_VIRT_START ASM_CONST(0xD000000000000000)
+#define KERN_VIRT_SIZE	ASM_CONST(0x0000100000000000)
+
+/*
+ * The vmalloc space starts at the beginning of that region, and
+ * occupies half of it on hash CPUs and a quarter of it on Book3E
+ * (we keep a quarter for the virtual memmap)
+ */
+#define VMALLOC_START	KERN_VIRT_START
+#define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 1)
+#define VMALLOC_END	(VMALLOC_START + VMALLOC_SIZE)
+
+/*
+ * Region IDs
+ */
+#define REGION_SHIFT		60UL
+#define REGION_MASK		(0xfUL << REGION_SHIFT)
+#define REGION_ID(ea)		(((unsigned long)(ea)) >> REGION_SHIFT)
+
+#define VMALLOC_REGION_ID	(REGION_ID(VMALLOC_START))
+#define KERNEL_REGION_ID	(REGION_ID(PAGE_OFFSET))
+#define VMEMMAP_REGION_ID	(0xfUL)	/* Server only */
+#define USER_REGION_ID		(0UL)
+
+/*
+ * Defines the address of the vmemap area, in its own region on
+ * hash table CPUs.
+ */
+#define VMEMMAP_BASE		(VMEMMAP_REGION_ID << REGION_SHIFT)
+
+#ifdef CONFIG_PPC_MM_SLICES
+#define HAVE_ARCH_UNMAPPED_AREA
+#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
+#else
+#error "Hash Config needs mm slice enabled"
+#endif /* CONFIG_PPC_MM_SLICES */
 /*
  * Common bits between 4K and 64K pages in a linux-style PTE.
  * These match the bits in the (hardware-defined) PowerPC PTE as closely
@@ -46,11 +103,400 @@
 /* Hash table based platforms need atomic updates of the linux PTE */
 #define PTE_ATOMIC_UPDATES	1
 
-#ifdef CONFIG_PPC_64K_PAGES
-#include <asm/book3s/64/hash-64k.h>
+/*
+ * THP pages can't be special. So use the _PAGE_SPECIAL
+ */
+#define _PAGE_SPLITTING _PAGE_SPECIAL
+
+/*
+ * We need to differentiate between explicit huge page and THP huge
+ * page, since THP huge page also need to track real subpage details
+ */
+#define _PAGE_THP_HUGE  _PAGE_4K_PFN
+
+/*
+ * set of bits not changed in pmd_modify.
+ */
+#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
+			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
+			 _PAGE_THP_HUGE)
+#define _PTE_NONE_MASK	_PAGE_HPTEFLAGS
+/*
+ * The mask convered by the RPN must be a ULL on 32-bit platforms with
+ * 64-bit PTEs
+ * FIXME!! double check the RPN_MAX May be not used
+ */
+//#define PTE_RPN_MAX	(1UL << (32 - PTE_RPN_SHIFT))
+#define PTE_RPN_MASK	(~((1UL << PTE_RPN_SHIFT) - 1))
+/*
+ * _PAGE_CHG_MASK masks of bits that are to be preserved across
+ * pgprot changes
+ */
+#define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
+			 _PAGE_ACCESSED | _PAGE_SPECIAL)
+/*
+ * Mask of bits returned by pte_pgprot()
+ */
+#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU | _PAGE_4K_PFN | \
+			 _PAGE_USER | _PAGE_ACCESSED |  \
+			 _PAGE_RW |  _PAGE_DIRTY | _PAGE_EXEC)
+/*
+ * We define 2 sets of base prot bits, one for basic pages (ie,
+ * cacheable kernel and user pages) and one for non cacheable
+ * pages. We always set _PAGE_COHERENT when SMP is enabled or
+ * the processor might need it for DMA coherency.
+ */
+#define _PAGE_BASE_NC	(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE)
+#define _PAGE_BASE	(_PAGE_BASE_NC | _PAGE_COHERENT)
+
+/* Permission masks used to generate the __P and __S table,
+ *
+ * Note:__pgprot is defined in arch/powerpc/include/asm/page.h
+ *
+ * Write permissions imply read permissions for now (we could make write-only
+ * pages on BookE but we don't bother for now). Execute permission control is
+ * possible on platforms that define _PAGE_EXEC
+ *
+ * Note due to the way vm flags are laid out, the bits are XWR
+ */
+#define PAGE_NONE	__pgprot(_PAGE_BASE)
+#define PAGE_SHARED	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
+#define PAGE_SHARED_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | \
+				 _PAGE_EXEC)
+#define PAGE_COPY	__pgprot(_PAGE_BASE | _PAGE_USER )
+#define PAGE_COPY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+#define PAGE_READONLY	__pgprot(_PAGE_BASE | _PAGE_USER )
+#define PAGE_READONLY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
+
+#define __P000	PAGE_NONE
+#define __P001	PAGE_READONLY
+#define __P010	PAGE_COPY
+#define __P011	PAGE_COPY
+#define __P100	PAGE_READONLY_X
+#define __P101	PAGE_READONLY_X
+#define __P110	PAGE_COPY_X
+#define __P111	PAGE_COPY_X
+
+#define __S000	PAGE_NONE
+#define __S001	PAGE_READONLY
+#define __S010	PAGE_SHARED
+#define __S011	PAGE_SHARED
+#define __S100	PAGE_READONLY_X
+#define __S101	PAGE_READONLY_X
+#define __S110	PAGE_SHARED_X
+#define __S111	PAGE_SHARED_X
+
+/* Permission masks used for kernel mappings */
+#define PAGE_KERNEL	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
+#define PAGE_KERNEL_NC	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
+				 _PAGE_NO_CACHE)
+#define PAGE_KERNEL_NCG	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
+				 _PAGE_NO_CACHE | _PAGE_GUARDED)
+#define PAGE_KERNEL_X	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
+#define PAGE_KERNEL_RO	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
+#define PAGE_KERNEL_ROX	__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
+
+/* Protection used for kernel text. We want the debuggers to be able to
+ * set breakpoints anywhere, so don't write protect the kernel text
+ * on platforms where such control is possible.
+ */
+#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) ||\
+	defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
+#define PAGE_KERNEL_TEXT	PAGE_KERNEL_X
 #else
-#include <asm/book3s/64/hash-4k.h>
+#define PAGE_KERNEL_TEXT	PAGE_KERNEL_ROX
+#endif
+
+/* Make modules code happy. We don't set RO yet */
+#define PAGE_KERNEL_EXEC	PAGE_KERNEL_X
+#define PAGE_AGP		(PAGE_KERNEL_NC)
+
+#define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
+#define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
+/*
+ * We save the slot number & secondary bit in the second half of the
+ * PTE page. We use the 8 bytes per each pte entry.
+ */
+#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
+
+#ifndef __ASSEMBLY__
+#define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
+				 || (pmd_val(pmd) & PMD_BAD_BITS))
+#define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
+
+#define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
+				 || (pud_val(pud) & PUD_BAD_BITS))
+#define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
+
+#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
+#define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1))
+#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - 1))
+
+extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
+			    pte_t *ptep, unsigned long pte, int huge);
+extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
+					 unsigned long addr,
+					 pmd_t *pmdp,
+					 unsigned long clr,
+					 unsigned long set);
+/* Atomic PTE updates */
+static inline unsigned long pte_update(struct mm_struct *mm,
+				       unsigned long addr,
+				       pte_t *ptep, unsigned long clr,
+				       unsigned long set,
+				       int huge)
+{
+	unsigned long old, tmp;
+
+	__asm__ __volatile__(
+	"1:	ldarx	%0,0,%3		# pte_update\n\
+	andi.	%1,%0,%6\n\
+	bne-	1b \n\
+	andc	%1,%0,%4 \n\
+	or	%1,%1,%7\n\
+	stdcx.	%1,0,%3 \n\
+	bne-	1b"
+	: "=&r" (old), "=&r" (tmp), "=m" (*ptep)
+	: "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
+	: "cc" );
+	/* huge pages use the old page table lock */
+	if (!huge)
+		assert_pte_locked(mm, addr);
+
+	if (old & _PAGE_HASHPTE)
+		hpte_need_flush(mm, addr, ptep, old, huge);
+
+	return old;
+}
+
+static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
+					      unsigned long addr, pte_t *ptep)
+{
+	unsigned long old;
+
+	if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+		return 0;
+	old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
+	return (old & _PAGE_ACCESSED) != 0;
+}
+#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
+#define ptep_test_and_clear_young(__vma, __addr, __ptep)		   \
+({									   \
+	int __r;							   \
+	__r = __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep); \
+	__r;								   \
+})
+
+#define __HAVE_ARCH_PTEP_SET_WRPROTECT
+static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pte_t *ptep)
+{
+
+	if ((pte_val(*ptep) & _PAGE_RW) == 0)
+		return;
+
+	pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
+}
+
+static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
+					   unsigned long addr, pte_t *ptep)
+{
+	if ((pte_val(*ptep) & _PAGE_RW) == 0)
+		return;
+
+	pte_update(mm, addr, ptep, _PAGE_RW, 0, 1);
+}
+
+/*
+ * We currently remove entries from the hashtable regardless of whether
+ * the entry was young or dirty. The generic routines only flush if the
+ * entry was young or dirty which is not good enough.
+ *
+ * We should be more intelligent about this but for the moment we override
+ * these functions and force a tlb flush unconditionally
+ */
+#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
+#define ptep_clear_flush_young(__vma, __address, __ptep)		\
+({									\
+	int __young = __ptep_test_and_clear_young((__vma)->vm_mm, __address, \
+						  __ptep);		\
+	__young;							\
+})
+
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
+				       unsigned long addr, pte_t *ptep)
+{
+	unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
+	return __pte(old);
+}
+
+static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
+			     pte_t * ptep)
+{
+	pte_update(mm, addr, ptep, ~0UL, 0, 0);
+}
+
+
+/* Set the dirty and/or accessed bits atomically in a linux PTE, this
+ * function doesn't need to flush the hash entry
+ */
+static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
+{
+	unsigned long bits = pte_val(entry) &
+		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
+
+	unsigned long old, tmp;
+
+	__asm__ __volatile__(
+	"1:	ldarx	%0,0,%4\n\
+		andi.	%1,%0,%6\n\
+		bne-	1b \n\
+		or	%0,%3,%0\n\
+		stdcx.	%0,0,%4\n\
+		bne-	1b"
+	:"=&r" (old), "=&r" (tmp), "=m" (*ptep)
+	:"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
+	:"cc");
+}
+
+#define __HAVE_ARCH_PTE_SAME
+#define pte_same(A,B)	(((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
+
+static inline char *get_hpte_slot_array(pmd_t *pmdp)
+{
+	/*
+	 * The hpte hindex is stored in the pgtable whose address is in the
+	 * second half of the PMD
+	 *
+	 * Order this load with the test for pmd_trans_huge in the caller
+	 */
+	smp_rmb();
+	return *(char **)(pmdp + PTRS_PER_PMD);
+
+
+}
+/*
+ * The linux hugepage PMD now include the pmd entries followed by the address
+ * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
+ * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
+ * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
+ * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
+ *
+ * The last three bits are intentionally left to zero. This memory location
+ * are also used as normal page PTE pointers. So if we have any pointers
+ * left around while we collapse a hugepage, we need to make sure
+ * _PAGE_PRESENT bit of that is zero when we look at them
+ */
+static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
+{
+	return (hpte_slot_array[index] >> 3) & 0x1;
+}
+
+static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
+					   int index)
+{
+	return hpte_slot_array[index] >> 4;
+}
+
+static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
+					unsigned int index, unsigned int hidx)
+{
+	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+/*
+ *
+ * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
+ * page. The hugetlbfs page table walking and mangling paths are totally
+ * separated form the core VM paths and they're differentiated by
+ *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
+ *
+ * pmd_trans_huge() is defined as false at build time if
+ * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
+ * time in such case.
+ *
+ * For ppc64 we need to differntiate from explicit hugepages from THP, because
+ * for THP we also track the subpage details at the pmd level. We don't do
+ * that for explicit huge pages.
+ *
+ */
+static inline int pmd_trans_huge(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
+}
+
+static inline int pmd_trans_splitting(pmd_t pmd)
+{
+	if (pmd_trans_huge(pmd))
+		return pmd_val(pmd) & _PAGE_SPLITTING;
+	return 0;
+}
+
 #endif
+static inline int pmd_large(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pmd_val(pmd) & 0x3) != 0x0);
+}
+
+static inline pmd_t pmd_mknotpresent(pmd_t pmd)
+{
+	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
+}
+
+static inline pmd_t pmd_mksplitting(pmd_t pmd)
+{
+	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
+}
+
+#define __HAVE_ARCH_PMD_SAME
+static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
+{
+	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
+}
+
+static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
+					      unsigned long addr, pmd_t *pmdp)
+{
+	unsigned long old;
+
+	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+		return 0;
+	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
+	return ((old & _PAGE_ACCESSED) != 0);
+}
+
+#define __HAVE_ARCH_PMDP_SET_WRPROTECT
+static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pmd_t *pmdp)
+{
+
+	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
+		return;
+
+	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
+}
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
+				   pmd_t *pmdp, unsigned long old_pmd);
+#else
+static inline void hpte_do_hugepage_flush(struct mm_struct *mm,
+					  unsigned long addr, pmd_t *pmdp,
+					  unsigned long old_pmd)
+{
+	WARN(1, "%s called with THP disabled\n", __func__);
+}
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
+#endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 19ea3852514c..aac630b4a15e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -8,32 +8,6 @@
 #include <asm/book3s/64/hash.h>
 #include <asm/barrier.h>
 
-
-/*
- * Size of EA range mapped by our pagetables.
- */
-#define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
-			    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
-#define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-#define PMD_CACHE_INDEX	(PMD_INDEX_SIZE + 1)
-#else
-#define PMD_CACHE_INDEX	PMD_INDEX_SIZE
-#endif
-/*
- * Define the address range of the kernel non-linear virtual area
- */
-#define KERN_VIRT_START ASM_CONST(0xD000000000000000)
-#define KERN_VIRT_SIZE	ASM_CONST(0x0000100000000000)
-/*
- * The vmalloc space starts at the beginning of that region, and
- * occupies half of it on hash CPUs and a quarter of it on Book3E
- * (we keep a quarter for the virtual memmap)
- */
-#define VMALLOC_START	KERN_VIRT_START
-#define VMALLOC_SIZE	(KERN_VIRT_SIZE >> 1)
-#define VMALLOC_END	(VMALLOC_START + VMALLOC_SIZE)
 /*
  * The second half of the kernel virtual space is used for IO mappings,
  * it's itself carved into the PIO region (ISA and PHB IO space) and
@@ -52,150 +26,9 @@
 #define IOREMAP_BASE	(PHB_IO_END)
 #define IOREMAP_END	(KERN_VIRT_START + KERN_VIRT_SIZE)
 
-/*
- * Region IDs
- */
-#define REGION_SHIFT		60UL
-#define REGION_MASK		(0xfUL << REGION_SHIFT)
-#define REGION_ID(ea)		(((unsigned long)(ea)) >> REGION_SHIFT)
-
-#define VMALLOC_REGION_ID	(REGION_ID(VMALLOC_START))
-#define KERNEL_REGION_ID	(REGION_ID(PAGE_OFFSET))
-#define VMEMMAP_REGION_ID	(0xfUL)	/* Server only */
-#define USER_REGION_ID		(0UL)
-
-/*
- * Defines the address of the vmemap area, in its own region on
- * hash table CPUs.
- */
-#define VMEMMAP_BASE		(VMEMMAP_REGION_ID << REGION_SHIFT)
 #define vmemmap			((struct page *)VMEMMAP_BASE)
 
-
-#ifdef CONFIG_PPC_MM_SLICES
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
-#else
-#error "Hash Config needs mm slice enabled"
-#endif /* CONFIG_PPC_MM_SLICES */
-
-/*
- * THP pages can't be special. So use the _PAGE_SPECIAL
- */
-#define _PAGE_SPLITTING _PAGE_SPECIAL
-
-/*
- * We need to differentiate between explicit huge page and THP huge
- * page, since THP huge page also need to track real subpage details
- */
-#define _PAGE_THP_HUGE  _PAGE_4K_PFN
-
-/*
- * set of bits not changed in pmd_modify.
- */
-#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
-			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
-#define _PTE_NONE_MASK	_PAGE_HPTEFLAGS
-/*
- * The mask convered by the RPN must be a ULL on 32-bit platforms with
- * 64-bit PTEs
- * FIXME!! double check the RPN_MAX May be not used
- */
-//#define PTE_RPN_MAX	(1UL << (32 - PTE_RPN_SHIFT))
-#define PTE_RPN_MASK	(~((1UL << PTE_RPN_SHIFT) - 1))
-/*
- * _PAGE_CHG_MASK masks of bits that are to be preserved across
- * pgprot changes
- */
-#define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
-			 _PAGE_ACCESSED | _PAGE_SPECIAL)
-/*
- * Mask of bits returned by pte_pgprot()
- */
-#define PAGE_PROT_BITS	(_PAGE_GUARDED | _PAGE_COHERENT | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU | _PAGE_4K_PFN | \
-			 _PAGE_USER | _PAGE_ACCESSED |  \
-			 _PAGE_RW |  _PAGE_DIRTY | _PAGE_EXEC)
-/*
- * We define 2 sets of base prot bits, one for basic pages (ie,
- * cacheable kernel and user pages) and one for non cacheable
- * pages. We always set _PAGE_COHERENT when SMP is enabled or
- * the processor might need it for DMA coherency.
- */
-#define _PAGE_BASE_NC	(_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE)
-#define _PAGE_BASE	(_PAGE_BASE_NC | _PAGE_COHERENT)
-
-/* Permission masks used to generate the __P and __S table,
- *
- * Note:__pgprot is defined in arch/powerpc/include/asm/page.h
- *
- * Write permissions imply read permissions for now (we could make write-only
- * pages on BookE but we don't bother for now). Execute permission control is
- * possible on platforms that define _PAGE_EXEC
- *
- * Note due to the way vm flags are laid out, the bits are XWR
- */
-#define PAGE_NONE	__pgprot(_PAGE_BASE)
-#define PAGE_SHARED	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW)
-#define PAGE_SHARED_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_RW | \
-				 _PAGE_EXEC)
-#define PAGE_COPY	__pgprot(_PAGE_BASE | _PAGE_USER )
-#define PAGE_COPY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
-#define PAGE_READONLY	__pgprot(_PAGE_BASE | _PAGE_USER )
-#define PAGE_READONLY_X	__pgprot(_PAGE_BASE | _PAGE_USER | _PAGE_EXEC)
-
-#define __P000	PAGE_NONE
-#define __P001	PAGE_READONLY
-#define __P010	PAGE_COPY
-#define __P011	PAGE_COPY
-#define __P100	PAGE_READONLY_X
-#define __P101	PAGE_READONLY_X
-#define __P110	PAGE_COPY_X
-#define __P111	PAGE_COPY_X
-
-#define __S000	PAGE_NONE
-#define __S001	PAGE_READONLY
-#define __S010	PAGE_SHARED
-#define __S011	PAGE_SHARED
-#define __S100	PAGE_READONLY_X
-#define __S101	PAGE_READONLY_X
-#define __S110	PAGE_SHARED_X
-#define __S111	PAGE_SHARED_X
-
-/* Permission masks used for kernel mappings */
-#define PAGE_KERNEL	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RW)
-#define PAGE_KERNEL_NC	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
-				 _PAGE_NO_CACHE)
-#define PAGE_KERNEL_NCG	__pgprot(_PAGE_BASE_NC | _PAGE_KERNEL_RW | \
-				 _PAGE_NO_CACHE | _PAGE_GUARDED)
-#define PAGE_KERNEL_X	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RWX)
-#define PAGE_KERNEL_RO	__pgprot(_PAGE_BASE | _PAGE_KERNEL_RO)
-#define PAGE_KERNEL_ROX	__pgprot(_PAGE_BASE | _PAGE_KERNEL_ROX)
-
-/* Protection used for kernel text. We want the debuggers to be able to
- * set breakpoints anywhere, so don't write protect the kernel text
- * on platforms where such control is possible.
- */
-#if defined(CONFIG_KGDB) || defined(CONFIG_XMON) || defined(CONFIG_BDI_SWITCH) ||\
-	defined(CONFIG_KPROBES) || defined(CONFIG_DYNAMIC_FTRACE)
-#define PAGE_KERNEL_TEXT	PAGE_KERNEL_X
-#else
-#define PAGE_KERNEL_TEXT	PAGE_KERNEL_ROX
-#endif
-
-/* Make modules code happy. We don't set RO yet */
-#define PAGE_KERNEL_EXEC	PAGE_KERNEL_X
-
-/*
- * Don't just check for any non zero bits in __PAGE_USER, since for book3e
- * and PTE_64BIT, PAGE_KERNEL_X contains _PAGE_BAP_SR which is also in
- * _PAGE_USER.  Need to explicitly match _PAGE_BAP_UR bit in that case too.
- */
-#define pte_user(val)		((val & _PAGE_USER) == _PAGE_USER)
-
 /* Advertise special mapping type for AGP */
-#define PAGE_AGP		(PAGE_KERNEL_NC)
 #define HAVE_PAGE_AGP
 
 /* Advertise support for _PAGE_SPECIAL */
@@ -234,12 +67,6 @@
 
 #endif /* __real_pte */
 
-
-/* pte_clear moved to later in this file */
-
-#define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
-#define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
-
 static inline void pmd_set(pmd_t *pmdp, unsigned long val)
 {
 	*pmdp = __pmd(val);
@@ -250,13 +77,8 @@ static inline void pmd_clear(pmd_t *pmdp)
 	*pmdp = __pmd(0);
 }
 
-
 #define pmd_none(pmd)		(!pmd_val(pmd))
-#define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
-				 || (pmd_val(pmd) & PMD_BAD_BITS))
 #define	pmd_present(pmd)	(!pmd_none(pmd))
-#define pmd_page_vaddr(pmd)	(pmd_val(pmd) & ~PMD_MASKED_BITS)
-extern struct page *pmd_page(pmd_t pmd);
 
 static inline void pud_set(pud_t *pudp, unsigned long val)
 {
@@ -269,13 +91,10 @@ static inline void pud_clear(pud_t *pudp)
 }
 
 #define pud_none(pud)		(!pud_val(pud))
-#define	pud_bad(pud)		(!is_kernel_addr(pud_val(pud)) \
-				 || (pud_val(pud) & PUD_BAD_BITS))
 #define pud_present(pud)	(pud_val(pud) != 0)
-#define pud_page_vaddr(pud)	(pud_val(pud) & ~PUD_MASKED_BITS)
 
 extern struct page *pud_page(pud_t pud);
-
+extern struct page *pmd_page(pmd_t pmd);
 static inline pte_t pud_pte(pud_t pud)
 {
 	return __pte(pud_val(pud));
@@ -296,15 +115,14 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
  * Find an entry in a page-table-directory.  We combine the address region
  * (the high order N bits) and the pgd portion of the address.
  */
-#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
 
 #define pgd_offset(mm, address)	 ((mm)->pgd + pgd_index(address))
 
 #define pmd_offset(pudp,addr) \
-  (((pmd_t *) pud_page_vaddr(*(pudp))) + (((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1)))
+	(((pmd_t *) pud_page_vaddr(*(pudp))) + pmd_index(addr))
 
 #define pte_offset_kernel(dir,addr) \
-  (((pte_t *) pmd_page_vaddr(*(dir))) + (((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)))
+	(((pte_t *) pmd_page_vaddr(*(dir))) + pte_index(addr))
 
 #define pte_offset_map(dir,addr)	pte_offset_kernel((dir), (addr))
 #define pte_unmap(pte)			do { } while(0)
@@ -312,132 +130,6 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
 /* to find an entry in a kernel page-table-directory */
 /* This now only contains the vmalloc pages */
 #define pgd_offset_k(address) pgd_offset(&init_mm, address)
-extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
-			    pte_t *ptep, unsigned long pte, int huge);
-
-/* Atomic PTE updates */
-static inline unsigned long pte_update(struct mm_struct *mm,
-				       unsigned long addr,
-				       pte_t *ptep, unsigned long clr,
-				       unsigned long set,
-				       int huge)
-{
-	unsigned long old, tmp;
-
-	__asm__ __volatile__(
-	"1:	ldarx	%0,0,%3		# pte_update\n\
-	andi.	%1,%0,%6\n\
-	bne-	1b \n\
-	andc	%1,%0,%4 \n\
-	or	%1,%1,%7\n\
-	stdcx.	%1,0,%3 \n\
-	bne-	1b"
-	: "=&r" (old), "=&r" (tmp), "=m" (*ptep)
-	: "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
-	: "cc" );
-	/* huge pages use the old page table lock */
-	if (!huge)
-		assert_pte_locked(mm, addr);
-
-	if (old & _PAGE_HASHPTE)
-		hpte_need_flush(mm, addr, ptep, old, huge);
-
-	return old;
-}
-
-static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
-					      unsigned long addr, pte_t *ptep)
-{
-	unsigned long old;
-
-	if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
-		return 0;
-	old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
-	return (old & _PAGE_ACCESSED) != 0;
-}
-#define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG
-#define ptep_test_and_clear_young(__vma, __addr, __ptep)		   \
-({									   \
-	int __r;							   \
-	__r = __ptep_test_and_clear_young((__vma)->vm_mm, __addr, __ptep); \
-	__r;								   \
-})
-
-#define __HAVE_ARCH_PTEP_SET_WRPROTECT
-static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
-				      pte_t *ptep)
-{
-
-	if ((pte_val(*ptep) & _PAGE_RW) == 0)
-		return;
-
-	pte_update(mm, addr, ptep, _PAGE_RW, 0, 0);
-}
-
-static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
-					   unsigned long addr, pte_t *ptep)
-{
-	if ((pte_val(*ptep) & _PAGE_RW) == 0)
-		return;
-
-	pte_update(mm, addr, ptep, _PAGE_RW, 0, 1);
-}
-
-/*
- * We currently remove entries from the hashtable regardless of whether
- * the entry was young or dirty. The generic routines only flush if the
- * entry was young or dirty which is not good enough.
- *
- * We should be more intelligent about this but for the moment we override
- * these functions and force a tlb flush unconditionally
- */
-#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH
-#define ptep_clear_flush_young(__vma, __address, __ptep)		\
-({									\
-	int __young = __ptep_test_and_clear_young((__vma)->vm_mm, __address, \
-						  __ptep);		\
-	__young;							\
-})
-
-#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
-				       unsigned long addr, pte_t *ptep)
-{
-	unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0);
-	return __pte(old);
-}
-
-static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
-			     pte_t * ptep)
-{
-	pte_update(mm, addr, ptep, ~0UL, 0, 0);
-}
-
-
-/* Set the dirty and/or accessed bits atomically in a linux PTE, this
- * function doesn't need to flush the hash entry
- */
-static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
-{
-	unsigned long bits = pte_val(entry) &
-		(_PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_RW | _PAGE_EXEC);
-
-	unsigned long old, tmp;
-
-	__asm__ __volatile__(
-	"1:	ldarx	%0,0,%4\n\
-		andi.	%1,%0,%6\n\
-		bne-	1b \n\
-		or	%0,%3,%0\n\
-		stdcx.	%0,0,%4\n\
-		bne-	1b"
-	:"=&r" (old), "=&r" (tmp), "=m" (*ptep)
-	:"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
-	:"cc");
-}
-
-#define __HAVE_ARCH_PTE_SAME
-#define pte_same(A,B)	(((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
 
 #define pte_ERROR(e) \
 	pr_err("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, pte_val(e))
@@ -472,54 +164,9 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
 
-/*
- * The linux hugepage PMD now include the pmd entries followed by the address
- * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
- * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
- * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
- * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
- *
- * The last three bits are intentionally left to zero. This memory location
- * are also used as normal page PTE pointers. So if we have any pointers
- * left around while we collapse a hugepage, we need to make sure
- * _PAGE_PRESENT bit of that is zero when we look at them
- */
-static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
-{
-	return (hpte_slot_array[index] >> 3) & 0x1;
-}
-
-static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
-					   int index)
-{
-	return hpte_slot_array[index] >> 4;
-}
-
-static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
-					unsigned int index, unsigned int hidx)
-{
-	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
-}
-
 struct page *realmode_pfn_to_page(unsigned long pfn);
 
-static inline char *get_hpte_slot_array(pmd_t *pmdp)
-{
-	/*
-	 * The hpte hindex is stored in the pgtable whose address is in the
-	 * second half of the PMD
-	 *
-	 * Order this load with the test for pmd_trans_huge in the caller
-	 */
-	smp_rmb();
-	return *(char **)(pmdp + PTRS_PER_PMD);
-
-
-}
-
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
-				   pmd_t *pmdp, unsigned long old_pmd);
 extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);
 extern pmd_t mk_pmd(struct page *page, pgprot_t pgprot);
 extern pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot);
@@ -527,55 +174,9 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr,
 		       pmd_t *pmdp, pmd_t pmd);
 extern void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
 				 pmd_t *pmd);
-/*
- *
- * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
- * page. The hugetlbfs page table walking and mangling paths are totally
- * separated form the core VM paths and they're differentiated by
- *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
- *
- * pmd_trans_huge() is defined as false at build time if
- * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
- * time in such case.
- *
- * For ppc64 we need to differntiate from explicit hugepages from THP, because
- * for THP we also track the subpage details at the pmd level. We don't do
- * that for explicit huge pages.
- *
- */
-static inline int pmd_trans_huge(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
-}
-
-static inline int pmd_trans_splitting(pmd_t pmd)
-{
-	if (pmd_trans_huge(pmd))
-		return pmd_val(pmd) & _PAGE_SPLITTING;
-	return 0;
-}
-
 extern int has_transparent_hugepage(void);
-#else
-static inline void hpte_do_hugepage_flush(struct mm_struct *mm,
-					  unsigned long addr, pmd_t *pmdp,
-					  unsigned long old_pmd)
-{
-
-	WARN(1, "%s called with THP disabled\n", __func__);
-}
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-static inline int pmd_large(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
-}
 
 static inline pte_t pmd_pte(pmd_t pmd)
 {
@@ -610,44 +211,11 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
 	return pmd;
 }
 
-static inline pmd_t pmd_mknotpresent(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
-}
-
-static inline pmd_t pmd_mksplitting(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
-}
-
-#define __HAVE_ARCH_PMD_SAME
-static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
-{
-	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
-}
-
 #define __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS
 extern int pmdp_set_access_flags(struct vm_area_struct *vma,
 				 unsigned long address, pmd_t *pmdp,
 				 pmd_t entry, int dirty);
 
-extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
-					 unsigned long addr,
-					 pmd_t *pmdp,
-					 unsigned long clr,
-					 unsigned long set);
-
-static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
-					      unsigned long addr, pmd_t *pmdp)
-{
-	unsigned long old;
-
-	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
-		return 0;
-	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
-	return ((old & _PAGE_ACCESSED) != 0);
-}
-
 #define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
 extern int pmdp_test_and_clear_young(struct vm_area_struct *vma,
 				     unsigned long address, pmd_t *pmdp);
@@ -659,17 +227,6 @@ extern int pmdp_clear_flush_young(struct vm_area_struct *vma,
 extern pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
 				     unsigned long addr, pmd_t *pmdp);
 
-#define __HAVE_ARCH_PMDP_SET_WRPROTECT
-static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
-				      pmd_t *pmdp)
-{
-
-	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
-		return;
-
-	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
-}
-
 #define __HAVE_ARCH_PMDP_SPLITTING_FLUSH
 extern void pmdp_splitting_flush(struct vm_area_struct *vma,
 				 unsigned long address, pmd_t *pmdp);
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index a27b8cef51d7..8f7338678fdc 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -18,12 +18,6 @@ struct mm_struct;
 #include <asm/pgtable-book3e.h>
 #endif /* !CONFIG_PPC_BOOK3S */
 
-/*
- * We save the slot number & secondary bit in the second half of the
- * PTE page. We use the 8 bytes per each pte entry.
- */
-#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
-
 #ifndef __ASSEMBLY__
 
 #include <asm/tlbflush.h>
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 12/31] powerpc/mm: Move PTE bits from generic functions to hash64 functions.
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (10 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 11/31] powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 13/31] powerpc/booke: Move nohash headers (part 1) Aneesh Kumar K.V
                   ` (20 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

functions which operate on pte bits are moved to hash*.h and other
generic functions are moved to pgtable.h

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 183 ++++++++++++++++++++++++
 arch/powerpc/include/asm/book3s/64/hash.h    | 151 ++++++++++++++++++++
 arch/powerpc/include/asm/book3s/64/pgtable.h |   6 +
 arch/powerpc/include/asm/book3s/pgtable.h    | 204 ---------------------------
 4 files changed, 340 insertions(+), 204 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 9e47515b2e01..35c699016a97 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -294,6 +294,189 @@ void pgtable_cache_init(void);
 extern int get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep,
 		      pmd_t **pmdp);
 
+/* Generic accessors to PTE bits */
+static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
+static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
+static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSED); }
+static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
+static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
+static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
+
+static inline int pte_present(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_PRESENT;
+}
+
+/* Conversion functions: convert a page and protection to a page entry,
+ * and a page entry and page directory to the page they refer to.
+ *
+ * Even if PTEs can be unsigned long long, a PFN is always an unsigned
+ * long for now.
+ */
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
+{
+	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
+		     pgprot_val(pgprot));
+}
+
+static inline unsigned long pte_pfn(pte_t pte)
+{
+	return pte_val(pte) >> PTE_RPN_SHIFT;
+}
+
+/* Generic modifiers for PTE bits */
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_RW);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_RW);
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+static inline pte_t pte_mkhuge(pte_t pte)
+{
+	return pte;
+}
+
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
+}
+
+
+
+/* This low level function performs the actual PTE insertion
+ * Setting the PTE depends on the MMU type and other factors. It's
+ * an horrible mess that I'm not going to try to clean up now but
+ * I'm keeping it in one place rather than spread around
+ */
+static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
+				pte_t *ptep, pte_t pte, int percpu)
+{
+#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
+	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
+	 * helper pte_update() which does an atomic update. We need to do that
+	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
+	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
+	 * the hash bits instead (ie, same as the non-SMP case)
+	 */
+	if (percpu)
+		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+			      | (pte_val(pte) & ~_PAGE_HASHPTE));
+	else
+		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
+
+#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
+	/* Second case is 32-bit with 64-bit PTE.  In this case, we
+	 * can just store as long as we do the two halves in the right order
+	 * with a barrier in between. This is possible because we take care,
+	 * in the hash code, to pre-invalidate if the PTE was already hashed,
+	 * which synchronizes us with any concurrent invalidation.
+	 * In the percpu case, we also fallback to the simple update preserving
+	 * the hash bits
+	 */
+	if (percpu) {
+		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+			      | (pte_val(pte) & ~_PAGE_HASHPTE));
+		return;
+	}
+	if (pte_val(*ptep) & _PAGE_HASHPTE)
+		flush_hash_entry(mm, ptep, addr);
+	__asm__ __volatile__("\
+		stw%U0%X0 %2,%0\n\
+		eieio\n\
+		stw%U0%X0 %L2,%1"
+	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+	: "r" (pte) : "memory");
+
+#elif defined(CONFIG_PPC_STD_MMU_32)
+	/* Third case is 32-bit hash table in UP mode, we need to preserve
+	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
+	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
+	 * and see we need to keep track that this PTE needs invalidating
+	 */
+	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
+		      | (pte_val(pte) & ~_PAGE_HASHPTE));
+
+#else
+#error "Not supported "
+#endif
+}
+
+/*
+ * Macro to mark a page protection value as "uncacheable".
+ */
+
+#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU)
+
+#define pgprot_noncached pgprot_noncached
+static inline pgprot_t pgprot_noncached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE | _PAGE_GUARDED);
+}
+
+#define pgprot_noncached_wc pgprot_noncached_wc
+static inline pgprot_t pgprot_noncached_wc(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE);
+}
+
+#define pgprot_cached pgprot_cached
+static inline pgprot_t pgprot_cached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT);
+}
+
+#define pgprot_cached_wthru pgprot_cached_wthru
+static inline pgprot_t pgprot_cached_wthru(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT | _PAGE_WRITETHRU);
+}
+
+#define pgprot_cached_noncoherent pgprot_cached_noncoherent
+static inline pgprot_t pgprot_cached_noncoherent(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL);
+}
+
+#define pgprot_writecombine pgprot_writecombine
+static inline pgprot_t pgprot_writecombine(pgprot_t prot)
+{
+	return pgprot_noncached_wc(prot);
+}
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_PGTABLE_PPC32_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 954062beca9e..c74916b5c657 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -485,6 +485,157 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
 	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
 }
 
+/* Generic accessors to PTE bits */
+static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
+static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
+static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSED); }
+static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
+static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
+static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
+
+#ifdef CONFIG_NUMA_BALANCING
+/*
+ * These work without NUMA balancing but the kernel does not care. See the
+ * comment in include/asm-generic/pgtable.h . On powerpc, this will only
+ * work for user pages and always return true for kernel pages.
+ */
+static inline int pte_protnone(pte_t pte)
+{
+	return (pte_val(pte) &
+		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
+}
+#endif /* CONFIG_NUMA_BALANCING */
+
+static inline int pte_present(pte_t pte)
+{
+	return pte_val(pte) & _PAGE_PRESENT;
+}
+
+/* Conversion functions: convert a page and protection to a page entry,
+ * and a page entry and page directory to the page they refer to.
+ *
+ * Even if PTEs can be unsigned long long, a PFN is always an unsigned
+ * long for now.
+ */
+static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
+{
+	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
+		     pgprot_val(pgprot));
+}
+
+static inline unsigned long pte_pfn(pte_t pte)
+{
+	return pte_val(pte) >> PTE_RPN_SHIFT;
+}
+
+/* Generic modifiers for PTE bits */
+static inline pte_t pte_wrprotect(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_RW);
+}
+
+static inline pte_t pte_mkclean(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkold(pte_t pte)
+{
+	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkwrite(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_RW);
+}
+
+static inline pte_t pte_mkdirty(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_DIRTY);
+}
+
+static inline pte_t pte_mkyoung(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_ACCESSED);
+}
+
+static inline pte_t pte_mkspecial(pte_t pte)
+{
+	return __pte(pte_val(pte) | _PAGE_SPECIAL);
+}
+
+static inline pte_t pte_mkhuge(pte_t pte)
+{
+	return pte;
+}
+
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
+}
+
+/* This low level function performs the actual PTE insertion
+ * Setting the PTE depends on the MMU type and other factors. It's
+ * an horrible mess that I'm not going to try to clean up now but
+ * I'm keeping it in one place rather than spread around
+ */
+static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
+				pte_t *ptep, pte_t pte, int percpu)
+{
+	/*
+	 * Anything else just stores the PTE normally. That covers all 64-bit
+	 * cases, and 32-bit non-hash with 32-bit PTEs.
+	 */
+	*ptep = pte;
+}
+
+/*
+ * Macro to mark a page protection value as "uncacheable".
+ */
+
+#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
+			 _PAGE_WRITETHRU)
+
+#define pgprot_noncached pgprot_noncached
+static inline pgprot_t pgprot_noncached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE | _PAGE_GUARDED);
+}
+
+#define pgprot_noncached_wc pgprot_noncached_wc
+static inline pgprot_t pgprot_noncached_wc(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_NO_CACHE);
+}
+
+#define pgprot_cached pgprot_cached
+static inline pgprot_t pgprot_cached(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT);
+}
+
+#define pgprot_cached_wthru pgprot_cached_wthru
+static inline pgprot_t pgprot_cached_wthru(pgprot_t prot)
+{
+	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
+			_PAGE_COHERENT | _PAGE_WRITETHRU);
+}
+
+#define pgprot_cached_noncoherent pgprot_cached_noncoherent
+static inline pgprot_t pgprot_cached_noncoherent(pgprot_t prot)
+{
+	return __pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL);
+}
+
+#define pgprot_writecombine pgprot_writecombine
+static inline pgprot_t pgprot_writecombine(pgprot_t prot)
+{
+	return pgprot_noncached_wc(prot);
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
 				   pmd_t *pmdp, unsigned long old_pmd);
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index aac630b4a15e..f2ace2cac7bb 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -201,6 +201,12 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
 #define pmd_mkdirty(pmd)	pte_pmd(pte_mkdirty(pmd_pte(pmd)))
 #define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
 #define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
+#ifdef CONFIG_NUMA_BALANCING
+static inline int pmd_protnone(pmd_t pmd)
+{
+	return pte_protnone(pmd_pte(pmd));
+}
+#endif /* CONFIG_NUMA_BALANCING */
 
 #define __HAVE_ARCH_PMD_WRITE
 #define pmd_write(pmd)		pte_write(pmd_pte(pmd))
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index ebd6677ea017..8b0f4a29259a 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -9,221 +9,17 @@
 
 #define FIRST_USER_ADDRESS	0UL
 #ifndef __ASSEMBLY__
-
-/* Generic accessors to PTE bits */
-static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
-static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
-static inline int pte_young(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_ACCESSED); }
-static inline int pte_special(pte_t pte)	{ return !!(pte_val(pte) & _PAGE_SPECIAL); }
-static inline int pte_none(pte_t pte)		{ return (pte_val(pte) & ~_PTE_NONE_MASK) == 0; }
-static inline pgprot_t pte_pgprot(pte_t pte)	{ return __pgprot(pte_val(pte) & PAGE_PROT_BITS); }
-
-#ifdef CONFIG_NUMA_BALANCING
-/*
- * These work without NUMA balancing but the kernel does not care. See the
- * comment in include/asm-generic/pgtable.h . On powerpc, this will only
- * work for user pages and always return true for kernel pages.
- */
-static inline int pte_protnone(pte_t pte)
-{
-	return (pte_val(pte) &
-		(_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
-}
-
-static inline int pmd_protnone(pmd_t pmd)
-{
-	return pte_protnone(pmd_pte(pmd));
-}
-#endif /* CONFIG_NUMA_BALANCING */
-
-static inline int pte_present(pte_t pte)
-{
-	return pte_val(pte) & _PAGE_PRESENT;
-}
-
-/* Conversion functions: convert a page and protection to a page entry,
- * and a page entry and page directory to the page they refer to.
- *
- * Even if PTEs can be unsigned long long, a PFN is always an unsigned
- * long for now.
- */
-static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
-{
-	return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
-		     pgprot_val(pgprot));
-}
-
-static inline unsigned long pte_pfn(pte_t pte)
-{
-	return pte_val(pte) >> PTE_RPN_SHIFT;
-}
-
-/* Generic modifiers for PTE bits */
-static inline pte_t pte_wrprotect(pte_t pte)
-{
-	return __pte(pte_val(pte) & ~_PAGE_RW);
-}
-
-static inline pte_t pte_mkclean(pte_t pte)
-{
-	return __pte(pte_val(pte) & ~_PAGE_DIRTY);
-}
-
-static inline pte_t pte_mkold(pte_t pte)
-{
-	return __pte(pte_val(pte) & ~_PAGE_ACCESSED);
-}
-
-static inline pte_t pte_mkwrite(pte_t pte)
-{
-	return __pte(pte_val(pte) | _PAGE_RW);
-}
-
-static inline pte_t pte_mkdirty(pte_t pte)
-{
-	return __pte(pte_val(pte) | _PAGE_DIRTY);
-}
-
-static inline pte_t pte_mkyoung(pte_t pte)
-{
-	return __pte(pte_val(pte) | _PAGE_ACCESSED);
-}
-
-static inline pte_t pte_mkspecial(pte_t pte)
-{
-	return __pte(pte_val(pte) | _PAGE_SPECIAL);
-}
-
-static inline pte_t pte_mkhuge(pte_t pte)
-{
-	return pte;
-}
-
-static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
-{
-	return __pte((pte_val(pte) & _PAGE_CHG_MASK) | pgprot_val(newprot));
-}
-
-
 /* Insert a PTE, top-level function is out of line. It uses an inline
  * low level function in the respective pgtable-* files
  */
 extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 		       pte_t pte);
 
-/* This low level function performs the actual PTE insertion
- * Setting the PTE depends on the MMU type and other factors. It's
- * an horrible mess that I'm not going to try to clean up now but
- * I'm keeping it in one place rather than spread around
- */
-static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
-				pte_t *ptep, pte_t pte, int percpu)
-{
-#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
-	/* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
-	 * helper pte_update() which does an atomic update. We need to do that
-	 * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
-	 * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
-	 * the hash bits instead (ie, same as the non-SMP case)
-	 */
-	if (percpu)
-		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-			      | (pte_val(pte) & ~_PAGE_HASHPTE));
-	else
-		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
-
-#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
-	/* Second case is 32-bit with 64-bit PTE.  In this case, we
-	 * can just store as long as we do the two halves in the right order
-	 * with a barrier in between. This is possible because we take care,
-	 * in the hash code, to pre-invalidate if the PTE was already hashed,
-	 * which synchronizes us with any concurrent invalidation.
-	 * In the percpu case, we also fallback to the simple update preserving
-	 * the hash bits
-	 */
-	if (percpu) {
-		*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-			      | (pte_val(pte) & ~_PAGE_HASHPTE));
-		return;
-	}
-	if (pte_val(*ptep) & _PAGE_HASHPTE)
-		flush_hash_entry(mm, ptep, addr);
-	__asm__ __volatile__("\
-		stw%U0%X0 %2,%0\n\
-		eieio\n\
-		stw%U0%X0 %L2,%1"
-	: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
-	: "r" (pte) : "memory");
-
-#elif defined(CONFIG_PPC_STD_MMU_32)
-	/* Third case is 32-bit hash table in UP mode, we need to preserve
-	 * the _PAGE_HASHPTE bit since we may not have invalidated the previous
-	 * translation in the hash yet (done in a subsequent flush_tlb_xxx())
-	 * and see we need to keep track that this PTE needs invalidating
-	 */
-	*ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
-		      | (pte_val(pte) & ~_PAGE_HASHPTE));
-
-#else
-	/* Anything else just stores the PTE normally. That covers all 64-bit
-	 * cases, and 32-bit non-hash with 32-bit PTEs.
-	 */
-	*ptep = pte;
-#endif
-}
-
 
 #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
 extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
 				 pte_t *ptep, pte_t entry, int dirty);
 
-/*
- * Macro to mark a page protection value as "uncacheable".
- */
-
-#define _PAGE_CACHE_CTL	(_PAGE_COHERENT | _PAGE_GUARDED | _PAGE_NO_CACHE | \
-			 _PAGE_WRITETHRU)
-
-#define pgprot_noncached pgprot_noncached
-static inline pgprot_t pgprot_noncached(pgprot_t prot)
-{
-	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
-			_PAGE_NO_CACHE | _PAGE_GUARDED);
-}
-
-#define pgprot_noncached_wc pgprot_noncached_wc
-static inline pgprot_t pgprot_noncached_wc(pgprot_t prot)
-{
-	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
-			_PAGE_NO_CACHE);
-}
-
-#define pgprot_cached pgprot_cached
-static inline pgprot_t pgprot_cached(pgprot_t prot)
-{
-	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
-			_PAGE_COHERENT);
-}
-
-#define pgprot_cached_wthru pgprot_cached_wthru
-static inline pgprot_t pgprot_cached_wthru(pgprot_t prot)
-{
-	return __pgprot((pgprot_val(prot) & ~_PAGE_CACHE_CTL) |
-			_PAGE_COHERENT | _PAGE_WRITETHRU);
-}
-
-#define pgprot_cached_noncoherent pgprot_cached_noncoherent
-static inline pgprot_t pgprot_cached_noncoherent(pgprot_t prot)
-{
-	return __pgprot(pgprot_val(prot) & ~_PAGE_CACHE_CTL);
-}
-
-#define pgprot_writecombine pgprot_writecombine
-static inline pgprot_t pgprot_writecombine(pgprot_t prot)
-{
-	return pgprot_noncached_wc(prot);
-}
-
 struct file;
 extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 				     unsigned long size, pgprot_t vma_prot);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 13/31] powerpc/booke: Move nohash headers (part 1)
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (11 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 12/31] powerpc/mm: Move PTE bits from generic functions to hash64 functions Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 14/31] powerpc/booke: Move nohash headers (part 2) Aneesh Kumar K.V
                   ` (19 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Move the booke related headers below booke/32 or booke/64

We are splitting this change into multiple patch to make the rebasing
easier. The following patches can be folded into this if needed.
They are kept separate for easier review.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/{pgtable-book3e.h => nohash/pgtable.h} | 0
 arch/powerpc/include/asm/pgtable.h                              | 2 +-
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename arch/powerpc/include/asm/{pgtable-book3e.h => nohash/pgtable.h} (100%)

diff --git a/arch/powerpc/include/asm/pgtable-book3e.h b/arch/powerpc/include/asm/nohash/pgtable.h
similarity index 100%
rename from arch/powerpc/include/asm/pgtable-book3e.h
rename to arch/powerpc/include/asm/nohash/pgtable.h
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index 8f7338678fdc..ac9fb114e25d 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -15,7 +15,7 @@ struct mm_struct;
 #ifdef CONFIG_PPC_BOOK3S
 #include <asm/book3s/pgtable.h>
 #else
-#include <asm/pgtable-book3e.h>
+#include <asm/nohash/pgtable.h>
 #endif /* !CONFIG_PPC_BOOK3S */
 
 #ifndef __ASSEMBLY__
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 14/31] powerpc/booke: Move nohash headers (part 2)
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (12 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 13/31] powerpc/booke: Move nohash headers (part 1) Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 15/31] powerpc/booke: Move nohash headers (part 3) Aneesh Kumar K.V
                   ` (18 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/{pgtable-ppc32.h => nohash/32/pgtable.h} | 0
 arch/powerpc/include/asm/{pgtable-ppc64.h => nohash/64/pgtable.h} | 2 +-
 arch/powerpc/include/asm/nohash/pgtable.h                         | 8 ++++----
 3 files changed, 5 insertions(+), 5 deletions(-)
 rename arch/powerpc/include/asm/{pgtable-ppc32.h => nohash/32/pgtable.h} (100%)
 rename arch/powerpc/include/asm/{pgtable-ppc64.h => nohash/64/pgtable.h} (99%)

diff --git a/arch/powerpc/include/asm/pgtable-ppc32.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
similarity index 100%
rename from arch/powerpc/include/asm/pgtable-ppc32.h
rename to arch/powerpc/include/asm/nohash/32/pgtable.h
diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
similarity index 99%
rename from arch/powerpc/include/asm/pgtable-ppc64.h
rename to arch/powerpc/include/asm/nohash/64/pgtable.h
index 6be203d43fd1..9b4f9fcd64de 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -18,7 +18,7 @@
  * Size of EA range mapped by our pagetables.
  */
 #define PGTABLE_EADDR_SIZE (PTE_INDEX_SIZE + PMD_INDEX_SIZE + \
-                	    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
+			    PUD_INDEX_SIZE + PGD_INDEX_SIZE + PAGE_SHIFT)
 #define PGTABLE_RANGE (ASM_CONST(1) << PGTABLE_EADDR_SIZE)
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 91325997ba25..c0c41a2409d2 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -1,10 +1,10 @@
-#ifndef _ASM_POWERPC_PGTABLE_BOOK3E_H
-#define _ASM_POWERPC_PGTABLE_BOOK3E_H
+#ifndef _ASM_POWERPC_NOHASH_PGTABLE_H
+#define _ASM_POWERPC_NOHASH_PGTABLE_H
 
 #if defined(CONFIG_PPC64)
-#include <asm/pgtable-ppc64.h>
+#include <asm/nohash/64/pgtable.h>
 #else
-#include <asm/pgtable-ppc32.h>
+#include <asm/nohash/32/pgtable.h>
 #endif
 
 #ifndef __ASSEMBLY__
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 15/31] powerpc/booke: Move nohash headers (part 3)
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (13 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 14/31] powerpc/booke: Move nohash headers (part 2) Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 16/31] powerpc/booke: Move nohash headers (part 4) Aneesh Kumar K.V
                   ` (17 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 .../include/asm/{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} |  0
 .../asm/{pgtable-ppc64-64k.h => nohash/64/pgtable-64k.h}       |  0
 arch/powerpc/include/asm/nohash/64/pgtable.h                   | 10 +++++-----
 3 files changed, 5 insertions(+), 5 deletions(-)
 rename arch/powerpc/include/asm/{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} (100%)
 rename arch/powerpc/include/asm/{pgtable-ppc64-64k.h => nohash/64/pgtable-64k.h} (100%)

diff --git a/arch/powerpc/include/asm/pgtable-ppc64-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
similarity index 100%
rename from arch/powerpc/include/asm/pgtable-ppc64-4k.h
rename to arch/powerpc/include/asm/nohash/64/pgtable-4k.h
diff --git a/arch/powerpc/include/asm/pgtable-ppc64-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
similarity index 100%
rename from arch/powerpc/include/asm/pgtable-ppc64-64k.h
rename to arch/powerpc/include/asm/nohash/64/pgtable-64k.h
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 9b4f9fcd64de..aebb83f84b69 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -1,14 +1,14 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC64_H_
-#define _ASM_POWERPC_PGTABLE_PPC64_H_
+#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_H
+#define _ASM_POWERPC_NOHASH_64_PGTABLE_H
 /*
  * This file contains the functions and defines necessary to modify and use
  * the ppc64 hashed page table.
  */
 
 #ifdef CONFIG_PPC_64K_PAGES
-#include <asm/pgtable-ppc64-64k.h>
+#include <asm/nohash/64/pgtable-64k.h>
 #else
-#include <asm/pgtable-ppc64-4k.h>
+#include <asm/nohash/64/pgtable-4k.h>
 #endif
 #include <asm/barrier.h>
 
@@ -637,4 +637,4 @@ static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
 	return true;
 }
 #endif /* __ASSEMBLY__ */
-#endif /* _ASM_POWERPC_PGTABLE_PPC64_H_ */
+#endif /* _ASM_POWERPC_NOHASH_64_PGTABLE_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 16/31] powerpc/booke: Move nohash headers (part 4)
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (14 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 15/31] powerpc/booke: Move nohash headers (part 3) Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 17/31] powerpc/booke: Move nohash headers (part 5) Aneesh Kumar K.V
                   ` (16 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/nohash/32/pgtable.h             | 16 ++++++++--------
 arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h       |  0
 arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h       |  0
 arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h       |  0
 arch/powerpc/include/asm/{ => nohash/32}/pte-fsl-booke.h |  0
 arch/powerpc/include/asm/nohash/64/pgtable.h             |  2 +-
 arch/powerpc/include/asm/{ => nohash}/pte-book3e.h       |  0
 7 files changed, 9 insertions(+), 9 deletions(-)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h (100%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h (100%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h (100%)
 rename arch/powerpc/include/asm/{ => nohash/32}/pte-fsl-booke.h (100%)
 rename arch/powerpc/include/asm/{ => nohash}/pte-book3e.h (100%)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index fbb23c54b998..c82cbf52d19e 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC32_H
-#define _ASM_POWERPC_PGTABLE_PPC32_H
+#ifndef _ASM_POWERPC_NOHASH_32_PGTABLE_H
+#define _ASM_POWERPC_NOHASH_32_PGTABLE_H
 
 #include <asm-generic/pgtable-nopmd.h>
 
@@ -106,15 +106,15 @@ extern int icache_44x_need_flush;
  */
 
 #if defined(CONFIG_40x)
-#include <asm/pte-40x.h>
+#include <asm/nohash/32/pte-40x.h>
 #elif defined(CONFIG_44x)
-#include <asm/pte-44x.h>
+#include <asm/nohash/32/pte-44x.h>
 #elif defined(CONFIG_FSL_BOOKE) && defined(CONFIG_PTE_64BIT)
-#include <asm/pte-book3e.h>
+#include <asm/nohash/pte-book3e.h>
 #elif defined(CONFIG_FSL_BOOKE)
-#include <asm/pte-fsl-booke.h>
+#include <asm/nohash/32/pte-fsl-booke.h>
 #elif defined(CONFIG_8xx)
-#include <asm/pte-8xx.h>
+#include <asm/nohash/32/pte-8xx.h>
 #endif
 
 /* And here we include common definitions */
@@ -340,4 +340,4 @@ extern int get_pteptr(struct mm_struct *mm, unsigned long addr, pte_t **ptep,
 
 #endif /* !__ASSEMBLY__ */
 
-#endif /* _ASM_POWERPC_PGTABLE_PPC32_H */
+#endif /* __ASM_POWERPC_NOHASH_32_PGTABLE_H */
diff --git a/arch/powerpc/include/asm/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-40x.h
rename to arch/powerpc/include/asm/nohash/32/pte-40x.h
diff --git a/arch/powerpc/include/asm/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-44x.h
rename to arch/powerpc/include/asm/nohash/32/pte-44x.h
diff --git a/arch/powerpc/include/asm/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-8xx.h
rename to arch/powerpc/include/asm/nohash/32/pte-8xx.h
diff --git a/arch/powerpc/include/asm/pte-fsl-booke.h b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-fsl-booke.h
rename to arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index aebb83f84b69..c24e03f22655 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -97,7 +97,7 @@
 /*
  * Include the PTE bits definitions
  */
-#include <asm/pte-book3e.h>
+#include <asm/nohash/pte-book3e.h>
 #include <asm/pte-common.h>
 
 #ifdef CONFIG_PPC_MM_SLICES
diff --git a/arch/powerpc/include/asm/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
similarity index 100%
rename from arch/powerpc/include/asm/pte-book3e.h
rename to arch/powerpc/include/asm/nohash/pte-book3e.h
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 17/31] powerpc/booke: Move nohash headers (part 5)
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (15 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 16/31] powerpc/booke: Move nohash headers (part 4) Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 18/31] powerpc/mm: Increase the pte frag size Aneesh Kumar K.V
                   ` (15 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/nohash/32/pte-40x.h       | 6 +++---
 arch/powerpc/include/asm/nohash/32/pte-44x.h       | 6 +++---
 arch/powerpc/include/asm/nohash/32/pte-8xx.h       | 6 +++---
 arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h | 6 +++---
 arch/powerpc/include/asm/nohash/64/pgtable-4k.h    | 6 +++---
 arch/powerpc/include/asm/nohash/64/pgtable-64k.h   | 6 +++---
 arch/powerpc/include/asm/nohash/pte-book3e.h       | 6 +++---
 7 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h
index 486b1ef81338..9624ebdacc47 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-40x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_40x_H
-#define _ASM_POWERPC_PTE_40x_H
+#ifndef _ASM_POWERPC_NOHASH_32_PTE_40x_H
+#define _ASM_POWERPC_NOHASH_32_PTE_40x_H
 #ifdef __KERNEL__
 
 /*
@@ -61,4 +61,4 @@
 #define PTE_ATOMIC_UPDATES	1
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_40x_H */
+#endif /*  _ASM_POWERPC_NOHASH_32_PTE_40x_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h
index 36f75fab23f5..fdab41c654ef 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-44x.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_44x_H
-#define _ASM_POWERPC_PTE_44x_H
+#ifndef _ASM_POWERPC_NOHASH_32_PTE_44x_H
+#define _ASM_POWERPC_NOHASH_32_PTE_44x_H
 #ifdef __KERNEL__
 
 /*
@@ -94,4 +94,4 @@
 
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_44x_H */
+#endif /*  _ASM_POWERPC_NOHASH_32_PTE_44x_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index a0e2ba960976..3742b1919661 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_8xx_H
-#define _ASM_POWERPC_PTE_8xx_H
+#ifndef _ASM_POWERPC_NOHASH_32_PTE_8xx_H
+#define _ASM_POWERPC_NOHASH_32_PTE_8xx_H
 #ifdef __KERNEL__
 
 /*
@@ -62,4 +62,4 @@
 				 _PAGE_HWWRITE | _PAGE_EXEC)
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_8xx_H */
+#endif /*  _ASM_POWERPC_NOHASH_32_PTE_8xx_H */
diff --git a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
index 9f5c3d04a1a3..5422d00c6145 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-fsl-booke.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_FSL_BOOKE_H
-#define _ASM_POWERPC_PTE_FSL_BOOKE_H
+#ifndef _ASM_POWERPC_NOHASH_32_PTE_FSL_BOOKE_H
+#define _ASM_POWERPC_NOHASH_32_PTE_FSL_BOOKE_H
 #ifdef __KERNEL__
 
 /* PTE bit definitions for Freescale BookE SW loaded TLB MMU based
@@ -37,4 +37,4 @@
 #define PTE_WIMGE_SHIFT (6)
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_FSL_BOOKE_H */
+#endif /*  _ASM_POWERPC_NOHASH_32_PTE_FSL_BOOKE_H */
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
index 7bace25d6b62..fc7d51753f81 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-4k.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC64_4K_H
-#define _ASM_POWERPC_PGTABLE_PPC64_4K_H
+#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
+#define _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H
 /*
  * Entries per page directory level.  The PTE level must use a 64b record
  * for each page table entry.  The PMD and PGD level use a 32b record for
@@ -89,4 +89,4 @@ extern struct page *pgd_page(pgd_t pgd);
 #define remap_4k_pfn(vma, addr, pfn, prot)	\
 	remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
 
-#endif /* _ASM_POWERPC_PGTABLE_PPC64_4K_H */
+#endif /* _ _ASM_POWERPC_NOHASH_64_PGTABLE_4K_H */
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
index 1de35bbd02a6..a44660d76096 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PGTABLE_PPC64_64K_H
-#define _ASM_POWERPC_PGTABLE_PPC64_64K_H
+#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H
+#define _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H
 
 #include <asm-generic/pgtable-nopud.h>
 
@@ -41,4 +41,4 @@
 #define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
 #define pte_pgd(pte)	((pgd_t)pte_pud(pte))
 
-#endif /* _ASM_POWERPC_PGTABLE_PPC64_64K_H */
+#endif /* _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H */
diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
index 8d8473278d91..e16807b78edf 100644
--- a/arch/powerpc/include/asm/nohash/pte-book3e.h
+++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
@@ -1,5 +1,5 @@
-#ifndef _ASM_POWERPC_PTE_BOOK3E_H
-#define _ASM_POWERPC_PTE_BOOK3E_H
+#ifndef _ASM_POWERPC_NOHASH_PTE_BOOK3E_H
+#define _ASM_POWERPC_NOHASH_PTE_BOOK3E_H
 #ifdef __KERNEL__
 
 /* PTE bit definitions for processors compliant to the Book3E
@@ -84,4 +84,4 @@
 #endif
 
 #endif /* __KERNEL__ */
-#endif /*  _ASM_POWERPC_PTE_FSL_BOOKE_H */
+#endif /*  _ASM_POWERPC_NOHASH_PTE_BOOK3E_H */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 18/31] powerpc/mm: Increase the pte frag size.
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (16 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 17/31] powerpc/booke: Move nohash headers (part 5) Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 19/31] powerpc/mm: Convert 4k hash insert to C Aneesh Kumar K.V
                   ` (14 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We will use the increased size to store more information of 4K pte
when using 64K page size. The idea is to free up bits in pte_t.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pgalloc-64.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index d8cde71f6734..4f1cc6c46728 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -164,15 +164,15 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
 
 #else /* if CONFIG_PPC_64K_PAGES */
 /*
- * we support 16 fragments per PTE page.
+ * we support 8 fragments per PTE page.
  */
-#define PTE_FRAG_NR	16
+#define PTE_FRAG_NR	8
 /*
- * We use a 2K PTE page fragment and another 2K for storing
- * real_pte_t hash index
+ * We use a 2K PTE page fragment and another 4K for storing
+ * real_pte_t hash index. Rounding the entire thing to 8K
  */
-#define PTE_FRAG_SIZE_SHIFT  12
-#define PTE_FRAG_SIZE (2 * PTRS_PER_PTE * sizeof(pte_t))
+#define PTE_FRAG_SIZE_SHIFT  13
+#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
 
 extern pte_t *page_table_alloc(struct mm_struct *, unsigned long, int);
 extern void page_table_free(struct mm_struct *, unsigned long *, int);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 19/31] powerpc/mm: Convert 4k hash insert to C
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (17 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 18/31] powerpc/mm: Increase the pte frag size Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 20/31] powerpc/mm: update __real_pte to take address as argument Aneesh Kumar K.V
                   ` (13 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/Makefile        |   3 +
 arch/powerpc/mm/hash64_64k.c    | 202 +++++++++++++++++++++
 arch/powerpc/mm/hash_low_64.S   | 380 ----------------------------------------
 arch/powerpc/mm/hash_utils_64.c |   4 +-
 4 files changed, 208 insertions(+), 381 deletions(-)
 create mode 100644 arch/powerpc/mm/hash64_64k.c

diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index 3eb73a38220d..f80ad1a76cc8 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -18,6 +18,9 @@ obj-$(CONFIG_PPC_STD_MMU_32)	+= ppc_mmu_32.o
 obj-$(CONFIG_PPC_STD_MMU)	+= hash_low_$(CONFIG_WORD_SIZE).o \
 				   tlb_hash$(CONFIG_WORD_SIZE).o \
 				   mmu_context_hash$(CONFIG_WORD_SIZE).o
+ifeq ($(CONFIG_PPC_STD_MMU_64),y)
+obj-$(CONFIG_PPC_64K_PAGES)	+= hash64_64k.o
+endif
 obj-$(CONFIG_PPC_ICSWX)		+= icswx.o
 obj-$(CONFIG_PPC_ICSWX_PID)	+= icswx_pid.o
 obj-$(CONFIG_40x)		+= 40x_mmu.o
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
new file mode 100644
index 000000000000..da133f2a51a7
--- /dev/null
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -0,0 +1,202 @@
+/*
+ * Copyright IBM Corporation, 2015
+ * Author Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include <linux/mm.h>
+#include <asm/machdep.h>
+#include <asm/mmu.h>
+
+int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
+		   pte_t *ptep, unsigned long trap, unsigned long flags,
+		   int ssize, int subpg_prot)
+{
+	real_pte_t rpte;
+	unsigned long *hidxp;
+	unsigned long hpte_group;
+	unsigned int subpg_index;
+	unsigned long shift = 12; /* 4K */
+	unsigned long rflags, pa, hidx;
+	unsigned long old_pte, new_pte, subpg_pte;
+	unsigned long vpn, hash, slot;
+
+	/*
+	 * atomically mark the linux large page PTE busy and dirty
+	 */
+	do {
+		pte_t pte = READ_ONCE(*ptep);
+
+		old_pte = pte_val(pte);
+		/* If PTE busy, retry the access */
+		if (unlikely(old_pte & _PAGE_BUSY))
+			return 0;
+		/* If PTE permissions don't match, take page fault */
+		if (unlikely(access & ~old_pte))
+			return 1;
+		/*
+		 * Try to lock the PTE, add ACCESSED and DIRTY if it was
+		 * a write access. Since this is 4K insert of 64K page size
+		 * also add _PAGE_COMBO
+		 */
+		new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_COMBO;
+		if (access & _PAGE_RW)
+			new_pte |= _PAGE_DIRTY;
+	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
+					  old_pte, new_pte));
+	/*
+	 * Handle the subpage protection bits
+	 */
+	subpg_pte = new_pte & ~subpg_prot;
+	/*
+	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
+	 * need to add in 0x1 if it's a read-only user page
+	 */
+	rflags = subpg_pte & _PAGE_USER;
+	if ((subpg_pte & _PAGE_USER) && !((subpg_pte & _PAGE_RW) &&
+					(subpg_pte & _PAGE_DIRTY)))
+		rflags |= 0x1;
+	/*
+	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
+	 */
+	rflags |= ((subpg_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
+	/*
+	 * Always add C and Memory coherence bit
+	 */
+	rflags |= HPTE_R_C | HPTE_R_M;
+	/*
+	 * Add in WIMG bits
+	 */
+	rflags |= (subpg_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
+				_PAGE_COHERENT | _PAGE_GUARDED));
+
+	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
+	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {
+
+		/*
+		 * No CPU has hugepages but lacks no execute, so we
+		 * don't need to worry about that case
+		 */
+		rflags = hash_page_do_lazy_icache(rflags, __pte(old_pte), trap);
+	}
+
+	subpg_index = (ea & (PAGE_SIZE - 1)) >> shift;
+	vpn  = hpt_vpn(ea, vsid, ssize);
+	rpte = __real_pte(__pte(old_pte), ptep);
+	/*
+	 *None of the sub 4k page is hashed
+	 */
+	if (!(old_pte & _PAGE_HASHPTE))
+		goto htab_insert_hpte;
+	/*
+	 * Check if the pte was already inserted into the hash table
+	 * as a 64k HW page, and invalidate the 64k HPTE if so.
+	 */
+	if (!(old_pte & _PAGE_COMBO)) {
+		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
+		old_pte &= ~_PAGE_HPTE_SUB;
+		goto htab_insert_hpte;
+	}
+	/*
+	 * Check for sub page valid and update
+	 */
+	if (__rpte_sub_valid(rpte, subpg_index)) {
+		int ret;
+
+		hash = hpt_hash(vpn, shift, ssize);
+		hidx = __rpte_to_hidx(rpte, subpg_index);
+		if (hidx & _PTEIDX_SECONDARY)
+			hash = ~hash;
+		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+		slot += hidx & _PTEIDX_GROUP_IX;
+
+		ret = ppc_md.hpte_updatepp(slot, rflags, vpn,
+					   MMU_PAGE_4K, MMU_PAGE_4K,
+					   ssize, flags);
+		/*
+		 *if we failed because typically the HPTE wasn't really here
+		 * we try an insertion.
+		 */
+		if (ret == -1)
+			goto htab_insert_hpte;
+
+		*ptep = __pte(new_pte & ~_PAGE_BUSY);
+		return 0;
+	}
+
+htab_insert_hpte:
+	/*
+	 * handle _PAGE_4K_PFN case
+	 */
+	if (old_pte & _PAGE_4K_PFN) {
+		/*
+		 * All the sub 4k page have the same
+		 * physical address.
+		 */
+		pa = pte_pfn(__pte(old_pte)) << HW_PAGE_SHIFT;
+	} else {
+		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
+		pa += (subpg_index << shift);
+	}
+	hash = hpt_hash(vpn, shift, ssize);
+repeat:
+	hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+
+	/* Insert into the hash table, primary slot */
+	slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
+				  MMU_PAGE_4K, MMU_PAGE_4K, ssize);
+	/*
+	 * Primary is full, try the secondary
+	 */
+	if (unlikely(slot == -1)) {
+		hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+		slot = ppc_md.hpte_insert(hpte_group, vpn, pa,
+					  rflags, HPTE_V_SECONDARY,
+					  MMU_PAGE_4K, MMU_PAGE_4K, ssize);
+		if (slot == -1) {
+			if (mftb() & 0x1)
+				hpte_group = ((hash & htab_hash_mask) *
+					      HPTES_PER_GROUP) & ~0x7UL;
+			ppc_md.hpte_remove(hpte_group);
+			/*
+			 * FIXME!! Should be try the group from which we removed ?
+			 */
+			goto repeat;
+		}
+	}
+	/*
+	 * Hypervisor failure. Restore old pmd and return -1
+	 * similar to __hash_page_*
+	 */
+	if (unlikely(slot == -2)) {
+		*ptep = __pte(old_pte);
+		hash_failure_debug(ea, access, vsid, trap, ssize,
+				   MMU_PAGE_4K, MMU_PAGE_4K, old_pte);
+		return -1;
+	}
+	/*
+	 * Insert slot number & secondary bit in PTE second half,
+	 * clear _PAGE_BUSY and set appropriate HPTE slot bit
+	 * Since we have _PAGE_BUSY set on ptep, we can be sure
+	 * nobody is undating hidx.
+	 */
+	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+	/* __real_pte use pte_val() any idea why ? FIXME!! */
+	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
+	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
+	new_pte |= (_PAGE_HPTE_SUB0 >> subpg_index);
+	/*
+	 * check __real_pte for details on matching smp_rmb()
+	 */
+	smp_wmb();
+	*ptep = __pte(new_pte & ~_PAGE_BUSY);
+	return 0;
+}
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 3b49e3295901..6b4d4c1d0628 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -328,381 +328,8 @@ htab_pte_insert_failure:
 	li	r3,-1
 	b	htab_bail
 
-
 #else /* CONFIG_PPC_64K_PAGES */
 
-
-/*****************************************************************************
- *                                                                           *
- *           64K SW & 4K or 64K HW in a 4K segment pages implementation      *
- *                                                                           *
- *****************************************************************************/
-
-/* _hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
- *		 pte_t *ptep, unsigned long trap, unsigned local flags,
- *		 int ssize, int subpg_prot)
- */
-
-/*
- * For now, we do NOT implement Admixed pages
- */
-_GLOBAL(__hash_page_4K)
-	mflr	r0
-	std	r0,16(r1)
-	stdu	r1,-STACKFRAMESIZE(r1)
-	/* Save all params that we need after a function call */
-	std	r6,STK_PARAM(R6)(r1)
-	std	r8,STK_PARAM(R8)(r1)
-	std	r9,STK_PARAM(R9)(r1)
-
-	/* Save non-volatile registers.
-	 * r31 will hold "old PTE"
-	 * r30 is "new PTE"
-	 * r29 is vpn
-	 * r28 is a hash value
-	 * r27 is hashtab mask (maybe dynamic patched instead ?)
-	 * r26 is the hidx mask
-	 * r25 is the index in combo page
-	 */
-	std	r25,STK_REG(R25)(r1)
-	std	r26,STK_REG(R26)(r1)
-	std	r27,STK_REG(R27)(r1)
-	std	r28,STK_REG(R28)(r1)
-	std	r29,STK_REG(R29)(r1)
-	std	r30,STK_REG(R30)(r1)
-	std	r31,STK_REG(R31)(r1)
-
-	/* Step 1:
-	 *
-	 * Check permissions, atomically mark the linux PTE busy
-	 * and hashed.
-	 */
-1:
-	ldarx	r31,0,r6
-	/* Check access rights (access & ~(pte_val(*ptep))) */
-	andc.	r0,r4,r31
-	bne-	htab_wrong_access
-	/* Check if PTE is busy */
-	andi.	r0,r31,_PAGE_BUSY
-	/* If so, just bail out and refault if needed. Someone else
-	 * is changing this PTE anyway and might hash it.
-	 */
-	bne-	htab_bail_ok
-	/* Prepare new PTE value (turn access RW into DIRTY, then
-	 * add BUSY and ACCESSED)
-	 */
-	rlwinm	r30,r4,32-9+7,31-7,31-7	/* _PAGE_RW -> _PAGE_DIRTY */
-	or	r30,r30,r31
-	ori	r30,r30,_PAGE_BUSY | _PAGE_ACCESSED
-	oris	r30,r30,_PAGE_COMBO@h
-	/* Write the linux PTE atomically (setting busy) */
-	stdcx.	r30,0,r6
-	bne-	1b
-	isync
-
-	/* Step 2:
-	 *
-	 * Insert/Update the HPTE in the hash table. At this point,
-	 * r4 (access) is re-useable, we use it for the new HPTE flags
-	 */
-
-	/* Load the hidx index */
-	rldicl	r25,r3,64-12,60
-
-BEGIN_FTR_SECTION
-	cmpdi	r9,0			/* check segment size */
-	bne	3f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)
-	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT - VPN_SHIFT
-	/*
-	 * clrldi r3,r3,64 - SID_SHIFT -->  ea & 0xfffffff
-	 * srdi	 r28,r3,VPN_SHIFT
-	 */
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT - VPN_SHIFT)
-	or	r29,r28,r29
-	/*
-	 * Calculate hash value for primary slot and store it in r28
-	 * r3 = va, r5 = vsid
-	 * r0 = (va >> 12) & ((1ul << (28 - 12)) -1)
-	 */
-	rldicl	r0,r3,64-12,48
-	xor	r28,r5,r0		/* hash */
-	b	4f
-
-3:	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT_1T - VPN_SHIFT
-	/*
-	 * clrldi r3,r3,64 - SID_SHIFT_1T -->  ea & 0xffffffffff
-	 * srdi	r28,r3,VPN_SHIFT
-	 */
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT_1T - VPN_SHIFT)
-	or	r29,r28,r29
-
-	/*
-	 * Calculate hash value for primary slot and
-	 * store it in r28  for 1T segment
-	 * r3 = va, r5 = vsid
-	 */
-	sldi	r28,r5,25		/* vsid << 25 */
-	/* r0 = (va >> 12) & ((1ul << (40 - 12)) -1) */
-	rldicl	r0,r3,64-12,36
-	xor	r28,r28,r5		/* vsid ^ ( vsid << 25) */
-	xor	r28,r28,r0		/* hash */
-
-	/* Convert linux PTE bits into HW equivalents */
-4:
-#ifdef CONFIG_PPC_SUBPAGE_PROT
-	andc	r10,r30,r10
-	andi.	r3,r10,0x1fe		/* Get basic set of flags */
-	rlwinm	r0,r10,32-9+1,30,30	/* _PAGE_RW -> _PAGE_USER (r0) */
-#else
-	andi.	r3,r30,0x1fe		/* Get basic set of flags */
-	rlwinm	r0,r30,32-9+1,30,30	/* _PAGE_RW -> _PAGE_USER (r0) */
-#endif
-	xori	r3,r3,HPTE_R_N		/* _PAGE_EXEC -> NOEXEC */
-	rlwinm	r4,r30,32-7+1,30,30	/* _PAGE_DIRTY -> _PAGE_USER (r4) */
-	and	r0,r0,r4		/* _PAGE_RW & _PAGE_DIRTY ->r0 bit 30*/
-	andc	r0,r3,r0		/* r0 = pte & ~r0 */
-	rlwimi	r3,r0,32-1,31,31	/* Insert result into PP lsb */
-	/*
-	 * Always add "C" bit for perf. Memory coherence is always enabled
-	 */
-	ori	r3,r3,HPTE_R_C | HPTE_R_M
-
-	/* We eventually do the icache sync here (maybe inline that
-	 * code rather than call a C function...)
-	 */
-BEGIN_FTR_SECTION
-	mr	r4,r30
-	mr	r5,r7
-	bl	hash_page_do_lazy_icache
-END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
-
-	/* At this point, r3 contains new PP bits, save them in
-	 * place of "access" in the param area (sic)
-	 */
-	std	r3,STK_PARAM(R4)(r1)
-
-	/* Get htab_hash_mask */
-	ld	r4,htab_hash_mask@got(2)
-	ld	r27,0(r4)	/* htab_hash_mask -> r27 */
-
-	/* Check if we may already be in the hashtable, in this case, we
-	 * go to out-of-line code to try to modify the HPTE. We look for
-	 * the bit at (1 >> (index + 32))
-	 */
-	rldicl.	r0,r31,64-12,48
-	li	r26,0			/* Default hidx */
-	beq	htab_insert_pte
-
-	/*
-	 * Check if the pte was already inserted into the hash table
-	 * as a 64k HW page, and invalidate the 64k HPTE if so.
-	 */
-	andis.	r0,r31,_PAGE_COMBO@h
-	beq	htab_inval_old_hpte
-
-	ld	r6,STK_PARAM(R6)(r1)
-	ori	r26,r6,PTE_PAGE_HIDX_OFFSET /* Load the hidx mask. */
-	ld	r26,0(r26)
-	addi	r5,r25,36		/* Check actual HPTE_SUB bit, this */
-	rldcr.	r0,r31,r5,0		/* must match pgtable.h definition */
-	bne	htab_modify_pte
-
-htab_insert_pte:
-	/* real page number in r5, PTE RPN value + index */
-	andis.	r0,r31,_PAGE_4K_PFN@h
-	srdi	r5,r31,PTE_RPN_SHIFT
-	bne-	htab_special_pfn
-	sldi	r5,r5,PAGE_FACTOR
-	add	r5,r5,r25
-htab_special_pfn:
-	sldi	r5,r5,HW_PAGE_SHIFT
-
-	/* Calculate primary group hash */
-	and	r0,r28,r27
-	rldicr	r3,r0,3,63-3		/* r0 = (hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,0			/* !bolted, !secondary */
-	li	r8,MMU_PAGE_4K		/* page size */
-	li	r9,MMU_PAGE_4K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl htab_call_hpte_insert1
-htab_call_hpte_insert1:
-	bl	.			/* patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge	htab_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	htab_pte_insert_failure
-
-	/* Now try secondary slot */
-
-	/* real page number in r5, PTE RPN value + index */
-	andis.	r0,r31,_PAGE_4K_PFN@h
-	srdi	r5,r31,PTE_RPN_SHIFT
-	bne-	3f
-	sldi	r5,r5,PAGE_FACTOR
-	add	r5,r5,r25
-3:	sldi	r5,r5,HW_PAGE_SHIFT
-
-	/* Calculate secondary group hash */
-	andc	r0,r27,r28
-	rldicr	r3,r0,3,63-3		/* r0 = (~hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,HPTE_V_SECONDARY	/* !bolted, secondary */
-	li	r8,MMU_PAGE_4K		/* page size */
-	li	r9,MMU_PAGE_4K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl htab_call_hpte_insert2
-htab_call_hpte_insert2:
-	bl	.			/* patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge+	htab_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	htab_pte_insert_failure
-
-	/* Both are full, we need to evict something */
-	mftb	r0
-	/* Pick a random group based on TB */
-	andi.	r0,r0,1
-	mr	r5,r28
-	bne	2f
-	not	r5,r5
-2:	and	r0,r5,r27
-	rldicr	r3,r0,3,63-3		/* r0 = (hash & mask) << 3 */
-	/* Call ppc_md.hpte_remove */
-.globl htab_call_hpte_remove
-htab_call_hpte_remove:
-	bl	.			/* patched by htab_finish_init() */
-
-	/* Try all again */
-	b	htab_insert_pte
-
-	/*
-	 * Call out to C code to invalidate an 64k HW HPTE that is
-	 * useless now that the segment has been switched to 4k pages.
-	 */
-htab_inval_old_hpte:
-	mr	r3,r29			/* vpn */
-	mr	r4,r31			/* PTE.pte */
-	li	r5,0			/* PTE.hidx */
-	li	r6,MMU_PAGE_64K		/* psize */
-	ld	r7,STK_PARAM(R9)(r1)	/* ssize */
-	ld	r8,STK_PARAM(R8)(r1)	/* flags */
-	bl	flush_hash_page
-	/* Clear out _PAGE_HPTE_SUB bits in the new linux PTE */
-	lis	r0,_PAGE_HPTE_SUB@h
-	ori	r0,r0,_PAGE_HPTE_SUB@l
-	andc	r30,r30,r0
-	b	htab_insert_pte
-	
-htab_bail_ok:
-	li	r3,0
-	b	htab_bail
-
-htab_pte_insert_ok:
-	/* Insert slot number & secondary bit in PTE second half,
-	 * clear _PAGE_BUSY and set approriate HPTE slot bit
-	 */
-	ld	r6,STK_PARAM(R6)(r1)
-	li	r0,_PAGE_BUSY
-	andc	r30,r30,r0
-	/* HPTE SUB bit */
-	li	r0,1
-	subfic	r5,r25,27		/* Must match bit position in */
-	sld	r0,r0,r5		/* pgtable.h */
-	or	r30,r30,r0
-	/* hindx */
-	sldi	r5,r25,2
-	sld	r3,r3,r5
-	li	r4,0xf
-	sld	r4,r4,r5
-	andc	r26,r26,r4
-	or	r26,r26,r3
-	ori	r5,r6,PTE_PAGE_HIDX_OFFSET
-	std	r26,0(r5)
-	lwsync
-	std	r30,0(r6)
-	li	r3, 0
-htab_bail:
-	ld	r25,STK_REG(R25)(r1)
-	ld	r26,STK_REG(R26)(r1)
-	ld	r27,STK_REG(R27)(r1)
-	ld	r28,STK_REG(R28)(r1)
-	ld	r29,STK_REG(R29)(r1)
-	ld      r30,STK_REG(R30)(r1)
-	ld      r31,STK_REG(R31)(r1)
-	addi    r1,r1,STACKFRAMESIZE
-	ld      r0,16(r1)
-	mtlr    r0
-	blr
-
-htab_modify_pte:
-	/* Keep PP bits in r4 and slot idx from the PTE around in r3 */
-	mr	r4,r3
-	sldi	r5,r25,2
-	srd	r3,r26,r5
-
-	/* Secondary group ? if yes, get a inverted hash value */
-	mr	r5,r28
-	andi.	r0,r3,0x8 /* page secondary ? */
-	beq	1f
-	not	r5,r5
-1:	andi.	r3,r3,0x7 /* extract idx alone */
-
-	/* Calculate proper slot value for ppc_md.hpte_updatepp */
-	and	r0,r5,r27
-	rldicr	r0,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-	add	r3,r0,r3	/* add slot idx */
-
-	/* Call ppc_md.hpte_updatepp */
-	mr	r5,r29			/* vpn */
-	li	r6,MMU_PAGE_4K		/* base page size */
-	li	r7,MMU_PAGE_4K		/* actual page size */
-	ld	r8,STK_PARAM(R9)(r1)	/* segment size */
-	ld	r9,STK_PARAM(R8)(r1)	/* get "flags" param */
-.globl htab_call_hpte_updatepp
-htab_call_hpte_updatepp:
-	bl	.			/* patched by htab_finish_init() */
-
-	/* if we failed because typically the HPTE wasn't really here
-	 * we try an insertion.
-	 */
-	cmpdi	0,r3,-1
-	beq-	htab_insert_pte
-
-	/* Clear the BUSY bit and Write out the PTE */
-	li	r0,_PAGE_BUSY
-	andc	r30,r30,r0
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r30,0(r6)
-	li	r3,0
-	b	htab_bail
-
-htab_wrong_access:
-	/* Bail out clearing reservation */
-	stdcx.	r31,0,r6
-	li	r3,1
-	b	htab_bail
-
-htab_pte_insert_failure:
-	/* Bail out restoring old PTE */
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r31,0(r6)
-	li	r3,-1
-	b	htab_bail
-
-#endif /* CONFIG_PPC_64K_PAGES */
-
-#ifdef CONFIG_PPC_64K_PAGES
-
 /*****************************************************************************
  *                                                                           *
  *           64K SW & 64K HW in a 64K segment pages implementation           *
@@ -994,10 +621,3 @@ ht64_pte_insert_failure:
 
 
 #endif /* CONFIG_PPC_64K_PAGES */
-
-
-/*****************************************************************************
- *                                                                           *
- *           Huge pages implementation is in hugetlbpage.c                   *
- *                                                                           *
- *****************************************************************************/
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 7f9616f7c479..9fcad40c16e9 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -653,7 +653,7 @@ static void __init htab_finish_init(void)
 	patch_branch(ht64_call_hpte_updatepp,
 		ppc_function_entry(ppc_md.hpte_updatepp),
 		BRANCH_SET_LINK);
-#endif /* CONFIG_PPC_64K_PAGES */
+#else /* !CONFIG_PPC_64K_PAGES */
 
 	patch_branch(htab_call_hpte_insert1,
 		ppc_function_entry(ppc_md.hpte_insert),
@@ -667,6 +667,8 @@ static void __init htab_finish_init(void)
 	patch_branch(htab_call_hpte_updatepp,
 		ppc_function_entry(ppc_md.hpte_updatepp),
 		BRANCH_SET_LINK);
+#endif
+
 }
 
 static void __init htab_initialize(void)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 20/31] powerpc/mm: update __real_pte to take address as argument
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (18 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 19/31] powerpc/mm: Convert 4k hash insert to C Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 21/31] powerpc/mm: make pte page hash index slot 8 bits Aneesh Kumar K.V
                   ` (12 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We will use this in the later patch to compute the right hash index

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 4 ++--
 arch/powerpc/include/asm/nohash/64/pgtable.h  | 4 ++--
 arch/powerpc/mm/hash64_64k.c                  | 2 +-
 arch/powerpc/mm/tlb_hash64.c                  | 2 +-
 5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index ee073822145d..ced5a17a8d1a 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -78,7 +78,7 @@
  * generic accessors and iterators here
  */
 #define __real_pte __real_pte
-static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
+static inline real_pte_t __real_pte(unsigned long addr, pte_t pte, pte_t *ptep)
 {
 	real_pte_t rpte;
 
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f2ace2cac7bb..3117f0495b74 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -44,10 +44,10 @@
 #ifndef __real_pte
 
 #ifdef CONFIG_STRICT_MM_TYPECHECKS
-#define __real_pte(e,p)		((real_pte_t){(e)})
+#define __real_pte(a,e,p)	((real_pte_t){(e)})
 #define __rpte_to_pte(r)	((r).pte)
 #else
-#define __real_pte(e,p)		(e)
+#define __real_pte(a,e,p)	(e)
 #define __rpte_to_pte(r)	(__pte(r))
 #endif
 #define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >> 12)
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index c24e03f22655..f389f2d6789e 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -115,10 +115,10 @@
 #ifndef __real_pte
 
 #ifdef CONFIG_STRICT_MM_TYPECHECKS
-#define __real_pte(e,p)		((real_pte_t){(e)})
+#define __real_pte(a,e,p)	((real_pte_t){(e)})
 #define __rpte_to_pte(r)	((r).pte)
 #else
-#define __real_pte(e,p)		(e)
+#define __real_pte(a,e,p)	(e)
 #define __rpte_to_pte(r)	(__pte(r))
 #endif
 #define __rpte_to_hidx(r,index)	(pte_val(__rpte_to_pte(r)) >> 12)
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index da133f2a51a7..456aa3bfa8f1 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -90,7 +90,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 
 	subpg_index = (ea & (PAGE_SIZE - 1)) >> shift;
 	vpn  = hpt_vpn(ea, vsid, ssize);
-	rpte = __real_pte(__pte(old_pte), ptep);
+	rpte = __real_pte(ea, __pte(old_pte), ptep);
 	/*
 	 *None of the sub 4k page is hashed
 	 */
diff --git a/arch/powerpc/mm/tlb_hash64.c b/arch/powerpc/mm/tlb_hash64.c
index f7b80391bee7..dd0fd1783bcc 100644
--- a/arch/powerpc/mm/tlb_hash64.c
+++ b/arch/powerpc/mm/tlb_hash64.c
@@ -89,7 +89,7 @@ void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
 	}
 	WARN_ON(vsid == 0);
 	vpn = hpt_vpn(addr, vsid, ssize);
-	rpte = __real_pte(__pte(pte), ptep);
+	rpte = __real_pte(addr, __pte(pte), ptep);
 
 	/*
 	 * Check if we have an active batch on this CPU. If not, just
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 21/31] powerpc/mm: make pte page hash index slot 8 bits
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (19 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 20/31] powerpc/mm: update __real_pte to take address as argument Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 22/31] powerpc/mm: Don't track subpage valid bit in pte_t Aneesh Kumar K.V
                   ` (11 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Currently we use 4 bits for each slot and pack all the 16 slot
information related to a 64K linux page in a 64bit value. To do this
we use 16 bits of pte_t. Move the hash slot valid bit out of pte_t
and place them in the second half of pte page. We also use 8 bit
per each slot.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 48 +++++++++++++++------------
 arch/powerpc/include/asm/book3s/64/hash.h     |  5 ---
 arch/powerpc/include/asm/page.h               |  4 +--
 arch/powerpc/mm/hash64_64k.c                  | 34 +++++++++++++++----
 4 files changed, 56 insertions(+), 35 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index ced5a17a8d1a..dafc2f31c843 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -78,33 +78,39 @@
  * generic accessors and iterators here
  */
 #define __real_pte __real_pte
-static inline real_pte_t __real_pte(unsigned long addr, pte_t pte, pte_t *ptep)
-{
-	real_pte_t rpte;
-
-	rpte.pte = pte;
-	rpte.hidx = 0;
-	if (pte_val(pte) & _PAGE_COMBO) {
-		/*
-		 * Make sure we order the hidx load against the _PAGE_COMBO
-		 * check. The store side ordering is done in __hash_page_4K
-		 */
-		smp_rmb();
-		rpte.hidx = pte_val(*((ptep) + PTRS_PER_PTE));
-	}
-	return rpte;
-}
-
+extern real_pte_t __real_pte(unsigned long addr, pte_t pte, pte_t *ptep);
 static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 {
 	if ((pte_val(rpte.pte) & _PAGE_COMBO))
-		return (rpte.hidx >> (index<<2)) & 0xf;
+		return (unsigned long) rpte.hidx[index] >> 4;
 	return (pte_val(rpte.pte) >> 12) & 0xf;
 }
 
-#define __rpte_to_pte(r)	((r).pte)
-#define __rpte_sub_valid(rpte, index) \
-	(pte_val(rpte.pte) & (_PAGE_HPTE_SUB0 >> (index)))
+static inline pte_t __rpte_to_pte(real_pte_t rpte)
+{
+	return rpte.pte;
+}
+/*
+ * we look at the second half of the pte page to determine whether
+ * the sub 4k hpte is valid. We use 8 bits per each index, and we have
+ * 16 index mapping full 64K page. Hence for each
+ * 64K linux page we use 128 bit from the second half of pte page.
+ * The encoding in the second half of the page is as below:
+ * [ index 15 ] .........................[index 0]
+ * [bit 127 ..................................bit 0]
+ * fomat of each index
+ * bit 7 ........ bit0
+ * [one bit secondary][ 3 bit hidx][1 bit valid][000]
+ */
+static inline bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
+{
+	unsigned char index_val = rpte.hidx[index];
+
+	if ((index_val >> 3) & 0x1)
+		return true;
+	return false;
+}
+
 /*
  * Trick: we set __end to va + 64k, which happens works for
  * a 16M page as well as we want only one iteration
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index c74916b5c657..050d22aaa050 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -214,11 +214,6 @@
 
 #define PMD_BAD_BITS		(PTE_TABLE_SIZE-1)
 #define PUD_BAD_BITS		(PMD_TABLE_SIZE-1)
-/*
- * We save the slot number & secondary bit in the second half of the
- * PTE page. We use the 8 bytes per each pte entry.
- */
-#define PTE_PAGE_HIDX_OFFSET (PTRS_PER_PTE * 8)
 
 #ifndef __ASSEMBLY__
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 9d2f38e1b21d..9c3211eb487c 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -295,7 +295,7 @@ static inline pte_basic_t pte_val(pte_t x)
  * the "second half" part of the PTE for pseudo 64k pages
  */
 #if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_STD_MMU_64)
-typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
+typedef struct { pte_t pte; unsigned char *hidx; } real_pte_t;
 #else
 typedef struct { pte_t pte; } real_pte_t;
 #endif
@@ -347,7 +347,7 @@ static inline pte_basic_t pte_val(pte_t pte)
 }
 
 #if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_PPC_STD_MMU_64)
-typedef struct { pte_t pte; unsigned long hidx; } real_pte_t;
+typedef struct { pte_t pte; unsigned char *hidx; } real_pte_t;
 #else
 typedef pte_t real_pte_t;
 #endif
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 456aa3bfa8f1..c40ee12cc922 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -16,12 +16,32 @@
 #include <asm/machdep.h>
 #include <asm/mmu.h>
 
+real_pte_t __real_pte(unsigned long addr, pte_t pte, pte_t *ptep)
+{
+	int indx;
+	real_pte_t rpte;
+	pte_t *pte_headp;
+
+	rpte.pte = pte;
+	rpte.hidx = NULL;
+	if (pte_val(pte) & _PAGE_COMBO) {
+		indx = pte_index(addr);
+		pte_headp = ptep - indx;
+		/*
+		 * Make sure we order the hidx load against the _PAGE_COMBO
+		 * check. The store side ordering is done in __hash_page_4K
+		 */
+		smp_rmb();
+		rpte.hidx = (unsigned char *)(pte_headp + PTRS_PER_PTE) + (16 * indx);
+	}
+	return rpte;
+}
+
 int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		   pte_t *ptep, unsigned long trap, unsigned long flags,
 		   int ssize, int subpg_prot)
 {
 	real_pte_t rpte;
-	unsigned long *hidxp;
 	unsigned long hpte_group;
 	unsigned int subpg_index;
 	unsigned long shift = 12; /* 4K */
@@ -90,7 +110,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 
 	subpg_index = (ea & (PAGE_SIZE - 1)) >> shift;
 	vpn  = hpt_vpn(ea, vsid, ssize);
-	rpte = __real_pte(ea, __pte(old_pte), ptep);
+	if (!(old_pte & _PAGE_COMBO))
+		rpte = __real_pte(ea, __pte(old_pte | _PAGE_COMBO), ptep);
+	else
+		rpte = __real_pte(ea, __pte(old_pte), ptep);
 	/*
 	 *None of the sub 4k page is hashed
 	 */
@@ -188,11 +211,8 @@ repeat:
 	 * Since we have _PAGE_BUSY set on ptep, we can be sure
 	 * nobody is undating hidx.
 	 */
-	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-	/* __real_pte use pte_val() any idea why ? FIXME!! */
-	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
-	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
-	new_pte |= (_PAGE_HPTE_SUB0 >> subpg_index);
+	rpte.hidx[subpg_index] = (unsigned char)(slot << 4 | 0x1 << 3);
+	new_pte |= _PAGE_HPTE_SUB0;
 	/*
 	 * check __real_pte for details on matching smp_rmb()
 	 */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 22/31] powerpc/mm: Don't track subpage valid bit in pte_t
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (20 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 21/31] powerpc/mm: make pte page hash index slot 8 bits Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 23/31] powerpc/mm: Increase the width of #define Aneesh Kumar K.V
                   ` (10 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

This free up 11 bits in pte_t. In the later patch we also change
the pte_t format so that we can start supporting migration pte
at pmd level.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 10 +--------
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 29 ++-------------------------
 arch/powerpc/include/asm/book3s/64/hash.h     |  4 ++++
 arch/powerpc/mm/hash64_64k.c                  |  4 ++--
 arch/powerpc/mm/hash_low_64.S                 |  6 +-----
 arch/powerpc/mm/hugetlbpage-hash64.c          |  5 +----
 arch/powerpc/mm/pgtable_64.c                  |  2 +-
 7 files changed, 12 insertions(+), 48 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 537eacecf6e9..75e8b9326e4b 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -47,17 +47,9 @@
 /* Bits to mask out from a PGD to get to the PUD page */
 #define PGD_MASKED_BITS		0
 
-/* PTE bits */
-#define _PAGE_HASHPTE	0x0400 /* software: pte has an associated HPTE */
-#define _PAGE_SECONDARY 0x8000 /* software: HPTE is in secondary group */
-#define _PAGE_GROUP_IX  0x7000 /* software: HPTE index within group */
-#define _PAGE_F_SECOND  _PAGE_SECONDARY
-#define _PAGE_F_GIX     _PAGE_GROUP_IX
-#define _PAGE_SPECIAL	0x10000 /* software: special page */
-
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | \
-			 _PAGE_SECONDARY | _PAGE_GROUP_IX)
+			 _PAGE_F_SECOND | _PAGE_F_GIX)
 
 /* shift to put page number into pte */
 #define PTE_RPN_SHIFT	(17)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index dafc2f31c843..b363d73ca225 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -31,33 +31,8 @@
 /* Bits to mask out from a PGD/PUD to get to the PMD page */
 #define PUD_MASKED_BITS		0x1ff
 
-/* Additional PTE bits (don't change without checking asm in hash_low.S) */
-#define _PAGE_SPECIAL	0x00000400 /* software: special page */
-#define _PAGE_HPTE_SUB	0x0ffff000 /* combo only: sub pages HPTE bits */
-#define _PAGE_HPTE_SUB0	0x08000000 /* combo only: first sub page */
-#define _PAGE_COMBO	0x10000000 /* this is a combo 4k page */
-#define _PAGE_4K_PFN	0x20000000 /* PFN is for a single 4k page */
-
-/* For 64K page, we don't have a separate _PAGE_HASHPTE bit. Instead,
- * we set that to be the whole sub-bits mask. The C code will only
- * test this, so a multi-bit mask will work. For combo pages, this
- * is equivalent as effectively, the old _PAGE_HASHPTE was an OR of
- * all the sub bits. For real 64k pages, we now have the assembly set
- * _PAGE_HPTE_SUB0 in addition to setting the HIDX bits which overlap
- * that mask. This is fine as long as the HIDX bits are never set on
- * a PTE that isn't hashed, which is the case today.
- *
- * A little nit is for the huge page C code, which does the hashing
- * in C, we need to provide which bit to use.
- */
-#define _PAGE_HASHPTE	_PAGE_HPTE_SUB
-
-/* Note the full page bits must be in the same location as for normal
- * 4k pages as the same assembly will be used to insert 64K pages
- * whether the kernel has CONFIG_PPC_64K_PAGES or not
- */
-#define _PAGE_F_SECOND  0x00008000 /* full page: hidx bits */
-#define _PAGE_F_GIX     0x00007000 /* full page: hidx bits */
+#define _PAGE_COMBO	0x00020000 /* this is a combo 4k page */
+#define _PAGE_4K_PFN	0x00040000 /* PFN is for a single 4k page */
 
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | _PAGE_COMBO)
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 050d22aaa050..96475b1dbb7d 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -83,7 +83,11 @@
 #define _PAGE_DIRTY		0x0080 /* C: page changed */
 #define _PAGE_ACCESSED		0x0100 /* R: page referenced */
 #define _PAGE_RW		0x0200 /* software: user write access allowed */
+#define _PAGE_HASHPTE		0x0400 /* software: pte has an associated HPTE */
 #define _PAGE_BUSY		0x0800 /* software: PTE & hash are busy */
+#define _PAGE_F_GIX		0x7000 /* full page: hidx bits */
+#define _PAGE_F_SECOND		0x8000 /* Whether to use secondary hash or not */
+#define _PAGE_SPECIAL		0x10000 /* software: special page */
 
 /* No separate kernel read-only */
 #define _PAGE_KERNEL_RW		(_PAGE_RW | _PAGE_DIRTY) /* user access blocked by key */
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index c40ee12cc922..1fabf7c9ecf2 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -125,7 +125,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 */
 	if (!(old_pte & _PAGE_COMBO)) {
 		flush_hash_page(vpn, rpte, MMU_PAGE_64K, ssize, flags);
-		old_pte &= ~_PAGE_HPTE_SUB;
+		old_pte &= ~_PAGE_HASHPTE | _PAGE_F_GIX | _PAGE_F_SECOND;
 		goto htab_insert_hpte;
 	}
 	/*
@@ -212,7 +212,7 @@ repeat:
 	 * nobody is undating hidx.
 	 */
 	rpte.hidx[subpg_index] = (unsigned char)(slot << 4 | 0x1 << 3);
-	new_pte |= _PAGE_HPTE_SUB0;
+	new_pte |= _PAGE_HASHPTE;
 	/*
 	 * check __real_pte for details on matching smp_rmb()
 	 */
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 6b4d4c1d0628..359839a57f26 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -285,7 +285,7 @@ htab_modify_pte:
 
 	/* Secondary group ? if yes, get a inverted hash value */
 	mr	r5,r28
-	andi.	r0,r31,_PAGE_SECONDARY
+	andi.	r0,r31,_PAGE_F_SECOND
 	beq	1f
 	not	r5,r5
 1:
@@ -473,11 +473,7 @@ ht64_insert_pte:
 	lis	r0,_PAGE_HPTEFLAGS@h
 	ori	r0,r0,_PAGE_HPTEFLAGS@l
 	andc	r30,r30,r0
-#ifdef CONFIG_PPC_64K_PAGES
-	oris	r30,r30,_PAGE_HPTE_SUB0@h
-#else
 	ori	r30,r30,_PAGE_HASHPTE
-#endif
 	/* Phyical address in r5 */
 	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
 	sldi	r5,r5,PAGE_SHIFT
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index d94b1af53a93..7584e8445512 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -91,11 +91,8 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
 
 		/* clear HPTE slot informations in new PTE */
-#ifdef CONFIG_PPC_64K_PAGES
-		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0;
-#else
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
-#endif
+
 		/* Add in WIMG bits */
 		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
 				      _PAGE_COHERENT | _PAGE_GUARDED));
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index d692ae31cfc7..3967e3cce03e 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -625,7 +625,7 @@ void pmdp_splitting_flush(struct vm_area_struct *vma,
 	"1:	ldarx	%0,0,%3\n\
 		andi.	%1,%0,%6\n\
 		bne-	1b \n\
-		ori	%1,%0,%4 \n\
+		oris	%1,%0,%4@h \n\
 		stdcx.	%1,0,%3 \n\
 		bne-	1b"
 	: "=&r" (old), "=&r" (tmp), "=m" (*pmdp)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 23/31] powerpc/mm: Increase the width of #define
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (21 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 22/31] powerpc/mm: Don't track subpage valid bit in pte_t Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 24/31] powerpc/mm: Convert __hash_page_64K to C Aneesh Kumar K.V
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

No real change, only style changes

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 96475b1dbb7d..987076574395 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -71,22 +71,22 @@
  * We could create separate kernel read-only if we used the 3 PP bits
  * combinations that newer processors provide but we currently don't.
  */
-#define _PAGE_PRESENT		0x0001 /* software: pte contains a translation */
-#define _PAGE_USER		0x0002 /* matches one of the PP bits */
+#define _PAGE_PRESENT		0x00001 /* software: pte contains a translation */
+#define _PAGE_USER		0x00002 /* matches one of the PP bits */
 #define _PAGE_BIT_SWAP_TYPE	2
-#define _PAGE_EXEC		0x0004 /* No execute on POWER4 and newer (we invert) */
-#define _PAGE_GUARDED		0x0008
+#define _PAGE_EXEC		0x00004 /* No execute on POWER4 and newer (we invert) */
+#define _PAGE_GUARDED		0x00008
 /* We can derive Memory coherence from _PAGE_NO_CACHE */
 #define _PAGE_COHERENT		0x0
-#define _PAGE_NO_CACHE		0x0020 /* I: cache inhibit */
-#define _PAGE_WRITETHRU		0x0040 /* W: cache write-through */
-#define _PAGE_DIRTY		0x0080 /* C: page changed */
-#define _PAGE_ACCESSED		0x0100 /* R: page referenced */
-#define _PAGE_RW		0x0200 /* software: user write access allowed */
-#define _PAGE_HASHPTE		0x0400 /* software: pte has an associated HPTE */
-#define _PAGE_BUSY		0x0800 /* software: PTE & hash are busy */
-#define _PAGE_F_GIX		0x7000 /* full page: hidx bits */
-#define _PAGE_F_SECOND		0x8000 /* Whether to use secondary hash or not */
+#define _PAGE_NO_CACHE		0x00020 /* I: cache inhibit */
+#define _PAGE_WRITETHRU		0x00040 /* W: cache write-through */
+#define _PAGE_DIRTY		0x00080 /* C: page changed */
+#define _PAGE_ACCESSED		0x00100 /* R: page referenced */
+#define _PAGE_RW		0x00200 /* software: user write access allowed */
+#define _PAGE_HASHPTE		0x00400 /* software: pte has an associated HPTE */
+#define _PAGE_BUSY		0x00800 /* software: PTE & hash are busy */
+#define _PAGE_F_GIX		0x07000 /* full page: hidx bits */
+#define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
 #define _PAGE_SPECIAL		0x10000 /* software: special page */
 
 /* No separate kernel read-only */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 24/31] powerpc/mm: Convert __hash_page_64K to C
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (22 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 23/31] powerpc/mm: Increase the width of #define Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 25/31] powerpc/mm: Convert 4k insert from asm " Aneesh Kumar K.V
                   ` (8 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Convert from asm to C

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   3 +-
 arch/powerpc/include/asm/book3s/64/hash.h     |   1 +
 arch/powerpc/mm/hash64_64k.c                  | 134 +++++++++++-
 arch/powerpc/mm/hash_low_64.S                 | 290 +-------------------------
 arch/powerpc/mm/hash_utils_64.c               |  19 +-
 5 files changed, 137 insertions(+), 310 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index b363d73ca225..f46fbd6cd837 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -35,7 +35,8 @@
 #define _PAGE_4K_PFN	0x00040000 /* PFN is for a single 4k page */
 
 /* PTE flags to conserve for HPTE identification */
-#define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_HASHPTE | _PAGE_COMBO)
+#define _PAGE_HPTEFLAGS (_PAGE_BUSY | _PAGE_F_SECOND | \
+			 _PAGE_F_GIX | _PAGE_HASHPTE | _PAGE_COMBO)
 
 /* Shift to put page number into pte.
  *
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 987076574395..fb6de2954c4d 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -86,6 +86,7 @@
 #define _PAGE_HASHPTE		0x00400 /* software: pte has an associated HPTE */
 #define _PAGE_BUSY		0x00800 /* software: PTE & hash are busy */
 #define _PAGE_F_GIX		0x07000 /* full page: hidx bits */
+#define _PAGE_F_GIX_SHIFT	12
 #define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
 #define _PAGE_SPECIAL		0x10000 /* software: special page */
 
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 1fabf7c9ecf2..d6a98ef374f3 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -44,10 +44,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	real_pte_t rpte;
 	unsigned long hpte_group;
 	unsigned int subpg_index;
-	unsigned long shift = 12; /* 4K */
 	unsigned long rflags, pa, hidx;
 	unsigned long old_pte, new_pte, subpg_pte;
 	unsigned long vpn, hash, slot;
+	unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
 
 	/*
 	 * atomically mark the linux large page PTE busy and dirty
@@ -212,7 +212,7 @@ repeat:
 	 * nobody is undating hidx.
 	 */
 	rpte.hidx[subpg_index] = (unsigned char)(slot << 4 | 0x1 << 3);
-	new_pte |= _PAGE_HASHPTE;
+	new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE | _PAGE_COMBO;
 	/*
 	 * check __real_pte for details on matching smp_rmb()
 	 */
@@ -220,3 +220,133 @@ repeat:
 	*ptep = __pte(new_pte & ~_PAGE_BUSY);
 	return 0;
 }
+
+int __hash_page_64K(unsigned long ea, unsigned long access,
+		    unsigned long vsid, pte_t *ptep, unsigned long trap,
+		    unsigned long flags, int ssize)
+{
+
+	unsigned long hpte_group;
+	unsigned long rflags, pa;
+	unsigned long old_pte, new_pte;
+	unsigned long vpn, hash, slot;
+	unsigned long shift = mmu_psize_defs[MMU_PAGE_64K].shift;
+
+	/*
+	 * atomically mark the linux large page PTE busy and dirty
+	 */
+	do {
+		pte_t pte = READ_ONCE(*ptep);
+
+		old_pte = pte_val(pte);
+		/* If PTE busy, retry the access */
+		if (unlikely(old_pte & _PAGE_BUSY))
+			return 0;
+		/* If PTE permissions don't match, take page fault */
+		if (unlikely(access & ~old_pte))
+			return 1;
+		/*
+		 * Check if PTE has the cache-inhibit bit set
+		 * If so, bail out and refault as a 4k page
+		 */
+		if (!mmu_has_feature(MMU_FTR_CI_LARGE_PAGE) &&
+		    unlikely(old_pte & _PAGE_NO_CACHE))
+			return 0;
+		/*
+		 * Try to lock the PTE, add ACCESSED and DIRTY if it was
+		 * a write access. Since this is 4K insert of 64K page size
+		 * also add _PAGE_COMBO
+		 */
+		new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED;
+		if (access & _PAGE_RW)
+			new_pte |= _PAGE_DIRTY;
+	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
+					  old_pte, new_pte));
+	/*
+	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
+	 * need to add in 0x1 if it's a read-only user page
+	 */
+	rflags = new_pte & _PAGE_USER;
+	if ((new_pte & _PAGE_USER) && !((new_pte & _PAGE_RW) &&
+					(new_pte & _PAGE_DIRTY)))
+		rflags |= 0x1;
+	/*
+	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
+	 */
+	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
+	/*
+	 * Always add C and Memory coherence bit
+	 */
+	rflags |= HPTE_R_C | HPTE_R_M;
+	/*
+	 * Add in WIMG bits
+	 */
+	rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
+				_PAGE_COHERENT | _PAGE_GUARDED));
+
+	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
+	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
+		rflags = hash_page_do_lazy_icache(rflags, __pte(old_pte), trap);
+
+	vpn  = hpt_vpn(ea, vsid, ssize);
+	if (unlikely(old_pte & _PAGE_HASHPTE)) {
+		/*
+		 * There MIGHT be an HPTE for this pte
+		 */
+		hash = hpt_hash(vpn, shift, ssize);
+		if (old_pte & _PAGE_F_SECOND)
+			hash = ~hash;
+		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+		slot += (old_pte & _PAGE_F_GIX) >> _PAGE_F_GIX_SHIFT;
+
+		if (ppc_md.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
+					 MMU_PAGE_64K, ssize, flags) == -1)
+			old_pte &= ~_PAGE_HPTEFLAGS;
+	}
+
+	if (likely(!(old_pte & _PAGE_HASHPTE))) {
+
+		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
+		hash = hpt_hash(vpn, shift, ssize);
+
+repeat:
+		hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+
+		/* Insert into the hash table, primary slot */
+		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
+				  MMU_PAGE_64K, MMU_PAGE_64K, ssize);
+		/*
+		 * Primary is full, try the secondary
+		 */
+		if (unlikely(slot == -1)) {
+			hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+			slot = ppc_md.hpte_insert(hpte_group, vpn, pa,
+						  rflags, HPTE_V_SECONDARY,
+						  MMU_PAGE_64K, MMU_PAGE_64K, ssize);
+			if (slot == -1) {
+				if (mftb() & 0x1)
+					hpte_group = ((hash & htab_hash_mask) *
+						      HPTES_PER_GROUP) & ~0x7UL;
+				ppc_md.hpte_remove(hpte_group);
+				/*
+				 * FIXME!! Should be try the group from which we removed ?
+				 */
+				goto repeat;
+			}
+		}
+		/*
+		 * Hypervisor failure. Restore old pmd and return -1
+		 * similar to __hash_page_*
+		 */
+		if (unlikely(slot == -2)) {
+			*ptep = __pte(old_pte);
+			hash_failure_debug(ea, access, vsid, trap, ssize,
+					   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);
+			return -1;
+		}
+		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
+		new_pte |= (slot << _PAGE_F_GIX_SHIFT) & (_PAGE_F_SECOND | _PAGE_F_GIX);
+	}
+	*ptep = __pte(new_pte & ~_PAGE_BUSY);
+	return 0;
+}
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
index 359839a57f26..f7d49cf0ccb7 100644
--- a/arch/powerpc/mm/hash_low_64.S
+++ b/arch/powerpc/mm/hash_low_64.S
@@ -328,292 +328,4 @@ htab_pte_insert_failure:
 	li	r3,-1
 	b	htab_bail
 
-#else /* CONFIG_PPC_64K_PAGES */
-
-/*****************************************************************************
- *                                                                           *
- *           64K SW & 64K HW in a 64K segment pages implementation           *
- *                                                                           *
- *****************************************************************************/
-
-_GLOBAL(__hash_page_64K)
-	mflr	r0
-	std	r0,16(r1)
-	stdu	r1,-STACKFRAMESIZE(r1)
-	/* Save all params that we need after a function call */
-	std	r6,STK_PARAM(R6)(r1)
-	std	r8,STK_PARAM(R8)(r1)
-	std	r9,STK_PARAM(R9)(r1)
-
-	/* Save non-volatile registers.
-	 * r31 will hold "old PTE"
-	 * r30 is "new PTE"
-	 * r29 is vpn
-	 * r28 is a hash value
-	 * r27 is hashtab mask (maybe dynamic patched instead ?)
-	 */
-	std	r27,STK_REG(R27)(r1)
-	std	r28,STK_REG(R28)(r1)
-	std	r29,STK_REG(R29)(r1)
-	std	r30,STK_REG(R30)(r1)
-	std	r31,STK_REG(R31)(r1)
-
-	/* Step 1:
-	 *
-	 * Check permissions, atomically mark the linux PTE busy
-	 * and hashed.
-	 */
-1:
-	ldarx	r31,0,r6
-	/* Check access rights (access & ~(pte_val(*ptep))) */
-	andc.	r0,r4,r31
-	bne-	ht64_wrong_access
-	/* Check if PTE is busy */
-	andi.	r0,r31,_PAGE_BUSY
-	/* If so, just bail out and refault if needed. Someone else
-	 * is changing this PTE anyway and might hash it.
-	 */
-	bne-	ht64_bail_ok
-BEGIN_FTR_SECTION
-	/* Check if PTE has the cache-inhibit bit set */
-	andi.	r0,r31,_PAGE_NO_CACHE
-	/* If so, bail out and refault as a 4k page */
-	bne-	ht64_bail_ok
-END_MMU_FTR_SECTION_IFCLR(MMU_FTR_CI_LARGE_PAGE)
-	/* Prepare new PTE value (turn access RW into DIRTY, then
-	 * add BUSY and ACCESSED)
-	 */
-	rlwinm	r30,r4,32-9+7,31-7,31-7	/* _PAGE_RW -> _PAGE_DIRTY */
-	or	r30,r30,r31
-	ori	r30,r30,_PAGE_BUSY | _PAGE_ACCESSED
-	/* Write the linux PTE atomically (setting busy) */
-	stdcx.	r30,0,r6
-	bne-	1b
-	isync
-
-	/* Step 2:
-	 *
-	 * Insert/Update the HPTE in the hash table. At this point,
-	 * r4 (access) is re-useable, we use it for the new HPTE flags
-	 */
-
-BEGIN_FTR_SECTION
-	cmpdi	r9,0			/* check segment size */
-	bne	3f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)
-	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT - VPN_SHIFT
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT - VPN_SHIFT)
-	or	r29,r28,r29
-
-	/* Calculate hash value for primary slot and store it in r28
-	 * r3 = va, r5 = vsid
-	 * r0 = (va >> 16) & ((1ul << (28 - 16)) -1)
-	 */
-	rldicl	r0,r3,64-16,52
-	xor	r28,r5,r0		/* hash */
-	b	4f
-
-3:	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT_1T - VPN_SHIFT
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT_1T - VPN_SHIFT)
-	or	r29,r28,r29
-	/*
-	 * calculate hash value for primary slot and
-	 * store it in r28 for 1T segment
-	 * r3 = va, r5 = vsid
-	 */
-	sldi	r28,r5,25		/* vsid << 25 */
-	/* r0 = (va >> 16) & ((1ul << (40 - 16)) -1) */
-	rldicl	r0,r3,64-16,40
-	xor	r28,r28,r5		/* vsid ^ ( vsid << 25) */
-	xor	r28,r28,r0		/* hash */
-
-	/* Convert linux PTE bits into HW equivalents */
-4:	andi.	r3,r30,0x1fe		/* Get basic set of flags */
-	xori	r3,r3,HPTE_R_N		/* _PAGE_EXEC -> NOEXEC */
-	rlwinm	r0,r30,32-9+1,30,30	/* _PAGE_RW -> _PAGE_USER (r0) */
-	rlwinm	r4,r30,32-7+1,30,30	/* _PAGE_DIRTY -> _PAGE_USER (r4) */
-	and	r0,r0,r4		/* _PAGE_RW & _PAGE_DIRTY ->r0 bit 30*/
-	andc	r0,r30,r0		/* r0 = pte & ~r0 */
-	rlwimi	r3,r0,32-1,31,31	/* Insert result into PP lsb */
-	/*
-	 * Always add "C" bit for perf. Memory coherence is always enabled
-	 */
-	ori	r3,r3,HPTE_R_C | HPTE_R_M
-
-	/* We eventually do the icache sync here (maybe inline that
-	 * code rather than call a C function...)
-	 */
-BEGIN_FTR_SECTION
-	mr	r4,r30
-	mr	r5,r7
-	bl	hash_page_do_lazy_icache
-END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
-
-	/* At this point, r3 contains new PP bits, save them in
-	 * place of "access" in the param area (sic)
-	 */
-	std	r3,STK_PARAM(R4)(r1)
-
-	/* Get htab_hash_mask */
-	ld	r4,htab_hash_mask@got(2)
-	ld	r27,0(r4)	/* htab_hash_mask -> r27 */
-
-	/* Check if we may already be in the hashtable, in this case, we
-	 * go to out-of-line code to try to modify the HPTE
-	 */
-	rldicl.	r0,r31,64-12,48
-	bne	ht64_modify_pte
-
-ht64_insert_pte:
-	/* Clear hpte bits in new pte (we also clear BUSY btw) and
-	 * add _PAGE_HPTE_SUB0
-	 */
-	lis	r0,_PAGE_HPTEFLAGS@h
-	ori	r0,r0,_PAGE_HPTEFLAGS@l
-	andc	r30,r30,r0
-	ori	r30,r30,_PAGE_HASHPTE
-	/* Phyical address in r5 */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
-	sldi	r5,r5,PAGE_SHIFT
-
-	/* Calculate primary group hash */
-	and	r0,r28,r27
-	rldicr	r3,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,0			/* !bolted, !secondary */
-	li	r8,MMU_PAGE_64K
-	li	r9,MMU_PAGE_64K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl ht64_call_hpte_insert1
-ht64_call_hpte_insert1:
-	bl	.			/* patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge	ht64_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	ht64_pte_insert_failure
-
-	/* Now try secondary slot */
-
-	/* Phyical address in r5 */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
-	sldi	r5,r5,PAGE_SHIFT
-
-	/* Calculate secondary group hash */
-	andc	r0,r27,r28
-	rldicr	r3,r0,3,63-3	/* r0 = (~hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,HPTE_V_SECONDARY	/* !bolted, secondary */
-	li	r8,MMU_PAGE_64K
-	li	r9,MMU_PAGE_64K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl ht64_call_hpte_insert2
-ht64_call_hpte_insert2:
-	bl	.			/* patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge+	ht64_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	ht64_pte_insert_failure
-
-	/* Both are full, we need to evict something */
-	mftb	r0
-	/* Pick a random group based on TB */
-	andi.	r0,r0,1
-	mr	r5,r28
-	bne	2f
-	not	r5,r5
-2:	and	r0,r5,r27
-	rldicr	r3,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-	/* Call ppc_md.hpte_remove */
-.globl ht64_call_hpte_remove
-ht64_call_hpte_remove:
-	bl	.			/* patched by htab_finish_init() */
-
-	/* Try all again */
-	b	ht64_insert_pte
-
-ht64_bail_ok:
-	li	r3,0
-	b	ht64_bail
-
-ht64_pte_insert_ok:
-	/* Insert slot number & secondary bit in PTE */
-	rldimi	r30,r3,12,63-15
-
-	/* Write out the PTE with a normal write
-	 * (maybe add eieio may be good still ?)
-	 */
-ht64_write_out_pte:
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r30,0(r6)
-	li	r3, 0
-ht64_bail:
-	ld	r27,STK_REG(R27)(r1)
-	ld	r28,STK_REG(R28)(r1)
-	ld	r29,STK_REG(R29)(r1)
-	ld      r30,STK_REG(R30)(r1)
-	ld      r31,STK_REG(R31)(r1)
-	addi    r1,r1,STACKFRAMESIZE
-	ld      r0,16(r1)
-	mtlr    r0
-	blr
-
-ht64_modify_pte:
-	/* Keep PP bits in r4 and slot idx from the PTE around in r3 */
-	mr	r4,r3
-	rlwinm	r3,r31,32-12,29,31
-
-	/* Secondary group ? if yes, get a inverted hash value */
-	mr	r5,r28
-	andi.	r0,r31,_PAGE_F_SECOND
-	beq	1f
-	not	r5,r5
-1:
-	/* Calculate proper slot value for ppc_md.hpte_updatepp */
-	and	r0,r5,r27
-	rldicr	r0,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-	add	r3,r0,r3	/* add slot idx */
-
-	/* Call ppc_md.hpte_updatepp */
-	mr	r5,r29			/* vpn */
-	li	r6,MMU_PAGE_64K		/* base page size */
-	li	r7,MMU_PAGE_64K		/* actual page size */
-	ld	r8,STK_PARAM(R9)(r1)	/* segment size */
-	ld	r9,STK_PARAM(R8)(r1)	/* get "flags" param */
-.globl ht64_call_hpte_updatepp
-ht64_call_hpte_updatepp:
-	bl	.			/* patched by htab_finish_init() */
-
-	/* if we failed because typically the HPTE wasn't really here
-	 * we try an insertion.
-	 */
-	cmpdi	0,r3,-1
-	beq-	ht64_insert_pte
-
-	/* Clear the BUSY bit and Write out the PTE */
-	li	r0,_PAGE_BUSY
-	andc	r30,r30,r0
-	b	ht64_write_out_pte
-
-ht64_wrong_access:
-	/* Bail out clearing reservation */
-	stdcx.	r31,0,r6
-	li	r3,1
-	b	ht64_bail
-
-ht64_pte_insert_failure:
-	/* Bail out restoring old PTE */
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r31,0(r6)
-	li	r3,-1
-	b	ht64_bail
-
-
-#endif /* CONFIG_PPC_64K_PAGES */
+#endif
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 9fcad40c16e9..d890580a4c87 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -633,28 +633,11 @@ extern u32 htab_call_hpte_insert1[];
 extern u32 htab_call_hpte_insert2[];
 extern u32 htab_call_hpte_remove[];
 extern u32 htab_call_hpte_updatepp[];
-extern u32 ht64_call_hpte_insert1[];
-extern u32 ht64_call_hpte_insert2[];
-extern u32 ht64_call_hpte_remove[];
-extern u32 ht64_call_hpte_updatepp[];
 
 static void __init htab_finish_init(void)
 {
-#ifdef CONFIG_PPC_64K_PAGES
-	patch_branch(ht64_call_hpte_insert1,
-		ppc_function_entry(ppc_md.hpte_insert),
-		BRANCH_SET_LINK);
-	patch_branch(ht64_call_hpte_insert2,
-		ppc_function_entry(ppc_md.hpte_insert),
-		BRANCH_SET_LINK);
-	patch_branch(ht64_call_hpte_remove,
-		ppc_function_entry(ppc_md.hpte_remove),
-		BRANCH_SET_LINK);
-	patch_branch(ht64_call_hpte_updatepp,
-		ppc_function_entry(ppc_md.hpte_updatepp),
-		BRANCH_SET_LINK);
-#else /* !CONFIG_PPC_64K_PAGES */
 
+#ifdef CONFIG_PPC_4K_PAGES
 	patch_branch(htab_call_hpte_insert1,
 		ppc_function_entry(ppc_md.hpte_insert),
 		BRANCH_SET_LINK);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 25/31] powerpc/mm: Convert 4k insert from asm to C
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (23 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 24/31] powerpc/mm: Convert __hash_page_64K to C Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 26/31] powerpc/mm: Remove the dependency on pte bit position in asm code Aneesh Kumar K.V
                   ` (7 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

This is similar to 64K insert. May be we want to consolidate

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/Makefile        |   6 +-
 arch/powerpc/mm/hash64_4k.c     | 139 +++++++++++++++++
 arch/powerpc/mm/hash_low_64.S   | 331 ----------------------------------------
 arch/powerpc/mm/hash_utils_64.c |  26 ----
 4 files changed, 142 insertions(+), 360 deletions(-)
 create mode 100644 arch/powerpc/mm/hash64_4k.c
 delete mode 100644 arch/powerpc/mm/hash_low_64.S

diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index f80ad1a76cc8..1ffeda85c086 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -14,11 +14,11 @@ obj-$(CONFIG_PPC_MMU_NOHASH)	+= mmu_context_nohash.o tlb_nohash.o \
 obj-$(CONFIG_PPC_BOOK3E)	+= tlb_low_$(CONFIG_WORD_SIZE)e.o
 hash64-$(CONFIG_PPC_NATIVE)	:= hash_native_64.o
 obj-$(CONFIG_PPC_STD_MMU_64)	+= hash_utils_64.o slb_low.o slb.o $(hash64-y)
-obj-$(CONFIG_PPC_STD_MMU_32)	+= ppc_mmu_32.o
-obj-$(CONFIG_PPC_STD_MMU)	+= hash_low_$(CONFIG_WORD_SIZE).o \
-				   tlb_hash$(CONFIG_WORD_SIZE).o \
+obj-$(CONFIG_PPC_STD_MMU_32)	+= ppc_mmu_32.o hash_low_32.o
+obj-$(CONFIG_PPC_STD_MMU)	+= tlb_hash$(CONFIG_WORD_SIZE).o \
 				   mmu_context_hash$(CONFIG_WORD_SIZE).o
 ifeq ($(CONFIG_PPC_STD_MMU_64),y)
+obj-$(CONFIG_PPC_4K_PAGES)	+= hash64_4k.o
 obj-$(CONFIG_PPC_64K_PAGES)	+= hash64_64k.o
 endif
 obj-$(CONFIG_PPC_ICSWX)		+= icswx.o
diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
new file mode 100644
index 000000000000..3b49c6f18741
--- /dev/null
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -0,0 +1,139 @@
+/*
+ * Copyright IBM Corporation, 2015
+ * Author Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include <linux/mm.h>
+#include <asm/machdep.h>
+#include <asm/mmu.h>
+
+int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
+		   pte_t *ptep, unsigned long trap, unsigned long flags,
+		   int ssize, int subpg_prot)
+{
+	unsigned long hpte_group;
+	unsigned long rflags, pa;
+	unsigned long old_pte, new_pte;
+	unsigned long vpn, hash, slot;
+	unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
+
+	/*
+	 * atomically mark the linux large page PTE busy and dirty
+	 */
+	do {
+		pte_t pte = READ_ONCE(*ptep);
+
+		old_pte = pte_val(pte);
+		/* If PTE busy, retry the access */
+		if (unlikely(old_pte & _PAGE_BUSY))
+			return 0;
+		/* If PTE permissions don't match, take page fault */
+		if (unlikely(access & ~old_pte))
+			return 1;
+		/*
+		 * Try to lock the PTE, add ACCESSED and DIRTY if it was
+		 * a write access. Since this is 4K insert of 64K page size
+		 * also add _PAGE_COMBO
+		 */
+		new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED | _PAGE_HASHPTE;
+		if (access & _PAGE_RW)
+			new_pte |= _PAGE_DIRTY;
+	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
+					  old_pte, new_pte));
+	/*
+	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
+	 * need to add in 0x1 if it's a read-only user page
+	 */
+	rflags = new_pte & _PAGE_USER;
+	if ((new_pte & _PAGE_USER) && !((new_pte & _PAGE_RW) &&
+					(new_pte & _PAGE_DIRTY)))
+		rflags |= 0x1;
+	/*
+	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
+	 */
+	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
+	/*
+	 * Always add C and Memory coherence bit
+	 */
+	rflags |= HPTE_R_C | HPTE_R_M;
+	/*
+	 * Add in WIMG bits
+	 */
+	rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
+				_PAGE_COHERENT | _PAGE_GUARDED));
+
+	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
+	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
+		rflags = hash_page_do_lazy_icache(rflags, __pte(old_pte), trap);
+
+	vpn  = hpt_vpn(ea, vsid, ssize);
+	if (unlikely(old_pte & _PAGE_HASHPTE)) {
+		/*
+		 * There MIGHT be an HPTE for this pte
+		 */
+		hash = hpt_hash(vpn, shift, ssize);
+		if (old_pte & _PAGE_F_SECOND)
+			hash = ~hash;
+		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+		slot += (old_pte & _PAGE_F_GIX) >> _PAGE_F_GIX_SHIFT;
+
+		if (ppc_md.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_4K,
+					 MMU_PAGE_4K, ssize, flags) == -1)
+			old_pte &= ~_PAGE_HPTEFLAGS;
+	}
+
+	if (likely(!(old_pte & _PAGE_HASHPTE))) {
+
+		pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
+		hash = hpt_hash(vpn, shift, ssize);
+
+repeat:
+		hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+
+		/* Insert into the hash table, primary slot */
+		slot = ppc_md.hpte_insert(hpte_group, vpn, pa, rflags, 0,
+				  MMU_PAGE_4K, MMU_PAGE_4K, ssize);
+		/*
+		 * Primary is full, try the secondary
+		 */
+		if (unlikely(slot == -1)) {
+			hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
+			slot = ppc_md.hpte_insert(hpte_group, vpn, pa,
+						  rflags, HPTE_V_SECONDARY,
+						  MMU_PAGE_4K, MMU_PAGE_4K, ssize);
+			if (slot == -1) {
+				if (mftb() & 0x1)
+					hpte_group = ((hash & htab_hash_mask) *
+						      HPTES_PER_GROUP) & ~0x7UL;
+				ppc_md.hpte_remove(hpte_group);
+				/*
+				 * FIXME!! Should be try the group from which we removed ?
+				 */
+				goto repeat;
+			}
+		}
+		/*
+		 * Hypervisor failure. Restore old pmd and return -1
+		 * similar to __hash_page_*
+		 */
+		if (unlikely(slot == -2)) {
+			*ptep = __pte(old_pte);
+			hash_failure_debug(ea, access, vsid, trap, ssize,
+					   MMU_PAGE_4K, MMU_PAGE_4K, old_pte);
+			return -1;
+		}
+		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
+		new_pte |= (slot << _PAGE_F_GIX_SHIFT) & (_PAGE_F_SECOND | _PAGE_F_GIX);
+	}
+	*ptep = __pte(new_pte & ~_PAGE_BUSY);
+	return 0;
+}
diff --git a/arch/powerpc/mm/hash_low_64.S b/arch/powerpc/mm/hash_low_64.S
deleted file mode 100644
index f7d49cf0ccb7..000000000000
--- a/arch/powerpc/mm/hash_low_64.S
+++ /dev/null
@@ -1,331 +0,0 @@
-/*
- * ppc64 MMU hashtable management routines
- *
- * (c) Copyright IBM Corp. 2003, 2005
- *
- * Maintained by: Benjamin Herrenschmidt
- *                <benh@kernel.crashing.org>
- *
- * This file is covered by the GNU Public Licence v2 as
- * described in the kernel's COPYING file.
- */
-
-#include <asm/reg.h>
-#include <asm/pgtable.h>
-#include <asm/mmu.h>
-#include <asm/page.h>
-#include <asm/types.h>
-#include <asm/ppc_asm.h>
-#include <asm/asm-offsets.h>
-#include <asm/cputable.h>
-
-	.text
-
-/*
- * Stackframe:
- *		
- *         +-> Back chain			(SP + 256)
- *         |   General register save area	(SP + 112)
- *         |   Parameter save area		(SP + 48)
- *         |   TOC save area			(SP + 40)
- *         |   link editor doubleword		(SP + 32)
- *         |   compiler doubleword		(SP + 24)
- *         |   LR save area			(SP + 16)
- *         |   CR save area			(SP + 8)
- * SP ---> +-- Back chain			(SP + 0)
- */
-
-#ifndef CONFIG_PPC_64K_PAGES
-
-/*****************************************************************************
- *                                                                           *
- *           4K SW & 4K HW pages implementation                              *
- *                                                                           *
- *****************************************************************************/
-
-
-/*
- * _hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
- *		 pte_t *ptep, unsigned long trap, unsigned long flags,
- *		 int ssize)
- *
- * Adds a 4K page to the hash table in a segment of 4K pages only
- */
-
-_GLOBAL(__hash_page_4K)
-	mflr	r0
-	std	r0,16(r1)
-	stdu	r1,-STACKFRAMESIZE(r1)
-	/* Save all params that we need after a function call */
-	std	r6,STK_PARAM(R6)(r1)
-	std	r8,STK_PARAM(R8)(r1)
-	std	r9,STK_PARAM(R9)(r1)
-	
-	/* Save non-volatile registers.
-	 * r31 will hold "old PTE"
-	 * r30 is "new PTE"
-	 * r29 is vpn
-	 * r28 is a hash value
-	 * r27 is hashtab mask (maybe dynamic patched instead ?)
-	 */
-	std	r27,STK_REG(R27)(r1)
-	std	r28,STK_REG(R28)(r1)
-	std	r29,STK_REG(R29)(r1)
-	std	r30,STK_REG(R30)(r1)
-	std	r31,STK_REG(R31)(r1)
-	
-	/* Step 1:
-	 *
-	 * Check permissions, atomically mark the linux PTE busy
-	 * and hashed.
-	 */ 
-1:
-	ldarx	r31,0,r6
-	/* Check access rights (access & ~(pte_val(*ptep))) */
-	andc.	r0,r4,r31
-	bne-	htab_wrong_access
-	/* Check if PTE is busy */
-	andi.	r0,r31,_PAGE_BUSY
-	/* If so, just bail out and refault if needed. Someone else
-	 * is changing this PTE anyway and might hash it.
-	 */
-	bne-	htab_bail_ok
-
-	/* Prepare new PTE value (turn access RW into DIRTY, then
-	 * add BUSY,HASHPTE and ACCESSED)
-	 */
-	rlwinm	r30,r4,32-9+7,31-7,31-7	/* _PAGE_RW -> _PAGE_DIRTY */
-	or	r30,r30,r31
-	ori	r30,r30,_PAGE_BUSY | _PAGE_ACCESSED | _PAGE_HASHPTE
-	/* Write the linux PTE atomically (setting busy) */
-	stdcx.	r30,0,r6
-	bne-	1b
-	isync
-
-	/* Step 2:
-	 *
-	 * Insert/Update the HPTE in the hash table. At this point,
-	 * r4 (access) is re-useable, we use it for the new HPTE flags
-	 */
-
-BEGIN_FTR_SECTION
-	cmpdi	r9,0			/* check segment size */
-	bne	3f
-END_MMU_FTR_SECTION_IFSET(MMU_FTR_1T_SEGMENT)
-	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT - VPN_SHIFT
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT - VPN_SHIFT)
-	or	r29,r28,r29
-	/*
-	 * Calculate hash value for primary slot and store it in r28
-	 * r3 = va, r5 = vsid
-	 * r0 = (va >> 12) & ((1ul << (28 - 12)) -1)
-	 */
-	rldicl	r0,r3,64-12,48
-	xor	r28,r5,r0		/* hash */
-	b	4f
-
-3:	/* Calc vpn and put it in r29 */
-	sldi	r29,r5,SID_SHIFT_1T - VPN_SHIFT
-	rldicl  r28,r3,64 - VPN_SHIFT,64 - (SID_SHIFT_1T - VPN_SHIFT)
-	or	r29,r28,r29
-
-	/*
-	 * calculate hash value for primary slot and
-	 * store it in r28 for 1T segment
-	 * r3 = va, r5 = vsid
-	 */
-	sldi	r28,r5,25		/* vsid << 25 */
-	/* r0 =  (va >> 12) & ((1ul << (40 - 12)) -1) */
-	rldicl	r0,r3,64-12,36
-	xor	r28,r28,r5		/* vsid ^ ( vsid << 25) */
-	xor	r28,r28,r0		/* hash */
-
-	/* Convert linux PTE bits into HW equivalents */
-4:	andi.	r3,r30,0x1fe		/* Get basic set of flags */
-	xori	r3,r3,HPTE_R_N		/* _PAGE_EXEC -> NOEXEC */
-	rlwinm	r0,r30,32-9+1,30,30	/* _PAGE_RW -> _PAGE_USER (r0) */
-	rlwinm	r4,r30,32-7+1,30,30	/* _PAGE_DIRTY -> _PAGE_USER (r4) */
-	and	r0,r0,r4		/* _PAGE_RW & _PAGE_DIRTY ->r0 bit 30*/
-	andc	r0,r30,r0		/* r0 = pte & ~r0 */
-	rlwimi	r3,r0,32-1,31,31	/* Insert result into PP lsb */
-	/*
-	 * Always add "C" bit for perf. Memory coherence is always enabled
-	 */
-	ori	r3,r3,HPTE_R_C | HPTE_R_M
-
-	/* We eventually do the icache sync here (maybe inline that
-	 * code rather than call a C function...) 
-	 */
-BEGIN_FTR_SECTION
-	mr	r4,r30
-	mr	r5,r7
-	bl	hash_page_do_lazy_icache
-END_FTR_SECTION(CPU_FTR_NOEXECUTE|CPU_FTR_COHERENT_ICACHE, CPU_FTR_NOEXECUTE)
-
-	/* At this point, r3 contains new PP bits, save them in
-	 * place of "access" in the param area (sic)
-	 */
-	std	r3,STK_PARAM(R4)(r1)
-
-	/* Get htab_hash_mask */
-	ld	r4,htab_hash_mask@got(2)
-	ld	r27,0(r4)	/* htab_hash_mask -> r27 */
-
-	/* Check if we may already be in the hashtable, in this case, we
-	 * go to out-of-line code to try to modify the HPTE
-	 */
-	andi.	r0,r31,_PAGE_HASHPTE
-	bne	htab_modify_pte
-
-htab_insert_pte:
-	/* Clear hpte bits in new pte (we also clear BUSY btw) and
-	 * add _PAGE_HASHPTE
-	 */
-	lis	r0,_PAGE_HPTEFLAGS@h
-	ori	r0,r0,_PAGE_HPTEFLAGS@l
-	andc	r30,r30,r0
-	ori	r30,r30,_PAGE_HASHPTE
-
-	/* physical address r5 */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
-	sldi	r5,r5,PAGE_SHIFT
-
-	/* Calculate primary group hash */
-	and	r0,r28,r27
-	rldicr	r3,r0,3,63-3		/* r3 = (hash & mask) << 3 */
-
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,0			/* !bolted, !secondary */
-	li	r8,MMU_PAGE_4K		/* page size */
-	li	r9,MMU_PAGE_4K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl htab_call_hpte_insert1
-htab_call_hpte_insert1:
-	bl	.			/* Patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge	htab_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	htab_pte_insert_failure
-
-	/* Now try secondary slot */
-	
-	/* physical address r5 */
-	rldicl	r5,r31,64-PTE_RPN_SHIFT,PTE_RPN_SHIFT
-	sldi	r5,r5,PAGE_SHIFT
-
-	/* Calculate secondary group hash */
-	andc	r0,r27,r28
-	rldicr	r3,r0,3,63-3	/* r0 = (~hash & mask) << 3 */
-	
-	/* Call ppc_md.hpte_insert */
-	ld	r6,STK_PARAM(R4)(r1)	/* Retrieve new pp bits */
-	mr	r4,r29			/* Retrieve vpn */
-	li	r7,HPTE_V_SECONDARY	/* !bolted, secondary */
-	li	r8,MMU_PAGE_4K		/* page size */
-	li	r9,MMU_PAGE_4K		/* actual page size */
-	ld	r10,STK_PARAM(R9)(r1)	/* segment size */
-.globl htab_call_hpte_insert2
-htab_call_hpte_insert2:
-	bl	.			/* Patched by htab_finish_init() */
-	cmpdi	0,r3,0
-	bge+	htab_pte_insert_ok	/* Insertion successful */
-	cmpdi	0,r3,-2			/* Critical failure */
-	beq-	htab_pte_insert_failure
-
-	/* Both are full, we need to evict something */
-	mftb	r0
-	/* Pick a random group based on TB */
-	andi.	r0,r0,1
-	mr	r5,r28
-	bne	2f
-	not	r5,r5
-2:	and	r0,r5,r27
-	rldicr	r3,r0,3,63-3	/* r0 = (hash & mask) << 3 */	
-	/* Call ppc_md.hpte_remove */
-.globl htab_call_hpte_remove
-htab_call_hpte_remove:
-	bl	.			/* Patched by htab_finish_init() */
-
-	/* Try all again */
-	b	htab_insert_pte	
-
-htab_bail_ok:
-	li	r3,0
-	b	htab_bail
-
-htab_pte_insert_ok:
-	/* Insert slot number & secondary bit in PTE */
-	rldimi	r30,r3,12,63-15
-		
-	/* Write out the PTE with a normal write
-	 * (maybe add eieio may be good still ?)
-	 */
-htab_write_out_pte:
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r30,0(r6)
-	li	r3, 0
-htab_bail:
-	ld	r27,STK_REG(R27)(r1)
-	ld	r28,STK_REG(R28)(r1)
-	ld	r29,STK_REG(R29)(r1)
-	ld      r30,STK_REG(R30)(r1)
-	ld      r31,STK_REG(R31)(r1)
-	addi    r1,r1,STACKFRAMESIZE
-	ld      r0,16(r1)
-	mtlr    r0
-	blr
-
-htab_modify_pte:
-	/* Keep PP bits in r4 and slot idx from the PTE around in r3 */
-	mr	r4,r3
-	rlwinm	r3,r31,32-12,29,31
-
-	/* Secondary group ? if yes, get a inverted hash value */
-	mr	r5,r28
-	andi.	r0,r31,_PAGE_F_SECOND
-	beq	1f
-	not	r5,r5
-1:
-	/* Calculate proper slot value for ppc_md.hpte_updatepp */
-	and	r0,r5,r27
-	rldicr	r0,r0,3,63-3	/* r0 = (hash & mask) << 3 */
-	add	r3,r0,r3	/* add slot idx */
-
-	/* Call ppc_md.hpte_updatepp */
-	mr	r5,r29			/* vpn */
-	li	r6,MMU_PAGE_4K		/* base page size */
-	li	r7,MMU_PAGE_4K		/* actual page size */
-	ld	r8,STK_PARAM(R9)(r1)	/* segment size */
-	ld	r9,STK_PARAM(R8)(r1)	/* get "flags" param */
-.globl htab_call_hpte_updatepp
-htab_call_hpte_updatepp:
-	bl	.			/* Patched by htab_finish_init() */
-
-	/* if we failed because typically the HPTE wasn't really here
-	 * we try an insertion. 
-	 */
-	cmpdi	0,r3,-1
-	beq-	htab_insert_pte
-
-	/* Clear the BUSY bit and Write out the PTE */
-	li	r0,_PAGE_BUSY
-	andc	r30,r30,r0
-	b	htab_write_out_pte
-
-htab_wrong_access:
-	/* Bail out clearing reservation */
-	stdcx.	r31,0,r6
-	li	r3,1
-	b	htab_bail
-
-htab_pte_insert_failure:
-	/* Bail out restoring old PTE */
-	ld	r6,STK_PARAM(R6)(r1)
-	std	r31,0(r6)
-	li	r3,-1
-	b	htab_bail
-
-#endif
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index d890580a4c87..db35e7d83088 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -629,31 +629,6 @@ int remove_section_mapping(unsigned long start, unsigned long end)
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
-extern u32 htab_call_hpte_insert1[];
-extern u32 htab_call_hpte_insert2[];
-extern u32 htab_call_hpte_remove[];
-extern u32 htab_call_hpte_updatepp[];
-
-static void __init htab_finish_init(void)
-{
-
-#ifdef CONFIG_PPC_4K_PAGES
-	patch_branch(htab_call_hpte_insert1,
-		ppc_function_entry(ppc_md.hpte_insert),
-		BRANCH_SET_LINK);
-	patch_branch(htab_call_hpte_insert2,
-		ppc_function_entry(ppc_md.hpte_insert),
-		BRANCH_SET_LINK);
-	patch_branch(htab_call_hpte_remove,
-		ppc_function_entry(ppc_md.hpte_remove),
-		BRANCH_SET_LINK);
-	patch_branch(htab_call_hpte_updatepp,
-		ppc_function_entry(ppc_md.hpte_updatepp),
-		BRANCH_SET_LINK);
-#endif
-
-}
-
 static void __init htab_initialize(void)
 {
 	unsigned long table;
@@ -800,7 +775,6 @@ static void __init htab_initialize(void)
 					 mmu_linear_psize, mmu_kernel_ssize));
 	}
 
-	htab_finish_init();
 
 	DBG(" <- htab_initialize()\n");
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 26/31] powerpc/mm: Remove the dependency on pte bit position in asm code
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (24 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 25/31] powerpc/mm: Convert 4k insert from asm " Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 27/31] powerpc/mm: Add helper for converting pte bit to hpte bits Aneesh Kumar K.V
                   ` (6 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We should not expect pte bit position in asm code. Simply
by moving part of that to C

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/exceptions-64s.S | 16 +++-------------
 arch/powerpc/mm/hash_utils_64.c      | 29 +++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 0a0399c2af11..34920f11dbdd 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1556,28 +1556,18 @@ do_hash_page:
 	lwz	r0,TI_PREEMPT(r11)	/* If we're in an "NMI" */
 	andis.	r0,r0,NMI_MASK@h	/* (i.e. an irq when soft-disabled) */
 	bne	77f			/* then don't call hash_page now */
-	/*
-	 * We need to set the _PAGE_USER bit if MSR_PR is set or if we are
-	 * accessing a userspace segment (even from the kernel). We assume
-	 * kernel addresses always have the high bit set.
-	 */
-	rlwinm	r4,r4,32-25+9,31-9,31-9	/* DSISR_STORE -> _PAGE_RW */
-	rotldi	r0,r3,15		/* Move high bit into MSR_PR posn */
-	orc	r0,r12,r0		/* MSR_PR | ~high_bit */
-	rlwimi	r4,r0,32-13,30,30	/* becomes _PAGE_USER access bit */
-	ori	r4,r4,1			/* add _PAGE_PRESENT */
-	rlwimi	r4,r5,22+2,31-2,31-2	/* Set _PAGE_EXEC if trap is 0x400 */
 
 	/*
 	 * r3 contains the faulting address
-	 * r4 contains the required access permissions
+	 * r4 msr
 	 * r5 contains the trap number
 	 * r6 contains dsisr
 	 *
 	 * at return r3 = 0 for success, 1 for page fault, negative for error
 	 */
+        mr 	r4,r12
 	ld      r6,_DSISR(r1)
-	bl	hash_page		/* build HPTE if possible */
+	bl	__hash_page		/* build HPTE if possible */
 	cmpdi	r3,0			/* see if hash_page succeeded */
 
 	/* Success */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index db35e7d83088..04d549527eaa 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1162,6 +1162,35 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
 }
 EXPORT_SYMBOL_GPL(hash_page);
 
+int __hash_page(unsigned long ea, unsigned long msr, unsigned long trap,
+		unsigned long dsisr)
+{
+	unsigned long access = _PAGE_PRESENT;
+	unsigned long flags = 0;
+	struct mm_struct *mm = current->mm;
+
+	if (REGION_ID(ea) == VMALLOC_REGION_ID)
+		mm = &init_mm;
+
+	if (dsisr & DSISR_NOHPTE)
+		flags |= HPTE_NOHPTE_UPDATE;
+
+	if (dsisr & DSISR_ISSTORE)
+		access |= _PAGE_RW;
+	/*
+	 * We need to set the _PAGE_USER bit if MSR_PR is set or if we are
+	 * accessing a userspace segment (even from the kernel). We assume
+	 * kernel addresses always have the high bit set.
+	 */
+	if ((msr & MSR_PR) || (REGION_ID(ea) == USER_REGION_ID))
+		access |= _PAGE_USER;
+
+	if (trap == 0x400)
+		access |= _PAGE_EXEC;
+
+	return hash_page_mm(mm, ea, access, trap, flags);
+}
+
 void hash_preload(struct mm_struct *mm, unsigned long ea,
 		  unsigned long access, unsigned long trap)
 {
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 27/31] powerpc/mm: Add helper for converting pte bit to hpte bits
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (25 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 26/31] powerpc/mm: Remove the dependency on pte bit position in asm code Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 28/31] powerpc/mm: Move WIMG update to helper Aneesh Kumar K.V
                   ` (5 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Instead of open coding it in multiple code paths, export the helper
and add more documentation. Also make sure we don't make assumption
regarding pte bit position

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h |  1 +
 arch/powerpc/mm/hash64_4k.c               | 13 +-----------
 arch/powerpc/mm/hash64_64k.c              | 35 +++----------------------------
 arch/powerpc/mm/hash_utils_64.c           | 22 ++++++++++++-------
 arch/powerpc/mm/hugepage-hash64.c         | 13 +-----------
 arch/powerpc/mm/hugetlbpage-hash64.c      |  4 +---
 6 files changed, 21 insertions(+), 67 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index fb6de2954c4d..0d9800a32d61 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -240,6 +240,7 @@ extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
 					 pmd_t *pmdp,
 					 unsigned long clr,
 					 unsigned long set);
+extern unsigned long htab_convert_pte_flags(unsigned long pteflags);
 /* Atomic PTE updates */
 static inline unsigned long pte_update(struct mm_struct *mm,
 				       unsigned long addr,
diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index 3b49c6f18741..ee863137035a 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -53,18 +53,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
 	 * need to add in 0x1 if it's a read-only user page
 	 */
-	rflags = new_pte & _PAGE_USER;
-	if ((new_pte & _PAGE_USER) && !((new_pte & _PAGE_RW) &&
-					(new_pte & _PAGE_DIRTY)))
-		rflags |= 0x1;
-	/*
-	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
-	 */
-	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
-	/*
-	 * Always add C and Memory coherence bit
-	 */
-	rflags |= HPTE_R_C | HPTE_R_M;
+	rflags = htab_convert_pte_flags(new_pte);
 	/*
 	 * Add in WIMG bits
 	 */
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index d6a98ef374f3..21d9e513e043 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -76,22 +76,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * Handle the subpage protection bits
 	 */
 	subpg_pte = new_pte & ~subpg_prot;
-	/*
-	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
-	 * need to add in 0x1 if it's a read-only user page
-	 */
-	rflags = subpg_pte & _PAGE_USER;
-	if ((subpg_pte & _PAGE_USER) && !((subpg_pte & _PAGE_RW) &&
-					(subpg_pte & _PAGE_DIRTY)))
-		rflags |= 0x1;
-	/*
-	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
-	 */
-	rflags |= ((subpg_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
-	/*
-	 * Always add C and Memory coherence bit
-	 */
-	rflags |= HPTE_R_C | HPTE_R_M;
+	rflags = htab_convert_pte_flags(subpg_pte);
 	/*
 	 * Add in WIMG bits
 	 */
@@ -262,22 +247,8 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 			new_pte |= _PAGE_DIRTY;
 	} while (old_pte != __cmpxchg_u64((unsigned long *)ptep,
 					  old_pte, new_pte));
-	/*
-	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
-	 * need to add in 0x1 if it's a read-only user page
-	 */
-	rflags = new_pte & _PAGE_USER;
-	if ((new_pte & _PAGE_USER) && !((new_pte & _PAGE_RW) &&
-					(new_pte & _PAGE_DIRTY)))
-		rflags |= 0x1;
-	/*
-	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
-	 */
-	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
-	/*
-	 * Always add C and Memory coherence bit
-	 */
-	rflags |= HPTE_R_C | HPTE_R_M;
+
+	rflags = htab_convert_pte_flags(new_pte);
 	/*
 	 * Add in WIMG bits
 	 */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 04d549527eaa..3b5e547b965d 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -159,20 +159,26 @@ static struct mmu_psize_def mmu_psize_defaults_gp[] = {
 	},
 };
 
-static unsigned long htab_convert_pte_flags(unsigned long pteflags)
+unsigned long htab_convert_pte_flags(unsigned long pteflags)
 {
-	unsigned long rflags = pteflags & 0x1fa;
+	unsigned long rflags = 0;
 
 	/* _PAGE_EXEC -> NOEXEC */
 	if ((pteflags & _PAGE_EXEC) == 0)
 		rflags |= HPTE_R_N;
-
-	/* PP bits. PAGE_USER is already PP bit 0x2, so we only
-	 * need to add in 0x1 if it's a read-only user page
+	/*
+	 * PP bits:
+	 * Linux use slb key 0 for kernel and 1 for user.
+	 * kernel areas are mapped by PP bits 00
+	 * and and there is no kernel RO (_PAGE_KERNEL_RO).
+	 * User area mapped by 0x2 and read only use by
+	 * 0x3.
 	 */
-	if ((pteflags & _PAGE_USER) && !((pteflags & _PAGE_RW) &&
-					 (pteflags & _PAGE_DIRTY)))
-		rflags |= 1;
+	if (pteflags & _PAGE_USER) {
+		rflags |= 0x2;
+		if (!((pteflags & _PAGE_RW) && (pteflags & _PAGE_DIRTY)))
+			rflags |= 0x1;
+	}
 	/*
 	 * Always add "C" bit for perf. Memory coherence is always enabled
 	 */
diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
index 4d87122cf6a7..91fcac6f989d 100644
--- a/arch/powerpc/mm/hugepage-hash64.c
+++ b/arch/powerpc/mm/hugepage-hash64.c
@@ -54,18 +54,7 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
 			new_pmd |= _PAGE_DIRTY;
 	} while (old_pmd != __cmpxchg_u64((unsigned long *)pmdp,
 					  old_pmd, new_pmd));
-	/*
-	 * PP bits. _PAGE_USER is already PP bit 0x2, so we only
-	 * need to add in 0x1 if it's a read-only user page
-	 */
-	rflags = new_pmd & _PAGE_USER;
-	if ((new_pmd & _PAGE_USER) && !((new_pmd & _PAGE_RW) &&
-					   (new_pmd & _PAGE_DIRTY)))
-		rflags |= 0x1;
-	/*
-	 * _PAGE_EXEC -> HW_NO_EXEC since it's inverted
-	 */
-	rflags |= ((new_pmd & _PAGE_EXEC) ? 0 : HPTE_R_N);
+	rflags = htab_convert_pte_flags(new_pmd);
 
 #if 0
 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index 7584e8445512..304c8520506e 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -59,10 +59,8 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 			new_pte |= _PAGE_DIRTY;
 	} while(old_pte != __cmpxchg_u64((unsigned long *)ptep,
 					 old_pte, new_pte));
+	rflags = htab_convert_pte_flags(new_pte);
 
-	rflags = 0x2 | (!(new_pte & _PAGE_RW));
-	/* _PAGE_EXEC -> HW_NO_EXEC since it's inverted */
-	rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
 	sz = ((1UL) << shift);
 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
 		/* No CPU has hugepages but lacks no execute, so we
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 28/31] powerpc/mm: Move WIMG update to helper.
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (26 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 27/31] powerpc/mm: Add helper for converting pte bit to hpte bits Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 29/31] powerpc/mm: Move hugetlb related headers Aneesh Kumar K.V
                   ` (4 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

Only difference here is, we apply the WIMG mapping early, so rflags
passed to updatepp will also be changed.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash64_4k.c          |  5 -----
 arch/powerpc/mm/hash64_64k.c         | 10 ----------
 arch/powerpc/mm/hash_utils_64.c      | 13 ++++++++++++-
 arch/powerpc/mm/hugepage-hash64.c    |  7 -------
 arch/powerpc/mm/hugetlbpage-hash64.c |  8 --------
 5 files changed, 12 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index ee863137035a..e7c04542ba62 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -54,11 +54,6 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * need to add in 0x1 if it's a read-only user page
 	 */
 	rflags = htab_convert_pte_flags(new_pte);
-	/*
-	 * Add in WIMG bits
-	 */
-	rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				_PAGE_COHERENT | _PAGE_GUARDED));
 
 	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 21d9e513e043..84867a1491a2 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -77,11 +77,6 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 */
 	subpg_pte = new_pte & ~subpg_prot;
 	rflags = htab_convert_pte_flags(subpg_pte);
-	/*
-	 * Add in WIMG bits
-	 */
-	rflags |= (subpg_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				_PAGE_COHERENT | _PAGE_GUARDED));
 
 	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) {
@@ -249,11 +244,6 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 					  old_pte, new_pte));
 
 	rflags = htab_convert_pte_flags(new_pte);
-	/*
-	 * Add in WIMG bits
-	 */
-	rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				_PAGE_COHERENT | _PAGE_GUARDED));
 
 	if (!cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 3b5e547b965d..3d261bc6fef8 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -182,7 +182,18 @@ unsigned long htab_convert_pte_flags(unsigned long pteflags)
 	/*
 	 * Always add "C" bit for perf. Memory coherence is always enabled
 	 */
-	return rflags | HPTE_R_C | HPTE_R_M;
+	rflags |=  HPTE_R_C | HPTE_R_M;
+	/*
+	 * Add in WIG bits
+	 */
+	if (pteflags & _PAGE_WRITETHRU)
+		rflags |= HPTE_R_W;
+	if (pteflags & _PAGE_NO_CACHE)
+		rflags |= HPTE_R_I;
+	if (pteflags & _PAGE_GUARDED)
+		rflags |= HPTE_R_G;
+
+	return rflags;
 }
 
 int htab_bolt_mapping(unsigned long vstart, unsigned long vend,
diff --git a/arch/powerpc/mm/hugepage-hash64.c b/arch/powerpc/mm/hugepage-hash64.c
index 91fcac6f989d..1f666de0110a 100644
--- a/arch/powerpc/mm/hugepage-hash64.c
+++ b/arch/powerpc/mm/hugepage-hash64.c
@@ -120,13 +120,6 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
 		pa = pmd_pfn(__pmd(old_pmd)) << PAGE_SHIFT;
 		new_pmd |= _PAGE_HASHPTE;
 
-		/* Add in WIMG bits */
-		rflags |= (new_pmd & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				      _PAGE_GUARDED));
-		/*
-		 * enable the memory coherence always
-		 */
-		rflags |= HPTE_R_M;
 repeat:
 		hpte_group = ((hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
 
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index 304c8520506e..0734e4daffef 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -91,14 +91,6 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		/* clear HPTE slot informations in new PTE */
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
 
-		/* Add in WIMG bits */
-		rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
-				      _PAGE_COHERENT | _PAGE_GUARDED));
-		/*
-		 * enable the memory coherence always
-		 */
-		rflags |= HPTE_R_M;
-
 		slot = hpte_insert_repeating(hash, vpn, pa, rflags, 0,
 					     mmu_psize, ssize);
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 29/31] powerpc/mm: Move hugetlb related headers
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (27 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 28/31] powerpc/mm: Move WIMG update to helper Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 30/31] powerpc/mm: Move THP headers around Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

W.r.t hugetlb, we support two format for pmd. With book3s_64 and
64K linux page size, we can have pte at the pmd level. Hence we
don't need to support hugepd there. For everything else hugepd
is supported and pmd_huge is (0).

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 31 +++++++++++++++
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 51 +++++++++++++++++++++++++
 arch/powerpc/include/asm/nohash/pgtable.h     | 25 +++++++++++++
 arch/powerpc/include/asm/page.h               | 42 ++-------------------
 arch/powerpc/mm/hugetlbpage.c                 | 54 ---------------------------
 5 files changed, 111 insertions(+), 92 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 75e8b9326e4b..b4d25529d179 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -93,6 +93,37 @@ extern struct page *pgd_page(pgd_t pgd);
 #define remap_4k_pfn(vma, addr, pfn, prot)	\
 	remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot))
 
+#ifdef CONFIG_HUGETLB_PAGE
+/*
+ * For 4k page size, we support explicit hugepage via hugepd
+ */
+static inline int pmd_huge(pmd_t pmd)
+{
+	return 0;
+}
+
+static inline int pud_huge(pud_t pud)
+{
+	return 0;
+}
+
+static inline int pgd_huge(pgd_t pgd)
+{
+	return 0;
+}
+#define pgd_huge pgd_huge
+
+static inline int hugepd_ok(hugepd_t hpd)
+{
+	/*
+	 * hugepd pointer, bottom two bits == 00 and next 4 bits
+	 * indicate size of table
+	 */
+	return (((hpd.pd & 0x3) == 0x0) && ((hpd.pd & HUGEPD_SHIFT_MASK) != 0));
+}
+#define is_hugepd(hpd)		(hugepd_ok(hpd))
+#endif
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_4K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index f46fbd6cd837..20865ca7a179 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -119,6 +119,57 @@ static inline bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
 #define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
 #define pte_pgd(pte)	((pgd_t)pte_pud(pte))
 
+#ifdef CONFIG_HUGETLB_PAGE
+/*
+ * We have PGD_INDEX_SIZ = 12 and PTE_INDEX_SIZE = 8, so that we can have
+ * 16GB hugepage pte in PGD and 16MB hugepage pte at PMD;
+ *
+ * Defined in such a way that we can optimize away code block at build time
+ * if CONFIG_HUGETLB_PAGE=n.
+ */
+static inline int pmd_huge(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pmd_val(pmd) & 0x3) != 0x0);
+}
+
+static inline int pud_huge(pud_t pud)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pud_val(pud) & 0x3) != 0x0);
+}
+
+static inline int pgd_huge(pgd_t pgd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pgd_val(pgd) & 0x3) != 0x0);
+}
+#define pgd_huge pgd_huge
+
+#ifdef CONFIG_DEBUG_VM
+extern int hugepd_ok(hugepd_t hpd);
+#define is_hugepd(hpd)               (hugepd_ok(hpd))
+#else
+/*
+ * With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't
+ * need to setup hugepage directory for them. Our pte and page directory format
+ * enable us to have this enabled.
+ */
+static inline int hugepd_ok(hugepd_t hpd)
+{
+	return 0;
+}
+#define is_hugepd(pdep)			0
+#endif /* CONFIG_DEBUG_VM */
+
+#endif /* CONFIG_HUGETLB_PAGE */
+
 #endif	/* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index c0c41a2409d2..1263c22d60d8 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -223,5 +223,30 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 				     unsigned long size, pgprot_t vma_prot);
 #define __HAVE_PHYS_MEM_ACCESS_PROT
 
+#ifdef CONFIG_HUGETLB_PAGE
+static inline int hugepd_ok(hugepd_t hpd)
+{
+	return (hpd.pd > 0);
+}
+
+static inline int pmd_huge(pmd_t pmd)
+{
+	return 0;
+}
+
+static inline int pud_huge(pud_t pud)
+{
+	return 0;
+}
+
+static inline int pgd_huge(pgd_t pgd)
+{
+	return 0;
+}
+#define pgd_huge		pgd_huge
+
+#define is_hugepd(hpd)		(hugepd_ok(hpd))
+#endif
+
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 9c3211eb487c..f63b2761cdd0 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -386,45 +386,11 @@ typedef unsigned long pgprot_t;
 
 typedef struct { signed long pd; } hugepd_t;
 
-#ifdef CONFIG_HUGETLB_PAGE
-#ifdef CONFIG_PPC_BOOK3S_64
-#ifdef CONFIG_PPC_64K_PAGES
-/*
- * With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't
- * need to setup hugepage directory for them. Our pte and page directory format
- * enable us to have this enabled. But to avoid errors when implementing new
- * features disable hugepd for 64K. We enable a debug version here, So we catch
- * wrong usage.
- */
-#ifdef CONFIG_DEBUG_VM
-extern int hugepd_ok(hugepd_t hpd);
-#else
-#define hugepd_ok(x)	(0)
-#endif
-#else
-static inline int hugepd_ok(hugepd_t hpd)
-{
-	/*
-	 * hugepd pointer, bottom two bits == 00 and next 4 bits
-	 * indicate size of table
-	 */
-	return (((hpd.pd & 0x3) == 0x0) && ((hpd.pd & HUGEPD_SHIFT_MASK) != 0));
-}
-#endif
-#else
-static inline int hugepd_ok(hugepd_t hpd)
-{
-	return (hpd.pd > 0);
-}
-#endif
-
-#define is_hugepd(hpd)               (hugepd_ok(hpd))
-#define pgd_huge pgd_huge
-int pgd_huge(pgd_t pgd);
-#else /* CONFIG_HUGETLB_PAGE */
-#define is_hugepd(pdep)			0
-#define pgd_huge(pgd)			0
+#ifndef CONFIG_HUGETLB_PAGE
+#define is_hugepd(pdep)		(0)
+#define pgd_huge(pgd)		(0)
 #endif /* CONFIG_HUGETLB_PAGE */
+
 #define __hugepd(x) ((hugepd_t) { (x) })
 
 struct page;
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 9833fee493ec..f20c1a9065da 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -53,43 +53,6 @@ static unsigned nr_gpages;
 
 #define hugepd_none(hpd)	((hpd).pd == 0)
 
-#ifdef CONFIG_PPC_BOOK3S_64
-/*
- * At this point we do the placement change only for BOOK3S 64. This would
- * possibly work on other subarchs.
- */
-
-/*
- * We have PGD_INDEX_SIZ = 12 and PTE_INDEX_SIZE = 8, so that we can have
- * 16GB hugepage pte in PGD and 16MB hugepage pte at PMD;
- *
- * Defined in such a way that we can optimize away code block at build time
- * if CONFIG_HUGETLB_PAGE=n.
- */
-int pmd_huge(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
-}
-
-int pud_huge(pud_t pud)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pud_val(pud) & 0x3) != 0x0);
-}
-
-int pgd_huge(pgd_t pgd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pgd_val(pgd) & 0x3) != 0x0);
-}
-
 #if defined(CONFIG_PPC_64K_PAGES) && defined(CONFIG_DEBUG_VM)
 /*
  * This enables us to catch the wrong page directory format
@@ -108,23 +71,6 @@ int hugepd_ok(hugepd_t hpd)
 }
 #endif
 
-#else
-int pmd_huge(pmd_t pmd)
-{
-	return 0;
-}
-
-int pud_huge(pud_t pud)
-{
-	return 0;
-}
-
-int pgd_huge(pgd_t pgd)
-{
-	return 0;
-}
-#endif
-
 pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
 {
 	/* Only called for hugetlbfs pages, hence can ignore THP */
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 30/31] powerpc/mm: Move THP headers around
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (28 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 29/31] powerpc/mm: Move hugetlb related headers Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 10:08 ` [PATCH V4 31/31] powerpc/mm: Add a _PAGE_PTE bit Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

We support THP only with book3s_64 and 64K page size. Move
THP details to hash64-64k.h to clarify the same.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 126 +++++++++++++
 arch/powerpc/include/asm/book3s/64/hash.h     | 223 +++++------------------
 arch/powerpc/include/asm/nohash/64/pgtable.h  | 253 +-------------------------
 arch/powerpc/mm/hash_native_64.c              |  10 +
 arch/powerpc/mm/pgtable_64.c                  |   2 +-
 arch/powerpc/platforms/pseries/lpar.c         |  10 +
 6 files changed, 201 insertions(+), 423 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 20865ca7a179..34eab4542b85 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -170,6 +170,132 @@ static inline int hugepd_ok(hugepd_t hpd)
 
 #endif /* CONFIG_HUGETLB_PAGE */
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
+					 unsigned long addr,
+					 pmd_t *pmdp,
+					 unsigned long clr,
+					 unsigned long set);
+static inline char *get_hpte_slot_array(pmd_t *pmdp)
+{
+	/*
+	 * The hpte hindex is stored in the pgtable whose address is in the
+	 * second half of the PMD
+	 *
+	 * Order this load with the test for pmd_trans_huge in the caller
+	 */
+	smp_rmb();
+	return *(char **)(pmdp + PTRS_PER_PMD);
+
+
+}
+/*
+ * The linux hugepage PMD now include the pmd entries followed by the address
+ * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
+ * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
+ * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
+ * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
+ *
+ * The last three bits are intentionally left to zero. This memory location
+ * are also used as normal page PTE pointers. So if we have any pointers
+ * left around while we collapse a hugepage, we need to make sure
+ * _PAGE_PRESENT bit of that is zero when we look at them
+ */
+static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
+{
+	return (hpte_slot_array[index] >> 3) & 0x1;
+}
+
+static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
+					   int index)
+{
+	return hpte_slot_array[index] >> 4;
+}
+
+static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
+					unsigned int index, unsigned int hidx)
+{
+	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
+}
+
+/*
+ *
+ * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
+ * page. The hugetlbfs page table walking and mangling paths are totally
+ * separated form the core VM paths and they're differentiated by
+ *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
+ *
+ * pmd_trans_huge() is defined as false at build time if
+ * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
+ * time in such case.
+ *
+ * For ppc64 we need to differntiate from explicit hugepages from THP, because
+ * for THP we also track the subpage details at the pmd level. We don't do
+ * that for explicit huge pages.
+ *
+ */
+static inline int pmd_trans_huge(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
+}
+
+static inline int pmd_trans_splitting(pmd_t pmd)
+{
+	if (pmd_trans_huge(pmd))
+		return pmd_val(pmd) & _PAGE_SPLITTING;
+	return 0;
+}
+
+static inline int pmd_large(pmd_t pmd)
+{
+	/*
+	 * leaf pte for huge page, bottom two bits != 00
+	 */
+	return ((pmd_val(pmd) & 0x3) != 0x0);
+}
+
+static inline pmd_t pmd_mknotpresent(pmd_t pmd)
+{
+	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
+}
+
+static inline pmd_t pmd_mksplitting(pmd_t pmd)
+{
+	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
+}
+
+#define __HAVE_ARCH_PMD_SAME
+static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
+{
+	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
+}
+
+static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
+					      unsigned long addr, pmd_t *pmdp)
+{
+	unsigned long old;
+
+	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+		return 0;
+	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
+	return ((old & _PAGE_ACCESSED) != 0);
+}
+
+#define __HAVE_ARCH_PMDP_SET_WRPROTECT
+static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
+				      pmd_t *pmdp)
+{
+
+	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
+		return;
+
+	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
+}
+
+#endif /*  CONFIG_TRANSPARENT_HUGEPAGE */
 #endif	/* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 0d9800a32d61..e63556c98c32 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -2,6 +2,55 @@
 #define _ASM_POWERPC_BOOK3S_64_HASH_H
 #ifdef __KERNEL__
 
+/*
+ * Common bits between 4K and 64K pages in a linux-style PTE.
+ * These match the bits in the (hardware-defined) PowerPC PTE as closely
+ * as possible. Additional bits may be defined in pgtable-hash64-*.h
+ *
+ * Note: We only support user read/write permissions. Supervisor always
+ * have full read/write to pages above PAGE_OFFSET (pages below that
+ * always use the user access permissions).
+ *
+ * We could create separate kernel read-only if we used the 3 PP bits
+ * combinations that newer processors provide but we currently don't.
+ */
+#define _PAGE_PRESENT		0x00001 /* software: pte contains a translation */
+#define _PAGE_USER		0x00002 /* matches one of the PP bits */
+#define _PAGE_BIT_SWAP_TYPE	2
+#define _PAGE_EXEC		0x00004 /* No execute on POWER4 and newer (we invert) */
+#define _PAGE_GUARDED		0x00008
+/* We can derive Memory coherence from _PAGE_NO_CACHE */
+#define _PAGE_COHERENT		0x0
+#define _PAGE_NO_CACHE		0x00020 /* I: cache inhibit */
+#define _PAGE_WRITETHRU		0x00040 /* W: cache write-through */
+#define _PAGE_DIRTY		0x00080 /* C: page changed */
+#define _PAGE_ACCESSED		0x00100 /* R: page referenced */
+#define _PAGE_RW		0x00200 /* software: user write access allowed */
+#define _PAGE_HASHPTE		0x00400 /* software: pte has an associated HPTE */
+#define _PAGE_BUSY		0x00800 /* software: PTE & hash are busy */
+#define _PAGE_F_GIX		0x07000 /* full page: hidx bits */
+#define _PAGE_F_GIX_SHIFT	12
+#define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
+#define _PAGE_SPECIAL		0x10000 /* software: special page */
+
+/*
+ * THP pages can't be special. So use the _PAGE_SPECIAL
+ */
+#define _PAGE_SPLITTING _PAGE_SPECIAL
+
+/*
+ * We need to differentiate between explicit huge page and THP huge
+ * page, since THP huge page also need to track real subpage details
+ */
+#define _PAGE_THP_HUGE  _PAGE_4K_PFN
+
+/*
+ * set of bits not changed in pmd_modify.
+ */
+#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
+			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
+			 _PAGE_THP_HUGE)
+
 #ifdef CONFIG_PPC_64K_PAGES
 #include <asm/book3s/64/hash-64k.h>
 #else
@@ -59,36 +108,6 @@
 #else
 #error "Hash Config needs mm slice enabled"
 #endif /* CONFIG_PPC_MM_SLICES */
-/*
- * Common bits between 4K and 64K pages in a linux-style PTE.
- * These match the bits in the (hardware-defined) PowerPC PTE as closely
- * as possible. Additional bits may be defined in pgtable-hash64-*.h
- *
- * Note: We only support user read/write permissions. Supervisor always
- * have full read/write to pages above PAGE_OFFSET (pages below that
- * always use the user access permissions).
- *
- * We could create separate kernel read-only if we used the 3 PP bits
- * combinations that newer processors provide but we currently don't.
- */
-#define _PAGE_PRESENT		0x00001 /* software: pte contains a translation */
-#define _PAGE_USER		0x00002 /* matches one of the PP bits */
-#define _PAGE_BIT_SWAP_TYPE	2
-#define _PAGE_EXEC		0x00004 /* No execute on POWER4 and newer (we invert) */
-#define _PAGE_GUARDED		0x00008
-/* We can derive Memory coherence from _PAGE_NO_CACHE */
-#define _PAGE_COHERENT		0x0
-#define _PAGE_NO_CACHE		0x00020 /* I: cache inhibit */
-#define _PAGE_WRITETHRU		0x00040 /* W: cache write-through */
-#define _PAGE_DIRTY		0x00080 /* C: page changed */
-#define _PAGE_ACCESSED		0x00100 /* R: page referenced */
-#define _PAGE_RW		0x00200 /* software: user write access allowed */
-#define _PAGE_HASHPTE		0x00400 /* software: pte has an associated HPTE */
-#define _PAGE_BUSY		0x00800 /* software: PTE & hash are busy */
-#define _PAGE_F_GIX		0x07000 /* full page: hidx bits */
-#define _PAGE_F_GIX_SHIFT	12
-#define _PAGE_F_SECOND		0x08000 /* Whether to use secondary hash or not */
-#define _PAGE_SPECIAL		0x10000 /* software: special page */
 
 /* No separate kernel read-only */
 #define _PAGE_KERNEL_RW		(_PAGE_RW | _PAGE_DIRTY) /* user access blocked by key */
@@ -107,24 +126,6 @@
 
 /* Hash table based platforms need atomic updates of the linux PTE */
 #define PTE_ATOMIC_UPDATES	1
-
-/*
- * THP pages can't be special. So use the _PAGE_SPECIAL
- */
-#define _PAGE_SPLITTING _PAGE_SPECIAL
-
-/*
- * We need to differentiate between explicit huge page and THP huge
- * page, since THP huge page also need to track real subpage details
- */
-#define _PAGE_THP_HUGE  _PAGE_4K_PFN
-
-/*
- * set of bits not changed in pmd_modify.
- */
-#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
-			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
 #define _PTE_NONE_MASK	_PAGE_HPTEFLAGS
 /*
  * The mask convered by the RPN must be a ULL on 32-bit platforms with
@@ -235,11 +236,6 @@
 
 extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
 			    pte_t *ptep, unsigned long pte, int huge);
-extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
-					 unsigned long addr,
-					 pmd_t *pmdp,
-					 unsigned long clr,
-					 unsigned long set);
 extern unsigned long htab_convert_pte_flags(unsigned long pteflags);
 /* Atomic PTE updates */
 static inline unsigned long pte_update(struct mm_struct *mm,
@@ -365,127 +361,6 @@ static inline void __ptep_set_access_flags(pte_t *ptep, pte_t entry)
 #define __HAVE_ARCH_PTE_SAME
 #define pte_same(A,B)	(((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
 
-static inline char *get_hpte_slot_array(pmd_t *pmdp)
-{
-	/*
-	 * The hpte hindex is stored in the pgtable whose address is in the
-	 * second half of the PMD
-	 *
-	 * Order this load with the test for pmd_trans_huge in the caller
-	 */
-	smp_rmb();
-	return *(char **)(pmdp + PTRS_PER_PMD);
-
-
-}
-/*
- * The linux hugepage PMD now include the pmd entries followed by the address
- * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
- * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
- * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
- * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
- *
- * The last three bits are intentionally left to zero. This memory location
- * are also used as normal page PTE pointers. So if we have any pointers
- * left around while we collapse a hugepage, we need to make sure
- * _PAGE_PRESENT bit of that is zero when we look at them
- */
-static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
-{
-	return (hpte_slot_array[index] >> 3) & 0x1;
-}
-
-static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
-					   int index)
-{
-	return hpte_slot_array[index] >> 4;
-}
-
-static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
-					unsigned int index, unsigned int hidx)
-{
-	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
-}
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-/*
- *
- * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
- * page. The hugetlbfs page table walking and mangling paths are totally
- * separated form the core VM paths and they're differentiated by
- *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
- *
- * pmd_trans_huge() is defined as false at build time if
- * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
- * time in such case.
- *
- * For ppc64 we need to differntiate from explicit hugepages from THP, because
- * for THP we also track the subpage details at the pmd level. We don't do
- * that for explicit huge pages.
- *
- */
-static inline int pmd_trans_huge(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
-}
-
-static inline int pmd_trans_splitting(pmd_t pmd)
-{
-	if (pmd_trans_huge(pmd))
-		return pmd_val(pmd) & _PAGE_SPLITTING;
-	return 0;
-}
-
-#endif
-static inline int pmd_large(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
-}
-
-static inline pmd_t pmd_mknotpresent(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
-}
-
-static inline pmd_t pmd_mksplitting(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
-}
-
-#define __HAVE_ARCH_PMD_SAME
-static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
-{
-	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
-}
-
-static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
-					      unsigned long addr, pmd_t *pmdp)
-{
-	unsigned long old;
-
-	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
-		return 0;
-	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
-	return ((old & _PAGE_ACCESSED) != 0);
-}
-
-#define __HAVE_ARCH_PMDP_SET_WRPROTECT
-static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
-				      pmd_t *pmdp)
-{
-
-	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
-		return;
-
-	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
-}
-
 /* Generic accessors to PTE bits */
 static inline int pte_write(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_RW);}
 static inline int pte_dirty(pte_t pte)		{ return !!(pte_val(pte) & _PAGE_DIRTY); }
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index f389f2d6789e..c4dff4d41c26 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -154,6 +154,11 @@ static inline void pmd_clear(pmd_t *pmdp)
 	*pmdp = __pmd(0);
 }
 
+static inline pte_t pmd_pte(pmd_t pmd)
+{
+	return __pte(pmd_val(pmd));
+}
+
 #define pmd_none(pmd)		(!pmd_val(pmd))
 #define	pmd_bad(pmd)		(!is_kernel_addr(pmd_val(pmd)) \
 				 || (pmd_val(pmd) & PMD_BAD_BITS))
@@ -389,252 +394,4 @@ void pgtable_cache_add(unsigned shift, void (*ctor)(void *));
 void pgtable_cache_init(void);
 #endif /* __ASSEMBLY__ */
 
-/*
- * THP pages can't be special. So use the _PAGE_SPECIAL
- */
-#define _PAGE_SPLITTING _PAGE_SPECIAL
-
-/*
- * We need to differentiate between explicit huge page and THP huge
- * page, since THP huge page also need to track real subpage details
- */
-#define _PAGE_THP_HUGE  _PAGE_4K_PFN
-
-/*
- * set of bits not changed in pmd_modify.
- */
-#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
-			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
-
-#ifndef __ASSEMBLY__
-/*
- * The linux hugepage PMD now include the pmd entries followed by the address
- * to the stashed pgtable_t. The stashed pgtable_t contains the hpte bits.
- * [ 1 bit secondary | 3 bit hidx | 1 bit valid | 000]. We use one byte per
- * each HPTE entry. With 16MB hugepage and 64K HPTE we need 256 entries and
- * with 4K HPTE we need 4096 entries. Both will fit in a 4K pgtable_t.
- *
- * The last three bits are intentionally left to zero. This memory location
- * are also used as normal page PTE pointers. So if we have any pointers
- * left around while we collapse a hugepage, we need to make sure
- * _PAGE_PRESENT bit of that is zero when we look at them
- */
-static inline unsigned int hpte_valid(unsigned char *hpte_slot_array, int index)
-{
-	return (hpte_slot_array[index] >> 3) & 0x1;
-}
-
-static inline unsigned int hpte_hash_index(unsigned char *hpte_slot_array,
-					   int index)
-{
-	return hpte_slot_array[index] >> 4;
-}
-
-static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
-					unsigned int index, unsigned int hidx)
-{
-	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
-}
-
-struct page *realmode_pfn_to_page(unsigned long pfn);
-
-static inline char *get_hpte_slot_array(pmd_t *pmdp)
-{
-	/*
-	 * The hpte hindex is stored in the pgtable whose address is in the
-	 * second half of the PMD
-	 *
-	 * Order this load with the test for pmd_trans_huge in the caller
-	 */
-	smp_rmb();
-	return *(char **)(pmdp + PTRS_PER_PMD);
-
-
-}
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-extern void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
-				   pmd_t *pmdp, unsigned long old_pmd);
-extern pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot);
-extern pmd_t mk_pmd(struct page *page, pgprot_t pgprot);
-extern pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot);
-extern void set_pmd_at(struct mm_struct *mm, unsigned long addr,
-		       pmd_t *pmdp, pmd_t pmd);
-extern void update_mmu_cache_pmd(struct vm_area_struct *vma, unsigned long addr,
-				 pmd_t *pmd);
-/*
- *
- * For core kernel code by design pmd_trans_huge is never run on any hugetlbfs
- * page. The hugetlbfs page table walking and mangling paths are totally
- * separated form the core VM paths and they're differentiated by
- *  VM_HUGETLB being set on vm_flags well before any pmd_trans_huge could run.
- *
- * pmd_trans_huge() is defined as false at build time if
- * CONFIG_TRANSPARENT_HUGEPAGE=n to optimize away code blocks at build
- * time in such case.
- *
- * For ppc64 we need to differntiate from explicit hugepages from THP, because
- * for THP we also track the subpage details at the pmd level. We don't do
- * that for explicit huge pages.
- *
- */
-static inline int pmd_trans_huge(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
-}
-
-static inline int pmd_trans_splitting(pmd_t pmd)
-{
-	if (pmd_trans_huge(pmd))
-		return pmd_val(pmd) & _PAGE_SPLITTING;
-	return 0;
-}
-
-extern int has_transparent_hugepage(void);
-#else
-static inline void hpte_do_hugepage_flush(struct mm_struct *mm,
-					  unsigned long addr, pmd_t *pmdp,
-					  unsigned long old_pmd)
-{
-
-	WARN(1, "%s called with THP disabled\n", __func__);
-}
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
-
-static inline int pmd_large(pmd_t pmd)
-{
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
-}
-
-static inline pte_t pmd_pte(pmd_t pmd)
-{
-	return __pte(pmd_val(pmd));
-}
-
-static inline pmd_t pte_pmd(pte_t pte)
-{
-	return __pmd(pte_val(pte));
-}
-
-static inline pte_t *pmdp_ptep(pmd_t *pmd)
-{
-	return (pte_t *)pmd;
-}
-
-#define pmd_pfn(pmd)		pte_pfn(pmd_pte(pmd))
-#define pmd_dirty(pmd)		pte_dirty(pmd_pte(pmd))
-#define pmd_young(pmd)		pte_young(pmd_pte(pmd))
-#define pmd_mkold(pmd)		pte_pmd(pte_mkold(pmd_pte(pmd)))
-#define pmd_wrprotect(pmd)	pte_pmd(pte_wrprotect(pmd_pte(pmd)))
-#define pmd_mkdirty(pmd)	pte_pmd(pte_mkdirty(pmd_pte(pmd)))
-#define pmd_mkyoung(pmd)	pte_pmd(pte_mkyoung(pmd_pte(pmd)))
-#define pmd_mkwrite(pmd)	pte_pmd(pte_mkwrite(pmd_pte(pmd)))
-
-#define __HAVE_ARCH_PMD_WRITE
-#define pmd_write(pmd)		pte_write(pmd_pte(pmd))
-
-static inline pmd_t pmd_mkhuge(pmd_t pmd)
-{
-	/* Do nothing, mk_pmd() does this part.  */
-	return pmd;
-}
-
-static inline pmd_t pmd_mknotpresent(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) & ~_PAGE_PRESENT);
-}
-
-static inline pmd_t pmd_mksplitting(pmd_t pmd)
-{
-	return __pmd(pmd_val(pmd) | _PAGE_SPLITTING);
-}
-
-#define __HAVE_ARCH_PMD_SAME
-static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
-{
-	return (((pmd_val(pmd_a) ^ pmd_val(pmd_b)) & ~_PAGE_HPTEFLAGS) == 0);
-}
-
-#define __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS
-extern int pmdp_set_access_flags(struct vm_area_struct *vma,
-				 unsigned long address, pmd_t *pmdp,
-				 pmd_t entry, int dirty);
-
-extern unsigned long pmd_hugepage_update(struct mm_struct *mm,
-					 unsigned long addr,
-					 pmd_t *pmdp,
-					 unsigned long clr,
-					 unsigned long set);
-
-static inline int __pmdp_test_and_clear_young(struct mm_struct *mm,
-					      unsigned long addr, pmd_t *pmdp)
-{
-	unsigned long old;
-
-	if ((pmd_val(*pmdp) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
-		return 0;
-	old = pmd_hugepage_update(mm, addr, pmdp, _PAGE_ACCESSED, 0);
-	return ((old & _PAGE_ACCESSED) != 0);
-}
-
-#define __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG
-extern int pmdp_test_and_clear_young(struct vm_area_struct *vma,
-				     unsigned long address, pmd_t *pmdp);
-#define __HAVE_ARCH_PMDP_CLEAR_YOUNG_FLUSH
-extern int pmdp_clear_flush_young(struct vm_area_struct *vma,
-				  unsigned long address, pmd_t *pmdp);
-
-#define __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR
-extern pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
-				     unsigned long addr, pmd_t *pmdp);
-
-#define __HAVE_ARCH_PMDP_SET_WRPROTECT
-static inline void pmdp_set_wrprotect(struct mm_struct *mm, unsigned long addr,
-				      pmd_t *pmdp)
-{
-
-	if ((pmd_val(*pmdp) & _PAGE_RW) == 0)
-		return;
-
-	pmd_hugepage_update(mm, addr, pmdp, _PAGE_RW, 0);
-}
-
-#define __HAVE_ARCH_PMDP_SPLITTING_FLUSH
-extern void pmdp_splitting_flush(struct vm_area_struct *vma,
-				 unsigned long address, pmd_t *pmdp);
-
-extern pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
-				 unsigned long address, pmd_t *pmdp);
-#define pmdp_collapse_flush pmdp_collapse_flush
-
-#define __HAVE_ARCH_PGTABLE_DEPOSIT
-extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
-				       pgtable_t pgtable);
-#define __HAVE_ARCH_PGTABLE_WITHDRAW
-extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp);
-
-#define __HAVE_ARCH_PMDP_INVALIDATE
-extern void pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
-			    pmd_t *pmdp);
-
-#define pmd_move_must_withdraw pmd_move_must_withdraw
-struct spinlock;
-static inline int pmd_move_must_withdraw(struct spinlock *new_pmd_ptl,
-					 struct spinlock *old_pmd_ptl)
-{
-	/*
-	 * Archs like ppc64 use pgtable to store per pmd
-	 * specific information. So when we switch the pmd,
-	 * we should also withdraw and deposit the pgtable
-	 */
-	return true;
-}
-#endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_NOHASH_64_PGTABLE_H */
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index 13befa35d8a8..538390638b63 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -429,6 +429,7 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn,
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static void native_hugepage_invalidate(unsigned long vsid,
 				       unsigned long addr,
 				       unsigned char *hpte_slot_array,
@@ -482,6 +483,15 @@ static void native_hugepage_invalidate(unsigned long vsid,
 	}
 	local_irq_restore(flags);
 }
+#else
+static void native_hugepage_invalidate(unsigned long vsid,
+				       unsigned long addr,
+				       unsigned char *hpte_slot_array,
+				       int psize, int ssize, int local)
+{
+	WARN(1, "%s called without THP support\n", __func__);
+}
+#endif
 
 static inline int __hpte_actual_psize(unsigned int lp, int psize)
 {
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 3967e3cce03e..d42dd289abfe 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -359,7 +359,7 @@ struct page *pud_page(pud_t pud)
 struct page *pmd_page(pmd_t pmd)
 {
 	if (pmd_trans_huge(pmd) || pmd_huge(pmd))
-		return pfn_to_page(pmd_pfn(pmd));
+		return pte_page(pmd_pte(pmd));
 	return virt_to_page(pmd_page_vaddr(pmd));
 }
 
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index b7a67e3d2201..6d46547871aa 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -396,6 +396,7 @@ static void pSeries_lpar_hpte_invalidate(unsigned long slot, unsigned long vpn,
 	BUG_ON(lpar_rc != H_SUCCESS);
 }
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 /*
  * Limit iterations holding pSeries_lpar_tlbie_lock to 3. We also need
  * to make sure that we avoid bouncing the hypervisor tlbie lock.
@@ -494,6 +495,15 @@ static void pSeries_lpar_hugepage_invalidate(unsigned long vsid,
 		__pSeries_lpar_hugepage_invalidate(slot_array, vpn_array,
 						   index, psize, ssize);
 }
+#else
+static void pSeries_lpar_hugepage_invalidate(unsigned long vsid,
+					     unsigned long addr,
+					     unsigned char *hpte_slot_array,
+					     int psize, int ssize, int local)
+{
+	WARN(1, "%s called without THP support\n", __func__);
+}
+#endif
 
 static void pSeries_lpar_hpte_removebolted(unsigned long ea,
 					   int psize, int ssize)
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH V4 31/31] powerpc/mm: Add a _PAGE_PTE bit
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (29 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 30/31] powerpc/mm: Move THP headers around Aneesh Kumar K.V
@ 2015-10-17 10:08 ` Aneesh Kumar K.V
  2015-10-17 13:22 ` [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Benjamin Herrenschmidt
  2015-10-22 18:40 ` Denis Kirjanov
  32 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-17 10:08 UTC (permalink / raw)
  To: benh, paulus, mpe, Scott Wood; +Cc: linuxppc-dev, Aneesh Kumar K.V

For a pte entry we will have _PAGE_PTE set. Our pte page
address have a minimum alignment requirement of HUGEPD_SHIFT_MASK + 1.
We use the lower 7 bits to indicate hugepd. ie.

For pmd and pgd we can find:
1) _PAGE_PTE set pte -> indicate PTE
2) bits [2..6] non zero -> indicate hugepd.
   They also encode the size. We skip bit 1 (_PAGE_PRESENT).
3) othewise pointer to next table.

Acked-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |  9 ++++++---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 23 +++++++++--------------
 arch/powerpc/include/asm/book3s/64/hash.h     | 13 +++++++------
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  3 +--
 arch/powerpc/include/asm/pte-common.h         |  5 +++++
 arch/powerpc/mm/hugetlbpage.c                 |  4 ++--
 arch/powerpc/mm/pgtable.c                     |  4 ++++
 arch/powerpc/mm/pgtable_64.c                  |  7 +------
 8 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index b4d25529d179..e59832c94609 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -116,10 +116,13 @@ static inline int pgd_huge(pgd_t pgd)
 static inline int hugepd_ok(hugepd_t hpd)
 {
 	/*
-	 * hugepd pointer, bottom two bits == 00 and next 4 bits
-	 * indicate size of table
+	 * if it is not a pte and have hugepd shift mask
+	 * set, then it is a hugepd directory pointer
 	 */
-	return (((hpd.pd & 0x3) == 0x0) && ((hpd.pd & HUGEPD_SHIFT_MASK) != 0));
+	if (!(hpd.pd & _PAGE_PTE) &&
+	    ((hpd.pd & HUGEPD_SHIFT_MASK) != 0))
+		return true;
+	return false;
 }
 #define is_hugepd(hpd)		(hugepd_ok(hpd))
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 34eab4542b85..957d66d13a97 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -130,25 +130,25 @@ static inline bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
 static inline int pmd_huge(pmd_t pmd)
 {
 	/*
-	 * leaf pte for huge page, bottom two bits != 00
+	 * leaf pte for huge page
 	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
+	return !!(pmd_val(pmd) & _PAGE_PTE);
 }
 
 static inline int pud_huge(pud_t pud)
 {
 	/*
-	 * leaf pte for huge page, bottom two bits != 00
+	 * leaf pte for huge page
 	 */
-	return ((pud_val(pud) & 0x3) != 0x0);
+	return !!(pud_val(pud) & _PAGE_PTE);
 }
 
 static inline int pgd_huge(pgd_t pgd)
 {
 	/*
-	 * leaf pte for huge page, bottom two bits != 00
+	 * leaf pte for huge page
 	 */
-	return ((pgd_val(pgd) & 0x3) != 0x0);
+	return !!(pgd_val(pgd) & _PAGE_PTE);
 }
 #define pgd_huge pgd_huge
 
@@ -236,10 +236,8 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
  */
 static inline int pmd_trans_huge(pmd_t pmd)
 {
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return (pmd_val(pmd) & 0x3) && (pmd_val(pmd) & _PAGE_THP_HUGE);
+	return !!((pmd_val(pmd) & (_PAGE_PTE | _PAGE_THP_HUGE)) ==
+		  (_PAGE_PTE | _PAGE_THP_HUGE));
 }
 
 static inline int pmd_trans_splitting(pmd_t pmd)
@@ -251,10 +249,7 @@ static inline int pmd_trans_splitting(pmd_t pmd)
 
 static inline int pmd_large(pmd_t pmd)
 {
-	/*
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
-	return ((pmd_val(pmd) & 0x3) != 0x0);
+	return !!(pmd_val(pmd) & _PAGE_PTE);
 }
 
 static inline pmd_t pmd_mknotpresent(pmd_t pmd)
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index e63556c98c32..7fef24c4c81c 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -14,11 +14,12 @@
  * We could create separate kernel read-only if we used the 3 PP bits
  * combinations that newer processors provide but we currently don't.
  */
-#define _PAGE_PRESENT		0x00001 /* software: pte contains a translation */
-#define _PAGE_USER		0x00002 /* matches one of the PP bits */
+#define _PAGE_PTE		0x00001
+#define _PAGE_PRESENT		0x00002 /* software: pte contains a translation */
+#define _PAGE_USER		0x00004 /* matches one of the PP bits */
 #define _PAGE_BIT_SWAP_TYPE	2
-#define _PAGE_EXEC		0x00004 /* No execute on POWER4 and newer (we invert) */
-#define _PAGE_GUARDED		0x00008
+#define _PAGE_EXEC		0x00008 /* No execute on POWER4 and newer (we invert) */
+#define _PAGE_GUARDED		0x00010
 /* We can derive Memory coherence from _PAGE_NO_CACHE */
 #define _PAGE_COHERENT		0x0
 #define _PAGE_NO_CACHE		0x00020 /* I: cache inhibit */
@@ -49,7 +50,7 @@
  */
 #define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS |		\
 			 _PAGE_DIRTY | _PAGE_ACCESSED | _PAGE_SPLITTING | \
-			 _PAGE_THP_HUGE)
+			 _PAGE_THP_HUGE | _PAGE_PTE)
 
 #ifdef CONFIG_PPC_64K_PAGES
 #include <asm/book3s/64/hash-64k.h>
@@ -139,7 +140,7 @@
  * pgprot changes
  */
 #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
-			 _PAGE_ACCESSED | _PAGE_SPECIAL)
+			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE)
 /*
  * Mask of bits returned by pte_pgprot()
  */
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 3117f0495b74..0b43ca60dcb9 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -213,8 +213,7 @@ static inline int pmd_protnone(pmd_t pmd)
 
 static inline pmd_t pmd_mkhuge(pmd_t pmd)
 {
-	/* Do nothing, mk_pmd() does this part.  */
-	return pmd;
+	return __pmd(pmd_val(pmd) | (_PAGE_PTE | _PAGE_THP_HUGE));
 }
 
 #define __HAVE_ARCH_PMDP_SET_ACCESS_FLAGS
diff --git a/arch/powerpc/include/asm/pte-common.h b/arch/powerpc/include/asm/pte-common.h
index 71537a319fc8..1ec67b043065 100644
--- a/arch/powerpc/include/asm/pte-common.h
+++ b/arch/powerpc/include/asm/pte-common.h
@@ -40,6 +40,11 @@
 #else
 #define _PAGE_RW 0
 #endif
+
+#ifndef _PAGE_PTE
+#define _PAGE_PTE 0
+#endif
+
 #ifndef _PMD_PRESENT_MASK
 #define _PMD_PRESENT_MASK	_PMD_PRESENT
 #endif
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index f20c1a9065da..9f861ea44da7 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -912,8 +912,8 @@ void flush_dcache_icache_hugepage(struct page *page)
  * We have 4 cases for pgds and pmds:
  * (1) invalid (all zeroes)
  * (2) pointer to next table, as normal; bottom 6 bits == 0
- * (3) leaf pte for huge page, bottom two bits != 00
- * (4) hugepd pointer, bottom two bits == 00, next 4 bits indicate size of table
+ * (3) leaf pte for huge page _PAGE_PTE set
+ * (4) hugepd pointer, _PAGE_PTE = 0 and bits [2..6] indicate size of table
  *
  * So long as we atomically load page table pointers we are safe against teardown,
  * we can follow the address down to the the page and take a ref on it.
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 83dfcb55ffef..83dfd7925c72 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -179,6 +179,10 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
 	 */
 	VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
 		(_PAGE_PRESENT | _PAGE_USER));
+	/*
+	 * Add the pte bit when tryint set a pte
+	 */
+	pte = __pte(pte_val(pte) | _PAGE_PTE);
 
 	/* Note: mm->context.id might not yet have been assigned as
 	 * this context might not have been activated yet when this
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index d42dd289abfe..ea6bc31debb0 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -765,13 +765,8 @@ static pmd_t pmd_set_protbits(pmd_t pmd, pgprot_t pgprot)
 pmd_t pfn_pmd(unsigned long pfn, pgprot_t pgprot)
 {
 	unsigned long pmdv;
-	/*
-	 * For a valid pte, we would have _PAGE_PRESENT always
-	 * set. We use this to check THP page at pmd level.
-	 * leaf pte for huge page, bottom two bits != 00
-	 */
+
 	pmdv = pfn << PTE_RPN_SHIFT;
-	pmdv |= _PAGE_THP_HUGE;
 	return pmd_set_protbits(__pmd(pmdv), pgprot);
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (30 preceding siblings ...)
  2015-10-17 10:08 ` [PATCH V4 31/31] powerpc/mm: Add a _PAGE_PTE bit Aneesh Kumar K.V
@ 2015-10-17 13:22 ` Benjamin Herrenschmidt
  2015-10-19  3:17   ` Aneesh Kumar K.V
  2015-10-19  8:31   ` Aneesh Kumar K.V
  2015-10-22 18:40 ` Denis Kirjanov
  32 siblings, 2 replies; 41+ messages in thread
From: Benjamin Herrenschmidt @ 2015-10-17 13:22 UTC (permalink / raw)
  To: Aneesh Kumar K.V, paulus, mpe, Scott Wood; +Cc: linuxppc-dev

On Sat, 2015-10-17 at 15:38 +0530, Aneesh Kumar K.V wrote:
> Hi All,
> 
> This patch series attempt to update book3s 64 linux page table format to
> make it more flexible. Our current pte format is very restrictive and we
> overload multiple pte bits. This is due to the non-availability of free bits
> in pte_t. We use pte_t to track the validity of 4K subpages. This patch
> series free up pte_t of 11 bits by moving 4K subpage tracking to the
> lower half of PTE page. The pte format is updated such that we have a
> better method for identifying a pte entry at pmd level. This will also enable
> us to implement hugetlb migration(not yet done in this series). 

I still have serious concerns about the fact that we now use 4 times
more memory for page tables than strictly necessary. We were using
twice as much before.

We need to find a way to not allocate all those "other halves" when not
needed.

I understand it's tricky, we tend to notice we need the second half too
late...

Maybe if we could escalate the hash miss into a minor fault when the
second half is needed and not present, we can then allocate it from the

For demotion of the vmap space, we might have to be a bit smarter,
maybe detect at ioremap/vmap time and flag the mm as needed second
halves for everything (and allocate them).

Of course if the machine doesn't do hw 64k, we would always allocate
the second half.

The question then becomes how to reference it from the first half.

A completely parallel tree means a lot more walks for each PTE, is
there something in the PTE page's struct page we can use maybe ?

> efore making the changes to the pte format, I am splitting the
> pte header definition such that we now have the below layout for headers
> 
> book3s
>    32
>      hash.h pgtable.h 
>    64
>      hash.h  pgtable.h hash-4k.h hash-64k.h
> booke
>   32
>      pgtable.h pte-40x.h pte-44x.h pte-8xx.h pte-fsl-booke.h
>   64
>     pgtable-4k.h  pgtable-64k.h  pgtable.h
> 
> I have done the header split such that booke headers and modified to the minimum so as to avoid
> causing breakage in booke.
> 
> The patch series can also be found at
> https://github.com/kvaneesh/linux.git book3s-pte-format 
> https://github.com/kvaneesh/linux/commits/book3s-pte-format
> 
> 
> Performance numbers with and without patch series.
> 
> Path length __hash_page_4k
> with patch: 196
> without patch: 142
> 
> Path length __hash_page_64k
> with patch: 219
> without patch: 154
> 
> But even if we have a path lengh increase of around 50 instructions. We don't see
> the impact when running workload. I tried the kernelbuild test. 
> 
> With THP enabled (which is default) we see an improvement. I haven't fully looked at
> the reason. This could be due to reduced contention of ptl lock. __hash_thp_page is
> already a C code.
> 
> make -j64 vmlinux modules 
> With fix:
> ---------
> real    1m35.509s
> user    56m8.565s
> sys     4m34.973s
> 
> real    1m32.174s
> user    57m2.336s
> sys     4m39.142s
> 
> Without fix:
> ---------------
> real    1m37.703s
> user    58m50.783s
> sys     7m52.440s
> 
> real    1m37.890s
> user    57m55.445s
> sys     7m50.501s
> 
> THP disabled:
> 
> make -j64 vmlinux modules 
> With fix:
> ---------
> real    1m37.197s
> user    58m28.672s
> sys     7m58.188s
> 
> real    1m44.638s
> user    58m37.551s
> sys     7m53.960s
> 
> Without fix:
> ------------
> real    1m41.224s
> user    58m46.944s
> sys     7m49.714s
> 
> real    1m42.585s
> user    59m14.019s
> sys     7m52.714s
> 
> 
> Changes from V3:
> * Add missing #define pgprot_*
> * Add Acked-by
> 
> Changes from V2:
> * rebase to -next for powerpc tree
> 
> Changes from V1:
> 1) Build fix with STRICT_MM_TYPES enabled 
> 2) pte_mkwrite fix for nohash
> 3) rebase to latest linus tree.
> 
> 
> Aneesh Kumar K.V (31):
>   powerpc/mm: move pte headers to book3s directory
>   powerpc/mm: move pte headers to book3s directory (part 2)
>   powerpc/mm: make a separate copy for book3s
>   powerpc/mm: make a separate copy for book3s (part 2)
>   powerpc/mm: Move hash specific pte width and other defines to book3s
>   powerpc/mm: Delete booke bits from book3s
>   powerpc/mm: Don't have generic headers introduce functions touching
>     pte bits
>   powerpc/mm: Drop pte-common.h from BOOK3S 64
>   powerpc/mm: Don't use pte_val as lvalue
>   powerpc/mm: Don't use pmd_val,pud_val and pgd_val as lvalue
>   powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h
>   powerpc/mm: Move PTE bits from generic functions to hash64 functions.
>   powerpc/booke: Move nohash headers (part 1)
>   powerpc/booke: Move nohash headers (part 2)
>   powerpc/booke: Move nohash headers (part 3)
>   powerpc/booke: Move nohash headers (part 4)
>   powerpc/booke: Move nohash headers (part 5)
>   powerpc/mm: Increase the pte frag size.
>   powerpc/mm: Convert 4k hash insert to C
>   powerpc/mm: update __real_pte to take address as argument
>   powerpc/mm: make pte page hash index slot 8 bits
>   powerpc/mm: Don't track subpage valid bit in pte_t
>   powerpc/mm: Increase the width of #define
>   powerpc/mm: Convert __hash_page_64K to C
>   powerpc/mm: Convert 4k insert from asm to C
>   powerpc/mm: Remove the dependency on pte bit position in asm code
>   powerpc/mm: Add helper for converting pte bit to hpte bits
>   powerpc/mm: Move WIMG update to helper.
>   powerpc/mm: Move hugetlb related headers
>   powerpc/mm: Move THP headers around
>   powerpc/mm: Add a _PAGE_PTE bit
> 
>  .../include/asm/{pte-hash32.h => book3s/32/hash.h} |    6 +-
>  .../asm/{pgtable-ppc32.h => book3s/32/pgtable.h}   |  286 ++++--
>  .../{pgtable-ppc64-4k.h => book3s/64/hash-4k.h}    |   58 +-
>  arch/powerpc/include/asm/book3s/64/hash-64k.h      |  296 ++++++
>  arch/powerpc/include/asm/book3s/64/hash.h          |  530 +++++++++++
>  arch/powerpc/include/asm/book3s/64/pgtable.h       |  266 ++++++
>  arch/powerpc/include/asm/book3s/pgtable.h          |   29 +
>  arch/powerpc/include/asm/mmu-hash64.h              |    2 +-
>  .../asm/{pgtable-ppc32.h => nohash/32/pgtable.h}   |   25 +-
>  arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h |    6 +-
>  arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h |    6 +-
>  arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h |    6 +-
>  .../include/asm/{ => nohash/32}/pte-fsl-booke.h    |    6 +-
>  .../{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} |   12 +-
>  .../64/pgtable-64k.h}                              |    6 +-
>  .../asm/{pgtable-ppc64.h => nohash/64/pgtable.h}   |  307 +-----
>  arch/powerpc/include/asm/{ => nohash}/pgtable.h    |  175 ++--
>  arch/powerpc/include/asm/{ => nohash}/pte-book3e.h |    6 +-
>  arch/powerpc/include/asm/page.h                    |   90 +-
>  arch/powerpc/include/asm/pgalloc-32.h              |   34 +-
>  arch/powerpc/include/asm/pgalloc-64.h              |   29 +-
>  arch/powerpc/include/asm/pgtable.h                 |  200 +---
>  arch/powerpc/include/asm/pte-common.h              |    5 +
>  arch/powerpc/include/asm/pte-hash64-4k.h           |   17 -
>  arch/powerpc/include/asm/pte-hash64-64k.h          |  102 --
>  arch/powerpc/include/asm/pte-hash64.h              |   54 --
>  arch/powerpc/kernel/exceptions-64s.S               |   16 +-
>  arch/powerpc/mm/40x_mmu.c                          |   10 +-
>  arch/powerpc/mm/Makefile                           |    9 +-
>  arch/powerpc/mm/hash64_4k.c                        |  123 +++
>  arch/powerpc/mm/hash64_64k.c                       |  313 ++++++
>  arch/powerpc/mm/hash_low_64.S                      | 1003 --------------------
>  arch/powerpc/mm/hash_native_64.c                   |   10 +
>  arch/powerpc/mm/hash_utils_64.c                    |  105 +-
>  arch/powerpc/mm/hugepage-hash64.c                  |   20 +-
>  arch/powerpc/mm/hugetlbpage-hash64.c               |   15 +-
>  arch/powerpc/mm/hugetlbpage.c                      |   58 +-
>  arch/powerpc/mm/pgtable.c                          |    4 +
>  arch/powerpc/mm/pgtable_64.c                       |   28 +-
>  arch/powerpc/mm/tlb_hash64.c                       |    2 +-
>  arch/powerpc/platforms/pseries/lpar.c              |   10 +
>  41 files changed, 2184 insertions(+), 2101 deletions(-)
>  rename arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} (93%)
>  copy arch/powerpc/include/asm/{pgtable-ppc32.h => book3s/32/pgtable.h} (62%)
>  copy arch/powerpc/include/asm/{pgtable-ppc64-4k.h => book3s/64/hash-4k.h} (71%)
>  create mode 100644 arch/powerpc/include/asm/book3s/64/hash-64k.h
>  create mode 100644 arch/powerpc/include/asm/book3s/64/hash.h
>  create mode 100644 arch/powerpc/include/asm/book3s/64/pgtable.h
>  create mode 100644 arch/powerpc/include/asm/book3s/pgtable.h
>  rename arch/powerpc/include/asm/{pgtable-ppc32.h => nohash/32/pgtable.h} (96%)
>  rename arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h (95%)
>  rename arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h (96%)
>  rename arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h (95%)
>  rename arch/powerpc/include/asm/{ => nohash/32}/pte-fsl-booke.h (88%)
>  rename arch/powerpc/include/asm/{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} (92%)
>  rename arch/powerpc/include/asm/{pgtable-ppc64-64k.h => nohash/64/pgtable-64k.h} (90%)
>  rename arch/powerpc/include/asm/{pgtable-ppc64.h => nohash/64/pgtable.h} (56%)
>  copy arch/powerpc/include/asm/{ => nohash}/pgtable.h (62%)
>  rename arch/powerpc/include/asm/{ => nohash}/pte-book3e.h (95%)
>  delete mode 100644 arch/powerpc/include/asm/pte-hash64-4k.h
>  delete mode 100644 arch/powerpc/include/asm/pte-hash64-64k.h
>  delete mode 100644 arch/powerpc/include/asm/pte-hash64.h
>  create mode 100644 arch/powerpc/mm/hash64_4k.c
>  create mode 100644 arch/powerpc/mm/hash64_64k.c
>  delete mode 100644 arch/powerpc/mm/hash_low_64.S
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-10-17 13:22 ` [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Benjamin Herrenschmidt
@ 2015-10-19  3:17   ` Aneesh Kumar K.V
  2015-10-19  8:31   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-19  3:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, paulus, mpe, Scott Wood; +Cc: linuxppc-dev

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Sat, 2015-10-17 at 15:38 +0530, Aneesh Kumar K.V wrote:
>> Hi All,
>> 
>> This patch series attempt to update book3s 64 linux page table format to
>> make it more flexible. Our current pte format is very restrictive and we
>> overload multiple pte bits. This is due to the non-availability of free bits
>> in pte_t. We use pte_t to track the validity of 4K subpages. This patch
>> series free up pte_t of 11 bits by moving 4K subpage tracking to the
>> lower half of PTE page. The pte format is updated such that we have a
>> better method for identifying a pte entry at pmd level. This will also enable
>> us to implement hugetlb migration(not yet done in this series). 
>
> I still have serious concerns about the fact that we now use 4 times
> more memory for page tables than strictly necessary. We were using
> twice as much before.
>
> We need to find a way to not allocate all those "other halves" when not
> needed.
>
> I understand it's tricky, we tend to notice we need the second half too
> late...
>
> Maybe if we could escalate the hash miss into a minor fault when the
> second half is needed and not present, we can then allocate it from the
>
> For demotion of the vmap space, we might have to be a bit smarter,
> maybe detect at ioremap/vmap time and flag the mm as needed second
> halves for everything (and allocate them).
>
> Of course if the machine doesn't do hw 64k, we would always allocate
> the second half.
>
> The question then becomes how to reference it from the first half.
>
> A completely parallel tree means a lot more walks for each PTE, is
> there something in the PTE page's struct page we can use maybe ?
>

We could use page->index. I have an early patch (it still find wrong
ptes ) which increase fragment count to 32. IMHO, we should
get this series merged and not depend on increasing fragment
count to 32 now. I was planning to get that merged in the next
merge window. We get sufficient time to test and also won't push
all the changes in one merge window. This will allow us to easily
isolate issues if they arise. 

    powerpc/mm: Move subpage tracking allocation out of pgtable
    
    This is the first part of the attempt to reduce pagetable usage for 64k.
    The goal is to allocate 4k subpage tracking memory only if we need to
    
    Not-Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

diff --git a/arch/powerpc/include/asm/pgalloc-64.h b/arch/powerpc/include/asm/pgalloc-64.h
index 4f1cc6c46728..f4243d9264c4 100644
--- a/arch/powerpc/include/asm/pgalloc-64.h
+++ b/arch/powerpc/include/asm/pgalloc-64.h
@@ -166,13 +166,24 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
 /*
  * we support 8 fragments per PTE page.
  */
-#define PTE_FRAG_NR	8
+#define PTE_FRAG_NR	32
 /*
  * We use a 2K PTE page fragment and another 4K for storing
  * real_pte_t hash index. Rounding the entire thing to 8K
  */
-#define PTE_FRAG_SIZE_SHIFT  13
+#define PTE_FRAG_SIZE_SHIFT  11
 #define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
+/*
+ * Size of array used to track the respective 4K subpage details
+ * Total entries in a pte page:
+ * (PTE_FRAG_NR * PTE_FRAG_SIZE) / 8
+ * Space needed per 64K pte:
+ * 16 bytes (One byte per 4K hpte)
+ * Total space for 4K PTE frag size
+ * (PTE_FRAG_NR * PTE_FRAG_SIZE) * 2
+ * ie, order 1 pages.
+ */
+#define PTE_4K_FRAG_SIZE (PTE_FRAG_SIZE << 1)
 
 extern pte_t *page_table_alloc(struct mm_struct *, unsigned long, int);
 extern void page_table_free(struct mm_struct *, unsigned long *, int);
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 84867a1491a2..08c141a844ca 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -15,24 +15,40 @@
 #include <linux/mm.h>
 #include <asm/machdep.h>
 #include <asm/mmu.h>
+#include <asm/pgalloc.h>
+
+static inline int pte_frag_index(unsigned long ptep, unsigned long pte_page)
+{
+	return  (ptep - pte_page)/ PTE_FRAG_SIZE;
+
+}
+
+static inline int pte_4k_track_index(int frag_index, int pte_index)
+{
+	return (PTE_4K_FRAG_SIZE * frag_index) + (pte_index * 16);
+
+}
 
 real_pte_t __real_pte(unsigned long addr, pte_t pte, pte_t *ptep)
 {
 	int indx;
 	real_pte_t rpte;
-	pte_t *pte_headp;
+	struct page *pte_page = virt_to_page(ptep);
 
 	rpte.pte = pte;
 	rpte.hidx = NULL;
 	if (pte_val(pte) & _PAGE_COMBO) {
+		int frag_index = pte_frag_index((unsigned long) ptep,
+					(unsigned long)page_address(pte_page));
+
 		indx = pte_index(addr);
-		pte_headp = ptep - indx;
 		/*
 		 * Make sure we order the hidx load against the _PAGE_COMBO
 		 * check. The store side ordering is done in __hash_page_4K
 		 */
 		smp_rmb();
-		rpte.hidx = (unsigned char *)(pte_headp + PTRS_PER_PTE) + (16 * indx);
+		rpte.hidx = ((unsigned char *)pte_page->index) +
+				pte_4k_track_index(frag_index, indx);
 	}
 	return rpte;
 }
diff --git a/arch/powerpc/mm/mmu_context_hash64.c b/arch/powerpc/mm/mmu_context_hash64.c
index 4e4efbc2658e..e25cba47febb 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -121,6 +121,7 @@ static void destroy_pagetable_page(struct mm_struct *mm)
 	count = atomic_sub_return(PTE_FRAG_NR - count, &page->_count);
 	if (!count) {
 		pgtable_page_dtor(page);
+		free_pages((unsigned long)page->index, 1);
 		free_hot_cold_page(page, 0);
 	}
 }
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index ea6bc31debb0..e6fe21af45ab 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -386,10 +386,18 @@ static pte_t *get_from_cache(struct mm_struct *mm)
 static pte_t *__alloc_for_cache(struct mm_struct *mm, int kernel)
 {
 	void *ret = NULL;
+	struct page *subpage_pte;
 	struct page *page = alloc_page(GFP_KERNEL | __GFP_NOTRACK |
 				       __GFP_REPEAT | __GFP_ZERO);
 	if (!page)
 		return NULL;
+
+	subpage_pte = alloc_pages(GFP_KERNEL | __GFP_NOTRACK |
+				  __GFP_REPEAT | __GFP_ZERO, 1);
+	WARN(page->index, "Page index is not null\n");
+	WARN(page->freelist, "Free list is not null\n");
+	page->index = (unsigned long)page_address(subpage_pte);
+
 	if (!kernel && !pgtable_page_ctor(page)) {
 		__free_page(page);
 		return NULL;
@@ -426,6 +434,8 @@ void page_table_free(struct mm_struct *mm, unsigned long *table, int kernel)
 {
 	struct page *page = virt_to_page(table);
 	if (put_page_testzero(page)) {
+		free_pages(page->index, 1);
+		page->index = 0;
 		if (!kernel)
 			pgtable_page_dtor(page);
 		free_hot_cold_page(page, 0);
@@ -437,6 +447,8 @@ static void page_table_free_rcu(void *table)
 {
 	struct page *page = virt_to_page(table);
 	if (put_page_testzero(page)) {
+		free_pages(page->index, 1);
+		page->index = 0;
 		pgtable_page_dtor(page);
 		free_hot_cold_page(page, 0);
 	}
@@ -471,6 +483,8 @@ void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
 		/* PTE page needs special handling */
 		struct page *page = virt_to_page(table);
 		if (put_page_testzero(page)) {
+			free_pages(page->index, 1);
+			page->index = 0;
 			pgtable_page_dtor(page);
 			free_hot_cold_page(page, 0);
 		}
 

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-10-17 13:22 ` [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Benjamin Herrenschmidt
  2015-10-19  3:17   ` Aneesh Kumar K.V
@ 2015-10-19  8:31   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-19  8:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, paulus, mpe, Scott Wood; +Cc: linuxppc-dev

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Sat, 2015-10-17 at 15:38 +0530, Aneesh Kumar K.V wrote:
>> Hi All,
>> 
>> This patch series attempt to update book3s 64 linux page table format to
>> make it more flexible. Our current pte format is very restrictive and we
>> overload multiple pte bits. This is due to the non-availability of free bits
>> in pte_t. We use pte_t to track the validity of 4K subpages. This patch
>> series free up pte_t of 11 bits by moving 4K subpage tracking to the
>> lower half of PTE page. The pte format is updated such that we have a
>> better method for identifying a pte entry at pmd level. This will also enable
>> us to implement hugetlb migration(not yet done in this series). 
>
> I still have serious concerns about the fact that we now use 4 times
> more memory for page tables than strictly necessary. We were using
> twice as much before.
>
> We need to find a way to not allocate all those "other halves" when not
> needed.
>
> I understand it's tricky, we tend to notice we need the second half too
> late...
>
> Maybe if we could escalate the hash miss into a minor fault when the
> second half is needed and not present, we can then allocate it from the
>
> For demotion of the vmap space, we might have to be a bit smarter,
> maybe detect at ioremap/vmap time and flag the mm as needed second
> halves for everything (and allocate them).
>
> Of course if the machine doesn't do hw 64k, we would always allocate
> the second half.
>

I am now trying to do the conditional alloc and one of the unpleasant
side effect of that is spreading of subpage 4k information all around
the code. We have the below cases

1) subpage protection:
This will result in demotion of segment when we call sys_subpage_prot.
New allocation of pgtable_t will result in subpage tracking page
allocation by checking in pte_alloc_one as below

pte_t *page_table_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel)
{
	pte_t *pte;

	pte = get_from_cache(mm);
	if (pte)
		goto out;

	pte = __alloc_for_cache(mm, kernel);
out:
	if (REGIOND_ID(vmaddr) == USER_REGION_ID) {
		int slice_psize;

		slice_psize = get_slice_psize(mm, vmaddr);
		/*
		 * 64K linux page size with 4K segment base page size. Allocate
		 * the 4k subpage track page
		 */
		if (slice_psize == MMU_PAGE_4K)
			alloc_and_update_4k(pte);
	}
	return pte;
}

Existing allocation when trying to insert a 4K hpte will push the fault
to handle_mm_fault and I am looking at adding the below in update_mmu_cache

	/*
	 * The fault was sent to us because, we didn't had the subpage
	 * tracking page
	 */
	if (pte_val(*ptep) & _PAGE_COMBO)
		alloc_and_update_4k(ptep);

We would have marked the pte _PAGE_COMBO in __hash_page_4k

2) ioremap with cache inhibited restrictions:
We can handle that in map_kernel_page
		/*
		 * if pte is non-cacheable and we have restrictions on
		 * using non cacheable large pages, we will have to use
		 * 4K subpages. Allocate the 4K region, but skip the
		 * segment demote
		 */
		if (mmu_ci_restrictions && (flags & _PAGE_NO_CACHE)) {
			alloc_and_update_4k(ptep);
		}

3) remap_4k_pfn:
Can we mark the segment demoted here ?. if so the pte alloc will handle
that if the remap address is in user region.

4) vmalloc space demotion because somebody did a remap_4k_pfn to an address
in that range:
Not yet sure about this 

5) Anything else I missed ?
you mention about vmap space. I was not able to find out when we would
demote an address in 0xf region ?

I will try to see if there is a better way to isolate the subpage
handling, but it looks like, we are making the code more complex by
doing the above ?

Also how do I evaluate the impact of doubling the pgtable_t size ?
Looking at the above i am wondering if this is really a issue we need
worry about ? We are better than other arch with 4K page size in-terms
of pte pages. We allocate one pte page per 128MB of address range and
for 4K page size architectures that is one per every 2MB ?

-aneesh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
                   ` (31 preceding siblings ...)
  2015-10-17 13:22 ` [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Benjamin Herrenschmidt
@ 2015-10-22 18:40 ` Denis Kirjanov
  2015-10-23  6:06   ` Aneesh Kumar K.V
  32 siblings, 1 reply; 41+ messages in thread
From: Denis Kirjanov @ 2015-10-22 18:40 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, paulus, mpe, Scott Wood, linuxppc-dev

On 10/17/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
> Hi All,
>
> This patch series attempt to update book3s 64 linux page table format to
> make it more flexible. Our current pte format is very restrictive and we
> overload multiple pte bits. This is due to the non-availability of free bits
> in pte_t. We use pte_t to track the validity of 4K subpages. This patch
> series free up pte_t of 11 bits by moving 4K subpage tracking to the
> lower half of PTE page. The pte format is updated such that we have a
> better method for identifying a pte entry at pmd level. This will also
> enable
> us to implement hugetlb migration(not yet done in this series).

Hi, what tree does this apply to?

>
> Before making the changes to the pte format, I am splitting the
> pte header definition such that we now have the below layout for headers
>
> book3s
>    32
>      hash.h pgtable.h
>    64
>      hash.h  pgtable.h hash-4k.h hash-64k.h
> booke
>   32
>      pgtable.h pte-40x.h pte-44x.h pte-8xx.h pte-fsl-booke.h
>   64
>     pgtable-4k.h  pgtable-64k.h  pgtable.h
>
> I have done the header split such that booke headers and modified to the
> minimum so as to avoid
> causing breakage in booke.
>
> The patch series can also be found at
> https://github.com/kvaneesh/linux.git book3s-pte-format
> https://github.com/kvaneesh/linux/commits/book3s-pte-format
>
>
> Performance numbers with and without patch series.
>
> Path length __hash_page_4k
> with patch: 196
> without patch: 142
>
> Path length __hash_page_64k
> with patch: 219
> without patch: 154
>
> But even if we have a path lengh increase of around 50 instructions. We
> don't see
> the impact when running workload. I tried the kernelbuild test.
>
> With THP enabled (which is default) we see an improvement. I haven't fully
> looked at
> the reason. This could be due to reduced contention of ptl lock.
> __hash_thp_page is
> already a C code.
>
> make -j64 vmlinux modules
> With fix:
> ---------
> real    1m35.509s
> user    56m8.565s
> sys     4m34.973s
>
> real    1m32.174s
> user    57m2.336s
> sys     4m39.142s
>
> Without fix:
> ---------------
> real    1m37.703s
> user    58m50.783s
> sys     7m52.440s
>
> real    1m37.890s
> user    57m55.445s
> sys     7m50.501s
>
> THP disabled:
>
> make -j64 vmlinux modules
> With fix:
> ---------
> real    1m37.197s
> user    58m28.672s
> sys     7m58.188s
>
> real    1m44.638s
> user    58m37.551s
> sys     7m53.960s
>
> Without fix:
> ------------
> real    1m41.224s
> user    58m46.944s
> sys     7m49.714s
>
> real    1m42.585s
> user    59m14.019s
> sys     7m52.714s
>
>
> Changes from V3:
> * Add missing #define pgprot_*
> * Add Acked-by
>
> Changes from V2:
> * rebase to -next for powerpc tree
>
> Changes from V1:
> 1) Build fix with STRICT_MM_TYPES enabled
> 2) pte_mkwrite fix for nohash
> 3) rebase to latest linus tree.
>
>
> Aneesh Kumar K.V (31):
>   powerpc/mm: move pte headers to book3s directory
>   powerpc/mm: move pte headers to book3s directory (part 2)
>   powerpc/mm: make a separate copy for book3s
>   powerpc/mm: make a separate copy for book3s (part 2)
>   powerpc/mm: Move hash specific pte width and other defines to book3s
>   powerpc/mm: Delete booke bits from book3s
>   powerpc/mm: Don't have generic headers introduce functions touching
>     pte bits
>   powerpc/mm: Drop pte-common.h from BOOK3S 64
>   powerpc/mm: Don't use pte_val as lvalue
>   powerpc/mm: Don't use pmd_val,pud_val and pgd_val as lvalue
>   powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h
>   powerpc/mm: Move PTE bits from generic functions to hash64 functions.
>   powerpc/booke: Move nohash headers (part 1)
>   powerpc/booke: Move nohash headers (part 2)
>   powerpc/booke: Move nohash headers (part 3)
>   powerpc/booke: Move nohash headers (part 4)
>   powerpc/booke: Move nohash headers (part 5)
>   powerpc/mm: Increase the pte frag size.
>   powerpc/mm: Convert 4k hash insert to C
>   powerpc/mm: update __real_pte to take address as argument
>   powerpc/mm: make pte page hash index slot 8 bits
>   powerpc/mm: Don't track subpage valid bit in pte_t
>   powerpc/mm: Increase the width of #define
>   powerpc/mm: Convert __hash_page_64K to C
>   powerpc/mm: Convert 4k insert from asm to C
>   powerpc/mm: Remove the dependency on pte bit position in asm code
>   powerpc/mm: Add helper for converting pte bit to hpte bits
>   powerpc/mm: Move WIMG update to helper.
>   powerpc/mm: Move hugetlb related headers
>   powerpc/mm: Move THP headers around
>   powerpc/mm: Add a _PAGE_PTE bit
>
>  .../include/asm/{pte-hash32.h => book3s/32/hash.h} |    6 +-
>  .../asm/{pgtable-ppc32.h => book3s/32/pgtable.h}   |  286 ++++--
>  .../{pgtable-ppc64-4k.h => book3s/64/hash-4k.h}    |   58 +-
>  arch/powerpc/include/asm/book3s/64/hash-64k.h      |  296 ++++++
>  arch/powerpc/include/asm/book3s/64/hash.h          |  530 +++++++++++
>  arch/powerpc/include/asm/book3s/64/pgtable.h       |  266 ++++++
>  arch/powerpc/include/asm/book3s/pgtable.h          |   29 +
>  arch/powerpc/include/asm/mmu-hash64.h              |    2 +-
>  .../asm/{pgtable-ppc32.h => nohash/32/pgtable.h}   |   25 +-
>  arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h |    6 +-
>  arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h |    6 +-
>  arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h |    6 +-
>  .../include/asm/{ => nohash/32}/pte-fsl-booke.h    |    6 +-
>  .../{pgtable-ppc64-4k.h => nohash/64/pgtable-4k.h} |   12 +-
>  .../64/pgtable-64k.h}                              |    6 +-
>  .../asm/{pgtable-ppc64.h => nohash/64/pgtable.h}   |  307 +-----
>  arch/powerpc/include/asm/{ => nohash}/pgtable.h    |  175 ++--
>  arch/powerpc/include/asm/{ => nohash}/pte-book3e.h |    6 +-
>  arch/powerpc/include/asm/page.h                    |   90 +-
>  arch/powerpc/include/asm/pgalloc-32.h              |   34 +-
>  arch/powerpc/include/asm/pgalloc-64.h              |   29 +-
>  arch/powerpc/include/asm/pgtable.h                 |  200 +---
>  arch/powerpc/include/asm/pte-common.h              |    5 +
>  arch/powerpc/include/asm/pte-hash64-4k.h           |   17 -
>  arch/powerpc/include/asm/pte-hash64-64k.h          |  102 --
>  arch/powerpc/include/asm/pte-hash64.h              |   54 --
>  arch/powerpc/kernel/exceptions-64s.S               |   16 +-
>  arch/powerpc/mm/40x_mmu.c                          |   10 +-
>  arch/powerpc/mm/Makefile                           |    9 +-
>  arch/powerpc/mm/hash64_4k.c                        |  123 +++
>  arch/powerpc/mm/hash64_64k.c                       |  313 ++++++
>  arch/powerpc/mm/hash_low_64.S                      | 1003
> --------------------
>  arch/powerpc/mm/hash_native_64.c                   |   10 +
>  arch/powerpc/mm/hash_utils_64.c                    |  105 +-
>  arch/powerpc/mm/hugepage-hash64.c                  |   20 +-
>  arch/powerpc/mm/hugetlbpage-hash64.c               |   15 +-
>  arch/powerpc/mm/hugetlbpage.c                      |   58 +-
>  arch/powerpc/mm/pgtable.c                          |    4 +
>  arch/powerpc/mm/pgtable_64.c                       |   28 +-
>  arch/powerpc/mm/tlb_hash64.c                       |    2 +-
>  arch/powerpc/platforms/pseries/lpar.c              |   10 +
>  41 files changed, 2184 insertions(+), 2101 deletions(-)
>  rename arch/powerpc/include/asm/{pte-hash32.h => book3s/32/hash.h} (93%)
>  copy arch/powerpc/include/asm/{pgtable-ppc32.h => book3s/32/pgtable.h}
> (62%)
>  copy arch/powerpc/include/asm/{pgtable-ppc64-4k.h => book3s/64/hash-4k.h}
> (71%)
>  create mode 100644 arch/powerpc/include/asm/book3s/64/hash-64k.h
>  create mode 100644 arch/powerpc/include/asm/book3s/64/hash.h
>  create mode 100644 arch/powerpc/include/asm/book3s/64/pgtable.h
>  create mode 100644 arch/powerpc/include/asm/book3s/pgtable.h
>  rename arch/powerpc/include/asm/{pgtable-ppc32.h => nohash/32/pgtable.h}
> (96%)
>  rename arch/powerpc/include/asm/{ => nohash/32}/pte-40x.h (95%)
>  rename arch/powerpc/include/asm/{ => nohash/32}/pte-44x.h (96%)
>  rename arch/powerpc/include/asm/{ => nohash/32}/pte-8xx.h (95%)
>  rename arch/powerpc/include/asm/{ => nohash/32}/pte-fsl-booke.h (88%)
>  rename arch/powerpc/include/asm/{pgtable-ppc64-4k.h =>
> nohash/64/pgtable-4k.h} (92%)
>  rename arch/powerpc/include/asm/{pgtable-ppc64-64k.h =>
> nohash/64/pgtable-64k.h} (90%)
>  rename arch/powerpc/include/asm/{pgtable-ppc64.h => nohash/64/pgtable.h}
> (56%)
>  copy arch/powerpc/include/asm/{ => nohash}/pgtable.h (62%)
>  rename arch/powerpc/include/asm/{ => nohash}/pte-book3e.h (95%)
>  delete mode 100644 arch/powerpc/include/asm/pte-hash64-4k.h
>  delete mode 100644 arch/powerpc/include/asm/pte-hash64-64k.h
>  delete mode 100644 arch/powerpc/include/asm/pte-hash64.h
>  create mode 100644 arch/powerpc/mm/hash64_4k.c
>  create mode 100644 arch/powerpc/mm/hash64_64k.c
>  delete mode 100644 arch/powerpc/mm/hash_low_64.S
>
> --
> 2.5.0
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-10-22 18:40 ` Denis Kirjanov
@ 2015-10-23  6:06   ` Aneesh Kumar K.V
  2015-10-23 19:08     ` Denis Kirjanov
  0 siblings, 1 reply; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-10-23  6:06 UTC (permalink / raw)
  To: Denis Kirjanov; +Cc: benh, paulus, mpe, Scott Wood, linuxppc-dev

Denis Kirjanov <kda@linux-powerpc.org> writes:

> On 10/17/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>> Hi All,
>>
>> This patch series attempt to update book3s 64 linux page table format to
>> make it more flexible. Our current pte format is very restrictive and we
>> overload multiple pte bits. This is due to the non-availability of free bits
>> in pte_t. We use pte_t to track the validity of 4K subpages. This patch
>> series free up pte_t of 11 bits by moving 4K subpage tracking to the
>> lower half of PTE page. The pte format is updated such that we have a
>> better method for identifying a pte entry at pmd level. This will also
>> enable
>> us to implement hugetlb migration(not yet done in this series).
>
> Hi, what tree does this apply to?
>

linux-powerpc tree -next branch

https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/

-aneesh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-10-23  6:06   ` Aneesh Kumar K.V
@ 2015-10-23 19:08     ` Denis Kirjanov
  2015-11-03  5:02       ` Aneesh Kumar K.V
  0 siblings, 1 reply; 41+ messages in thread
From: Denis Kirjanov @ 2015-10-23 19:08 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, paulus, mpe, Scott Wood, linuxppc-dev

On 10/23/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
> Denis Kirjanov <kda@linux-powerpc.org> writes:
>
>> On 10/17/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>> Hi All,
>>>
>>> This patch series attempt to update book3s 64 linux page table format to
>>> make it more flexible. Our current pte format is very restrictive and we
>>> overload multiple pte bits. This is due to the non-availability of free
>>> bits
>>> in pte_t. We use pte_t to track the validity of 4K subpages. This patch
>>> series free up pte_t of 11 bits by moving 4K subpage tracking to the
>>> lower half of PTE page. The pte format is updated such that we have a
>>> better method for identifying a pte entry at pmd level. This will also
>>> enable
>>> us to implement hugetlb migration(not yet done in this series).
>>
>> Hi, what tree does this apply to?
>>
>
> linux-powerpc tree -next branch
>
> https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/

Ok, but I'm getting the following error after applying the first patch
from your series:

error: patch failed: arch/powerpc/include/asm/pgalloc-64.h:166
error: arch/powerpc/include/asm/pgalloc-64.h: patch does not apply
error: arch/powerpc/mm/hash64_64k.c: No such file or directory

kda@hydra ~/devel/linux $ git branch
  master
* next
kda@hydra ~/devel/linux $ git remote show origin
* remote origin
  Fetch URL: git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
  Push  URL: git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git

Or am I missing something?

>
> -aneesh
>
>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-10-23 19:08     ` Denis Kirjanov
@ 2015-11-03  5:02       ` Aneesh Kumar K.V
  2015-11-10  8:28         ` Denis Kirjanov
  0 siblings, 1 reply; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-11-03  5:02 UTC (permalink / raw)
  To: Denis Kirjanov; +Cc: benh, paulus, mpe, Scott Wood, linuxppc-dev

Denis Kirjanov <kda@linux-powerpc.org> writes:

> On 10/23/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>> Denis Kirjanov <kda@linux-powerpc.org> writes:
>>
>>> On 10/17/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>>> Hi All,
>>>>
>>>> This patch series attempt to update book3s 64 linux page table format to
>>>> make it more flexible. Our current pte format is very restrictive and we
>>>> overload multiple pte bits. This is due to the non-availability of free
>>>> bits
>>>> in pte_t. We use pte_t to track the validity of 4K subpages. This patch
>>>> series free up pte_t of 11 bits by moving 4K subpage tracking to the
>>>> lower half of PTE page. The pte format is updated such that we have a
>>>> better method for identifying a pte entry at pmd level. This will also
>>>> enable
>>>> us to implement hugetlb migration(not yet done in this series).
>>>
>>> Hi, what tree does this apply to?
>>>
>>
>> linux-powerpc tree -next branch
>>
>> https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/
>
> Ok, but I'm getting the following error after applying the first patch
> from your series:
>
> error: patch failed: arch/powerpc/include/asm/pgalloc-64.h:166
> error: arch/powerpc/include/asm/pgalloc-64.h: patch does not apply
> error: arch/powerpc/mm/hash64_64k.c: No such file or directory
>
> kda@hydra ~/devel/linux $ git branch
>   master
> * next
> kda@hydra ~/devel/linux $ git remote show origin
> * remote origin
>   Fetch URL: git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
>   Push  URL: git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
>
> Or am I missing something?


Why not use

The patch series can also be found at
https://github.com/kvaneesh/linux.git book3s-pte-format 
https://github.com/kvaneesh/linux/commits/book3s-pte-format


-aneesh

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-11-03  5:02       ` Aneesh Kumar K.V
@ 2015-11-10  8:28         ` Denis Kirjanov
  2015-11-10 16:00           ` Aneesh Kumar K.V
  0 siblings, 1 reply; 41+ messages in thread
From: Denis Kirjanov @ 2015-11-10  8:28 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: Scott Wood, paulus, linuxppc-dev, Mel Gorman

On 11/3/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
> Denis Kirjanov <kda@linux-powerpc.org> writes:
>
>> On 10/23/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>> Denis Kirjanov <kda@linux-powerpc.org> writes:
>>>
>>>> On 10/17/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>>>>> Hi All,
>>>>>
>>>>> This patch series attempt to update book3s 64 linux page table format
>>>>> to
>>>>> make it more flexible. Our current pte format is very restrictive and
>>>>> we
>>>>> overload multiple pte bits. This is due to the non-availability of free
>>>>> bits
>>>>> in pte_t. We use pte_t to track the validity of 4K subpages. This patch
>>>>> series free up pte_t of 11 bits by moving 4K subpage tracking to the
>>>>> lower half of PTE page. The pte format is updated such that we have a
>>>>> better method for identifying a pte entry at pmd level. This will also
>>>>> enable
>>>>> us to implement hugetlb migration(not yet done in this series).
>>>>
>>>> Hi, what tree does this apply to?
>>>>
>>>
>>> linux-powerpc tree -next branch
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git/
>>
>> Ok, but I'm getting the following error after applying the first patch
>> from your series:
>>
>> error: patch failed: arch/powerpc/include/asm/pgalloc-64.h:166
>> error: arch/powerpc/include/asm/pgalloc-64.h: patch does not apply
>> error: arch/powerpc/mm/hash64_64k.c: No such file or directory
>>
>> kda@hydra ~/devel/linux $ git branch
>>   master
>> * next
>> kda@hydra ~/devel/linux $ git remote show origin
>> * remote origin
>>   Fetch URL:
>> git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
>>   Push  URL:
>> git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
>>
>> Or am I missing something?
>
>
> Why not use
>
> The patch series can also be found at
> https://github.com/kvaneesh/linux.git book3s-pte-format
> https://github.com/kvaneesh/linux/commits/book3s-pte-format

I've started playing with your changes.
Also I want to run Mel Gorman's mm tests[0] but it needs some
work to adapt them to run.

Thanks!

[0] https://lwn.net/Articles/454121/
>
>
> -aneesh
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64
  2015-11-10  8:28         ` Denis Kirjanov
@ 2015-11-10 16:00           ` Aneesh Kumar K.V
  0 siblings, 0 replies; 41+ messages in thread
From: Aneesh Kumar K.V @ 2015-11-10 16:00 UTC (permalink / raw)
  To: Denis Kirjanov; +Cc: Scott Wood, paulus, linuxppc-dev, Mel Gorman

Denis Kirjanov <kda@linux-powerpc.org> writes:

> On 11/3/15, Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> wrote:
>> Denis Kirjanov <kda@linux-powerpc.org> writes:
>>
........
..........

>> Why not use
>>
>> The patch series can also be found at
>> https://github.com/kvaneesh/linux.git book3s-pte-format
>> https://github.com/kvaneesh/linux/commits/book3s-pte-format
>
> I've started playing with your changes.
> Also I want to run Mel Gorman's mm tests[0] but it needs some
> work to adapt them to run.
>

I did post one config number here

http://article.gmane.org/gmane.linux.ports.ppc.embedded/88548

-aneesh

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2015-11-10 16:01 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-17 10:08 [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 01/31] powerpc/mm: move pte headers to book3s directory Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 02/31] powerpc/mm: move pte headers to book3s directory (part 2) Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 03/31] powerpc/mm: make a separate copy for book3s Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 04/31] powerpc/mm: make a separate copy for book3s (part 2) Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 05/31] powerpc/mm: Move hash specific pte width and other defines to book3s Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 06/31] powerpc/mm: Delete booke bits from book3s Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 07/31] powerpc/mm: Don't have generic headers introduce functions touching pte bits Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 08/31] powerpc/mm: Drop pte-common.h from BOOK3S 64 Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 09/31] powerpc/mm: Don't use pte_val as lvalue Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 10/31] powerpc/mm: Don't use pmd_val, pud_val and pgd_val " Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 11/31] powerpc/mm: Move hash64 PTE bits from book3s/64/pgtable.h to hash.h Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 12/31] powerpc/mm: Move PTE bits from generic functions to hash64 functions Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 13/31] powerpc/booke: Move nohash headers (part 1) Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 14/31] powerpc/booke: Move nohash headers (part 2) Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 15/31] powerpc/booke: Move nohash headers (part 3) Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 16/31] powerpc/booke: Move nohash headers (part 4) Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 17/31] powerpc/booke: Move nohash headers (part 5) Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 18/31] powerpc/mm: Increase the pte frag size Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 19/31] powerpc/mm: Convert 4k hash insert to C Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 20/31] powerpc/mm: update __real_pte to take address as argument Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 21/31] powerpc/mm: make pte page hash index slot 8 bits Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 22/31] powerpc/mm: Don't track subpage valid bit in pte_t Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 23/31] powerpc/mm: Increase the width of #define Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 24/31] powerpc/mm: Convert __hash_page_64K to C Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 25/31] powerpc/mm: Convert 4k insert from asm " Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 26/31] powerpc/mm: Remove the dependency on pte bit position in asm code Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 27/31] powerpc/mm: Add helper for converting pte bit to hpte bits Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 28/31] powerpc/mm: Move WIMG update to helper Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 29/31] powerpc/mm: Move hugetlb related headers Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 30/31] powerpc/mm: Move THP headers around Aneesh Kumar K.V
2015-10-17 10:08 ` [PATCH V4 31/31] powerpc/mm: Add a _PAGE_PTE bit Aneesh Kumar K.V
2015-10-17 13:22 ` [PATCH V4 00/31] powerpc/mm: Update page table format for book3s 64 Benjamin Herrenschmidt
2015-10-19  3:17   ` Aneesh Kumar K.V
2015-10-19  8:31   ` Aneesh Kumar K.V
2015-10-22 18:40 ` Denis Kirjanov
2015-10-23  6:06   ` Aneesh Kumar K.V
2015-10-23 19:08     ` Denis Kirjanov
2015-11-03  5:02       ` Aneesh Kumar K.V
2015-11-10  8:28         ` Denis Kirjanov
2015-11-10 16:00           ` Aneesh Kumar K.V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.