All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] powerpc: Free up RPAGE_RSV bits in 64K PTE
@ 2017-07-21 18:52 Ram Pai
  2017-07-21 18:52 ` [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Ram Pai @ 2017-07-21 18:52 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: benh, paulus, mpe, khandual, aneesh.kumar, bsingharora, hbabu,
	linuxram, mhocko


RPAGE_RSV0..4 pte bits are currently used for hpte slot
tracking. We  need  these bits   for  memory-protection
keys. Luckily these  four bits   are  relatively easier 
to move among all the other candidate bits.

For  64K   linux-ptes   backed  by 4k hptes, these bits
are   used for tracking the  validity of the slot value
stored   in the second-part-of-the-pte. We device a new
mechanism for  tracking   the   validity  without using
those bits. The    mechanism  is explained in the first
patch.

For 64K  linux-pte  backed by 64K hptes, we simply move
the   slot  tracking bits to the second-part-of-the-pte.

The above  mechanism  is also used to free the bits for
hugetlb linux-ptes.


Testing:
--------
has survived  kernel  compilation on multiple platforms
p8 powernv hash-mode, p9 powernv hash-mode,  p7 powervm,
p8-powervm, p8-kvm-guest.

Has survived git-bisect on p8  power-nv  with  64K page
and 4K page.

History:
-------
This patchset  is  a  spin-off from the memkey patchset.

version v7:
	(1) GIX bit reset change  moved  to  the second
		patch  -- noticed by Aneesh.
	(2) Separated this patches from memkey patchset
	(3) merged a  bunch  of  patches, that used the
       		helper function, into one.
version v6:
	(1) No changes related to pte.

version v5:
	(1) No changes related to pte.

version v4:
	(1) No changes related to pte.

version v3:
	(1) split the patches into smaller consumable
		patches.
	(2) A bug fix while  invalidating a hpte slot
		in __hash_page_4K()
       		-- noticed by Aneesh
	

version v2:
 	(1) fixed a  bug  in 4k  hpte  backed 64k pte
       		where  page    invalidation   was not
		done  correctly,  and  initialization
	       	of    second-part-of-the-pte  was not
		done    correctly  if the pte was not
	       	yet Hashed with a hpte.
	       	   --	Reported by Aneesh.
	

version v1: Initial version



Ram Pai (6):
  powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  powerpc: capture the PTE format changes in the dump pte report
  powerpc: introduce pte_set_hash_slot() helper
  powerpc: introduce pte_get_hash_gslot() helper
  powerpc: use helper functions to get and set hash slots

 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   20 ++++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   60 ++++++++----
 arch/powerpc/include/asm/book3s/64/hash.h     |    7 +-
 arch/powerpc/mm/dump_linuxpagetables.c        |    3 +-
 arch/powerpc/mm/hash64_4k.c                   |   14 +--
 arch/powerpc/mm/hash64_64k.c                  |  124 +++++++++++++------------
 arch/powerpc/mm/hash_utils_64.c               |   35 +++++--
 arch/powerpc/mm/hugetlbpage-hash64.c          |   16 +--
 8 files changed, 165 insertions(+), 114 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-07-21 18:52 [PATCH 0/6] powerpc: Free up RPAGE_RSV bits in 64K PTE Ram Pai
@ 2017-07-21 18:52 ` Ram Pai
  2017-07-26 10:35   ` Aneesh Kumar K.V
  2017-07-21 18:52 ` [PATCH 2/6] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Ram Pai @ 2017-07-21 18:52 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: benh, paulus, mpe, khandual, aneesh.kumar, bsingharora, hbabu,
	linuxram, mhocko

Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6,
in the 4K backed HPTE pages.These bits continue to be used
for 64K backed HPTE pages in this patch, but will be freed
up in the next patch. The  bit  numbers are big-endian  as
defined in the ISA3.0

The patch does the following change to the 4k htpe backed
64K PTE's format.

H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
		below)
V0 which occupied bit 4 is not used anymore.
V1 which occupied bit 5 is not used anymore.
V2 which occupied bit 6 is not used anymore.
V3 which occupied bit 7 is not used anymore.

Before the patch, the 4k backed 64k PTE format was as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

After the patch, the 4k backed 64k PTE format is as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |  |  |  |  |x|B| |x|x|................|.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

the four  bits S,G,I,X (one quadruplet per 4k HPTE) that
cache  the  hash-bucket  slot  value, is initialized  to
1,1,1,1 indicating -- an invalid slot.   If  a HPTE gets
cached in a 1111  slot(i.e 7th  slot  of  secondary hash
bucket), it is  released  immediately. In  other  words,
even  though 1111   is   a valid slot  value in the hash
bucket, we consider it invalid and  release the slot and
the HPTE.  This  gives  us  the opportunity to determine
the validity of S,G,I,X  bits  based on its contents and
not on any of the bits V0,V1,V2 or V3 in the primary PTE

When   we  release  a    HPTE    cached in the 1111 slot
we also    release  a  legitimate   slot  in the primary
hash bucket  and  unmap  its  corresponding  HPTE.  This
is  to  ensure   that  we do get a HPTE cached in a slot
of the primary hash bucket, the next time we retry.

Though  treating  1111  slot  as  invalid,  reduces  the
number of  available  slots  in the hash bucket and  may
have  an  effect   on the performance, the probabilty of
hitting a 1111 slot is extermely low.

Compared  to  the  current   scheme, the above described
scheme  reduces  the  number of false hash table updates
significantly   and    has  the   added   advantage   of
releasing  four  valuable  PTE bits for other purpose.

NOTE:even though bits 3, 4, 5, 6, 7 are  not  used  when
the  64K  PTE is backed by 4k HPTE,  they continue to be
used  if  the  PTE  gets  backed  by 64k HPTE.  The next
patch will decouple that aswell, and truely  release the
bits.

This idea was jointly developed by Paul Mackerras,
Aneesh, Michael Ellermen and myself.

4K PTE format remains unchanged currently.

The patch does the following code changes
a) PTE flags are split between 64k and 4k  header files.
b) __hash_page_4K()  is  reimplemented   to reflect the
   above logic.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |    2 +
 arch/powerpc/include/asm/book3s/64/hash-64k.h |    8 +--
 arch/powerpc/include/asm/book3s/64/hash.h     |    1 -
 arch/powerpc/mm/hash64_64k.c                  |   74 ++++++++++++++++---------
 arch/powerpc/mm/hash_utils_64.c               |    4 +-
 5 files changed, 55 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 0c4e470..f959c00 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -16,6 +16,8 @@
 #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
+#define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
+
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
 			 H_PAGE_F_SECOND | H_PAGE_F_GIX)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 9732837..62e580c 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -12,18 +12,14 @@
  */
 #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
+#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
+
 /*
  * We need to differentiate between explicit huge page and THP huge
  * page, since THP huge page also need to track real subpage details
  */
 #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
 
-/*
- * Used to track subpage group valid if H_PAGE_COMBO is set
- * This overloads H_PAGE_F_GIX and H_PAGE_F_SECOND
- */
-#define H_PAGE_COMBO_VALID	(H_PAGE_F_GIX | H_PAGE_F_SECOND)
-
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
 			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 4e957b0..2d72964 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -9,7 +9,6 @@
  */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
 #define H_PAGE_F_GIX_SHIFT	56
-#define H_PAGE_BUSY		_RPAGE_RSV1 /* software: PTE & hash are busy */
 #define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
 #define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 1a68cb1..7b92204 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -15,34 +15,22 @@
 #include <linux/mm.h>
 #include <asm/machdep.h>
 #include <asm/mmu.h>
+
 /*
- * index from 0 - 15
+ * return true, if the entry has a slot value which
+ * the software considers as invalid.
  */
-bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
+static inline bool hpte_soft_invalid(unsigned long slot)
 {
-	unsigned long g_idx;
-	unsigned long ptev = pte_val(rpte.pte);
-
-	g_idx = (ptev & H_PAGE_COMBO_VALID) >> H_PAGE_F_GIX_SHIFT;
-	index = index >> 2;
-	if (g_idx & (0x1 << index))
-		return true;
-	else
-		return false;
+	return ((slot & 0xfUL) == 0xfUL);
 }
+
 /*
  * index from 0 - 15
  */
-static unsigned long mark_subptegroup_valid(unsigned long ptev, unsigned long index)
+bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
 {
-	unsigned long g_idx;
-
-	if (!(ptev & H_PAGE_COMBO))
-		return ptev;
-	index = index >> 2;
-	g_idx = 0x1 << index;
-
-	return ptev | (g_idx << H_PAGE_F_GIX_SHIFT);
+	return !(hpte_soft_invalid(rpte.hidx >> (index << 2)));
 }
 
 int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
@@ -50,12 +38,12 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		   int ssize, int subpg_prot)
 {
 	real_pte_t rpte;
-	unsigned long *hidxp;
 	unsigned long hpte_group;
+	unsigned long *hidxp;
 	unsigned int subpg_index;
 	unsigned long rflags, pa, hidx;
 	unsigned long old_pte, new_pte, subpg_pte;
-	unsigned long vpn, hash, slot;
+	unsigned long vpn, hash, slot, gslot;
 	unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
 
 	/*
@@ -148,6 +136,15 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	}
 
 htab_insert_hpte:
+
+	/*
+	 * initialize all hidx entries to invalid value,
+	 * the first time the PTE is about to allocate
+	 * a 4K hpte
+	 */
+	if (!(old_pte & H_PAGE_COMBO))
+		rpte.hidx = ~0x0UL;
+
 	/*
 	 * handle H_PAGE_4K_PFN case
 	 */
@@ -172,15 +169,41 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * Primary is full, try the secondary
 	 */
 	if (unlikely(slot == -1)) {
+		bool soft_invalid;
+
 		hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
 		slot = mmu_hash_ops.hpte_insert(hpte_group, vpn, pa,
 						rflags, HPTE_V_SECONDARY,
 						MMU_PAGE_4K, MMU_PAGE_4K,
 						ssize);
-		if (slot == -1) {
-			if (mftb() & 0x1)
+
+		soft_invalid = hpte_soft_invalid(slot);
+		if (unlikely(soft_invalid)) {
+			/*
+			 * we got a valid slot from a hardware point of view.
+			 * but we cannot use it, because we use this special
+			 * value; as     defined   by    hpte_soft_invalid(),
+			 * to  track    invalid  slots.  We  cannot  use  it.
+			 * So invalidate it.
+			 */
+			gslot = slot & _PTEIDX_GROUP_IX;
+			mmu_hash_ops.hpte_invalidate(hpte_group+gslot, vpn,
+				MMU_PAGE_4K, MMU_PAGE_4K,
+				ssize, 0);
+		}
+
+		if (unlikely(slot == -1 || soft_invalid)) {
+			/*
+			 * for soft invalid slot, lets   ensure that we
+			 * release a slot from  the primary,   with the
+			 * hope that we  will  acquire that slot   next
+			 * time we try. This will ensure that we do not
+			 * get the same soft-invalid slot.
+			 */
+			if (soft_invalid || (mftb() & 0x1))
 				hpte_group = ((hash & htab_hash_mask) *
 					      HPTES_PER_GROUP) & ~0x7UL;
+
 			mmu_hash_ops.hpte_remove(hpte_group);
 			/*
 			 * FIXME!! Should be try the group from which we removed ?
@@ -207,12 +230,11 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
 	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
 	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
-	new_pte = mark_subptegroup_valid(new_pte, subpg_index);
-	new_pte |=  H_PAGE_HASHPTE;
 	/*
 	 * check __real_pte for details on matching smp_rmb()
 	 */
 	smp_wmb();
+	new_pte |=  H_PAGE_HASHPTE;
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
 }
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index f2095ce..1b494d0 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -975,8 +975,9 @@ void __init hash__early_init_devtree(void)
 
 void __init hash__early_init_mmu(void)
 {
+#ifndef CONFIG_PPC_64K_PAGES
 	/*
-	 * We have code in __hash_page_64K() and elsewhere, which assumes it can
+	 * We have code in __hash_page_4K() and elsewhere, which assumes it can
 	 * do the following:
 	 *   new_pte |= (slot << H_PAGE_F_GIX_SHIFT) & (H_PAGE_F_SECOND | H_PAGE_F_GIX);
 	 *
@@ -987,6 +988,7 @@ void __init hash__early_init_mmu(void)
 	 * with a BUILD_BUG_ON().
 	 */
 	BUILD_BUG_ON(H_PAGE_F_SECOND != (1ul  << (H_PAGE_F_GIX_SHIFT + 3)));
+#endif /* CONFIG_PPC_64K_PAGES */
 
 	htab_init_page_sizes();
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/6] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-07-21 18:52 [PATCH 0/6] powerpc: Free up RPAGE_RSV bits in 64K PTE Ram Pai
  2017-07-21 18:52 ` [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
@ 2017-07-21 18:52 ` Ram Pai
  2017-07-21 18:52 ` [PATCH 3/6] powerpc: capture the PTE format changes in the dump pte report Ram Pai
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Ram Pai @ 2017-07-21 18:52 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: benh, paulus, mpe, khandual, aneesh.kumar, bsingharora, hbabu,
	linuxram, mhocko

Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6
in the 64K backed HPTE pages. This along with the earlier
patch will  entirely free  up the four bits from 64K PTE.
The bit numbers are  big-endian as defined in the  ISA3.0

This patch  does  the  following change to 64K PTE backed
by 64K HPTE.

H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the
	second part of the pte to bit 60.
H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also
	moves  to  the   second part of the pte to bit 61,
       	62, 63, 64 respectively

since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
bit  9  to  bit  7.

The second part of the PTE will hold
(H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
NOTE: None of the bits in the secondary PTE were not used
by 64k-HPTE backed PTE.

Before the patch, the 64K HPTE backed 64k PTE format was
as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
| | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

After the patch, the 64k HPTE backed 64k PTE format is
as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
| | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

The above PTE changes is applicable to hugetlbpages aswell.

The patch does the following code changes:

a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE
	header   since it is no more needed b the 64k PTEs.
b) abstracts  out __real_pte() and __rpte_to_hidx() so the
	caller  need not know the bit location of the slot.
c) moves the slot bits to the secondary pte.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +++++++++-----------
 arch/powerpc/include/asm/book3s/64/hash.h     |    3 --
 arch/powerpc/mm/hash64_64k.c                  |   34 +++++++++++++++++-------
 arch/powerpc/mm/hugetlbpage-hash64.c          |   26 +++++++++++++++---
 5 files changed, 61 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index f959c00..d2cf949 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -16,6 +16,9 @@
 #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
+#define H_PAGE_F_GIX_SHIFT	56
+#define H_PAGE_F_SECOND	_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX	(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
 
 /* PTE flags to conserve for HPTE identification */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 62e580c..c281f18 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -12,7 +12,7 @@
  */
 #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
-#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
+#define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
 
 /*
  * We need to differentiate between explicit huge page and THP huge
@@ -21,8 +21,7 @@
 #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
 
 /* PTE flags to conserve for HPTE identification */
-#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
-			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
+#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)
 /*
  * we support 16 fragments per PTE page of 64K size.
  */
@@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
 	unsigned long *hidxp;
 
 	rpte.pte = pte;
-	rpte.hidx = 0;
-	if (pte_val(pte) & H_PAGE_COMBO) {
-		/*
-		 * Make sure we order the hidx load against the H_PAGE_COMBO
-		 * check. The store side ordering is done in __hash_page_4K
-		 */
-		smp_rmb();
-		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-		rpte.hidx = *hidxp;
-	}
+	/*
+	 * Ensure that we do not read the hidx before we read
+	 * the pte. Because the writer side is  expected
+	 * to finish writing the hidx first followed by the pte,
+	 * by using smp_wmb().
+	 * pte_set_hash_slot() ensures that.
+	 */
+	smp_rmb();
+	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+	rpte.hidx = *hidxp;
 	return rpte;
 }
 
 static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 {
-	if ((pte_val(rpte.pte) & H_PAGE_COMBO))
-		return (rpte.hidx >> (index<<2)) & 0xf;
-	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
+	return ((rpte.hidx >> (index<<2)) & 0xfUL);
 }
 
 #define __rpte_to_pte(r)	((r).pte)
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 2d72964..d27f885 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -8,9 +8,6 @@
  *
  */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
-#define H_PAGE_F_GIX_SHIFT	56
-#define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
-#define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
 
 #ifdef CONFIG_PPC_64K_PAGES
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 7b92204..e922a70 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -104,8 +104,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		 * On hash insert failure we use old pte value and we don't
 		 * want slot information there if we have a insert failure.
 		 */
-		old_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
-		new_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
+		old_pte &= ~H_PAGE_HASHPTE;
+		new_pte &= ~H_PAGE_HASHPTE;
 		goto htab_insert_hpte;
 	}
 	/*
@@ -243,6 +243,8 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 		    unsigned long vsid, pte_t *ptep, unsigned long trap,
 		    unsigned long flags, int ssize)
 {
+	real_pte_t rpte;
+	unsigned long *hidxp;
 	unsigned long hpte_group;
 	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte;
@@ -279,6 +281,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 	} while (!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
 
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -286,15 +289,17 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 
 	vpn  = hpt_vpn(ea, vsid, ssize);
 	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
-		/*
-		 * There MIGHT be an HPTE for this pte
-		 */
+		unsigned long hash, slot, hidx;
+
 		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
+		hidx = __rpte_to_hidx(rpte, 0);
+		if (hidx & _PTEIDX_SECONDARY)
 			hash = ~hash;
 		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
-
+		slot += hidx & _PTEIDX_GROUP_IX;
+		/*
+		 * There MIGHT be an HPTE for this pte
+		 */
 		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
 					       MMU_PAGE_64K, ssize,
 					       flags) == -1)
@@ -344,9 +349,18 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 					   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);
 			return -1;
 		}
+
+		/*
+		 * Insert slot number & secondary bit in PTE second half.
+		 */
+		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+		rpte.hidx &= ~(0xfUL);
+		*hidxp = rpte.hidx  | (slot & 0xfUL);
+		/*
+		 * check __real_pte for details on matching smp_rmb()
+		 */
+		smp_wmb();
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
 	}
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index a84bb44..5964b6d 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -22,6 +22,10 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		     pte_t *ptep, unsigned long trap, unsigned long flags,
 		     int ssize, unsigned int shift, unsigned int mmu_psize)
 {
+	real_pte_t rpte;
+#ifdef CONFIG_PPC_64K_PAGES
+	unsigned long *hidxp;
+#endif /* CONFIG_PPC_64K_PAGES */
 	unsigned long vpn;
 	unsigned long old_pte, new_pte;
 	unsigned long rflags, pa, sz;
@@ -61,6 +65,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 	} while(!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
 
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	sz = ((1UL) << shift);
 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -71,13 +76,14 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 	/* Check if pte already has an hpte (case 2) */
 	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
 		/* There MIGHT be an HPTE for this pte */
-		unsigned long hash, slot;
+		unsigned long hash, slot, hidx;
 
 		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
+		hidx = __rpte_to_hidx(rpte, 0);
+		if (hidx & _PTEIDX_SECONDARY)
 			hash = ~hash;
 		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
+		slot += hidx & _PTEIDX_GROUP_IX;
 
 		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,
 					       mmu_psize, ssize, flags) == -1)
@@ -106,8 +112,18 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 			return -1;
 		}
 
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+#ifdef CONFIG_PPC_64K_PAGES
+		/*
+		 * Insert slot number & secondary bit in PTE second half.
+		 */
+		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+		rpte.hidx &= ~(0xfUL);
+		*hidxp = rpte.hidx  | (slot & 0xfUL);
+		/*
+		 * check __real_pte for details on matching smp_rmb()
+		 */
+		smp_wmb();
+#endif /* CONFIG_PPC_64K_PAGES */
 	}
 
 	/*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/6] powerpc: capture the PTE format changes in the dump pte report
  2017-07-21 18:52 [PATCH 0/6] powerpc: Free up RPAGE_RSV bits in 64K PTE Ram Pai
  2017-07-21 18:52 ` [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
  2017-07-21 18:52 ` [PATCH 2/6] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
@ 2017-07-21 18:52 ` Ram Pai
  2017-07-21 18:52 ` [PATCH 4/6] powerpc: introduce pte_set_hash_slot() helper Ram Pai
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Ram Pai @ 2017-07-21 18:52 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: benh, paulus, mpe, khandual, aneesh.kumar, bsingharora, hbabu,
	linuxram, mhocko

The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE.
capture these changes in the dump pte report.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/dump_linuxpagetables.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/mm/dump_linuxpagetables.c b/arch/powerpc/mm/dump_linuxpagetables.c
index 44fe483..5627edd 100644
--- a/arch/powerpc/mm/dump_linuxpagetables.c
+++ b/arch/powerpc/mm/dump_linuxpagetables.c
@@ -213,7 +213,7 @@ struct flag_info {
 		.val	= H_PAGE_4K_PFN,
 		.set	= "4K_pfn",
 	}, {
-#endif
+#else /* CONFIG_PPC_64K_PAGES */
 		.mask	= H_PAGE_F_GIX,
 		.val	= H_PAGE_F_GIX,
 		.set	= "f_gix",
@@ -224,6 +224,7 @@ struct flag_info {
 		.val	= H_PAGE_F_SECOND,
 		.set	= "f_second",
 	}, {
+#endif /* CONFIG_PPC_64K_PAGES */
 #endif
 		.mask	= _PAGE_SPECIAL,
 		.val	= _PAGE_SPECIAL,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/6] powerpc: introduce pte_set_hash_slot() helper
  2017-07-21 18:52 [PATCH 0/6] powerpc: Free up RPAGE_RSV bits in 64K PTE Ram Pai
                   ` (2 preceding siblings ...)
  2017-07-21 18:52 ` [PATCH 3/6] powerpc: capture the PTE format changes in the dump pte report Ram Pai
@ 2017-07-21 18:52 ` Ram Pai
  2017-07-21 18:52 ` [PATCH 5/6] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
  2017-07-21 18:52 ` [PATCH 6/6] powerpc: use helper functions to get and set hash slots Ram Pai
  5 siblings, 0 replies; 11+ messages in thread
From: Ram Pai @ 2017-07-21 18:52 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: benh, paulus, mpe, khandual, aneesh.kumar, bsingharora, hbabu,
	linuxram, mhocko

Introduce pte_set_hash_slot().It  sets the (H_PAGE_F_SECOND|H_PAGE_F_GIX)
bits at  the   appropriate   location   in   the   PTE  of  4K  PTE.  For
64K PTE, it  sets  the  bits  in  the  second  part  of  the  PTE. Though
the implementation  for the former just needs the slot parameter, it does
take some additional parameters to keep the prototype consistent.

This function  will  be  handy  as  we   work   towards  re-arranging the
bits in the later patches.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   15 +++++++++++++++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   25 +++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index d2cf949..dc153c6 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -53,6 +53,21 @@ static inline int hash__hugepd_ok(hugepd_t hpd)
 }
 #endif
 
+/*
+ * 4k pte format is  different  from  64k  pte  format.  Saving  the
+ * hash_slot is just a matter of returning the pte bits that need to
+ * be modified. On 64k pte, things are a  little  more  involved and
+ * hence  needs   many   more  parameters  to  accomplish  the  same.
+ * However we  want  to abstract this out from the caller by keeping
+ * the prototype consistent across the two formats.
+ */
+static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
+			unsigned int subpg_index, unsigned long slot)
+{
+	return (slot << H_PAGE_F_GIX_SHIFT) &
+		(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
 static inline char *get_hpte_slot_array(pmd_t *pmdp)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index c281f18..89ef5a9 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -67,6 +67,31 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 	return ((rpte.hidx >> (index<<2)) & 0xfUL);
 }
 
+/*
+ * Commit the hash slot and return pte bits that needs to be modified.
+ * The caller is expected to modify the pte bits accordingly and
+ * commit the pte to memory.
+ */
+static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
+		unsigned int subpg_index, unsigned long slot)
+{
+	unsigned long *hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+
+	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
+	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
+	/*
+	 * Commit the hidx bits to memory before returning.
+	 * Anyone reading  pte  must  ensure hidx bits are
+	 * read  only  after  reading the pte by using the
+	 * read-side  barrier  smp_rmb(). __real_pte() can
+	 * help ensure that.
+	 */
+	smp_wmb();
+
+	/* no pte bits to be modified, return 0x0UL */
+	return 0x0UL;
+}
+
 #define __rpte_to_pte(r)	((r).pte)
 extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
 /*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 5/6] powerpc: introduce pte_get_hash_gslot() helper
  2017-07-21 18:52 [PATCH 0/6] powerpc: Free up RPAGE_RSV bits in 64K PTE Ram Pai
                   ` (3 preceding siblings ...)
  2017-07-21 18:52 ` [PATCH 4/6] powerpc: introduce pte_set_hash_slot() helper Ram Pai
@ 2017-07-21 18:52 ` Ram Pai
  2017-07-21 18:52 ` [PATCH 6/6] powerpc: use helper functions to get and set hash slots Ram Pai
  5 siblings, 0 replies; 11+ messages in thread
From: Ram Pai @ 2017-07-21 18:52 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: benh, paulus, mpe, khandual, aneesh.kumar, bsingharora, hbabu,
	linuxram, mhocko

Introduce pte_get_hash_gslot()() which returns the slot number of the
HPTE in the global hash table.

This function will come in handy as we work towards re-arranging the
PTE bits in the later patches.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h |    3 +++
 arch/powerpc/mm/hash_utils_64.c           |   18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index d27f885..277158c 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -156,6 +156,9 @@ static inline int hash__pte_none(pte_t pte)
 	return (pte_val(pte) & ~H_PTE_NONE_MASK) == 0;
 }
 
+unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
+		int ssize, real_pte_t rpte, unsigned int subpg_index);
+
 /* This low level function performs the actual PTE insertion
  * Setting the PTE depends on the MMU type and other factors. It's
  * an horrible mess that I'm not going to try to clean up now but
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 1b494d0..d3604da 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1591,6 +1591,24 @@ static inline void tm_flush_hash_page(int local)
 }
 #endif
 
+/*
+ * return the global hash slot, corresponding to the given
+ * pte, which contains the hpte.
+ */
+unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
+		int ssize, real_pte_t rpte, unsigned int subpg_index)
+{
+	unsigned long hash, slot, hidx;
+
+	hash = hpt_hash(vpn, shift, ssize);
+	hidx = __rpte_to_hidx(rpte, subpg_index);
+	if (hidx & _PTEIDX_SECONDARY)
+		hash = ~hash;
+	slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+	slot += hidx & _PTEIDX_GROUP_IX;
+	return slot;
+}
+
 /* WARNING: This is called from hash_low_64.S, if you change this prototype,
  *          do not forget to update the assembly call site !
  */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 6/6] powerpc: use helper functions to get and set hash slots
  2017-07-21 18:52 [PATCH 0/6] powerpc: Free up RPAGE_RSV bits in 64K PTE Ram Pai
                   ` (4 preceding siblings ...)
  2017-07-21 18:52 ` [PATCH 5/6] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
@ 2017-07-21 18:52 ` Ram Pai
  5 siblings, 0 replies; 11+ messages in thread
From: Ram Pai @ 2017-07-21 18:52 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: benh, paulus, mpe, khandual, aneesh.kumar, bsingharora, hbabu,
	linuxram, mhocko

replace redundant code in __hash_page_64K(), __hash_page_huge(),
__hash_page_4K(), __hash_page_4K() and flush_hash_page()   with
helper functions pte_get_hash_gslot() and   pte_set_hash_slot()

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/hash64_4k.c          |   14 +++-----
 arch/powerpc/mm/hash64_64k.c         |   58 +++++++--------------------------
 arch/powerpc/mm/hash_utils_64.c      |   13 ++-----
 arch/powerpc/mm/hugetlbpage-hash64.c |   28 ++--------------
 4 files changed, 27 insertions(+), 86 deletions(-)

diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index 6fa450c..a1eebc1 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -20,6 +20,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		   pte_t *ptep, unsigned long trap, unsigned long flags,
 		   int ssize, int subpg_prot)
 {
+	real_pte_t rpte;
 	unsigned long hpte_group;
 	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte;
@@ -54,6 +55,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * need to add in 0x1 if it's a read-only user page
 	 */
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -64,13 +66,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		/*
 		 * There MIGHT be an HPTE for this pte
 		 */
-		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
+		unsigned long gslot = pte_get_hash_gslot(vpn, shift,
+						ssize, rpte, 0);
 
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_4K,
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_4K,
 					       MMU_PAGE_4K, ssize, flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
 	}
@@ -118,8 +117,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 			return -1;
 		}
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
 	}
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index e922a70..6c1c87a 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -39,9 +39,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 {
 	real_pte_t rpte;
 	unsigned long hpte_group;
-	unsigned long *hidxp;
 	unsigned int subpg_index;
-	unsigned long rflags, pa, hidx;
+	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte, subpg_pte;
 	unsigned long vpn, hash, slot, gslot;
 	unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
@@ -114,18 +113,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	if (__rpte_sub_valid(rpte, subpg_index)) {
 		int ret;
 
-		hash = hpt_hash(vpn, shift, ssize);
-		hidx = __rpte_to_hidx(rpte, subpg_index);
-		if (hidx & _PTEIDX_SECONDARY)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += hidx & _PTEIDX_GROUP_IX;
-
-		ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn,
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte,
+				subpg_index);
+		ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn,
 						 MMU_PAGE_4K, MMU_PAGE_4K,
 						 ssize, flags);
 		/*
-		 *if we failed because typically the HPTE wasn't really here
+		 * if we failed because typically the HPTE wasn't really here
 		 * we try an insertion.
 		 */
 		if (ret == -1)
@@ -221,20 +215,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 				   MMU_PAGE_4K, MMU_PAGE_4K, old_pte);
 		return -1;
 	}
-	/*
-	 * Insert slot number & secondary bit in PTE second half,
-	 * clear H_PAGE_BUSY and set appropriate HPTE slot bit
-	 * Since we have H_PAGE_BUSY set on ptep, we can be sure
-	 * nobody is undating hidx.
-	 */
-	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
-	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
-	/*
-	 * check __real_pte for details on matching smp_rmb()
-	 */
-	smp_wmb();
-	new_pte |=  H_PAGE_HASHPTE;
+
+	new_pte |= pte_set_hash_slot(ptep, rpte, subpg_index, slot);
+	new_pte |= H_PAGE_HASHPTE;
+
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
 }
@@ -244,7 +228,6 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 		    unsigned long flags, int ssize)
 {
 	real_pte_t rpte;
-	unsigned long *hidxp;
 	unsigned long hpte_group;
 	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte;
@@ -289,18 +272,12 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 
 	vpn  = hpt_vpn(ea, vsid, ssize);
 	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
-		unsigned long hash, slot, hidx;
-
-		hash = hpt_hash(vpn, shift, ssize);
-		hidx = __rpte_to_hidx(rpte, 0);
-		if (hidx & _PTEIDX_SECONDARY)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += hidx & _PTEIDX_GROUP_IX;
+		unsigned long gslot;
 		/*
 		 * There MIGHT be an HPTE for this pte
 		 */
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_64K,
 					       MMU_PAGE_64K, ssize,
 					       flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
@@ -350,17 +327,8 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 			return -1;
 		}
 
-		/*
-		 * Insert slot number & secondary bit in PTE second half.
-		 */
-		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-		rpte.hidx &= ~(0xfUL);
-		*hidxp = rpte.hidx  | (slot & 0xfUL);
-		/*
-		 * check __real_pte for details on matching smp_rmb()
-		 */
-		smp_wmb();
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
+		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
 	}
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index d3604da..d863696 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1615,23 +1615,18 @@ unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
 void flush_hash_page(unsigned long vpn, real_pte_t pte, int psize, int ssize,
 		     unsigned long flags)
 {
-	unsigned long hash, index, shift, hidx, slot;
+	unsigned long index, shift, gslot;
 	int local = flags & HPTE_LOCAL_UPDATE;
 
 	DBG_LOW("flush_hash_page(vpn=%016lx)\n", vpn);
 	pte_iterate_hashed_subpages(pte, psize, vpn, index, shift) {
-		hash = hpt_hash(vpn, shift, ssize);
-		hidx = __rpte_to_hidx(pte, index);
-		if (hidx & _PTEIDX_SECONDARY)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += hidx & _PTEIDX_GROUP_IX;
-		DBG_LOW(" sub %ld: hash=%lx, hidx=%lx\n", index, slot, hidx);
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, pte, index);
+		DBG_LOW(" sub %ld: gslot=%lx\n", index, gslot);
 		/*
 		 * We use same base page size and actual psize, because we don't
 		 * use these functions for hugepage
 		 */
-		mmu_hash_ops.hpte_invalidate(slot, vpn, psize, psize,
+		mmu_hash_ops.hpte_invalidate(gslot, vpn, psize, psize,
 					     ssize, local);
 	} pte_iterate_hashed_end();
 
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index 5964b6d..e6dcd50 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -23,9 +23,6 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		     int ssize, unsigned int shift, unsigned int mmu_psize)
 {
 	real_pte_t rpte;
-#ifdef CONFIG_PPC_64K_PAGES
-	unsigned long *hidxp;
-#endif /* CONFIG_PPC_64K_PAGES */
 	unsigned long vpn;
 	unsigned long old_pte, new_pte;
 	unsigned long rflags, pa, sz;
@@ -76,16 +73,10 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 	/* Check if pte already has an hpte (case 2) */
 	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
 		/* There MIGHT be an HPTE for this pte */
-		unsigned long hash, slot, hidx;
+		unsigned long gslot;
 
-		hash = hpt_hash(vpn, shift, ssize);
-		hidx = __rpte_to_hidx(rpte, 0);
-		if (hidx & _PTEIDX_SECONDARY)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += hidx & _PTEIDX_GROUP_IX;
-
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, mmu_psize,
 					       mmu_psize, ssize, flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
 	}
@@ -112,18 +103,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 			return -1;
 		}
 
-#ifdef CONFIG_PPC_64K_PAGES
-		/*
-		 * Insert slot number & secondary bit in PTE second half.
-		 */
-		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-		rpte.hidx &= ~(0xfUL);
-		*hidxp = rpte.hidx  | (slot & 0xfUL);
-		/*
-		 * check __real_pte for details on matching smp_rmb()
-		 */
-		smp_wmb();
-#endif /* CONFIG_PPC_64K_PAGES */
+		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
 	}
 
 	/*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-07-21 18:52 ` [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
@ 2017-07-26 10:35   ` Aneesh Kumar K.V
  2017-07-26 16:06     ` Ram Pai
  0 siblings, 1 reply; 11+ messages in thread
From: Aneesh Kumar K.V @ 2017-07-26 10:35 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, mpe, khandual, bsingharora, hbabu, linuxram, mhocko

Ram Pai <linuxram@us.ibm.com> writes:

> Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6,
> in the 4K backed HPTE pages.These bits continue to be used
> for 64K backed HPTE pages in this patch, but will be freed
> up in the next patch. The  bit  numbers are big-endian  as
> defined in the ISA3.0
>
> The patch does the following change to the 4k htpe backed
> 64K PTE's format.
>
> H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
> 		below)
> V0 which occupied bit 4 is not used anymore.
> V1 which occupied bit 5 is not used anymore.
> V2 which occupied bit 6 is not used anymore.
> V3 which occupied bit 7 is not used anymore.
>
> Before the patch, the 4k backed 64k PTE format was as follows
>
>  0 1 2 3 4  5  6  7  8 9 10...........................63
>  : : : : :  :  :  :  : : :                            :
>  v v v v v  v  v  v  v v v                            v
>
> ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> |x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
>
> After the patch, the 4k backed 64k PTE format is as follows
>
>  0 1 2 3 4  5  6  7  8 9 10...........................63
>  : : : : :  :  :  :  : : :                            :
>  v v v v v  v  v  v  v v v                            v
>
> ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> |x|x|x| |  |  |  |  |x|B| |x|x|................|.|.|.|.| <- primary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
>
> the four  bits S,G,I,X (one quadruplet per 4k HPTE) that
> cache  the  hash-bucket  slot  value, is initialized  to
> 1,1,1,1 indicating -- an invalid slot.   If  a HPTE gets
> cached in a 1111  slot(i.e 7th  slot  of  secondary hash
> bucket), it is  released  immediately. In  other  words,
> even  though 1111   is   a valid slot  value in the hash
> bucket, we consider it invalid and  release the slot and
> the HPTE.  This  gives  us  the opportunity to determine
> the validity of S,G,I,X  bits  based on its contents and
> not on any of the bits V0,V1,V2 or V3 in the primary PTE
>
> When   we  release  a    HPTE    cached in the 1111 slot
> we also    release  a  legitimate   slot  in the primary
> hash bucket  and  unmap  its  corresponding  HPTE.  This
> is  to  ensure   that  we do get a HPTE cached in a slot
> of the primary hash bucket, the next time we retry.
>
> Though  treating  1111  slot  as  invalid,  reduces  the
> number of  available  slots  in the hash bucket and  may
> have  an  effect   on the performance, the probabilty of
> hitting a 1111 slot is extermely low.
>
> Compared  to  the  current   scheme, the above described
> scheme  reduces  the  number of false hash table updates
> significantly   and    has  the   added   advantage   of
> releasing  four  valuable  PTE bits for other purpose.
>
> NOTE:even though bits 3, 4, 5, 6, 7 are  not  used  when
> the  64K  PTE is backed by 4k HPTE,  they continue to be
> used  if  the  PTE  gets  backed  by 64k HPTE.  The next
> patch will decouple that aswell, and truely  release the
> bits.
>
> This idea was jointly developed by Paul Mackerras,
> Aneesh, Michael Ellermen and myself.
>
> 4K PTE format remains unchanged currently.
>
> The patch does the following code changes
> a) PTE flags are split between 64k and 4k  header files.
> b) __hash_page_4K()  is  reimplemented   to reflect the
>    above logic.
>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hash-4k.h  |    2 +
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |    8 +--
>  arch/powerpc/include/asm/book3s/64/hash.h     |    1 -
>  arch/powerpc/mm/hash64_64k.c                  |   74 ++++++++++++++++---------
>  arch/powerpc/mm/hash_utils_64.c               |    4 +-
>  5 files changed, 55 insertions(+), 34 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> index 0c4e470..f959c00 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> @@ -16,6 +16,8 @@
>  #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
>  #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
>
> +#define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
> +
>  /* PTE flags to conserve for HPTE identification */
>  #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
>  			 H_PAGE_F_SECOND | H_PAGE_F_GIX)
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 9732837..62e580c 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -12,18 +12,14 @@
>   */
>  #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
>  #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
> +#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */


Why are we moving H_PAGE_BUSY. Right now 4k and 64k linux page table
format looks similar. We use the lower RPN bits only for subpage
tracking/details.


> +
>  /*
>   * We need to differentiate between explicit huge page and THP huge
>   * page, since THP huge page also need to track real subpage details
>   */
>  #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
>


-aneesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-07-26 10:35   ` Aneesh Kumar K.V
@ 2017-07-26 16:06     ` Ram Pai
  2017-07-27  1:59       ` Aneesh Kumar K.V
  0 siblings, 1 reply; 11+ messages in thread
From: Ram Pai @ 2017-07-26 16:06 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: linuxppc-dev, benh, paulus, mpe, khandual, bsingharora, hbabu, mhocko

On Wed, Jul 26, 2017 at 04:05:48PM +0530, Aneesh Kumar K.V wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6,
> > in the 4K backed HPTE pages.These bits continue to be used
> > for 64K backed HPTE pages in this patch, but will be freed
> > up in the next patch. The  bit  numbers are big-endian  as
> > defined in the ISA3.0
> >
> > The patch does the following change to the 4k htpe backed
> > 64K PTE's format.
> >
> > H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
> > 		below)
> > V0 which occupied bit 4 is not used anymore.
> > V1 which occupied bit 5 is not used anymore.
> > V2 which occupied bit 6 is not used anymore.
> > V3 which occupied bit 7 is not used anymore.
> >
> > Before the patch, the 4k backed 64k PTE format was as follows
> >
> >  0 1 2 3 4  5  6  7  8 9 10...........................63
> >  : : : : :  :  :  :  : : :                            :
> >  v v v v v  v  v  v  v v v                            v
> >
> > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> > |x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte
> > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> > |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
> > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> >
> > After the patch, the 4k backed 64k PTE format is as follows
> >
> >  0 1 2 3 4  5  6  7  8 9 10...........................63
> >  : : : : :  :  :  :  : : :                            :
> >  v v v v v  v  v  v  v v v                            v
> >
> > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> > |x|x|x| |  |  |  |  |x|B| |x|x|................|.|.|.|.| <- primary pte
> > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> > |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
> > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> >
> > the four  bits S,G,I,X (one quadruplet per 4k HPTE) that
> > cache  the  hash-bucket  slot  value, is initialized  to
> > 1,1,1,1 indicating -- an invalid slot.   If  a HPTE gets
> > cached in a 1111  slot(i.e 7th  slot  of  secondary hash
> > bucket), it is  released  immediately. In  other  words,
> > even  though 1111   is   a valid slot  value in the hash
> > bucket, we consider it invalid and  release the slot and
> > the HPTE.  This  gives  us  the opportunity to determine
> > the validity of S,G,I,X  bits  based on its contents and
> > not on any of the bits V0,V1,V2 or V3 in the primary PTE
> >
> > When   we  release  a    HPTE    cached in the 1111 slot
> > we also    release  a  legitimate   slot  in the primary
> > hash bucket  and  unmap  its  corresponding  HPTE.  This
> > is  to  ensure   that  we do get a HPTE cached in a slot
> > of the primary hash bucket, the next time we retry.
> >
> > Though  treating  1111  slot  as  invalid,  reduces  the
> > number of  available  slots  in the hash bucket and  may
> > have  an  effect   on the performance, the probabilty of
> > hitting a 1111 slot is extermely low.
> >
> > Compared  to  the  current   scheme, the above described
> > scheme  reduces  the  number of false hash table updates
> > significantly   and    has  the   added   advantage   of
> > releasing  four  valuable  PTE bits for other purpose.
> >
> > NOTE:even though bits 3, 4, 5, 6, 7 are  not  used  when
> > the  64K  PTE is backed by 4k HPTE,  they continue to be
> > used  if  the  PTE  gets  backed  by 64k HPTE.  The next
> > patch will decouple that aswell, and truely  release the
> > bits.
> >
> > This idea was jointly developed by Paul Mackerras,
> > Aneesh, Michael Ellermen and myself.
> >
> > 4K PTE format remains unchanged currently.
> >
> > The patch does the following code changes
> > a) PTE flags are split between 64k and 4k  header files.
> > b) __hash_page_4K()  is  reimplemented   to reflect the
> >    above logic.
> >
> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/book3s/64/hash-4k.h  |    2 +
> >  arch/powerpc/include/asm/book3s/64/hash-64k.h |    8 +--
> >  arch/powerpc/include/asm/book3s/64/hash.h     |    1 -
> >  arch/powerpc/mm/hash64_64k.c                  |   74 ++++++++++++++++---------
> >  arch/powerpc/mm/hash_utils_64.c               |    4 +-
> >  5 files changed, 55 insertions(+), 34 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > index 0c4e470..f959c00 100644
> > --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > @@ -16,6 +16,8 @@
> >  #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
> >  #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
> >
> > +#define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
> > +
> >  /* PTE flags to conserve for HPTE identification */
> >  #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
> >  			 H_PAGE_F_SECOND | H_PAGE_F_GIX)
> > diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > index 9732837..62e580c 100644
> > --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > @@ -12,18 +12,14 @@
> >   */
> >  #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
> >  #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
> > +#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
> 
> 
> Why are we moving H_PAGE_BUSY. Right now 4k and 64k linux page table
> format looks similar.

The goal is to clear off all the _RPAGE_RSV* bits so that they can be
used for protection keys.  the aim is to keep the protection-bits in the
_RPAGE_RSV* bits, so that they will work as-is whenever radix MMU enables
protection keys.

Yes this makes the PTE format differ from 4k PTE. Hopefully it is a
small inconvenience. The PTE format for 4K is anyway not exactly the
same compared to 64K PTE format. For example, higher RPN bits are 
used on 4K but not on 64k. lower RPN bits are used on 64k but not
on 4k.

RP
> We use the lower RPN bits only for subpage
> tracking/details.
> 
> 
> > +
> >  /*
> >   * We need to differentiate between explicit huge page and THP huge
> >   * page, since THP huge page also need to track real subpage details
> >   */
> >  #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
> >
> 
> 
> -aneesh

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-07-26 16:06     ` Ram Pai
@ 2017-07-27  1:59       ` Aneesh Kumar K.V
  2017-07-27  8:00         ` Ram Pai
  0 siblings, 1 reply; 11+ messages in thread
From: Aneesh Kumar K.V @ 2017-07-27  1:59 UTC (permalink / raw)
  To: Ram Pai
  Cc: linuxppc-dev, benh, paulus, mpe, khandual, bsingharora, hbabu, mhocko



On 07/26/2017 09:36 PM, Ram Pai wrote:
> On Wed, Jul 26, 2017 at 04:05:48PM +0530, Aneesh Kumar K.V wrote:
>> Ram Pai <linuxram@us.ibm.com> writes:
>>

>>> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>>> index 9732837..62e580c 100644
>>> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
>>> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
>>> @@ -12,18 +12,14 @@
>>>    */
>>>   #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
>>>   #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
>>> +#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
>>
>>
>> Why are we moving H_PAGE_BUSY. Right now 4k and 64k linux page table
>> format looks similar.
> 
> The goal is to clear off all the _RPAGE_RSV* bits so that they can be
> used for protection keys.  the aim is to keep the protection-bits in the
> _RPAGE_RSV* bits, so that they will work as-is whenever radix MMU enables
> protection keys.
> 
> Yes this makes the PTE format differ from 4k PTE. Hopefully it is a
> small inconvenience. The PTE format for 4K is anyway not exactly the
> same compared to 64K PTE format. For example, higher RPN bits are
> used on 4K but not on 64k. lower RPN bits are used on 64k but not
> on 4k.
I was wondering why in this patch ? You do in the next patch

--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -12,7 +12,7 @@
   */
  #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
  #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
-#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
+#define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */



-aneesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-07-27  1:59       ` Aneesh Kumar K.V
@ 2017-07-27  8:00         ` Ram Pai
  0 siblings, 0 replies; 11+ messages in thread
From: Ram Pai @ 2017-07-27  8:00 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: linuxppc-dev, benh, paulus, mpe, khandual, bsingharora, hbabu, mhocko

On Thu, Jul 27, 2017 at 07:29:32AM +0530, Aneesh Kumar K.V wrote:
> 
> 
> On 07/26/2017 09:36 PM, Ram Pai wrote:
> >On Wed, Jul 26, 2017 at 04:05:48PM +0530, Aneesh Kumar K.V wrote:
> >>Ram Pai <linuxram@us.ibm.com> writes:
> >>
> 
> >>>diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> >>>index 9732837..62e580c 100644
> >>>--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> >>>+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> >>>@@ -12,18 +12,14 @@
> >>>   */
> >>>  #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
> >>>  #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
> >>>+#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
> >>
> >>
> >>Why are we moving H_PAGE_BUSY. Right now 4k and 64k linux page table
> >>format looks similar.
> >
> >The goal is to clear off all the _RPAGE_RSV* bits so that they can be
> >used for protection keys.  the aim is to keep the protection-bits in the
> >_RPAGE_RSV* bits, so that they will work as-is whenever radix MMU enables
> >protection keys.
> >
> >Yes this makes the PTE format differ from 4k PTE. Hopefully it is a
> >small inconvenience. The PTE format for 4K is anyway not exactly the
> >same compared to 64K PTE format. For example, higher RPN bits are
> >used on 4K but not on 64k. lower RPN bits are used on 64k but not
> >on 4k.
> I was wondering why in this patch ? You do in the next patch

True. because in this patch, we have not yet freed up bit
_RPAGE_RPN44. _RPAGE_RPN44 bit is still used by H_PAGE_F_GIX for 64K
backed HPTEs.  Hence I have temporarily parked H_PAGE_BUSY at
_RPAGE_RPN42.

I could leave H_PAGE_BUSY at bit _RPAGE_RSV1  and move it to 
_RPAGE_RPN44 in the next patch. But by doing so, i would have not
truely released bit _RPAGE_RSV1 for 4K backed hptes; as claimed in the title
of this patch....  

> 
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -12,7 +12,7 @@
>   */
>  #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
>  #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
> -#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
> +#define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
> 
...
-- 
Ram Pai

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-27  8:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-21 18:52 [PATCH 0/6] powerpc: Free up RPAGE_RSV bits in 64K PTE Ram Pai
2017-07-21 18:52 ` [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
2017-07-26 10:35   ` Aneesh Kumar K.V
2017-07-26 16:06     ` Ram Pai
2017-07-27  1:59       ` Aneesh Kumar K.V
2017-07-27  8:00         ` Ram Pai
2017-07-21 18:52 ` [PATCH 2/6] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
2017-07-21 18:52 ` [PATCH 3/6] powerpc: capture the PTE format changes in the dump pte report Ram Pai
2017-07-21 18:52 ` [PATCH 4/6] powerpc: introduce pte_set_hash_slot() helper Ram Pai
2017-07-21 18:52 ` [PATCH 5/6] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
2017-07-21 18:52 ` [PATCH 6/6] powerpc: use helper functions to get and set hash slots Ram Pai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.