linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] powerpc: Free up RPAGE_RSV bits
@ 2017-09-08 22:44 Ram Pai
  2017-09-08 22:44 ` [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper Ram Pai
                   ` (32 more replies)
  0 siblings, 33 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

RPAGE_RSV0..4 pte bits are currently used for hpte slot
tracking. We  need  these bits   for  memory-protection
keys. Luckily these  four bits   are  relatively easier 
to move among all the other candidate bits.

For  64K   linux-ptes   backed  by 4k hptes, these bits
are   used for tracking the  validity of the slot value
stored   in the second-part-of-the-pte. We device a new
mechanism for  tracking   the   validity  without using
those bits. The    mechanism  is explained in the patch.

For 64K  linux-pte  backed by 64K hptes, we simply move
the   slot  tracking bits to the second-part-of-the-pte.

The above  mechanism  is also used to free the bits for
hugetlb linux-ptes.

For 4k linux-pte, we  have only 3 free  bits  available.
We swizzle around the bits and release RPAGE_RSV{2,3,4}
for memory protection keys.

Testing:
--------
has survived  kernel  compilation on multiple platforms
p8 powernv hash-mode, p9 powernv hash-mode,  p7 powervm,
p8-powervm, p8-kvm-guest.

Has survived git-bisect on p8  power-nv  with  64K page
and 4K page.

History:
-------
This patchset  is  a  spin-off from the memkey patchset.

version v9:
	(1) rearranged the patch order. First the helper
		routines are defined followed   by  the
		patches that make use of the helpers.

version v8:
	(1) an  additional  patch   added  to  free  up
             RSV{2,3,4} on 4K linux-pte.

version v7:
	(1) GIX bit reset change  moved  to  the second
		patch  -- noticed by Aneesh.
	(2) Separated this patches from memkey patchset
	(3) merged a  bunch  of  patches, that used the
       		helper function, into one.
version v6:
	(1) No changes related to pte.

version v5:
	(1) No changes related to pte.

version v4:
	(1) No changes related to pte.

version v3:
	(1) split the patches into smaller consumable
		patches.
	(2) A bug fix while  invalidating a hpte slot
		in __hash_page_4K()
       		-- noticed by Aneesh
	

version v2:
 	(1) fixed a  bug  in 4k  hpte  backed 64k pte
       		where  page    invalidation   was not
		done  correctly,  and  initialization
	       	of    second-part-of-the-pte  was not
		done    correctly  if the pte was not
	       	yet Hashed with a hpte.
	       	   --	Reported by Aneesh.
	

version v1: Initial version

Ram Pai (7):
  powerpc: introduce pte_set_hash_slot() helper
  powerpc: introduce pte_get_hash_gslot() helper
  powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6
  powerpc: use helper functions to get and set hash slots
  powerpc: capture the PTE format changes in the dump pte report

 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   21 ++++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   61 ++++++++----
 arch/powerpc/include/asm/book3s/64/hash.h     |    8 +-
 arch/powerpc/mm/dump_linuxpagetables.c        |    3 +-
 arch/powerpc/mm/hash64_4k.c                   |   14 +--
 arch/powerpc/mm/hash64_64k.c                  |  131 +++++++++++++------------
 arch/powerpc/mm/hash_utils_64.c               |   35 +++++--
 arch/powerpc/mm/hugetlbpage-hash64.c          |   18 ++--
 8 files changed, 171 insertions(+), 120 deletions(-)

^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-13  7:55   ` Balbir Singh
  2017-10-19  4:52   ` Michael Ellerman
  2017-09-08 22:44 ` [PATCH 2/7] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
                   ` (31 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Introduce pte_set_hash_slot().It  sets the (H_PAGE_F_SECOND|H_PAGE_F_GIX)
bits at  the   appropriate   location   in   the   PTE  of  4K  PTE.  For
64K PTE, it  sets  the  bits  in  the  second  part  of  the  PTE. Though
the implementation  for the former just needs the slot parameter, it does
take some additional parameters to keep the prototype consistent.

This function  will  be  handy  as  we   work   towards  re-arranging the
bits in the later patches.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   15 +++++++++++++++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   25 +++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 0c4e470..8909039 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -48,6 +48,21 @@ static inline int hash__hugepd_ok(hugepd_t hpd)
 }
 #endif
 
+/*
+ * 4k pte format is  different  from  64k  pte  format.  Saving  the
+ * hash_slot is just a matter of returning the pte bits that need to
+ * be modified. On 64k pte, things are a  little  more  involved and
+ * hence  needs   many   more  parameters  to  accomplish  the  same.
+ * However we  want  to abstract this out from the caller by keeping
+ * the prototype consistent across the two formats.
+ */
+static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
+			unsigned int subpg_index, unsigned long slot)
+{
+	return (slot << H_PAGE_F_GIX_SHIFT) &
+		(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+}
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 
 static inline char *get_hpte_slot_array(pmd_t *pmdp)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 9732837..6652669 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -74,6 +74,31 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
 }
 
+/*
+ * Commit the hash slot and return pte bits that needs to be modified.
+ * The caller is expected to modify the pte bits accordingly and
+ * commit the pte to memory.
+ */
+static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
+		unsigned int subpg_index, unsigned long slot)
+{
+	unsigned long *hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+
+	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
+	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
+	/*
+	 * Commit the hidx bits to memory before returning.
+	 * Anyone reading  pte  must  ensure hidx bits are
+	 * read  only  after  reading the pte by using the
+	 * read-side  barrier  smp_rmb(). __real_pte() can
+	 * help ensure that.
+	 */
+	smp_wmb();
+
+	/* no pte bits to be modified, return 0x0UL */
+	return 0x0UL;
+}
+
 #define __rpte_to_pte(r)	((r).pte)
 extern bool __rpte_sub_valid(real_pte_t rpte, unsigned long index);
 /*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 2/7] powerpc: introduce pte_get_hash_gslot() helper
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
  2017-09-08 22:44 ` [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-13  9:32   ` Balbir Singh
  2017-09-08 22:44 ` [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
                   ` (30 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Introduce pte_get_hash_gslot()() which returns the slot number of the
HPTE in the global hash table.

This function will come in handy as we work towards re-arranging the
PTE bits in the later patches.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h |    3 +++
 arch/powerpc/mm/hash_utils_64.c           |   18 ++++++++++++++++++
 2 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index f884520..060c059 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -166,6 +166,9 @@ static inline int hash__pte_none(pte_t pte)
 	return (pte_val(pte) & ~H_PTE_NONE_MASK) == 0;
 }
 
+unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
+		int ssize, real_pte_t rpte, unsigned int subpg_index);
+
 /* This low level function performs the actual PTE insertion
  * Setting the PTE depends on the MMU type and other factors. It's
  * an horrible mess that I'm not going to try to clean up now but
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 67ec2e9..e68f053 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1591,6 +1591,24 @@ static inline void tm_flush_hash_page(int local)
 }
 #endif
 
+/*
+ * return the global hash slot, corresponding to the given
+ * pte, which contains the hpte.
+ */
+unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
+		int ssize, real_pte_t rpte, unsigned int subpg_index)
+{
+	unsigned long hash, slot, hidx;
+
+	hash = hpt_hash(vpn, shift, ssize);
+	hidx = __rpte_to_hidx(rpte, subpg_index);
+	if (hidx & _PTEIDX_SECONDARY)
+		hash = ~hash;
+	slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
+	slot += hidx & _PTEIDX_GROUP_IX;
+	return slot;
+}
+
 /* WARNING: This is called from hash_low_64.S, if you change this prototype,
  *          do not forget to update the assembly call site !
  */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
  2017-09-08 22:44 ` [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper Ram Pai
  2017-09-08 22:44 ` [PATCH 2/7] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-14  1:18   ` Balbir Singh
  2017-10-19  3:25   ` Michael Ellerman
  2017-09-08 22:44 ` [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
                   ` (29 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6,
in the 4K backed HPTE pages.These bits continue to be used
for 64K backed HPTE pages in this patch, but will be freed
up in the next patch. The  bit  numbers are big-endian  as
defined in the ISA3.0

The patch does the following change to the 4k htpe backed
64K PTE's format.

H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
		below)
V0 which occupied bit 4 is not used anymore.
V1 which occupied bit 5 is not used anymore.
V2 which occupied bit 6 is not used anymore.
V3 which occupied bit 7 is not used anymore.

Before the patch, the 4k backed 64k PTE format was as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

After the patch, the 4k backed 64k PTE format is as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |  |  |  |  |x|B| |x|x|................|.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
|S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

the four  bits S,G,I,X (one quadruplet per 4k HPTE) that
cache  the  hash-bucket  slot  value, is initialized  to
1,1,1,1 indicating -- an invalid slot.   If  a HPTE gets
cached in a 1111  slot(i.e 7th  slot  of  secondary hash
bucket), it is  released  immediately. In  other  words,
even  though 1111   is   a valid slot  value in the hash
bucket, we consider it invalid and  release the slot and
the HPTE.  This  gives  us  the opportunity to determine
the validity of S,G,I,X  bits  based on its contents and
not on any of the bits V0,V1,V2 or V3 in the primary PTE

When   we  release  a    HPTE    cached in the 1111 slot
we also    release  a  legitimate   slot  in the primary
hash bucket  and  unmap  its  corresponding  HPTE.  This
is  to  ensure   that  we do get a HPTE cached in a slot
of the primary hash bucket, the next time we retry.

Though  treating  1111  slot  as  invalid,  reduces  the
number of  available  slots  in the hash bucket and  may
have  an  effect   on the performance, the probabilty of
hitting a 1111 slot is extermely low.

Compared  to  the   current    scheme,  the above scheme
reduces  the   number  of   false   hash  table  updates
significantly and  has the  added advantage of releasing
four  valuable  PTE bits for other purpose.

NOTE:even though bits 3, 4, 5, 6, 7 are  not  used  when
the  64K  PTE is backed by 4k HPTE,  they continue to be
used  if  the  PTE  gets  backed  by 64k HPTE.  The next
patch will decouple that aswell, and truely  release the
bits.

This idea was jointly developed by Paul Mackerras,
Aneesh, Michael Ellermen and myself.

4K PTE format remains unchanged currently.

The patch does the following code changes
a) PTE flags are split between 64k and 4k  header files.
b) __hash_page_4K()  is  reimplemented   to reflect the
   above logic.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |    2 +
 arch/powerpc/include/asm/book3s/64/hash-64k.h |    8 +--
 arch/powerpc/include/asm/book3s/64/hash.h     |    1 -
 arch/powerpc/mm/hash64_64k.c                  |  106 +++++++++++++------------
 arch/powerpc/mm/hash_utils_64.c               |    4 +-
 5 files changed, 63 insertions(+), 58 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 8909039..e66bfeb 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -16,6 +16,8 @@
 #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
+#define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
+
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
 			 H_PAGE_F_SECOND | H_PAGE_F_GIX)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 6652669..e038f1c 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -12,18 +12,14 @@
  */
 #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
+#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
+
 /*
  * We need to differentiate between explicit huge page and THP huge
  * page, since THP huge page also need to track real subpage details
  */
 #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
 
-/*
- * Used to track subpage group valid if H_PAGE_COMBO is set
- * This overloads H_PAGE_F_GIX and H_PAGE_F_SECOND
- */
-#define H_PAGE_COMBO_VALID	(H_PAGE_F_GIX | H_PAGE_F_SECOND)
-
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
 			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 060c059..8ce4112 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -9,7 +9,6 @@
  */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
 #define H_PAGE_F_GIX_SHIFT	56
-#define H_PAGE_BUSY		_RPAGE_RSV1 /* software: PTE & hash are busy */
 #define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
 #define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index 1a68cb1..c6c5559 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -15,34 +15,22 @@
 #include <linux/mm.h>
 #include <asm/machdep.h>
 #include <asm/mmu.h>
+
 /*
- * index from 0 - 15
+ * return true, if the entry has a slot value which
+ * the software considers as invalid.
  */
-bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
+static inline bool hpte_soft_invalid(unsigned long slot)
 {
-	unsigned long g_idx;
-	unsigned long ptev = pte_val(rpte.pte);
-
-	g_idx = (ptev & H_PAGE_COMBO_VALID) >> H_PAGE_F_GIX_SHIFT;
-	index = index >> 2;
-	if (g_idx & (0x1 << index))
-		return true;
-	else
-		return false;
+	return ((slot & 0xfUL) == 0xfUL);
 }
+
 /*
  * index from 0 - 15
  */
-static unsigned long mark_subptegroup_valid(unsigned long ptev, unsigned long index)
+bool __rpte_sub_valid(real_pte_t rpte, unsigned long index)
 {
-	unsigned long g_idx;
-
-	if (!(ptev & H_PAGE_COMBO))
-		return ptev;
-	index = index >> 2;
-	g_idx = 0x1 << index;
-
-	return ptev | (g_idx << H_PAGE_F_GIX_SHIFT);
+	return !(hpte_soft_invalid(rpte.hidx >> (index << 2)));
 }
 
 int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
@@ -50,12 +38,11 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		   int ssize, int subpg_prot)
 {
 	real_pte_t rpte;
-	unsigned long *hidxp;
 	unsigned long hpte_group;
 	unsigned int subpg_index;
-	unsigned long rflags, pa, hidx;
+	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte, subpg_pte;
-	unsigned long vpn, hash, slot;
+	unsigned long vpn, hash, slot, gslot;
 	unsigned long shift = mmu_psize_defs[MMU_PAGE_4K].shift;
 
 	/*
@@ -126,18 +113,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	if (__rpte_sub_valid(rpte, subpg_index)) {
 		int ret;
 
-		hash = hpt_hash(vpn, shift, ssize);
-		hidx = __rpte_to_hidx(rpte, subpg_index);
-		if (hidx & _PTEIDX_SECONDARY)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += hidx & _PTEIDX_GROUP_IX;
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte,
+					subpg_index);
+		ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn,
+			MMU_PAGE_4K, MMU_PAGE_4K, ssize, flags);
 
-		ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn,
-						 MMU_PAGE_4K, MMU_PAGE_4K,
-						 ssize, flags);
 		/*
-		 *if we failed because typically the HPTE wasn't really here
+		 * if we failed because typically the HPTE wasn't really here
 		 * we try an insertion.
 		 */
 		if (ret == -1)
@@ -148,6 +130,15 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	}
 
 htab_insert_hpte:
+
+	/*
+	 * initialize all hidx entries to invalid value,
+	 * the first time the PTE is about to allocate
+	 * a 4K hpte
+	 */
+	if (!(old_pte & H_PAGE_COMBO))
+		rpte.hidx = ~0x0UL;
+
 	/*
 	 * handle H_PAGE_4K_PFN case
 	 */
@@ -172,15 +163,41 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * Primary is full, try the secondary
 	 */
 	if (unlikely(slot == -1)) {
+		bool soft_invalid;
+
 		hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
 		slot = mmu_hash_ops.hpte_insert(hpte_group, vpn, pa,
 						rflags, HPTE_V_SECONDARY,
 						MMU_PAGE_4K, MMU_PAGE_4K,
 						ssize);
-		if (slot == -1) {
-			if (mftb() & 0x1)
+
+		soft_invalid = hpte_soft_invalid(slot);
+		if (unlikely(soft_invalid)) {
+			/*
+			 * we got a valid slot from a hardware point of view.
+			 * but we cannot use it, because we use this special
+			 * value; as     defined   by    hpte_soft_invalid(),
+			 * to  track    invalid  slots.  We  cannot  use  it.
+			 * So invalidate it.
+			 */
+			gslot = slot & _PTEIDX_GROUP_IX;
+			mmu_hash_ops.hpte_invalidate(hpte_group+gslot, vpn,
+				MMU_PAGE_4K, MMU_PAGE_4K,
+				ssize, 0);
+		}
+
+		if (unlikely(slot == -1 || soft_invalid)) {
+			/*
+			 * for soft invalid slot, lets   ensure that we
+			 * release a slot from  the primary,   with the
+			 * hope that we  will  acquire that slot   next
+			 * time we try. This will ensure that we do not
+			 * get the same soft-invalid slot.
+			 */
+			if (soft_invalid || (mftb() & 0x1))
 				hpte_group = ((hash & htab_hash_mask) *
 					      HPTES_PER_GROUP) & ~0x7UL;
+
 			mmu_hash_ops.hpte_remove(hpte_group);
 			/*
 			 * FIXME!! Should be try the group from which we removed ?
@@ -198,21 +215,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 				   MMU_PAGE_4K, MMU_PAGE_4K, old_pte);
 		return -1;
 	}
-	/*
-	 * Insert slot number & secondary bit in PTE second half,
-	 * clear H_PAGE_BUSY and set appropriate HPTE slot bit
-	 * Since we have H_PAGE_BUSY set on ptep, we can be sure
-	 * nobody is undating hidx.
-	 */
-	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
-	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
-	new_pte = mark_subptegroup_valid(new_pte, subpg_index);
-	new_pte |=  H_PAGE_HASHPTE;
-	/*
-	 * check __real_pte for details on matching smp_rmb()
-	 */
-	smp_wmb();
+
+	new_pte |= pte_set_hash_slot(ptep, rpte, subpg_index, slot);
+	new_pte |= H_PAGE_HASHPTE;
+
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
 }
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index e68f053..a40c7bc 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -978,8 +978,9 @@ void __init hash__early_init_devtree(void)
 
 void __init hash__early_init_mmu(void)
 {
+#ifndef CONFIG_PPC_64K_PAGES
 	/*
-	 * We have code in __hash_page_64K() and elsewhere, which assumes it can
+	 * We have code in __hash_page_4K() and elsewhere, which assumes it can
 	 * do the following:
 	 *   new_pte |= (slot << H_PAGE_F_GIX_SHIFT) & (H_PAGE_F_SECOND | H_PAGE_F_GIX);
 	 *
@@ -990,6 +991,7 @@ void __init hash__early_init_mmu(void)
 	 * with a BUILD_BUG_ON().
 	 */
 	BUILD_BUG_ON(H_PAGE_F_SECOND != (1ul  << (H_PAGE_F_GIX_SHIFT + 3)));
+#endif /* CONFIG_PPC_64K_PAGES */
 
 	htab_init_page_sizes();
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (2 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-14  1:44   ` Balbir Singh
  2017-09-14  8:13   ` Benjamin Herrenschmidt
  2017-09-08 22:44 ` [PATCH 5/7] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6 Ram Pai
                   ` (28 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6
in the 64K backed HPTE pages. This along with the earlier
patch will  entirely free  up the four bits from 64K PTE.
The bit numbers are  big-endian as defined in the  ISA3.0

This patch  does  the  following change to 64K PTE backed
by 64K HPTE.

H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the
	second part of the pte to bit 60.
H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also
	moves  to  the   second part of the pte to bit 61,
       	62, 63, 64 respectively

since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
bit  9  to  bit  7.

The second part of the PTE will hold
(H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
NOTE: None of the bits in the secondary PTE were not used
by 64k-HPTE backed PTE.

Before the patch, the 64K HPTE backed 64k PTE format was
as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
| | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

After the patch, the 64k HPTE backed 64k PTE format is
as follows

 0 1 2 3 4  5  6  7  8 9 10...........................63
 : : : : :  :  :  :  : : :                            :
 v v v v v  v  v  v  v v v                            v

,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
| | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte
'_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'

The above PTE changes is applicable to hugetlbpages aswell.

The patch does the following code changes:

a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE
	header   since it is no more needed b the 64k PTEs.
b) abstracts  out __real_pte() and __rpte_to_hidx() so the
	caller  need not know the bit location of the slot.
c) moves the slot bits to the secondary pte.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +++++++++++-------------
 arch/powerpc/include/asm/book3s/64/hash.h     |    3 --
 arch/powerpc/mm/hash64_64k.c                  |   23 ++++++++-----------
 arch/powerpc/mm/hugetlbpage-hash64.c          |   18 ++++++---------
 5 files changed, 33 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index e66bfeb..dc153c6 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -16,6 +16,9 @@
 #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
+#define H_PAGE_F_GIX_SHIFT	56
+#define H_PAGE_F_SECOND	_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX	(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
 
 /* PTE flags to conserve for HPTE identification */
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index e038f1c..89ef5a9 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -12,7 +12,7 @@
  */
 #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
-#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
+#define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
 
 /*
  * We need to differentiate between explicit huge page and THP huge
@@ -21,8 +21,7 @@
 #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
 
 /* PTE flags to conserve for HPTE identification */
-#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
-			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
+#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)
 /*
  * we support 16 fragments per PTE page of 64K size.
  */
@@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
 	unsigned long *hidxp;
 
 	rpte.pte = pte;
-	rpte.hidx = 0;
-	if (pte_val(pte) & H_PAGE_COMBO) {
-		/*
-		 * Make sure we order the hidx load against the H_PAGE_COMBO
-		 * check. The store side ordering is done in __hash_page_4K
-		 */
-		smp_rmb();
-		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
-		rpte.hidx = *hidxp;
-	}
+	/*
+	 * Ensure that we do not read the hidx before we read
+	 * the pte. Because the writer side is  expected
+	 * to finish writing the hidx first followed by the pte,
+	 * by using smp_wmb().
+	 * pte_set_hash_slot() ensures that.
+	 */
+	smp_rmb();
+	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
+	rpte.hidx = *hidxp;
 	return rpte;
 }
 
 static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
 {
-	if ((pte_val(rpte.pte) & H_PAGE_COMBO))
-		return (rpte.hidx >> (index<<2)) & 0xf;
-	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
+	return ((rpte.hidx >> (index<<2)) & 0xfUL);
 }
 
 /*
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 8ce4112..46f3a23 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -8,9 +8,6 @@
  *
  */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
-#define H_PAGE_F_GIX_SHIFT	56
-#define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
-#define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
 #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
 
 #ifdef CONFIG_PPC_64K_PAGES
diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
index c6c5559..9c63844 100644
--- a/arch/powerpc/mm/hash64_64k.c
+++ b/arch/powerpc/mm/hash64_64k.c
@@ -103,8 +103,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		 * On hash insert failure we use old pte value and we don't
 		 * want slot information there if we have a insert failure.
 		 */
-		old_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
-		new_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
+		old_pte &= ~H_PAGE_HASHPTE;
+		new_pte &= ~H_PAGE_HASHPTE;
 		goto htab_insert_hpte;
 	}
 	/*
@@ -227,6 +227,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 		    unsigned long vsid, pte_t *ptep, unsigned long trap,
 		    unsigned long flags, int ssize)
 {
+	real_pte_t rpte;
 	unsigned long hpte_group;
 	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte;
@@ -263,6 +264,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 	} while (!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
 
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -270,18 +272,13 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 
 	vpn  = hpt_vpn(ea, vsid, ssize);
 	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
+		unsigned long gslot;
 		/*
 		 * There MIGHT be an HPTE for this pte
 		 */
-		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
-
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
-					       MMU_PAGE_64K, ssize,
-					       flags) == -1)
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_64K,
+				MMU_PAGE_64K, ssize, flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
 	}
 
@@ -328,9 +325,9 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
 					   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);
 			return -1;
 		}
+
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
 	}
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
index a84bb44..d52d667 100644
--- a/arch/powerpc/mm/hugetlbpage-hash64.c
+++ b/arch/powerpc/mm/hugetlbpage-hash64.c
@@ -22,6 +22,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		     pte_t *ptep, unsigned long trap, unsigned long flags,
 		     int ssize, unsigned int shift, unsigned int mmu_psize)
 {
+	real_pte_t rpte;
 	unsigned long vpn;
 	unsigned long old_pte, new_pte;
 	unsigned long rflags, pa, sz;
@@ -61,6 +62,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 	} while(!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
 
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	sz = ((1UL) << shift);
 	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -71,16 +73,11 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 	/* Check if pte already has an hpte (case 2) */
 	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
 		/* There MIGHT be an HPTE for this pte */
-		unsigned long hash, slot;
+		unsigned long gslot;
 
-		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
-
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,
-					       mmu_psize, ssize, flags) == -1)
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, mmu_psize,
+				mmu_psize, ssize, flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
 	}
 
@@ -106,8 +103,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 			return -1;
 		}
 
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
 	}
 
 	/*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 5/7] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (3 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-14  1:48   ` Balbir Singh
  2017-09-08 22:44 ` [PATCH 6/7] powerpc: use helper functions to get and set hash slots Ram Pai
                   ` (27 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

We  need  PTE bits  3 ,4, 5, 6 and 57 to support protection-keys,
because these are  the bits we want to consolidate on across all
configuration to support protection keys.

Bit 3,4,5 and 6 are currently used on 4K-pte kernels.  But bit 9
and 10 are available.  Hence  we  use the two available bits and
free up bit 5 and 6.  We will still not be able to free up bit 3
and 4. In the absence  of  any  other free bits, we will have to
stay satisfied  with  what we have :-(.   This means we will not
be  able  to support  32  protection  keys, but only 8.  The bit
numbers are  big-endian as defined in the  ISA3.0

This patch  does  the  following change to 4K PTE.

H_PAGE_F_SECOND (S) which   occupied  bit  4   moves  to  bit  7.
H_PAGE_F_GIX (G,I,X)  which  occupied  bit 5, 6 and 7 also moves
to  bit 8,9, 10 respectively.
H_PAGE_HASHPTE (H)  which   occupied   bit  8  moves  to  bit  4.

Before the patch, the 4k PTE format was as follows

 0 1 2 3 4  5  6  7  8 9 10....................57.....63
 : : : : :  :  :  :  : : :                      :     :
 v v v v v  v  v  v  v v v                      v     v
,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x|B|S |G |I |X |H| | |x|x|................| |x|x|x|
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'

After the patch, the 4k PTE format is as follows

 0 1 2 3 4  5  6  7  8 9 10....................57.....63
 : : : : :  :  :  :  : : :                      :     :
 v v v v v  v  v  v  v v v                      v     v
,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
|x|x|x|B|H |  |  |S |G|I|X|x|x|................| |.|.|.|
'_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'

The patch has no code changes; just swizzles around bits.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |    7 ++++---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |    1 +
 arch/powerpc/include/asm/book3s/64/hash.h     |    1 -
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index dc153c6..5187249 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -16,10 +16,11 @@
 #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
 #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
 
-#define H_PAGE_F_GIX_SHIFT	56
-#define H_PAGE_F_SECOND	_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
-#define H_PAGE_F_GIX	(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
+#define H_PAGE_F_GIX_SHIFT	53
+#define H_PAGE_F_SECOND	_RPAGE_RPN44	/* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX	(_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
 #define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
+#define H_PAGE_HASHPTE	_RPAGE_RSV2     /* software: PTE & hash are busy */
 
 /* PTE flags to conserve for HPTE identification */
 #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 89ef5a9..8576060 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -13,6 +13,7 @@
 #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
 #define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
+#define H_PAGE_HASHPTE	_RPAGE_RPN43	/* PTE has associated HPTE */
 
 /*
  * We need to differentiate between explicit huge page and THP huge
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 46f3a23..953795e 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -8,7 +8,6 @@
  *
  */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
-#define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
 
 #ifdef CONFIG_PPC_64K_PAGES
 #include <asm/book3s/64/hash-64k.h>
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 6/7] powerpc: use helper functions to get and set hash slots
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (4 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 5/7] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6 Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-08 22:44 ` [PATCH 7/7] powerpc: capture the PTE format changes in the dump pte report Ram Pai
                   ` (26 subsequent siblings)
  32 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

replace redundant code in __hash_page_4K()   and   flush_hash_page()
with helper functions pte_get_hash_gslot() and   pte_set_hash_slot()

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/hash64_4k.c     |   14 ++++++--------
 arch/powerpc/mm/hash_utils_64.c |   13 ++++---------
 2 files changed, 10 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/mm/hash64_4k.c b/arch/powerpc/mm/hash64_4k.c
index 6fa450c..a1eebc1 100644
--- a/arch/powerpc/mm/hash64_4k.c
+++ b/arch/powerpc/mm/hash64_4k.c
@@ -20,6 +20,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		   pte_t *ptep, unsigned long trap, unsigned long flags,
 		   int ssize, int subpg_prot)
 {
+	real_pte_t rpte;
 	unsigned long hpte_group;
 	unsigned long rflags, pa;
 	unsigned long old_pte, new_pte;
@@ -54,6 +55,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 	 * need to add in 0x1 if it's a read-only user page
 	 */
 	rflags = htab_convert_pte_flags(new_pte);
+	rpte = __real_pte(__pte(old_pte), ptep);
 
 	if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
 	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
@@ -64,13 +66,10 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 		/*
 		 * There MIGHT be an HPTE for this pte
 		 */
-		hash = hpt_hash(vpn, shift, ssize);
-		if (old_pte & H_PAGE_F_SECOND)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
+		unsigned long gslot = pte_get_hash_gslot(vpn, shift,
+						ssize, rpte, 0);
 
-		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_4K,
+		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_4K,
 					       MMU_PAGE_4K, ssize, flags) == -1)
 			old_pte &= ~_PAGE_HPTEFLAGS;
 	}
@@ -118,8 +117,7 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
 			return -1;
 		}
 		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
-		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
-			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
+		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
 	}
 	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
 	return 0;
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index a40c7bc..0dff57b 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1617,23 +1617,18 @@ unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
 void flush_hash_page(unsigned long vpn, real_pte_t pte, int psize, int ssize,
 		     unsigned long flags)
 {
-	unsigned long hash, index, shift, hidx, slot;
+	unsigned long index, shift, gslot;
 	int local = flags & HPTE_LOCAL_UPDATE;
 
 	DBG_LOW("flush_hash_page(vpn=%016lx)\n", vpn);
 	pte_iterate_hashed_subpages(pte, psize, vpn, index, shift) {
-		hash = hpt_hash(vpn, shift, ssize);
-		hidx = __rpte_to_hidx(pte, index);
-		if (hidx & _PTEIDX_SECONDARY)
-			hash = ~hash;
-		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
-		slot += hidx & _PTEIDX_GROUP_IX;
-		DBG_LOW(" sub %ld: hash=%lx, hidx=%lx\n", index, slot, hidx);
+		gslot = pte_get_hash_gslot(vpn, shift, ssize, pte, index);
+		DBG_LOW(" sub %ld: gslot=%lx\n", index, gslot);
 		/*
 		 * We use same base page size and actual psize, because we don't
 		 * use these functions for hugepage
 		 */
-		mmu_hash_ops.hpte_invalidate(slot, vpn, psize, psize,
+		mmu_hash_ops.hpte_invalidate(gslot, vpn, psize, psize,
 					     ssize, local);
 	} pte_iterate_hashed_end();
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 7/7] powerpc: capture the PTE format changes in the dump pte report
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (5 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 6/7] powerpc: use helper functions to get and set hash slots Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-14  3:22   ` Balbir Singh
  2017-09-08 22:44 ` [PATCH 00/25] powerpc: Memory Protection Keys Ram Pai
                   ` (25 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE.
capture these changes in the dump pte report.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/dump_linuxpagetables.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/mm/dump_linuxpagetables.c b/arch/powerpc/mm/dump_linuxpagetables.c
index c9282d2..0323dc4 100644
--- a/arch/powerpc/mm/dump_linuxpagetables.c
+++ b/arch/powerpc/mm/dump_linuxpagetables.c
@@ -213,7 +213,7 @@ struct flag_info {
 		.val	= H_PAGE_4K_PFN,
 		.set	= "4K_pfn",
 	}, {
-#endif
+#else /* CONFIG_PPC_64K_PAGES */
 		.mask	= H_PAGE_F_GIX,
 		.val	= H_PAGE_F_GIX,
 		.set	= "f_gix",
@@ -224,6 +224,7 @@ struct flag_info {
 		.val	= H_PAGE_F_SECOND,
 		.set	= "f_second",
 	}, {
+#endif /* CONFIG_PPC_64K_PAGES */
 #endif
 		.mask	= _PAGE_SPECIAL,
 		.val	= _PAGE_SPECIAL,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 00/25] powerpc: Memory Protection Keys
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (6 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 7/7] powerpc: capture the PTE format changes in the dump pte report Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-08 22:44 ` [PATCH 01/25] powerpc: initial pkey plumbing Ram Pai
                   ` (24 subsequent siblings)
  32 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Memory protection keys enable applications to protect its
address  space from inadvertent access from or corruption
by itself.

These patches along with the pte-bit freeing patch series
enables the protection key feature on powerpc; 4k and 64k
hashpage kernels. A subsequent patch series that  changes
the generic  and  x86  code  will  expose memkey features
through sysfs and  provide  testcases  and  Documentation
updates.

Patches are based on powerpc -next branch.  All   patches
can be found at --
https://github.com/rampai/memorykeys.git memkey.v8

The overall idea:
-----------------
 A process allocates a   key  and associates it with
 an  address  range  within    its   address   space.
 The process  then  can  dynamically  set read/write 
 permissions on  the   key   without  involving  the 
 kernel. Any  code that  violates   the  permissions
 of  the address space; as defined by its associated
 key, will receive a segmentation fault.

This  patch series enables the feature on PPC64 HPTE
platform.

ISA3.0   section  5.7.13   describes  the  detailed
specifications.


Highlevel view of the design:
---------------------------
When  an  application associates a key with a address
address  range,  program  the key in    the Linux PTE.
When the MMU   detects  a page fault, allocate a hash
page  and   program  the  key into HPTE.  And finally
when the  MMU    detects  a  key  violation;  due  to
invalid    application  access, invoke the registered
signal   handler and provide the violated  key number.


Testing:
-------
This  patch  series has passed all the protection key
tests   available    in   the selftests directory.The
tests are updated  to    work on both x86 and powerpc.
NOTE: All the selftest related patches will be   part
of  a separate patch series.


History:
-------
version v8:
	(1) Contents of the AMR register withdrawn from
       	the siginfo  structure. Applications can always
	read the AMR register.
	(2) AMR/IAMR/UAMOR are  now  available  through 
		ptrace system call. -- thanks to Thiago
	(3) code  changes  to  handle legacy power cpus
	that do not support execute-disable.
	(4) incorporates many code improvement
		suggestions.

version v7:
	(1) refers to device tree property to enable
		protection keys.
	(2) adds 4K PTE support.
	(3) fixes a couple of bugs noticed by Thiago
	(4) decouples this patch series from   arch-
	    independent code. This patch series can
	    now stand by itself, with one kludge
	    patch(2).
version v7:
	(1) refers to device tree property to enable
		protection keys.
	(2) adds 4K PTE support.
	(3) fixes a couple of bugs noticed by Thiago
	(4) decouples this patch series from   arch-
	    independent code. This patch series can
	    now stand by itself, with one kludge
	    patch(2).

version v6:
	(1) selftest changes  are broken down into 20
		incremental patches.
	(2) A  separate   key  allocation  mask  that
       		includes    PKEY_DISABLE_EXECUTE   is 
		added for powerpc
	(3) pkey feature  is enabled for 64K HPT case
		only.  RPT and 4k HPT is disabled.
	(4) Documentation   is   updated   to  better 
		capture the semantics.
	(5) introduced   arch_pkeys_enabled() to find
       		if an arch enables pkeys.  Correspond-
		ing change the  logic   that displays
		key value in smaps.
	(6) code  rearranged  in many places based on
       		comments from   Dave Hansen,   Balbir,
	       	Anshuman.	
	(7) fixed  one bug where a bogus key could be
		associated     successfully        in
		pkey_mprotect().

version v5:
	(1) reverted back to the old  design -- store
	    the key in the pte,  instead of bypassing
	    it.  The v4  design  slowed down the hash
	    page path.
	(2) detects key violation when kernel is told
       		to access user pages.
	(3) further  refined the patches into smaller
       		consumable units
	(4) page faults   handlers captures the fault-
		ing key 
	    from the pte   instead of   the vma. This
	    closes  a  race  between  where  the  key 
	    update in the  vma and a key fault caused
	    by the key programmed in the pte.
	(5) a key created   with access-denied should
	    also set it up to deny write. Fixed it.
	(6) protection-key   number   is displayed in
       		smaps the x86 way.

version v4:
	(1) patches no more depend on the pte bits
       		to program the hpte
			-- comment by Balbir
	(2) documentation updates
	(3) fixed a bug in the selftest.
	(4) unlike x86, powerpc   lets signal handler
		change   key   permission   bits; the
	       	change   will   persist across signal
	       	handler   boundaries.   Earlier    we
	       	allowed   the   signal   handler   to
	       	modify   a   field   in   the siginfo
		structure   which would  than be used
       		by  the  kernel  to  program  the key
		protection register (AMR)
       		  -- resolves a issue raised by Ben.
    		"Calls  to  sys_swapcontext  with   a
		made-up  context  will  end up with a
		crap  AMR  if done by code who didn't
	       	know about that register".
	(5) these  changes  enable protection keys on
       		4k-page kernel aswell.

version v3:
	(1) split the patches into smaller consumable
		patches.
	(2) added  the  ability  to  disable  execute
       		permission  on  a  key  at   creation.
	(3) rename    calc_pte_to_hpte_pkey_bits() to
	    pte_to_hpte_pkey_bits()
		-- suggested by Anshuman
	(4) some   code   optimization and clarity in
		do_page_fault()  
	(5) A bug fix while  invalidating a hpte slot
		in __hash_page_4K()
       		-- noticed by Aneesh
	

version v2:
	(1) documentation and selftest added.
 	(2) fixed a  bug  in 4k  hpte  backed 64k pte
       		where  page    invalidation   was not
		done  correctly,  and  initialization
	       	of    second-part-of-the-pte  was not
		done    correctly  if the pte was not
	       	yet Hashed with a hpte.
	       	   --	Reported by Aneesh.
	(3) Fixed  ABI  breakage  caused in siginfo
       		structure.
		-- Reported by Anshuman.
	

version v1: Initial version

Ram Pai (24):
  powerpc: initial pkey plumbing
  powerpc: define an additional vma bit for protection keys.
  powerpc: track allocation status of all pkeys
  powerpc: helper function to read,write AMR,IAMR,UAMOR registers
  powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  powerpc: cleaup AMR,iAMR when a key is allocated or freed
  powerpc: implementation for arch_set_user_pkey_access()
  powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
  powerpc: ability to create execute-disabled pkeys
  powerpc: store and restore the pkey state across context switches
  powerpc: introduce execute-only pkey
  powerpc: ability to associate pkey to a vma
  powerpc: implementation for arch_override_mprotect_pkey()
  powerpc: map vma key-protection bits to pte key bits.
  powerpc: sys_pkey_mprotect() system call
  powerpc: Program HPTE key protection bits
  powerpc: helper to validate key-access permissions of a pte
  powerpc: check key protection for user page access
  powerpc: implementation for arch_vma_access_permitted()
  powerpc: Handle exceptions caused by pkey violation
  powerpc: introduce get_pte_pkey() helper
  powerpc: capture the violated protection key on fault
  powerpc: Deliver SEGV signal on pkey violation
  powerpc: Enable pkey subsystem

Thiago Jung Bauermann (1):
  powerpc/ptrace: Add memory protection key regset

 arch/powerpc/Kconfig                          |   16 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |   10 +
 arch/powerpc/include/asm/book3s/64/mmu.h      |   10 +
 arch/powerpc/include/asm/book3s/64/pgtable.h  |   74 +++++-
 arch/powerpc/include/asm/cputable.h           |   15 +-
 arch/powerpc/include/asm/mman.h               |   16 +-
 arch/powerpc/include/asm/mmu_context.h        |   21 ++
 arch/powerpc/include/asm/paca.h               |    3 +
 arch/powerpc/include/asm/pkeys.h              |  252 +++++++++++++++++
 arch/powerpc/include/asm/processor.h          |    5 +
 arch/powerpc/include/asm/systbl.h             |    3 +
 arch/powerpc/include/asm/unistd.h             |    6 +-
 arch/powerpc/include/uapi/asm/elf.h           |    1 +
 arch/powerpc/include/uapi/asm/mman.h          |    6 +
 arch/powerpc/include/uapi/asm/unistd.h        |    3 +
 arch/powerpc/kernel/asm-offsets.c             |    5 +
 arch/powerpc/kernel/process.c                 |   10 +
 arch/powerpc/kernel/prom.c                    |   19 ++
 arch/powerpc/kernel/ptrace.c                  |   66 +++++
 arch/powerpc/kernel/setup_64.c                |    4 +
 arch/powerpc/kernel/traps.c                   |   22 ++
 arch/powerpc/mm/Makefile                      |    1 +
 arch/powerpc/mm/fault.c                       |   46 +++-
 arch/powerpc/mm/hash_utils_64.c               |   26 ++
 arch/powerpc/mm/mmu_context_book3s64.c        |    2 +
 arch/powerpc/mm/pkeys.c                       |  374 +++++++++++++++++++++++++
 include/uapi/linux/elf.h                      |    1 +
 27 files changed, 1000 insertions(+), 17 deletions(-)
 create mode 100644 arch/powerpc/include/asm/pkeys.h
 create mode 100644 arch/powerpc/mm/pkeys.c

^ permalink raw reply	[flat|nested] 134+ messages in thread

* [PATCH 01/25] powerpc: initial pkey plumbing
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (7 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 00/25] powerpc: Memory Protection Keys Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-14  3:32   ` Balbir Singh
  2017-10-19  4:20   ` Michael Ellerman
  2017-09-08 22:44 ` [PATCH 02/25] powerpc: define an additional vma bit for protection keys Ram Pai
                   ` (23 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Basic  plumbing  to   initialize  the   pkey  system.
Nothing is enabled yet. A later patch will enable it
ones all the infrastructure is in place.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/Kconfig                   |   16 +++++++++++
 arch/powerpc/include/asm/mmu_context.h |    5 +++
 arch/powerpc/include/asm/pkeys.h       |   45 ++++++++++++++++++++++++++++++++
 arch/powerpc/kernel/setup_64.c         |    4 +++
 arch/powerpc/mm/Makefile               |    1 +
 arch/powerpc/mm/hash_utils_64.c        |    1 +
 arch/powerpc/mm/pkeys.c                |   33 +++++++++++++++++++++++
 7 files changed, 105 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/pkeys.h
 create mode 100644 arch/powerpc/mm/pkeys.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9fc3c0b..a4cd210 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -864,6 +864,22 @@ config SECCOMP
 
 	  If unsure, say Y. Only embedded should say N here.
 
+config PPC64_MEMORY_PROTECTION_KEYS
+	prompt "PowerPC Memory Protection Keys"
+	def_bool y
+	# Note: only available in 64-bit mode
+	depends on PPC64
+	select ARCH_USES_HIGH_VMA_FLAGS
+	select ARCH_HAS_PKEYS
+	---help---
+	  Memory Protection Keys provides a mechanism for enforcing
+	  page-based protections, but without requiring modification of the
+	  page tables when an application changes protection domains.
+
+	  For details, see Documentation/vm/protection-keys.txt
+
+	  If unsure, say y.
+
 endmenu
 
 config ISA_DMA_API
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 3095925..7badf29 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -141,5 +141,10 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 	/* by default, allow everything */
 	return true;
 }
+
+#ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+#define pkey_initialize()
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 #endif /* __KERNEL__ */
 #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
new file mode 100644
index 0000000..c02305a
--- /dev/null
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -0,0 +1,45 @@
+#ifndef _ASM_PPC64_PKEYS_H
+#define _ASM_PPC64_PKEYS_H
+
+extern bool pkey_inited;
+extern bool pkey_execute_disable_support;
+#define ARCH_VM_PKEY_FLAGS 0
+
+static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
+{
+	return (pkey == 0);
+}
+
+static inline int mm_pkey_alloc(struct mm_struct *mm)
+{
+	return -1;
+}
+
+static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
+{
+	return -EINVAL;
+}
+
+/*
+ * Try to dedicate one of the protection keys to be used as an
+ * execute-only protection key.
+ */
+static inline int execute_only_pkey(struct mm_struct *mm)
+{
+	return 0;
+}
+
+static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
+		int prot, int pkey)
+{
+	return 0;
+}
+
+static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+		unsigned long init_val)
+{
+	return 0;
+}
+
+extern void pkey_initialize(void);
+#endif /*_ASM_PPC64_PKEYS_H */
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index b89c6aa..3b67014 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -37,6 +37,7 @@
 #include <linux/memblock.h>
 #include <linux/memory.h>
 #include <linux/nmi.h>
+#include <linux/pkeys.h>
 
 #include <asm/io.h>
 #include <asm/kdump.h>
@@ -316,6 +317,9 @@ void __init early_setup(unsigned long dt_ptr)
 	/* Initialize the hash table or TLB handling */
 	early_init_mmu();
 
+	/* initialize the key subsystem */
+	pkey_initialize();
+
 	/*
 	 * At this point, we can let interrupts switch to virtual mode
 	 * (the MMU has been setup), so adjust the MSR in the PACA to
diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index fb844d2..927620a 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -43,3 +43,4 @@ obj-$(CONFIG_PPC_COPRO_BASE)	+= copro_fault.o
 obj-$(CONFIG_SPAPR_TCE_IOMMU)	+= mmu_context_iommu.o
 obj-$(CONFIG_PPC_PTDUMP)	+= dump_linuxpagetables.o
 obj-$(CONFIG_PPC_HTDUMP)	+= dump_hashpagetable.o
+obj-$(CONFIG_PPC64_MEMORY_PROTECTION_KEYS)	+= pkeys.o
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 0dff57b..67f62b5 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -35,6 +35,7 @@
 #include <linux/memblock.h>
 #include <linux/context_tracking.h>
 #include <linux/libfdt.h>
+#include <linux/pkeys.h>
 
 #include <asm/debugfs.h>
 #include <asm/processor.h>
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
new file mode 100644
index 0000000..418a05b
--- /dev/null
+++ b/arch/powerpc/mm/pkeys.c
@@ -0,0 +1,33 @@
+/*
+ * PowerPC Memory Protection Keys management
+ * Copyright (c) 2015, Intel Corporation.
+ * Copyright (c) 2017, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+#include <linux/pkeys.h>                /* PKEY_*                       */
+
+bool pkey_inited;
+bool pkey_execute_disable_support;
+
+void __init pkey_initialize(void)
+{
+	/* disable the pkey system till everything
+	 * is in place. A patch further down the
+	 * line will enable it.
+	 */
+	pkey_inited = false;
+
+	/*
+	 * disable execute_disable support for now.
+	 * A patch further down will enable it.
+	 */
+	pkey_execute_disable_support = false;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (8 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 01/25] powerpc: initial pkey plumbing Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-14  4:38   ` Balbir Singh
  2017-10-23  9:25   ` Aneesh Kumar K.V
  2017-09-08 22:44 ` [PATCH 03/25] powerpc: track allocation status of all pkeys Ram Pai
                   ` (22 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

powerpc needs an additional vma bit to support 32 keys.
Till the additional vma bit lands in include/linux/mm.h
we have to define  it  in powerpc specific header file.
This is  needed to get pkeys working on power.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index c02305a..44e01a2 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -3,6 +3,24 @@
 
 extern bool pkey_inited;
 extern bool pkey_execute_disable_support;
+
+/*
+ * powerpc needs an additional vma bit to support 32 keys.
+ * Till the additional vma bit lands in include/linux/mm.h
+ * we have to carry the hunk below. This is  needed to get
+ * pkeys working on power. -- Ram
+ */
+#ifndef VM_HIGH_ARCH_BIT_4
+#define VM_HIGH_ARCH_BIT_4	36
+#define VM_HIGH_ARCH_4	BIT(VM_HIGH_ARCH_BIT_4)
+#define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
+#define VM_PKEY_BIT0	VM_HIGH_ARCH_0
+#define VM_PKEY_BIT1	VM_HIGH_ARCH_1
+#define VM_PKEY_BIT2	VM_HIGH_ARCH_2
+#define VM_PKEY_BIT3	VM_HIGH_ARCH_3
+#define VM_PKEY_BIT4	VM_HIGH_ARCH_4
+#endif
+
 #define ARCH_VM_PKEY_FLAGS 0
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 03/25] powerpc: track allocation status of all pkeys
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (9 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 02/25] powerpc: define an additional vma bit for protection keys Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-10-07 10:02   ` Michael Ellerman
                     ` (3 more replies)
  2017-09-08 22:44 ` [PATCH 04/25] powerpc: helper function to read, write AMR, IAMR, UAMOR registers Ram Pai
                   ` (21 subsequent siblings)
  32 siblings, 4 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Total 32 keys are available on power7 and above. However
pkey 0,1 are reserved. So effectively we  have  30 pkeys.

On 4K kernels, we do not  have  5  bits  in  the  PTE to
represent  all the keys; we only have 3bits.Two of those
keys are reserved; pkey 0 and pkey 1. So effectively  we
have 6 pkeys.

This patch keeps track of reserved keys, allocated  keys
and keys that are currently free.

Also it  adds  skeletal  functions  and macros, that the
architecture-independent code expects to be available.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu.h |    9 ++++
 arch/powerpc/include/asm/mmu_context.h   |    1 +
 arch/powerpc/include/asm/pkeys.h         |   72 ++++++++++++++++++++++++++++--
 arch/powerpc/mm/mmu_context_book3s64.c   |    2 +
 arch/powerpc/mm/pkeys.c                  |   28 ++++++++++++
 5 files changed, 108 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index c3b00e8..55950f4 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -107,6 +107,15 @@ struct patb_entry {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	struct list_head iommu_group_mem_list;
 #endif
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	/*
+	 * Each bit represents one protection key.
+	 * bit set   -> key allocated
+	 * bit unset -> key available for allocation
+	 */
+	u32 pkey_allocation_map;
+#endif
 } mm_context_t;
 
 /*
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 7badf29..c705a5d 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -144,6 +144,7 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 
 #ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 #define pkey_initialize()
+#define pkey_mm_init(mm)
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 44e01a2..133f8c4 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -3,6 +3,8 @@
 
 extern bool pkey_inited;
 extern bool pkey_execute_disable_support;
+extern int pkeys_total; /* total pkeys as per device tree */
+extern u32 initial_allocation_mask;/* bits set for reserved keys */
 
 /*
  * powerpc needs an additional vma bit to support 32 keys.
@@ -21,21 +23,76 @@
 #define VM_PKEY_BIT4	VM_HIGH_ARCH_4
 #endif
 
-#define ARCH_VM_PKEY_FLAGS 0
+#define arch_max_pkey()  pkeys_total
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
+				VM_PKEY_BIT3 | VM_PKEY_BIT4)
+
+#define pkey_alloc_mask(pkey) (0x1 << pkey)
+
+#define mm_pkey_allocation_map(mm)	(mm->context.pkey_allocation_map)
+
+#define mm_set_pkey_allocated(mm, pkey) {	\
+	mm_pkey_allocation_map(mm) |= pkey_alloc_mask(pkey); \
+}
+
+#define mm_set_pkey_free(mm, pkey) {	\
+	mm_pkey_allocation_map(mm) &= ~pkey_alloc_mask(pkey);	\
+}
+
+#define mm_set_pkey_is_allocated(mm, pkey)	\
+	(mm_pkey_allocation_map(mm) & pkey_alloc_mask(pkey))
+
+#define mm_set_pkey_is_reserved(mm, pkey) (initial_allocation_mask & \
+					pkey_alloc_mask(pkey))
 
 static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 {
-	return (pkey == 0);
+	/* a reserved key is never considered as 'explicitly allocated' */
+	return ((pkey < arch_max_pkey()) &&
+		!mm_set_pkey_is_reserved(mm, pkey) &&
+		mm_set_pkey_is_allocated(mm, pkey));
 }
 
+/*
+ * Returns a positive, 5-bit key on success, or -1 on failure.
+ */
 static inline int mm_pkey_alloc(struct mm_struct *mm)
 {
-	return -1;
+	/*
+	 * Note: this is the one and only place we make sure
+	 * that the pkey is valid as far as the hardware is
+	 * concerned.  The rest of the kernel trusts that
+	 * only good, valid pkeys come out of here.
+	 */
+	u32 all_pkeys_mask = (u32)(~(0x0));
+	int ret;
+
+	if (!pkey_inited)
+		return -1;
+	/*
+	 * Are we out of pkeys?  We must handle this specially
+	 * because ffz() behavior is undefined if there are no
+	 * zeros.
+	 */
+	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
+		return -1;
+
+	ret = ffz((u32)mm_pkey_allocation_map(mm));
+	mm_set_pkey_allocated(mm, ret);
+	return ret;
 }
 
 static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
 {
-	return -EINVAL;
+	if (!pkey_inited)
+		return -1;
+
+	if (!mm_pkey_is_allocated(mm, pkey))
+		return -EINVAL;
+
+	mm_set_pkey_free(mm, pkey);
+
+	return 0;
 }
 
 /*
@@ -59,5 +116,12 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return 0;
 }
 
+static inline void pkey_mm_init(struct mm_struct *mm)
+{
+	if (!pkey_inited)
+		return;
+	mm_pkey_allocation_map(mm) = initial_allocation_mask;
+}
+
 extern void pkey_initialize(void);
 #endif /*_ASM_PPC64_PKEYS_H */
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index 05e1538..5df223a 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -16,6 +16,7 @@
 #include <linux/string.h>
 #include <linux/types.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <linux/spinlock.h>
 #include <linux/idr.h>
 #include <linux/export.h>
@@ -118,6 +119,7 @@ static int hash__init_new_context(struct mm_struct *mm)
 
 	subpage_prot_init_new_context(mm);
 
+	pkey_mm_init(mm);
 	return index;
 }
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 418a05b..ebc9e84 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -16,9 +16,13 @@
 
 bool pkey_inited;
 bool pkey_execute_disable_support;
+int  pkeys_total;		/* total pkeys as per device tree */
+u32  initial_allocation_mask;	/* bits set for reserved keys */
 
 void __init pkey_initialize(void)
 {
+	int os_reserved, i;
+
 	/* disable the pkey system till everything
 	 * is in place. A patch further down the
 	 * line will enable it.
@@ -30,4 +34,28 @@ void __init pkey_initialize(void)
 	 * A patch further down will enable it.
 	 */
 	pkey_execute_disable_support = false;
+
+	/* Lets assume 32 keys */
+	pkeys_total = 32;
+
+#ifdef CONFIG_PPC_4K_PAGES
+	/*
+	 * the OS can manage only 8 pkeys
+	 * due to its inability to represent
+	 * them in the linux 4K-PTE.
+	 */
+	os_reserved = pkeys_total-8;
+#else
+	os_reserved = 0;
+#endif
+	/*
+	 * Bits are in LE format.
+	 * NOTE: 1, 0 are reserved.
+	 * key 0 is the default key, which allows read/write/execute.
+	 * key 1 is recommended not to be used.
+	 * PowerISA(3.0) page 1015, programming note.
+	 */
+	initial_allocation_mask = ~0x0;
+	for (i = 2; i < (pkeys_total - os_reserved); i++)
+		initial_allocation_mask &= ~(0x1<<i);
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 04/25] powerpc: helper function to read, write AMR, IAMR, UAMOR registers
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (10 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 03/25] powerpc: track allocation status of all pkeys Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-10-18  3:17   ` [PATCH 04/25] powerpc: helper function to read,write AMR,IAMR,UAMOR registers Balbir Singh
  2017-09-08 22:44 ` [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers Ram Pai
                   ` (20 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Implements helper functions to read and write the key related
registers; AMR, IAMR, UAMOR.

AMR register tracks the read,write permission of a key
IAMR register tracks the execute permission of a key
UAMOR register enables and disables a key

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   31 ++++++++++++++++++++++++++
 1 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index b9aff51..73ed52c 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -438,6 +438,37 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
 		pte_update(mm, addr, ptep, 0, _PAGE_PRIVILEGED, 1);
 }
 
+#include <asm/reg.h>
+static inline u64 read_amr(void)
+{
+	return mfspr(SPRN_AMR);
+}
+static inline void write_amr(u64 value)
+{
+	mtspr(SPRN_AMR, value);
+}
+extern bool pkey_execute_disable_support;
+static inline u64 read_iamr(void)
+{
+	if (pkey_execute_disable_support)
+		return mfspr(SPRN_IAMR);
+	else
+		return 0x0UL;
+}
+static inline void write_iamr(u64 value)
+{
+	if (pkey_execute_disable_support)
+		mtspr(SPRN_IAMR, value);
+}
+static inline u64 read_uamor(void)
+{
+	return mfspr(SPRN_UAMOR);
+}
+static inline void write_uamor(u64 value)
+{
+	mtspr(SPRN_UAMOR, value);
+}
+
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long addr, pte_t *ptep)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (11 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 04/25] powerpc: helper function to read, write AMR, IAMR, UAMOR registers Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-10-18  3:24   ` Balbir Singh
  2017-10-24  6:25   ` Aneesh Kumar K.V
  2017-09-08 22:44 ` [PATCH 06/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed Ram Pai
                   ` (19 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Introduce  helper functions that can initialize the bits in the AMR,
IAMR and UAMOR register; the bits that correspond to the given pkey.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |    1 +
 arch/powerpc/mm/pkeys.c          |   46 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 133f8c4..5a83ed7 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -26,6 +26,7 @@
 #define arch_max_pkey()  pkeys_total
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
 				VM_PKEY_BIT3 | VM_PKEY_BIT4)
+#define AMR_BITS_PER_PKEY 2
 
 #define pkey_alloc_mask(pkey) (0x1 << pkey)
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index ebc9e84..178aa33 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -59,3 +59,49 @@ void __init pkey_initialize(void)
 	for (i = 2; i < (pkeys_total - os_reserved); i++)
 		initial_allocation_mask &= ~(0x1<<i);
 }
+
+#define PKEY_REG_BITS (sizeof(u64)*8)
+#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
+
+static inline void init_amr(int pkey, u8 init_bits)
+{
+	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
+	u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
+
+	write_amr(old_amr | new_amr_bits);
+}
+
+static inline void init_iamr(int pkey, u8 init_bits)
+{
+	u64 new_iamr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
+	u64 old_iamr = read_iamr() & ~((u64)(0x3ul) << pkeyshift(pkey));
+
+	write_iamr(old_iamr | new_iamr_bits);
+}
+
+static void pkey_status_change(int pkey, bool enable)
+{
+	u64 old_uamor;
+
+	/* reset the AMR and IAMR bits for this key */
+	init_amr(pkey, 0x0);
+	init_iamr(pkey, 0x0);
+
+	/* enable/disable key */
+	old_uamor = read_uamor();
+	if (enable)
+		old_uamor |= (0x3ul << pkeyshift(pkey));
+	else
+		old_uamor &= ~(0x3ul << pkeyshift(pkey));
+	write_uamor(old_uamor);
+}
+
+void __arch_activate_pkey(int pkey)
+{
+	pkey_status_change(pkey, true);
+}
+
+void __arch_deactivate_pkey(int pkey)
+{
+	pkey_status_change(pkey, false);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 06/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (12 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-10-18  3:34   ` [PATCH 06/25] powerpc: cleaup AMR,iAMR " Balbir Singh
  2017-10-23  9:43   ` [PATCH 06/25] powerpc: cleaup AMR, iAMR " Aneesh Kumar K.V
  2017-09-08 22:44 ` [PATCH 07/25] powerpc: implementation for arch_set_user_pkey_access() Ram Pai
                   ` (18 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

cleanup the bits corresponding to a key in the AMR, and IAMR
register, when the key is newly allocated/activated or is freed.
We dont want some residual bits cause the hardware enforce
unintended behavior when the key is activated or freed.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 5a83ed7..53bf13b 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -54,6 +54,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
 		mm_set_pkey_is_allocated(mm, pkey));
 }
 
+extern void __arch_activate_pkey(int pkey);
+extern void __arch_deactivate_pkey(int pkey);
 /*
  * Returns a positive, 5-bit key on success, or -1 on failure.
  */
@@ -80,6 +82,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
 
 	ret = ffz((u32)mm_pkey_allocation_map(mm));
 	mm_set_pkey_allocated(mm, ret);
+
+	/*
+	 * enable the key in the hardware
+	 */
+	if (ret > 0)
+		__arch_activate_pkey(ret);
 	return ret;
 }
 
@@ -91,6 +99,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
 	if (!mm_pkey_is_allocated(mm, pkey))
 		return -EINVAL;
 
+	/*
+	 * Disable the key in the hardware
+	 */
+	__arch_deactivate_pkey(pkey);
 	mm_set_pkey_free(mm, pkey);
 
 	return 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 07/25] powerpc: implementation for arch_set_user_pkey_access()
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (13 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 06/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-09-08 22:44 ` [PATCH 08/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls Ram Pai
                   ` (17 subsequent siblings)
  32 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

This patch provides the detailed implementation for
a user to allocate a key and enable it in the hardware.

It provides the plumbing, but it cannot be used till
the system call is implemented. The next patch  will
do so.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h |    9 ++++++++-
 arch/powerpc/mm/pkeys.c          |   28 ++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 53bf13b..7fd48a4 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -24,6 +24,9 @@
 #endif
 
 #define arch_max_pkey()  pkeys_total
+#define AMR_RD_BIT 0x1UL
+#define AMR_WR_BIT 0x2UL
+#define IAMR_EX_BIT 0x1UL
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
 				VM_PKEY_BIT3 | VM_PKEY_BIT4)
 #define AMR_BITS_PER_PKEY 2
@@ -123,10 +126,14 @@ static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 	return 0;
 }
 
+extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+		unsigned long init_val);
 static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val)
 {
-	return 0;
+	if (!pkey_inited)
+		return -EINVAL;
+	return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
 static inline void pkey_mm_init(struct mm_struct *mm)
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 178aa33..cc5be6a 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -12,6 +12,7 @@
  * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
  * more details.
  */
+#include <asm/mman.h>
 #include <linux/pkeys.h>                /* PKEY_*                       */
 
 bool pkey_inited;
@@ -63,6 +64,11 @@ void __init pkey_initialize(void)
 #define PKEY_REG_BITS (sizeof(u64)*8)
 #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
 
+static bool is_pkey_enabled(int pkey)
+{
+	return !!(read_uamor() & (0x3ul << pkeyshift(pkey)));
+}
+
 static inline void init_amr(int pkey, u8 init_bits)
 {
 	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
@@ -105,3 +111,25 @@ void __arch_deactivate_pkey(int pkey)
 {
 	pkey_status_change(pkey, false);
 }
+
+/*
+ * set the access right in AMR IAMR and UAMOR register
+ * for @pkey to that specified in @init_val.
+ */
+int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
+		unsigned long init_val)
+{
+	u64 new_amr_bits = 0x0ul;
+
+	if (!is_pkey_enabled(pkey))
+		return -EINVAL;
+
+	/* Set the bits we need in AMR:  */
+	if (init_val & PKEY_DISABLE_ACCESS)
+		new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
+	else if (init_val & PKEY_DISABLE_WRITE)
+		new_amr_bits |= AMR_WR_BIT;
+
+	init_amr(pkey, new_amr_bits);
+	return 0;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 08/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (14 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 07/25] powerpc: implementation for arch_set_user_pkey_access() Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-10-24 15:48   ` Michael Ellerman
  2017-09-08 22:44 ` [PATCH 09/25] powerpc: ability to create execute-disabled pkeys Ram Pai
                   ` (16 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Finally this patch provides the ability for a process to
allocate and free a protection key.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/systbl.h      |    2 ++
 arch/powerpc/include/asm/unistd.h      |    4 +---
 arch/powerpc/include/uapi/asm/unistd.h |    2 ++
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index 1c94708..22dd776 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -388,3 +388,5 @@
 COMPAT_SYS_SPU(pwritev2)
 SYSCALL(kexec_file_load)
 SYSCALL(statx)
+SYSCALL(pkey_alloc)
+SYSCALL(pkey_free)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index 9ba11db..e0273bc 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,13 +12,11 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		384
+#define NR_syscalls		386
 
 #define __NR__exit __NR_exit
 
 #define __IGNORE_pkey_mprotect
-#define __IGNORE_pkey_alloc
-#define __IGNORE_pkey_free
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index b85f142..7993a07 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -394,5 +394,7 @@
 #define __NR_pwritev2		381
 #define __NR_kexec_file_load	382
 #define __NR_statx		383
+#define __NR_pkey_alloc		384
+#define __NR_pkey_free		385
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 09/25] powerpc: ability to create execute-disabled pkeys
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (15 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 08/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-10-18  3:42   ` Balbir Singh
  2017-10-24  4:36   ` Aneesh Kumar K.V
  2017-09-08 22:44 ` [PATCH 10/25] powerpc: store and restore the pkey state across context switches Ram Pai
                   ` (15 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

powerpc has hardware support to disable execute on a pkey.
This patch enables the ability to create execute-disabled
keys.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/uapi/asm/mman.h |    6 ++++++
 arch/powerpc/mm/pkeys.c              |   16 ++++++++++++++++
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
index ab45cc2..f272b09 100644
--- a/arch/powerpc/include/uapi/asm/mman.h
+++ b/arch/powerpc/include/uapi/asm/mman.h
@@ -45,4 +45,10 @@
 #define MAP_HUGE_1GB	(30 << MAP_HUGE_SHIFT)	/* 1GB   HugeTLB Page */
 #define MAP_HUGE_16GB	(34 << MAP_HUGE_SHIFT)	/* 16GB  HugeTLB Page */
 
+/* override any generic PKEY Permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#undef PKEY_ACCESS_MASK
+#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
+				PKEY_DISABLE_WRITE  |\
+				PKEY_DISABLE_EXECUTE)
 #endif /* _UAPI_ASM_POWERPC_MMAN_H */
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index cc5be6a..2282864 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -24,6 +24,14 @@ void __init pkey_initialize(void)
 {
 	int os_reserved, i;
 
+	/*
+	 * we define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
+	 * generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
+	 * Ensure that the bits a distinct.
+	 */
+	BUILD_BUG_ON(PKEY_DISABLE_EXECUTE &
+		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+
 	/* disable the pkey system till everything
 	 * is in place. A patch further down the
 	 * line will enable it.
@@ -120,10 +128,18 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 		unsigned long init_val)
 {
 	u64 new_amr_bits = 0x0ul;
+	u64 new_iamr_bits = 0x0ul;
 
 	if (!is_pkey_enabled(pkey))
 		return -EINVAL;
 
+	if ((init_val & PKEY_DISABLE_EXECUTE)) {
+		if (!pkey_execute_disable_support)
+			return -EINVAL;
+		new_iamr_bits |= IAMR_EX_BIT;
+	}
+	init_iamr(pkey, new_iamr_bits);
+
 	/* Set the bits we need in AMR:  */
 	if (init_val & PKEY_DISABLE_ACCESS)
 		new_amr_bits |= AMR_RD_BIT | AMR_WR_BIT;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 10/25] powerpc: store and restore the pkey state across context switches
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (16 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 09/25] powerpc: ability to create execute-disabled pkeys Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-10-18  3:49   ` Balbir Singh
  2017-09-08 22:44 ` [PATCH 11/25] powerpc: introduce execute-only pkey Ram Pai
                   ` (14 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Store and restore the AMR, IAMR and UAMOR register state of the task
before scheduling out and after scheduling in, respectively.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h     |    4 +++
 arch/powerpc/include/asm/processor.h |    5 ++++
 arch/powerpc/kernel/process.c        |   10 ++++++++
 arch/powerpc/mm/pkeys.c              |   39 ++++++++++++++++++++++++++++++++++
 4 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 7fd48a4..78c5362 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -143,5 +143,9 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	mm_pkey_allocation_map(mm) = initial_allocation_mask;
 }
 
+extern void thread_pkey_regs_save(struct thread_struct *thread);
+extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
+			struct thread_struct *old_thread);
+extern void thread_pkey_regs_init(struct thread_struct *thread);
 extern void pkey_initialize(void);
 #endif /*_ASM_PPC64_PKEYS_H */
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index fab7ff8..de9d9ba 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -309,6 +309,11 @@ struct thread_struct {
 	struct thread_vr_state ckvr_state; /* Checkpointed VR state */
 	unsigned long	ckvrsave; /* Checkpointed VRSAVE */
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	unsigned long	amr;
+	unsigned long	iamr;
+	unsigned long	uamor;
+#endif
 #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
 	void*		kvm_shadow_vcpu; /* KVM internal data */
 #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index a0c74bb..ba80002 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -42,6 +42,7 @@
 #include <linux/hw_breakpoint.h>
 #include <linux/uaccess.h>
 #include <linux/elf-randomize.h>
+#include <linux/pkeys.h>
 
 #include <asm/pgtable.h>
 #include <asm/io.h>
@@ -1085,6 +1086,9 @@ static inline void save_sprs(struct thread_struct *t)
 		t->tar = mfspr(SPRN_TAR);
 	}
 #endif
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	thread_pkey_regs_save(t);
+#endif
 }
 
 static inline void restore_sprs(struct thread_struct *old_thread,
@@ -1120,6 +1124,9 @@ static inline void restore_sprs(struct thread_struct *old_thread,
 			mtspr(SPRN_TAR, new_thread->tar);
 	}
 #endif
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	thread_pkey_regs_restore(new_thread, old_thread);
+#endif
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
@@ -1705,6 +1712,9 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
 	current->thread.tm_tfiar = 0;
 	current->thread.load_tm = 0;
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	thread_pkey_regs_init(&current->thread);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 }
 EXPORT_SYMBOL(start_thread);
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 2282864..7cd1be4 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -149,3 +149,42 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	init_amr(pkey, new_amr_bits);
 	return 0;
 }
+
+void thread_pkey_regs_save(struct thread_struct *thread)
+{
+	if (!pkey_inited)
+		return;
+
+	/* @TODO skip saving any registers if the thread
+	 * has not used any keys yet.
+	 */
+
+	thread->amr = read_amr();
+	thread->iamr = read_iamr();
+	thread->uamor = read_uamor();
+}
+
+void thread_pkey_regs_restore(struct thread_struct *new_thread,
+			struct thread_struct *old_thread)
+{
+	if (!pkey_inited)
+		return;
+
+	/* @TODO just reset uamor to zero if the new_thread
+	 * has not used any keys yet.
+	 */
+
+	if (old_thread->amr != new_thread->amr)
+		write_amr(new_thread->amr);
+	if (old_thread->iamr != new_thread->iamr)
+		write_iamr(new_thread->iamr);
+	if (old_thread->uamor != new_thread->uamor)
+		write_uamor(new_thread->uamor);
+}
+
+void thread_pkey_regs_init(struct thread_struct *thread)
+{
+	write_amr(0x0ul);
+	write_iamr(0x0ul);
+	write_uamor(0x0ul);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 11/25] powerpc: introduce execute-only pkey
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (17 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 10/25] powerpc: store and restore the pkey state across context switches Ram Pai
@ 2017-09-08 22:44 ` Ram Pai
  2017-10-18  4:15   ` Balbir Singh
  2017-09-08 22:45 ` [PATCH 12/25] powerpc: ability to associate pkey to a vma Ram Pai
                   ` (13 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:44 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

This patch provides the implementation of execute-only pkey.
The architecture-independent layer expects the arch-dependent
layer, to support the ability to create and enable a special
key which has execute-only permission.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu.h |    1 +
 arch/powerpc/include/asm/pkeys.h         |    9 ++++-
 arch/powerpc/mm/pkeys.c                  |   57 ++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 55950f4..ee18ba0 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -115,6 +115,7 @@ struct patb_entry {
 	 * bit unset -> key available for allocation
 	 */
 	u32 pkey_allocation_map;
+	s16 execute_only_pkey; /* key holding execute-only protection */
 #endif
 } mm_context_t;
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 78c5362..0cf115f 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -115,11 +115,16 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
  * Try to dedicate one of the protection keys to be used as an
  * execute-only protection key.
  */
+extern int __execute_only_pkey(struct mm_struct *mm);
 static inline int execute_only_pkey(struct mm_struct *mm)
 {
-	return 0;
+	if (!pkey_inited || !pkey_execute_disable_support)
+		return -1;
+
+	return __execute_only_pkey(mm);
 }
 
+
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 		int prot, int pkey)
 {
@@ -141,6 +146,8 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	if (!pkey_inited)
 		return;
 	mm_pkey_allocation_map(mm) = initial_allocation_mask;
+	/* -1 means unallocated or invalid */
+	mm->context.execute_only_pkey = -1;
 }
 
 extern void thread_pkey_regs_save(struct thread_struct *thread);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 7cd1be4..8a24983 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -188,3 +188,60 @@ void thread_pkey_regs_init(struct thread_struct *thread)
 	write_iamr(0x0ul);
 	write_uamor(0x0ul);
 }
+
+static inline bool pkey_allows_readwrite(int pkey)
+{
+	int pkey_shift = pkeyshift(pkey);
+
+	if (!(read_uamor() & (0x3UL << pkey_shift)))
+		return true;
+
+	return !(read_amr() & ((AMR_RD_BIT|AMR_WR_BIT) << pkey_shift));
+}
+
+int __execute_only_pkey(struct mm_struct *mm)
+{
+	bool need_to_set_mm_pkey = false;
+	int execute_only_pkey = mm->context.execute_only_pkey;
+	int ret;
+
+	/* Do we need to assign a pkey for mm's execute-only maps? */
+	if (execute_only_pkey == -1) {
+		/* Go allocate one to use, which might fail */
+		execute_only_pkey = mm_pkey_alloc(mm);
+		if (execute_only_pkey < 0)
+			return -1;
+		need_to_set_mm_pkey = true;
+	}
+
+	/*
+	 * We do not want to go through the relatively costly
+	 * dance to set AMR if we do not need to.  Check it
+	 * first and assume that if the execute-only pkey is
+	 * readwrite-disabled than we do not have to set it
+	 * ourselves.
+	 */
+	if (!need_to_set_mm_pkey &&
+	    !pkey_allows_readwrite(execute_only_pkey))
+		return execute_only_pkey;
+
+	/*
+	 * Set up AMR so that it denies access for everything
+	 * other than execution.
+	 */
+	ret = __arch_set_user_pkey_access(current, execute_only_pkey,
+			(PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
+	/*
+	 * If the AMR-set operation failed somehow, just return
+	 * 0 and effectively disable execute-only support.
+	 */
+	if (ret) {
+		mm_set_pkey_free(mm, execute_only_pkey);
+		return -1;
+	}
+
+	/* We got one, store it and use it from here on out */
+	if (need_to_set_mm_pkey)
+		mm->context.execute_only_pkey = execute_only_pkey;
+	return execute_only_pkey;
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 12/25] powerpc: ability to associate pkey to a vma
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (18 preceding siblings ...)
  2017-09-08 22:44 ` [PATCH 11/25] powerpc: introduce execute-only pkey Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18  4:27   ` Balbir Singh
  2017-09-08 22:45 ` [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey() Ram Pai
                   ` (12 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

arch-independent code expects the arch to  map
a  pkey  into the vma's protection bit setting.
The patch provides that ability.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mman.h  |    8 +++++++-
 arch/powerpc/include/asm/pkeys.h |   18 ++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 30922f6..067eec2 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -13,6 +13,7 @@
 
 #include <asm/cputable.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <asm/cpu_has_feature.h>
 
 /*
@@ -22,7 +23,12 @@
 static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
 		unsigned long pkey)
 {
-	return (prot & PROT_SAO) ? VM_SAO : 0;
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	return (((prot & PROT_SAO) ? VM_SAO : 0) |
+			pkey_to_vmflag_bits(pkey));
+#else
+	return ((prot & PROT_SAO) ? VM_SAO : 0);
+#endif
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
 
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0cf115f..f13e913 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -23,6 +23,24 @@
 #define VM_PKEY_BIT4	VM_HIGH_ARCH_4
 #endif
 
+/* override any generic PKEY Permission defines */
+#define PKEY_DISABLE_EXECUTE   0x4
+#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
+				PKEY_DISABLE_WRITE  |\
+				PKEY_DISABLE_EXECUTE)
+
+static inline u64 pkey_to_vmflag_bits(u16 pkey)
+{
+	if (!pkey_inited)
+		return 0x0UL;
+
+	return (((pkey & 0x1UL) ? VM_PKEY_BIT0 : 0x0UL) |
+		((pkey & 0x2UL) ? VM_PKEY_BIT1 : 0x0UL) |
+		((pkey & 0x4UL) ? VM_PKEY_BIT2 : 0x0UL) |
+		((pkey & 0x8UL) ? VM_PKEY_BIT3 : 0x0UL) |
+		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
+}
+
 #define arch_max_pkey()  pkeys_total
 #define AMR_RD_BIT 0x1UL
 #define AMR_WR_BIT 0x2UL
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey()
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (19 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 12/25] powerpc: ability to associate pkey to a vma Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18  4:36   ` Balbir Singh
  2017-09-08 22:45 ` [PATCH 14/25] powerpc: map vma key-protection bits to pte key bits Ram Pai
                   ` (11 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

arch independent code calls arch_override_mprotect_pkey()
to return a pkey that best matches the requested protection.

This patch provides the implementation.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    5 +++
 arch/powerpc/include/asm/pkeys.h       |   17 ++++++++++-
 arch/powerpc/mm/pkeys.c                |   47 ++++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index c705a5d..8e5a87e 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -145,6 +145,11 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 #ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 #define pkey_initialize()
 #define pkey_mm_init(mm)
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+	return 0;
+}
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index f13e913..d2fffef 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -41,6 +41,16 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
 		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
 }
 
+#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
+				VM_PKEY_BIT3 | VM_PKEY_BIT4)
+
+static inline int vma_pkey(struct vm_area_struct *vma)
+{
+	if (!pkey_inited)
+		return 0;
+	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
+}
+
 #define arch_max_pkey()  pkeys_total
 #define AMR_RD_BIT 0x1UL
 #define AMR_WR_BIT 0x2UL
@@ -142,11 +152,14 @@ static inline int execute_only_pkey(struct mm_struct *mm)
 	return __execute_only_pkey(mm);
 }
 
-
+extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
+		int prot, int pkey);
 static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
 		int prot, int pkey)
 {
-	return 0;
+	if (!pkey_inited)
+		return 0;
+	return __arch_override_mprotect_pkey(vma, prot, pkey);
 }
 
 extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 8a24983..fb1a76a 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -245,3 +245,50 @@ int __execute_only_pkey(struct mm_struct *mm)
 		mm->context.execute_only_pkey = execute_only_pkey;
 	return execute_only_pkey;
 }
+
+static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
+{
+	/* Do this check first since the vm_flags should be hot */
+	if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
+		return false;
+
+	return (vma_pkey(vma) == vma->vm_mm->context.execute_only_pkey);
+}
+
+/*
+ * This should only be called for *plain* mprotect calls.
+ */
+int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
+		int pkey)
+{
+	/*
+	 * Is this an mprotect_pkey() call?  If so, never
+	 * override the value that came from the user.
+	 */
+	if (pkey != -1)
+		return pkey;
+
+	/*
+	 * If the currently associated pkey is execute-only,
+	 * but the requested protection requires read or write,
+	 * move it back to the default pkey.
+	 */
+	if (vma_is_pkey_exec_only(vma) &&
+	    (prot & (PROT_READ|PROT_WRITE)))
+		return 0;
+
+	/*
+	 * the requested protection is execute-only. Hence
+	 * lets use a execute-only pkey.
+	 */
+	if (prot == PROT_EXEC) {
+		pkey = execute_only_pkey(vma->vm_mm);
+		if (pkey > 0)
+			return pkey;
+	}
+
+	/*
+	 * nothing to override.
+	 */
+	return vma_pkey(vma);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 14/25] powerpc: map vma key-protection bits to pte key bits.
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (20 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey() Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18  4:39   ` Balbir Singh
  2017-09-08 22:45 ` [PATCH 15/25] powerpc: sys_pkey_mprotect() system call Ram Pai
                   ` (10 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

map  the  key  protection  bits of the vma to the pkey bits in
the PTE.

The Pte  bits used  for pkey  are  3,4,5,6  and 57. The  first
four bits are the same four bits that were freed up  initially
in this patch series. remember? :-) Without those four bits
this patch would'nt be possible.

BUT, On 4k kernel, bit 3, and 4 could not be freed up. remember?
Hence we have to be satisfied with 5,6 and 7.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   25 ++++++++++++++++++++++++-
 arch/powerpc/include/asm/mman.h              |    8 ++++++++
 arch/powerpc/include/asm/pkeys.h             |   12 ++++++++++++
 3 files changed, 44 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 73ed52c..5935d4e 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -38,6 +38,7 @@
 #define _RPAGE_RSV2		0x0800000000000000UL
 #define _RPAGE_RSV3		0x0400000000000000UL
 #define _RPAGE_RSV4		0x0200000000000000UL
+#define _RPAGE_RSV5		0x00040UL
 
 #define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
 #define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
@@ -57,6 +58,25 @@
 /* Max physical address bit as per radix table */
 #define _RPAGE_PA_MAX		57
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+#ifdef CONFIG_PPC_64K_PAGES
+#define H_PAGE_PKEY_BIT0	_RPAGE_RSV1
+#define H_PAGE_PKEY_BIT1	_RPAGE_RSV2
+#else /* CONFIG_PPC_64K_PAGES */
+#define H_PAGE_PKEY_BIT0	0 /* _RPAGE_RSV1 is not available */
+#define H_PAGE_PKEY_BIT1	0 /* _RPAGE_RSV2 is not available */
+#endif /* CONFIG_PPC_64K_PAGES */
+#define H_PAGE_PKEY_BIT2	_RPAGE_RSV3
+#define H_PAGE_PKEY_BIT3	_RPAGE_RSV4
+#define H_PAGE_PKEY_BIT4	_RPAGE_RSV5
+#else /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+#define H_PAGE_PKEY_BIT0	0
+#define H_PAGE_PKEY_BIT1	0
+#define H_PAGE_PKEY_BIT2	0
+#define H_PAGE_PKEY_BIT3	0
+#define H_PAGE_PKEY_BIT4	0
+#endif /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 /*
  * Max physical address bit we will use for now.
  *
@@ -120,13 +140,16 @@
 #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
 			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE |	\
 			 _PAGE_SOFT_DIRTY)
+
+#define H_PAGE_PKEY  (H_PAGE_PKEY_BIT0 | H_PAGE_PKEY_BIT1 | H_PAGE_PKEY_BIT2 | \
+			H_PAGE_PKEY_BIT3 | H_PAGE_PKEY_BIT4)
 /*
  * Mask of bits returned by pte_pgprot()
  */
 #define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
 			 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
 			 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | \
-			 _PAGE_SOFT_DIRTY)
+			 _PAGE_SOFT_DIRTY | H_PAGE_PKEY)
 /*
  * We define 2 sets of base prot bits, one for basic pages (ie,
  * cacheable kernel and user pages) and one for non cacheable
diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
index 067eec2..3f7220f 100644
--- a/arch/powerpc/include/asm/mman.h
+++ b/arch/powerpc/include/asm/mman.h
@@ -32,12 +32,20 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
 }
 #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
 
+
 static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
 {
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	return (vm_flags & VM_SAO) ?
+		__pgprot(_PAGE_SAO | vmflag_to_page_pkey_bits(vm_flags)) :
+		__pgprot(0 | vmflag_to_page_pkey_bits(vm_flags));
+#else
 	return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0);
+#endif
 }
 #define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
 
+
 static inline bool arch_validate_prot(unsigned long prot)
 {
 	if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO))
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index d2fffef..0d2488a 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -41,6 +41,18 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
 		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
 }
 
+static inline u64 vmflag_to_page_pkey_bits(u64 vm_flags)
+{
+	if (!pkey_inited)
+		return 0x0UL;
+
+	return (((vm_flags & VM_PKEY_BIT0) ? H_PAGE_PKEY_BIT4 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT1) ? H_PAGE_PKEY_BIT3 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT2) ? H_PAGE_PKEY_BIT2 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT3) ? H_PAGE_PKEY_BIT1 : 0x0UL) |
+		((vm_flags & VM_PKEY_BIT4) ? H_PAGE_PKEY_BIT0 : 0x0UL));
+}
+
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
 				VM_PKEY_BIT3 | VM_PKEY_BIT4)
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 15/25] powerpc: sys_pkey_mprotect() system call
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (21 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 14/25] powerpc: map vma key-protection bits to pte key bits Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-09-08 22:45 ` [PATCH 16/25] powerpc: Program HPTE key protection bits Ram Pai
                   ` (9 subsequent siblings)
  32 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Patch provides the ability for a process to
associate a pkey with a address range.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/systbl.h      |    1 +
 arch/powerpc/include/asm/unistd.h      |    4 +---
 arch/powerpc/include/uapi/asm/unistd.h |    1 +
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/systbl.h b/arch/powerpc/include/asm/systbl.h
index 22dd776..b33b551 100644
--- a/arch/powerpc/include/asm/systbl.h
+++ b/arch/powerpc/include/asm/systbl.h
@@ -390,3 +390,4 @@
 SYSCALL(statx)
 SYSCALL(pkey_alloc)
 SYSCALL(pkey_free)
+SYSCALL(pkey_mprotect)
diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h
index e0273bc..daf1ba9 100644
--- a/arch/powerpc/include/asm/unistd.h
+++ b/arch/powerpc/include/asm/unistd.h
@@ -12,12 +12,10 @@
 #include <uapi/asm/unistd.h>
 
 
-#define NR_syscalls		386
+#define NR_syscalls		387
 
 #define __NR__exit __NR_exit
 
-#define __IGNORE_pkey_mprotect
-
 #ifndef __ASSEMBLY__
 
 #include <linux/types.h>
diff --git a/arch/powerpc/include/uapi/asm/unistd.h b/arch/powerpc/include/uapi/asm/unistd.h
index 7993a07..71ae45e 100644
--- a/arch/powerpc/include/uapi/asm/unistd.h
+++ b/arch/powerpc/include/uapi/asm/unistd.h
@@ -396,5 +396,6 @@
 #define __NR_statx		383
 #define __NR_pkey_alloc		384
 #define __NR_pkey_free		385
+#define __NR_pkey_mprotect	386
 
 #endif /* _UAPI_ASM_POWERPC_UNISTD_H_ */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 16/25] powerpc: Program HPTE key protection bits
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (22 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 15/25] powerpc: sys_pkey_mprotect() system call Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18  4:43   ` Balbir Singh
  2017-09-08 22:45 ` [PATCH 17/25] powerpc: helper to validate key-access permissions of a pte Ram Pai
                   ` (8 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Map the PTE protection key bits to the HPTE key protection bits,
while creating HPTE  entries.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |    5 +++++
 arch/powerpc/include/asm/mmu_context.h        |    6 ++++++
 arch/powerpc/include/asm/pkeys.h              |   13 +++++++++++++
 arch/powerpc/mm/hash_utils_64.c               |    1 +
 4 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 508275b..2e22357 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -90,6 +90,8 @@
 #define HPTE_R_PP0		ASM_CONST(0x8000000000000000)
 #define HPTE_R_TS		ASM_CONST(0x4000000000000000)
 #define HPTE_R_KEY_HI		ASM_CONST(0x3000000000000000)
+#define HPTE_R_KEY_BIT0		ASM_CONST(0x2000000000000000)
+#define HPTE_R_KEY_BIT1		ASM_CONST(0x1000000000000000)
 #define HPTE_R_RPN_SHIFT	12
 #define HPTE_R_RPN		ASM_CONST(0x0ffffffffffff000)
 #define HPTE_R_RPN_3_0		ASM_CONST(0x01fffffffffff000)
@@ -104,6 +106,9 @@
 #define HPTE_R_C		ASM_CONST(0x0000000000000080)
 #define HPTE_R_R		ASM_CONST(0x0000000000000100)
 #define HPTE_R_KEY_LO		ASM_CONST(0x0000000000000e00)
+#define HPTE_R_KEY_BIT2		ASM_CONST(0x0000000000000800)
+#define HPTE_R_KEY_BIT3		ASM_CONST(0x0000000000000400)
+#define HPTE_R_KEY_BIT4		ASM_CONST(0x0000000000000200)
 #define HPTE_R_KEY		(HPTE_R_KEY_LO | HPTE_R_KEY_HI)
 
 #define HPTE_V_1TB_SEG		ASM_CONST(0x4000000000000000)
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 8e5a87e..04e9221 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -150,6 +150,12 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 {
 	return 0;
 }
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+	return 0x0UL;
+}
+
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 0d2488a..cd3924c 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -67,6 +67,19 @@ static inline int vma_pkey(struct vm_area_struct *vma)
 #define AMR_RD_BIT 0x1UL
 #define AMR_WR_BIT 0x2UL
 #define IAMR_EX_BIT 0x1UL
+
+static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
+{
+	if (!pkey_inited)
+		return 0x0UL;
+
+	return (((pteflags & H_PAGE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL) |
+		((pteflags & H_PAGE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
+		((pteflags & H_PAGE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
+		((pteflags & H_PAGE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
+		((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
+}
+
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
 				VM_PKEY_BIT3 | VM_PKEY_BIT4)
 #define AMR_BITS_PER_PKEY 2
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 67f62b5..a739a2d 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -232,6 +232,7 @@ unsigned long htab_convert_pte_flags(unsigned long pteflags)
 		 */
 		rflags |= HPTE_R_M;
 
+	rflags |= pte_to_hpte_pkey_bits(pteflags);
 	return rflags;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 17/25] powerpc: helper to validate key-access permissions of a pte
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (23 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 16/25] powerpc: Program HPTE key protection bits Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18  4:48   ` Balbir Singh
  2017-09-08 22:45 ` [PATCH 18/25] powerpc: check key protection for user page access Ram Pai
                   ` (7 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

helper function that checks if the read/write/execute is allowed
on the pte.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |    4 +++
 arch/powerpc/include/asm/pkeys.h             |   12 +++++++++++
 arch/powerpc/mm/pkeys.c                      |   28 ++++++++++++++++++++++++++
 3 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 5935d4e..bd244b3 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -492,6 +492,10 @@ static inline void write_uamor(u64 value)
 	mtspr(SPRN_UAMOR, value);
 }
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long addr, pte_t *ptep)
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index cd3924c..50522a0 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -80,6 +80,18 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
 		((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
 }
 
+static inline u16 pte_to_pkey_bits(u64 pteflags)
+{
+	if (!pkey_inited)
+		return 0x0UL;
+
+	return (((pteflags & H_PAGE_PKEY_BIT0) ? 0x10 : 0x0UL) |
+		((pteflags & H_PAGE_PKEY_BIT1) ? 0x8 : 0x0UL) |
+		((pteflags & H_PAGE_PKEY_BIT2) ? 0x4 : 0x0UL) |
+		((pteflags & H_PAGE_PKEY_BIT3) ? 0x2 : 0x0UL) |
+		((pteflags & H_PAGE_PKEY_BIT4) ? 0x1 : 0x0UL));
+}
+
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
 				VM_PKEY_BIT3 | VM_PKEY_BIT4)
 #define AMR_BITS_PER_PKEY 2
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index fb1a76a..24589d9 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -292,3 +292,31 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
 	 */
 	return vma_pkey(vma);
 }
+
+static bool pkey_access_permitted(int pkey, bool write, bool execute)
+{
+	int pkey_shift;
+	u64 amr;
+
+	if (!pkey)
+		return true;
+
+	pkey_shift = pkeyshift(pkey);
+	if (!(read_uamor() & (0x3UL << pkey_shift)))
+		return true;
+
+	if (execute && !(read_iamr() & (IAMR_EX_BIT << pkey_shift)))
+		return true;
+
+	amr = read_amr(); /* delay reading amr uptil absolutely needed*/
+	return ((!write && !(amr & (AMR_RD_BIT << pkey_shift))) ||
+		(write &&  !(amr & (AMR_WR_BIT << pkey_shift))));
+}
+
+bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
+{
+	if (!pkey_inited)
+		return true;
+	return pkey_access_permitted(pte_to_pkey_bits(pte),
+			write, execute);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 18/25] powerpc: check key protection for user page access
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (24 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 17/25] powerpc: helper to validate key-access permissions of a pte Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18 19:57   ` Balbir Singh
  2017-09-08 22:45 ` [PATCH 19/25] powerpc: implementation for arch_vma_access_permitted() Ram Pai
                   ` (6 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Make sure that the kernel does not access user pages without
checking their key-protection.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |   14 ++++++++++++++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index bd244b3..d22bb4d 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -494,6 +494,20 @@ static inline void write_uamor(u64 value)
 
 #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
+
+#define pte_access_permitted(pte, write) \
+	(pte_present(pte) && \
+	 ((!(write) || pte_write(pte)) && \
+	  arch_pte_access_permitted(pte_val(pte), !!write, 0)))
+
+/*
+ * We store key in pmd for huge tlb pages. So need
+ * to check for key protection.
+ */
+#define pmd_access_permitted(pmd, write) \
+	(pmd_present(pmd) && \
+	 ((!(write) || pmd_write(pmd)) && \
+	  arch_pte_access_permitted(pmd_val(pmd), !!write, 0)))
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 19/25] powerpc: implementation for arch_vma_access_permitted()
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (25 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 18/25] powerpc: check key protection for user page access Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18 23:20   ` Balbir Singh
  2017-10-24 15:48   ` Michael Ellerman
  2017-09-08 22:45 ` [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation Ram Pai
                   ` (5 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

This patch provides the implementation for
arch_vma_access_permitted(). Returns true if the
requested access is allowed by pkey associated with the
vma.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/mmu_context.h |    5 +++-
 arch/powerpc/mm/pkeys.c                |   43 ++++++++++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 04e9221..9a56355 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -135,6 +135,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
 {
 }
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+bool arch_vma_access_permitted(struct vm_area_struct *vma,
+			bool write, bool execute, bool foreign);
+#else /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 		bool write, bool execute, bool foreign)
 {
@@ -142,7 +146,6 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 	return true;
 }
 
-#ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 #define pkey_initialize()
 #define pkey_mm_init(mm)
 
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 24589d9..21c3b42 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -320,3 +320,46 @@ bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
 	return pkey_access_permitted(pte_to_pkey_bits(pte),
 			write, execute);
 }
+
+/*
+ * We only want to enforce protection keys on the current process
+ * because we effectively have no access to AMR/IAMR for other
+ * processes or any way to tell *which * AMR/IAMR in a threaded
+ * process we could use.
+ *
+ * So do not enforce things if the VMA is not from the current
+ * mm, or if we are in a kernel thread.
+ */
+static inline bool vma_is_foreign(struct vm_area_struct *vma)
+{
+	if (!current->mm)
+		return true;
+	/*
+	 * if the VMA is from another process, then AMR/IAMR has no
+	 * relevance and should not be enforced.
+	 */
+	if (current->mm != vma->vm_mm)
+		return true;
+
+	return false;
+}
+
+bool arch_vma_access_permitted(struct vm_area_struct *vma,
+		bool write, bool execute, bool foreign)
+{
+	int pkey;
+
+	if (!pkey_inited)
+		return true;
+
+	/* allow access if the VMA is not one from this process */
+	if (foreign || vma_is_foreign(vma))
+		return true;
+
+	pkey = vma_pkey(vma);
+
+	if (!pkey)
+		return true;
+
+	return pkey_access_permitted(pkey, write, execute);
+}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (26 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 19/25] powerpc: implementation for arch_vma_access_permitted() Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18 23:27   ` Balbir Singh
  2017-10-24 15:47   ` Michael Ellerman
  2017-09-08 22:45 ` [PATCH 21/25] powerpc: introduce get_pte_pkey() helper Ram Pai
                   ` (4 subsequent siblings)
  32 siblings, 2 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Handle Data and  Instruction exceptions caused by memory
protection-key.

The CPU will detect the key fault if the HPTE is already
programmed with the key.

However if the HPTE is not  hashed, a key fault will not
be detected by the  hardware. The   software will detect
pkey violation in such a case.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/mm/fault.c |   37 ++++++++++++++++++++++++++++++++-----
 1 files changed, 32 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 4797d08..a16bc43 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -145,6 +145,23 @@ static noinline int bad_area(struct pt_regs *regs, unsigned long address)
 	return __bad_area(regs, address, SEGV_MAPERR);
 }
 
+static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
+					int si_code)
+{
+	int sig = SIGBUS;
+	int code = BUS_OBJERR;
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	if (si_code & DSISR_KEYFAULT) {
+		sig = SIGSEGV;
+		code = SEGV_PKUERR;
+	}
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
+	_exception(sig, regs, code, address);
+	return 0;
+}
+
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
 		     unsigned int fault)
 {
@@ -391,11 +408,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 		return 0;
 
 	if (unlikely(page_fault_is_bad(error_code))) {
-		if (is_user) {
-			_exception(SIGBUS, regs, BUS_OBJERR, address);
-			return 0;
-		}
-		return SIGBUS;
+		if (!is_user)
+			return SIGBUS;
+		return bad_page_fault_exception(regs, address, error_code);
 	}
 
 	/* Additional sanity check(s) */
@@ -492,6 +507,18 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (unlikely(access_error(is_write, is_exec, vma)))
 		return bad_area(regs, address);
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
+			is_exec, 0))
+		return __bad_area(regs, address, SEGV_PKUERR);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
+
+	/* handle_mm_fault() needs to know if its a instruction access
+	 * fault.
+	 */
+	if (is_exec)
+		flags |= FAULT_FLAG_INSTRUCTION;
 	/*
 	 * If for any reason at all we couldn't handle the fault,
 	 * make sure we exit gracefully rather than endlessly redo
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 21/25] powerpc: introduce get_pte_pkey() helper
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (27 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-18 23:29   ` Balbir Singh
  2017-09-08 22:45 ` [PATCH 22/25] powerpc: capture the violated protection key on fault Ram Pai
                   ` (3 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

get_pte_pkey() helper returns the pkey associated with
a address corresponding to a given mm_struct.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |    5 +++++
 arch/powerpc/mm/hash_utils_64.c               |   24 ++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 2e22357..8716031 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -451,6 +451,11 @@ extern int hash_page(unsigned long ea, unsigned long access, unsigned long trap,
 int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
 		     pte_t *ptep, unsigned long trap, unsigned long flags,
 		     int ssize, unsigned int shift, unsigned int mmu_psize);
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+u16 get_pte_pkey(struct mm_struct *mm, unsigned long address);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 extern int __hash_page_thp(unsigned long ea, unsigned long access,
 			   unsigned long vsid, pmd_t *pmdp, unsigned long trap,
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index a739a2d..5917d45 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1572,6 +1572,30 @@ void hash_preload(struct mm_struct *mm, unsigned long ea,
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+/*
+ * return the protection key associated with the given address
+ * and the mm_struct.
+ */
+u16 get_pte_pkey(struct mm_struct *mm, unsigned long address)
+{
+	pte_t *ptep;
+	u16 pkey = 0;
+	unsigned long flags;
+
+	if (!mm || !mm->pgd)
+		return 0;
+
+	local_irq_save(flags);
+	ptep = find_linux_pte(mm->pgd, address, NULL, NULL);
+	if (ptep)
+		pkey = pte_to_pkey_bits(pte_val(READ_ONCE(*ptep)));
+	local_irq_restore(flags);
+
+	return pkey;
+}
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 static inline void tm_flush_hash_page(int local)
 {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 22/25] powerpc: capture the violated protection key on fault
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (28 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 21/25] powerpc: introduce get_pte_pkey() helper Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-24 15:46   ` Michael Ellerman
  2017-09-08 22:45 ` [PATCH 23/25] powerpc: Deliver SEGV signal on pkey violation Ram Pai
                   ` (2 subsequent siblings)
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Capture the protection key that got violated in paca.
This value will be later used to inform the signal
handler.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/paca.h   |    3 +++
 arch/powerpc/kernel/asm-offsets.c |    5 +++++
 arch/powerpc/mm/fault.c           |   11 ++++++++++-
 3 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 04b60af..51c89c1 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -97,6 +97,9 @@ struct paca_struct {
 	struct dtl_entry *dispatch_log_end;
 #endif /* CONFIG_PPC_STD_MMU_64 */
 	u64 dscr_default;		/* per-CPU default DSCR */
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	u16 paca_pkey;                  /* exception causing pkey */
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 #ifdef CONFIG_PPC_STD_MMU_64
 	/*
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 8cfb20e..361f0d4 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -241,6 +241,11 @@ int main(void)
 	OFFSET(PACAHWCPUID, paca_struct, hw_cpu_id);
 	OFFSET(PACAKEXECSTATE, paca_struct, kexec_state);
 	OFFSET(PACA_DSCR_DEFAULT, paca_struct, dscr_default);
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	OFFSET(PACA_PKEY, paca_struct, paca_pkey);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 	OFFSET(ACCOUNT_STARTTIME, paca_struct, accounting.starttime);
 	OFFSET(ACCOUNT_STARTTIME_USER, paca_struct, accounting.starttime_user);
 	OFFSET(ACCOUNT_USER_TIME, paca_struct, accounting.utime);
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index a16bc43..ad31f6e 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -153,6 +153,7 @@ static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
 
 #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 	if (si_code & DSISR_KEYFAULT) {
+		get_paca()->paca_pkey = get_pte_pkey(current->mm, address);
 		sig = SIGSEGV;
 		code = SEGV_PKUERR;
 	}
@@ -509,8 +510,16 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 
 #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
 	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
-			is_exec, 0))
+			is_exec, 0)) {
+		/*
+		 * The pgd-pdt...pmd-pte tree may not  have  been fully setup.
+		 * Hence we cannot walk the tree to locate the pte, to locate
+		 * the key. Hence lets use vma_pkey() to get the key; instead
+		 * of get_pte_pkey().
+		 */
+		get_paca()->paca_pkey = vma_pkey(vma);
 		return __bad_area(regs, address, SEGV_PKUERR);
+	}
 #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
 
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 23/25] powerpc: Deliver SEGV signal on pkey violation
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (29 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 22/25] powerpc: capture the violated protection key on fault Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-10-24 15:46   ` Michael Ellerman
  2017-09-08 22:45 ` [PATCH 24/25] powerpc/ptrace: Add memory protection key regset Ram Pai
  2017-09-08 22:45 ` [PATCH 25/25] powerpc: Enable pkey subsystem Ram Pai
  32 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

The value of the pkey, whose protection got violated,
is made available in si_pkey field of the siginfo structure.

Also keep the thread's pkey-register fields up2date.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/kernel/traps.c |   22 ++++++++++++++++++++++
 1 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index ec74e20..f2a310d 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -20,6 +20,7 @@
 #include <linux/sched/debug.h>
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/pkeys.h>
 #include <linux/stddef.h>
 #include <linux/unistd.h>
 #include <linux/ptrace.h>
@@ -265,6 +266,15 @@ void user_single_step_siginfo(struct task_struct *tsk,
 	info->si_addr = (void __user *)regs->nip;
 }
 
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+static void fill_sig_info_pkey(int si_code, siginfo_t *info, unsigned long addr)
+{
+	if (info->si_signo != SIGSEGV || si_code != SEGV_PKUERR)
+		return;
+	info->si_pkey = get_paca()->paca_pkey;
+}
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
 {
 	siginfo_t info;
@@ -292,6 +302,18 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
 	info.si_signo = signr;
 	info.si_code = code;
 	info.si_addr = (void __user *) addr;
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	/*
+	 * update the thread's pkey related fields.
+	 * core-dump handlers and other sub-systems
+	 * depend on those values.
+	 */
+	thread_pkey_regs_save(&current->thread);
+	/* update the violated-key value */
+	fill_sig_info_pkey(code, &info, addr);
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 	force_sig_info(signr, &info, current);
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 24/25] powerpc/ptrace: Add memory protection key regset
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (30 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 23/25] powerpc: Deliver SEGV signal on pkey violation Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  2017-09-08 22:45 ` [PATCH 25/25] powerpc: Enable pkey subsystem Ram Pai
  32 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

From: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>

The AMR/IAMR/UAMOR are part of the program context.
Allow it to be accessed via ptrace and through core files.

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pkeys.h    |    5 +++
 arch/powerpc/include/uapi/asm/elf.h |    1 +
 arch/powerpc/kernel/ptrace.c        |   66 +++++++++++++++++++++++++++++++++++
 include/uapi/linux/elf.h            |    1 +
 4 files changed, 73 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index 50522a0..a0111de 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -209,6 +209,11 @@ static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
 	return __arch_set_user_pkey_access(tsk, pkey, init_val);
 }
 
+static inline bool arch_pkeys_enabled(void)
+{
+	return pkey_inited;
+}
+
 static inline void pkey_mm_init(struct mm_struct *mm)
 {
 	if (!pkey_inited)
diff --git a/arch/powerpc/include/uapi/asm/elf.h b/arch/powerpc/include/uapi/asm/elf.h
index b2c6fdd..923e6d5 100644
--- a/arch/powerpc/include/uapi/asm/elf.h
+++ b/arch/powerpc/include/uapi/asm/elf.h
@@ -96,6 +96,7 @@
 #define ELF_NTMSPRREG	3	/* include tfhar, tfiar, texasr */
 #define ELF_NEBB	3	/* includes ebbrr, ebbhr, bescr */
 #define ELF_NPMU	5	/* includes siar, sdar, sier, mmcr2, mmcr0 */
+#define ELF_NPKEY	3	/* includes amr, iamr, uamor */
 
 typedef unsigned long elf_greg_t64;
 typedef elf_greg_t64 elf_gregset_t64[ELF_NGREG];
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 07cd22e..6a9d3ec 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -39,6 +39,7 @@
 #include <asm/pgtable.h>
 #include <asm/switch_to.h>
 #include <asm/tm.h>
+#include <asm/pkeys.h>
 #include <asm/asm-prototypes.h>
 
 #define CREATE_TRACE_POINTS
@@ -1775,6 +1776,61 @@ static int pmu_set(struct task_struct *target,
 	return ret;
 }
 #endif
+
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+static int pkey_active(struct task_struct *target,
+		       const struct user_regset *regset)
+{
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	return regset->n;
+}
+
+static int pkey_get(struct task_struct *target,
+		    const struct user_regset *regset,
+		    unsigned int pos, unsigned int count,
+		    void *kbuf, void __user *ubuf)
+{
+	BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr));
+	BUILD_BUG_ON(TSO(iamr) + sizeof(unsigned long) != TSO(uamor));
+
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+				   &target->thread.amr, 0,
+				   ELF_NPKEY * sizeof(unsigned long));
+}
+
+static int pkey_set(struct task_struct *target,
+		      const struct user_regset *regset,
+		      unsigned int pos, unsigned int count,
+		      const void *kbuf, const void __user *ubuf)
+{
+	u64 new_amr;
+	int ret;
+
+	if (!arch_pkeys_enabled())
+		return -ENODEV;
+
+	/* Only the AMR can be set from userspace */
+	if (pos != 0 || count != sizeof(new_amr))
+		return -EINVAL;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+				 &new_amr, 0, sizeof(new_amr));
+	if (ret)
+		return ret;
+
+	/* UAMOR determines which bits of the AMR can be set from userspace. */
+	target->thread.amr = (new_amr & target->thread.uamor) |
+		(target->thread.amr & ~target->thread.uamor);
+
+	return 0;
+}
+#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
+
 /*
  * These are our native regset flavors.
  */
@@ -1809,6 +1865,9 @@ enum powerpc_regset {
 	REGSET_EBB,		/* EBB registers */
 	REGSET_PMR,		/* Performance Monitor Registers */
 #endif
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	REGSET_PKEY,		/* AMR register */
+#endif
 };
 
 static const struct user_regset native_regsets[] = {
@@ -1914,6 +1973,13 @@ enum powerpc_regset {
 		.active = pmu_active, .get = pmu_get, .set = pmu_set
 	},
 #endif
+#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
+	[REGSET_PKEY] = {
+		.core_note_type = NT_PPC_PKEY, .n = ELF_NPKEY,
+		.size = sizeof(u64), .align = sizeof(u64),
+		.active = pkey_active, .get = pkey_get, .set = pkey_set
+	},
+#endif
 };
 
 static const struct user_regset_view user_ppc_native_view = {
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index b5280db..0708516 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -395,6 +395,7 @@
 #define NT_PPC_TM_CTAR	0x10d		/* TM checkpointed Target Address Register */
 #define NT_PPC_TM_CPPR	0x10e		/* TM checkpointed Program Priority Register */
 #define NT_PPC_TM_CDSCR	0x10f		/* TM checkpointed Data Stream Control Register */
+#define NT_PPC_PKEY	0x110		/* Memory Protection Keys registers */
 #define NT_386_TLS	0x200		/* i386 TLS slots (struct user_desc) */
 #define NT_386_IOPERM	0x201		/* x86 io permission bitmap (1=deny) */
 #define NT_X86_XSTATE	0x202		/* x86 extended state using xsave */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* [PATCH 25/25] powerpc: Enable pkey subsystem
  2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
                   ` (31 preceding siblings ...)
  2017-09-08 22:45 ` [PATCH 24/25] powerpc/ptrace: Add memory protection key regset Ram Pai
@ 2017-09-08 22:45 ` Ram Pai
  32 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-08 22:45 UTC (permalink / raw)
  To: mpe, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

PAPR defines 'ibm,processor-storage-keys' property. It exports
two values.The first value indicates the number of data-access
keys and the second indicates the number of instruction-access
keys. Though this hints that keys  can  be  either data access
or instruction access only,that is not the case in reality.Any
key can be of any kind.  This patch adds all the keys and uses
that as the total number of keys available to us.

Non PAPR platforms do not define this property   in the device
tree yet. Here, we   hardcode   CPUs   that   support  pkey by
consulting PowerISA3.0

Signed-off-by: Ram Pai <linuxram@us.ibm.com>
---
 arch/powerpc/include/asm/cputable.h    |   15 ++++++++++-----
 arch/powerpc/include/asm/mmu_context.h |    1 +
 arch/powerpc/include/asm/pkeys.h       |   21 +++++++++++++++++++++
 arch/powerpc/kernel/prom.c             |   19 +++++++++++++++++++
 arch/powerpc/mm/pkeys.c                |   19 ++++++++++++++-----
 5 files changed, 65 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/cputable.h b/arch/powerpc/include/asm/cputable.h
index a9bf921..31ed1d2 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -214,7 +214,9 @@ enum {
 #define CPU_FTR_DAWR			LONG_ASM_CONST(0x0400000000000000)
 #define CPU_FTR_DABRX			LONG_ASM_CONST(0x0800000000000000)
 #define CPU_FTR_PMAO_BUG		LONG_ASM_CONST(0x1000000000000000)
+#define CPU_FTR_PKEY			LONG_ASM_CONST(0x2000000000000000)
 #define CPU_FTR_POWER9_DD1		LONG_ASM_CONST(0x4000000000000000)
+#define CPU_FTR_PKEY_EXECUTE		LONG_ASM_CONST(0x8000000000000000)
 
 #ifndef __ASSEMBLY__
 
@@ -435,7 +437,8 @@ enum {
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
 	    CPU_FTR_COHERENT_ICACHE | CPU_FTR_PURR | \
-	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_DABRX)
+	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_DABRX | \
+	    CPU_FTR_PKEY)
 #define CPU_FTRS_POWER6 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -443,7 +446,7 @@ enum {
 	    CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
 	    CPU_FTR_DSCR | CPU_FTR_UNALIGNED_LD_STD | \
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_CFAR | \
-	    CPU_FTR_DABRX)
+	    CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER7 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -452,7 +455,7 @@ enum {
 	    CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | \
-	    CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX)
+	    CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX | CPU_FTR_PKEY)
 #define CPU_FTRS_POWER8 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
 	    CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | CPU_FTR_ARCH_206 |\
 	    CPU_FTR_MMCRA | CPU_FTR_SMT | \
@@ -462,7 +465,8 @@ enum {
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_ICSWX | CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
 	    CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP)
+	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_PKEY |\
+	    CPU_FTR_PKEY_EXECUTE)
 #define CPU_FTRS_POWER8E (CPU_FTRS_POWER8 | CPU_FTR_PMAO_BUG)
 #define CPU_FTRS_POWER8_DD1 (CPU_FTRS_POWER8 & ~CPU_FTR_DBELL)
 #define CPU_FTRS_POWER9 (CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
@@ -474,7 +478,8 @@ enum {
 	    CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
 	    CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
 	    CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
-	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300)
+	    CPU_FTR_ARCH_207S | CPU_FTR_TM_COMP | CPU_FTR_ARCH_300 | \
+	    CPU_FTR_PKEY | CPU_FTR_PKEY_EXECUTE)
 #define CPU_FTRS_POWER9_DD1 ((CPU_FTRS_POWER9 | CPU_FTR_POWER9_DD1) & \
 			     (~CPU_FTR_SAO))
 #define CPU_FTRS_CELL	(CPU_FTR_USE_TB | CPU_FTR_LWSYNC | \
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 9a56355..98ac713 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -148,6 +148,7 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
 
 #define pkey_initialize()
 #define pkey_mm_init(mm)
+#define pkey_mmu_values(total_data, total_execute)
 
 static inline int vma_pkey(struct vm_area_struct *vma)
 {
diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
index a0111de..baac435 100644
--- a/arch/powerpc/include/asm/pkeys.h
+++ b/arch/powerpc/include/asm/pkeys.h
@@ -1,9 +1,12 @@
 #ifndef _ASM_PPC64_PKEYS_H
 #define _ASM_PPC64_PKEYS_H
 
+#include <asm/firmware.h>
+
 extern bool pkey_inited;
 extern bool pkey_execute_disable_support;
 extern int pkeys_total; /* total pkeys as per device tree */
+extern int pkey_total_execute; /* total execute pkeys as per device tree */
 extern u32 initial_allocation_mask;/* bits set for reserved keys */
 
 /*
@@ -223,6 +226,24 @@ static inline void pkey_mm_init(struct mm_struct *mm)
 	mm->context.execute_only_pkey = -1;
 }
 
+static inline void pkey_mmu_values(int total_data, int total_execute)
+{
+	/*
+	 * since any pkey can be used for data or execute, we
+	 * will  just  treat all keys as equal and track them
+	 * as one entity.
+	 */
+	pkeys_total = total_data;
+}
+
+static inline bool pkey_mmu_enabled(void)
+{
+	if (firmware_has_feature(FW_FEATURE_LPAR))
+		return pkeys_total;
+	else
+		return cpu_has_feature(CPU_FTR_PKEY);
+}
+
 extern void thread_pkey_regs_save(struct thread_struct *thread);
 extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
 			struct thread_struct *old_thread);
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index f830562..f61da26 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -35,6 +35,7 @@
 #include <linux/of_fdt.h>
 #include <linux/libfdt.h>
 #include <linux/cpu.h>
+#include <linux/pkeys.h>
 
 #include <asm/prom.h>
 #include <asm/rtas.h>
@@ -228,6 +229,23 @@ static void __init check_cpu_pa_features(unsigned long node)
 		      ibm_pa_features, ARRAY_SIZE(ibm_pa_features));
 }
 
+static void __init check_cpu_pkey_feature(unsigned long node)
+{
+	const __be32 *ftrs;
+	int len, total_data, total_execute;
+
+	ftrs = of_get_flat_dt_prop(node,
+		"ibm,processor-storage-keys", &len);
+	if (ftrs == NULL)
+		return;
+
+	len /= sizeof(int);
+	total_execute = (len >= 2) ? be32_to_cpu(ftrs[1]) : 0;
+	total_data = (len >= 1) ? be32_to_cpu(ftrs[0]) : 0;
+	pkey_mmu_values(total_data, total_execute);
+}
+
+
 #ifdef CONFIG_PPC_STD_MMU_64
 static void __init init_mmu_slb_size(unsigned long node)
 {
@@ -391,6 +409,7 @@ static int __init early_init_dt_scan_cpus(unsigned long node,
 
 		check_cpu_feature_properties(node);
 		check_cpu_pa_features(node);
+		check_cpu_pkey_feature(node);
 	}
 
 	identical_pvr_fixup(node);
diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
index 21c3b42..c3ed473 100644
--- a/arch/powerpc/mm/pkeys.c
+++ b/arch/powerpc/mm/pkeys.c
@@ -37,15 +37,24 @@ void __init pkey_initialize(void)
 	 * line will enable it.
 	 */
 	pkey_inited = false;
+	if (pkey_mmu_enabled())
+		pkey_inited = !radix_enabled();
+	if (!pkey_inited)
+		return;
 
 	/*
-	 * disable execute_disable support for now.
-	 * A patch further down will enable it.
+	 * the device tree cannot be relied on for 
+	 * execute_disable support. Hence we depend
+	 * on CPU FTR.
 	 */
-	pkey_execute_disable_support = false;
+	pkey_execute_disable_support = cpu_has_feature(CPU_FTR_PKEY_EXECUTE);
 
-	/* Lets assume 32 keys */
-	pkeys_total = 32;
+	/*
+	 * Lets assume 32 keys if we are not told
+	 * the number of pkeys.
+	 */
+	if (!pkeys_total)
+		pkeys_total = 32;
 
 #ifdef CONFIG_PPC_4K_PAGES
 	/*
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* Re: [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper
  2017-09-08 22:44 ` [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper Ram Pai
@ 2017-09-13  7:55   ` Balbir Singh
  2017-10-19  4:52   ` Michael Ellerman
  1 sibling, 0 replies; 134+ messages in thread
From: Balbir Singh @ 2017-09-13  7:55 UTC (permalink / raw)
  To: Ram Pai
  Cc: Michael Ellerman, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Benjamin Herrenschmidt, Paul Mackerras, Anshuman Khandual,
	Aneesh Kumar KV, Haren Myneni/Beaverton/IBM, Michal Hocko,
	Thiago Jung Bauermann, Eric W. Biederman

On Sat, Sep 9, 2017 at 8:44 AM, Ram Pai <linuxram@us.ibm.com> wrote:
> Introduce pte_set_hash_slot().It  sets the (H_PAGE_F_SECOND|H_PAGE_F_GIX)
> bits at  the   appropriate   location   in   the   PTE  of  4K  PTE.  For
> 64K PTE, it  sets  the  bits  in  the  second  part  of  the  PTE. Though
> the implementation  for the former just needs the slot parameter, it does
> take some additional parameters to keep the prototype consistent.
>
> This function  will  be  handy  as  we   work   towards  re-arranging the
> bits in the later patches.
>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hash-4k.h  |   15 +++++++++++++++
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |   25 +++++++++++++++++++++++++
>  2 files changed, 40 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> index 0c4e470..8909039 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> @@ -48,6 +48,21 @@ static inline int hash__hugepd_ok(hugepd_t hpd)
>  }
>  #endif
>
> +/*
> + * 4k pte format is  different  from  64k  pte  format.  Saving  the
> + * hash_slot is just a matter of returning the pte bits that need to
> + * be modified. On 64k pte, things are a  little  more  involved and
> + * hence  needs   many   more  parameters  to  accomplish  the  same.
> + * However we  want  to abstract this out from the caller by keeping
> + * the prototype consistent across the two formats.
> + */
> +static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
> +                       unsigned int subpg_index, unsigned long slot)
> +{
> +       return (slot << H_PAGE_F_GIX_SHIFT) &
> +               (H_PAGE_F_SECOND | H_PAGE_F_GIX);
> +}
> +
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>
>  static inline char *get_hpte_slot_array(pmd_t *pmdp)
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 9732837..6652669 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -74,6 +74,31 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
>         return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
>  }
>
> +/*
> + * Commit the hash slot and return pte bits that needs to be modified.
> + * The caller is expected to modify the pte bits accordingly and
> + * commit the pte to memory.
> + */
> +static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
> +               unsigned int subpg_index, unsigned long slot)
> +{
> +       unsigned long *hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> +
> +       rpte.hidx &= ~(0xfUL << (subpg_index << 2));
> +       *hidxp = rpte.hidx  | (slot << (subpg_index << 2));
> +       /*
> +        * Commit the hidx bits to memory before returning.
> +        * Anyone reading  pte  must  ensure hidx bits are
> +        * read  only  after  reading the pte by using the

Can you lose the only and make it "read after reading the pte"
read only is easy to confuse as read-only

> +        * read-side  barrier  smp_rmb(). __real_pte() can
> +        * help ensure that.
> +        */
> +       smp_wmb();
> +
> +       /* no pte bits to be modified, return 0x0UL */
> +       return 0x0UL;

Acked-by: Balbir Singh <bsingharora@gmail.com>

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 2/7] powerpc: introduce pte_get_hash_gslot() helper
  2017-09-08 22:44 ` [PATCH 2/7] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
@ 2017-09-13  9:32   ` Balbir Singh
  2017-09-13 20:10     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-09-13  9:32 UTC (permalink / raw)
  To: Ram Pai
  Cc: Michael Ellerman, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Benjamin Herrenschmidt, Paul Mackerras, Anshuman Khandual,
	Aneesh Kumar KV, Haren Myneni/Beaverton/IBM, Michal Hocko,
	Thiago Jung Bauermann, Eric W. Biederman

On Sat, Sep 9, 2017 at 8:44 AM, Ram Pai <linuxram@us.ibm.com> wrote:
> Introduce pte_get_hash_gslot()() which returns the slot number of the
> HPTE in the global hash table.
>
> This function will come in handy as we work towards re-arranging the
> PTE bits in the later patches.
>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hash.h |    3 +++
>  arch/powerpc/mm/hash_utils_64.c           |   18 ++++++++++++++++++
>  2 files changed, 21 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> index f884520..060c059 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -166,6 +166,9 @@ static inline int hash__pte_none(pte_t pte)
>         return (pte_val(pte) & ~H_PTE_NONE_MASK) == 0;
>  }
>
> +unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
> +               int ssize, real_pte_t rpte, unsigned int subpg_index);
> +
>  /* This low level function performs the actual PTE insertion
>   * Setting the PTE depends on the MMU type and other factors. It's
>   * an horrible mess that I'm not going to try to clean up now but
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 67ec2e9..e68f053 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -1591,6 +1591,24 @@ static inline void tm_flush_hash_page(int local)
>  }
>  #endif
>
> +/*
> + * return the global hash slot, corresponding to the given
> + * pte, which contains the hpte.

Does this work with native/guest page tables? I guess both.
The comment sounds trivial, could you please elaborate more.
Looking at the code, it seems like given a real pte, we use
the hash value and hidx to figure out the slot value in the global
slot information. This uses information in the software page
tables. Is that correct? Do we have to consider validity and
present state here or is that guaranteed?

> + */
> +unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
> +               int ssize, real_pte_t rpte, unsigned int subpg_index)
> +{
> +       unsigned long hash, slot, hidx;
> +
> +       hash = hpt_hash(vpn, shift, ssize);
> +       hidx = __rpte_to_hidx(rpte, subpg_index);
> +       if (hidx & _PTEIDX_SECONDARY)
> +               hash = ~hash;
> +       slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> +       slot += hidx & _PTEIDX_GROUP_IX;
> +       return slot;
> +}
> +
>  /* WARNING: This is called from hash_low_64.S, if you change this prototype,
>   *          do not forget to update the assembly call site !
>   */

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 2/7] powerpc: introduce pte_get_hash_gslot() helper
  2017-09-13  9:32   ` Balbir Singh
@ 2017-09-13 20:10     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-13 20:10 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Michael Ellerman, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT),
	Benjamin Herrenschmidt, Paul Mackerras, Anshuman Khandual,
	Aneesh Kumar KV, Haren Myneni/Beaverton/IBM, Michal Hocko,
	Thiago Jung Bauermann, Eric W. Biederman

On Wed, Sep 13, 2017 at 07:32:57PM +1000, Balbir Singh wrote:
> On Sat, Sep 9, 2017 at 8:44 AM, Ram Pai <linuxram@us.ibm.com> wrote:
> > Introduce pte_get_hash_gslot()() which returns the slot number of the
> > HPTE in the global hash table.
> >
> > This function will come in handy as we work towards re-arranging the
> > PTE bits in the later patches.
> >
> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/book3s/64/hash.h |    3 +++
> >  arch/powerpc/mm/hash_utils_64.c           |   18 ++++++++++++++++++
> >  2 files changed, 21 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> > index f884520..060c059 100644
> > --- a/arch/powerpc/include/asm/book3s/64/hash.h
> > +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> > @@ -166,6 +166,9 @@ static inline int hash__pte_none(pte_t pte)
> >         return (pte_val(pte) & ~H_PTE_NONE_MASK) == 0;
> >  }
> >
> > +unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
> > +               int ssize, real_pte_t rpte, unsigned int subpg_index);
> > +
> >  /* This low level function performs the actual PTE insertion
> >   * Setting the PTE depends on the MMU type and other factors. It's
> >   * an horrible mess that I'm not going to try to clean up now but
> > diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> > index 67ec2e9..e68f053 100644
> > --- a/arch/powerpc/mm/hash_utils_64.c
> > +++ b/arch/powerpc/mm/hash_utils_64.c
> > @@ -1591,6 +1591,24 @@ static inline void tm_flush_hash_page(int local)
> >  }
> >  #endif
> >
> > +/*
> > + * return the global hash slot, corresponding to the given
> > + * pte, which contains the hpte.
> 
> Does this work with native/guest page tables? I guess both.

Yes. it is supposed to work with native as well as guest page tables.
The code has been this way for ages. This patch encapsulate
the logic in a standalone function.

> The comment sounds trivial, could you please elaborate more.
> Looking at the code, it seems like given a real pte, we use
> the hash value and hidx to figure out the slot value in the global
> slot information. This uses information in the software page
> tables. Is that correct?

Yes. This uses information passed to it by the caller, the information
is expected to be derived from linux page table.

> Do we have to consider validity and
> present state here or is that guaranteed?

This function's job is to do the math and return the global slot based
on the input. It will return the calculated value regardless of the validity of
its inputs.

Its the the callers' job to validate the pte and ensure that it is hashed,
before meaningfully using the return value of the this function.

> 
> > + */
> > +unsigned long pte_get_hash_gslot(unsigned long vpn, unsigned long shift,
> > +               int ssize, real_pte_t rpte, unsigned int subpg_index)
> > +{
> > +       unsigned long hash, slot, hidx;
> > +
> > +       hash = hpt_hash(vpn, shift, ssize);
> > +       hidx = __rpte_to_hidx(rpte, subpg_index);
> > +       if (hidx & _PTEIDX_SECONDARY)
> > +               hash = ~hash;
> > +       slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> > +       slot += hidx & _PTEIDX_GROUP_IX;
> > +       return slot;
> > +}
> > +
> >  /* WARNING: This is called from hash_low_64.S, if you change this prototype,
> >   *          do not forget to update the assembly call site !
> >   */
> 
> Balbir Singh.

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-09-08 22:44 ` [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
@ 2017-09-14  1:18   ` Balbir Singh
  2017-10-19  3:25   ` Michael Ellerman
  1 sibling, 0 replies; 134+ messages in thread
From: Balbir Singh @ 2017-09-14  1:18 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:43 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6,
> in the 4K backed HPTE pages.These bits continue to be used
> for 64K backed HPTE pages in this patch, but will be freed
> up in the next patch. The  bit  numbers are big-endian  as
> defined in the ISA3.0
> 
> The patch does the following change to the 4k htpe backed
> 64K PTE's format.
> 
> H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure
> 		below)
> V0 which occupied bit 4 is not used anymore.
> V1 which occupied bit 5 is not used anymore.
> V2 which occupied bit 6 is not used anymore.
> V3 which occupied bit 7 is not used anymore.
> 
> Before the patch, the 4k backed 64k PTE format was as follows
> 
>  0 1 2 3 4  5  6  7  8 9 10...........................63
>  : : : : :  :  :  :  : : :                            :
>  v v v v v  v  v  v  v v v                            v
> 
> ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> |x|x|x|B|V0|V1|V2|V3|x| | |x|x|................|x|x|x|x| <- primary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> 
> After the patch, the 4k backed 64k PTE format is as follows
> 
>  0 1 2 3 4  5  6  7  8 9 10...........................63
>  : : : : :  :  :  :  : : :                            :
>  v v v v v  v  v  v  v v v                            v
> 
> ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> |x|x|x| |  |  |  |  |x|B| |x|x|................|.|.|.|.| <- primary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> |S|G|I|X|S |G |I |X |S|G|I|X|..................|S|G|I|X| <- secondary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> 
> the four  bits S,G,I,X (one quadruplet per 4k HPTE) that
> cache  the  hash-bucket  slot  value, is initialized  to
> 1,1,1,1 indicating -- an invalid slot.   If  a HPTE gets
> cached in a 1111  slot(i.e 7th  slot  of  secondary hash
> bucket), it is  released  immediately. In  other  words,
> even  though 1111   is   a valid slot  value in the hash
> bucket, we consider it invalid and  release the slot and
> the HPTE.  This  gives  us  the opportunity to determine
> the validity of S,G,I,X  bits  based on its contents and
> not on any of the bits V0,V1,V2 or V3 in the primary PTE
> 
> When   we  release  a    HPTE    cached in the 1111 slot
> we also    release  a  legitimate   slot  in the primary
> hash bucket  and  unmap  its  corresponding  HPTE.  This
> is  to  ensure   that  we do get a HPTE cached in a slot
> of the primary hash bucket, the next time we retry.
> 
> Though  treating  1111  slot  as  invalid,  reduces  the
> number of  available  slots  in the hash bucket and  may
> have  an  effect   on the performance, the probabilty of
> hitting a 1111 slot is extermely low.
> 
> Compared  to  the   current    scheme,  the above scheme
> reduces  the   number  of   false   hash  table  updates
> significantly and  has the  added advantage of releasing
> four  valuable  PTE bits for other purpose.
> 
> NOTE:even though bits 3, 4, 5, 6, 7 are  not  used  when
> the  64K  PTE is backed by 4k HPTE,  they continue to be
> used  if  the  PTE  gets  backed  by 64k HPTE.  The next
> patch will decouple that aswell, and truely  release the
> bits.
> 
> This idea was jointly developed by Paul Mackerras,
> Aneesh, Michael Ellermen and myself.
> 

Acked-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-09-08 22:44 ` [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
@ 2017-09-14  1:44   ` Balbir Singh
  2017-09-14 17:54     ` Ram Pai
  2017-09-14  8:13   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-09-14  1:44 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:44 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6
> in the 64K backed HPTE pages. This along with the earlier
> patch will  entirely free  up the four bits from 64K PTE.
> The bit numbers are  big-endian as defined in the  ISA3.0
> 
> This patch  does  the  following change to 64K PTE backed
> by 64K HPTE.
> 
> H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the
> 	second part of the pte to bit 60.
> H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also
> 	moves  to  the   second part of the pte to bit 61,
>        	62, 63, 64 respectively
> 
> since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
> bit  9  to  bit  7.
> 
> The second part of the PTE will hold
> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
> NOTE: None of the bits in the secondary PTE were not used
> by 64k-HPTE backed PTE.
> 
> Before the patch, the 64K HPTE backed 64k PTE format was
> as follows
> 
>  0 1 2 3 4  5  6  7  8 9 10...........................63
>  : : : : :  :  :  :  : : :                            :
>  v v v v v  v  v  v  v v v                            v
> 
> ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> |x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> | | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> 
> After the patch, the 64k HPTE backed 64k PTE format is
> as follows
> 
>  0 1 2 3 4  5  6  7  8 9 10...........................63
>  : : : : :  :  :  :  : : :                            :
>  v v v v v  v  v  v  v v v                            v
> 
> ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> |x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> | | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte
> '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> 
> The above PTE changes is applicable to hugetlbpages aswell.
> 
> The patch does the following code changes:
> 
> a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE
> 	header   since it is no more needed b the 64k PTEs.
> b) abstracts  out __real_pte() and __rpte_to_hidx() so the
> 	caller  need not know the bit location of the slot.
> c) moves the slot bits to the secondary pte.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++
>  arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +++++++++++-------------
>  arch/powerpc/include/asm/book3s/64/hash.h     |    3 --
>  arch/powerpc/mm/hash64_64k.c                  |   23 ++++++++-----------
>  arch/powerpc/mm/hugetlbpage-hash64.c          |   18 ++++++---------
>  5 files changed, 33 insertions(+), 43 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> index e66bfeb..dc153c6 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> @@ -16,6 +16,9 @@
>  #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
>  #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
>  
> +#define H_PAGE_F_GIX_SHIFT	56
> +#define H_PAGE_F_SECOND	_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
> +#define H_PAGE_F_GIX	(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
>  #define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
>  
>  /* PTE flags to conserve for HPTE identification */
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index e038f1c..89ef5a9 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -12,7 +12,7 @@
>   */
>  #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
>  #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
> -#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
> +#define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
>  
>  /*
>   * We need to differentiate between explicit huge page and THP huge
> @@ -21,8 +21,7 @@
>  #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
>  
>  /* PTE flags to conserve for HPTE identification */
> -#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
> -			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
> +#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)
>  /*
>   * we support 16 fragments per PTE page of 64K size.
>   */
> @@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
>  	unsigned long *hidxp;
>  
>  	rpte.pte = pte;
> -	rpte.hidx = 0;
> -	if (pte_val(pte) & H_PAGE_COMBO) {
> -		/*
> -		 * Make sure we order the hidx load against the H_PAGE_COMBO
> -		 * check. The store side ordering is done in __hash_page_4K
> -		 */
> -		smp_rmb();
> -		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> -		rpte.hidx = *hidxp;
> -	}
> +	/*
> +	 * Ensure that we do not read the hidx before we read
> +	 * the pte. Because the writer side is  expected
> +	 * to finish writing the hidx first followed by the pte,
> +	 * by using smp_wmb().
> +	 * pte_set_hash_slot() ensures that.
> +	 */
> +	smp_rmb();
> +	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> +	rpte.hidx = *hidxp;
>  	return rpte;
>  }
>  
>  static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
>  {
> -	if ((pte_val(rpte.pte) & H_PAGE_COMBO))
> -		return (rpte.hidx >> (index<<2)) & 0xf;
> -	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
> +	return ((rpte.hidx >> (index<<2)) & 0xfUL);
>  }
>  
>  /*
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> index 8ce4112..46f3a23 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -8,9 +8,6 @@
>   *
>   */
>  #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
> -#define H_PAGE_F_GIX_SHIFT	56
> -#define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
> -#define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
>  #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
>  
>  #ifdef CONFIG_PPC_64K_PAGES
> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> index c6c5559..9c63844 100644
> --- a/arch/powerpc/mm/hash64_64k.c
> +++ b/arch/powerpc/mm/hash64_64k.c
> @@ -103,8 +103,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  		 * On hash insert failure we use old pte value and we don't
>  		 * want slot information there if we have a insert failure.
>  		 */
> -		old_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
> -		new_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
> +		old_pte &= ~H_PAGE_HASHPTE;
> +		new_pte &= ~H_PAGE_HASHPTE;

Shouldn't we set old/new_pte.slot = invalid? via rpte.hidx

>  		goto htab_insert_hpte;
>  	}
>  	/*
> @@ -227,6 +227,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
>  		    unsigned long vsid, pte_t *ptep, unsigned long trap,
>  		    unsigned long flags, int ssize)
>  {
> +	real_pte_t rpte;
>  	unsigned long hpte_group;
>  	unsigned long rflags, pa;
>  	unsigned long old_pte, new_pte;
> @@ -263,6 +264,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
>  	} while (!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
>  
>  	rflags = htab_convert_pte_flags(new_pte);
> +	rpte = __real_pte(__pte(old_pte), ptep);
>  
>  	if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
>  	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
> @@ -270,18 +272,13 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
>  
>  	vpn  = hpt_vpn(ea, vsid, ssize);
>  	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
> +		unsigned long gslot;
>  		/*
>  		 * There MIGHT be an HPTE for this pte
>  		 */
> -		hash = hpt_hash(vpn, shift, ssize);
> -		if (old_pte & H_PAGE_F_SECOND)
> -			hash = ~hash;
> -		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> -		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
> -
> -		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
> -					       MMU_PAGE_64K, ssize,
> -					       flags) == -1)
> +		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
> +		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_64K,
> +				MMU_PAGE_64K, ssize, flags) == -1)
>  			old_pte &= ~_PAGE_HPTEFLAGS;
>  	}
>  
> @@ -328,9 +325,9 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
>  					   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);
>  			return -1;
>  		}
> +
>  		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
> -		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
> -			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
> +		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
>  	}
>  	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
>  	return 0;
> diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
> index a84bb44..d52d667 100644
> --- a/arch/powerpc/mm/hugetlbpage-hash64.c
> +++ b/arch/powerpc/mm/hugetlbpage-hash64.c
> @@ -22,6 +22,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
>  		     pte_t *ptep, unsigned long trap, unsigned long flags,
>  		     int ssize, unsigned int shift, unsigned int mmu_psize)
>  {
> +	real_pte_t rpte;
>  	unsigned long vpn;
>  	unsigned long old_pte, new_pte;
>  	unsigned long rflags, pa, sz;
> @@ -61,6 +62,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
>  	} while(!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
>  
>  	rflags = htab_convert_pte_flags(new_pte);
> +	rpte = __real_pte(__pte(old_pte), ptep);
>  
>  	sz = ((1UL) << shift);
>  	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
> @@ -71,16 +73,11 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
>  	/* Check if pte already has an hpte (case 2) */
>  	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
>  		/* There MIGHT be an HPTE for this pte */
> -		unsigned long hash, slot;
> +		unsigned long gslot;
>  
> -		hash = hpt_hash(vpn, shift, ssize);
> -		if (old_pte & H_PAGE_F_SECOND)
> -			hash = ~hash;
> -		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> -		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
> -
> -		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,
> -					       mmu_psize, ssize, flags) == -1)
> +		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
> +		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, mmu_psize,
> +				mmu_psize, ssize, flags) == -1)
>  			old_pte &= ~_PAGE_HPTEFLAGS;
>  	}
>  
> @@ -106,8 +103,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
>  			return -1;
>  		}
>  
> -		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
> -			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
> +		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
>  	}
>  
>  	/*

Balbir

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 5/7] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6
  2017-09-08 22:44 ` [PATCH 5/7] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6 Ram Pai
@ 2017-09-14  1:48   ` Balbir Singh
  2017-09-14 17:23     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-09-14  1:48 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:45 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> We  need  PTE bits  3 ,4, 5, 6 and 57 to support protection-keys,
> because these are  the bits we want to consolidate on across all
> configuration to support protection keys.
> 
> Bit 3,4,5 and 6 are currently used on 4K-pte kernels.  But bit 9
> and 10 are available.  Hence  we  use the two available bits and
> free up bit 5 and 6.  We will still not be able to free up bit 3
> and 4. In the absence  of  any  other free bits, we will have to
> stay satisfied  with  what we have :-(.   This means we will not
> be  able  to support  32  protection  keys, but only 8.  The bit
> numbers are  big-endian as defined in the  ISA3.0
>

Any chance for 4k PTE's we can do slot searching for the PTE?
I guess thats add additional complexity

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 7/7] powerpc: capture the PTE format changes in the dump pte report
  2017-09-08 22:44 ` [PATCH 7/7] powerpc: capture the PTE format changes in the dump pte report Ram Pai
@ 2017-09-14  3:22   ` Balbir Singh
  2017-09-14 17:19     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-09-14  3:22 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:47 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE.
> capture these changes in the dump pte report.
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---

So we lose slot and secondary information for 64K PTE's with
this change?

Balbir

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 01/25] powerpc: initial pkey plumbing
  2017-09-08 22:44 ` [PATCH 01/25] powerpc: initial pkey plumbing Ram Pai
@ 2017-09-14  3:32   ` Balbir Singh
  2017-09-14 16:17     ` Ram Pai
  2017-10-19  4:20   ` Michael Ellerman
  1 sibling, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-09-14  3:32 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:49 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Basic  plumbing  to   initialize  the   pkey  system.
> Nothing is enabled yet. A later patch will enable it
> ones all the infrastructure is in place.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/Kconfig                   |   16 +++++++++++
>  arch/powerpc/include/asm/mmu_context.h |    5 +++
>  arch/powerpc/include/asm/pkeys.h       |   45 ++++++++++++++++++++++++++++++++
>  arch/powerpc/kernel/setup_64.c         |    4 +++
>  arch/powerpc/mm/Makefile               |    1 +
>  arch/powerpc/mm/hash_utils_64.c        |    1 +
>  arch/powerpc/mm/pkeys.c                |   33 +++++++++++++++++++++++
>  7 files changed, 105 insertions(+), 0 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/pkeys.h
>  create mode 100644 arch/powerpc/mm/pkeys.c
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 9fc3c0b..a4cd210 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -864,6 +864,22 @@ config SECCOMP
>  
>  	  If unsure, say Y. Only embedded should say N here.
>  
> +config PPC64_MEMORY_PROTECTION_KEYS
> +	prompt "PowerPC Memory Protection Keys"
> +	def_bool y
> +	# Note: only available in 64-bit mode
> +	depends on PPC64

This is not sufficient right, you need PPC_BOOK3S_64
for compile time at-least?

> +	select ARCH_USES_HIGH_VMA_FLAGS
> +	select ARCH_HAS_PKEYS
> +	---help---
> +	  Memory Protection Keys provides a mechanism for enforcing
> +	  page-based protections, but without requiring modification of the
> +	  page tables when an application changes protection domains.
> +
> +	  For details, see Documentation/vm/protection-keys.txt
> +
> +	  If unsure, say y.
> +
>  endmenu
>  
>  config ISA_DMA_API
> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> index 3095925..7badf29 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -141,5 +141,10 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
>  	/* by default, allow everything */
>  	return true;
>  }
> +
> +#ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +#define pkey_initialize()
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> +
>  #endif /* __KERNEL__ */
>  #endif /* __ASM_POWERPC_MMU_CONTEXT_H */
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> new file mode 100644
> index 0000000..c02305a
> --- /dev/null
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -0,0 +1,45 @@
> +#ifndef _ASM_PPC64_PKEYS_H
> +#define _ASM_PPC64_PKEYS_H
> +
> +extern bool pkey_inited;
> +extern bool pkey_execute_disable_support;
> +#define ARCH_VM_PKEY_FLAGS 0
> +
> +static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> +{
> +	return (pkey == 0);
> +}
> +
> +static inline int mm_pkey_alloc(struct mm_struct *mm)
> +{
> +	return -1;
> +}
> +
> +static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> +{
> +	return -EINVAL;
> +}
> +
> +/*
> + * Try to dedicate one of the protection keys to be used as an
> + * execute-only protection key.
> + */
> +static inline int execute_only_pkey(struct mm_struct *mm)
> +{
> +	return 0;
> +}
> +
> +static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> +		int prot, int pkey)
> +{
> +	return 0;
> +}
> +
> +static inline int arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> +		unsigned long init_val)
> +{
> +	return 0;
> +}
> +
> +extern void pkey_initialize(void);
> +#endif /*_ASM_PPC64_PKEYS_H */
> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index b89c6aa..3b67014 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -37,6 +37,7 @@
>  #include <linux/memblock.h>
>  #include <linux/memory.h>
>  #include <linux/nmi.h>
> +#include <linux/pkeys.h>
>  
>  #include <asm/io.h>
>  #include <asm/kdump.h>
> @@ -316,6 +317,9 @@ void __init early_setup(unsigned long dt_ptr)
>  	/* Initialize the hash table or TLB handling */
>  	early_init_mmu();
>  
> +	/* initialize the key subsystem */
> +	pkey_initialize();
> +
>  	/*
>  	 * At this point, we can let interrupts switch to virtual mode
>  	 * (the MMU has been setup), so adjust the MSR in the PACA to
> diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
> index fb844d2..927620a 100644
> --- a/arch/powerpc/mm/Makefile
> +++ b/arch/powerpc/mm/Makefile
> @@ -43,3 +43,4 @@ obj-$(CONFIG_PPC_COPRO_BASE)	+= copro_fault.o
>  obj-$(CONFIG_SPAPR_TCE_IOMMU)	+= mmu_context_iommu.o
>  obj-$(CONFIG_PPC_PTDUMP)	+= dump_linuxpagetables.o
>  obj-$(CONFIG_PPC_HTDUMP)	+= dump_hashpagetable.o
> +obj-$(CONFIG_PPC64_MEMORY_PROTECTION_KEYS)	+= pkeys.o
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 0dff57b..67f62b5 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -35,6 +35,7 @@
>  #include <linux/memblock.h>
>  #include <linux/context_tracking.h>
>  #include <linux/libfdt.h>
> +#include <linux/pkeys.h>
>  
>  #include <asm/debugfs.h>
>  #include <asm/processor.h>
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> new file mode 100644
> index 0000000..418a05b
> --- /dev/null
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -0,0 +1,33 @@
> +/*
> + * PowerPC Memory Protection Keys management
> + * Copyright (c) 2015, Intel Corporation.
> + * Copyright (c) 2017, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + */
> +#include <linux/pkeys.h>                /* PKEY_*                       */
> +
> +bool pkey_inited;
> +bool pkey_execute_disable_support;
> +
> +void __init pkey_initialize(void)
> +{
> +	/* disable the pkey system till everything
> +	 * is in place. A patch further down the
> +	 * line will enable it.
> +	 */

Comment style is broken

> +	pkey_inited = false;
> +
> +	/*
> +	 * disable execute_disable support for now.
> +	 * A patch further down will enable it.
> +	 */
> +	pkey_execute_disable_support = false;
> +}

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-09-08 22:44 ` [PATCH 02/25] powerpc: define an additional vma bit for protection keys Ram Pai
@ 2017-09-14  4:38   ` Balbir Singh
  2017-09-14  8:11     ` Benjamin Herrenschmidt
  2017-09-14 16:15     ` Ram Pai
  2017-10-23  9:25   ` Aneesh Kumar K.V
  1 sibling, 2 replies; 134+ messages in thread
From: Balbir Singh @ 2017-09-14  4:38 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:50 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> powerpc needs an additional vma bit to support 32 keys.
> Till the additional vma bit lands in include/linux/mm.h
> we have to define  it  in powerpc specific header file.
> This is  needed to get pkeys working on power.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---

"This" being an arch specific hack for the additional bit?

Balbir

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-09-14  4:38   ` Balbir Singh
@ 2017-09-14  8:11     ` Benjamin Herrenschmidt
  2017-10-23 21:06       ` Ram Pai
  2017-09-14 16:15     ` Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Benjamin Herrenschmidt @ 2017-09-14  8:11 UTC (permalink / raw)
  To: Balbir Singh, Ram Pai
  Cc: mpe, linuxppc-dev, paulus, khandual, aneesh.kumar, hbabu, mhocko,
	bauerman, ebiederm

On Thu, 2017-09-14 at 14:38 +1000, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:50 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > powerpc needs an additional vma bit to support 32 keys.
> > Till the additional vma bit lands in include/linux/mm.h
> > we have to define  it  in powerpc specific header file.
> > This is  needed to get pkeys working on power.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> 
> "This" being an arch specific hack for the additional bit?

Arch VMA bits ? really ? I'd rather we limit ourselves to 16 keys first
then push for adding the extra bit to the generic code.

Ben.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-09-08 22:44 ` [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
  2017-09-14  1:44   ` Balbir Singh
@ 2017-09-14  8:13   ` Benjamin Herrenschmidt
  2017-10-23  8:52     ` Aneesh Kumar K.V
  2017-10-23 19:22     ` Ram Pai
  1 sibling, 2 replies; 134+ messages in thread
From: Benjamin Herrenschmidt @ 2017-09-14  8:13 UTC (permalink / raw)
  To: Ram Pai, mpe, linuxppc-dev
  Cc: paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm

On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:
> The second part of the PTE will hold
> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
> NOTE: None of the bits in the secondary PTE were not used
> by 64k-HPTE backed PTE.

Have you measured the performance impact of this ? The second part of
the PTE being in a different cache line there could be one...

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-09-14  4:38   ` Balbir Singh
  2017-09-14  8:11     ` Benjamin Herrenschmidt
@ 2017-09-14 16:15     ` Ram Pai
  1 sibling, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-14 16:15 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Sep 14, 2017 at 02:38:07PM +1000, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:50 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > powerpc needs an additional vma bit to support 32 keys.
> > Till the additional vma bit lands in include/linux/mm.h
> > we have to define  it  in powerpc specific header file.
> > This is  needed to get pkeys working on power.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> 
> "This" being an arch specific hack for the additional bit?


Yes. arch-specific hack.  I am trying to get the arch specific
changes merged parallelly, along with these patches. Don't know
which one will merge first. Regardless of which patch-set
lands-in first; I have organized the code such that nothing
breaks.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 01/25] powerpc: initial pkey plumbing
  2017-09-14  3:32   ` Balbir Singh
@ 2017-09-14 16:17     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-14 16:17 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Sep 14, 2017 at 01:32:05PM +1000, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:49 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > Basic  plumbing  to   initialize  the   pkey  system.
> > Nothing is enabled yet. A later patch will enable it
> > ones all the infrastructure is in place.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/Kconfig                   |   16 +++++++++++
> >  arch/powerpc/include/asm/mmu_context.h |    5 +++
> >  arch/powerpc/include/asm/pkeys.h       |   45 ++++++++++++++++++++++++++++++++
> >  arch/powerpc/kernel/setup_64.c         |    4 +++
> >  arch/powerpc/mm/Makefile               |    1 +
> >  arch/powerpc/mm/hash_utils_64.c        |    1 +
> >  arch/powerpc/mm/pkeys.c                |   33 +++++++++++++++++++++++
> >  7 files changed, 105 insertions(+), 0 deletions(-)
> >  create mode 100644 arch/powerpc/include/asm/pkeys.h
> >  create mode 100644 arch/powerpc/mm/pkeys.c
> > 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 9fc3c0b..a4cd210 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -864,6 +864,22 @@ config SECCOMP
> >  
> >  	  If unsure, say Y. Only embedded should say N here.
> >  
> > +config PPC64_MEMORY_PROTECTION_KEYS
> > +	prompt "PowerPC Memory Protection Keys"
> > +	def_bool y
> > +	# Note: only available in 64-bit mode
> > +	depends on PPC64
> 
> This is not sufficient right, you need PPC_BOOK3S_64
> for compile time at-least?

Ok. Not thought too deep about this. Thanks for the input.


> 
> > +	select ARCH_USES_HIGH_VMA_FLAGS
> > +
.....
> > +void __init pkey_initialize(void)
> > +{
> > +	/* disable the pkey system till everything
> > +	 * is in place. A patch further down the
> > +	 * line will enable it.
> > +	 */
> 
> Comment style is broken
> 

checkpatch.pl does not complain.  So is it really a broken
comment style, or is it checkpatch.pl needs to be fixed?


RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 7/7] powerpc: capture the PTE format changes in the dump pte report
  2017-09-14  3:22   ` Balbir Singh
@ 2017-09-14 17:19     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-14 17:19 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Sep 14, 2017 at 01:22:27PM +1000, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:47 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > The H_PAGE_F_SECOND,H_PAGE_F_GIX are not in the 64K main-PTE.
> > capture these changes in the dump pte report.
> > 
> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> 
> So we lose slot and secondary information for 64K PTE's with
> this change?

yes. It was anyway not there for 4k-backed-64k ptes. Now it wont
be there for any 64k ptes.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 5/7] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6
  2017-09-14  1:48   ` Balbir Singh
@ 2017-09-14 17:23     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-14 17:23 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Sep 14, 2017 at 11:48:34AM +1000, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:45 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > We  need  PTE bits  3 ,4, 5, 6 and 57 to support protection-keys,
> > because these are  the bits we want to consolidate on across all
> > configuration to support protection keys.
> > 
> > Bit 3,4,5 and 6 are currently used on 4K-pte kernels.  But bit 9
> > and 10 are available.  Hence  we  use the two available bits and
> > free up bit 5 and 6.  We will still not be able to free up bit 3
> > and 4. In the absence  of  any  other free bits, we will have to
> > stay satisfied  with  what we have :-(.   This means we will not
> > be  able  to support  32  protection  keys, but only 8.  The bit
> > numbers are  big-endian as defined in the  ISA3.0
> >
> 
> Any chance for 4k PTE's we can do slot searching for the PTE?
> I guess thats add additional complexity

Aneesh, i think, is working on moving slot information out of the PTE.
If that happens, we will have leg-space to support more keys.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-09-14  1:44   ` Balbir Singh
@ 2017-09-14 17:54     ` Ram Pai
  2017-09-14 18:25       ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-09-14 17:54 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Sep 14, 2017 at 11:44:49AM +1000, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:44 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6
> > in the 64K backed HPTE pages. This along with the earlier
> > patch will  entirely free  up the four bits from 64K PTE.
> > The bit numbers are  big-endian as defined in the  ISA3.0
> > 
> > This patch  does  the  following change to 64K PTE backed
> > by 64K HPTE.
> > 
> > H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the
> > 	second part of the pte to bit 60.
> > H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also
> > 	moves  to  the   second part of the pte to bit 61,
> >        	62, 63, 64 respectively
> > 
> > since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
> > bit  9  to  bit  7.
> > 
> > The second part of the PTE will hold
> > (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
> > NOTE: None of the bits in the secondary PTE were not used
> > by 64k-HPTE backed PTE.
> > 
> > Before the patch, the 64K HPTE backed 64k PTE format was
> > as follows
> > 
> >  0 1 2 3 4  5  6  7  8 9 10...........................63
> >  : : : : :  :  :  :  : : :                            :
> >  v v v v v  v  v  v  v v v                            v
> > 
> > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> > |x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte
> > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> > | | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte
> > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> > 
> > After the patch, the 64k HPTE backed 64k PTE format is
> > as follows
> > 
> >  0 1 2 3 4  5  6  7  8 9 10...........................63
> >  : : : : :  :  :  :  : : :                            :
> >  v v v v v  v  v  v  v v v                            v
> > 
> > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> > |x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte
> > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> > | | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte
> > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> > 
> > The above PTE changes is applicable to hugetlbpages aswell.
> > 
> > The patch does the following code changes:
> > 
> > a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE
> > 	header   since it is no more needed b the 64k PTEs.
> > b) abstracts  out __real_pte() and __rpte_to_hidx() so the
> > 	caller  need not know the bit location of the slot.
> > c) moves the slot bits to the secondary pte.
> > 
> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++
> >  arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +++++++++++-------------
> >  arch/powerpc/include/asm/book3s/64/hash.h     |    3 --
> >  arch/powerpc/mm/hash64_64k.c                  |   23 ++++++++-----------
> >  arch/powerpc/mm/hugetlbpage-hash64.c          |   18 ++++++---------
> >  5 files changed, 33 insertions(+), 43 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > index e66bfeb..dc153c6 100644
> > --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > @@ -16,6 +16,9 @@
> >  #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
> >  #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
> >  
> > +#define H_PAGE_F_GIX_SHIFT	56
> > +#define H_PAGE_F_SECOND	_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
> > +#define H_PAGE_F_GIX	(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
> >  #define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
> >  
> >  /* PTE flags to conserve for HPTE identification */
> > diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > index e038f1c..89ef5a9 100644
> > --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > @@ -12,7 +12,7 @@
> >   */
> >  #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
> >  #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
> > -#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
> > +#define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
> >  
> >  /*
> >   * We need to differentiate between explicit huge page and THP huge
> > @@ -21,8 +21,7 @@
> >  #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
> >  
> >  /* PTE flags to conserve for HPTE identification */
> > -#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
> > -			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
> > +#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)
> >  /*
> >   * we support 16 fragments per PTE page of 64K size.
> >   */
> > @@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
> >  	unsigned long *hidxp;
> >  
> >  	rpte.pte = pte;
> > -	rpte.hidx = 0;
> > -	if (pte_val(pte) & H_PAGE_COMBO) {
> > -		/*
> > -		 * Make sure we order the hidx load against the H_PAGE_COMBO
> > -		 * check. The store side ordering is done in __hash_page_4K
> > -		 */
> > -		smp_rmb();
> > -		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> > -		rpte.hidx = *hidxp;
> > -	}
> > +	/*
> > +	 * Ensure that we do not read the hidx before we read
> > +	 * the pte. Because the writer side is  expected
> > +	 * to finish writing the hidx first followed by the pte,
> > +	 * by using smp_wmb().
> > +	 * pte_set_hash_slot() ensures that.
> > +	 */
> > +	smp_rmb();
> > +	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> > +	rpte.hidx = *hidxp;
> >  	return rpte;
> >  }
> >  
> >  static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
> >  {
> > -	if ((pte_val(rpte.pte) & H_PAGE_COMBO))
> > -		return (rpte.hidx >> (index<<2)) & 0xf;
> > -	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
> > +	return ((rpte.hidx >> (index<<2)) & 0xfUL);
> >  }
> >  
> >  /*
> > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> > index 8ce4112..46f3a23 100644
> > --- a/arch/powerpc/include/asm/book3s/64/hash.h
> > +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> > @@ -8,9 +8,6 @@
> >   *
> >   */
> >  #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
> > -#define H_PAGE_F_GIX_SHIFT	56
> > -#define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
> > -#define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
> >  #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
> >  
> >  #ifdef CONFIG_PPC_64K_PAGES
> > diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> > index c6c5559..9c63844 100644
> > --- a/arch/powerpc/mm/hash64_64k.c
> > +++ b/arch/powerpc/mm/hash64_64k.c
> > @@ -103,8 +103,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
> >  		 * On hash insert failure we use old pte value and we don't
> >  		 * want slot information there if we have a insert failure.
> >  		 */
> > -		old_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
> > -		new_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
> > +		old_pte &= ~H_PAGE_HASHPTE;
> > +		new_pte &= ~H_PAGE_HASHPTE;
> 
> Shouldn't we set old/new_pte.slot = invalid? via rpte.hidx

by resetting the H_PAGE_HASHPTE flag, we are invalidating
slot information.  Would that not be sufficient?

RP

> 
> >  		goto htab_insert_hpte;
> >  	}
> >  	/*
> > @@ -227,6 +227,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
> >  		    unsigned long vsid, pte_t *ptep, unsigned long trap,
> >  		    unsigned long flags, int ssize)
> >  {
> > +	real_pte_t rpte;
> >  	unsigned long hpte_group;
> >  	unsigned long rflags, pa;
> >  	unsigned long old_pte, new_pte;
> > @@ -263,6 +264,7 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
> >  	} while (!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
> >  
> >  	rflags = htab_convert_pte_flags(new_pte);
> > +	rpte = __real_pte(__pte(old_pte), ptep);
> >  
> >  	if (cpu_has_feature(CPU_FTR_NOEXECUTE) &&
> >  	    !cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
> > @@ -270,18 +272,13 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
> >  
> >  	vpn  = hpt_vpn(ea, vsid, ssize);
> >  	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
> > +		unsigned long gslot;
> >  		/*
> >  		 * There MIGHT be an HPTE for this pte
> >  		 */
> > -		hash = hpt_hash(vpn, shift, ssize);
> > -		if (old_pte & H_PAGE_F_SECOND)
> > -			hash = ~hash;
> > -		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> > -		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
> > -
> > -		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, MMU_PAGE_64K,
> > -					       MMU_PAGE_64K, ssize,
> > -					       flags) == -1)
> > +		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
> > +		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, MMU_PAGE_64K,
> > +				MMU_PAGE_64K, ssize, flags) == -1)
> >  			old_pte &= ~_PAGE_HPTEFLAGS;
> >  	}
> >  
> > @@ -328,9 +325,9 @@ int __hash_page_64K(unsigned long ea, unsigned long access,
> >  					   MMU_PAGE_64K, MMU_PAGE_64K, old_pte);
> >  			return -1;
> >  		}
> > +
> >  		new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | H_PAGE_HASHPTE;
> > -		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
> > -			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
> > +		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
> >  	}
> >  	*ptep = __pte(new_pte & ~H_PAGE_BUSY);
> >  	return 0;
> > diff --git a/arch/powerpc/mm/hugetlbpage-hash64.c b/arch/powerpc/mm/hugetlbpage-hash64.c
> > index a84bb44..d52d667 100644
> > --- a/arch/powerpc/mm/hugetlbpage-hash64.c
> > +++ b/arch/powerpc/mm/hugetlbpage-hash64.c
> > @@ -22,6 +22,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
> >  		     pte_t *ptep, unsigned long trap, unsigned long flags,
> >  		     int ssize, unsigned int shift, unsigned int mmu_psize)
> >  {
> > +	real_pte_t rpte;
> >  	unsigned long vpn;
> >  	unsigned long old_pte, new_pte;
> >  	unsigned long rflags, pa, sz;
> > @@ -61,6 +62,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
> >  	} while(!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
> >  
> >  	rflags = htab_convert_pte_flags(new_pte);
> > +	rpte = __real_pte(__pte(old_pte), ptep);
> >  
> >  	sz = ((1UL) << shift);
> >  	if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
> > @@ -71,16 +73,11 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
> >  	/* Check if pte already has an hpte (case 2) */
> >  	if (unlikely(old_pte & H_PAGE_HASHPTE)) {
> >  		/* There MIGHT be an HPTE for this pte */
> > -		unsigned long hash, slot;
> > +		unsigned long gslot;
> >  
> > -		hash = hpt_hash(vpn, shift, ssize);
> > -		if (old_pte & H_PAGE_F_SECOND)
> > -			hash = ~hash;
> > -		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> > -		slot += (old_pte & H_PAGE_F_GIX) >> H_PAGE_F_GIX_SHIFT;
> > -
> > -		if (mmu_hash_ops.hpte_updatepp(slot, rflags, vpn, mmu_psize,
> > -					       mmu_psize, ssize, flags) == -1)
> > +		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte, 0);
> > +		if (mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn, mmu_psize,
> > +				mmu_psize, ssize, flags) == -1)
> >  			old_pte &= ~_PAGE_HPTEFLAGS;
> >  	}
> >  
> > @@ -106,8 +103,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
> >  			return -1;
> >  		}
> >  
> > -		new_pte |= (slot << H_PAGE_F_GIX_SHIFT) &
> > -			(H_PAGE_F_SECOND | H_PAGE_F_GIX);
> > +		new_pte |= pte_set_hash_slot(ptep, rpte, 0, slot);
> >  	}
> >  
> >  	/*
> 
> Balbir

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-09-14 17:54     ` Ram Pai
@ 2017-09-14 18:25       ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-09-14 18:25 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Sep 14, 2017 at 10:54:08AM -0700, Ram Pai wrote:
> On Thu, Sep 14, 2017 at 11:44:49AM +1000, Balbir Singh wrote:
> > On Fri,  8 Sep 2017 15:44:44 -0700
> > Ram Pai <linuxram@us.ibm.com> wrote:
> > 
> > > Rearrange 64K PTE bits to  free  up  bits 3, 4, 5  and  6
> > > in the 64K backed HPTE pages. This along with the earlier
> > > patch will  entirely free  up the four bits from 64K PTE.
> > > The bit numbers are  big-endian as defined in the  ISA3.0
> > > 
> > > This patch  does  the  following change to 64K PTE backed
> > > by 64K HPTE.
> > > 
> > > H_PAGE_F_SECOND (S) which  occupied  bit  4  moves to the
> > > 	second part of the pte to bit 60.
> > > H_PAGE_F_GIX (G,I,X) which  occupied  bit 5, 6 and 7 also
> > > 	moves  to  the   second part of the pte to bit 61,
> > >        	62, 63, 64 respectively
> > > 
> > > since bit 7 is now freed up, we move H_PAGE_BUSY (B) from
> > > bit  9  to  bit  7.
> > > 
> > > The second part of the PTE will hold
> > > (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
> > > NOTE: None of the bits in the secondary PTE were not used
> > > by 64k-HPTE backed PTE.
> > > 
> > > Before the patch, the 64K HPTE backed 64k PTE format was
> > > as follows
> > > 
> > >  0 1 2 3 4  5  6  7  8 9 10...........................63
> > >  : : : : :  :  :  :  : : :                            :
> > >  v v v v v  v  v  v  v v v                            v
> > > 
> > > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> > > |x|x|x| |S |G |I |X |x|B| |x|x|................|x|x|x|x| <- primary pte
> > > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> > > | | | | |  |  |  |  | | | | |..................| | | | | <- secondary pte
> > > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> > > 
> > > After the patch, the 64k HPTE backed 64k PTE format is
> > > as follows
> > > 
> > >  0 1 2 3 4  5  6  7  8 9 10...........................63
> > >  : : : : :  :  :  :  : : :                            :
> > >  v v v v v  v  v  v  v v v                            v
> > > 
> > > ,-,-,-,-,--,--,--,--,-,-,-,-,-,------------------,-,-,-,
> > > |x|x|x| |  |  |  |B |x| | |x|x|................|.|.|.|.| <- primary pte
> > > '_'_'_'_'__'__'__'__'_'_'_'_'_'________________'_'_'_'_'
> > > | | | | |  |  |  |  | | | | |..................|S|G|I|X| <- secondary pte
> > > '_'_'_'_'__'__'__'__'_'_'_'_'__________________'_'_'_'_'
> > > 
> > > The above PTE changes is applicable to hugetlbpages aswell.
> > > 
> > > The patch does the following code changes:
> > > 
> > > a) moves  the  H_PAGE_F_SECOND and  H_PAGE_F_GIX to 4k PTE
> > > 	header   since it is no more needed b the 64k PTEs.
> > > b) abstracts  out __real_pte() and __rpte_to_hidx() so the
> > > 	caller  need not know the bit location of the slot.
> > > c) moves the slot bits to the secondary pte.
> > > 
> > > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/book3s/64/hash-4k.h  |    3 ++
> > >  arch/powerpc/include/asm/book3s/64/hash-64k.h |   29 +++++++++++-------------
> > >  arch/powerpc/include/asm/book3s/64/hash.h     |    3 --
> > >  arch/powerpc/mm/hash64_64k.c                  |   23 ++++++++-----------
> > >  arch/powerpc/mm/hugetlbpage-hash64.c          |   18 ++++++---------
> > >  5 files changed, 33 insertions(+), 43 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > > index e66bfeb..dc153c6 100644
> > > --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > > +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
> > > @@ -16,6 +16,9 @@
> > >  #define H_PUD_TABLE_SIZE	(sizeof(pud_t) << H_PUD_INDEX_SIZE)
> > >  #define H_PGD_TABLE_SIZE	(sizeof(pgd_t) << H_PGD_INDEX_SIZE)
> > >  
> > > +#define H_PAGE_F_GIX_SHIFT	56
> > > +#define H_PAGE_F_SECOND	_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
> > > +#define H_PAGE_F_GIX	(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
> > >  #define H_PAGE_BUSY	_RPAGE_RSV1     /* software: PTE & hash are busy */
> > >  
> > >  /* PTE flags to conserve for HPTE identification */
> > > diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > > index e038f1c..89ef5a9 100644
> > > --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > > +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> > > @@ -12,7 +12,7 @@
> > >   */
> > >  #define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
> > >  #define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
> > > -#define H_PAGE_BUSY	_RPAGE_RPN42     /* software: PTE & hash are busy */
> > > +#define H_PAGE_BUSY	_RPAGE_RPN44     /* software: PTE & hash are busy */
> > >  
> > >  /*
> > >   * We need to differentiate between explicit huge page and THP huge
> > > @@ -21,8 +21,7 @@
> > >  #define H_PAGE_THP_HUGE  H_PAGE_4K_PFN
> > >  
> > >  /* PTE flags to conserve for HPTE identification */
> > > -#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_F_SECOND | \
> > > -			 H_PAGE_F_GIX | H_PAGE_HASHPTE | H_PAGE_COMBO)
> > > +#define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | H_PAGE_COMBO)
> > >  /*
> > >   * we support 16 fragments per PTE page of 64K size.
> > >   */
> > > @@ -50,24 +49,22 @@ static inline real_pte_t __real_pte(pte_t pte, pte_t *ptep)
> > >  	unsigned long *hidxp;
> > >  
> > >  	rpte.pte = pte;
> > > -	rpte.hidx = 0;
> > > -	if (pte_val(pte) & H_PAGE_COMBO) {
> > > -		/*
> > > -		 * Make sure we order the hidx load against the H_PAGE_COMBO
> > > -		 * check. The store side ordering is done in __hash_page_4K
> > > -		 */
> > > -		smp_rmb();
> > > -		hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> > > -		rpte.hidx = *hidxp;
> > > -	}
> > > +	/*
> > > +	 * Ensure that we do not read the hidx before we read
> > > +	 * the pte. Because the writer side is  expected
> > > +	 * to finish writing the hidx first followed by the pte,
> > > +	 * by using smp_wmb().
> > > +	 * pte_set_hash_slot() ensures that.
> > > +	 */
> > > +	smp_rmb();
> > > +	hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> > > +	rpte.hidx = *hidxp;
> > >  	return rpte;
> > >  }
> > >  
> > >  static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
> > >  {
> > > -	if ((pte_val(rpte.pte) & H_PAGE_COMBO))
> > > -		return (rpte.hidx >> (index<<2)) & 0xf;
> > > -	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
> > > +	return ((rpte.hidx >> (index<<2)) & 0xfUL);
> > >  }
> > >  
> > >  /*
> > > diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> > > index 8ce4112..46f3a23 100644
> > > --- a/arch/powerpc/include/asm/book3s/64/hash.h
> > > +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> > > @@ -8,9 +8,6 @@
> > >   *
> > >   */
> > >  #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
> > > -#define H_PAGE_F_GIX_SHIFT	56
> > > -#define H_PAGE_F_SECOND		_RPAGE_RSV2	/* HPTE is in 2ndary HPTEG */
> > > -#define H_PAGE_F_GIX		(_RPAGE_RSV3 | _RPAGE_RSV4 | _RPAGE_RPN44)
> > >  #define H_PAGE_HASHPTE		_RPAGE_RPN43	/* PTE has associated HPTE */
> > >  
> > >  #ifdef CONFIG_PPC_64K_PAGES
> > > diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> > > index c6c5559..9c63844 100644
> > > --- a/arch/powerpc/mm/hash64_64k.c
> > > +++ b/arch/powerpc/mm/hash64_64k.c
> > > @@ -103,8 +103,8 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
> > >  		 * On hash insert failure we use old pte value and we don't
> > >  		 * want slot information there if we have a insert failure.
> > >  		 */
> > > -		old_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
> > > -		new_pte &= ~(H_PAGE_HASHPTE | H_PAGE_F_GIX | H_PAGE_F_SECOND);
> > > +		old_pte &= ~H_PAGE_HASHPTE;
> > > +		new_pte &= ~H_PAGE_HASHPTE;
> > 
> > Shouldn't we set old/new_pte.slot = invalid? via rpte.hidx
> 
> by resetting the H_PAGE_HASHPTE flag, we are invalidating
> slot information.  Would that not be sufficient?

I think i misunderstood you question. Yes rpte.hidx will have
to be reset to invalid. The code does that further down in that
function.

	if (!(old_pte & H_PAGE_COMBO))
		rpte.hidx = ~0x0UL;


RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 03/25] powerpc: track allocation status of all pkeys
  2017-09-08 22:44 ` [PATCH 03/25] powerpc: track allocation status of all pkeys Ram Pai
@ 2017-10-07 10:02   ` Michael Ellerman
  2017-10-08 23:02     ` Ram Pai
  2017-10-18  2:47   ` Balbir Singh
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 134+ messages in thread
From: Michael Ellerman @ 2017-10-07 10:02 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> Total 32 keys are available on power7 and above. However
> pkey 0,1 are reserved. So effectively we  have  30 pkeys.
>
> On 4K kernels, we do not  have  5  bits  in  the  PTE to
> represent  all the keys; we only have 3bits.Two of those
> keys are reserved; pkey 0 and pkey 1. So effectively  we
> have 6 pkeys.
>
> This patch keeps track of reserved keys, allocated  keys
> and keys that are currently free.
>
> Also it  adds  skeletal  functions  and macros, that the
> architecture-independent code expects to be available.
>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/mmu.h |    9 ++++
>  arch/powerpc/include/asm/mmu_context.h   |    1 +
>  arch/powerpc/include/asm/pkeys.h         |   72 ++++++++++++++++++++++++=
++++--
>  arch/powerpc/mm/mmu_context_book3s64.c   |    2 +
>  arch/powerpc/mm/pkeys.c                  |   28 ++++++++++++
>  5 files changed, 108 insertions(+), 4 deletions(-)

This doesn't build for me, with pseries_le_defconfig. I assume it built
for you. So something has changed upstream maybe?


In file included from ../include/linux/pkeys.h:8:0,
                 from ../mm/mprotect.c:26:
../mm/mprotect.c: In function =E2=80=98do_mprotect_pkey=E2=80=99:
../arch/powerpc/include/asm/pkeys.h:27:29: error: =E2=80=98VM_PKEY_BIT0=E2=
=80=99 undeclared (first use in this function)
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
                             ^
../mm/mprotect.c:470:6: note: in expansion of macro =E2=80=98ARCH_VM_PKEY_F=
LAGS=E2=80=99
      ARCH_VM_PKEY_FLAGS;
      ^~~~~~~~~~~~~~~~~~
../arch/powerpc/include/asm/pkeys.h:27:29: note: each undeclared identifier=
 is reported only once for each function it appears in
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
                             ^
../mm/mprotect.c:470:6: note: in expansion of macro =E2=80=98ARCH_VM_PKEY_F=
LAGS=E2=80=99
      ARCH_VM_PKEY_FLAGS;
      ^~~~~~~~~~~~~~~~~~
../arch/powerpc/include/asm/pkeys.h:27:44: error: =E2=80=98VM_PKEY_BIT1=E2=
=80=99 undeclared (first use in this function)
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
                                            ^
../mm/mprotect.c:470:6: note: in expansion of macro =E2=80=98ARCH_VM_PKEY_F=
LAGS=E2=80=99
      ARCH_VM_PKEY_FLAGS;
      ^~~~~~~~~~~~~~~~~~
../arch/powerpc/include/asm/pkeys.h:27:59: error: =E2=80=98VM_PKEY_BIT2=E2=
=80=99 undeclared (first use in this function)
 #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
                                                           ^
../mm/mprotect.c:470:6: note: in expansion of macro =E2=80=98ARCH_VM_PKEY_F=
LAGS=E2=80=99
      ARCH_VM_PKEY_FLAGS;
      ^~~~~~~~~~~~~~~~~~
../arch/powerpc/include/asm/pkeys.h:28:5: error: =E2=80=98VM_PKEY_BIT3=E2=
=80=99 undeclared (first use in this function)
     VM_PKEY_BIT3 | VM_PKEY_BIT4)
     ^
../mm/mprotect.c:470:6: note: in expansion of macro =E2=80=98ARCH_VM_PKEY_F=
LAGS=E2=80=99
      ARCH_VM_PKEY_FLAGS;
      ^~~~~~~~~~~~~~~~~~
../arch/powerpc/include/asm/pkeys.h:28:20: error: =E2=80=98VM_PKEY_BIT4=E2=
=80=99 undeclared (first use in this function)
     VM_PKEY_BIT3 | VM_PKEY_BIT4)
                    ^
../mm/mprotect.c:470:6: note: in expansion of macro =E2=80=98ARCH_VM_PKEY_F=
LAGS=E2=80=99
      ARCH_VM_PKEY_FLAGS;
      ^~~~~~~~~~~~~~~~~~
../scripts/Makefile.build:311: recipe for target 'mm/mprotect.o' failed


cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 03/25] powerpc: track allocation status of all pkeys
  2017-10-07 10:02   ` Michael Ellerman
@ 2017-10-08 23:02     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-08 23:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm

On Sat, Oct 07, 2017 at 09:02:55PM +1100, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > Total 32 keys are available on power7 and above. However
> > pkey 0,1 are reserved. So effectively we  have  30 pkeys.
> >
> > On 4K kernels, we do not  have  5  bits  in  the  PTE to
> > represent  all the keys; we only have 3bits.Two of those
> > keys are reserved; pkey 0 and pkey 1. So effectively  we
> > have 6 pkeys.
> >
> > This patch keeps track of reserved keys, allocated  keys
> > and keys that are currently free.
> >
> > Also it  adds  skeletal  functions  and macros, that the
> > architecture-independent code expects to be available.
> >
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/book3s/64/mmu.h |    9 ++++
> >  arch/powerpc/include/asm/mmu_context.h   |    1 +
> >  arch/powerpc/include/asm/pkeys.h         |   72 ++++++++++++++++++++++++++++--
> >  arch/powerpc/mm/mmu_context_book3s64.c   |    2 +
> >  arch/powerpc/mm/pkeys.c                  |   28 ++++++++++++
> >  5 files changed, 108 insertions(+), 4 deletions(-)
> 
> This doesn't build for me, with pseries_le_defconfig. I assume it built
> for you. So something has changed upstream maybe?
> 

Yes. :(
The following commit upstream broke my patches.
df3735c5b40fad8d0d28eb8ab065fe955b3347ee

Will fix and send you a patch.

RP



> 
> In file included from ../include/linux/pkeys.h:8:0,
>                  from ../mm/mprotect.c:26:
> ../mm/mprotect.c: In function ‘do_mprotect_pkey’:
> ../arch/powerpc/include/asm/pkeys.h:27:29: error: ‘VM_PKEY_BIT0’ undeclared (first use in this function)
>  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
>                              ^
..snip...

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 03/25] powerpc: track allocation status of all pkeys
  2017-09-08 22:44 ` [PATCH 03/25] powerpc: track allocation status of all pkeys Ram Pai
  2017-10-07 10:02   ` Michael Ellerman
@ 2017-10-18  2:47   ` Balbir Singh
  2017-10-23  9:41   ` Aneesh Kumar K.V
  2017-10-24  6:28   ` Aneesh Kumar K.V
  3 siblings, 0 replies; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  2:47 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:51 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Total 32 keys are available on power7 and above. However
> pkey 0,1 are reserved. So effectively we  have  30 pkeys.
> 
> On 4K kernels, we do not  have  5  bits  in  the  PTE to
> represent  all the keys; we only have 3bits.Two of those
> keys are reserved; pkey 0 and pkey 1. So effectively  we
> have 6 pkeys.
> 
> This patch keeps track of reserved keys, allocated  keys
> and keys that are currently free.
> 
> Also it  adds  skeletal  functions  and macros, that the
> architecture-independent code expects to be available.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>

I ended up reviewing v7 of the patch. Is this v8?
I think the comments still apply to this revision

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 04/25] powerpc: helper function to read,write AMR,IAMR,UAMOR registers
  2017-09-08 22:44 ` [PATCH 04/25] powerpc: helper function to read, write AMR, IAMR, UAMOR registers Ram Pai
@ 2017-10-18  3:17   ` Balbir Singh
  2017-10-18  3:42     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  3:17 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:52 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Implements helper functions to read and write the key related
> registers; AMR, IAMR, UAMOR.
> 
> AMR register tracks the read,write permission of a key
> IAMR register tracks the execute permission of a key
> UAMOR register enables and disables a key
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h |   31 ++++++++++++++++++++++++++
>  1 files changed, 31 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index b9aff51..73ed52c 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -438,6 +438,37 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
>  		pte_update(mm, addr, ptep, 0, _PAGE_PRIVILEGED, 1);
>  }
>  
> +#include <asm/reg.h>
> +static inline u64 read_amr(void)
> +{
> +	return mfspr(SPRN_AMR);
> +}
> +static inline void write_amr(u64 value)
> +{
> +	mtspr(SPRN_AMR, value);
> +}

Do we care to validate values or is that in
the caller

> +extern bool pkey_execute_disable_support;
> +static inline u64 read_iamr(void)
> +{
> +	if (pkey_execute_disable_support)
> +		return mfspr(SPRN_IAMR);
> +	else
> +		return 0x0UL;
> +}
> +static inline void write_iamr(u64 value)
> +{
> +	if (pkey_execute_disable_support)
> +		mtspr(SPRN_IAMR, value);
> +}
> +static inline u64 read_uamor(void)
> +{
> +	return mfspr(SPRN_UAMOR);
> +}
> +static inline void write_uamor(u64 value)
> +{
> +	mtspr(SPRN_UAMOR, value);
> +}
> +
>  #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
>  static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>  				       unsigned long addr, pte_t *ptep)

Looks reasonable otherwise

Acked-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  2017-09-08 22:44 ` [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers Ram Pai
@ 2017-10-18  3:24   ` Balbir Singh
  2017-10-18 20:38     ` Ram Pai
  2017-10-24  6:25   ` Aneesh Kumar K.V
  1 sibling, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  3:24 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:53 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Introduce  helper functions that can initialize the bits in the AMR,
> IAMR and UAMOR register; the bits that correspond to the given pkey.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/pkeys.h |    1 +
>  arch/powerpc/mm/pkeys.c          |   46 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 47 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 133f8c4..5a83ed7 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -26,6 +26,7 @@
>  #define arch_max_pkey()  pkeys_total
>  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
>  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> +#define AMR_BITS_PER_PKEY 2
>  
>  #define pkey_alloc_mask(pkey) (0x1 << pkey)
>  
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index ebc9e84..178aa33 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -59,3 +59,49 @@ void __init pkey_initialize(void)
>  	for (i = 2; i < (pkeys_total - os_reserved); i++)
>  		initial_allocation_mask &= ~(0x1<<i);
>  }
> +
> +#define PKEY_REG_BITS (sizeof(u64)*8)
> +#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
> +
> +static inline void init_amr(int pkey, u8 init_bits)
> +{
> +	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> +	u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
> +

Do we need to check for reserved keys or that is at a layer above?

> +	write_amr(old_amr | new_amr_bits);
> +}
> +
> +static inline void init_iamr(int pkey, u8 init_bits)
> +{
> +	u64 new_iamr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> +	u64 old_iamr = read_iamr() & ~((u64)(0x3ul) << pkeyshift(pkey));
> +
> +	write_iamr(old_iamr | new_iamr_bits);

Do we need to check for reserved keys here?

> +}
> +
> +static void pkey_status_change(int pkey, bool enable)
> +{
> +	u64 old_uamor;
> +
> +	/* reset the AMR and IAMR bits for this key */
> +	init_amr(pkey, 0x0);
> +	init_iamr(pkey, 0x0);
> +
> +	/* enable/disable key */
> +	old_uamor = read_uamor();
> +	if (enable)
> +		old_uamor |= (0x3ul << pkeyshift(pkey));
> +	else
> +		old_uamor &= ~(0x3ul << pkeyshift(pkey));
> +	write_uamor(old_uamor);
> +}
> +
> +void __arch_activate_pkey(int pkey)
> +{
> +	pkey_status_change(pkey, true);
> +}
> +
> +void __arch_deactivate_pkey(int pkey)
> +{
> +	pkey_status_change(pkey, false);
> +}


Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 06/25] powerpc: cleaup AMR,iAMR when a key is allocated or freed
  2017-09-08 22:44 ` [PATCH 06/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed Ram Pai
@ 2017-10-18  3:34   ` Balbir Singh
  2017-10-23  9:43     ` [PATCH 06/25] powerpc: cleaup AMR, iAMR " Aneesh Kumar K.V
  2017-10-23  9:43   ` [PATCH 06/25] powerpc: cleaup AMR, iAMR " Aneesh Kumar K.V
  1 sibling, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  3:34 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:54 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> cleanup the bits corresponding to a key in the AMR, and IAMR
> register, when the key is newly allocated/activated or is freed.
> We dont want some residual bits cause the hardware enforce
> unintended behavior when the key is activated or freed.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/pkeys.h |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 5a83ed7..53bf13b 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -54,6 +54,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
>  		mm_set_pkey_is_allocated(mm, pkey));
>  }
>  
> +extern void __arch_activate_pkey(int pkey);
> +extern void __arch_deactivate_pkey(int pkey);
>  /*
>   * Returns a positive, 5-bit key on success, or -1 on failure.
>   */
> @@ -80,6 +82,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
>  
>  	ret = ffz((u32)mm_pkey_allocation_map(mm));
>  	mm_set_pkey_allocated(mm, ret);
> +
> +	/*
> +	 * enable the key in the hardware
> +	 */
> +	if (ret > 0)
> +		__arch_activate_pkey(ret);
>  	return ret;
>  }
>  
> @@ -91,6 +99,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
>  	if (!mm_pkey_is_allocated(mm, pkey))
>  		return -EINVAL;
>  
> +	/*
> +	 * Disable the key in the hardware
> +	 */
> +	__arch_deactivate_pkey(pkey);
>  	mm_set_pkey_free(mm, pkey);
>  
>  	return 0;

I think some of these patches can be merged, too much fine granularity
is hurting my ability to see the larger function/implementation.

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 04/25] powerpc: helper function to read,write AMR,IAMR,UAMOR registers
  2017-10-18  3:17   ` [PATCH 04/25] powerpc: helper function to read,write AMR,IAMR,UAMOR registers Balbir Singh
@ 2017-10-18  3:42     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-18  3:42 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 02:17:35PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:52 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > Implements helper functions to read and write the key related
> > registers; AMR, IAMR, UAMOR.
> > 
> > AMR register tracks the read,write permission of a key
> > IAMR register tracks the execute permission of a key
> > UAMOR register enables and disables a key
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/book3s/64/pgtable.h |   31 ++++++++++++++++++++++++++
> >  1 files changed, 31 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > index b9aff51..73ed52c 100644
> > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > @@ -438,6 +438,37 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
> >  		pte_update(mm, addr, ptep, 0, _PAGE_PRIVILEGED, 1);
> >  }
> >  
> > +#include <asm/reg.h>
> > +static inline u64 read_amr(void)
> > +{
> > +	return mfspr(SPRN_AMR);
> > +}
> > +static inline void write_amr(u64 value)
> > +{
> > +	mtspr(SPRN_AMR, value);
> > +}
> 
> Do we care to validate values or is that in
> the caller


No. Caller is expected to validate the values.

> 
> > +extern bool pkey_execute_disable_support;
> > +static inline u64 read_iamr(void)
> > +{
> > +	if (pkey_execute_disable_support)
> > +		return mfspr(SPRN_IAMR);
> > +	else
> > +		return 0x0UL;
> > +}
> > +static inline void write_iamr(u64 value)
> > +{
> > +	if (pkey_execute_disable_support)
> > +		mtspr(SPRN_IAMR, value);
> > +}
> > +static inline u64 read_uamor(void)
> > +{
> > +	return mfspr(SPRN_UAMOR);
> > +}
> > +static inline void write_uamor(u64 value)
> > +{
> > +	mtspr(SPRN_UAMOR, value);
> > +}
> > +
> >  #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
> >  static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
> >  				       unsigned long addr, pte_t *ptep)
> 
> Looks reasonable otherwise

Thanks!
RP


> Acked-by: Balbir Singh <bsingharora@gmail.com>

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 09/25] powerpc: ability to create execute-disabled pkeys
  2017-09-08 22:44 ` [PATCH 09/25] powerpc: ability to create execute-disabled pkeys Ram Pai
@ 2017-10-18  3:42   ` Balbir Singh
  2017-10-18  5:15     ` Ram Pai
  2017-10-24  4:36   ` Aneesh Kumar K.V
  1 sibling, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  3:42 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:57 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> powerpc has hardware support to disable execute on a pkey.
> This patch enables the ability to create execute-disabled
> keys.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/uapi/asm/mman.h |    6 ++++++
>  arch/powerpc/mm/pkeys.c              |   16 ++++++++++++++++
>  2 files changed, 22 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> index ab45cc2..f272b09 100644
> --- a/arch/powerpc/include/uapi/asm/mman.h
> +++ b/arch/powerpc/include/uapi/asm/mman.h
> @@ -45,4 +45,10 @@
>  #define MAP_HUGE_1GB	(30 << MAP_HUGE_SHIFT)	/* 1GB   HugeTLB Page */
>  #define MAP_HUGE_16GB	(34 << MAP_HUGE_SHIFT)	/* 16GB  HugeTLB Page */
>  
> +/* override any generic PKEY Permission defines */
> +#define PKEY_DISABLE_EXECUTE   0x4
> +#undef PKEY_ACCESS_MASK
> +#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
> +				PKEY_DISABLE_WRITE  |\
> +				PKEY_DISABLE_EXECUTE)
>  #endif /* _UAPI_ASM_POWERPC_MMAN_H */
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index cc5be6a..2282864 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -24,6 +24,14 @@ void __init pkey_initialize(void)
>  {
>  	int os_reserved, i;
>  
> +	/*
> +	 * we define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
> +	 * generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
> +	 * Ensure that the bits a distinct.
> +	 */
> +	BUILD_BUG_ON(PKEY_DISABLE_EXECUTE &
> +		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));

Will these values every change? It's good to have I guess.

> +
>  	/* disable the pkey system till everything
>  	 * is in place. A patch further down the
>  	 * line will enable it.
> @@ -120,10 +128,18 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
>  		unsigned long init_val)
>  {
>  	u64 new_amr_bits = 0x0ul;
> +	u64 new_iamr_bits = 0x0ul;
>  
>  	if (!is_pkey_enabled(pkey))
>  		return -EINVAL;
>  
> +	if ((init_val & PKEY_DISABLE_EXECUTE)) {
> +		if (!pkey_execute_disable_support)
> +			return -EINVAL;
> +		new_iamr_bits |= IAMR_EX_BIT;
> +	}
> +	init_iamr(pkey, new_iamr_bits);
> +

Where do we check the reserved keys?

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 10/25] powerpc: store and restore the pkey state across context switches
  2017-09-08 22:44 ` [PATCH 10/25] powerpc: store and restore the pkey state across context switches Ram Pai
@ 2017-10-18  3:49   ` Balbir Singh
  2017-10-18 20:47     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  3:49 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:58 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Store and restore the AMR, IAMR and UAMOR register state of the task
> before scheduling out and after scheduling in, respectively.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/pkeys.h     |    4 +++
>  arch/powerpc/include/asm/processor.h |    5 ++++
>  arch/powerpc/kernel/process.c        |   10 ++++++++
>  arch/powerpc/mm/pkeys.c              |   39 ++++++++++++++++++++++++++++++++++
>  4 files changed, 58 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 7fd48a4..78c5362 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -143,5 +143,9 @@ static inline void pkey_mm_init(struct mm_struct *mm)
>  	mm_pkey_allocation_map(mm) = initial_allocation_mask;
>  }
>  
> +extern void thread_pkey_regs_save(struct thread_struct *thread);
> +extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
> +			struct thread_struct *old_thread);
> +extern void thread_pkey_regs_init(struct thread_struct *thread);
>  extern void pkey_initialize(void);
>  #endif /*_ASM_PPC64_PKEYS_H */
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index fab7ff8..de9d9ba 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -309,6 +309,11 @@ struct thread_struct {
>  	struct thread_vr_state ckvr_state; /* Checkpointed VR state */
>  	unsigned long	ckvrsave; /* Checkpointed VRSAVE */
>  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	unsigned long	amr;
> +	unsigned long	iamr;
> +	unsigned long	uamor;
> +#endif
>  #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
>  	void*		kvm_shadow_vcpu; /* KVM internal data */
>  #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index a0c74bb..ba80002 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -42,6 +42,7 @@
>  #include <linux/hw_breakpoint.h>
>  #include <linux/uaccess.h>
>  #include <linux/elf-randomize.h>
> +#include <linux/pkeys.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/io.h>
> @@ -1085,6 +1086,9 @@ static inline void save_sprs(struct thread_struct *t)
>  		t->tar = mfspr(SPRN_TAR);
>  	}
>  #endif
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	thread_pkey_regs_save(t);
> +#endif

Just define two variants of thread_pkey_regs_save() based on
CONFIG_PPC64_MEMORY_PROTECTION_KEYS and remove the #ifdefs from process.c
Ditto for the lines below

>  }
>  
>  static inline void restore_sprs(struct thread_struct *old_thread,
> @@ -1120,6 +1124,9 @@ static inline void restore_sprs(struct thread_struct *old_thread,
>  			mtspr(SPRN_TAR, new_thread->tar);
>  	}
>  #endif
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	thread_pkey_regs_restore(new_thread, old_thread);
> +#endif
>  }
>  
>  #ifdef CONFIG_PPC_BOOK3S_64
> @@ -1705,6 +1712,9 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
>  	current->thread.tm_tfiar = 0;
>  	current->thread.load_tm = 0;
>  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	thread_pkey_regs_init(&current->thread);
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
>  }
>  EXPORT_SYMBOL(start_thread);
>  
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index 2282864..7cd1be4 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -149,3 +149,42 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
>  	init_amr(pkey, new_amr_bits);
>  	return 0;
>  }
> +
> +void thread_pkey_regs_save(struct thread_struct *thread)
> +{
> +	if (!pkey_inited)
> +		return;
> +
> +	/* @TODO skip saving any registers if the thread
> +	 * has not used any keys yet.
> +	 */

Comment style is broken

> +
> +	thread->amr = read_amr();
> +	thread->iamr = read_iamr();
> +	thread->uamor = read_uamor();
> +}
> +
> +void thread_pkey_regs_restore(struct thread_struct *new_thread,
> +			struct thread_struct *old_thread)
> +{
> +	if (!pkey_inited)
> +		return;
> +
> +	/* @TODO just reset uamor to zero if the new_thread
> +	 * has not used any keys yet.
> +	 */

Comment style is broken.

> +
> +	if (old_thread->amr != new_thread->amr)
> +		write_amr(new_thread->amr);
> +	if (old_thread->iamr != new_thread->iamr)
> +		write_iamr(new_thread->iamr);
> +	if (old_thread->uamor != new_thread->uamor)
> +		write_uamor(new_thread->uamor);

Is this order correct? Ideally, You want to write the uamor first
but since we are in supervisor state, I think we can get away
with this order. Do we want to expose the uamor to user space
for it to modify the AMR directly?

> +}
> +
> +void thread_pkey_regs_init(struct thread_struct *thread)
> +{
> +	write_amr(0x0ul);
> +	write_iamr(0x0ul);
> +	write_uamor(0x0ul);

This is not correct, reserved keys should not be set to 0's

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 11/25] powerpc: introduce execute-only pkey
  2017-09-08 22:44 ` [PATCH 11/25] powerpc: introduce execute-only pkey Ram Pai
@ 2017-10-18  4:15   ` Balbir Singh
  2017-10-18 20:57     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  4:15 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:44:59 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> This patch provides the implementation of execute-only pkey.
> The architecture-independent layer expects the arch-dependent
> layer, to support the ability to create and enable a special
> key which has execute-only permission.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/mmu.h |    1 +
>  arch/powerpc/include/asm/pkeys.h         |    9 ++++-
>  arch/powerpc/mm/pkeys.c                  |   57 ++++++++++++++++++++++++++++++
>  3 files changed, 66 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> index 55950f4..ee18ba0 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> @@ -115,6 +115,7 @@ struct patb_entry {
>  	 * bit unset -> key available for allocation
>  	 */
>  	u32 pkey_allocation_map;
> +	s16 execute_only_pkey; /* key holding execute-only protection */
>  #endif
>  } mm_context_t;
>  
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 78c5362..0cf115f 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -115,11 +115,16 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
>   * Try to dedicate one of the protection keys to be used as an
>   * execute-only protection key.
>   */
> +extern int __execute_only_pkey(struct mm_struct *mm);
>  static inline int execute_only_pkey(struct mm_struct *mm)
>  {
> -	return 0;
> +	if (!pkey_inited || !pkey_execute_disable_support)
> +		return -1;
> +
> +	return __execute_only_pkey(mm);
>  }
>  
> +
>  static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
>  		int prot, int pkey)
>  {
> @@ -141,6 +146,8 @@ static inline void pkey_mm_init(struct mm_struct *mm)
>  	if (!pkey_inited)
>  		return;
>  	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> +	/* -1 means unallocated or invalid */
> +	mm->context.execute_only_pkey = -1;
>  }
>  
>  extern void thread_pkey_regs_save(struct thread_struct *thread);
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index 7cd1be4..8a24983 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -188,3 +188,60 @@ void thread_pkey_regs_init(struct thread_struct *thread)
>  	write_iamr(0x0ul);
>  	write_uamor(0x0ul);
>  }
> +
> +static inline bool pkey_allows_readwrite(int pkey)
> +{
> +	int pkey_shift = pkeyshift(pkey);
> +
> +	if (!(read_uamor() & (0x3UL << pkey_shift)))
> +		return true;

If uamor for key 0 is 0x10 for example or 0x01 it's a bug.
The above check might miss it.

> +
> +	return !(read_amr() & ((AMR_RD_BIT|AMR_WR_BIT) << pkey_shift));
> +}
> +
> +int __execute_only_pkey(struct mm_struct *mm)
> +{
> +	bool need_to_set_mm_pkey = false;
> +	int execute_only_pkey = mm->context.execute_only_pkey;
> +	int ret;
> +
> +	/* Do we need to assign a pkey for mm's execute-only maps? */
> +	if (execute_only_pkey == -1) {
> +		/* Go allocate one to use, which might fail */
> +		execute_only_pkey = mm_pkey_alloc(mm);
> +		if (execute_only_pkey < 0)
> +			return -1;
> +		need_to_set_mm_pkey = true;
> +	}
> +
> +	/*
> +	 * We do not want to go through the relatively costly
> +	 * dance to set AMR if we do not need to.  Check it
> +	 * first and assume that if the execute-only pkey is
> +	 * readwrite-disabled than we do not have to set it
> +	 * ourselves.
> +	 */
> +	if (!need_to_set_mm_pkey &&
> +	    !pkey_allows_readwrite(execute_only_pkey))
> +		return execute_only_pkey;
> +
> +	/*
> +	 * Set up AMR so that it denies access for everything
> +	 * other than execution.
> +	 */
> +	ret = __arch_set_user_pkey_access(current, execute_only_pkey,
> +			(PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
> +	/*
> +	 * If the AMR-set operation failed somehow, just return
> +	 * 0 and effectively disable execute-only support.
> +	 */
> +	if (ret) {
> +		mm_set_pkey_free(mm, execute_only_pkey);
> +		return -1;
> +	}
> +
> +	/* We got one, store it and use it from here on out */
> +	if (need_to_set_mm_pkey)
> +		mm->context.execute_only_pkey = execute_only_pkey;
> +	return execute_only_pkey;
> +}

Looks good otherwise

Acked-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 12/25] powerpc: ability to associate pkey to a vma
  2017-09-08 22:45 ` [PATCH 12/25] powerpc: ability to associate pkey to a vma Ram Pai
@ 2017-10-18  4:27   ` Balbir Singh
  2017-10-18 21:01     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  4:27 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:00 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> arch-independent code expects the arch to  map
> a  pkey  into the vma's protection bit setting.
> The patch provides that ability.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/mman.h  |    8 +++++++-
>  arch/powerpc/include/asm/pkeys.h |   18 ++++++++++++++++++
>  2 files changed, 25 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
> index 30922f6..067eec2 100644
> --- a/arch/powerpc/include/asm/mman.h
> +++ b/arch/powerpc/include/asm/mman.h
> @@ -13,6 +13,7 @@
>  
>  #include <asm/cputable.h>
>  #include <linux/mm.h>
> +#include <linux/pkeys.h>
>  #include <asm/cpu_has_feature.h>
>  
>  /*
> @@ -22,7 +23,12 @@
>  static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>  		unsigned long pkey)
>  {
> -	return (prot & PROT_SAO) ? VM_SAO : 0;
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	return (((prot & PROT_SAO) ? VM_SAO : 0) |
> +			pkey_to_vmflag_bits(pkey));
> +#else
> +	return ((prot & PROT_SAO) ? VM_SAO : 0);
> +#endif
>  }
>  #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
>  
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 0cf115f..f13e913 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -23,6 +23,24 @@
>  #define VM_PKEY_BIT4	VM_HIGH_ARCH_4
>  #endif
>  
> +/* override any generic PKEY Permission defines */
> +#define PKEY_DISABLE_EXECUTE   0x4
> +#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
> +				PKEY_DISABLE_WRITE  |\
> +				PKEY_DISABLE_EXECUTE)
> +
> +static inline u64 pkey_to_vmflag_bits(u16 pkey)
> +{
> +	if (!pkey_inited)
> +		return 0x0UL;
> +
> +	return (((pkey & 0x1UL) ? VM_PKEY_BIT0 : 0x0UL) |
> +		((pkey & 0x2UL) ? VM_PKEY_BIT1 : 0x0UL) |
> +		((pkey & 0x4UL) ? VM_PKEY_BIT2 : 0x0UL) |
> +		((pkey & 0x8UL) ? VM_PKEY_BIT3 : 0x0UL) |
> +		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
> +}

Assuming that there is a linear order between VM_PKEY_BIT4 to
VM_PKEY_BIT0, the conditional checks can be removed

(pkey & 0x1fUL) << VM_PKEY_BIT0?


Balbir Singh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey()
  2017-09-08 22:45 ` [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey() Ram Pai
@ 2017-10-18  4:36   ` Balbir Singh
  2017-10-18 21:10     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  4:36 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:01 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> arch independent code calls arch_override_mprotect_pkey()
> to return a pkey that best matches the requested protection.
> 
> This patch provides the implementation.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/mmu_context.h |    5 +++
>  arch/powerpc/include/asm/pkeys.h       |   17 ++++++++++-
>  arch/powerpc/mm/pkeys.c                |   47 ++++++++++++++++++++++++++++++++
>  3 files changed, 67 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> index c705a5d..8e5a87e 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -145,6 +145,11 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
>  #ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
>  #define pkey_initialize()
>  #define pkey_mm_init(mm)
> +
> +static inline int vma_pkey(struct vm_area_struct *vma)
> +{
> +	return 0;
> +}
>  #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
>  
>  #endif /* __KERNEL__ */
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index f13e913..d2fffef 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -41,6 +41,16 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
>  		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
>  }
>  
> +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
> +				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> +
> +static inline int vma_pkey(struct vm_area_struct *vma)
> +{
> +	if (!pkey_inited)
> +		return 0;

We don't want pkey_inited to be present in all functions, why do we need
a conditional branch for all functions. Even if we do, it should be a jump
label. I would rather we just removed !pkey_inited unless really really
required.

> +	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
> +}
> +
>  #define arch_max_pkey()  pkeys_total
>  #define AMR_RD_BIT 0x1UL
>  #define AMR_WR_BIT 0x2UL
> @@ -142,11 +152,14 @@ static inline int execute_only_pkey(struct mm_struct *mm)
>  	return __execute_only_pkey(mm);
>  }
>  
> -
> +extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
> +		int prot, int pkey);
>  static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
>  		int prot, int pkey)
>  {
> -	return 0;
> +	if (!pkey_inited)
> +		return 0;
> +	return __arch_override_mprotect_pkey(vma, prot, pkey);
>  }
>  
>  extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index 8a24983..fb1a76a 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -245,3 +245,50 @@ int __execute_only_pkey(struct mm_struct *mm)
>  		mm->context.execute_only_pkey = execute_only_pkey;
>  	return execute_only_pkey;
>  }
> +
> +static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
> +{
> +	/* Do this check first since the vm_flags should be hot */
> +	if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
> +		return false;
> +
> +	return (vma_pkey(vma) == vma->vm_mm->context.execute_only_pkey);
> +}
> +
> +/*
> + * This should only be called for *plain* mprotect calls.

What's a plain mprotect call?

> + */
> +int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
> +		int pkey)
> +{
> +	/*
> +	 * Is this an mprotect_pkey() call?  If so, never
> +	 * override the value that came from the user.
> +	 */
> +	if (pkey != -1)
> +		return pkey;

If the user specified a key, we always use it? Presumably the user
got it from pkey_alloc(), in other cases, the user was lazy and used
-1 in the mprotect call?

> +
> +	/*
> +	 * If the currently associated pkey is execute-only,
> +	 * but the requested protection requires read or write,
> +	 * move it back to the default pkey.
> +	 */
> +	if (vma_is_pkey_exec_only(vma) &&
> +	    (prot & (PROT_READ|PROT_WRITE)))
> +		return 0;
> +
> +	/*
> +	 * the requested protection is execute-only. Hence
> +	 * lets use a execute-only pkey.
> +	 */
> +	if (prot == PROT_EXEC) {
> +		pkey = execute_only_pkey(vma->vm_mm);
> +		if (pkey > 0)
> +			return pkey;
> +	}
> +
> +	/*
> +	 * nothing to override.
> +	 */
> +	return vma_pkey(vma);
> +}

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 14/25] powerpc: map vma key-protection bits to pte key bits.
  2017-09-08 22:45 ` [PATCH 14/25] powerpc: map vma key-protection bits to pte key bits Ram Pai
@ 2017-10-18  4:39   ` Balbir Singh
  2017-10-18 21:14     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  4:39 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:02 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> map  the  key  protection  bits of the vma to the pkey bits in
> the PTE.
> 
> The Pte  bits used  for pkey  are  3,4,5,6  and 57. The  first
> four bits are the same four bits that were freed up  initially
> in this patch series. remember? :-) Without those four bits
> this patch would'nt be possible.
> 
> BUT, On 4k kernel, bit 3, and 4 could not be freed up. remember?
> Hence we have to be satisfied with 5,6 and 7.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h |   25 ++++++++++++++++++++++++-
>  arch/powerpc/include/asm/mman.h              |    8 ++++++++
>  arch/powerpc/include/asm/pkeys.h             |   12 ++++++++++++
>  3 files changed, 44 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 73ed52c..5935d4e 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -38,6 +38,7 @@
>  #define _RPAGE_RSV2		0x0800000000000000UL
>  #define _RPAGE_RSV3		0x0400000000000000UL
>  #define _RPAGE_RSV4		0x0200000000000000UL
> +#define _RPAGE_RSV5		0x00040UL
>  
>  #define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
>  #define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
> @@ -57,6 +58,25 @@
>  /* Max physical address bit as per radix table */
>  #define _RPAGE_PA_MAX		57
>  
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +#ifdef CONFIG_PPC_64K_PAGES
> +#define H_PAGE_PKEY_BIT0	_RPAGE_RSV1
> +#define H_PAGE_PKEY_BIT1	_RPAGE_RSV2
> +#else /* CONFIG_PPC_64K_PAGES */
> +#define H_PAGE_PKEY_BIT0	0 /* _RPAGE_RSV1 is not available */
> +#define H_PAGE_PKEY_BIT1	0 /* _RPAGE_RSV2 is not available */
> +#endif /* CONFIG_PPC_64K_PAGES */
> +#define H_PAGE_PKEY_BIT2	_RPAGE_RSV3
> +#define H_PAGE_PKEY_BIT3	_RPAGE_RSV4
> +#define H_PAGE_PKEY_BIT4	_RPAGE_RSV5
> +#else /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> +#define H_PAGE_PKEY_BIT0	0
> +#define H_PAGE_PKEY_BIT1	0
> +#define H_PAGE_PKEY_BIT2	0
> +#define H_PAGE_PKEY_BIT3	0
> +#define H_PAGE_PKEY_BIT4	0
> +#endif /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */

H_PTE_PKEY_BITX?

> +
>  /*
>   * Max physical address bit we will use for now.
>   *
> @@ -120,13 +140,16 @@
>  #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
>  			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE |	\
>  			 _PAGE_SOFT_DIRTY)
> +
> +#define H_PAGE_PKEY  (H_PAGE_PKEY_BIT0 | H_PAGE_PKEY_BIT1 | H_PAGE_PKEY_BIT2 | \
> +			H_PAGE_PKEY_BIT3 | H_PAGE_PKEY_BIT4)
>  /*
>   * Mask of bits returned by pte_pgprot()
>   */
>  #define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
>  			 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
>  			 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | \
> -			 _PAGE_SOFT_DIRTY)
> +			 _PAGE_SOFT_DIRTY | H_PAGE_PKEY)
>  /*
>   * We define 2 sets of base prot bits, one for basic pages (ie,
>   * cacheable kernel and user pages) and one for non cacheable
> diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
> index 067eec2..3f7220f 100644
> --- a/arch/powerpc/include/asm/mman.h
> +++ b/arch/powerpc/include/asm/mman.h
> @@ -32,12 +32,20 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
>  }
>  #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
>  
> +
>  static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
>  {
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	return (vm_flags & VM_SAO) ?
> +		__pgprot(_PAGE_SAO | vmflag_to_page_pkey_bits(vm_flags)) :
> +		__pgprot(0 | vmflag_to_page_pkey_bits(vm_flags));
> +#else
>  	return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0);
> +#endif
>  }
>  #define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
>  
> +
>  static inline bool arch_validate_prot(unsigned long prot)
>  {
>  	if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO))
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index d2fffef..0d2488a 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -41,6 +41,18 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
>  		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
>  }
>  
> +static inline u64 vmflag_to_page_pkey_bits(u64 vm_flags)

vmflag_to_pte_pkey_bits?

> +{
> +	if (!pkey_inited)
> +		return 0x0UL;
> +
> +	return (((vm_flags & VM_PKEY_BIT0) ? H_PAGE_PKEY_BIT4 : 0x0UL) |
> +		((vm_flags & VM_PKEY_BIT1) ? H_PAGE_PKEY_BIT3 : 0x0UL) |
> +		((vm_flags & VM_PKEY_BIT2) ? H_PAGE_PKEY_BIT2 : 0x0UL) |
> +		((vm_flags & VM_PKEY_BIT3) ? H_PAGE_PKEY_BIT1 : 0x0UL) |
> +		((vm_flags & VM_PKEY_BIT4) ? H_PAGE_PKEY_BIT0 : 0x0UL));
> +}
> +
>  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
>  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
>  

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 16/25] powerpc: Program HPTE key protection bits
  2017-09-08 22:45 ` [PATCH 16/25] powerpc: Program HPTE key protection bits Ram Pai
@ 2017-10-18  4:43   ` Balbir Singh
  0 siblings, 0 replies; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  4:43 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:04 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Map the PTE protection key bits to the HPTE key protection bits,
> while creating HPTE  entries.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |    5 +++++
>  arch/powerpc/include/asm/mmu_context.h        |    6 ++++++
>  arch/powerpc/include/asm/pkeys.h              |   13 +++++++++++++
>  arch/powerpc/mm/hash_utils_64.c               |    1 +
>  4 files changed, 25 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 508275b..2e22357 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -90,6 +90,8 @@
>  #define HPTE_R_PP0		ASM_CONST(0x8000000000000000)
>  #define HPTE_R_TS		ASM_CONST(0x4000000000000000)
>  #define HPTE_R_KEY_HI		ASM_CONST(0x3000000000000000)
> +#define HPTE_R_KEY_BIT0		ASM_CONST(0x2000000000000000)
> +#define HPTE_R_KEY_BIT1		ASM_CONST(0x1000000000000000)
>  #define HPTE_R_RPN_SHIFT	12
>  #define HPTE_R_RPN		ASM_CONST(0x0ffffffffffff000)
>  #define HPTE_R_RPN_3_0		ASM_CONST(0x01fffffffffff000)
> @@ -104,6 +106,9 @@
>  #define HPTE_R_C		ASM_CONST(0x0000000000000080)
>  #define HPTE_R_R		ASM_CONST(0x0000000000000100)
>  #define HPTE_R_KEY_LO		ASM_CONST(0x0000000000000e00)
> +#define HPTE_R_KEY_BIT2		ASM_CONST(0x0000000000000800)
> +#define HPTE_R_KEY_BIT3		ASM_CONST(0x0000000000000400)
> +#define HPTE_R_KEY_BIT4		ASM_CONST(0x0000000000000200)
>  #define HPTE_R_KEY		(HPTE_R_KEY_LO | HPTE_R_KEY_HI)
>  
>  #define HPTE_V_1TB_SEG		ASM_CONST(0x4000000000000000)
> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> index 8e5a87e..04e9221 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -150,6 +150,12 @@ static inline int vma_pkey(struct vm_area_struct *vma)
>  {
>  	return 0;
>  }
> +
> +static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
> +{
> +	return 0x0UL;
> +}
> +
>  #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
>  
>  #endif /* __KERNEL__ */
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 0d2488a..cd3924c 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -67,6 +67,19 @@ static inline int vma_pkey(struct vm_area_struct *vma)
>  #define AMR_RD_BIT 0x1UL
>  #define AMR_WR_BIT 0x2UL
>  #define IAMR_EX_BIT 0x1UL
> +
> +static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
> +{
> +	if (!pkey_inited)
> +		return 0x0UL;
> +
> +	return (((pteflags & H_PAGE_PKEY_BIT0) ? HPTE_R_KEY_BIT0 : 0x0UL) |
> +		((pteflags & H_PAGE_PKEY_BIT1) ? HPTE_R_KEY_BIT1 : 0x0UL) |
> +		((pteflags & H_PAGE_PKEY_BIT2) ? HPTE_R_KEY_BIT2 : 0x0UL) |
> +		((pteflags & H_PAGE_PKEY_BIT3) ? HPTE_R_KEY_BIT3 : 0x0UL) |
> +		((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
> +}
> +
>  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
>  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
>  #define AMR_BITS_PER_PKEY 2
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 67f62b5..a739a2d 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -232,6 +232,7 @@ unsigned long htab_convert_pte_flags(unsigned long pteflags)
>  		 */
>  		rflags |= HPTE_R_M;
>  
> +	rflags |= pte_to_hpte_pkey_bits(pteflags);
>  	return rflags;
>  }
>  


Looks good!

Acked-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 17/25] powerpc: helper to validate key-access permissions of a pte
  2017-09-08 22:45 ` [PATCH 17/25] powerpc: helper to validate key-access permissions of a pte Ram Pai
@ 2017-10-18  4:48   ` Balbir Singh
  2017-10-18 21:19     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18  4:48 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:05 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> helper function that checks if the read/write/execute is allowed
> on the pte.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h |    4 +++
>  arch/powerpc/include/asm/pkeys.h             |   12 +++++++++++
>  arch/powerpc/mm/pkeys.c                      |   28 ++++++++++++++++++++++++++
>  3 files changed, 44 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 5935d4e..bd244b3 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -492,6 +492,10 @@ static inline void write_uamor(u64 value)
>  	mtspr(SPRN_UAMOR, value);
>  }
>  
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> +
>  #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
>  static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>  				       unsigned long addr, pte_t *ptep)
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index cd3924c..50522a0 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -80,6 +80,18 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
>  		((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
>  }
>  
> +static inline u16 pte_to_pkey_bits(u64 pteflags)
> +{
> +	if (!pkey_inited)
> +		return 0x0UL;
> +
> +	return (((pteflags & H_PAGE_PKEY_BIT0) ? 0x10 : 0x0UL) |
> +		((pteflags & H_PAGE_PKEY_BIT1) ? 0x8 : 0x0UL) |
> +		((pteflags & H_PAGE_PKEY_BIT2) ? 0x4 : 0x0UL) |
> +		((pteflags & H_PAGE_PKEY_BIT3) ? 0x2 : 0x0UL) |
> +		((pteflags & H_PAGE_PKEY_BIT4) ? 0x1 : 0x0UL));
> +}
> +
>  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
>  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
>  #define AMR_BITS_PER_PKEY 2
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index fb1a76a..24589d9 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -292,3 +292,31 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
>  	 */
>  	return vma_pkey(vma);
>  }
> +
> +static bool pkey_access_permitted(int pkey, bool write, bool execute)
> +{
> +	int pkey_shift;
> +	u64 amr;
> +
> +	if (!pkey)
> +		return true;

Why would we have pkey set to 0, it's reserved. Why do we return true?

> +
> +	pkey_shift = pkeyshift(pkey);
> +	if (!(read_uamor() & (0x3UL << pkey_shift)))
> +		return true;
> +
> +	if (execute && !(read_iamr() & (IAMR_EX_BIT << pkey_shift)))
> +		return true;
> +
> +	amr = read_amr(); /* delay reading amr uptil absolutely needed*/
> +	return ((!write && !(amr & (AMR_RD_BIT << pkey_shift))) ||
> +		(write &&  !(amr & (AMR_WR_BIT << pkey_shift))));
> +}
> +
> +bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
> +{
> +	if (!pkey_inited)
> +		return true;

Again, don't like the pkey_inited bits :)

> +	return pkey_access_permitted(pte_to_pkey_bits(pte),
> +			write, execute);
> +}

Balbir Singh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 09/25] powerpc: ability to create execute-disabled pkeys
  2017-10-18  3:42   ` Balbir Singh
@ 2017-10-18  5:15     ` Ram Pai
  2017-10-24  6:58       ` Aneesh Kumar K.V
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-18  5:15 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 02:42:56PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:57 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > powerpc has hardware support to disable execute on a pkey.
> > This patch enables the ability to create execute-disabled
> > keys.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/uapi/asm/mman.h |    6 ++++++
> >  arch/powerpc/mm/pkeys.c              |   16 ++++++++++++++++
> >  2 files changed, 22 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> > index ab45cc2..f272b09 100644
> > --- a/arch/powerpc/include/uapi/asm/mman.h
> > +++ b/arch/powerpc/include/uapi/asm/mman.h
> > @@ -45,4 +45,10 @@
> >  #define MAP_HUGE_1GB	(30 << MAP_HUGE_SHIFT)	/* 1GB   HugeTLB Page */
> >  #define MAP_HUGE_16GB	(34 << MAP_HUGE_SHIFT)	/* 16GB  HugeTLB Page */
> >  
> > +/* override any generic PKEY Permission defines */
> > +#define PKEY_DISABLE_EXECUTE   0x4
> > +#undef PKEY_ACCESS_MASK
> > +#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
> > +				PKEY_DISABLE_WRITE  |\
> > +				PKEY_DISABLE_EXECUTE)
> >  #endif /* _UAPI_ASM_POWERPC_MMAN_H */
> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > index cc5be6a..2282864 100644
> > --- a/arch/powerpc/mm/pkeys.c
> > +++ b/arch/powerpc/mm/pkeys.c
> > @@ -24,6 +24,14 @@ void __init pkey_initialize(void)
> >  {
> >  	int os_reserved, i;
> >  
> > +	/*
> > +	 * we define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
> > +	 * generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
> > +	 * Ensure that the bits a distinct.
> > +	 */
> > +	BUILD_BUG_ON(PKEY_DISABLE_EXECUTE &
> > +		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
> 
> Will these values every change? It's good to have I guess.
> 
> > +
> >  	/* disable the pkey system till everything
> >  	 * is in place. A patch further down the
> >  	 * line will enable it.
> > @@ -120,10 +128,18 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> >  		unsigned long init_val)
> >  {
> >  	u64 new_amr_bits = 0x0ul;
> > +	u64 new_iamr_bits = 0x0ul;
> >  
> >  	if (!is_pkey_enabled(pkey))
> >  		return -EINVAL;
> >  
> > +	if ((init_val & PKEY_DISABLE_EXECUTE)) {
> > +		if (!pkey_execute_disable_support)
> > +			return -EINVAL;
> > +		new_iamr_bits |= IAMR_EX_BIT;
> > +	}
> > +	init_iamr(pkey, new_iamr_bits);
> > +
> 
> Where do we check the reserved keys?

The main gate keeper against spurious keys are the system calls.
sys_pkey_mprotect(), sys_pkey_free() and sys_pkey_modify() are the one
that will check against reserved and unallocated keys.  Once it has
passed the check, all other internal functions trust the key values
provided to them. I can put in additional checks but that will
unnecessarily chew a few cpu cycles.

Agree?

BTW: you raise a good point though, I may have missed guarding against
unallocated or reserved keys in sys_pkey_modify(). That was a power
specific system call that I have introduced to change the permissions on
a key.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 18/25] powerpc: check key protection for user page access
  2017-09-08 22:45 ` [PATCH 18/25] powerpc: check key protection for user page access Ram Pai
@ 2017-10-18 19:57   ` Balbir Singh
  2017-10-18 21:29     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18 19:57 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:06 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Make sure that the kernel does not access user pages without
> checking their key-protection.
>

Why? This makes the routines AMR/thread specific? Looks like
x86 does this as well, but these routines are used by GUP from
the kernel.

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  2017-10-18  3:24   ` Balbir Singh
@ 2017-10-18 20:38     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-18 20:38 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 02:24:03PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:53 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > Introduce  helper functions that can initialize the bits in the AMR,
> > IAMR and UAMOR register; the bits that correspond to the given pkey.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/pkeys.h |    1 +
> >  arch/powerpc/mm/pkeys.c          |   46 ++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 47 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index 133f8c4..5a83ed7 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -26,6 +26,7 @@
> >  #define arch_max_pkey()  pkeys_total
> >  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
> >  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> > +#define AMR_BITS_PER_PKEY 2
> >  
> >  #define pkey_alloc_mask(pkey) (0x1 << pkey)
> >  
> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > index ebc9e84..178aa33 100644
> > --- a/arch/powerpc/mm/pkeys.c
> > +++ b/arch/powerpc/mm/pkeys.c
> > @@ -59,3 +59,49 @@ void __init pkey_initialize(void)
> >  	for (i = 2; i < (pkeys_total - os_reserved); i++)
> >  		initial_allocation_mask &= ~(0x1<<i);
> >  }
> > +
> > +#define PKEY_REG_BITS (sizeof(u64)*8)
> > +#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
> > +
> > +static inline void init_amr(int pkey, u8 init_bits)
> > +{
> > +	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> > +	u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
> > +
> 
> Do we need to check for reserved keys or that is at a layer above?

These routines blindly trust the caller. The assumption is the
system calls which are the gate-keepers for the keys validate
the keys before calling any lower level functions.

> 
> > +	write_amr(old_amr | new_amr_bits);
> > +}
> > +
> > +static inline void init_iamr(int pkey, u8 init_bits)
> > +{
> > +	u64 new_iamr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> > +	u64 old_iamr = read_iamr() & ~((u64)(0x3ul) << pkeyshift(pkey));
> > +
> > +	write_iamr(old_iamr | new_iamr_bits);
> 
> Do we need to check for reserved keys here?
> 

ditto..

> > +}
> > +
> > +static void pkey_status_change(int pkey, bool enable)
> > +{
> > +	u64 old_uamor;
> > +
> > +	/* reset the AMR and IAMR bits for this key */
> > +	init_amr(pkey, 0x0);
> > +	init_iamr(pkey, 0x0);
> > +
> > +	/* enable/disable key */
> > +	old_uamor = read_uamor();
> > +	if (enable)
> > +		old_uamor |= (0x3ul << pkeyshift(pkey));
> > +	else
> > +		old_uamor &= ~(0x3ul << pkeyshift(pkey));
> > +	write_uamor(old_uamor);
> > +}
> > +
> > +void __arch_activate_pkey(int pkey)
> > +{
> > +	pkey_status_change(pkey, true);
> > +}
> > +
> > +void __arch_deactivate_pkey(int pkey)
> > +{
> > +	pkey_status_change(pkey, false);
> > +}
> 
> 
> Balbir Singh.

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 10/25] powerpc: store and restore the pkey state across context switches
  2017-10-18  3:49   ` Balbir Singh
@ 2017-10-18 20:47     ` Ram Pai
  2017-10-18 23:00       ` Balbir Singh
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-18 20:47 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 02:49:14PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:58 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > Store and restore the AMR, IAMR and UAMOR register state of the task
> > before scheduling out and after scheduling in, respectively.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/pkeys.h     |    4 +++
> >  arch/powerpc/include/asm/processor.h |    5 ++++
> >  arch/powerpc/kernel/process.c        |   10 ++++++++
> >  arch/powerpc/mm/pkeys.c              |   39 ++++++++++++++++++++++++++++++++++
> >  4 files changed, 58 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index 7fd48a4..78c5362 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -143,5 +143,9 @@ static inline void pkey_mm_init(struct mm_struct *mm)
> >  	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> >  }
> >  
> > +extern void thread_pkey_regs_save(struct thread_struct *thread);
> > +extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
> > +			struct thread_struct *old_thread);
> > +extern void thread_pkey_regs_init(struct thread_struct *thread);
> >  extern void pkey_initialize(void);
> >  #endif /*_ASM_PPC64_PKEYS_H */
> > diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> > index fab7ff8..de9d9ba 100644
> > --- a/arch/powerpc/include/asm/processor.h
> > +++ b/arch/powerpc/include/asm/processor.h
> > @@ -309,6 +309,11 @@ struct thread_struct {
> >  	struct thread_vr_state ckvr_state; /* Checkpointed VR state */
> >  	unsigned long	ckvrsave; /* Checkpointed VRSAVE */
> >  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	unsigned long	amr;
> > +	unsigned long	iamr;
> > +	unsigned long	uamor;
> > +#endif
> >  #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
> >  	void*		kvm_shadow_vcpu; /* KVM internal data */
> >  #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index a0c74bb..ba80002 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -42,6 +42,7 @@
> >  #include <linux/hw_breakpoint.h>
> >  #include <linux/uaccess.h>
> >  #include <linux/elf-randomize.h>
> > +#include <linux/pkeys.h>
> >  
> >  #include <asm/pgtable.h>
> >  #include <asm/io.h>
> > @@ -1085,6 +1086,9 @@ static inline void save_sprs(struct thread_struct *t)
> >  		t->tar = mfspr(SPRN_TAR);
> >  	}
> >  #endif
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	thread_pkey_regs_save(t);
> > +#endif
> 
> Just define two variants of thread_pkey_regs_save() based on
> CONFIG_PPC64_MEMORY_PROTECTION_KEYS and remove the #ifdefs from process.c
> Ditto for the lines below

ok.

> 
> >  }
> >  
> >  static inline void restore_sprs(struct thread_struct *old_thread,
> > @@ -1120,6 +1124,9 @@ static inline void restore_sprs(struct thread_struct *old_thread,
> >  			mtspr(SPRN_TAR, new_thread->tar);
> >  	}
> >  #endif
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	thread_pkey_regs_restore(new_thread, old_thread);
> > +#endif

ok.

> >  }
> >  
> >  #ifdef CONFIG_PPC_BOOK3S_64
> > @@ -1705,6 +1712,9 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
> >  	current->thread.tm_tfiar = 0;
> >  	current->thread.load_tm = 0;
> >  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	thread_pkey_regs_init(&current->thread);
> > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> >  }
> >  EXPORT_SYMBOL(start_thread);
> >  
> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > index 2282864..7cd1be4 100644
> > --- a/arch/powerpc/mm/pkeys.c
> > +++ b/arch/powerpc/mm/pkeys.c
> > @@ -149,3 +149,42 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> >  	init_amr(pkey, new_amr_bits);
> >  	return 0;
> >  }
> > +
> > +void thread_pkey_regs_save(struct thread_struct *thread)
> > +{
> > +	if (!pkey_inited)
> > +		return;
> > +
> > +	/* @TODO skip saving any registers if the thread
> > +	 * has not used any keys yet.
> > +	 */
> 
> Comment style is broken

ok. this time i will fix them. It misses by radar screen because
checkpatch.pl does not complain. 

> 
> > +
> > +	thread->amr = read_amr();
> > +	thread->iamr = read_iamr();
> > +	thread->uamor = read_uamor();
> > +}
> > +
> > +void thread_pkey_regs_restore(struct thread_struct *new_thread,
> > +			struct thread_struct *old_thread)
> > +{
> > +	if (!pkey_inited)
> > +		return;
> > +
> > +	/* @TODO just reset uamor to zero if the new_thread
> > +	 * has not used any keys yet.
> > +	 */
> 
> Comment style is broken.
> 
> > +
> > +	if (old_thread->amr != new_thread->amr)
> > +		write_amr(new_thread->amr);
> > +	if (old_thread->iamr != new_thread->iamr)
> > +		write_iamr(new_thread->iamr);
> > +	if (old_thread->uamor != new_thread->uamor)
> > +		write_uamor(new_thread->uamor);
> 
> Is this order correct? Ideally, You want to write the uamor first
> but since we are in supervisor state, I think we can get away
> with this order. 

we could be in hypervisor state too, as is the case when we run
a powernv kernel.

But..does it matter in which order they are written? if
the thread is in the kernel, it cannot execute any instructions
in userspace. So it wont see a intermediate state. right?
or am i getting this wrong?

> Do we want to expose the uamor to user space
> for it to modify the AMR directly?

sorry I did not understand the comment. UAMOR cannot
be accessed from usespace. and there are no system calls
currently to help userspace to program the UAMOR on its
behalf.

> 
> > +}
> > +
> > +void thread_pkey_regs_init(struct thread_struct *thread)
> > +{
> > +	write_amr(0x0ul);
> > +	write_iamr(0x0ul);
> > +	write_uamor(0x0ul);
> 
> This is not correct, reserved keys should not be set to 0's

ok. makes sense. best to not touch reserved key bits here.

> 
> Balbir Singh.

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 11/25] powerpc: introduce execute-only pkey
  2017-10-18  4:15   ` Balbir Singh
@ 2017-10-18 20:57     ` Ram Pai
  2017-10-18 23:02       ` Balbir Singh
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-18 20:57 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 03:15:22PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:44:59 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > This patch provides the implementation of execute-only pkey.
> > The architecture-independent layer expects the arch-dependent
> > layer, to support the ability to create and enable a special
> > key which has execute-only permission.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/book3s/64/mmu.h |    1 +
> >  arch/powerpc/include/asm/pkeys.h         |    9 ++++-
> >  arch/powerpc/mm/pkeys.c                  |   57 ++++++++++++++++++++++++++++++
> >  3 files changed, 66 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> > index 55950f4..ee18ba0 100644
> > --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> > @@ -115,6 +115,7 @@ struct patb_entry {
> >  	 * bit unset -> key available for allocation
> >  	 */
> >  	u32 pkey_allocation_map;
> > +	s16 execute_only_pkey; /* key holding execute-only protection */
> >  #endif
> >  } mm_context_t;
> >  
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index 78c5362..0cf115f 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -115,11 +115,16 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> >   * Try to dedicate one of the protection keys to be used as an
> >   * execute-only protection key.
> >   */
> > +extern int __execute_only_pkey(struct mm_struct *mm);
> >  static inline int execute_only_pkey(struct mm_struct *mm)
> >  {
> > -	return 0;
> > +	if (!pkey_inited || !pkey_execute_disable_support)
> > +		return -1;
> > +
> > +	return __execute_only_pkey(mm);
> >  }
> >  
> > +
> >  static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> >  		int prot, int pkey)
> >  {
> > @@ -141,6 +146,8 @@ static inline void pkey_mm_init(struct mm_struct *mm)
> >  	if (!pkey_inited)
> >  		return;
> >  	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> > +	/* -1 means unallocated or invalid */
> > +	mm->context.execute_only_pkey = -1;
> >  }
> >  
> >  extern void thread_pkey_regs_save(struct thread_struct *thread);
> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > index 7cd1be4..8a24983 100644
> > --- a/arch/powerpc/mm/pkeys.c
> > +++ b/arch/powerpc/mm/pkeys.c
> > @@ -188,3 +188,60 @@ void thread_pkey_regs_init(struct thread_struct *thread)
> >  	write_iamr(0x0ul);
> >  	write_uamor(0x0ul);
> >  }
> > +
> > +static inline bool pkey_allows_readwrite(int pkey)
> > +{
> > +	int pkey_shift = pkeyshift(pkey);
> > +
> > +	if (!(read_uamor() & (0x3UL << pkey_shift)))
> > +		return true;
> 
> If uamor for key 0 is 0x10 for example or 0x01 it's a bug.
> The above check might miss it.


The specs says both the bits corresponding to a key are set or
reset, cannot be anything else.

cut-n-paste from the ISA...
----------------------------------------------------
Software must ensure that both bits of each even/odd
bit pair of the AMOR contain the same value. -- i.e.,
the contents of register RS for mtspr specifying the
AMOR must be such that (RS)2n = (RS)2n+1 for every
n in the range 0:31 - and like for the UAMOR.
---------------------------------------------------------

> 
> > +
> > +	return !(read_amr() & ((AMR_RD_BIT|AMR_WR_BIT) << pkey_shift));
> > +}
> > +
> > +int __execute_only_pkey(struct mm_struct *mm)
> > +{
> > +	bool need_to_set_mm_pkey = false;
> > +	int execute_only_pkey = mm->context.execute_only_pkey;
> > +	int ret;
> > +
> > +	/* Do we need to assign a pkey for mm's execute-only maps? */
> > +	if (execute_only_pkey == -1) {
> > +		/* Go allocate one to use, which might fail */
> > +		execute_only_pkey = mm_pkey_alloc(mm);
> > +		if (execute_only_pkey < 0)
> > +			return -1;
> > +		need_to_set_mm_pkey = true;
> > +	}
> > +
> > +	/*
> > +	 * We do not want to go through the relatively costly
> > +	 * dance to set AMR if we do not need to.  Check it
> > +	 * first and assume that if the execute-only pkey is
> > +	 * readwrite-disabled than we do not have to set it
> > +	 * ourselves.
> > +	 */
> > +	if (!need_to_set_mm_pkey &&
> > +	    !pkey_allows_readwrite(execute_only_pkey))
> > +		return execute_only_pkey;
> > +
> > +	/*
> > +	 * Set up AMR so that it denies access for everything
> > +	 * other than execution.
> > +	 */
> > +	ret = __arch_set_user_pkey_access(current, execute_only_pkey,
> > +			(PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
> > +	/*
> > +	 * If the AMR-set operation failed somehow, just return
> > +	 * 0 and effectively disable execute-only support.
> > +	 */
> > +	if (ret) {
> > +		mm_set_pkey_free(mm, execute_only_pkey);
> > +		return -1;
> > +	}
> > +
> > +	/* We got one, store it and use it from here on out */
> > +	if (need_to_set_mm_pkey)
> > +		mm->context.execute_only_pkey = execute_only_pkey;
> > +	return execute_only_pkey;
> > +}
> 
> Looks good otherwise
> 
> Acked-by: Balbir Singh <bsingharora@gmail.com>

thanks.

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 12/25] powerpc: ability to associate pkey to a vma
  2017-10-18  4:27   ` Balbir Singh
@ 2017-10-18 21:01     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-18 21:01 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 03:27:33PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:45:00 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > arch-independent code expects the arch to  map
> > a  pkey  into the vma's protection bit setting.
> > The patch provides that ability.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/mman.h  |    8 +++++++-
> >  arch/powerpc/include/asm/pkeys.h |   18 ++++++++++++++++++
> >  2 files changed, 25 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
> > index 30922f6..067eec2 100644
> > --- a/arch/powerpc/include/asm/mman.h
> > +++ b/arch/powerpc/include/asm/mman.h
> > @@ -13,6 +13,7 @@
> >  
> >  #include <asm/cputable.h>
> >  #include <linux/mm.h>
> > +#include <linux/pkeys.h>
> >  #include <asm/cpu_has_feature.h>
> >  
> >  /*
> > @@ -22,7 +23,12 @@
> >  static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >  		unsigned long pkey)
> >  {
> > -	return (prot & PROT_SAO) ? VM_SAO : 0;
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	return (((prot & PROT_SAO) ? VM_SAO : 0) |
> > +			pkey_to_vmflag_bits(pkey));
> > +#else
> > +	return ((prot & PROT_SAO) ? VM_SAO : 0);
> > +#endif
> >  }
> >  #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> >  
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index 0cf115f..f13e913 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -23,6 +23,24 @@
> >  #define VM_PKEY_BIT4	VM_HIGH_ARCH_4
> >  #endif
> >  
> > +/* override any generic PKEY Permission defines */
> > +#define PKEY_DISABLE_EXECUTE   0x4
> > +#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
> > +				PKEY_DISABLE_WRITE  |\
> > +				PKEY_DISABLE_EXECUTE)
> > +
> > +static inline u64 pkey_to_vmflag_bits(u16 pkey)
> > +{
> > +	if (!pkey_inited)
> > +		return 0x0UL;
> > +
> > +	return (((pkey & 0x1UL) ? VM_PKEY_BIT0 : 0x0UL) |
> > +		((pkey & 0x2UL) ? VM_PKEY_BIT1 : 0x0UL) |
> > +		((pkey & 0x4UL) ? VM_PKEY_BIT2 : 0x0UL) |
> > +		((pkey & 0x8UL) ? VM_PKEY_BIT3 : 0x0UL) |
> > +		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
> > +}
> 
> Assuming that there is a linear order between VM_PKEY_BIT4 to
> VM_PKEY_BIT0, the conditional checks can be removed
> 
> (pkey & 0x1fUL) << VM_PKEY_BIT0?

yes. currently the are linear. But I am afraid it will break without
notice someday when someone decides to change the values of VM_PKEY_BITx to
be non-contiguous. I can put a BUILD_ASSERTION I suppose.  But thought
this will be safe.

RP

> 
> 
> Balbir Singh

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey()
  2017-10-18  4:36   ` Balbir Singh
@ 2017-10-18 21:10     ` Ram Pai
  2017-10-18 23:04       ` Balbir Singh
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-18 21:10 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 03:36:35PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:45:01 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > arch independent code calls arch_override_mprotect_pkey()
> > to return a pkey that best matches the requested protection.
> > 
> > This patch provides the implementation.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/mmu_context.h |    5 +++
> >  arch/powerpc/include/asm/pkeys.h       |   17 ++++++++++-
> >  arch/powerpc/mm/pkeys.c                |   47 ++++++++++++++++++++++++++++++++
> >  3 files changed, 67 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> > index c705a5d..8e5a87e 100644
> > --- a/arch/powerpc/include/asm/mmu_context.h
> > +++ b/arch/powerpc/include/asm/mmu_context.h
> > @@ -145,6 +145,11 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> >  #ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> >  #define pkey_initialize()
> >  #define pkey_mm_init(mm)
> > +
> > +static inline int vma_pkey(struct vm_area_struct *vma)
> > +{
> > +	return 0;
> > +}
> >  #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> >  
> >  #endif /* __KERNEL__ */
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index f13e913..d2fffef 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -41,6 +41,16 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
> >  		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
> >  }
> >  
> > +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
> > +				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> > +
> > +static inline int vma_pkey(struct vm_area_struct *vma)
> > +{
> > +	if (!pkey_inited)
> > +		return 0;
> 
> We don't want pkey_inited to be present in all functions, why do we need
> a conditional branch for all functions. Even if we do, it should be a jump
> label. I would rather we just removed !pkey_inited unless really really
> required.

No. we really really need it.  For example when we build a kernel with
PROTECTION_KEYS config enabled and run that kernel on a older processor
or on a system where the key feature is not enabled in the device tree,
we have fail all the calls that get called-in by the arch-neutral code.

Hence we need this check.

BTW: jump labels are awkward IMHO, unless absolutely needed.

> 
> > +	return (vma->vm_flags & ARCH_VM_PKEY_FLAGS) >> VM_PKEY_SHIFT;
> > +}
> > +
> >  #define arch_max_pkey()  pkeys_total
> >  #define AMR_RD_BIT 0x1UL
> >  #define AMR_WR_BIT 0x2UL
> > @@ -142,11 +152,14 @@ static inline int execute_only_pkey(struct mm_struct *mm)
> >  	return __execute_only_pkey(mm);
> >  }
> >  
> > -
> > +extern int __arch_override_mprotect_pkey(struct vm_area_struct *vma,
> > +		int prot, int pkey);
> >  static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> >  		int prot, int pkey)
> >  {
> > -	return 0;
> > +	if (!pkey_inited)
> > +		return 0;
> > +	return __arch_override_mprotect_pkey(vma, prot, pkey);
> >  }
> >  
> >  extern int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > index 8a24983..fb1a76a 100644
> > --- a/arch/powerpc/mm/pkeys.c
> > +++ b/arch/powerpc/mm/pkeys.c
> > @@ -245,3 +245,50 @@ int __execute_only_pkey(struct mm_struct *mm)
> >  		mm->context.execute_only_pkey = execute_only_pkey;
> >  	return execute_only_pkey;
> >  }
> > +
> > +static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
> > +{
> > +	/* Do this check first since the vm_flags should be hot */
> > +	if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
> > +		return false;
> > +
> > +	return (vma_pkey(vma) == vma->vm_mm->context.execute_only_pkey);
> > +}
> > +
> > +/*
> > + * This should only be called for *plain* mprotect calls.
> 
> What's a plain mprotect call?

there is sys_mprotect() and now there is a sys_pkey_mprotect() call.
The 'plain' one is the former.

> 
> > + */
> > +int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
> > +		int pkey)
> > +{
> > +	/*
> > +	 * Is this an mprotect_pkey() call?  If so, never
> > +	 * override the value that came from the user.
> > +	 */
> > +	if (pkey != -1)
> > +		return pkey;
> 
> If the user specified a key, we always use it? Presumably the user
> got it from pkey_alloc(), in other cases, the user was lazy and used
> -1 in the mprotect call?

in the plain sys_mprotect() key is not specified. In that case this
function gets called with a -1.

> 
> > +
> > +	/*
> > +	 * If the currently associated pkey is execute-only,
> > +	 * but the requested protection requires read or write,
> > +	 * move it back to the default pkey.
> > +	 */
> > +	if (vma_is_pkey_exec_only(vma) &&
> > +	    (prot & (PROT_READ|PROT_WRITE)))
> > +		return 0;
> > +
> > +	/*
> > +	 * the requested protection is execute-only. Hence
> > +	 * lets use a execute-only pkey.
> > +	 */
> > +	if (prot == PROT_EXEC) {
> > +		pkey = execute_only_pkey(vma->vm_mm);
> > +		if (pkey > 0)
> > +			return pkey;
> > +	}
> > +
> > +	/*
> > +	 * nothing to override.
> > +	 */
> > +	return vma_pkey(vma);
> > +}
> 
> Balbir Singh.

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 14/25] powerpc: map vma key-protection bits to pte key bits.
  2017-10-18  4:39   ` Balbir Singh
@ 2017-10-18 21:14     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-18 21:14 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 03:39:11PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:45:02 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > map  the  key  protection  bits of the vma to the pkey bits in
> > the PTE.
> > 
> > The Pte  bits used  for pkey  are  3,4,5,6  and 57. The  first
> > four bits are the same four bits that were freed up  initially
> > in this patch series. remember? :-) Without those four bits
> > this patch would'nt be possible.
> > 
> > BUT, On 4k kernel, bit 3, and 4 could not be freed up. remember?
> > Hence we have to be satisfied with 5,6 and 7.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/book3s/64/pgtable.h |   25 ++++++++++++++++++++++++-
> >  arch/powerpc/include/asm/mman.h              |    8 ++++++++
> >  arch/powerpc/include/asm/pkeys.h             |   12 ++++++++++++
> >  3 files changed, 44 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > index 73ed52c..5935d4e 100644
> > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > @@ -38,6 +38,7 @@
> >  #define _RPAGE_RSV2		0x0800000000000000UL
> >  #define _RPAGE_RSV3		0x0400000000000000UL
> >  #define _RPAGE_RSV4		0x0200000000000000UL
> > +#define _RPAGE_RSV5		0x00040UL
> >  
> >  #define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
> >  #define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
> > @@ -57,6 +58,25 @@
> >  /* Max physical address bit as per radix table */
> >  #define _RPAGE_PA_MAX		57
> >  
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +#ifdef CONFIG_PPC_64K_PAGES
> > +#define H_PAGE_PKEY_BIT0	_RPAGE_RSV1
> > +#define H_PAGE_PKEY_BIT1	_RPAGE_RSV2
> > +#else /* CONFIG_PPC_64K_PAGES */
> > +#define H_PAGE_PKEY_BIT0	0 /* _RPAGE_RSV1 is not available */
> > +#define H_PAGE_PKEY_BIT1	0 /* _RPAGE_RSV2 is not available */
> > +#endif /* CONFIG_PPC_64K_PAGES */
> > +#define H_PAGE_PKEY_BIT2	_RPAGE_RSV3
> > +#define H_PAGE_PKEY_BIT3	_RPAGE_RSV4
> > +#define H_PAGE_PKEY_BIT4	_RPAGE_RSV5
> > +#else /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> > +#define H_PAGE_PKEY_BIT0	0
> > +#define H_PAGE_PKEY_BIT1	0
> > +#define H_PAGE_PKEY_BIT2	0
> > +#define H_PAGE_PKEY_BIT3	0
> > +#define H_PAGE_PKEY_BIT4	0
> > +#endif /*  CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> 
> H_PTE_PKEY_BITX?

ok. makes sense. will do.

> 
> > +
> >  /*
> >   * Max physical address bit we will use for now.
> >   *
> > @@ -120,13 +140,16 @@
> >  #define _PAGE_CHG_MASK	(PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
> >  			 _PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE |	\
> >  			 _PAGE_SOFT_DIRTY)
> > +
> > +#define H_PAGE_PKEY  (H_PAGE_PKEY_BIT0 | H_PAGE_PKEY_BIT1 | H_PAGE_PKEY_BIT2 | \
> > +			H_PAGE_PKEY_BIT3 | H_PAGE_PKEY_BIT4)
> >  /*
> >   * Mask of bits returned by pte_pgprot()
> >   */
> >  #define PAGE_PROT_BITS  (_PAGE_SAO | _PAGE_NON_IDEMPOTENT | _PAGE_TOLERANT | \
> >  			 H_PAGE_4K_PFN | _PAGE_PRIVILEGED | _PAGE_ACCESSED | \
> >  			 _PAGE_READ | _PAGE_WRITE |  _PAGE_DIRTY | _PAGE_EXEC | \
> > -			 _PAGE_SOFT_DIRTY)
> > +			 _PAGE_SOFT_DIRTY | H_PAGE_PKEY)
> >  /*
> >   * We define 2 sets of base prot bits, one for basic pages (ie,
> >   * cacheable kernel and user pages) and one for non cacheable
> > diff --git a/arch/powerpc/include/asm/mman.h b/arch/powerpc/include/asm/mman.h
> > index 067eec2..3f7220f 100644
> > --- a/arch/powerpc/include/asm/mman.h
> > +++ b/arch/powerpc/include/asm/mman.h
> > @@ -32,12 +32,20 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
> >  }
> >  #define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
> >  
> > +
> >  static inline pgprot_t arch_vm_get_page_prot(unsigned long vm_flags)
> >  {
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	return (vm_flags & VM_SAO) ?
> > +		__pgprot(_PAGE_SAO | vmflag_to_page_pkey_bits(vm_flags)) :
> > +		__pgprot(0 | vmflag_to_page_pkey_bits(vm_flags));
> > +#else
> >  	return (vm_flags & VM_SAO) ? __pgprot(_PAGE_SAO) : __pgprot(0);
> > +#endif
> >  }
> >  #define arch_vm_get_page_prot(vm_flags) arch_vm_get_page_prot(vm_flags)
> >  
> > +
> >  static inline bool arch_validate_prot(unsigned long prot)
> >  {
> >  	if (prot & ~(PROT_READ | PROT_WRITE | PROT_EXEC | PROT_SEM | PROT_SAO))
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index d2fffef..0d2488a 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -41,6 +41,18 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
> >  		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
> >  }
> >  
> > +static inline u64 vmflag_to_page_pkey_bits(u64 vm_flags)
> 
> vmflag_to_pte_pkey_bits?

ok. if you insist :). will do.

> 
> > +{
> > +	if (!pkey_inited)
> > +		return 0x0UL;
> > +
> > +	return (((vm_flags & VM_PKEY_BIT0) ? H_PAGE_PKEY_BIT4 : 0x0UL) |
> > +		((vm_flags & VM_PKEY_BIT1) ? H_PAGE_PKEY_BIT3 : 0x0UL) |
> > +		((vm_flags & VM_PKEY_BIT2) ? H_PAGE_PKEY_BIT2 : 0x0UL) |
> > +		((vm_flags & VM_PKEY_BIT3) ? H_PAGE_PKEY_BIT1 : 0x0UL) |
> > +		((vm_flags & VM_PKEY_BIT4) ? H_PAGE_PKEY_BIT0 : 0x0UL));
> > +}
> > +
> >  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
> >  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> >  
> 
> Balbir Singh.

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 17/25] powerpc: helper to validate key-access permissions of a pte
  2017-10-18  4:48   ` Balbir Singh
@ 2017-10-18 21:19     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-18 21:19 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, Oct 18, 2017 at 03:48:31PM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:45:05 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > helper function that checks if the read/write/execute is allowed
> > on the pte.
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/book3s/64/pgtable.h |    4 +++
> >  arch/powerpc/include/asm/pkeys.h             |   12 +++++++++++
> >  arch/powerpc/mm/pkeys.c                      |   28 ++++++++++++++++++++++++++
> >  3 files changed, 44 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > index 5935d4e..bd244b3 100644
> > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> > @@ -492,6 +492,10 @@ static inline void write_uamor(u64 value)
> >  	mtspr(SPRN_UAMOR, value);
> >  }
> >  
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +extern bool arch_pte_access_permitted(u64 pte, bool write, bool execute);
> > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> > +
> >  #define __HAVE_ARCH_PTEP_GET_AND_CLEAR
> >  static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
> >  				       unsigned long addr, pte_t *ptep)
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index cd3924c..50522a0 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -80,6 +80,18 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)
> >  		((pteflags & H_PAGE_PKEY_BIT4) ? HPTE_R_KEY_BIT4 : 0x0UL));
> >  }
> >  
> > +static inline u16 pte_to_pkey_bits(u64 pteflags)
> > +{
> > +	if (!pkey_inited)
> > +		return 0x0UL;
> > +
> > +	return (((pteflags & H_PAGE_PKEY_BIT0) ? 0x10 : 0x0UL) |
> > +		((pteflags & H_PAGE_PKEY_BIT1) ? 0x8 : 0x0UL) |
> > +		((pteflags & H_PAGE_PKEY_BIT2) ? 0x4 : 0x0UL) |
> > +		((pteflags & H_PAGE_PKEY_BIT3) ? 0x2 : 0x0UL) |
> > +		((pteflags & H_PAGE_PKEY_BIT4) ? 0x1 : 0x0UL));
> > +}
> > +
> >  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
> >  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> >  #define AMR_BITS_PER_PKEY 2
> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > index fb1a76a..24589d9 100644
> > --- a/arch/powerpc/mm/pkeys.c
> > +++ b/arch/powerpc/mm/pkeys.c
> > @@ -292,3 +292,31 @@ int __arch_override_mprotect_pkey(struct vm_area_struct *vma, int prot,
> >  	 */
> >  	return vma_pkey(vma);
> >  }
> > +
> > +static bool pkey_access_permitted(int pkey, bool write, bool execute)
> > +{
> > +	int pkey_shift;
> > +	u64 amr;
> > +
> > +	if (!pkey)
> > +		return true;
> 
> Why would we have pkey set to 0, it's reserved. Why do we return true?

pkey 0 is reserved in some weird sense. it is the default key which is
omnipresent, which cannot be allocated or freed, but can be used any time and
allows read/write/execute at all times.

> 
> > +
> > +	pkey_shift = pkeyshift(pkey);
> > +	if (!(read_uamor() & (0x3UL << pkey_shift)))
> > +		return true;
> > +
> > +	if (execute && !(read_iamr() & (IAMR_EX_BIT << pkey_shift)))
> > +		return true;
> > +
> > +	amr = read_amr(); /* delay reading amr uptil absolutely needed*/
> > +	return ((!write && !(amr & (AMR_RD_BIT << pkey_shift))) ||
> > +		(write &&  !(amr & (AMR_WR_BIT << pkey_shift))));
> > +}
> > +
> > +bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
> > +{
> > +	if (!pkey_inited)
> > +		return true;
> 
> Again, don't like the pkey_inited bits :)

suggest something. :) running out of ideas ;)

> 
> > +	return pkey_access_permitted(pte_to_pkey_bits(pte),
> > +			write, execute);
> > +}
> 
> Balbir Singh

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 18/25] powerpc: check key protection for user page access
  2017-10-18 19:57   ` Balbir Singh
@ 2017-10-18 21:29     ` Ram Pai
  2017-10-18 23:08       ` Balbir Singh
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-18 21:29 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 06:57:32AM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:45:06 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > Make sure that the kernel does not access user pages without
> > checking their key-protection.
> >
> 
> Why? This makes the routines AMR/thread specific? Looks like
> x86 does this as well

Yes. the memkey semantics implemented by x86, assumes that the keys and
their access-permission are per thread.  In other words, a key which is
enabled in the context of one thread, will not be enabled in the context
of another thread.

> but these routines are used by GUP from
> the kernel.

See a problem?

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 10/25] powerpc: store and restore the pkey state across context switches
  2017-10-18 20:47     ` Ram Pai
@ 2017-10-18 23:00       ` Balbir Singh
  2017-10-19  0:52         ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18 23:00 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, 18 Oct 2017 13:47:05 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> On Wed, Oct 18, 2017 at 02:49:14PM +1100, Balbir Singh wrote:
> > On Fri,  8 Sep 2017 15:44:58 -0700
> > Ram Pai <linuxram@us.ibm.com> wrote:
> >   
> > > Store and restore the AMR, IAMR and UAMOR register state of the task
> > > before scheduling out and after scheduling in, respectively.
> > > 
> > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/pkeys.h     |    4 +++
> > >  arch/powerpc/include/asm/processor.h |    5 ++++
> > >  arch/powerpc/kernel/process.c        |   10 ++++++++
> > >  arch/powerpc/mm/pkeys.c              |   39 ++++++++++++++++++++++++++++++++++
> > >  4 files changed, 58 insertions(+), 0 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > > index 7fd48a4..78c5362 100644
> > > --- a/arch/powerpc/include/asm/pkeys.h
> > > +++ b/arch/powerpc/include/asm/pkeys.h
> > > @@ -143,5 +143,9 @@ static inline void pkey_mm_init(struct mm_struct *mm)
> > >  	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> > >  }
> > >  
> > > +extern void thread_pkey_regs_save(struct thread_struct *thread);
> > > +extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
> > > +			struct thread_struct *old_thread);
> > > +extern void thread_pkey_regs_init(struct thread_struct *thread);
> > >  extern void pkey_initialize(void);
> > >  #endif /*_ASM_PPC64_PKEYS_H */
> > > diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> > > index fab7ff8..de9d9ba 100644
> > > --- a/arch/powerpc/include/asm/processor.h
> > > +++ b/arch/powerpc/include/asm/processor.h
> > > @@ -309,6 +309,11 @@ struct thread_struct {
> > >  	struct thread_vr_state ckvr_state; /* Checkpointed VR state */
> > >  	unsigned long	ckvrsave; /* Checkpointed VRSAVE */
> > >  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > +	unsigned long	amr;
> > > +	unsigned long	iamr;
> > > +	unsigned long	uamor;
> > > +#endif
> > >  #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
> > >  	void*		kvm_shadow_vcpu; /* KVM internal data */
> > >  #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
> > > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > > index a0c74bb..ba80002 100644
> > > --- a/arch/powerpc/kernel/process.c
> > > +++ b/arch/powerpc/kernel/process.c
> > > @@ -42,6 +42,7 @@
> > >  #include <linux/hw_breakpoint.h>
> > >  #include <linux/uaccess.h>
> > >  #include <linux/elf-randomize.h>
> > > +#include <linux/pkeys.h>
> > >  
> > >  #include <asm/pgtable.h>
> > >  #include <asm/io.h>
> > > @@ -1085,6 +1086,9 @@ static inline void save_sprs(struct thread_struct *t)
> > >  		t->tar = mfspr(SPRN_TAR);
> > >  	}
> > >  #endif
> > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > +	thread_pkey_regs_save(t);
> > > +#endif  
> > 
> > Just define two variants of thread_pkey_regs_save() based on
> > CONFIG_PPC64_MEMORY_PROTECTION_KEYS and remove the #ifdefs from process.c
> > Ditto for the lines below  
> 
> ok.
> 
> >   
> > >  }
> > >  
> > >  static inline void restore_sprs(struct thread_struct *old_thread,
> > > @@ -1120,6 +1124,9 @@ static inline void restore_sprs(struct thread_struct *old_thread,
> > >  			mtspr(SPRN_TAR, new_thread->tar);
> > >  	}
> > >  #endif
> > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > +	thread_pkey_regs_restore(new_thread, old_thread);
> > > +#endif  
> 
> ok.
> 
> > >  }
> > >  
> > >  #ifdef CONFIG_PPC_BOOK3S_64
> > > @@ -1705,6 +1712,9 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
> > >  	current->thread.tm_tfiar = 0;
> > >  	current->thread.load_tm = 0;
> > >  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > +	thread_pkey_regs_init(&current->thread);
> > > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> > >  }
> > >  EXPORT_SYMBOL(start_thread);
> > >  
> > > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > > index 2282864..7cd1be4 100644
> > > --- a/arch/powerpc/mm/pkeys.c
> > > +++ b/arch/powerpc/mm/pkeys.c
> > > @@ -149,3 +149,42 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> > >  	init_amr(pkey, new_amr_bits);
> > >  	return 0;
> > >  }
> > > +
> > > +void thread_pkey_regs_save(struct thread_struct *thread)
> > > +{
> > > +	if (!pkey_inited)
> > > +		return;
> > > +
> > > +	/* @TODO skip saving any registers if the thread
> > > +	 * has not used any keys yet.
> > > +	 */  
> > 
> > Comment style is broken  
> 
> ok. this time i will fix them. It misses by radar screen because
> checkpatch.pl does not complain. 
>

Yep, there is an lkml thread on this style of coding. It's
best avoided.

> >   
> > > +
> > > +	thread->amr = read_amr();
> > > +	thread->iamr = read_iamr();
> > > +	thread->uamor = read_uamor();
> > > +}
> > > +
> > > +void thread_pkey_regs_restore(struct thread_struct *new_thread,
> > > +			struct thread_struct *old_thread)
> > > +{
> > > +	if (!pkey_inited)
> > > +		return;
> > > +
> > > +	/* @TODO just reset uamor to zero if the new_thread
> > > +	 * has not used any keys yet.
> > > +	 */  
> > 
> > Comment style is broken.
> >   
> > > +
> > > +	if (old_thread->amr != new_thread->amr)
> > > +		write_amr(new_thread->amr);
> > > +	if (old_thread->iamr != new_thread->iamr)
> > > +		write_iamr(new_thread->iamr);
> > > +	if (old_thread->uamor != new_thread->uamor)
> > > +		write_uamor(new_thread->uamor);  
> > 
> > Is this order correct? Ideally, You want to write the uamor first
> > but since we are in supervisor state, I think we can get away
> > with this order.   
> 
> we could be in hypervisor state too, as is the case when we run
> a powernv kernel.
> 
> But..does it matter in which order they are written? if
> the thread is in the kernel, it cannot execute any instructions
> in userspace. So it wont see a intermediate state. right?
> or am i getting this wrong?

You are right, since uamor + amor control what can and
cannot be set, there is a subtle dependency, but it does
not apply to the kernel doing the context switch. AMR has
two SPR values, 13 and 29. I presume we are using SPR #29
here?

> 
> > Do we want to expose the uamor to user space
> > for it to modify the AMR directly?  
> 
> sorry I did not understand the comment. UAMOR cannot
> be accessed from usespace. and there are no system calls
> currently to help userspace to program the UAMOR on its
> behalf.
> 

I was just wondering how two threads can share a key if
they decide to. They would need uamor to give them permissions
to the same set of keys and then reuse the key via
pkey_mprotect(.., pkey). I am missing the bit about how
uamor's across these threads would be synchronized.


> >   
> > > +}
> > > +
> > > +void thread_pkey_regs_init(struct thread_struct *thread)
> > > +{
> > > +	write_amr(0x0ul);
> > > +	write_iamr(0x0ul);
> > > +	write_uamor(0x0ul);  
> > 
> > This is not correct, reserved keys should not be set to 0's  
> 
> ok. makes sense. best to not touch reserved key bits here.

Also this implies that at init time, the thread has access to all
keys, but it can't modify any of the keys in the AMR/IAMR.

> 
> > 
> > Balbir Singh.  
> 

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 11/25] powerpc: introduce execute-only pkey
  2017-10-18 20:57     ` Ram Pai
@ 2017-10-18 23:02       ` Balbir Singh
  2017-10-19 15:52         ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18 23:02 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, 18 Oct 2017 13:57:39 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> On Wed, Oct 18, 2017 at 03:15:22PM +1100, Balbir Singh wrote:
> > On Fri,  8 Sep 2017 15:44:59 -0700
> > Ram Pai <linuxram@us.ibm.com> wrote:
> >   
> > > This patch provides the implementation of execute-only pkey.
> > > The architecture-independent layer expects the arch-dependent
> > > layer, to support the ability to create and enable a special
> > > key which has execute-only permission.
> > > 
> > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/book3s/64/mmu.h |    1 +
> > >  arch/powerpc/include/asm/pkeys.h         |    9 ++++-
> > >  arch/powerpc/mm/pkeys.c                  |   57 ++++++++++++++++++++++++++++++
> > >  3 files changed, 66 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> > > index 55950f4..ee18ba0 100644
> > > --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> > > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> > > @@ -115,6 +115,7 @@ struct patb_entry {
> > >  	 * bit unset -> key available for allocation
> > >  	 */
> > >  	u32 pkey_allocation_map;
> > > +	s16 execute_only_pkey; /* key holding execute-only protection */
> > >  #endif
> > >  } mm_context_t;
> > >  
> > > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > > index 78c5362..0cf115f 100644
> > > --- a/arch/powerpc/include/asm/pkeys.h
> > > +++ b/arch/powerpc/include/asm/pkeys.h
> > > @@ -115,11 +115,16 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> > >   * Try to dedicate one of the protection keys to be used as an
> > >   * execute-only protection key.
> > >   */
> > > +extern int __execute_only_pkey(struct mm_struct *mm);
> > >  static inline int execute_only_pkey(struct mm_struct *mm)
> > >  {
> > > -	return 0;
> > > +	if (!pkey_inited || !pkey_execute_disable_support)
> > > +		return -1;
> > > +
> > > +	return __execute_only_pkey(mm);
> > >  }
> > >  
> > > +
> > >  static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> > >  		int prot, int pkey)
> > >  {
> > > @@ -141,6 +146,8 @@ static inline void pkey_mm_init(struct mm_struct *mm)
> > >  	if (!pkey_inited)
> > >  		return;
> > >  	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> > > +	/* -1 means unallocated or invalid */
> > > +	mm->context.execute_only_pkey = -1;
> > >  }
> > >  
> > >  extern void thread_pkey_regs_save(struct thread_struct *thread);
> > > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > > index 7cd1be4..8a24983 100644
> > > --- a/arch/powerpc/mm/pkeys.c
> > > +++ b/arch/powerpc/mm/pkeys.c
> > > @@ -188,3 +188,60 @@ void thread_pkey_regs_init(struct thread_struct *thread)
> > >  	write_iamr(0x0ul);
> > >  	write_uamor(0x0ul);
> > >  }
> > > +
> > > +static inline bool pkey_allows_readwrite(int pkey)
> > > +{
> > > +	int pkey_shift = pkeyshift(pkey);
> > > +
> > > +	if (!(read_uamor() & (0x3UL << pkey_shift)))
> > > +		return true;  
> > 
> > If uamor for key 0 is 0x10 for example or 0x01 it's a bug.
> > The above check might miss it.  
> 
> 
> The specs says both the bits corresponding to a key are set or
> reset, cannot be anything else.
>

I agree, thats why I said it's a bug if the values are such.
Do we care to validate that both bits are the same?

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey()
  2017-10-18 21:10     ` Ram Pai
@ 2017-10-18 23:04       ` Balbir Singh
  2017-10-19 16:39         ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18 23:04 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, 18 Oct 2017 14:10:41 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> On Wed, Oct 18, 2017 at 03:36:35PM +1100, Balbir Singh wrote:
> > On Fri,  8 Sep 2017 15:45:01 -0700
> > Ram Pai <linuxram@us.ibm.com> wrote:
> >   
> > > arch independent code calls arch_override_mprotect_pkey()
> > > to return a pkey that best matches the requested protection.
> > > 
> > > This patch provides the implementation.
> > > 
> > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > ---
> > >  arch/powerpc/include/asm/mmu_context.h |    5 +++
> > >  arch/powerpc/include/asm/pkeys.h       |   17 ++++++++++-
> > >  arch/powerpc/mm/pkeys.c                |   47 ++++++++++++++++++++++++++++++++
> > >  3 files changed, 67 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> > > index c705a5d..8e5a87e 100644
> > > --- a/arch/powerpc/include/asm/mmu_context.h
> > > +++ b/arch/powerpc/include/asm/mmu_context.h
> > > @@ -145,6 +145,11 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> > >  #ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > >  #define pkey_initialize()
> > >  #define pkey_mm_init(mm)
> > > +
> > > +static inline int vma_pkey(struct vm_area_struct *vma)
> > > +{
> > > +	return 0;
> > > +}
> > >  #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> > >  
> > >  #endif /* __KERNEL__ */
> > > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > > index f13e913..d2fffef 100644
> > > --- a/arch/powerpc/include/asm/pkeys.h
> > > +++ b/arch/powerpc/include/asm/pkeys.h
> > > @@ -41,6 +41,16 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
> > >  		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
> > >  }
> > >  
> > > +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
> > > +				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> > > +
> > > +static inline int vma_pkey(struct vm_area_struct *vma)
> > > +{
> > > +	if (!pkey_inited)
> > > +		return 0;  
> > 
> > We don't want pkey_inited to be present in all functions, why do we need
> > a conditional branch for all functions. Even if we do, it should be a jump
> > label. I would rather we just removed !pkey_inited unless really really
> > required.  
> 
> No. we really really need it.  For example when we build a kernel with
> PROTECTION_KEYS config enabled and run that kernel on a older processor
> or on a system where the key feature is not enabled in the device tree,
> we have fail all the calls that get called-in by the arch-neutral code.
> 
> Hence we need this check.
> 

Use a mmu_feature then, it's already designed and optimized for that
purpose

> BTW: jump labels are awkward IMHO, unless absolutely needed.
>

The if checks across the place will hurt performance and we want to have
this enabled by default, we may need a mmu feature or jump labels

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 18/25] powerpc: check key protection for user page access
  2017-10-18 21:29     ` Ram Pai
@ 2017-10-18 23:08       ` Balbir Singh
  2017-10-19 16:46         ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18 23:08 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Wed, 18 Oct 2017 14:29:24 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> On Thu, Oct 19, 2017 at 06:57:32AM +1100, Balbir Singh wrote:
> > On Fri,  8 Sep 2017 15:45:06 -0700
> > Ram Pai <linuxram@us.ibm.com> wrote:
> >   
> > > Make sure that the kernel does not access user pages without
> > > checking their key-protection.
> > >  
> > 
> > Why? This makes the routines AMR/thread specific? Looks like
> > x86 does this as well  
> 
> Yes. the memkey semantics implemented by x86, assumes that the keys and
> their access-permission are per thread.  In other words, a key which is
> enabled in the context of one thread, will not be enabled in the context
> of another thread.
> 
> > but these routines are used by GUP from
> > the kernel.  
> 
> See a problem?
>

No, I don't understand why gup (called from driver context, probably) should
worry about permissions and keys?

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 19/25] powerpc: implementation for arch_vma_access_permitted()
  2017-09-08 22:45 ` [PATCH 19/25] powerpc: implementation for arch_vma_access_permitted() Ram Pai
@ 2017-10-18 23:20   ` Balbir Singh
  2017-10-24 15:48   ` Michael Ellerman
  1 sibling, 0 replies; 134+ messages in thread
From: Balbir Singh @ 2017-10-18 23:20 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:07 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> This patch provides the implementation for
> arch_vma_access_permitted(). Returns true if the
> requested access is allowed by pkey associated with the
> vma.
> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/mmu_context.h |    5 +++-
>  arch/powerpc/mm/pkeys.c                |   43 ++++++++++++++++++++++++++++++++
>  2 files changed, 47 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> index 04e9221..9a56355 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -135,6 +135,10 @@ static inline void arch_bprm_mm_init(struct mm_struct *mm,
>  {
>  }
>  
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +bool arch_vma_access_permitted(struct vm_area_struct *vma,
> +			bool write, bool execute, bool foreign);
> +#else /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
>  static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
>  		bool write, bool execute, bool foreign)
>  {
> @@ -142,7 +146,6 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
>  	return true;
>  }
>  
> -#ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
>  #define pkey_initialize()
>  #define pkey_mm_init(mm)
>  
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index 24589d9..21c3b42 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -320,3 +320,46 @@ bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
>  	return pkey_access_permitted(pte_to_pkey_bits(pte),
>  			write, execute);
>  }
> +
> +/*
> + * We only want to enforce protection keys on the current process
> + * because we effectively have no access to AMR/IAMR for other
> + * processes or any way to tell *which * AMR/IAMR in a threaded
> + * process we could use.
> + *
> + * So do not enforce things if the VMA is not from the current
> + * mm, or if we are in a kernel thread.
> + */
> +static inline bool vma_is_foreign(struct vm_area_struct *vma)
> +{
> +	if (!current->mm)
> +		return true;
> +	/*
> +	 * if the VMA is from another process, then AMR/IAMR has no
> +	 * relevance and should not be enforced.
> +	 */
> +	if (current->mm != vma->vm_mm)
> +		return true;
> +
> +	return false;
> +}
> +
> +bool arch_vma_access_permitted(struct vm_area_struct *vma,
> +		bool write, bool execute, bool foreign)
> +{
> +	int pkey;
> +
> +	if (!pkey_inited)
> +		return true;
> +
> +	/* allow access if the VMA is not one from this process */
> +	if (foreign || vma_is_foreign(vma))
> +		return true;
> +
> +	pkey = vma_pkey(vma);
> +
> +	if (!pkey)
> +		return true;
> +
> +	return pkey_access_permitted(pkey, write, execute);
> +}

Again, I think this is GUP, I don't really understand the top level
use case of enforcing permissions for GUP in a thread context.

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation
  2017-09-08 22:45 ` [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation Ram Pai
@ 2017-10-18 23:27   ` Balbir Singh
  2017-10-19 16:53     ` Ram Pai
  2017-10-24 15:47   ` Michael Ellerman
  1 sibling, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18 23:27 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:08 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> Handle Data and  Instruction exceptions caused by memory
> protection-key.
> 
> The CPU will detect the key fault if the HPTE is already
> programmed with the key.
> 
> However if the HPTE is not  hashed, a key fault will not
> be detected by the  hardware. The   software will detect
> pkey violation in such a case.


This bit is not clear.

> 
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/mm/fault.c |   37 ++++++++++++++++++++++++++++++++-----
>  1 files changed, 32 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 4797d08..a16bc43 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -145,6 +145,23 @@ static noinline int bad_area(struct pt_regs *regs, unsigned long address)
>  	return __bad_area(regs, address, SEGV_MAPERR);
>  }
>  
> +static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
> +					int si_code)
> +{
> +	int sig = SIGBUS;
> +	int code = BUS_OBJERR;
> +
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	if (si_code & DSISR_KEYFAULT) {
> +		sig = SIGSEGV;
> +		code = SEGV_PKUERR;
> +	}
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> +
> +	_exception(sig, regs, code, address);
> +	return 0;
> +}
> +
>  static int do_sigbus(struct pt_regs *regs, unsigned long address,
>  		     unsigned int fault)
>  {
> @@ -391,11 +408,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  		return 0;
>  
>  	if (unlikely(page_fault_is_bad(error_code))) {
> -		if (is_user) {
> -			_exception(SIGBUS, regs, BUS_OBJERR, address);
> -			return 0;
> -		}
> -		return SIGBUS;
> +		if (!is_user)
> +			return SIGBUS;
> +		return bad_page_fault_exception(regs, address, error_code);
>  	}
>  
>  	/* Additional sanity check(s) */
> @@ -492,6 +507,18 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  	if (unlikely(access_error(is_write, is_exec, vma)))
>  		return bad_area(regs, address);
>  
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
> +			is_exec, 0))
> +		return __bad_area(regs, address, SEGV_PKUERR);


Hmm.. this is for the missing entry in the HPT and software detecting the
fault you mentioned above? Why do we need this case?

> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> +
> +
> +	/* handle_mm_fault() needs to know if its a instruction access
> +	 * fault.
> +	 */

comment style

> +	if (is_exec)
> +		flags |= FAULT_FLAG_INSTRUCTION;
>  	/*
>  	 * If for any reason at all we couldn't handle the fault,
>  	 * make sure we exit gracefully rather than endlessly redo

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 21/25] powerpc: introduce get_pte_pkey() helper
  2017-09-08 22:45 ` [PATCH 21/25] powerpc: introduce get_pte_pkey() helper Ram Pai
@ 2017-10-18 23:29   ` Balbir Singh
  2017-10-19 16:55     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Balbir Singh @ 2017-10-18 23:29 UTC (permalink / raw)
  To: Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Fri,  8 Sep 2017 15:45:09 -0700
Ram Pai <linuxram@us.ibm.com> wrote:

> get_pte_pkey() helper returns the pkey associated with
> a address corresponding to a given mm_struct.
>

This is really get_mm_addr_key() -- no?

Balbir Singh.

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 10/25] powerpc: store and restore the pkey state across context switches
  2017-10-18 23:00       ` Balbir Singh
@ 2017-10-19  0:52         ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-19  0:52 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 10:00:38AM +1100, Balbir Singh wrote:
> On Wed, 18 Oct 2017 13:47:05 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > On Wed, Oct 18, 2017 at 02:49:14PM +1100, Balbir Singh wrote:
> > > On Fri,  8 Sep 2017 15:44:58 -0700
> > > Ram Pai <linuxram@us.ibm.com> wrote:
> > >   
> > > > Store and restore the AMR, IAMR and UAMOR register state of the task
> > > > before scheduling out and after scheduling in, respectively.
> > > > 
> > > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > > ---
> > > >  arch/powerpc/include/asm/pkeys.h     |    4 +++
> > > >  arch/powerpc/include/asm/processor.h |    5 ++++
> > > >  arch/powerpc/kernel/process.c        |   10 ++++++++
> > > >  arch/powerpc/mm/pkeys.c              |   39 ++++++++++++++++++++++++++++++++++
> > > >  4 files changed, 58 insertions(+), 0 deletions(-)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > > > index 7fd48a4..78c5362 100644
> > > > --- a/arch/powerpc/include/asm/pkeys.h
> > > > +++ b/arch/powerpc/include/asm/pkeys.h
> > > > @@ -143,5 +143,9 @@ static inline void pkey_mm_init(struct mm_struct *mm)
> > > >  	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> > > >  }
> > > >  
> > > > +extern void thread_pkey_regs_save(struct thread_struct *thread);
> > > > +extern void thread_pkey_regs_restore(struct thread_struct *new_thread,
> > > > +			struct thread_struct *old_thread);
> > > > +extern void thread_pkey_regs_init(struct thread_struct *thread);
> > > >  extern void pkey_initialize(void);
> > > >  #endif /*_ASM_PPC64_PKEYS_H */
> > > > diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> > > > index fab7ff8..de9d9ba 100644
> > > > --- a/arch/powerpc/include/asm/processor.h
> > > > +++ b/arch/powerpc/include/asm/processor.h
> > > > @@ -309,6 +309,11 @@ struct thread_struct {
> > > >  	struct thread_vr_state ckvr_state; /* Checkpointed VR state */
> > > >  	unsigned long	ckvrsave; /* Checkpointed VRSAVE */
> > > >  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> > > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > > +	unsigned long	amr;
> > > > +	unsigned long	iamr;
> > > > +	unsigned long	uamor;
> > > > +#endif
> > > >  #ifdef CONFIG_KVM_BOOK3S_32_HANDLER
> > > >  	void*		kvm_shadow_vcpu; /* KVM internal data */
> > > >  #endif /* CONFIG_KVM_BOOK3S_32_HANDLER */
> > > > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > > > index a0c74bb..ba80002 100644
> > > > --- a/arch/powerpc/kernel/process.c
> > > > +++ b/arch/powerpc/kernel/process.c
> > > > @@ -42,6 +42,7 @@
> > > >  #include <linux/hw_breakpoint.h>
> > > >  #include <linux/uaccess.h>
> > > >  #include <linux/elf-randomize.h>
> > > > +#include <linux/pkeys.h>
> > > >  
> > > >  #include <asm/pgtable.h>
> > > >  #include <asm/io.h>
> > > > @@ -1085,6 +1086,9 @@ static inline void save_sprs(struct thread_struct *t)
> > > >  		t->tar = mfspr(SPRN_TAR);
> > > >  	}
> > > >  #endif
> > > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > > +	thread_pkey_regs_save(t);
> > > > +#endif  
> > > 
> > > Just define two variants of thread_pkey_regs_save() based on
> > > CONFIG_PPC64_MEMORY_PROTECTION_KEYS and remove the #ifdefs from process.c
> > > Ditto for the lines below  
> > 
> > ok.
> > 
> > >   
> > > >  }
> > > >  
> > > >  static inline void restore_sprs(struct thread_struct *old_thread,
> > > > @@ -1120,6 +1124,9 @@ static inline void restore_sprs(struct thread_struct *old_thread,
> > > >  			mtspr(SPRN_TAR, new_thread->tar);
> > > >  	}
> > > >  #endif
> > > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > > +	thread_pkey_regs_restore(new_thread, old_thread);
> > > > +#endif  
> > 
> > ok.
> > 
> > > >  }
> > > >  
> > > >  #ifdef CONFIG_PPC_BOOK3S_64
> > > > @@ -1705,6 +1712,9 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp)
> > > >  	current->thread.tm_tfiar = 0;
> > > >  	current->thread.load_tm = 0;
> > > >  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
> > > > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > > +	thread_pkey_regs_init(&current->thread);
> > > > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> > > >  }
> > > >  EXPORT_SYMBOL(start_thread);
> > > >  
> > > > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > > > index 2282864..7cd1be4 100644
> > > > --- a/arch/powerpc/mm/pkeys.c
> > > > +++ b/arch/powerpc/mm/pkeys.c
> > > > @@ -149,3 +149,42 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> > > >  	init_amr(pkey, new_amr_bits);
> > > >  	return 0;
> > > >  }
> > > > +
> > > > +void thread_pkey_regs_save(struct thread_struct *thread)
> > > > +{
> > > > +	if (!pkey_inited)
> > > > +		return;
> > > > +
> > > > +	/* @TODO skip saving any registers if the thread
> > > > +	 * has not used any keys yet.
> > > > +	 */  
> > > 
> > > Comment style is broken  
> > 
> > ok. this time i will fix them. It misses by radar screen because
> > checkpatch.pl does not complain. 
> >
> 
> Yep, there is an lkml thread on this style of coding. It's
> best avoided.
> 
> > >   
> > > > +
> > > > +	thread->amr = read_amr();
> > > > +	thread->iamr = read_iamr();
> > > > +	thread->uamor = read_uamor();
> > > > +}
> > > > +
> > > > +void thread_pkey_regs_restore(struct thread_struct *new_thread,
> > > > +			struct thread_struct *old_thread)
> > > > +{
> > > > +	if (!pkey_inited)
> > > > +		return;
> > > > +
> > > > +	/* @TODO just reset uamor to zero if the new_thread
> > > > +	 * has not used any keys yet.
> > > > +	 */  
> > > 
> > > Comment style is broken.
> > >   
> > > > +
> > > > +	if (old_thread->amr != new_thread->amr)
> > > > +		write_amr(new_thread->amr);
> > > > +	if (old_thread->iamr != new_thread->iamr)
> > > > +		write_iamr(new_thread->iamr);
> > > > +	if (old_thread->uamor != new_thread->uamor)
> > > > +		write_uamor(new_thread->uamor);  
> > > 
> > > Is this order correct? Ideally, You want to write the uamor first
> > > but since we are in supervisor state, I think we can get away
> > > with this order.   
> > 
> > we could be in hypervisor state too, as is the case when we run
> > a powernv kernel.
> > 
> > But..does it matter in which order they are written? if
> > the thread is in the kernel, it cannot execute any instructions
> > in userspace. So it wont see a intermediate state. right?
> > or am i getting this wrong?
> 
> You are right, since uamor + amor control what can and
> cannot be set, there is a subtle dependency, but it does
> not apply to the kernel doing the context switch. AMR has
> two SPR values, 13 and 29. I presume we are using SPR #29
> here?

it is SPRN_AMR, which is 29 (0x1d)

> 
> > 
> > > Do we want to expose the uamor to user space
> > > for it to modify the AMR directly?  
> > 
> > sorry I did not understand the comment. UAMOR cannot
> > be accessed from usespace. and there are no system calls
> > currently to help userspace to program the UAMOR on its
> > behalf.
> > 
> 
> I was just wondering how two threads can share a key if
> they decide to. They would need uamor to give them permissions
> to the same set of keys and then reuse the key via
> pkey_mprotect(.., pkey). I am missing the bit about how
> uamor's across these threads would be synchronized.

As it stands now, two threads are discouraged to share the same key,
since we dont provide synchronization of keys across threads. A key
allocated in one thread's context has no meaning in the context of
another thread.  I think, it is constraining to the application, but
that is how the semantics were defined and i have implemented it that
way for powerpc.  Yes eventually; one day I am sure, as applications
start exploiting this feature they will demand more flexibility. But
for now that is what it is.  So given the above semantics, there is
no need currently to synchronize umor/amr/iamr across threads.

> 
> 
> > >   
> > > > +}
> > > > +
> > > > +void thread_pkey_regs_init(struct thread_struct *thread)
> > > > +{
> > > > +	write_amr(0x0ul);
> > > > +	write_iamr(0x0ul);
> > > > +	write_uamor(0x0ul);  
> > > 
> > > This is not correct, reserved keys should not be set to 0's  
> > 
> > ok. makes sense. best to not touch reserved key bits here.
> 
> Also this implies that at init time, the thread has access to all
> keys, but it can't modify any of the keys in the AMR/IAMR.

it shouldn't modify the bit corresponding to the reserved keys.

These patches dont touch AMOR. the hypervisor is entirely in control of
the AMOR.

AMOR is the master controller for these keys. If a reserved key is
disabled in AMOR, any changes to the bits corresponding to the
reserved-key in AMR or IAMR or UAMOR will have any effect. 

But if AMOR has enabled a reserved-key, than we will cause bad things by
changing the bits in AMR/IAMR/UAMOR. So you are right. We better not
touch the bits corresponding to the reserved-keys.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-09-08 22:44 ` [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
  2017-09-14  1:18   ` Balbir Singh
@ 2017-10-19  3:25   ` Michael Ellerman
  2017-10-19 17:02     ` Ram Pai
  2017-10-23  8:47     ` Aneesh Kumar K.V
  1 sibling, 2 replies; 134+ messages in thread
From: Michael Ellerman @ 2017-10-19  3:25 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> index 1a68cb1..c6c5559 100644
> --- a/arch/powerpc/mm/hash64_64k.c
> +++ b/arch/powerpc/mm/hash64_64k.c
> @@ -126,18 +113,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  	if (__rpte_sub_valid(rpte, subpg_index)) {
>  		int ret;
>  
> -		hash = hpt_hash(vpn, shift, ssize);
> -		hidx = __rpte_to_hidx(rpte, subpg_index);
> -		if (hidx & _PTEIDX_SECONDARY)
> -			hash = ~hash;
> -		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> -		slot += hidx & _PTEIDX_GROUP_IX;
> +		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte,
> +					subpg_index);
> +		ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn,
> +			MMU_PAGE_4K, MMU_PAGE_4K, ssize, flags);

This was formatted correctly before:
  
> -		ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn,
> -						 MMU_PAGE_4K, MMU_PAGE_4K,
> -						 ssize, flags);
>  		/*
> -		 *if we failed because typically the HPTE wasn't really here
> +		 * if we failed because typically the HPTE wasn't really here

If you're fixing it up please make it "If ...".

>  		 * we try an insertion.
>  		 */
>  		if (ret == -1)
> @@ -148,6 +130,15 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  	}
>  
>  htab_insert_hpte:
> +
> +	/*
> +	 * initialize all hidx entries to invalid value,
> +	 * the first time the PTE is about to allocate
> +	 * a 4K hpte
> +	 */

Should be:
	/*
	 * Initialize all hidx entries to invalid value, the first time
         * the PTE is about to allocate a 4K HPTE.
	 */

> +	if (!(old_pte & H_PAGE_COMBO))
> +		rpte.hidx = ~0x0UL;
> +

Paul had the idea that if we biased the slot number by 1, we could make
the "invalid" value be == 0.

That would avoid needing to that above, and also mean the value is
correctly invalid from the get-go, which would be good IMO.

I think now that you've added the slot accessors it would be pretty easy
to do.


>  	/*
>  	 * handle H_PAGE_4K_PFN case
>  	 */
> @@ -172,15 +163,41 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>  	 * Primary is full, try the secondary
>  	 */
>  	if (unlikely(slot == -1)) {
> +		bool soft_invalid;
> +
>  		hpte_group = ((~hash & htab_hash_mask) * HPTES_PER_GROUP) & ~0x7UL;
>  		slot = mmu_hash_ops.hpte_insert(hpte_group, vpn, pa,
>  						rflags, HPTE_V_SECONDARY,
>  						MMU_PAGE_4K, MMU_PAGE_4K,
>  						ssize);
> -		if (slot == -1) {
> -			if (mftb() & 0x1)
> +
> +		soft_invalid = hpte_soft_invalid(slot);
> +		if (unlikely(soft_invalid)) {


> +			/*
> +			 * we got a valid slot from a hardware point of view.
> +			 * but we cannot use it, because we use this special
> +			 * value; as     defined   by    hpte_soft_invalid(),
> +			 * to  track    invalid  slots.  We  cannot  use  it.
> +			 * So invalidate it.
> +			 */
> +			gslot = slot & _PTEIDX_GROUP_IX;
> +			mmu_hash_ops.hpte_invalidate(hpte_group+gslot, vpn,
> +				MMU_PAGE_4K, MMU_PAGE_4K,
> +				ssize, 0);

Please:
			mmu_hash_ops.hpte_invalidate(hpte_group+gslot, vpn,
                        			     MMU_PAGE_4K, MMU_PAGE_4K,
						     ssize, 0);

> +		}
> +
> +		if (unlikely(slot == -1 || soft_invalid)) {
> +			/*
> +			 * for soft invalid slot, lets   ensure that we

For .. let's


cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 01/25] powerpc: initial pkey plumbing
  2017-09-08 22:44 ` [PATCH 01/25] powerpc: initial pkey plumbing Ram Pai
  2017-09-14  3:32   ` Balbir Singh
@ 2017-10-19  4:20   ` Michael Ellerman
  2017-10-19 17:11     ` Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Michael Ellerman @ 2017-10-19  4:20 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 9fc3c0b..a4cd210 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -864,6 +864,22 @@ config SECCOMP
>  
>  	  If unsure, say Y. Only embedded should say N here.
>  
> +config PPC64_MEMORY_PROTECTION_KEYS

That's pretty wordy, can we make it CONFIG_PPC_MEM_KEYS ?

I think you're a sufficient vim wizard to search and replace all
usages at once, if not I can do it before I apply the series.

> +	prompt "PowerPC Memory Protection Keys"
> +	def_bool y
> +	# Note: only available in 64-bit mode

We don't need the note, that's exactly what the next line says:
> +	depends on PPC64

But shouldn't it be BOOK3S_64 ?

I don't think it works on BookE does it?

> +	select ARCH_USES_HIGH_VMA_FLAGS
> +	select ARCH_HAS_PKEYS
> +	---help---

I prefer just "help".

> diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> index 3095925..7badf29 100644
> --- a/arch/powerpc/include/asm/mmu_context.h
> +++ b/arch/powerpc/include/asm/mmu_context.h
> @@ -141,5 +141,10 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
>  	/* by default, allow everything */
>  	return true;
>  }
> +
> +#ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +#define pkey_initialize()
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */

You don't need ifdefs around that. But you also don't need it (see below).

> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> new file mode 100644
> index 0000000..c02305a
> --- /dev/null
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -0,0 +1,45 @@
> +#ifndef _ASM_PPC64_PKEYS_H
> +#define _ASM_PPC64_PKEYS_H

_ASM_POWERPC_KEYS_H

Missing copyright header here.

> +
> +extern bool pkey_inited;
> +extern bool pkey_execute_disable_support;
> +#define ARCH_VM_PKEY_FLAGS 0
> +
> +static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> +{
> +	return (pkey == 0);

That means pkey 1 is not allocated and pkey 0 is?

Surely this should just return false for now?

> +}
> +
> +static inline int mm_pkey_alloc(struct mm_struct *mm)
> +{
> +	return -1;
> +}
> +
> +static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> +{
> +	return -EINVAL;
> +}
> +
> +/*
> + * Try to dedicate one of the protection keys to be used as an
> + * execute-only protection key.
> + */
> +static inline int execute_only_pkey(struct mm_struct *mm)
> +{
> +	return 0;
> +}
> +
> +static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> +		int prot, int pkey)

static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
					      int prot, int pkey)

> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index b89c6aa..3b67014 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -316,6 +317,9 @@ void __init early_setup(unsigned long dt_ptr)
>  	/* Initialize the hash table or TLB handling */
>  	early_init_mmu();
>  
> +	/* initialize the key subsystem */
> +	pkey_initialize();
> +

I'm not sure we need to initialise this that early, but if we do, it
should be done in early_init_mmu(), not here.

> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 0dff57b..67f62b5 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -35,6 +35,7 @@
>  #include <linux/memblock.h>
>  #include <linux/context_tracking.h>
>  #include <linux/libfdt.h>
> +#include <linux/pkeys.h>

This should go in a later patch when it's needed.

> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> new file mode 100644
> index 0000000..418a05b
> --- /dev/null
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -0,0 +1,33 @@
> +/*
> + * PowerPC Memory Protection Keys management
> + * Copyright (c) 2015, Intel Corporation.

Is any of it really copyright Intel?

> + * Copyright (c) 2017, IBM Corporation.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2, as published by the Free Software Foundation.

We're meant to use "or later" on new code.

> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.

But not this part.

> + */

Blank line.

> +#include <linux/pkeys.h>                /* PKEY_*                       */

Comment is wrong and unnecessary.

> +bool pkey_inited;
> +bool pkey_execute_disable_support;
> +
> +void __init pkey_initialize(void)
> +{
> +	/* disable the pkey system till everything
> +	 * is in place. A patch further down the
> +	 * line will enable it.
> +	 */

	/*
         * Disable the pkey system till everything is in place. A patch
         * further down the line will enable it.
	 */

I'm going to stop commenting on every badly formatted comment :)


> +	pkey_inited = false;
> +
> +	/*
> +	 * disable execute_disable support for now.
> +	 * A patch further down will enable it.
> +	 */
> +	pkey_execute_disable_support = false;

Those are both false anyway, so I'm not sure we really needed to
initialise them to false again, but I don't feel that strongly about it.

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper
  2017-09-08 22:44 ` [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper Ram Pai
  2017-09-13  7:55   ` Balbir Singh
@ 2017-10-19  4:52   ` Michael Ellerman
  1 sibling, 0 replies; 134+ messages in thread
From: Michael Ellerman @ 2017-10-19  4:52 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:
> diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> index 9732837..6652669 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
> @@ -74,6 +74,31 @@ static inline unsigned long __rpte_to_hidx(real_pte_t rpte, unsigned long index)
>  	return (pte_val(rpte.pte) >> H_PAGE_F_GIX_SHIFT) & 0xf;
>  }
>  
> +/*
> + * Commit the hash slot and return pte bits that needs to be modified.
> + * The caller is expected to modify the pte bits accordingly and
> + * commit the pte to memory.
> + */
> +static inline unsigned long pte_set_hash_slot(pte_t *ptep, real_pte_t rpte,
> +		unsigned int subpg_index, unsigned long slot)
> +{
> +	unsigned long *hidxp = (unsigned long *)(ptep + PTRS_PER_PTE);
> +
> +	rpte.hidx &= ~(0xfUL << (subpg_index << 2));
> +	*hidxp = rpte.hidx  | (slot << (subpg_index << 2));
                           ^
                           stray space here
> +	/*
> +	 * Commit the hidx bits to memory before returning.

I'd prefer we didn't use "commit", it implies the bits are actually
written to memory by the barrier, which is not true. The barrier is just
a barrier or fence which prevents some reorderings of the things before
it and the things after it.

> +	 * Anyone reading  pte  must  ensure hidx bits are
> +	 * read  only  after  reading the pte by using the
> +	 * read-side  barrier  smp_rmb().

That seems OK. Though I'm reminded that I dislike your justified
comments, the odd spacing is jarring to read.

>          __real_pte() can
> +	 * help ensure that.

It doesn't help, it *does* do that.

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 11/25] powerpc: introduce execute-only pkey
  2017-10-18 23:02       ` Balbir Singh
@ 2017-10-19 15:52         ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-19 15:52 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 10:02:13AM +1100, Balbir Singh wrote:
> On Wed, 18 Oct 2017 13:57:39 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > On Wed, Oct 18, 2017 at 03:15:22PM +1100, Balbir Singh wrote:
> > > On Fri,  8 Sep 2017 15:44:59 -0700
> > > Ram Pai <linuxram@us.ibm.com> wrote:
> > >   
> > > > This patch provides the implementation of execute-only pkey.
> > > > The architecture-independent layer expects the arch-dependent
> > > > layer, to support the ability to create and enable a special
> > > > key which has execute-only permission.
> > > > 
> > > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > > ---
> > > >  arch/powerpc/include/asm/book3s/64/mmu.h |    1 +
> > > >  arch/powerpc/include/asm/pkeys.h         |    9 ++++-
> > > >  arch/powerpc/mm/pkeys.c                  |   57 ++++++++++++++++++++++++++++++
> > > >  3 files changed, 66 insertions(+), 1 deletions(-)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> > > > index 55950f4..ee18ba0 100644
> > > > --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> > > > +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> > > > @@ -115,6 +115,7 @@ struct patb_entry {
> > > >  	 * bit unset -> key available for allocation
> > > >  	 */
> > > >  	u32 pkey_allocation_map;
> > > > +	s16 execute_only_pkey; /* key holding execute-only protection */
> > > >  #endif
> > > >  } mm_context_t;
> > > >  
> > > > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > > > index 78c5362..0cf115f 100644
> > > > --- a/arch/powerpc/include/asm/pkeys.h
> > > > +++ b/arch/powerpc/include/asm/pkeys.h
> > > > @@ -115,11 +115,16 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> > > >   * Try to dedicate one of the protection keys to be used as an
> > > >   * execute-only protection key.
> > > >   */
> > > > +extern int __execute_only_pkey(struct mm_struct *mm);
> > > >  static inline int execute_only_pkey(struct mm_struct *mm)
> > > >  {
> > > > -	return 0;
> > > > +	if (!pkey_inited || !pkey_execute_disable_support)
> > > > +		return -1;
> > > > +
> > > > +	return __execute_only_pkey(mm);
> > > >  }
> > > >  
> > > > +
> > > >  static inline int arch_override_mprotect_pkey(struct vm_area_struct *vma,
> > > >  		int prot, int pkey)
> > > >  {
> > > > @@ -141,6 +146,8 @@ static inline void pkey_mm_init(struct mm_struct *mm)
> > > >  	if (!pkey_inited)
> > > >  		return;
> > > >  	mm_pkey_allocation_map(mm) = initial_allocation_mask;
> > > > +	/* -1 means unallocated or invalid */
> > > > +	mm->context.execute_only_pkey = -1;
> > > >  }
> > > >  
> > > >  extern void thread_pkey_regs_save(struct thread_struct *thread);
> > > > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > > > index 7cd1be4..8a24983 100644
> > > > --- a/arch/powerpc/mm/pkeys.c
> > > > +++ b/arch/powerpc/mm/pkeys.c
> > > > @@ -188,3 +188,60 @@ void thread_pkey_regs_init(struct thread_struct *thread)
> > > >  	write_iamr(0x0ul);
> > > >  	write_uamor(0x0ul);
> > > >  }
> > > > +
> > > > +static inline bool pkey_allows_readwrite(int pkey)
> > > > +{
> > > > +	int pkey_shift = pkeyshift(pkey);
> > > > +
> > > > +	if (!(read_uamor() & (0x3UL << pkey_shift)))
> > > > +		return true;  
> > > 
> > > If uamor for key 0 is 0x10 for example or 0x01 it's a bug.
> > > The above check might miss it.  
> > 
> > 
> > The specs says both the bits corresponding to a key are set or
> > reset, cannot be anything else.
> >
> 
> I agree, thats why I said it's a bug if the values are such.
> Do we care to validate that both bits are the same?

I will put in a assert. Will that work?

RP

> 
> Balbir Singh.

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey()
  2017-10-18 23:04       ` Balbir Singh
@ 2017-10-19 16:39         ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-19 16:39 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 10:04:40AM +1100, Balbir Singh wrote:
> On Wed, 18 Oct 2017 14:10:41 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > On Wed, Oct 18, 2017 at 03:36:35PM +1100, Balbir Singh wrote:
> > > On Fri,  8 Sep 2017 15:45:01 -0700
> > > Ram Pai <linuxram@us.ibm.com> wrote:
> > >   
> > > > arch independent code calls arch_override_mprotect_pkey()
> > > > to return a pkey that best matches the requested protection.
> > > > 
> > > > This patch provides the implementation.
> > > > 
> > > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > > ---
> > > >  arch/powerpc/include/asm/mmu_context.h |    5 +++
> > > >  arch/powerpc/include/asm/pkeys.h       |   17 ++++++++++-
> > > >  arch/powerpc/mm/pkeys.c                |   47 ++++++++++++++++++++++++++++++++
> > > >  3 files changed, 67 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> > > > index c705a5d..8e5a87e 100644
> > > > --- a/arch/powerpc/include/asm/mmu_context.h
> > > > +++ b/arch/powerpc/include/asm/mmu_context.h
> > > > @@ -145,6 +145,11 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> > > >  #ifndef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > > >  #define pkey_initialize()
> > > >  #define pkey_mm_init(mm)
> > > > +
> > > > +static inline int vma_pkey(struct vm_area_struct *vma)
> > > > +{
> > > > +	return 0;
> > > > +}
> > > >  #endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> > > >  
> > > >  #endif /* __KERNEL__ */
> > > > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > > > index f13e913..d2fffef 100644
> > > > --- a/arch/powerpc/include/asm/pkeys.h
> > > > +++ b/arch/powerpc/include/asm/pkeys.h
> > > > @@ -41,6 +41,16 @@ static inline u64 pkey_to_vmflag_bits(u16 pkey)
> > > >  		((pkey & 0x10UL) ? VM_PKEY_BIT4 : 0x0UL));
> > > >  }
> > > >  
> > > > +#define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
> > > > +				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> > > > +
> > > > +static inline int vma_pkey(struct vm_area_struct *vma)
> > > > +{
> > > > +	if (!pkey_inited)
> > > > +		return 0;  
> > > 
> > > We don't want pkey_inited to be present in all functions, why do we need
> > > a conditional branch for all functions. Even if we do, it should be a jump
> > > label. I would rather we just removed !pkey_inited unless really really
> > > required.  
> > 
> > No. we really really need it.  For example when we build a kernel with
> > PROTECTION_KEYS config enabled and run that kernel on a older processor
> > or on a system where the key feature is not enabled in the device tree,
> > we have fail all the calls that get called-in by the arch-neutral code.
> > 
> > Hence we need this check.
> > 
> 
> Use a mmu_feature then, it's already designed and optimized for that
> purpose

We rely on a combination of cpu_feature and firmware_feature. But that
is still not sufficient. It is also gated on !radix_enabled().

In other words,  the pkey system is enabled if
a) if we are in a lpar and device tree has the feature enabled, and
	there are more than zero keys enabled and radix is not enabled.

	OR

b) if we in baremetal, and the CPU is power5 or later and radix is not
	enabled.

All these criteria determine the value of 'pkey_inited'.

> 
> > BTW: jump labels are awkward IMHO, unless absolutely needed.
> >
> 
> The if checks across the place will hurt performance and we want to have
> this enabled by default, we may need a mmu feature or jump labels

I like the idea of jump_label now that I googled for it.

Thanks,
RP

> 
> Balbir Singh.

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 18/25] powerpc: check key protection for user page access
  2017-10-18 23:08       ` Balbir Singh
@ 2017-10-19 16:46         ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-19 16:46 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 10:08:57AM +1100, Balbir Singh wrote:
> On Wed, 18 Oct 2017 14:29:24 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > On Thu, Oct 19, 2017 at 06:57:32AM +1100, Balbir Singh wrote:
> > > On Fri,  8 Sep 2017 15:45:06 -0700
> > > Ram Pai <linuxram@us.ibm.com> wrote:
> > >   
> > > > Make sure that the kernel does not access user pages without
> > > > checking their key-protection.
> > > >  
> > > 
> > > Why? This makes the routines AMR/thread specific? Looks like
> > > x86 does this as well  
> > 
> > Yes. the memkey semantics implemented by x86, assumes that the keys and
> > their access-permission are per thread.  In other words, a key which is
> > enabled in the context of one thread, will not be enabled in the context
> > of another thread.
> > 
> > > but these routines are used by GUP from
> > > the kernel.  
> > 
> > See a problem?
> >
> 
> No, I don't understand why gup (called from driver context, probably) should
> worry about permissions and keys?

There are some user level features; eg: pipe,  where the userspace
donates one of its pages to the kernel, to buffer the pipe stream.

But if the donated page has a non-permissive key associated, the
kernel should reject and return failure. Access to a page
associated with a non-permissive key should fail regardless of who
accesses the page (userspace, or kernel on userspace's behalf).

That is the reason we tap into the GUP routines to validate such
access.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation
  2017-10-18 23:27   ` Balbir Singh
@ 2017-10-19 16:53     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-19 16:53 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 10:27:52AM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:45:08 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > Handle Data and  Instruction exceptions caused by memory
> > protection-key.
> > 
> > The CPU will detect the key fault if the HPTE is already
> > programmed with the key.
> > 
> > However if the HPTE is not  hashed, a key fault will not
> > be detected by the  hardware. The   software will detect
> > pkey violation in such a case.
> 
> 
> This bit is not clear.
> 
> > 
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/mm/fault.c |   37 ++++++++++++++++++++++++++++++++-----
> >  1 files changed, 32 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index 4797d08..a16bc43 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -145,6 +145,23 @@ static noinline int bad_area(struct pt_regs *regs, unsigned long address)
> >  	return __bad_area(regs, address, SEGV_MAPERR);
> >  }
> >  
> > +static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
> > +					int si_code)
> > +{
> > +	int sig = SIGBUS;
> > +	int code = BUS_OBJERR;
> > +
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	if (si_code & DSISR_KEYFAULT) {
> > +		sig = SIGSEGV;
> > +		code = SEGV_PKUERR;
> > +	}
> > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> > +
> > +	_exception(sig, regs, code, address);
> > +	return 0;
> > +}
> > +
> >  static int do_sigbus(struct pt_regs *regs, unsigned long address,
> >  		     unsigned int fault)
> >  {
> > @@ -391,11 +408,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
> >  		return 0;
> >  
> >  	if (unlikely(page_fault_is_bad(error_code))) {
> > -		if (is_user) {
> > -			_exception(SIGBUS, regs, BUS_OBJERR, address);
> > -			return 0;
> > -		}
> > -		return SIGBUS;
> > +		if (!is_user)
> > +			return SIGBUS;
> > +		return bad_page_fault_exception(regs, address, error_code);
> >  	}
> >  
> >  	/* Additional sanity check(s) */
> > @@ -492,6 +507,18 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
> >  	if (unlikely(access_error(is_write, is_exec, vma)))
> >  		return bad_area(regs, address);
> >  
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
> > +			is_exec, 0))
> > +		return __bad_area(regs, address, SEGV_PKUERR);
> 
> 
> Hmm.. this is for the missing entry in the HPT and software detecting the
> fault you mentioned above? Why do we need this case?

I thought I had put in a comment motivating the reason. Seems to have
disappeared. Will add it back.  But here is the reason....

hardware enforces key-exception only after the key is programmed into
the HPTE. However there is a window where the key is programmed into the
PTE and waiting for a page fault so that it can propagate key to the
HPTE. It is that window, during which we have to guard for key
violation. The above check closes that small window of vulnerability.
	
RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 21/25] powerpc: introduce get_pte_pkey() helper
  2017-10-18 23:29   ` Balbir Singh
@ 2017-10-19 16:55     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-19 16:55 UTC (permalink / raw)
  To: Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, aneesh.kumar, hbabu,
	mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 10:29:44AM +1100, Balbir Singh wrote:
> On Fri,  8 Sep 2017 15:45:09 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
> 
> > get_pte_pkey() helper returns the pkey associated with
> > a address corresponding to a given mm_struct.
> >
> 
> This is really get_mm_addr_key() -- no?

ok. will be so.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-10-19  3:25   ` Michael Ellerman
@ 2017-10-19 17:02     ` Ram Pai
  2017-10-23  8:47     ` Aneesh Kumar K.V
  1 sibling, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-19 17:02 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 02:25:47PM +1100, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> > index 1a68cb1..c6c5559 100644
> > --- a/arch/powerpc/mm/hash64_64k.c
> > +++ b/arch/powerpc/mm/hash64_64k.c
> > @@ -126,18 +113,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
> >  	if (__rpte_sub_valid(rpte, subpg_index)) {
> >  		int ret;
> >  
> > -		hash = hpt_hash(vpn, shift, ssize);
> > -		hidx = __rpte_to_hidx(rpte, subpg_index);
> > -		if (hidx & _PTEIDX_SECONDARY)
> > -			hash = ~hash;
> > -		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> > -		slot += hidx & _PTEIDX_GROUP_IX;
> > +		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte,
> > +					subpg_index);
> > +		ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn,
> > +			MMU_PAGE_4K, MMU_PAGE_4K, ssize, flags);
> 
> This was formatted correctly before:
>   
> > -		ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn,
> > -						 MMU_PAGE_4K, MMU_PAGE_4K,
> > -						 ssize, flags);
> >  		/*
> > -		 *if we failed because typically the HPTE wasn't really here
> > +		 * if we failed because typically the HPTE wasn't really here
> 
> If you're fixing it up please make it "If ...".
> 
> >  		 * we try an insertion.
> >  		 */
> >  		if (ret == -1)
> > @@ -148,6 +130,15 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
> >  	}
> >  
> >  htab_insert_hpte:
> > +
> > +	/*
> > +	 * initialize all hidx entries to invalid value,
> > +	 * the first time the PTE is about to allocate
> > +	 * a 4K hpte
> > +	 */
> 
> Should be:
> 	/*
> 	 * Initialize all hidx entries to invalid value, the first time
>          * the PTE is about to allocate a 4K HPTE.
> 	 */
> 
> > +	if (!(old_pte & H_PAGE_COMBO))
> > +		rpte.hidx = ~0x0UL;
> > +
> 
> Paul had the idea that if we biased the slot number by 1, we could make
> the "invalid" value be == 0.
> 
> That would avoid needing to that above, and also mean the value is
> correctly invalid from the get-go, which would be good IMO.
> 
> I think now that you've added the slot accessors it would be pretty easy
> to do.

I did attempt to do so, and was not getting it right. The machine went
unstable. So left it with an accessor, to be revisited at a
later point in time. That time has come... I suppose.  Shall I make it a
separate patch instead of baking it into this patch?

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 01/25] powerpc: initial pkey plumbing
  2017-10-19  4:20   ` Michael Ellerman
@ 2017-10-19 17:11     ` Ram Pai
  2017-10-24  8:17       ` Michael Ellerman
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-19 17:11 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm

On Thu, Oct 19, 2017 at 03:20:36PM +1100, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 9fc3c0b..a4cd210 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -864,6 +864,22 @@ config SECCOMP
> >  
> >  	  If unsure, say Y. Only embedded should say N here.
> >  
> > +config PPC64_MEMORY_PROTECTION_KEYS
> 
> That's pretty wordy, can we make it CONFIG_PPC_MEM_KEYS ?
> 
> I think you're a sufficient vim wizard to search and replace all
> usages at once,

I take that as a compliment for now ;)

> if not I can do it before I apply the series.

Will change it...just that I was trying to keep it similar to what intel
has X86_INTEL_MEMORY_PROTECTION_KEYS

> 
> > +	prompt "PowerPC Memory Protection Keys"
> > +	def_bool y
> > +	# Note: only available in 64-bit mode
> 
> We don't need the note, that's exactly what the next line says:
> > +	depends on PPC64
> 
> But shouldn't it be BOOK3S_64 ?
> 
> I don't think it works on BookE does it?
> 
> > +	select ARCH_USES_HIGH_VMA_FLAGS
> > +	select ARCH_HAS_PKEYS
> > +	---help---
> 
> I prefer just "help".

ok.

> 
> > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
> > index 3095925..7badf29 100644
> > --- a/arch/powerpc/include/asm/mmu_context.h
> > +++ b/arch/powerpc/include/asm/mmu_context.h
> > @@ -141,5 +141,10 @@ static inline bool arch_vma_access_permitted(struct vm_area_struct *vma,
> 	/*

  ....snip...

>          * Disable the pkey system till everything is in place. A patch
>          * further down the line will enable it.
> 	 */
> 
> I'm going to stop commenting on every badly formatted comment :)

did not realize I had acquired a habit for badly formatted
comments. :-( Sorry for causing the pain. Will fix it.


RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-10-19  3:25   ` Michael Ellerman
  2017-10-19 17:02     ` Ram Pai
@ 2017-10-23  8:47     ` Aneesh Kumar K.V
  2017-10-23 16:29       ` Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-23  8:47 UTC (permalink / raw)
  To: Michael Ellerman, Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Michael Ellerman <mpe@ellerman.id.au> writes:

> Ram Pai <linuxram@us.ibm.com> writes:
>
>> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
>> index 1a68cb1..c6c5559 100644
>> --- a/arch/powerpc/mm/hash64_64k.c
>> +++ b/arch/powerpc/mm/hash64_64k.c
>> @@ -126,18 +113,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>>  	if (__rpte_sub_valid(rpte, subpg_index)) {
>>  		int ret;
>>  
>> -		hash = hpt_hash(vpn, shift, ssize);
>> -		hidx = __rpte_to_hidx(rpte, subpg_index);
>> -		if (hidx & _PTEIDX_SECONDARY)
>> -			hash = ~hash;
>> -		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
>> -		slot += hidx & _PTEIDX_GROUP_IX;
>> +		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte,
>> +					subpg_index);
>> +		ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn,
>> +			MMU_PAGE_4K, MMU_PAGE_4K, ssize, flags);
>
> This was formatted correctly before:
>   
>> -		ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn,
>> -						 MMU_PAGE_4K, MMU_PAGE_4K,
>> -						 ssize, flags);
>>  		/*
>> -		 *if we failed because typically the HPTE wasn't really here
>> +		 * if we failed because typically the HPTE wasn't really here
>
> If you're fixing it up please make it "If ...".
>
>>  		 * we try an insertion.
>>  		 */
>>  		if (ret == -1)
>> @@ -148,6 +130,15 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
>>  	}
>>  
>>  htab_insert_hpte:
>> +
>> +	/*
>> +	 * initialize all hidx entries to invalid value,
>> +	 * the first time the PTE is about to allocate
>> +	 * a 4K hpte
>> +	 */
>
> Should be:
> 	/*
> 	 * Initialize all hidx entries to invalid value, the first time
>          * the PTE is about to allocate a 4K HPTE.
> 	 */
>
>> +	if (!(old_pte & H_PAGE_COMBO))
>> +		rpte.hidx = ~0x0UL;
>> +
>
> Paul had the idea that if we biased the slot number by 1, we could make
> the "invalid" value be == 0.
>
> That would avoid needing to that above, and also mean the value is
> correctly invalid from the get-go, which would be good IMO.
>
> I think now that you've added the slot accessors it would be pretty easy
> to do.

That would be imply, we loose one slot in primary group, which means we
will do extra work in some case because our primary now has only 7
slots. And in case of pseries, the hypervisor will always return the
least available slot, which implie we will do extra hcalls in case of an
hpte insert to an empty group?

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-09-14  8:13   ` Benjamin Herrenschmidt
@ 2017-10-23  8:52     ` Aneesh Kumar K.V
  2017-10-23 23:42       ` Ram Pai
  2017-10-23 19:22     ` Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-23  8:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Ram Pai, mpe, linuxppc-dev
  Cc: paulus, khandual, bsingharora, hbabu, mhocko, bauerman, ebiederm

Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:
>> The second part of the PTE will hold
>> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
>> NOTE: None of the bits in the secondary PTE were not used
>> by 64k-HPTE backed PTE.
>
> Have you measured the performance impact of this ? The second part of
> the PTE being in a different cache line there could be one...
>

I am also looking at a patch series removing the slot tracking
completely. With randomize address turned off and no swap in guest/host
and making sure we touched most of guest ram, I don't find much impact
in performance when we don't track the slot at all. I will post the
patch series with numbers in a day or two. But my test was

while (5000) {
      mmap(128M)
      touch every page of 2048 pages
      munmap()
}

I could also be the best case in my run because i might have always
found the hash pte slot in the primary. In one measurement with swap on
and address randmization enabled, i did find a 50% impact. But then i
was not able to recreate that again. So could be something i did wrong
in the test setup.

Ram,

Will you be able to get a test run with the above loop?

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-09-08 22:44 ` [PATCH 02/25] powerpc: define an additional vma bit for protection keys Ram Pai
  2017-09-14  4:38   ` Balbir Singh
@ 2017-10-23  9:25   ` Aneesh Kumar K.V
  2017-10-23  9:28     ` Aneesh Kumar K.V
  2017-10-23 17:43     ` Ram Pai
  1 sibling, 2 replies; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-23  9:25 UTC (permalink / raw)
  To: Ram Pai, mpe, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> powerpc needs an additional vma bit to support 32 keys.
> Till the additional vma bit lands in include/linux/mm.h
> we have to define  it  in powerpc specific header file.
> This is  needed to get pkeys working on power.
>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/pkeys.h |   18 ++++++++++++++++++
>  1 files changed, 18 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index c02305a..44e01a2 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -3,6 +3,24 @@
>
>  extern bool pkey_inited;
>  extern bool pkey_execute_disable_support;
> +
> +/*
> + * powerpc needs an additional vma bit to support 32 keys.
> + * Till the additional vma bit lands in include/linux/mm.h
> + * we have to carry the hunk below. This is  needed to get
> + * pkeys working on power. -- Ram
> + */
> +#ifndef VM_HIGH_ARCH_BIT_4
> +#define VM_HIGH_ARCH_BIT_4	36
> +#define VM_HIGH_ARCH_4	BIT(VM_HIGH_ARCH_BIT_4)
> +#define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
> +#define VM_PKEY_BIT0	VM_HIGH_ARCH_0
> +#define VM_PKEY_BIT1	VM_HIGH_ARCH_1
> +#define VM_PKEY_BIT2	VM_HIGH_ARCH_2
> +#define VM_PKEY_BIT3	VM_HIGH_ARCH_3
> +#define VM_PKEY_BIT4	VM_HIGH_ARCH_4
> +#endif
> +
>  #define ARCH_VM_PKEY_FLAGS 0

Do we want them in pkeys.h ? Even if they are arch specific for the
existing ones we have them in include/linux/mm.h. IIUC, vmflags details
are always in mm.h? This will be the first exception to that?


>
>  static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> -- 
> 1.7.1

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-10-23  9:25   ` Aneesh Kumar K.V
@ 2017-10-23  9:28     ` Aneesh Kumar K.V
  2017-10-23 17:57       ` Ram Pai
  2017-10-23 17:43     ` Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-23  9:28 UTC (permalink / raw)
  To: Ram Pai, mpe, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> Ram Pai <linuxram@us.ibm.com> writes:
>
>> powerpc needs an additional vma bit to support 32 keys.
>> Till the additional vma bit lands in include/linux/mm.h
>> we have to define  it  in powerpc specific header file.
>> This is  needed to get pkeys working on power.
>>
>> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
>> ---
>>  arch/powerpc/include/asm/pkeys.h |   18 ++++++++++++++++++
>>  1 files changed, 18 insertions(+), 0 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
>> index c02305a..44e01a2 100644
>> --- a/arch/powerpc/include/asm/pkeys.h
>> +++ b/arch/powerpc/include/asm/pkeys.h
>> @@ -3,6 +3,24 @@
>>
>>  extern bool pkey_inited;
>>  extern bool pkey_execute_disable_support;
>> +
>> +/*
>> + * powerpc needs an additional vma bit to support 32 keys.
>> + * Till the additional vma bit lands in include/linux/mm.h
>> + * we have to carry the hunk below. This is  needed to get
>> + * pkeys working on power. -- Ram
>> + */
>> +#ifndef VM_HIGH_ARCH_BIT_4
>> +#define VM_HIGH_ARCH_BIT_4	36
>> +#define VM_HIGH_ARCH_4	BIT(VM_HIGH_ARCH_BIT_4)
>> +#define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
>> +#define VM_PKEY_BIT0	VM_HIGH_ARCH_0
>> +#define VM_PKEY_BIT1	VM_HIGH_ARCH_1
>> +#define VM_PKEY_BIT2	VM_HIGH_ARCH_2
>> +#define VM_PKEY_BIT3	VM_HIGH_ARCH_3
>> +#define VM_PKEY_BIT4	VM_HIGH_ARCH_4
>> +#endif
>> +
>>  #define ARCH_VM_PKEY_FLAGS 0
>
> Do we want them in pkeys.h ? Even if they are arch specific for the
> existing ones we have them in include/linux/mm.h. IIUC, vmflags details
> are always in mm.h? This will be the first exception to that?


Also can we move that 

#if defined (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)
# define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
# define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
# define VM_PKEY_BIT1	VM_HIGH_ARCH_1
# define VM_PKEY_BIT2	VM_HIGH_ARCH_2
# define VM_PKEY_BIT3	VM_HIGH_ARCH_3
#endif

to

#if defined (CONFIG_ARCH_HAS_PKEYS)
# define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
# define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
# define VM_PKEY_BIT1	VM_HIGH_ARCH_1
# define VM_PKEY_BIT2	VM_HIGH_ARCH_2
# define VM_PKEY_BIT3	VM_HIGH_ARCH_3
#endif


And then later update the generic to handle PKEY_BIT4?

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 03/25] powerpc: track allocation status of all pkeys
  2017-09-08 22:44 ` [PATCH 03/25] powerpc: track allocation status of all pkeys Ram Pai
  2017-10-07 10:02   ` Michael Ellerman
  2017-10-18  2:47   ` Balbir Singh
@ 2017-10-23  9:41   ` Aneesh Kumar K.V
  2017-10-23 18:14     ` Ram Pai
  2017-10-24  6:28   ` Aneesh Kumar K.V
  3 siblings, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-23  9:41 UTC (permalink / raw)
  To: Ram Pai, mpe, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> Total 32 keys are available on power7 and above. However
> pkey 0,1 are reserved. So effectively we  have  30 pkeys.

When you say reserved, reserved by whom? Is that part of ISA or PAPR ?
Also do you expect that to change. If not why all these indirection?
Can we have the mask as a #define for 4K and 64K page size
config?

>
> On 4K kernels, we do not  have  5  bits  in  the  PTE to
> represent  all the keys; we only have 3bits.Two of those
> keys are reserved; pkey 0 and pkey 1. So effectively  we
> have 6 pkeys.
>
> This patch keeps track of reserved keys, allocated  keys
> and keys that are currently free.
>
> Also it  adds  skeletal  functions  and macros, that the
> architecture-independent code expects to be available.


-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 06/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed
  2017-09-08 22:44 ` [PATCH 06/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed Ram Pai
  2017-10-18  3:34   ` [PATCH 06/25] powerpc: cleaup AMR,iAMR " Balbir Singh
@ 2017-10-23  9:43   ` Aneesh Kumar K.V
  2017-10-23 18:29     ` [PATCH 06/25] powerpc: cleaup AMR,iAMR " Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-23  9:43 UTC (permalink / raw)
  To: Ram Pai, mpe, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> cleanup the bits corresponding to a key in the AMR, and IAMR
> register, when the key is newly allocated/activated or is freed.
> We dont want some residual bits cause the hardware enforce
> unintended behavior when the key is activated or freed.
>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/pkeys.h |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 5a83ed7..53bf13b 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -54,6 +54,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
>  		mm_set_pkey_is_allocated(mm, pkey));
>  }
>
> +extern void __arch_activate_pkey(int pkey);
> +extern void __arch_deactivate_pkey(int pkey);
>  /*
>   * Returns a positive, 5-bit key on success, or -1 on failure.
>   */
> @@ -80,6 +82,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
>
>  	ret = ffz((u32)mm_pkey_allocation_map(mm));
>  	mm_set_pkey_allocated(mm, ret);
> +
> +	/*
> +	 * enable the key in the hardware
> +	 */
> +	if (ret > 0)
> +		__arch_activate_pkey(ret);
>  	return ret;
>  }

We are already arch specific because we are defining them in
arch/powerpc/include/asm/, then why __arch_activate_pkey() ? 

>
> @@ -91,6 +99,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
>  	if (!mm_pkey_is_allocated(mm, pkey))
>  		return -EINVAL;
>
> +	/*
> +	 * Disable the key in the hardware
> +	 */
> +	__arch_deactivate_pkey(pkey);
>  	mm_set_pkey_free(mm, pkey);
>
>  	return 0;
> -- 
> 1.7.1

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 06/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed
  2017-10-18  3:34   ` [PATCH 06/25] powerpc: cleaup AMR,iAMR " Balbir Singh
@ 2017-10-23  9:43     ` Aneesh Kumar K.V
  2017-10-23 18:36       ` [PATCH 06/25] powerpc: cleaup AMR,iAMR " Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-23  9:43 UTC (permalink / raw)
  To: Balbir Singh, Ram Pai
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, hbabu, mhocko,
	bauerman, ebiederm

Balbir Singh <bsingharora@gmail.com> writes:

> On Fri,  8 Sep 2017 15:44:54 -0700
> Ram Pai <linuxram@us.ibm.com> wrote:
>
>> cleanup the bits corresponding to a key in the AMR, and IAMR
>> register, when the key is newly allocated/activated or is freed.
>> We dont want some residual bits cause the hardware enforce
>> unintended behavior when the key is activated or freed.
>> 
>> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
>> ---
>>  arch/powerpc/include/asm/pkeys.h |   12 ++++++++++++
>>  1 files changed, 12 insertions(+), 0 deletions(-)
>> 
>> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
>> index 5a83ed7..53bf13b 100644
>> --- a/arch/powerpc/include/asm/pkeys.h
>> +++ b/arch/powerpc/include/asm/pkeys.h
>> @@ -54,6 +54,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
>>  		mm_set_pkey_is_allocated(mm, pkey));
>>  }
>>  
>> +extern void __arch_activate_pkey(int pkey);
>> +extern void __arch_deactivate_pkey(int pkey);
>>  /*
>>   * Returns a positive, 5-bit key on success, or -1 on failure.
>>   */
>> @@ -80,6 +82,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
>>  
>>  	ret = ffz((u32)mm_pkey_allocation_map(mm));
>>  	mm_set_pkey_allocated(mm, ret);
>> +
>> +	/*
>> +	 * enable the key in the hardware
>> +	 */
>> +	if (ret > 0)
>> +		__arch_activate_pkey(ret);
>>  	return ret;
>>  }
>>  
>> @@ -91,6 +99,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
>>  	if (!mm_pkey_is_allocated(mm, pkey))
>>  		return -EINVAL;
>>  
>> +	/*
>> +	 * Disable the key in the hardware
>> +	 */
>> +	__arch_deactivate_pkey(pkey);
>>  	mm_set_pkey_free(mm, pkey);
>>  
>>  	return 0;
>
> I think some of these patches can be merged, too much fine granularity
> is hurting my ability to see the larger function/implementation.


Completely agree

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-10-23  8:47     ` Aneesh Kumar K.V
@ 2017-10-23 16:29       ` Ram Pai
  2017-10-25  9:18         ` Michael Ellerman
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-23 16:29 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Michael Ellerman, linuxppc-dev, benh, paulus, khandual,
	bsingharora, hbabu, mhocko, bauerman, ebiederm

On Mon, Oct 23, 2017 at 02:17:39PM +0530, Aneesh Kumar K.V wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> 
> > Ram Pai <linuxram@us.ibm.com> writes:
> >
> >> diff --git a/arch/powerpc/mm/hash64_64k.c b/arch/powerpc/mm/hash64_64k.c
> >> index 1a68cb1..c6c5559 100644
> >> --- a/arch/powerpc/mm/hash64_64k.c
> >> +++ b/arch/powerpc/mm/hash64_64k.c
> >> @@ -126,18 +113,13 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
> >>  	if (__rpte_sub_valid(rpte, subpg_index)) {
> >>  		int ret;
> >>  
> >> -		hash = hpt_hash(vpn, shift, ssize);
> >> -		hidx = __rpte_to_hidx(rpte, subpg_index);
> >> -		if (hidx & _PTEIDX_SECONDARY)
> >> -			hash = ~hash;
> >> -		slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
> >> -		slot += hidx & _PTEIDX_GROUP_IX;
> >> +		gslot = pte_get_hash_gslot(vpn, shift, ssize, rpte,
> >> +					subpg_index);
> >> +		ret = mmu_hash_ops.hpte_updatepp(gslot, rflags, vpn,
> >> +			MMU_PAGE_4K, MMU_PAGE_4K, ssize, flags);
> >
> > This was formatted correctly before:
> >   
> >> -		ret = mmu_hash_ops.hpte_updatepp(slot, rflags, vpn,
> >> -						 MMU_PAGE_4K, MMU_PAGE_4K,
> >> -						 ssize, flags);
> >>  		/*
> >> -		 *if we failed because typically the HPTE wasn't really here
> >> +		 * if we failed because typically the HPTE wasn't really here
> >
> > If you're fixing it up please make it "If ...".
> >
> >>  		 * we try an insertion.
> >>  		 */
> >>  		if (ret == -1)
> >> @@ -148,6 +130,15 @@ int __hash_page_4K(unsigned long ea, unsigned long access, unsigned long vsid,
> >>  	}
> >>  
> >>  htab_insert_hpte:
> >> +
> >> +	/*
> >> +	 * initialize all hidx entries to invalid value,
> >> +	 * the first time the PTE is about to allocate
> >> +	 * a 4K hpte
> >> +	 */
> >
> > Should be:
> > 	/*
> > 	 * Initialize all hidx entries to invalid value, the first time
> >          * the PTE is about to allocate a 4K HPTE.
> > 	 */
> >
> >> +	if (!(old_pte & H_PAGE_COMBO))
> >> +		rpte.hidx = ~0x0UL;
> >> +
> >
> > Paul had the idea that if we biased the slot number by 1, we could make
> > the "invalid" value be == 0.
> >
> > That would avoid needing to that above, and also mean the value is
> > correctly invalid from the get-go, which would be good IMO.
> >
> > I think now that you've added the slot accessors it would be pretty easy
> > to do.
> 
> That would be imply, we loose one slot in primary group, which means we
> will do extra work in some case because our primary now has only 7
> slots. And in case of pseries, the hypervisor will always return the
> least available slot, which implie we will do extra hcalls in case of an
> hpte insert to an empty group?


No. that is not the idea.  The idea is that slot 'F' in the seconday
will continue to be a invalid slot, but will be represented as
offset-by-one in the PTE.  In other words, 0 will be repesented as 1,
1 as 2....   and  n as (n+1)%32

The idea seems feasible.  It has the advantage -- where 0 in the PTE
means invalid slot. But it can be confusing to the casual code-
reader. Will need to put in a big-huge comment to explain that.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-10-23  9:25   ` Aneesh Kumar K.V
  2017-10-23  9:28     ` Aneesh Kumar K.V
@ 2017-10-23 17:43     ` Ram Pai
  1 sibling, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-23 17:43 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Mon, Oct 23, 2017 at 02:55:51PM +0530, Aneesh Kumar K.V wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > powerpc needs an additional vma bit to support 32 keys.
> > Till the additional vma bit lands in include/linux/mm.h
> > we have to define  it  in powerpc specific header file.
> > This is  needed to get pkeys working on power.
> >
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/pkeys.h |   18 ++++++++++++++++++
> >  1 files changed, 18 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index c02305a..44e01a2 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -3,6 +3,24 @@
> >
> >  extern bool pkey_inited;
> >  extern bool pkey_execute_disable_support;
> > +
> > +/*
> > + * powerpc needs an additional vma bit to support 32 keys.
> > + * Till the additional vma bit lands in include/linux/mm.h
> > + * we have to carry the hunk below. This is  needed to get
> > + * pkeys working on power. -- Ram
> > + */
> > +#ifndef VM_HIGH_ARCH_BIT_4
> > +#define VM_HIGH_ARCH_BIT_4	36
> > +#define VM_HIGH_ARCH_4	BIT(VM_HIGH_ARCH_BIT_4)
> > +#define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
> > +#define VM_PKEY_BIT0	VM_HIGH_ARCH_0
> > +#define VM_PKEY_BIT1	VM_HIGH_ARCH_1
> > +#define VM_PKEY_BIT2	VM_HIGH_ARCH_2
> > +#define VM_PKEY_BIT3	VM_HIGH_ARCH_3
> > +#define VM_PKEY_BIT4	VM_HIGH_ARCH_4
> > +#endif
> > +
> >  #define ARCH_VM_PKEY_FLAGS 0
> 
> Do we want them in pkeys.h ? Even if they are arch specific for the
> existing ones we have them in include/linux/mm.h. IIUC, vmflags details
> are always in mm.h? This will be the first exception to that?

I am trying to get the real fix in include/linux/mm.h  
If that happens sooner than this hunk is not needed. 
Till then this is an exception, but it is the **ONLY** exception.

I think your point is to have this hunk in include/linux/mm.h  ?

If yes, than it would be easier to push the real fix in
include/linux/mm.h  instead of pushing this hunk in the there.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-10-23  9:28     ` Aneesh Kumar K.V
@ 2017-10-23 17:57       ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-23 17:57 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Mon, Oct 23, 2017 at 02:58:55PM +0530, Aneesh Kumar K.V wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> 
> > Ram Pai <linuxram@us.ibm.com> writes:
> >
> >> powerpc needs an additional vma bit to support 32 keys.
> >> Till the additional vma bit lands in include/linux/mm.h
> >> we have to define  it  in powerpc specific header file.
> >> This is  needed to get pkeys working on power.
> >>
> >> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> >> ---
> >>  arch/powerpc/include/asm/pkeys.h |   18 ++++++++++++++++++
> >>  1 files changed, 18 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> >> index c02305a..44e01a2 100644
> >> --- a/arch/powerpc/include/asm/pkeys.h
> >> +++ b/arch/powerpc/include/asm/pkeys.h
> >> @@ -3,6 +3,24 @@
> >>
> >>  extern bool pkey_inited;
> >>  extern bool pkey_execute_disable_support;
> >> +
> >> +/*
> >> + * powerpc needs an additional vma bit to support 32 keys.
> >> + * Till the additional vma bit lands in include/linux/mm.h
> >> + * we have to carry the hunk below. This is  needed to get
> >> + * pkeys working on power. -- Ram
> >> + */
> >> +#ifndef VM_HIGH_ARCH_BIT_4
> >> +#define VM_HIGH_ARCH_BIT_4	36
> >> +#define VM_HIGH_ARCH_4	BIT(VM_HIGH_ARCH_BIT_4)
> >> +#define VM_PKEY_SHIFT VM_HIGH_ARCH_BIT_0
> >> +#define VM_PKEY_BIT0	VM_HIGH_ARCH_0
> >> +#define VM_PKEY_BIT1	VM_HIGH_ARCH_1
> >> +#define VM_PKEY_BIT2	VM_HIGH_ARCH_2
> >> +#define VM_PKEY_BIT3	VM_HIGH_ARCH_3
> >> +#define VM_PKEY_BIT4	VM_HIGH_ARCH_4
> >> +#endif
> >> +
> >>  #define ARCH_VM_PKEY_FLAGS 0
> >
> > Do we want them in pkeys.h ? Even if they are arch specific for the
> > existing ones we have them in include/linux/mm.h. IIUC, vmflags details
> > are always in mm.h? This will be the first exception to that?
> 
> 
> Also can we move that 
> 
> #if defined (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS)
> # define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
> # define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
> # define VM_PKEY_BIT1	VM_HIGH_ARCH_1
> # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
> # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
> #endif
> 
> to
> 
> #if defined (CONFIG_ARCH_HAS_PKEYS)
> # define VM_PKEY_SHIFT	VM_HIGH_ARCH_BIT_0
> # define VM_PKEY_BIT0	VM_HIGH_ARCH_0	/* A protection key is a 4-bit value */
> # define VM_PKEY_BIT1	VM_HIGH_ARCH_1
> # define VM_PKEY_BIT2	VM_HIGH_ARCH_2
> # define VM_PKEY_BIT3	VM_HIGH_ARCH_3
> #endif
> 
> 
> And then later update the generic to handle PKEY_BIT4?


Yes. The above changes have been implemented in a patch sent to the mm
mailing list as well as to lkml.

https://lkml.org/lkml/2017/9/15/504

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 03/25] powerpc: track allocation status of all pkeys
  2017-10-23  9:41   ` Aneesh Kumar K.V
@ 2017-10-23 18:14     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-23 18:14 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Mon, Oct 23, 2017 at 03:11:28PM +0530, Aneesh Kumar K.V wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > Total 32 keys are available on power7 and above. However
> > pkey 0,1 are reserved. So effectively we  have  30 pkeys.
> 
> When you say reserved, reserved by whom? Is that part of ISA or PAPR ?
> Also do you expect that to change. If not why all these indirection?
> Can we have the mask as a #define for 4K and 64K page size
> config?

The reserved keys cannot be determined
statically. It depends on the platform and the key info exported to us
by the device-tree.  Hence it cannot be macro'd.

One of the subsequent patch, reads the device tree and sets pkeys_total
appropriately. FYI.

BTW: key 0 is reserved by the ISA.  key 1 is reserved, but its nebulous.
There is a programming note in the ISA to avoid key 1. Also testing
and experimentation with key 1 lead to unpredicatable behavior on
powervm. Key 31 may or may not be used by the hypervisor. The device
tree has to be referred to determine that.

Bottomline, the mask can be determined only during runtime.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 06/25] powerpc: cleaup AMR,iAMR when a key is allocated or freed
  2017-10-23  9:43   ` [PATCH 06/25] powerpc: cleaup AMR, iAMR " Aneesh Kumar K.V
@ 2017-10-23 18:29     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-23 18:29 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Mon, Oct 23, 2017 at 03:13:33PM +0530, Aneesh Kumar K.V wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > cleanup the bits corresponding to a key in the AMR, and IAMR
> > register, when the key is newly allocated/activated or is freed.
> > We dont want some residual bits cause the hardware enforce
> > unintended behavior when the key is activated or freed.
> >
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/pkeys.h |   12 ++++++++++++
> >  1 files changed, 12 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index 5a83ed7..53bf13b 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -54,6 +54,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> >  		mm_set_pkey_is_allocated(mm, pkey));
> >  }
> >
> > +extern void __arch_activate_pkey(int pkey);
> > +extern void __arch_deactivate_pkey(int pkey);
> >  /*
> >   * Returns a positive, 5-bit key on success, or -1 on failure.
> >   */
> > @@ -80,6 +82,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
> >
> >  	ret = ffz((u32)mm_pkey_allocation_map(mm));
> >  	mm_set_pkey_allocated(mm, ret);
> > +
> > +	/*
> > +	 * enable the key in the hardware
> > +	 */
> > +	if (ret > 0)
> > +		__arch_activate_pkey(ret);
> >  	return ret;
> >  }
> 
> We are already arch specific because we are defining them in
> arch/powerpc/include/asm/, then why __arch_activate_pkey() ? 

almost all the memory-key internal functions are named with a two
leading underbars.  So just *loosely* following a convention within that
file. The corresponding functions without the two underbars are the one
exposed to the arch-neutral code.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 06/25] powerpc: cleaup AMR,iAMR when a key is allocated or freed
  2017-10-23  9:43     ` [PATCH 06/25] powerpc: cleaup AMR, iAMR " Aneesh Kumar K.V
@ 2017-10-23 18:36       ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-23 18:36 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Balbir Singh, mpe, linuxppc-dev, benh, paulus, khandual, hbabu,
	mhocko, bauerman, ebiederm

On Mon, Oct 23, 2017 at 03:13:45PM +0530, Aneesh Kumar K.V wrote:
> Balbir Singh <bsingharora@gmail.com> writes:
> 
> > On Fri,  8 Sep 2017 15:44:54 -0700
> > Ram Pai <linuxram@us.ibm.com> wrote:
> >
> >> cleanup the bits corresponding to a key in the AMR, and IAMR
> >> register, when the key is newly allocated/activated or is freed.
> >> We dont want some residual bits cause the hardware enforce
> >> unintended behavior when the key is activated or freed.
> >> 
> >> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> >> ---
> >>  arch/powerpc/include/asm/pkeys.h |   12 ++++++++++++
> >>  1 files changed, 12 insertions(+), 0 deletions(-)
> >> 
> >> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> >> index 5a83ed7..53bf13b 100644
> >> --- a/arch/powerpc/include/asm/pkeys.h
> >> +++ b/arch/powerpc/include/asm/pkeys.h
> >> @@ -54,6 +54,8 @@ static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> >>  		mm_set_pkey_is_allocated(mm, pkey));
> >>  }
> >>  
> >> +extern void __arch_activate_pkey(int pkey);
> >> +extern void __arch_deactivate_pkey(int pkey);
> >>  /*
> >>   * Returns a positive, 5-bit key on success, or -1 on failure.
> >>   */
> >> @@ -80,6 +82,12 @@ static inline int mm_pkey_alloc(struct mm_struct *mm)
> >>  
> >>  	ret = ffz((u32)mm_pkey_allocation_map(mm));
> >>  	mm_set_pkey_allocated(mm, ret);
> >> +
> >> +	/*
> >> +	 * enable the key in the hardware
> >> +	 */
> >> +	if (ret > 0)
> >> +		__arch_activate_pkey(ret);
> >>  	return ret;
> >>  }
> >>  
> >> @@ -91,6 +99,10 @@ static inline int mm_pkey_free(struct mm_struct *mm, int pkey)
> >>  	if (!mm_pkey_is_allocated(mm, pkey))
> >>  		return -EINVAL;
> >>  
> >> +	/*
> >> +	 * Disable the key in the hardware
> >> +	 */
> >> +	__arch_deactivate_pkey(pkey);
> >>  	mm_set_pkey_free(mm, pkey);
> >>  
> >>  	return 0;
> >
> > I think some of these patches can be merged, too much fine granularity
> > is hurting my ability to see the larger function/implementation.
> 
> 
> Completely agree

Me agree too :) Had to fine-grain it to satisfy comments
received during the initial versions.  I had about 10-12 patches in total.

Thanks, will merge a couple of these patches.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-09-14  8:13   ` Benjamin Herrenschmidt
  2017-10-23  8:52     ` Aneesh Kumar K.V
@ 2017-10-23 19:22     ` Ram Pai
  2017-10-24  3:37       ` Aneesh Kumar K.V
  1 sibling, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-23 19:22 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: mpe, linuxppc-dev, ebiederm, mhocko, paulus, aneesh.kumar,
	bauerman, khandual

On Thu, Sep 14, 2017 at 06:13:57PM +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:
> > The second part of the PTE will hold
> > (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
> > NOTE: None of the bits in the secondary PTE were not used
> > by 64k-HPTE backed PTE.
> 
> Have you measured the performance impact of this ? The second part of
> the PTE being in a different cache line there could be one...

hmm..missed responding to this comment.

I did a preliminay measurement running mmap bench in the selftest.
Ran it multiple times. almost always the numbers were either equal-to
or better-than without the patch-series.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 02/25] powerpc: define an additional vma bit for protection keys.
  2017-09-14  8:11     ` Benjamin Herrenschmidt
@ 2017-10-23 21:06       ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-23 21:06 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Balbir Singh, ebiederm, mhocko, paulus, aneesh.kumar, bauerman,
	linuxppc-dev, khandual

On Thu, Sep 14, 2017 at 06:11:32PM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2017-09-14 at 14:38 +1000, Balbir Singh wrote:
> > On Fri,  8 Sep 2017 15:44:50 -0700
> > Ram Pai <linuxram@us.ibm.com> wrote:
> > 
> > > powerpc needs an additional vma bit to support 32 keys.
> > > Till the additional vma bit lands in include/linux/mm.h
> > > we have to define  it  in powerpc specific header file.
> > > This is  needed to get pkeys working on power.
> > > 
> > > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > > ---
> > 
> > "This" being an arch specific hack for the additional bit?
> 
> Arch VMA bits ? really ? I'd rather we limit ourselves to 16 keys first
> then push for adding the extra bit to the generic code.

(hmm.. this mail did not land in my mailbox :( Sorry for the delay. Just
saw it in the mailing list.)

I think this is good idea. I can code it such a way that we
support 16 or 32 keys depending on the availability of the vma bit.

No more hunks in the code.


RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-10-23  8:52     ` Aneesh Kumar K.V
@ 2017-10-23 23:42       ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-23 23:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Benjamin Herrenschmidt, mpe, linuxppc-dev, mhocko, paulus,
	ebiederm, bauerman, khandual

On Mon, Oct 23, 2017 at 02:22:44PM +0530, Aneesh Kumar K.V wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> 
> > On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:
> >> The second part of the PTE will hold
> >> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
> >> NOTE: None of the bits in the secondary PTE were not used
> >> by 64k-HPTE backed PTE.
> >
> > Have you measured the performance impact of this ? The second part of
> > the PTE being in a different cache line there could be one...
> >
> 
> I am also looking at a patch series removing the slot tracking
> completely. With randomize address turned off and no swap in guest/host
> and making sure we touched most of guest ram, I don't find much impact
> in performance when we don't track the slot at all. I will post the
> patch series with numbers in a day or two. But my test was
> 
> while (5000) {
>       mmap(128M)
>       touch every page of 2048 pages
>       munmap()
> }
> 
> I could also be the best case in my run because i might have always
> found the hash pte slot in the primary. In one measurement with swap on
> and address randmization enabled, i did find a 50% impact. But then i
> was not able to recreate that again. So could be something i did wrong
> in the test setup.
> 
> Ram,
> 
> Will you be able to get a test run with the above loop?

Yes. results with patch look good; better than w/o patch.


/-----------------------------------------------\
|Itteratn| secs w/ patch	|secs w/o patch |
-------------------------------------------------
|1	 | 45.572621     	| 49.046994	|
|2	 | 46.049545     	| 49.378756	|
|3	 | 46.103657     	| 49.223591	|
|4	 | 46.298903     	| 48.991245	|
|5	 | 46.353202     	| 48.988033	|
|6	 | 45.440878     	| 49.175846	|
|7	 | 46.860373     	| 49.008395	|
|8	 | 46.221390     	| 49.236964	|
|9	 | 45.794993     	| 49.171927	|
|10	 | 46.569491     	| 48.995628	|
|-----------------------------------------------|
|average  | 46.1265053		| 49.1217379    |
\-----------------------------------------------/


The code is as follows:


diff --git a/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c b/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c
index 8d084a2..ef2ad87 100644
--- a/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c
+++ b/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c
@@ -10,14 +10,14 @@
 
 #include "utils.h"
 
-#define ITERATIONS 5000000
+#define ITERATIONS 5000
 
 #define MEMSIZE (128 * 1024 * 1024)
 
 int test_mmap(void)
 {
 	struct timespec ts_start, ts_end;
-	unsigned long i = ITERATIONS;
+	unsigned long i = ITERATIONS, j;
 
 	clock_gettime(CLOCK_MONOTONIC, &ts_start);
 
@@ -25,6 +25,10 @@ int test_mmap(void)
 		char *c = mmap(NULL, MEMSIZE, PROT_READ|PROT_WRITE,
 			       MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
 		FAIL_IF(c == MAP_FAILED);
+
+		for (j=0; j < (MEMSIZE >> 16); j++)
+			c[j<<16] = 0xf;
+
 		munmap(c, MEMSIZE);
 	}
 

^ permalink raw reply related	[flat|nested] 134+ messages in thread

* Re: [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K backed HPTE pages
  2017-10-23 19:22     ` Ram Pai
@ 2017-10-24  3:37       ` Aneesh Kumar K.V
  0 siblings, 0 replies; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-24  3:37 UTC (permalink / raw)
  To: Ram Pai, Benjamin Herrenschmidt
  Cc: mpe, linuxppc-dev, ebiederm, mhocko, paulus, bauerman, khandual



On 10/24/2017 12:52 AM, Ram Pai wrote:
> On Thu, Sep 14, 2017 at 06:13:57PM +1000, Benjamin Herrenschmidt wrote:
>> On Fri, 2017-09-08 at 15:44 -0700, Ram Pai wrote:
>>> The second part of the PTE will hold
>>> (H_PAGE_F_SECOND|H_PAGE_F_GIX) at bit 60,61,62,63.
>>> NOTE: None of the bits in the secondary PTE were not used
>>> by 64k-HPTE backed PTE.
>>
>> Have you measured the performance impact of this ? The second part of
>> the PTE being in a different cache line there could be one...
> 
> hmm..missed responding to this comment.
> 
> I did a preliminay measurement running mmap bench in the selftest.
> Ran it multiple times. almost always the numbers were either equal-to
> or better-than without the patch-series.

mmap bench doesn't do any fault. It is just mmap/munmap in loop.

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 09/25] powerpc: ability to create execute-disabled pkeys
  2017-09-08 22:44 ` [PATCH 09/25] powerpc: ability to create execute-disabled pkeys Ram Pai
  2017-10-18  3:42   ` Balbir Singh
@ 2017-10-24  4:36   ` Aneesh Kumar K.V
  2017-10-28 23:18     ` Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-24  4:36 UTC (permalink / raw)
  To: Ram Pai, mpe, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> powerpc has hardware support to disable execute on a pkey.
> This patch enables the ability to create execute-disabled
> keys.

Can you summarize here how this works?  Access to IAMR is
privileged so how will keys framework work with IAMR? 

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  2017-09-08 22:44 ` [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers Ram Pai
  2017-10-18  3:24   ` Balbir Singh
@ 2017-10-24  6:25   ` Aneesh Kumar K.V
  2017-10-24  7:04     ` Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-24  6:25 UTC (permalink / raw)
  To: Ram Pai, mpe, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> Introduce  helper functions that can initialize the bits in the AMR,
> IAMR and UAMOR register; the bits that correspond to the given pkey.
>
> Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> ---
>  arch/powerpc/include/asm/pkeys.h |    1 +
>  arch/powerpc/mm/pkeys.c          |   46 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 47 insertions(+), 0 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> index 133f8c4..5a83ed7 100644
> --- a/arch/powerpc/include/asm/pkeys.h
> +++ b/arch/powerpc/include/asm/pkeys.h
> @@ -26,6 +26,7 @@
>  #define arch_max_pkey()  pkeys_total
>  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
>  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> +#define AMR_BITS_PER_PKEY 2
>
>  #define pkey_alloc_mask(pkey) (0x1 << pkey)
>
> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index ebc9e84..178aa33 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -59,3 +59,49 @@ void __init pkey_initialize(void)
>  	for (i = 2; i < (pkeys_total - os_reserved); i++)
>  		initial_allocation_mask &= ~(0x1<<i);
>  }
> +
> +#define PKEY_REG_BITS (sizeof(u64)*8)
> +#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
> +
> +static inline void init_amr(int pkey, u8 init_bits)
> +{
> +	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> +	u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
> +
> +	write_amr(old_amr | new_amr_bits);
> +}
> +
> +static inline void init_iamr(int pkey, u8 init_bits)
> +{
> +	u64 new_iamr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> +	u64 old_iamr = read_iamr() & ~((u64)(0x3ul) << pkeyshift(pkey));
> +
> +	write_iamr(old_iamr | new_iamr_bits);
> +}
> +
> +static void pkey_status_change(int pkey, bool enable)
> +{
> +	u64 old_uamor;
> +
> +	/* reset the AMR and IAMR bits for this key */
> +	init_amr(pkey, 0x0);
> +	init_iamr(pkey, 0x0);
> +
> +	/* enable/disable key */
> +	old_uamor = read_uamor();
> +	if (enable)
> +		old_uamor |= (0x3ul << pkeyshift(pkey));
> +	else
> +		old_uamor &= ~(0x3ul << pkeyshift(pkey));
> +	write_uamor(old_uamor);
> +}

That one is confusing, we discussed this outside the list, but want to
bring the list to further discussion. So now the core kernel request
for a key via mm_pkey_alloc(). Why not do the below there

static inline int mm_pkey_alloc(struct mm_struct *mm)
{
	/*
	 * Note: this is the one and only place we make sure
	 * that the pkey is valid as far as the hardware is
	 * concerned.  The rest of the kernel trusts that
	 * only good, valid pkeys come out of here.
	 */
	u32 all_pkeys_mask = (u32)(~(0x0));
	int ret;

	if (!pkey_inited)
		return -1;
	/*
	 * Are we out of pkeys?  We must handle this specially
	 * because ffz() behavior is undefined if there are no
	 * zeros.
	 */
	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
		return -1;

	ret = ffz((u32)mm_pkey_allocation_map(mm));
	mm_set_pkey_allocated(mm, ret);

	return ret;
}

your mm_pkey_allocation_map() should have the keys specified in AMOR and
UAMOR marked as allocatied. It is in use by hypervisor and OS respectively.


There is no need of __arch_activate_key() etc. and by default if the OS
has not requested for a key for its internal use UAMOR should be
0xFFFFFFFF and that AMOR value you derive from the device tree based of
what keys hypervisor has reserved.

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 03/25] powerpc: track allocation status of all pkeys
  2017-09-08 22:44 ` [PATCH 03/25] powerpc: track allocation status of all pkeys Ram Pai
                     ` (2 preceding siblings ...)
  2017-10-23  9:41   ` Aneesh Kumar K.V
@ 2017-10-24  6:28   ` Aneesh Kumar K.V
  2017-10-24  7:23     ` Ram Pai
  3 siblings, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-24  6:28 UTC (permalink / raw)
  To: Ram Pai, mpe, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:
 +
> +#define mm_set_pkey_is_allocated(mm, pkey)	\
> +	(mm_pkey_allocation_map(mm) & pkey_alloc_mask(pkey))
> +

>  static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
>  {
> -	return (pkey == 0);
> +	/* a reserved key is never considered as 'explicitly allocated' */
> +	return ((pkey < arch_max_pkey()) &&
> +		!mm_set_pkey_is_reserved(mm, pkey) &&
> +		mm_set_pkey_is_allocated(mm, pkey));
>  }
>

that is confusing naming. what is mm_set_pkey_is_allocated()? . 'set' in
that name is confusing.

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 09/25] powerpc: ability to create execute-disabled pkeys
  2017-10-18  5:15     ` Ram Pai
@ 2017-10-24  6:58       ` Aneesh Kumar K.V
  2017-10-24  7:20         ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-24  6:58 UTC (permalink / raw)
  To: Ram Pai, Balbir Singh
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, hbabu, mhocko,
	bauerman, ebiederm

Ram Pai <linuxram@us.ibm.com> writes:

> On Wed, Oct 18, 2017 at 02:42:56PM +1100, Balbir Singh wrote:
>> On Fri,  8 Sep 2017 15:44:57 -0700
>> Ram Pai <linuxram@us.ibm.com> wrote:
>> 
>> > powerpc has hardware support to disable execute on a pkey.
>> > This patch enables the ability to create execute-disabled
>> > keys.
>> > 
>> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
>> > ---
>> >  arch/powerpc/include/uapi/asm/mman.h |    6 ++++++
>> >  arch/powerpc/mm/pkeys.c              |   16 ++++++++++++++++
>> >  2 files changed, 22 insertions(+), 0 deletions(-)
>> > 
>> > diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
>> > index ab45cc2..f272b09 100644
>> > --- a/arch/powerpc/include/uapi/asm/mman.h
>> > +++ b/arch/powerpc/include/uapi/asm/mman.h
>> > @@ -45,4 +45,10 @@
>> >  #define MAP_HUGE_1GB	(30 << MAP_HUGE_SHIFT)	/* 1GB   HugeTLB Page */
>> >  #define MAP_HUGE_16GB	(34 << MAP_HUGE_SHIFT)	/* 16GB  HugeTLB Page */
>> >  
>> > +/* override any generic PKEY Permission defines */
>> > +#define PKEY_DISABLE_EXECUTE   0x4
>> > +#undef PKEY_ACCESS_MASK
>> > +#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
>> > +				PKEY_DISABLE_WRITE  |\
>> > +				PKEY_DISABLE_EXECUTE)
>> >  #endif /* _UAPI_ASM_POWERPC_MMAN_H */
>> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
>> > index cc5be6a..2282864 100644
>> > --- a/arch/powerpc/mm/pkeys.c
>> > +++ b/arch/powerpc/mm/pkeys.c
>> > @@ -24,6 +24,14 @@ void __init pkey_initialize(void)
>> >  {
>> >  	int os_reserved, i;
>> >  
>> > +	/*
>> > +	 * we define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
>> > +	 * generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
>> > +	 * Ensure that the bits a distinct.
>> > +	 */
>> > +	BUILD_BUG_ON(PKEY_DISABLE_EXECUTE &
>> > +		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
>> 
>> Will these values every change? It's good to have I guess.
>> 
>> > +
>> >  	/* disable the pkey system till everything
>> >  	 * is in place. A patch further down the
>> >  	 * line will enable it.
>> > @@ -120,10 +128,18 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
>> >  		unsigned long init_val)
>> >  {
>> >  	u64 new_amr_bits = 0x0ul;
>> > +	u64 new_iamr_bits = 0x0ul;
>> >  
>> >  	if (!is_pkey_enabled(pkey))
>> >  		return -EINVAL;
>> >  
>> > +	if ((init_val & PKEY_DISABLE_EXECUTE)) {
>> > +		if (!pkey_execute_disable_support)
>> > +			return -EINVAL;
>> > +		new_iamr_bits |= IAMR_EX_BIT;
>> > +	}
>> > +	init_iamr(pkey, new_iamr_bits);
>> > +
>> 
>> Where do we check the reserved keys?
>
> The main gate keeper against spurious keys are the system calls.
> sys_pkey_mprotect(), sys_pkey_free() and sys_pkey_modify() are the one
> that will check against reserved and unallocated keys.  Once it has
> passed the check, all other internal functions trust the key values
> provided to them. I can put in additional checks but that will
> unnecessarily chew a few cpu cycles.
>
> Agree?
>
> BTW: you raise a good point though, I may have missed guarding against
> unallocated or reserved keys in sys_pkey_modify(). That was a power
> specific system call that I have introduced to change the permissions on
> a key.

Why do you need a power specific syscall? We should ideally not require
anything powerpc specific in the application to use memory keys. If it
is for exectue only key, the programming model should remain same as the
other keys.

NOTE: I am not able to find patch that add sys_pkey_modify()

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  2017-10-24  6:25   ` Aneesh Kumar K.V
@ 2017-10-24  7:04     ` Ram Pai
  2017-10-24  8:29       ` Michael Ellerman
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-24  7:04 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Tue, Oct 24, 2017 at 11:55:41AM +0530, Aneesh Kumar K.V wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > Introduce  helper functions that can initialize the bits in the AMR,
> > IAMR and UAMOR register; the bits that correspond to the given pkey.
> >
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> > ---
> >  arch/powerpc/include/asm/pkeys.h |    1 +
> >  arch/powerpc/mm/pkeys.c          |   46 ++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 47 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/pkeys.h b/arch/powerpc/include/asm/pkeys.h
> > index 133f8c4..5a83ed7 100644
> > --- a/arch/powerpc/include/asm/pkeys.h
> > +++ b/arch/powerpc/include/asm/pkeys.h
> > @@ -26,6 +26,7 @@
> >  #define arch_max_pkey()  pkeys_total
> >  #define ARCH_VM_PKEY_FLAGS (VM_PKEY_BIT0 | VM_PKEY_BIT1 | VM_PKEY_BIT2 | \
> >  				VM_PKEY_BIT3 | VM_PKEY_BIT4)
> > +#define AMR_BITS_PER_PKEY 2
> >
> >  #define pkey_alloc_mask(pkey) (0x1 << pkey)
> >
> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > index ebc9e84..178aa33 100644
> > --- a/arch/powerpc/mm/pkeys.c
> > +++ b/arch/powerpc/mm/pkeys.c
> > @@ -59,3 +59,49 @@ void __init pkey_initialize(void)
> >  	for (i = 2; i < (pkeys_total - os_reserved); i++)
> >  		initial_allocation_mask &= ~(0x1<<i);
> >  }
> > +
> > +#define PKEY_REG_BITS (sizeof(u64)*8)
> > +#define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
> > +
> > +static inline void init_amr(int pkey, u8 init_bits)
> > +{
> > +	u64 new_amr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> > +	u64 old_amr = read_amr() & ~((u64)(0x3ul) << pkeyshift(pkey));
> > +
> > +	write_amr(old_amr | new_amr_bits);
> > +}
> > +
> > +static inline void init_iamr(int pkey, u8 init_bits)
> > +{
> > +	u64 new_iamr_bits = (((u64)init_bits & 0x3UL) << pkeyshift(pkey));
> > +	u64 old_iamr = read_iamr() & ~((u64)(0x3ul) << pkeyshift(pkey));
> > +
> > +	write_iamr(old_iamr | new_iamr_bits);
> > +}
> > +
> > +static void pkey_status_change(int pkey, bool enable)
> > +{
> > +	u64 old_uamor;
> > +
> > +	/* reset the AMR and IAMR bits for this key */
> > +	init_amr(pkey, 0x0);
> > +	init_iamr(pkey, 0x0);
> > +
> > +	/* enable/disable key */
> > +	old_uamor = read_uamor();
> > +	if (enable)
> > +		old_uamor |= (0x3ul << pkeyshift(pkey));
> > +	else
> > +		old_uamor &= ~(0x3ul << pkeyshift(pkey));
> > +	write_uamor(old_uamor);
> > +}
> 
> That one is confusing, we discussed this outside the list, but want to
> bring the list to further discussion. So now the core kernel request
> for a key via mm_pkey_alloc(). Why not do the below there
> 
> static inline int mm_pkey_alloc(struct mm_struct *mm)
> {
> 	/*
> 	 * Note: this is the one and only place we make sure
> 	 * that the pkey is valid as far as the hardware is
> 	 * concerned.  The rest of the kernel trusts that
> 	 * only good, valid pkeys come out of here.
> 	 */
> 	u32 all_pkeys_mask = (u32)(~(0x0));
> 	int ret;
> 
> 	if (!pkey_inited)
> 		return -1;
> 	/*
> 	 * Are we out of pkeys?  We must handle this specially
> 	 * because ffz() behavior is undefined if there are no
> 	 * zeros.
> 	 */
> 	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
> 		return -1;
> 
> 	ret = ffz((u32)mm_pkey_allocation_map(mm));
> 	mm_set_pkey_allocated(mm, ret);
> 
> 	return ret;
> }
> 
> your mm_pkey_allocation_map() should have the keys specified in AMOR and
> UAMOR marked as allocatied. It is in use by hypervisor and OS respectively.
> 
> 
> There is no need of __arch_activate_key() etc. and by default if the OS
> has not requested for a key for its internal use UAMOR should be
> 0xFFFFFFFF and that AMOR value you derive from the device tree based of
> what keys hypervisor has reserved.


Ok. You are suggesting a programming model where
a) keys that are reserved by hypervisor are enabled by default.
b) keys that are reserved by the OS are enabled by default.
c) keys that are not yet allocated to userspace is enabled by default.

The problem with this model is that, the userspace can change the
permissions of an unallocated key, and those permissions will go into
effect immediately, because every key is enabled by default. If it
happens to be a key that is reserved by the OS or the hypervisor, bad
things can happen.


the programming model that I have implemented; **which is the programming
model expected by linux**, is

a)  keys that are reserved by hypervisor are left to the hypervisor to
	enable/disable whenever it needs to.
b)  keys that are reserved by the OS are left to the OS to
	enable/disable whenever it needs to.
c)  keys that are not yet allocated to userspace are not yet enabled.
	Will be enabled when the key is allocated to userspace
	through sys_pkey_alloc() syscall.

In this programming model, userspace is expected to use only keys that
are allocated. And in case it inadvertetly uses a key that is not
allocated, nothing bad happens because that key is not activated unless
it is allocated.  And in case it uses a key that is reserved by the
hypervisor or OS, bad things will still not happen because those keys
should not be kept enabled when the process is in userspace. 


These are my thoughts, I will let the jury decide. :)
RP


> 
> -aneesh

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 09/25] powerpc: ability to create execute-disabled pkeys
  2017-10-24  6:58       ` Aneesh Kumar K.V
@ 2017-10-24  7:20         ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-24  7:20 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Balbir Singh, mpe, linuxppc-dev, benh, paulus, khandual, hbabu,
	mhocko, bauerman, ebiederm

On Tue, Oct 24, 2017 at 12:28:35PM +0530, Aneesh Kumar K.V wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > On Wed, Oct 18, 2017 at 02:42:56PM +1100, Balbir Singh wrote:
> >> On Fri,  8 Sep 2017 15:44:57 -0700
> >> Ram Pai <linuxram@us.ibm.com> wrote:
> >> 
> >> > powerpc has hardware support to disable execute on a pkey.
> >> > This patch enables the ability to create execute-disabled
> >> > keys.
> >> > 
> >> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> >> > ---
> >> >  arch/powerpc/include/uapi/asm/mman.h |    6 ++++++
> >> >  arch/powerpc/mm/pkeys.c              |   16 ++++++++++++++++
> >> >  2 files changed, 22 insertions(+), 0 deletions(-)
> >> > 
> >> > diff --git a/arch/powerpc/include/uapi/asm/mman.h b/arch/powerpc/include/uapi/asm/mman.h
> >> > index ab45cc2..f272b09 100644
> >> > --- a/arch/powerpc/include/uapi/asm/mman.h
> >> > +++ b/arch/powerpc/include/uapi/asm/mman.h
> >> > @@ -45,4 +45,10 @@
> >> >  #define MAP_HUGE_1GB	(30 << MAP_HUGE_SHIFT)	/* 1GB   HugeTLB Page */
> >> >  #define MAP_HUGE_16GB	(34 << MAP_HUGE_SHIFT)	/* 16GB  HugeTLB Page */
> >> >  
> >> > +/* override any generic PKEY Permission defines */
> >> > +#define PKEY_DISABLE_EXECUTE   0x4
> >> > +#undef PKEY_ACCESS_MASK
> >> > +#define PKEY_ACCESS_MASK       (PKEY_DISABLE_ACCESS |\
> >> > +				PKEY_DISABLE_WRITE  |\
> >> > +				PKEY_DISABLE_EXECUTE)
> >> >  #endif /* _UAPI_ASM_POWERPC_MMAN_H */
> >> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> >> > index cc5be6a..2282864 100644
> >> > --- a/arch/powerpc/mm/pkeys.c
> >> > +++ b/arch/powerpc/mm/pkeys.c
> >> > @@ -24,6 +24,14 @@ void __init pkey_initialize(void)
> >> >  {
> >> >  	int os_reserved, i;
> >> >  
> >> > +	/*
> >> > +	 * we define PKEY_DISABLE_EXECUTE in addition to the arch-neutral
> >> > +	 * generic defines for PKEY_DISABLE_ACCESS and PKEY_DISABLE_WRITE.
> >> > +	 * Ensure that the bits a distinct.
> >> > +	 */
> >> > +	BUILD_BUG_ON(PKEY_DISABLE_EXECUTE &
> >> > +		     (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE));
> >> 
> >> Will these values every change? It's good to have I guess.
> >> 
> >> > +
> >> >  	/* disable the pkey system till everything
> >> >  	 * is in place. A patch further down the
> >> >  	 * line will enable it.
> >> > @@ -120,10 +128,18 @@ int __arch_set_user_pkey_access(struct task_struct *tsk, int pkey,
> >> >  		unsigned long init_val)
> >> >  {
> >> >  	u64 new_amr_bits = 0x0ul;
> >> > +	u64 new_iamr_bits = 0x0ul;
> >> >  
> >> >  	if (!is_pkey_enabled(pkey))
> >> >  		return -EINVAL;
> >> >  
> >> > +	if ((init_val & PKEY_DISABLE_EXECUTE)) {
> >> > +		if (!pkey_execute_disable_support)
> >> > +			return -EINVAL;
> >> > +		new_iamr_bits |= IAMR_EX_BIT;
> >> > +	}
> >> > +	init_iamr(pkey, new_iamr_bits);
> >> > +
> >> 
> >> Where do we check the reserved keys?
> >
> > The main gate keeper against spurious keys are the system calls.
> > sys_pkey_mprotect(), sys_pkey_free() and sys_pkey_modify() are the one
> > that will check against reserved and unallocated keys.  Once it has
> > passed the check, all other internal functions trust the key values
> > provided to them. I can put in additional checks but that will
> > unnecessarily chew a few cpu cycles.
> >
> > Agree?
> >
> > BTW: you raise a good point though, I may have missed guarding against
> > unallocated or reserved keys in sys_pkey_modify(). That was a power
> > specific system call that I have introduced to change the permissions on
> > a key.
> 
> Why do you need a power specific syscall? We should ideally not require
> anything powerpc specific in the application to use memory keys. If it
> is for exectue only key, the programming model should remain same as the
> other keys.

The programming model has not changed. It continues to be the
same. i.e 

a) allocate a key  through sys_pkey_alloc()
b) associate the key to a addressspace through sys_pkey_mprotect()
c) change the permissions on the key by programming the AMR register as
	and when needed.
d) free the key through sys_pkey_free() when done.


the problem is with the programming of execute-permission on the key. x86
does not support the execute-permission and does not have the issue.

powerpc supports execute-permission but unfortunately has not exposed
that capability to userspace, because IAMR cannot be programmed from
userspace. I have filled in that gap, by providing a power-specific
system call called sys_pkey_modify().  It is a way to enable the exact 
same programming model on keys for execute-permissions.


> 
> NOTE: I am not able to find patch that add sys_pkey_modify()

Yes that patch was added only recently to my tree after consulting
Michael Ellermen. I am yet to send out that patch. Will be doing so
in my next version.

RP

> 
> -aneesh

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 03/25] powerpc: track allocation status of all pkeys
  2017-10-24  6:28   ` Aneesh Kumar K.V
@ 2017-10-24  7:23     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-24  7:23 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: mpe, linuxppc-dev, benh, paulus, khandual, bsingharora, hbabu,
	mhocko, bauerman, ebiederm

On Tue, Oct 24, 2017 at 11:58:29AM +0530, Aneesh Kumar K.V wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
>  +
> > +#define mm_set_pkey_is_allocated(mm, pkey)	\
> > +	(mm_pkey_allocation_map(mm) & pkey_alloc_mask(pkey))
> > +
> 
> >  static inline bool mm_pkey_is_allocated(struct mm_struct *mm, int pkey)
> >  {
> > -	return (pkey == 0);
> > +	/* a reserved key is never considered as 'explicitly allocated' */
> > +	return ((pkey < arch_max_pkey()) &&
> > +		!mm_set_pkey_is_reserved(mm, pkey) &&
> > +		mm_set_pkey_is_allocated(mm, pkey));
> >  }
> >
> 
> that is confusing naming. what is mm_set_pkey_is_allocated()? . 'set' in
> that name is confusing.

will change it by removing the 'set' in the names.

thanks,
RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 01/25] powerpc: initial pkey plumbing
  2017-10-19 17:11     ` Ram Pai
@ 2017-10-24  8:17       ` Michael Ellerman
  0 siblings, 0 replies; 134+ messages in thread
From: Michael Ellerman @ 2017-10-24  8:17 UTC (permalink / raw)
  To: Ram Pai
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm

Ram Pai <linuxram@us.ibm.com> writes:

> On Thu, Oct 19, 2017 at 03:20:36PM +1100, Michael Ellerman wrote:
>> Ram Pai <linuxram@us.ibm.com> writes:
>> 
>> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> > index 9fc3c0b..a4cd210 100644
>> > --- a/arch/powerpc/Kconfig
>> > +++ b/arch/powerpc/Kconfig
>> > @@ -864,6 +864,22 @@ config SECCOMP
>> >  
>> >  	  If unsure, say Y. Only embedded should say N here.
>> >  
>> > +config PPC64_MEMORY_PROTECTION_KEYS
>> 
>> That's pretty wordy, can we make it CONFIG_PPC_MEM_KEYS ?
>> 
>> I think you're a sufficient vim wizard to search and replace all
>> usages at once,
>
> I take that as a compliment for now ;)
>
>> if not I can do it before I apply the series.
>
> Will change it...just that I was trying to keep it similar to what intel
> has X86_INTEL_MEMORY_PROTECTION_KEYS

Yeah I realise that's why you chose that name, but I'd still prefer
something shorter. I don't think it matters that the CONFIG names are
slightly different between the arches, it's still pretty clear what it's
for I think.

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers
  2017-10-24  7:04     ` Ram Pai
@ 2017-10-24  8:29       ` Michael Ellerman
  0 siblings, 0 replies; 134+ messages in thread
From: Michael Ellerman @ 2017-10-24  8:29 UTC (permalink / raw)
  To: Ram Pai, Aneesh Kumar K.V
  Cc: linuxppc-dev, benh, paulus, khandual, bsingharora, hbabu, mhocko,
	bauerman, ebiederm

Ram Pai <linuxram@us.ibm.com> writes:
> On Tue, Oct 24, 2017 at 11:55:41AM +0530, Aneesh Kumar K.V wrote:
>> Ram Pai <linuxram@us.ibm.com> writes:
>> > +
>> > +static void pkey_status_change(int pkey, bool enable)
>> > +{
>> > +	u64 old_uamor;
>> > +
>> > +	/* reset the AMR and IAMR bits for this key */
>> > +	init_amr(pkey, 0x0);
>> > +	init_iamr(pkey, 0x0);
>> > +
>> > +	/* enable/disable key */
>> > +	old_uamor = read_uamor();
>> > +	if (enable)
>> > +		old_uamor |= (0x3ul << pkeyshift(pkey));
>> > +	else
>> > +		old_uamor &= ~(0x3ul << pkeyshift(pkey));
>> > +	write_uamor(old_uamor);
>> > +}
>> 
>> That one is confusing, we discussed this outside the list, but want to
>> bring the list to further discussion. So now the core kernel request
>> for a key via mm_pkey_alloc(). Why not do the below there
>> 
>> static inline int mm_pkey_alloc(struct mm_struct *mm)
>> {
>> 	/*
>> 	 * Note: this is the one and only place we make sure
>> 	 * that the pkey is valid as far as the hardware is
>> 	 * concerned.  The rest of the kernel trusts that
>> 	 * only good, valid pkeys come out of here.
>> 	 */
>> 	u32 all_pkeys_mask = (u32)(~(0x0));
>> 	int ret;
>> 
>> 	if (!pkey_inited)
>> 		return -1;
>> 	/*
>> 	 * Are we out of pkeys?  We must handle this specially
>> 	 * because ffz() behavior is undefined if there are no
>> 	 * zeros.
>> 	 */
>> 	if (mm_pkey_allocation_map(mm) == all_pkeys_mask)
>> 		return -1;
>> 
>> 	ret = ffz((u32)mm_pkey_allocation_map(mm));
>> 	mm_set_pkey_allocated(mm, ret);
>> 
>> 	return ret;
>> }
>> 
>> your mm_pkey_allocation_map() should have the keys specified in AMOR and
>> UAMOR marked as allocatied. It is in use by hypervisor and OS respectively.
>> 
>> 
>> There is no need of __arch_activate_key() etc. and by default if the OS
>> has not requested for a key for its internal use UAMOR should be
>> 0xFFFFFFFF and that AMOR value you derive from the device tree based of
>> what keys hypervisor has reserved.
>
>
> Ok. You are suggesting a programming model where
> a) keys that are reserved by hypervisor are enabled by default.

No he's not saying that.

> b) keys that are reserved by the OS are enabled by default.

Or that.

> c) keys that are not yet allocated to userspace is enabled by default.

But he is saying this.

> The problem with this model is that, the userspace can change the
> permissions of an unallocated key, and those permissions will go into
> effect immediately,

Correct, but an unallocated key should not be used anywhere, so it
should have no effect, unless there's a bug.

> because every key is enabled by default. If it

Not every key, every key that is not reserved by the hypervisor or OS.

> happens to be a key that is reserved by the OS or the hypervisor, bad
> things can happen.

So no nothing bad should be able to happen.

> the programming model that I have implemented; **which is the programming
> model expected by linux**, is

>
> a)  keys that are reserved by hypervisor are left to the hypervisor to
> 	enable/disable whenever it needs to.

We agree on that.

> b)  keys that are reserved by the OS are left to the OS to
> 	enable/disable whenever it needs to.

And that.

> c)  keys that are not yet allocated to userspace are not yet enabled.
> 	Will be enabled when the key is allocated to userspace
> 	through sys_pkey_alloc() syscall.

This is the only part that is under discussion.

> In this programming model, userspace is expected to use only keys that
> are allocated. And in case it inadvertetly uses a key that is not
> allocated, nothing bad happens because that key is not activated unless
> it is allocated.

This sames like a good design to me.

The only downside I see is we have to switch an extra SPR, but that's
not much in the scheme of things.

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 23/25] powerpc: Deliver SEGV signal on pkey violation
  2017-09-08 22:45 ` [PATCH 23/25] powerpc: Deliver SEGV signal on pkey violation Ram Pai
@ 2017-10-24 15:46   ` Michael Ellerman
  2017-10-24 17:19     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Michael Ellerman @ 2017-10-24 15:46 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> index ec74e20..f2a310d 100644
> --- a/arch/powerpc/kernel/traps.c
> +++ b/arch/powerpc/kernel/traps.c
> @@ -265,6 +266,15 @@ void user_single_step_siginfo(struct task_struct *tsk,
>  	info->si_addr = (void __user *)regs->nip;
>  }
>  
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +static void fill_sig_info_pkey(int si_code, siginfo_t *info, unsigned long addr)
> +{
> +	if (info->si_signo != SIGSEGV || si_code != SEGV_PKUERR)

Just checking si_code is sufficient there I think.

> +		return;
> +	info->si_pkey = get_paca()->paca_pkey;
> +}
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */

This should define an empty version in the #else case, so we don't need
the ifdef below.

> @@ -292,6 +302,18 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
>  	info.si_signo = signr;
>  	info.si_code = code;
>  	info.si_addr = (void __user *) addr;
> +
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	/*
> +	 * update the thread's pkey related fields.
> +	 * core-dump handlers and other sub-systems
> +	 * depend on those values.
> +	 */
> +	thread_pkey_regs_save(&current->thread);

You shouldn't need to do this.

We're not putting any of the pkey regs in the signal frame, so you don't
need to save before we do that. [And if you did the right place to do it
would be in setup_sigcontext() (or the TM version).]

For ptrace and coredumps it should happen in pkey_get(), see eg.
fpr_get() which does flush_fp_to_thread() as an example.

> +	/* update the violated-key value */
> +	fill_sig_info_pkey(code, &info, addr);
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */

> +
>  	force_sig_info(signr, &info, current);
>  }

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 22/25] powerpc: capture the violated protection key on fault
  2017-09-08 22:45 ` [PATCH 22/25] powerpc: capture the violated protection key on fault Ram Pai
@ 2017-10-24 15:46   ` Michael Ellerman
  0 siblings, 0 replies; 134+ messages in thread
From: Michael Ellerman @ 2017-10-24 15:46 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
> index 04b60af..51c89c1 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -97,6 +97,9 @@ struct paca_struct {
>  	struct dtl_entry *dispatch_log_end;
>  #endif /* CONFIG_PPC_STD_MMU_64 */
>  	u64 dscr_default;		/* per-CPU default DSCR */
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	u16 paca_pkey;                  /* exception causing pkey */
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */

I can't see any reason why this should be in the paca.

> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index a16bc43..ad31f6e 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -153,6 +153,7 @@ static int bad_page_fault_exception(struct pt_regs *regs, unsigned long address,
>  
>  #ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
>  	if (si_code & DSISR_KEYFAULT) {
> +		get_paca()->paca_pkey = get_pte_pkey(current->mm, address);

You seem to be using the paca as a temporary stash so that you don't
have to pass it to _exception().

But that's not what the paca is for, the paca is for per-cpu data not
per-thread data, and (preferably) only for things that need to be
accessed in low-level code where proper per_cpu() variables don't work.

Updating _exception() to take the key would be a mess, because there are
so many callers who don't care about the key. For now we can probably
just do something ~=:

void _exception_pkey(int signr, struct pt_regs *regs, int code, unsigned long addr, int key)
{
	< current body of _exception >

	+ pkey bits
}

void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
{
	_exception_pkey(..., 0);
}


cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation
  2017-09-08 22:45 ` [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation Ram Pai
  2017-10-18 23:27   ` Balbir Singh
@ 2017-10-24 15:47   ` Michael Ellerman
  2017-10-24 18:26     ` Ram Pai
  2017-10-29 14:03     ` Aneesh Kumar K.V
  1 sibling, 2 replies; 134+ messages in thread
From: Michael Ellerman @ 2017-10-24 15:47 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> Handle Data and  Instruction exceptions caused by memory
> protection-key.
>
> The CPU will detect the key fault if the HPTE is already
> programmed with the key.
>
> However if the HPTE is not  hashed, a key fault will not
> be detected by the  hardware. The   software will detect
> pkey violation in such a case.

That seems like the wrong trade off to me.

It means every fault has to go through arch_vma_access_permitted(),
which is at least a function call in the best case, even when pkeys are
not in use, and/or the range in question is not protected by a key.

Why not instead just service the fault and let the hardware catch it?

If that access to a protected page is a bug then the process will
probably just exit, in which case who cares about the overhead of the
extra fault.

If the access is not currently allowed, but the process wants to take
the signal and do something to allow it, then is the overhead of the
extra fault going to matter vs the signal etc?

I think we should just let the hardware take a 2nd fault, unless someone
can demonstrate a workload where the cost of that is prohibitive.

Or does that not work for some reason?

> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 4797d08..a16bc43 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -391,11 +408,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  		return 0;
>  
>  	if (unlikely(page_fault_is_bad(error_code))) {
> -		if (is_user) {
> -			_exception(SIGBUS, regs, BUS_OBJERR, address);
> -			return 0;
> -		}
> -		return SIGBUS;
> +		if (!is_user)
> +			return SIGBUS;
> +		return bad_page_fault_exception(regs, address, error_code);

I don't see why we want to handle the key fault here.

I know it's caught here at the moment, because it's in
DSISR_BAD_FAULT_64S, but I think now that we support keys we should
remove DSISR_KEYFAULT from DSISR_BAD_FAULT_64S.

Then we should let it fall through to further down, and handle it in
access_error().

That way eg. the page fault accounting will work as normal etc.

> @@ -492,6 +507,18 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  	if (unlikely(access_error(is_write, is_exec, vma)))
>  		return bad_area(regs, address);
>  
> +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> +	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
> +			is_exec, 0))
> +		return __bad_area(regs, address, SEGV_PKUERR);
> +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> +
> +
> +	/* handle_mm_fault() needs to know if its a instruction access
> +	 * fault.
> +	 */
> +	if (is_exec)
> +		flags |= FAULT_FLAG_INSTRUCTION;

What is this doing here? We already do that further up.

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 19/25] powerpc: implementation for arch_vma_access_permitted()
  2017-09-08 22:45 ` [PATCH 19/25] powerpc: implementation for arch_vma_access_permitted() Ram Pai
  2017-10-18 23:20   ` Balbir Singh
@ 2017-10-24 15:48   ` Michael Ellerman
  1 sibling, 0 replies; 134+ messages in thread
From: Michael Ellerman @ 2017-10-24 15:48 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> index 24589d9..21c3b42 100644
> --- a/arch/powerpc/mm/pkeys.c
> +++ b/arch/powerpc/mm/pkeys.c
> @@ -320,3 +320,46 @@ bool arch_pte_access_permitted(u64 pte, bool write, bool execute)
>  	return pkey_access_permitted(pte_to_pkey_bits(pte),
>  			write, execute);
>  }
> +
> +/*
> + * We only want to enforce protection keys on the current process
> + * because we effectively have no access to AMR/IAMR for other
> + * processes or any way to tell *which * AMR/IAMR in a threaded
> + * process we could use.

This comment doesn't match the code, or at least is confusing to me.

A "threaded process" is two tasks that have the same mm. ie. where
current->mm == vma->mm.

And in that case we *do* enforce protection, based on the AMR/IAMR of
the current thread.

> + * So do not enforce things if the VMA is not from the current
> + * mm, or if we are in a kernel thread.
> + */
> +static inline bool vma_is_foreign(struct vm_area_struct *vma)
> +{
> +	if (!current->mm)
> +		return true;
> +	/*
> +	 * if the VMA is from another process, then AMR/IAMR has no
> +	 * relevance and should not be enforced.
> +	 */
> +	if (current->mm != vma->vm_mm)
> +		return true;

ie. threaded processes end up here, because they *do* share an mm.

> +
> +	return false;
> +}
> +
> +bool arch_vma_access_permitted(struct vm_area_struct *vma,
> +		bool write, bool execute, bool foreign)
> +{
> +	int pkey;
> +
> +	if (!pkey_inited)
> +		return true;
> +
> +	/* allow access if the VMA is not one from this process */
> +	if (foreign || vma_is_foreign(vma))
> +		return true;
> +
> +	pkey = vma_pkey(vma);
> +
> +	if (!pkey)
> +		return true;

I think I'd prefer if we didn't special case key 0, instead just let it
go through to pkey_access_permitted().

> +
> +	return pkey_access_permitted(pkey, write, execute);
> +}

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 08/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
  2017-09-08 22:44 ` [PATCH 08/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls Ram Pai
@ 2017-10-24 15:48   ` Michael Ellerman
  2017-10-24 18:34     ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Michael Ellerman @ 2017-10-24 15:48 UTC (permalink / raw)
  To: Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, aneesh.kumar, bsingharora, hbabu, mhocko,
	bauerman, ebiederm, linuxram

Ram Pai <linuxram@us.ibm.com> writes:

> Finally this patch provides the ability for a process to
> allocate and free a protection key.

This must be the last patch in the series.

We don't want to expose a half working interface to userspace.

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 23/25] powerpc: Deliver SEGV signal on pkey violation
  2017-10-24 15:46   ` Michael Ellerman
@ 2017-10-24 17:19     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-24 17:19 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm

On Tue, Oct 24, 2017 at 05:46:38PM +0200, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index ec74e20..f2a310d 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -265,6 +266,15 @@ void user_single_step_siginfo(struct task_struct *tsk,
> >  	info->si_addr = (void __user *)regs->nip;
> >  }
> >  
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +static void fill_sig_info_pkey(int si_code, siginfo_t *info, unsigned long addr)
> > +{
> > +	if (info->si_signo != SIGSEGV || si_code != SEGV_PKUERR)
> 
> Just checking si_code is sufficient there I think.
> 
> > +		return;
> > +	info->si_pkey = get_paca()->paca_pkey;
> > +}
> > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> 
> This should define an empty version in the #else case, so we don't need
> the ifdef below.
> 
> > @@ -292,6 +302,18 @@ void _exception(int signr, struct pt_regs *regs, int code, unsigned long addr)
> >  	info.si_signo = signr;
> >  	info.si_code = code;
> >  	info.si_addr = (void __user *) addr;
> > +
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	/*
> > +	 * update the thread's pkey related fields.
> > +	 * core-dump handlers and other sub-systems
> > +	 * depend on those values.
> > +	 */
> > +	thread_pkey_regs_save(&current->thread);
> 
> You shouldn't need to do this.
> 
> We're not putting any of the pkey regs in the signal frame, so you don't
> need to save before we do that. [And if you did the right place to do it
> would be in setup_sigcontext() (or the TM version).]

One of the comments in the past was the ability to capture AMR
register status in core-dumps and also the ability to modify the
AMR through ptrace. Thiago wrote a patch towards that. That patch
is one of the subsequent patches sent in this series.

I will move the code to pkey_get().

> 
> For ptrace and coredumps it should happen in pkey_get(), see eg.
> fpr_get() which does flush_fp_to_thread() as an example.

Ok.

> 
> > +	/* update the violated-key value */
> > +	fill_sig_info_pkey(code, &info, addr);
> > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> 
> > +
> >  	force_sig_info(signr, &info, current);
> >  }
> 
> cheers

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation
  2017-10-24 15:47   ` Michael Ellerman
@ 2017-10-24 18:26     ` Ram Pai
  2017-10-29 14:03     ` Aneesh Kumar K.V
  1 sibling, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-24 18:26 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm

On Tue, Oct 24, 2017 at 05:47:38PM +0200, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > Handle Data and  Instruction exceptions caused by memory
> > protection-key.
> >
> > The CPU will detect the key fault if the HPTE is already
> > programmed with the key.
> >
> > However if the HPTE is not  hashed, a key fault will not
> > be detected by the  hardware. The   software will detect
> > pkey violation in such a case.
> 
> That seems like the wrong trade off to me.
> 
> It means every fault has to go through arch_vma_access_permitted(),
> which is at least a function call in the best case, even when pkeys are
> not in use, and/or the range in question is not protected by a key.
> 
> Why not instead just service the fault and let the hardware catch it?
> 
> If that access to a protected page is a bug then the process will
> probably just exit, in which case who cares about the overhead of the
> extra fault.
> 
> If the access is not currently allowed, but the process wants to take
> the signal and do something to allow it, then is the overhead of the
> extra fault going to matter vs the signal etc?
> 
> I think we should just let the hardware take a 2nd fault, unless someone
> can demonstrate a workload where the cost of that is prohibitive.
> 
> Or does that not work for some reason?

It does not work, because the arch-neutral code error-outs the fault
without letting the power-specific code to page-in the faulting page.

So neither does the page gets faulted, nor does the key-fault gets 
signalled.

> 
> > diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> > index 4797d08..a16bc43 100644
> > --- a/arch/powerpc/mm/fault.c
> > +++ b/arch/powerpc/mm/fault.c
> > @@ -391,11 +408,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
> >  		return 0;
> >  
> >  	if (unlikely(page_fault_is_bad(error_code))) {
> > -		if (is_user) {
> > -			_exception(SIGBUS, regs, BUS_OBJERR, address);
> > -			return 0;
> > -		}
> > -		return SIGBUS;
> > +		if (!is_user)
> > +			return SIGBUS;
> > +		return bad_page_fault_exception(regs, address, error_code);
> 
> I don't see why we want to handle the key fault here.
> 
> I know it's caught here at the moment, because it's in
> DSISR_BAD_FAULT_64S, but I think now that we support keys we should
> remove DSISR_KEYFAULT from DSISR_BAD_FAULT_64S.
> 
> Then we should let it fall through to further down, and handle it in
> access_error().
> 
> That way eg. the page fault accounting will work as normal etc.

Bit tricky to do that. Will try. For one if I take out DSISR_KEYFAULT
from DSISR_BAD_FAULT_64S, than the assembly code in do_hash_page() will
not call  __do_page_fault().  We want it called, but somehow
special-handle DSISR_KEYFAULT to call access_error().

> 
> > @@ -492,6 +507,18 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
> >  	if (unlikely(access_error(is_write, is_exec, vma)))
> >  		return bad_area(regs, address);
> >  
> > +#ifdef CONFIG_PPC64_MEMORY_PROTECTION_KEYS
> > +	if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE,
> > +			is_exec, 0))
> > +		return __bad_area(regs, address, SEGV_PKUERR);
> > +#endif /* CONFIG_PPC64_MEMORY_PROTECTION_KEYS */
> > +
> > +
> > +	/* handle_mm_fault() needs to know if its a instruction access
> > +	 * fault.
> > +	 */
> > +	if (is_exec)
> > +		flags |= FAULT_FLAG_INSTRUCTION;
> 
> What is this doing here? We already do that further up.

The more I think of this, I find that the code can be optimized and
remove redundancy.  I am unnecessarily called arch_vma_access_permitted()
above, when all the hardwork anyway gets done by handle_mm_fault().
handle_mm_fault() anyway calls arch_vma_access_permitted().

I could rather use the return value of handle_mm_fault() to signal
a key-error to the userspace.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 08/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
  2017-10-24 15:48   ` Michael Ellerman
@ 2017-10-24 18:34     ` Ram Pai
  2017-10-25  9:26       ` Michael Ellerman
  0 siblings, 1 reply; 134+ messages in thread
From: Ram Pai @ 2017-10-24 18:34 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm

On Tue, Oct 24, 2017 at 05:48:15PM +0200, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > Finally this patch provides the ability for a process to
> > allocate and free a protection key.
> 
> This must be the last patch in the series.
> 
> We don't want to expose a half working interface to userspace.

The way the patch series is organized -- this patch; even-though
introduces the syscall, the syscall will fail because the pkey subsystem
is enabled only by the last patch. Till than the code exists , but does
a great job of failing with an appropriate return code.

Hope this helps,
RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-10-23 16:29       ` Ram Pai
@ 2017-10-25  9:18         ` Michael Ellerman
  2017-10-26  6:08           ` Ram Pai
  0 siblings, 1 reply; 134+ messages in thread
From: Michael Ellerman @ 2017-10-25  9:18 UTC (permalink / raw)
  To: Ram Pai, Aneesh Kumar K.V
  Cc: linuxppc-dev, benh, paulus, khandual, bsingharora, hbabu, mhocko,
	bauerman, ebiederm

Ram Pai <linuxram@us.ibm.com> writes:
> On Mon, Oct 23, 2017 at 02:17:39PM +0530, Aneesh Kumar K.V wrote:
>> Michael Ellerman <mpe@ellerman.id.au> writes:
>> > Ram Pai <linuxram@us.ibm.com> writes:
...
>> > Should be:
>> > 	/*
>> > 	 * Initialize all hidx entries to invalid value, the first time
>> >          * the PTE is about to allocate a 4K HPTE.
>> > 	 */
>> >
>> >> +	if (!(old_pte & H_PAGE_COMBO))
>> >> +		rpte.hidx = ~0x0UL;
>> >> +
>> >
>> > Paul had the idea that if we biased the slot number by 1, we could make
>> > the "invalid" value be == 0.
>> >
>> > That would avoid needing to that above, and also mean the value is
>> > correctly invalid from the get-go, which would be good IMO.
>> >
>> > I think now that you've added the slot accessors it would be pretty easy
>> > to do.
>> 
>> That would be imply, we loose one slot in primary group, which means we
>> will do extra work in some case because our primary now has only 7
>> slots. And in case of pseries, the hypervisor will always return the
>> least available slot, which implie we will do extra hcalls in case of an
>> hpte insert to an empty group?
>
> No. that is not the idea.  The idea is that slot 'F' in the seconday
> will continue to be a invalid slot, but will be represented as
> offset-by-one in the PTE.  In other words, 0 will be repesented as 1,
> 1 as 2....   and  n as (n+1)%32

Right.

> The idea seems feasible.  It has the advantage -- where 0 in the PTE
> means invalid slot. But it can be confusing to the casual code-
> reader. Will need to put in a big-huge comment to explain that.

This code is already confusing to *any* reader, so I don't think it's a
worry. :)

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 08/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
  2017-10-24 18:34     ` Ram Pai
@ 2017-10-25  9:26       ` Michael Ellerman
  0 siblings, 0 replies; 134+ messages in thread
From: Michael Ellerman @ 2017-10-25  9:26 UTC (permalink / raw)
  To: Ram Pai
  Cc: linuxppc-dev, benh, paulus, khandual, aneesh.kumar, bsingharora,
	hbabu, mhocko, bauerman, ebiederm

Ram Pai <linuxram@us.ibm.com> writes:

> On Tue, Oct 24, 2017 at 05:48:15PM +0200, Michael Ellerman wrote:
>> Ram Pai <linuxram@us.ibm.com> writes:
>> 
>> > Finally this patch provides the ability for a process to
>> > allocate and free a protection key.
>> 
>> This must be the last patch in the series.
>> 
>> We don't want to expose a half working interface to userspace.
>
> The way the patch series is organized -- this patch; even-though
> introduces the syscall, the syscall will fail because the pkey subsystem
> is enabled only by the last patch. Till than the code exists , but does
> a great job of failing with an appropriate return code.

See my previous mail :)

Please don't add the syscall until it can work.

cheers

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
  2017-10-25  9:18         ` Michael Ellerman
@ 2017-10-26  6:08           ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-26  6:08 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Aneesh Kumar K.V, linuxppc-dev, benh, paulus, khandual,
	bsingharora, hbabu, mhocko, bauerman, ebiederm

On Wed, Oct 25, 2017 at 11:18:49AM +0200, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> > On Mon, Oct 23, 2017 at 02:17:39PM +0530, Aneesh Kumar K.V wrote:
> >> Michael Ellerman <mpe@ellerman.id.au> writes:
> >> > Ram Pai <linuxram@us.ibm.com> writes:
> ...
> >> > Should be:
> >> > 	/*
> >> > 	 * Initialize all hidx entries to invalid value, the first time
> >> >          * the PTE is about to allocate a 4K HPTE.
> >> > 	 */
> >> >
> >> >> +	if (!(old_pte & H_PAGE_COMBO))
> >> >> +		rpte.hidx = ~0x0UL;
> >> >> +
> >> >
> >> > Paul had the idea that if we biased the slot number by 1, we could make
> >> > the "invalid" value be == 0.
> >> >
> >> > That would avoid needing to that above, and also mean the value is
> >> > correctly invalid from the get-go, which would be good IMO.
> >> >
> >> > I think now that you've added the slot accessors it would be pretty easy
> >> > to do.
> >> 
> >> That would be imply, we loose one slot in primary group, which means we
> >> will do extra work in some case because our primary now has only 7
> >> slots. And in case of pseries, the hypervisor will always return the
> >> least available slot, which implie we will do extra hcalls in case of an
> >> hpte insert to an empty group?
> >
> > No. that is not the idea.  The idea is that slot 'F' in the seconday
> > will continue to be a invalid slot, but will be represented as
> > offset-by-one in the PTE.  In other words, 0 will be repesented as 1,
> > 1 as 2....   and  n as (n+1)%32
> 
> Right.
> 
> > The idea seems feasible.  It has the advantage -- where 0 in the PTE
> > means invalid slot. But it can be confusing to the casual code-
> > reader. Will need to put in a big-huge comment to explain that.
> 
> This code is already confusing to *any* reader, so I don't think it's a
> worry. :)

I just got it coded and working.  But I see no advantage implementing
the shifted-value. The hidx in the secondary-part of the pte, still
needs to be initialzed to all-zeros. Because it could contain the value of
the hidx corresponding the 64k-backed hpte, which needs to be cleared.

I will send the patch anyway.  But we should not apply it
for I see no apparent gain.

RP

> 
> cheers

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 09/25] powerpc: ability to create execute-disabled pkeys
  2017-10-24  4:36   ` Aneesh Kumar K.V
@ 2017-10-28 23:18     ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-28 23:18 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: mpe, linuxppc-dev, mhocko, paulus, ebiederm, bauerman, khandual

On Tue, Oct 24, 2017 at 10:06:18AM +0530, Aneesh Kumar K.V wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > powerpc has hardware support to disable execute on a pkey.
> > This patch enables the ability to create execute-disabled
> > keys.
> 
> Can you summarize here how this works?  Access to IAMR is
> privileged so how will keys framework work with IAMR? 
> 
> -aneesh

right. IAMR will have to programmed through a system call.
I have introduced a sys_pkey_modify()  which takes a key value
and the permission that it wants to enable/disable on that key.
This syscall is powerpc specific only for now, since no other
arch's need it.

The patch is at http://patchwork.ozlabs.org/patch/817961/

-- 
Ram Pai

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation
  2017-10-24 15:47   ` Michael Ellerman
  2017-10-24 18:26     ` Ram Pai
@ 2017-10-29 14:03     ` Aneesh Kumar K.V
  2017-10-30  0:37       ` Ram Pai
  1 sibling, 1 reply; 134+ messages in thread
From: Aneesh Kumar K.V @ 2017-10-29 14:03 UTC (permalink / raw)
  To: Michael Ellerman, Ram Pai, linuxppc-dev
  Cc: benh, paulus, khandual, bsingharora, hbabu, mhocko, bauerman,
	ebiederm, linuxram

Michael Ellerman <mpe@ellerman.id.au> writes:

> Ram Pai <linuxram@us.ibm.com> writes:
>
>> Handle Data and  Instruction exceptions caused by memory
>> protection-key.
>>
>> The CPU will detect the key fault if the HPTE is already
>> programmed with the key.
>>
>> However if the HPTE is not  hashed, a key fault will not
>> be detected by the  hardware. The   software will detect
>> pkey violation in such a case.
>
> That seems like the wrong trade off to me.
>
> It means every fault has to go through arch_vma_access_permitted(),
> which is at least a function call in the best case, even when pkeys are
> not in use, and/or the range in question is not protected by a key.

We don't really need to call arch_vma_access_permitted() in
arch/powerpc/ do_page_fault(). Core kernel does that in
handle_mm_fault(). So if the first fault is a bad access handle_mm_fault
handle this. If it is a valid access we insert the right hash page table
entry and then we do a wrong access, we detect that a key fault in the
low level hash fault handler. IIUC, the call the
arch_vma_access_permitted() from arch/powerpc/ can go away?

-aneesh

^ permalink raw reply	[flat|nested] 134+ messages in thread

* Re: [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation
  2017-10-29 14:03     ` Aneesh Kumar K.V
@ 2017-10-30  0:37       ` Ram Pai
  0 siblings, 0 replies; 134+ messages in thread
From: Ram Pai @ 2017-10-30  0:37 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Michael Ellerman, linuxppc-dev, mhocko, paulus, ebiederm,
	bauerman, khandual

On Sun, Oct 29, 2017 at 07:33:25PM +0530, Aneesh Kumar K.V wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> 
> > Ram Pai <linuxram@us.ibm.com> writes:
> >
> >> Handle Data and  Instruction exceptions caused by memory
> >> protection-key.
> >>
> >> The CPU will detect the key fault if the HPTE is already
> >> programmed with the key.
> >>
> >> However if the HPTE is not  hashed, a key fault will not
> >> be detected by the  hardware. The   software will detect
> >> pkey violation in such a case.
> >
> > That seems like the wrong trade off to me.
> >
> > It means every fault has to go through arch_vma_access_permitted(),
> > which is at least a function call in the best case, even when pkeys are
> > not in use, and/or the range in question is not protected by a key.
> 
> We don't really need to call arch_vma_access_permitted() in
> arch/powerpc/ do_page_fault(). Core kernel does that in
> handle_mm_fault(). So if the first fault is a bad access handle_mm_fault
> handle this. If it is a valid access we insert the right hash page table
> entry and then we do a wrong access, we detect that a key fault in the
> low level hash fault handler. IIUC, the call the
> arch_vma_access_permitted() from arch/powerpc/ can go away?


Yes. since handle_mm_fault() checks for key-violation, we can leverage
that in __do_page_fault(), instead of calling arch_vma_access_permitted().

Have fixed it in the next version of patches, about to to hit the mailing
list this week.

RP

^ permalink raw reply	[flat|nested] 134+ messages in thread

end of thread, other threads:[~2017-10-30  0:37 UTC | newest]

Thread overview: 134+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-08 22:44 [PATCH 0/7] powerpc: Free up RPAGE_RSV bits Ram Pai
2017-09-08 22:44 ` [PATCH 1/7] powerpc: introduce pte_set_hash_slot() helper Ram Pai
2017-09-13  7:55   ` Balbir Singh
2017-10-19  4:52   ` Michael Ellerman
2017-09-08 22:44 ` [PATCH 2/7] powerpc: introduce pte_get_hash_gslot() helper Ram Pai
2017-09-13  9:32   ` Balbir Singh
2017-09-13 20:10     ` Ram Pai
2017-09-08 22:44 ` [PATCH 3/7] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages Ram Pai
2017-09-14  1:18   ` Balbir Singh
2017-10-19  3:25   ` Michael Ellerman
2017-10-19 17:02     ` Ram Pai
2017-10-23  8:47     ` Aneesh Kumar K.V
2017-10-23 16:29       ` Ram Pai
2017-10-25  9:18         ` Michael Ellerman
2017-10-26  6:08           ` Ram Pai
2017-09-08 22:44 ` [PATCH 4/7] powerpc: Free up four 64K PTE bits in 64K " Ram Pai
2017-09-14  1:44   ` Balbir Singh
2017-09-14 17:54     ` Ram Pai
2017-09-14 18:25       ` Ram Pai
2017-09-14  8:13   ` Benjamin Herrenschmidt
2017-10-23  8:52     ` Aneesh Kumar K.V
2017-10-23 23:42       ` Ram Pai
2017-10-23 19:22     ` Ram Pai
2017-10-24  3:37       ` Aneesh Kumar K.V
2017-09-08 22:44 ` [PATCH 5/7] powerpc: Swizzle around 4K PTE bits to free up bit 5 and bit 6 Ram Pai
2017-09-14  1:48   ` Balbir Singh
2017-09-14 17:23     ` Ram Pai
2017-09-08 22:44 ` [PATCH 6/7] powerpc: use helper functions to get and set hash slots Ram Pai
2017-09-08 22:44 ` [PATCH 7/7] powerpc: capture the PTE format changes in the dump pte report Ram Pai
2017-09-14  3:22   ` Balbir Singh
2017-09-14 17:19     ` Ram Pai
2017-09-08 22:44 ` [PATCH 00/25] powerpc: Memory Protection Keys Ram Pai
2017-09-08 22:44 ` [PATCH 01/25] powerpc: initial pkey plumbing Ram Pai
2017-09-14  3:32   ` Balbir Singh
2017-09-14 16:17     ` Ram Pai
2017-10-19  4:20   ` Michael Ellerman
2017-10-19 17:11     ` Ram Pai
2017-10-24  8:17       ` Michael Ellerman
2017-09-08 22:44 ` [PATCH 02/25] powerpc: define an additional vma bit for protection keys Ram Pai
2017-09-14  4:38   ` Balbir Singh
2017-09-14  8:11     ` Benjamin Herrenschmidt
2017-10-23 21:06       ` Ram Pai
2017-09-14 16:15     ` Ram Pai
2017-10-23  9:25   ` Aneesh Kumar K.V
2017-10-23  9:28     ` Aneesh Kumar K.V
2017-10-23 17:57       ` Ram Pai
2017-10-23 17:43     ` Ram Pai
2017-09-08 22:44 ` [PATCH 03/25] powerpc: track allocation status of all pkeys Ram Pai
2017-10-07 10:02   ` Michael Ellerman
2017-10-08 23:02     ` Ram Pai
2017-10-18  2:47   ` Balbir Singh
2017-10-23  9:41   ` Aneesh Kumar K.V
2017-10-23 18:14     ` Ram Pai
2017-10-24  6:28   ` Aneesh Kumar K.V
2017-10-24  7:23     ` Ram Pai
2017-09-08 22:44 ` [PATCH 04/25] powerpc: helper function to read, write AMR, IAMR, UAMOR registers Ram Pai
2017-10-18  3:17   ` [PATCH 04/25] powerpc: helper function to read,write AMR,IAMR,UAMOR registers Balbir Singh
2017-10-18  3:42     ` Ram Pai
2017-09-08 22:44 ` [PATCH 05/25] powerpc: helper functions to initialize AMR, IAMR and UAMOR registers Ram Pai
2017-10-18  3:24   ` Balbir Singh
2017-10-18 20:38     ` Ram Pai
2017-10-24  6:25   ` Aneesh Kumar K.V
2017-10-24  7:04     ` Ram Pai
2017-10-24  8:29       ` Michael Ellerman
2017-09-08 22:44 ` [PATCH 06/25] powerpc: cleaup AMR, iAMR when a key is allocated or freed Ram Pai
2017-10-18  3:34   ` [PATCH 06/25] powerpc: cleaup AMR,iAMR " Balbir Singh
2017-10-23  9:43     ` [PATCH 06/25] powerpc: cleaup AMR, iAMR " Aneesh Kumar K.V
2017-10-23 18:36       ` [PATCH 06/25] powerpc: cleaup AMR,iAMR " Ram Pai
2017-10-23  9:43   ` [PATCH 06/25] powerpc: cleaup AMR, iAMR " Aneesh Kumar K.V
2017-10-23 18:29     ` [PATCH 06/25] powerpc: cleaup AMR,iAMR " Ram Pai
2017-09-08 22:44 ` [PATCH 07/25] powerpc: implementation for arch_set_user_pkey_access() Ram Pai
2017-09-08 22:44 ` [PATCH 08/25] powerpc: sys_pkey_alloc() and sys_pkey_free() system calls Ram Pai
2017-10-24 15:48   ` Michael Ellerman
2017-10-24 18:34     ` Ram Pai
2017-10-25  9:26       ` Michael Ellerman
2017-09-08 22:44 ` [PATCH 09/25] powerpc: ability to create execute-disabled pkeys Ram Pai
2017-10-18  3:42   ` Balbir Singh
2017-10-18  5:15     ` Ram Pai
2017-10-24  6:58       ` Aneesh Kumar K.V
2017-10-24  7:20         ` Ram Pai
2017-10-24  4:36   ` Aneesh Kumar K.V
2017-10-28 23:18     ` Ram Pai
2017-09-08 22:44 ` [PATCH 10/25] powerpc: store and restore the pkey state across context switches Ram Pai
2017-10-18  3:49   ` Balbir Singh
2017-10-18 20:47     ` Ram Pai
2017-10-18 23:00       ` Balbir Singh
2017-10-19  0:52         ` Ram Pai
2017-09-08 22:44 ` [PATCH 11/25] powerpc: introduce execute-only pkey Ram Pai
2017-10-18  4:15   ` Balbir Singh
2017-10-18 20:57     ` Ram Pai
2017-10-18 23:02       ` Balbir Singh
2017-10-19 15:52         ` Ram Pai
2017-09-08 22:45 ` [PATCH 12/25] powerpc: ability to associate pkey to a vma Ram Pai
2017-10-18  4:27   ` Balbir Singh
2017-10-18 21:01     ` Ram Pai
2017-09-08 22:45 ` [PATCH 13/25] powerpc: implementation for arch_override_mprotect_pkey() Ram Pai
2017-10-18  4:36   ` Balbir Singh
2017-10-18 21:10     ` Ram Pai
2017-10-18 23:04       ` Balbir Singh
2017-10-19 16:39         ` Ram Pai
2017-09-08 22:45 ` [PATCH 14/25] powerpc: map vma key-protection bits to pte key bits Ram Pai
2017-10-18  4:39   ` Balbir Singh
2017-10-18 21:14     ` Ram Pai
2017-09-08 22:45 ` [PATCH 15/25] powerpc: sys_pkey_mprotect() system call Ram Pai
2017-09-08 22:45 ` [PATCH 16/25] powerpc: Program HPTE key protection bits Ram Pai
2017-10-18  4:43   ` Balbir Singh
2017-09-08 22:45 ` [PATCH 17/25] powerpc: helper to validate key-access permissions of a pte Ram Pai
2017-10-18  4:48   ` Balbir Singh
2017-10-18 21:19     ` Ram Pai
2017-09-08 22:45 ` [PATCH 18/25] powerpc: check key protection for user page access Ram Pai
2017-10-18 19:57   ` Balbir Singh
2017-10-18 21:29     ` Ram Pai
2017-10-18 23:08       ` Balbir Singh
2017-10-19 16:46         ` Ram Pai
2017-09-08 22:45 ` [PATCH 19/25] powerpc: implementation for arch_vma_access_permitted() Ram Pai
2017-10-18 23:20   ` Balbir Singh
2017-10-24 15:48   ` Michael Ellerman
2017-09-08 22:45 ` [PATCH 20/25] powerpc: Handle exceptions caused by pkey violation Ram Pai
2017-10-18 23:27   ` Balbir Singh
2017-10-19 16:53     ` Ram Pai
2017-10-24 15:47   ` Michael Ellerman
2017-10-24 18:26     ` Ram Pai
2017-10-29 14:03     ` Aneesh Kumar K.V
2017-10-30  0:37       ` Ram Pai
2017-09-08 22:45 ` [PATCH 21/25] powerpc: introduce get_pte_pkey() helper Ram Pai
2017-10-18 23:29   ` Balbir Singh
2017-10-19 16:55     ` Ram Pai
2017-09-08 22:45 ` [PATCH 22/25] powerpc: capture the violated protection key on fault Ram Pai
2017-10-24 15:46   ` Michael Ellerman
2017-09-08 22:45 ` [PATCH 23/25] powerpc: Deliver SEGV signal on pkey violation Ram Pai
2017-10-24 15:46   ` Michael Ellerman
2017-10-24 17:19     ` Ram Pai
2017-09-08 22:45 ` [PATCH 24/25] powerpc/ptrace: Add memory protection key regset Ram Pai
2017-09-08 22:45 ` [PATCH 25/25] powerpc: Enable pkey subsystem Ram Pai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).