All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V6 0/2] PTE fixes for ARM LPAE
@ 2014-07-03 12:06 Steve Capper
  2014-07-03 12:06 ` [PATCH V6 1/2] arm: mm: Introduce {pte, pmd}_isset and {pte, pmd}_isclear Steve Capper
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Steve Capper @ 2014-07-03 12:06 UTC (permalink / raw)
  To: linux-arm-kernel

This series fixes a couple of problems that I encountered on ARM with
LPAE.
 1) Some pte accessors can have their results cancelled out by a
    downcast. This is addressed by the first patch.

 2) It is impossible to distinguish between clean writeable ptes and
    read only ptes. This is addressed by the second patch.

Notable changes from V5->V6:
 o) Renamed PMD_SECT_RDONLY to PMD_SECT_AP2 to make the distinction
    between software and hardware bits clearer.
 o) pmd_trans_huge tidied up to use !pmd_table(.)

Notable changes in this series from V4->V5:
 o) Rather than switch to L_PTE_WRITE semantics, we move L_PTE_RDONLY
    to a software bit. This allows for a much smaller patch set that
    does not affect the 2-level code.

Notable changes in this series from V3->V4:
 o) Rebased to take into account the BE LPAE fix in 3.16-rc1.
 o) THP logic added as this was missing.

This series has been tested with libhugetlbfs unit tests, ltp mm tests
and a custom THP test to test what happens when PROT_NONE THPs are
faulted. I have tested on an Arndale board with LPAE and classic MMU
(with classic MMU, only ltp mm tests were run). Testing was performed
against 3.16-rc3.

Any comments/feedback would be appreciated.

Cheers,
--
Steve

Steve Capper (2):
  arm: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear
  arm: mm: Modify pte_write and pmd_write logic for LPAE

 arch/arm/include/asm/pgtable-3level-hwdef.h |  3 +-
 arch/arm/include/asm/pgtable-3level.h       | 49 ++++++++++++++++++-----------
 arch/arm/include/asm/pgtable.h              | 18 ++++++-----
 arch/arm/mm/dump.c                          |  4 +--
 arch/arm/mm/proc-v7-3level.S                |  9 ++++--
 5 files changed, 52 insertions(+), 31 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V6 1/2] arm: mm: Introduce {pte, pmd}_isset and {pte, pmd}_isclear
  2014-07-03 12:06 [PATCH V6 0/2] PTE fixes for ARM LPAE Steve Capper
@ 2014-07-03 12:06 ` Steve Capper
  2014-07-03 16:01   ` [PATCH V6 1/2] arm: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear Will Deacon
  2014-07-03 12:06 ` [PATCH V6 2/2] arm: mm: Modify pte_write and pmd_write logic for LPAE Steve Capper
  2014-07-03 15:44 ` [PATCH V7] " Steve Capper
  2 siblings, 1 reply; 8+ messages in thread
From: Steve Capper @ 2014-07-03 12:06 UTC (permalink / raw)
  To: linux-arm-kernel

Long descriptors on ARM are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
	gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch introduces a new macro pte_isset which performs the bitwise
and, then performs a double logical invert (where needed) to ensure
predictable downcasting. The logical inverse pte_isclear is also
introduced.

Equivalent pmd functions for Transparent HugePages have also been
added.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
---
 arch/arm/include/asm/pgtable-3level.h | 12 ++++++++----
 arch/arm/include/asm/pgtable.h        | 18 +++++++++++-------
 2 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 85c60ad..34f371c 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -207,17 +207,21 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
 #define pte_huge(pte)		(pte_val(pte) && !(pte_val(pte) & PTE_TABLE_BIT))
 #define pte_mkhuge(pte)		(__pte(pte_val(pte) & ~PTE_TABLE_BIT))
 
-#define pmd_young(pmd)		(pmd_val(pmd) & PMD_SECT_AF)
+#define pmd_isset(pmd, val)	((u32)(val) == (val) ? pmd_val(pmd) & (val) \
+						: !!(pmd_val(pmd) & (val)))
+#define pmd_isclear(pmd, val)	(!(pmd_val(pmd) & (val)))
+
+#define pmd_young(pmd)		(pmd_isset((pmd), PMD_SECT_AF))
 
 #define __HAVE_ARCH_PMD_WRITE
-#define pmd_write(pmd)		(!(pmd_val(pmd) & PMD_SECT_RDONLY))
+#define pmd_write(pmd)		(pmd_isclear((pmd), PMD_SECT_RDONLY))
 
 #define pmd_hugewillfault(pmd)	(!pmd_young(pmd) || !pmd_write(pmd))
 #define pmd_thp_or_huge(pmd)	(pmd_huge(pmd) || pmd_trans_huge(pmd))
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-#define pmd_trans_huge(pmd)	(pmd_val(pmd) && !(pmd_val(pmd) & PMD_TABLE_BIT))
-#define pmd_trans_splitting(pmd) (pmd_val(pmd) & PMD_SECT_SPLITTING)
+#define pmd_trans_huge(pmd)	(pmd_val(pmd) && !pmd_table(pmd))
+#define pmd_trans_splitting(pmd) (pmd_isset((pmd), PMD_SECT_SPLITTING))
 #endif
 
 #define PMD_BIT_FUNC(fn,op) \
diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h
index 5478e5d..01baef0 100644
--- a/arch/arm/include/asm/pgtable.h
+++ b/arch/arm/include/asm/pgtable.h
@@ -214,18 +214,22 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 
 #define pte_clear(mm,addr,ptep)	set_pte_ext(ptep, __pte(0), 0)
 
+#define pte_isset(pte, val)	((u32)(val) == (val) ? pte_val(pte) & (val) \
+						: !!(pte_val(pte) & (val)))
+#define pte_isclear(pte, val)	(!(pte_val(pte) & (val)))
+
 #define pte_none(pte)		(!pte_val(pte))
-#define pte_present(pte)	(pte_val(pte) & L_PTE_PRESENT)
-#define pte_valid(pte)		(pte_val(pte) & L_PTE_VALID)
+#define pte_present(pte)	(pte_isset((pte), L_PTE_PRESENT))
+#define pte_valid(pte)		(pte_isset((pte), L_PTE_VALID))
 #define pte_accessible(mm, pte)	(mm_tlb_flush_pending(mm) ? pte_present(pte) : pte_valid(pte))
-#define pte_write(pte)		(!(pte_val(pte) & L_PTE_RDONLY))
-#define pte_dirty(pte)		(pte_val(pte) & L_PTE_DIRTY)
-#define pte_young(pte)		(pte_val(pte) & L_PTE_YOUNG)
-#define pte_exec(pte)		(!(pte_val(pte) & L_PTE_XN))
+#define pte_write(pte)		(pte_isclear((pte), L_PTE_RDONLY))
+#define pte_dirty(pte)		(pte_isset((pte), L_PTE_DIRTY))
+#define pte_young(pte)		(pte_isset((pte), L_PTE_YOUNG))
+#define pte_exec(pte)		(pte_isclear((pte), L_PTE_XN))
 #define pte_special(pte)	(0)
 
 #define pte_valid_user(pte)	\
-	(pte_valid(pte) && (pte_val(pte) & L_PTE_USER) && pte_young(pte))
+	(pte_valid(pte) && pte_isset((pte), L_PTE_USER) && pte_young(pte))
 
 #if __LINUX_ARM_ARCH__ < 6
 static inline void __sync_icache_dcache(pte_t pteval)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V6 2/2] arm: mm: Modify pte_write and pmd_write logic for LPAE
  2014-07-03 12:06 [PATCH V6 0/2] PTE fixes for ARM LPAE Steve Capper
  2014-07-03 12:06 ` [PATCH V6 1/2] arm: mm: Introduce {pte, pmd}_isset and {pte, pmd}_isclear Steve Capper
@ 2014-07-03 12:06 ` Steve Capper
  2014-07-03 12:18   ` Russell King - ARM Linux
  2014-07-03 15:44 ` [PATCH V7] " Steve Capper
  2 siblings, 1 reply; 8+ messages in thread
From: Steve Capper @ 2014-07-03 12:06 UTC (permalink / raw)
  To: linux-arm-kernel

For LPAE, we have the following means for encoding writable or dirty
ptes:
                              L_PTE_DIRTY       L_PTE_RDONLY
    !pte_dirty && !pte_write        0               1
    !pte_dirty && pte_write         0               1
    pte_dirty && !pte_write         1               1
    pte_dirty && pte_write          1               0

So we can't distinguish between writeable clean ptes and read only
ptes. This can cause problems with ptes being incorrectly flagged as
read only when they are writeable but not dirty.

This patch renumbers L_PTE_RDONLY from AP[2] to a software bit #58,
and adds additional logic to set AP[2] whenever the pte is read only
or not dirty. That way we can distinguish between clean writeable ptes
and read only ptes.

HugeTLB pages will use this new logic automatically.

We need to add some logic to Transparent HugePages to ensure that they
correctly interpret the revised pgprot permissions (L_PTE_RDONLY has
moved and no longer matches PMD_SECT_AP2). In the process of revising
THP, the names of the PMD software bits have been prefixed with L_ to
make them easier to distinguish from their hardware bit counterparts.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
---
 arch/arm/include/asm/pgtable-3level-hwdef.h |  3 ++-
 arch/arm/include/asm/pgtable-3level.h       | 41 +++++++++++++++++------------
 arch/arm/mm/dump.c                          |  4 +--
 arch/arm/mm/proc-v7-3level.S                |  9 +++++--
 4 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index 626989f..9fd61c7 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -43,7 +43,7 @@
 #define PMD_SECT_BUFFERABLE	(_AT(pmdval_t, 1) << 2)
 #define PMD_SECT_CACHEABLE	(_AT(pmdval_t, 1) << 3)
 #define PMD_SECT_USER		(_AT(pmdval_t, 1) << 6)		/* AP[1] */
-#define PMD_SECT_RDONLY		(_AT(pmdval_t, 1) << 7)		/* AP[2] */
+#define PMD_SECT_AP2		(_AT(pmdval_t, 1) << 7)		/* read only */
 #define PMD_SECT_S		(_AT(pmdval_t, 3) << 8)
 #define PMD_SECT_AF		(_AT(pmdval_t, 1) << 10)
 #define PMD_SECT_nG		(_AT(pmdval_t, 1) << 11)
@@ -72,6 +72,7 @@
 #define PTE_TABLE_BIT		(_AT(pteval_t, 1) << 1)
 #define PTE_BUFFERABLE		(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
 #define PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
+#define PTE_AP2			(_AT(pteval_t, 1) << 7)		/* AP[2] */
 #define PTE_EXT_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
 #define PTE_EXT_AF		(_AT(pteval_t, 1) << 10)	/* Access Flag */
 #define PTE_EXT_NG		(_AT(pteval_t, 1) << 11)	/* nG */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 34f371c..06e0bc0 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -79,18 +79,19 @@
 #define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Present */
 #define L_PTE_FILE		(_AT(pteval_t, 1) << 2)		/* only when !PRESENT */
 #define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
-#define L_PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
 #define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
 #define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
 #define L_PTE_XN		(_AT(pteval_t, 1) << 54)	/* XN */
-#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)	/* unused */
-#define L_PTE_SPECIAL		(_AT(pteval_t, 1) << 56)	/* unused */
+#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)
+#define L_PTE_SPECIAL		(_AT(pteval_t, 1) << 56)
 #define L_PTE_NONE		(_AT(pteval_t, 1) << 57)	/* PROT_NONE */
+#define L_PTE_RDONLY		(_AT(pteval_t, 1) << 58)	/* READ ONLY */
 
-#define PMD_SECT_VALID		(_AT(pmdval_t, 1) << 0)
-#define PMD_SECT_DIRTY		(_AT(pmdval_t, 1) << 55)
-#define PMD_SECT_SPLITTING	(_AT(pmdval_t, 1) << 56)
-#define PMD_SECT_NONE		(_AT(pmdval_t, 1) << 57)
+#define L_PMD_SECT_VALID	(_AT(pmdval_t, 1) << 0)
+#define L_PMD_SECT_DIRTY	(_AT(pmdval_t, 1) << 55)
+#define L_PMD_SECT_SPLITTING	(_AT(pmdval_t, 1) << 56)
+#define L_PMD_SECT_NONE		(_AT(pmdval_t, 1) << 57)
+#define L_PMD_SECT_RDONLY	(_AT(pteval_t, 1) << 58)
 
 /*
  * To be used in assembly code with the upper page attributes.
@@ -214,24 +215,25 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
 #define pmd_young(pmd)		(pmd_isset((pmd), PMD_SECT_AF))
 
 #define __HAVE_ARCH_PMD_WRITE
-#define pmd_write(pmd)		(pmd_isclear((pmd), PMD_SECT_RDONLY))
+#define pmd_write(pmd)		(pmd_isclear((pmd), L_PMD_SECT_RDONLY))
+#define pmd_dirty(pmd)		(pmd_isset((pmd), L_PMD_SECT_DIRTY))
 
 #define pmd_hugewillfault(pmd)	(!pmd_young(pmd) || !pmd_write(pmd))
 #define pmd_thp_or_huge(pmd)	(pmd_huge(pmd) || pmd_trans_huge(pmd))
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #define pmd_trans_huge(pmd)	(pmd_val(pmd) && !pmd_table(pmd))
-#define pmd_trans_splitting(pmd) (pmd_isset((pmd), PMD_SECT_SPLITTING))
+#define pmd_trans_splitting(pmd) (pmd_isset((pmd), L_PMD_SECT_SPLITTING))
 #endif
 
 #define PMD_BIT_FUNC(fn,op) \
 static inline pmd_t pmd_##fn(pmd_t pmd) { pmd_val(pmd) op; return pmd; }
 
-PMD_BIT_FUNC(wrprotect,	|= PMD_SECT_RDONLY);
+PMD_BIT_FUNC(wrprotect,	|= L_PMD_SECT_RDONLY);
 PMD_BIT_FUNC(mkold,	&= ~PMD_SECT_AF);
-PMD_BIT_FUNC(mksplitting, |= PMD_SECT_SPLITTING);
-PMD_BIT_FUNC(mkwrite,   &= ~PMD_SECT_RDONLY);
-PMD_BIT_FUNC(mkdirty,   |= PMD_SECT_DIRTY);
+PMD_BIT_FUNC(mksplitting, |= L_PMD_SECT_SPLITTING);
+PMD_BIT_FUNC(mkwrite,   &= ~L_PMD_SECT_RDONLY);
+PMD_BIT_FUNC(mkdirty,   |= L_PMD_SECT_DIRTY);
 PMD_BIT_FUNC(mkyoung,   |= PMD_SECT_AF);
 
 #define pmd_mkhuge(pmd)		(__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
@@ -245,8 +247,8 @@ PMD_BIT_FUNC(mkyoung,   |= PMD_SECT_AF);
 
 static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
 {
-	const pmdval_t mask = PMD_SECT_USER | PMD_SECT_XN | PMD_SECT_RDONLY |
-				PMD_SECT_VALID | PMD_SECT_NONE;
+	const pmdval_t mask = PMD_SECT_USER | PMD_SECT_XN | L_PMD_SECT_RDONLY |
+				L_PMD_SECT_VALID | L_PMD_SECT_NONE;
 	pmd_val(pmd) = (pmd_val(pmd) & ~mask) | (pgprot_val(newprot) & mask);
 	return pmd;
 }
@@ -257,8 +259,13 @@ static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
 	BUG_ON(addr >= TASK_SIZE);
 
 	/* create a faulting entry if PROT_NONE protected */
-	if (pmd_val(pmd) & PMD_SECT_NONE)
-		pmd_val(pmd) &= ~PMD_SECT_VALID;
+	if (pmd_val(pmd) & L_PMD_SECT_NONE)
+		pmd_val(pmd) &= ~L_PMD_SECT_VALID;
+
+	if (pmd_write(pmd) && pmd_dirty(pmd))
+		pmd_val(pmd) &= ~PMD_SECT_AP2;
+	else
+		pmd_val(pmd) |= PMD_SECT_AP2;
 
 	*pmdp = __pmd(pmd_val(pmd) | PMD_SECT_nG);
 	flush_pmd_entry(pmdp);
diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
index c508f41..00a1cf9 100644
--- a/arch/arm/mm/dump.c
+++ b/arch/arm/mm/dump.c
@@ -126,8 +126,8 @@ static const struct prot_bits section_bits[] = {
 		.val	= PMD_SECT_USER,
 		.set	= "USR",
 	}, {
-		.mask	= PMD_SECT_RDONLY,
-		.val	= PMD_SECT_RDONLY,
+		.mask	= PMD_SECT_AP2,
+		.val	= PMD_SECT_AP2,
 		.set	= "ro",
 		.clear	= "RW",
 #elif __LINUX_ARM_ARCH__ >= 6
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 22e3ad6..eb81123 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -86,8 +86,13 @@ ENTRY(cpu_v7_set_pte_ext)
 	tst	rh, #1 << (57 - 32)		@ L_PTE_NONE
 	bicne	rl, #L_PTE_VALID
 	bne	1f
-	tst	rh, #1 << (55 - 32)		@ L_PTE_DIRTY
-	orreq	rl, #L_PTE_RDONLY
+
+	eor	ip, rh, #1 << (55 - 32)	@ toggle L_PTE_DIRTY in temp reg to
+					@ test for !L_PTE_DIRTY || L_PTE_RDONLY
+	tst	ip, #1 << (55 - 32) | 1 << (58 - 32)
+	orrne	rl, #PTE_AP2
+	biceq	rl, #PTE_AP2
+
 1:	strd	r2, r3, [r0]
 	ALT_SMP(W(nop))
 	ALT_UP (mcr	p15, 0, r0, c7, c10, 1)		@ flush_pte
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V6 2/2] arm: mm: Modify pte_write and pmd_write logic for LPAE
  2014-07-03 12:06 ` [PATCH V6 2/2] arm: mm: Modify pte_write and pmd_write logic for LPAE Steve Capper
@ 2014-07-03 12:18   ` Russell King - ARM Linux
  2014-07-03 13:47     ` Steve Capper
  0 siblings, 1 reply; 8+ messages in thread
From: Russell King - ARM Linux @ 2014-07-03 12:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 03, 2014 at 01:06:13PM +0100, Steve Capper wrote:
> diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
> index c508f41..00a1cf9 100644
> --- a/arch/arm/mm/dump.c
> +++ b/arch/arm/mm/dump.c
> @@ -126,8 +126,8 @@ static const struct prot_bits section_bits[] = {
>  		.val	= PMD_SECT_USER,
>  		.set	= "USR",
>  	}, {
> -		.mask	= PMD_SECT_RDONLY,
> -		.val	= PMD_SECT_RDONLY,
> +		.mask	= PMD_SECT_AP2,
> +		.val	= PMD_SECT_AP2,

I think we're almost there, except for this hunk, which I think can
just be deleted.

We want to report what the PTEs/PMDs are requested to be, not what they
physically are at the point where we dump them out.  In other words, we
don't want to know that they're physically read-only because the dirty
bit isn't set - what we want to know is that they have permission to be
written to if the dirty bit were to be set.

The reasoning is that we want the dump to reflect what is possible -
for example, we want to know whether a page can be executed _and_ written
to.

Consider the case where the dirty bit is implemented in hardware.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V6 2/2] arm: mm: Modify pte_write and pmd_write logic for LPAE
  2014-07-03 12:18   ` Russell King - ARM Linux
@ 2014-07-03 13:47     ` Steve Capper
  0 siblings, 0 replies; 8+ messages in thread
From: Steve Capper @ 2014-07-03 13:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 03, 2014 at 01:18:08PM +0100, Russell King - ARM Linux wrote:
> On Thu, Jul 03, 2014 at 01:06:13PM +0100, Steve Capper wrote:
> > diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
> > index c508f41..00a1cf9 100644
> > --- a/arch/arm/mm/dump.c
> > +++ b/arch/arm/mm/dump.c
> > @@ -126,8 +126,8 @@ static const struct prot_bits section_bits[] = {
> >  		.val	= PMD_SECT_USER,
> >  		.set	= "USR",
> >  	}, {
> > -		.mask	= PMD_SECT_RDONLY,
> > -		.val	= PMD_SECT_RDONLY,
> > +		.mask	= PMD_SECT_AP2,
> > +		.val	= PMD_SECT_AP2,
> 
> I think we're almost there, except for this hunk, which I think can
> just be deleted.
> 
> We want to report what the PTEs/PMDs are requested to be, not what they
> physically are at the point where we dump them out.  In other words, we
> don't want to know that they're physically read-only because the dirty
> bit isn't set - what we want to know is that they have permission to be
> written to if the dirty bit were to be set.
> 
> The reasoning is that we want the dump to reflect what is possible -
> for example, we want to know whether a page can be executed _and_ written
> to.
> 
> Consider the case where the dirty bit is implemented in hardware.

Thanks Russell, that makes more sense. I'll change the mask and val to:
L_PMD_SECT_RDONLY, to catch the software bit.

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V7] arm: mm: Modify pte_write and pmd_write logic for LPAE
  2014-07-03 12:06 [PATCH V6 0/2] PTE fixes for ARM LPAE Steve Capper
  2014-07-03 12:06 ` [PATCH V6 1/2] arm: mm: Introduce {pte, pmd}_isset and {pte, pmd}_isclear Steve Capper
  2014-07-03 12:06 ` [PATCH V6 2/2] arm: mm: Modify pte_write and pmd_write logic for LPAE Steve Capper
@ 2014-07-03 15:44 ` Steve Capper
  2014-07-03 16:02   ` Will Deacon
  2 siblings, 1 reply; 8+ messages in thread
From: Steve Capper @ 2014-07-03 15:44 UTC (permalink / raw)
  To: linux-arm-kernel

For LPAE, we have the following means for encoding writable or dirty
ptes:
                              L_PTE_DIRTY       L_PTE_RDONLY
    !pte_dirty && !pte_write        0               1
    !pte_dirty && pte_write         0               1
    pte_dirty && !pte_write         1               1
    pte_dirty && pte_write          1               0

So we can't distinguish between writeable clean ptes and read only
ptes. This can cause problems with ptes being incorrectly flagged as
read only when they are writeable but not dirty.

This patch renumbers L_PTE_RDONLY from AP[2] to a software bit #58,
and adds additional logic to set AP[2] whenever the pte is read only
or not dirty. That way we can distinguish between clean writeable ptes
and read only ptes.

HugeTLB pages will use this new logic automatically.

We need to add some logic to Transparent HugePages to ensure that they
correctly interpret the revised pgprot permissions (L_PTE_RDONLY has
moved and no longer matches PMD_SECT_AP2). In the process of revising
THP, the names of the PMD software bits have been prefixed with L_ to
make them easier to distinguish from their hardware bit counterparts.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
---
Just posting this patch, it should replace the last patch in the series:
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-July/268977.html

Changed in v7: 
Check for L_PMD_SECT_RDONLY in kernel pagetable layout walker rather than
PMD_SECT_AP2.
---
 arch/arm/include/asm/pgtable-3level-hwdef.h |  3 ++-
 arch/arm/include/asm/pgtable-3level.h       | 41 +++++++++++++++++------------
 arch/arm/mm/dump.c                          |  4 +--
 arch/arm/mm/proc-v7-3level.S                |  9 +++++--
 4 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h b/arch/arm/include/asm/pgtable-3level-hwdef.h
index 626989f..9fd61c7 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -43,7 +43,7 @@
 #define PMD_SECT_BUFFERABLE	(_AT(pmdval_t, 1) << 2)
 #define PMD_SECT_CACHEABLE	(_AT(pmdval_t, 1) << 3)
 #define PMD_SECT_USER		(_AT(pmdval_t, 1) << 6)		/* AP[1] */
-#define PMD_SECT_RDONLY		(_AT(pmdval_t, 1) << 7)		/* AP[2] */
+#define PMD_SECT_AP2		(_AT(pmdval_t, 1) << 7)		/* read only */
 #define PMD_SECT_S		(_AT(pmdval_t, 3) << 8)
 #define PMD_SECT_AF		(_AT(pmdval_t, 1) << 10)
 #define PMD_SECT_nG		(_AT(pmdval_t, 1) << 11)
@@ -72,6 +72,7 @@
 #define PTE_TABLE_BIT		(_AT(pteval_t, 1) << 1)
 #define PTE_BUFFERABLE		(_AT(pteval_t, 1) << 2)		/* AttrIndx[0] */
 #define PTE_CACHEABLE		(_AT(pteval_t, 1) << 3)		/* AttrIndx[1] */
+#define PTE_AP2			(_AT(pteval_t, 1) << 7)		/* AP[2] */
 #define PTE_EXT_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
 #define PTE_EXT_AF		(_AT(pteval_t, 1) << 10)	/* Access Flag */
 #define PTE_EXT_NG		(_AT(pteval_t, 1) << 11)	/* nG */
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index 34f371c..06e0bc0 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -79,18 +79,19 @@
 #define L_PTE_PRESENT		(_AT(pteval_t, 3) << 0)		/* Present */
 #define L_PTE_FILE		(_AT(pteval_t, 1) << 2)		/* only when !PRESENT */
 #define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
-#define L_PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
 #define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
 #define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
 #define L_PTE_XN		(_AT(pteval_t, 1) << 54)	/* XN */
-#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)	/* unused */
-#define L_PTE_SPECIAL		(_AT(pteval_t, 1) << 56)	/* unused */
+#define L_PTE_DIRTY		(_AT(pteval_t, 1) << 55)
+#define L_PTE_SPECIAL		(_AT(pteval_t, 1) << 56)
 #define L_PTE_NONE		(_AT(pteval_t, 1) << 57)	/* PROT_NONE */
+#define L_PTE_RDONLY		(_AT(pteval_t, 1) << 58)	/* READ ONLY */
 
-#define PMD_SECT_VALID		(_AT(pmdval_t, 1) << 0)
-#define PMD_SECT_DIRTY		(_AT(pmdval_t, 1) << 55)
-#define PMD_SECT_SPLITTING	(_AT(pmdval_t, 1) << 56)
-#define PMD_SECT_NONE		(_AT(pmdval_t, 1) << 57)
+#define L_PMD_SECT_VALID	(_AT(pmdval_t, 1) << 0)
+#define L_PMD_SECT_DIRTY	(_AT(pmdval_t, 1) << 55)
+#define L_PMD_SECT_SPLITTING	(_AT(pmdval_t, 1) << 56)
+#define L_PMD_SECT_NONE		(_AT(pmdval_t, 1) << 57)
+#define L_PMD_SECT_RDONLY	(_AT(pteval_t, 1) << 58)
 
 /*
  * To be used in assembly code with the upper page attributes.
@@ -214,24 +215,25 @@ static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
 #define pmd_young(pmd)		(pmd_isset((pmd), PMD_SECT_AF))
 
 #define __HAVE_ARCH_PMD_WRITE
-#define pmd_write(pmd)		(pmd_isclear((pmd), PMD_SECT_RDONLY))
+#define pmd_write(pmd)		(pmd_isclear((pmd), L_PMD_SECT_RDONLY))
+#define pmd_dirty(pmd)		(pmd_isset((pmd), L_PMD_SECT_DIRTY))
 
 #define pmd_hugewillfault(pmd)	(!pmd_young(pmd) || !pmd_write(pmd))
 #define pmd_thp_or_huge(pmd)	(pmd_huge(pmd) || pmd_trans_huge(pmd))
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #define pmd_trans_huge(pmd)	(pmd_val(pmd) && !pmd_table(pmd))
-#define pmd_trans_splitting(pmd) (pmd_isset((pmd), PMD_SECT_SPLITTING))
+#define pmd_trans_splitting(pmd) (pmd_isset((pmd), L_PMD_SECT_SPLITTING))
 #endif
 
 #define PMD_BIT_FUNC(fn,op) \
 static inline pmd_t pmd_##fn(pmd_t pmd) { pmd_val(pmd) op; return pmd; }
 
-PMD_BIT_FUNC(wrprotect,	|= PMD_SECT_RDONLY);
+PMD_BIT_FUNC(wrprotect,	|= L_PMD_SECT_RDONLY);
 PMD_BIT_FUNC(mkold,	&= ~PMD_SECT_AF);
-PMD_BIT_FUNC(mksplitting, |= PMD_SECT_SPLITTING);
-PMD_BIT_FUNC(mkwrite,   &= ~PMD_SECT_RDONLY);
-PMD_BIT_FUNC(mkdirty,   |= PMD_SECT_DIRTY);
+PMD_BIT_FUNC(mksplitting, |= L_PMD_SECT_SPLITTING);
+PMD_BIT_FUNC(mkwrite,   &= ~L_PMD_SECT_RDONLY);
+PMD_BIT_FUNC(mkdirty,   |= L_PMD_SECT_DIRTY);
 PMD_BIT_FUNC(mkyoung,   |= PMD_SECT_AF);
 
 #define pmd_mkhuge(pmd)		(__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
@@ -245,8 +247,8 @@ PMD_BIT_FUNC(mkyoung,   |= PMD_SECT_AF);
 
 static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot)
 {
-	const pmdval_t mask = PMD_SECT_USER | PMD_SECT_XN | PMD_SECT_RDONLY |
-				PMD_SECT_VALID | PMD_SECT_NONE;
+	const pmdval_t mask = PMD_SECT_USER | PMD_SECT_XN | L_PMD_SECT_RDONLY |
+				L_PMD_SECT_VALID | L_PMD_SECT_NONE;
 	pmd_val(pmd) = (pmd_val(pmd) & ~mask) | (pgprot_val(newprot) & mask);
 	return pmd;
 }
@@ -257,8 +259,13 @@ static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
 	BUG_ON(addr >= TASK_SIZE);
 
 	/* create a faulting entry if PROT_NONE protected */
-	if (pmd_val(pmd) & PMD_SECT_NONE)
-		pmd_val(pmd) &= ~PMD_SECT_VALID;
+	if (pmd_val(pmd) & L_PMD_SECT_NONE)
+		pmd_val(pmd) &= ~L_PMD_SECT_VALID;
+
+	if (pmd_write(pmd) && pmd_dirty(pmd))
+		pmd_val(pmd) &= ~PMD_SECT_AP2;
+	else
+		pmd_val(pmd) |= PMD_SECT_AP2;
 
 	*pmdp = __pmd(pmd_val(pmd) | PMD_SECT_nG);
 	flush_pmd_entry(pmdp);
diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c
index c508f41..5942493 100644
--- a/arch/arm/mm/dump.c
+++ b/arch/arm/mm/dump.c
@@ -126,8 +126,8 @@ static const struct prot_bits section_bits[] = {
 		.val	= PMD_SECT_USER,
 		.set	= "USR",
 	}, {
-		.mask	= PMD_SECT_RDONLY,
-		.val	= PMD_SECT_RDONLY,
+		.mask	= L_PMD_SECT_RDONLY,
+		.val	= L_PMD_SECT_RDONLY,
 		.set	= "ro",
 		.clear	= "RW",
 #elif __LINUX_ARM_ARCH__ >= 6
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 22e3ad6..eb81123 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -86,8 +86,13 @@ ENTRY(cpu_v7_set_pte_ext)
 	tst	rh, #1 << (57 - 32)		@ L_PTE_NONE
 	bicne	rl, #L_PTE_VALID
 	bne	1f
-	tst	rh, #1 << (55 - 32)		@ L_PTE_DIRTY
-	orreq	rl, #L_PTE_RDONLY
+
+	eor	ip, rh, #1 << (55 - 32)	@ toggle L_PTE_DIRTY in temp reg to
+					@ test for !L_PTE_DIRTY || L_PTE_RDONLY
+	tst	ip, #1 << (55 - 32) | 1 << (58 - 32)
+	orrne	rl, #PTE_AP2
+	biceq	rl, #PTE_AP2
+
 1:	strd	r2, r3, [r0]
 	ALT_SMP(W(nop))
 	ALT_UP (mcr	p15, 0, r0, c7, c10, 1)		@ flush_pte
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V6 1/2] arm: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear
  2014-07-03 12:06 ` [PATCH V6 1/2] arm: mm: Introduce {pte, pmd}_isset and {pte, pmd}_isclear Steve Capper
@ 2014-07-03 16:01   ` Will Deacon
  0 siblings, 0 replies; 8+ messages in thread
From: Will Deacon @ 2014-07-03 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 03, 2014 at 01:06:12PM +0100, Steve Capper wrote:
> Long descriptors on ARM are 64 bits, and some pte functions such as
> pte_dirty return a bitwise-and of a flag with the pte value. If the
> flag to be tested resides in the upper 32 bits of the pte, then we run
> into the danger of the result being dropped if downcast.
> 
> For example:
> 	gather_stats(page, md, pte_dirty(*pte), 1);
> where pte_dirty(*pte) is downcast to an int.
> 
> This patch introduces a new macro pte_isset which performs the bitwise
> and, then performs a double logical invert (where needed) to ensure
> predictable downcasting. The logical inverse pte_isclear is also
> introduced.
> 
> Equivalent pmd functions for Transparent HugePages have also been
> added.
> 
> Signed-off-by: Steve Capper <steve.capper@linaro.org>

  Reviewed-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V7] arm: mm: Modify pte_write and pmd_write logic for LPAE
  2014-07-03 15:44 ` [PATCH V7] " Steve Capper
@ 2014-07-03 16:02   ` Will Deacon
  0 siblings, 0 replies; 8+ messages in thread
From: Will Deacon @ 2014-07-03 16:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jul 03, 2014 at 04:44:56PM +0100, Steve Capper wrote:
> For LPAE, we have the following means for encoding writable or dirty
> ptes:
>                               L_PTE_DIRTY       L_PTE_RDONLY
>     !pte_dirty && !pte_write        0               1
>     !pte_dirty && pte_write         0               1
>     pte_dirty && !pte_write         1               1
>     pte_dirty && pte_write          1               0
> 
> So we can't distinguish between writeable clean ptes and read only
> ptes. This can cause problems with ptes being incorrectly flagged as
> read only when they are writeable but not dirty.
> 
> This patch renumbers L_PTE_RDONLY from AP[2] to a software bit #58,
> and adds additional logic to set AP[2] whenever the pte is read only
> or not dirty. That way we can distinguish between clean writeable ptes
> and read only ptes.
> 
> HugeTLB pages will use this new logic automatically.
> 
> We need to add some logic to Transparent HugePages to ensure that they
> correctly interpret the revised pgprot permissions (L_PTE_RDONLY has
> moved and no longer matches PMD_SECT_AP2). In the process of revising
> THP, the names of the PMD software bits have been prefixed with L_ to
> make them easier to distinguish from their hardware bit counterparts.
> 
> Signed-off-by: Steve Capper <steve.capper@linaro.org>

  Reviewed-by: Will Deacon <will.deacon@arm.com>

Will

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-07-03 16:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-03 12:06 [PATCH V6 0/2] PTE fixes for ARM LPAE Steve Capper
2014-07-03 12:06 ` [PATCH V6 1/2] arm: mm: Introduce {pte, pmd}_isset and {pte, pmd}_isclear Steve Capper
2014-07-03 16:01   ` [PATCH V6 1/2] arm: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear Will Deacon
2014-07-03 12:06 ` [PATCH V6 2/2] arm: mm: Modify pte_write and pmd_write logic for LPAE Steve Capper
2014-07-03 12:18   ` Russell King - ARM Linux
2014-07-03 13:47     ` Steve Capper
2014-07-03 15:44 ` [PATCH V7] " Steve Capper
2014-07-03 16:02   ` Will Deacon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.