* [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
@ 2017-02-09 2:58 Aneesh Kumar K.V
2017-02-09 2:58 ` [PATCH 2/3] powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm Aneesh Kumar K.V
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-09 2:58 UTC (permalink / raw)
To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V
In the kernel we do follow the below sequence in different code paths.
pte = ptep_get_clear(ptep)
....
set_pte_at(ptep, pte)
We do that for mremap, autonuma protection update and softdirty clearing. This
implies our optimization to skip a tlb flush when clearing a pte update is
not valid, because for DD1 system that followup set_pte_at will be done witout
doing the required tlbflush. Fix that by always doing the dd1 style pte update
irrespective of new_pte value. In a later patch we will optimize the application
exit case.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/book3s/64/radix.h | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index b4d1302387a3..70a3cdcdbe47 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
* new value of pte
*/
new_pte = (old_pte | set) & ~clr;
- /*
- * If we are trying to clear the pte, we can skip
- * the below sequence and batch the tlb flush. The
- * tlb flush batching is done by mmu gather code
- */
- if (new_pte) {
- asm volatile("ptesync" : : : "memory");
- radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
+ asm volatile("ptesync" : : : "memory");
+ radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
+ if (new_pte)
__radix_pte_update(ptep, 0, new_pte);
- }
} else
old_pte = __radix_pte_update(ptep, clr, set);
asm volatile("ptesync" : : : "memory");
--
2.7.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 2/3] powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm
2017-02-09 2:58 [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Aneesh Kumar K.V
@ 2017-02-09 2:58 ` Aneesh Kumar K.V
2017-02-14 4:17 ` Michael Neuling
2017-02-09 2:58 ` [PATCH 3/3] powerpc/mm/radix: Skip ptesync in pte update helpers Aneesh Kumar K.V
` (3 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-09 2:58 UTC (permalink / raw)
To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V
This helps us to do some optimization for application exit case, where we can
skip the DD1 style pte update sequence.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 17 +++++++++++++++++
arch/powerpc/include/asm/book3s/64/radix.h | 23 ++++++++++++++++++++++-
2 files changed, 39 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 6f15bde94da2..e91ada786d48 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -373,6 +373,23 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
return __pte(old);
}
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
+static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep, int full)
+{
+ if (full && radix_enabled()) {
+ /*
+ * Let's skip the DD1 style pte update here. We know that
+ * this is a full mm pte clear and hence can be sure there is
+ * no parallel set_pte.
+ */
+ return radix__ptep_get_and_clear_full(mm, addr, ptep, full);
+ }
+ return ptep_get_and_clear(mm, addr, ptep);
+}
+
+
static inline void pte_clear(struct mm_struct *mm, unsigned long addr,
pte_t * ptep)
{
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 70a3cdcdbe47..fcf822d6c204 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -139,7 +139,7 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
unsigned long new_pte;
- old_pte = __radix_pte_update(ptep, ~0, 0);
+ old_pte = __radix_pte_update(ptep, ~0ul, 0);
/*
* new value of pte
*/
@@ -157,6 +157,27 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
return old_pte;
}
+static inline pte_t radix__ptep_get_and_clear_full(struct mm_struct *mm,
+ unsigned long addr,
+ pte_t *ptep, int full)
+{
+ unsigned long old_pte;
+
+ if (full) {
+ /*
+ * If we are trying to clear the pte, we can skip
+ * the DD1 pte update sequence and batch the tlb flush. The
+ * tlb flush batching is done by mmu gather code. We
+ * still keep the cmp_xchg update to make sure we get
+ * correct R/C bit which might be updated via Nest MMU.
+ */
+ old_pte = __radix_pte_update(ptep, ~0ul, 0);
+ } else
+ old_pte = radix__pte_update(mm, addr, ptep, ~0ul, 0, 0);
+
+ return __pte(old_pte);
+}
+
/*
* Set the dirty and/or accessed bits atomically in a linux PTE, this
* function doesn't need to invalidate tlb.
--
2.7.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH 3/3] powerpc/mm/radix: Skip ptesync in pte update helpers
2017-02-09 2:58 [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Aneesh Kumar K.V
2017-02-09 2:58 ` [PATCH 2/3] powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm Aneesh Kumar K.V
@ 2017-02-09 2:58 ` Aneesh Kumar K.V
2017-02-14 4:17 ` Michael Neuling
2017-02-09 3:49 ` [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Benjamin Herrenschmidt
` (2 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-09 2:58 UTC (permalink / raw)
To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V
We do them at the start of tlb flush, and we are sure a pte update will be
followed by a tlbflush. Hence we can skip the ptesync in pte update helpers.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/book3s/64/radix.h | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index fcf822d6c204..77e590c77299 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -144,13 +144,11 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
* new value of pte
*/
new_pte = (old_pte | set) & ~clr;
- asm volatile("ptesync" : : : "memory");
radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
if (new_pte)
__radix_pte_update(ptep, 0, new_pte);
} else
old_pte = __radix_pte_update(ptep, clr, set);
- asm volatile("ptesync" : : : "memory");
if (!huge)
assert_pte_locked(mm, addr);
@@ -195,7 +193,6 @@ static inline void radix__ptep_set_access_flags(struct mm_struct *mm,
unsigned long old_pte, new_pte;
old_pte = __radix_pte_update(ptep, ~0, 0);
- asm volatile("ptesync" : : : "memory");
/*
* new value of pte
*/
--
2.7.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
2017-02-09 2:58 [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Aneesh Kumar K.V
2017-02-09 2:58 ` [PATCH 2/3] powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm Aneesh Kumar K.V
2017-02-09 2:58 ` [PATCH 3/3] powerpc/mm/radix: Skip ptesync in pte update helpers Aneesh Kumar K.V
@ 2017-02-09 3:49 ` Benjamin Herrenschmidt
2017-02-09 4:15 ` Benjamin Herrenschmidt
2017-02-14 4:17 ` Michael Neuling
2017-02-16 5:59 ` [1/3] " Michael Ellerman
4 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2017-02-09 3:49 UTC (permalink / raw)
To: Aneesh Kumar K.V, paulus, mpe; +Cc: linuxppc-dev
On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
> In the kernel we do follow the below sequence in different code paths.
> pte = ptep_get_clear(ptep)
> ....
> set_pte_at(ptep, pte)
>
> We do that for mremap, autonuma protection update and softdirty clearing. This
> implies our optimization to skip a tlb flush when clearing a pte update is
> not valid, because for DD1 system that followup set_pte_at will be done witout
> doing the required tlbflush. Fix that by always doing the dd1 style pte update
> irrespective of new_pte value. In a later patch we will optimize the application
> exit case.
What about my change to set_pte_at() ? We seem to be overwriting valid PTEs,
shouldn't we deal with that ?
Cheers,
Ben.
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/radix.h | 12 +++---------
> 1 file changed, 3 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index b4d1302387a3..70a3cdcdbe47 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struct mm_struct *mm,
> > * new value of pte
> > */
> > new_pte = (old_pte | set) & ~clr;
> > - /*
> > - * If we are trying to clear the pte, we can skip
> > - * the below sequence and batch the tlb flush. The
> > - * tlb flush batching is done by mmu gather code
> > - */
> > - if (new_pte) {
> > - asm volatile("ptesync" : : : "memory");
> > - radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> > + asm volatile("ptesync" : : : "memory");
> > + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> > + if (new_pte)
> > __radix_pte_update(ptep, 0, new_pte);
> > - }
> > } else
> > old_pte = __radix_pte_update(ptep, clr, set);
> > asm volatile("ptesync" : : : "memory");
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
2017-02-09 3:49 ` [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Benjamin Herrenschmidt
@ 2017-02-09 4:15 ` Benjamin Herrenschmidt
2017-02-09 4:34 ` Aneesh Kumar K.V
0 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2017-02-09 4:15 UTC (permalink / raw)
To: Aneesh Kumar K.V, paulus, mpe; +Cc: linuxppc-dev
On Thu, 2017-02-09 at 14:49 +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
> > In the kernel we do follow the below sequence in different code
> > paths.
> > pte = ptep_get_clear(ptep)
> > ....
> > set_pte_at(ptep, pte)
> >
> > We do that for mremap, autonuma protection update and softdirty
> > clearing. This
> > implies our optimization to skip a tlb flush when clearing a pte
> > update is
> > not valid, because for DD1 system that followup set_pte_at will be
> > done witout
> > doing the required tlbflush. Fix that by always doing the dd1 style
> > pte update
> > irrespective of new_pte value. In a later patch we will optimize
> > the application
> > exit case.
>
> What about my change to set_pte_at() ? We seem to be overwriting
> valid PTEs,
> shouldn't we deal with that ?
So the HW guys confirmed that the TLB will never cache a valid entry
that has all permissions clear. That leaves the THP write problem
though.
Cheers,
Ben.
> Cheers,
> Ben.
>
> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> >
> > ---
> > arch/powerpc/include/asm/book3s/64/radix.h | 12 +++---------
> > 1 file changed, 3 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/book3s/64/radix.h
> > b/arch/powerpc/include/asm/book3s/64/radix.h
> > index b4d1302387a3..70a3cdcdbe47 100644
> > --- a/arch/powerpc/include/asm/book3s/64/radix.h
> > +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> > @@ -144,16 +144,10 @@ static inline unsigned long
> > radix__pte_update(struct mm_struct *mm,
> > > * new value of pte
> > > */
> > > new_pte = (old_pte | set) & ~clr;
> > > - /*
> > > - * If we are trying to clear the pte, we can
> > > skip
> > > - * the below sequence and batch the tlb flush.
> > > The
> > > - * tlb flush batching is done by mmu gather code
> > > - */
> > > - if (new_pte) {
> > > - asm volatile("ptesync" : : : "memory");
> > > - radix__flush_tlb_pte_p9_dd1(old_pte, mm,
> > > addr);
> > > + asm volatile("ptesync" : : : "memory");
> > > + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> > > + if (new_pte)
> > > __radix_pte_update(ptep, 0, new_pte);
> > > - }
> > > } else
> > > old_pte = __radix_pte_update(ptep, clr, set);
> > > asm volatile("ptesync" : : : "memory");
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
2017-02-09 4:15 ` Benjamin Herrenschmidt
@ 2017-02-09 4:34 ` Aneesh Kumar K.V
0 siblings, 0 replies; 10+ messages in thread
From: Aneesh Kumar K.V @ 2017-02-09 4:34 UTC (permalink / raw)
To: Benjamin Herrenschmidt, paulus, mpe; +Cc: linuxppc-dev
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Thu, 2017-02-09 at 14:49 +1100, Benjamin Herrenschmidt wrote:
>> On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
>> > In the kernel we do follow the below sequence in different code
>> > paths.
>> > pte = ptep_get_clear(ptep)
>> > ....
>> > set_pte_at(ptep, pte)
>> >
>> > We do that for mremap, autonuma protection update and softdirty
>> > clearing. This
>> > implies our optimization to skip a tlb flush when clearing a pte
>> > update is
>> > not valid, because for DD1 system that followup set_pte_at will be
>> > done witout
>> > doing the required tlbflush. Fix that by always doing the dd1 style
>> > pte update
>> > irrespective of new_pte value. In a later patch we will optimize
>> > the application
>> > exit case.
>>
>> What about my change to set_pte_at() ? We seem to be overwriting
>> valid PTEs,
>> shouldn't we deal with that ?
>
> So the HW guys confirmed that the TLB will never cache a valid entry
> that has all permissions clear. That leaves the THP write problem
> though.
Which is fixed by the autonuma related changes posted at
https://lkml.kernel.org/r/1486609259-6796-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com
Right now i am running a kernel compile in loop with parallel perf
bench numa mem run to make sure we got most of the details correct.
(this is on p8)
-aneesh
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
2017-02-09 2:58 [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Aneesh Kumar K.V
` (2 preceding siblings ...)
2017-02-09 3:49 ` [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Benjamin Herrenschmidt
@ 2017-02-14 4:17 ` Michael Neuling
2017-02-16 5:59 ` [1/3] " Michael Ellerman
4 siblings, 0 replies; 10+ messages in thread
From: Michael Neuling @ 2017-02-14 4:17 UTC (permalink / raw)
To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev
On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
> In the kernel we do follow the below sequence in different code paths.
> pte =3D ptep_get_clear(ptep)
> ....
> set_pte_at(ptep, pte)
>=20
> We do that for mremap, autonuma protection update and softdirty clearing.=
This
> implies our optimization to skip a tlb flush when clearing a pte update i=
s
> not valid, because for DD1 system that followup set_pte_at will be done w=
itout
> doing the required tlbflush. Fix that by always doing the dd1 style pte u=
pdate
> irrespective of new_pte value. In a later patch we will optimize the
> application
> exit case.
>=20
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Michael Neuling <mikey@neuling.org>
> ---
> =C2=A0arch/powerpc/include/asm/book3s/64/radix.h | 12 +++---------
> =C2=A01 file changed, 3 insertions(+), 9 deletions(-)
>=20
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h
> b/arch/powerpc/include/asm/book3s/64/radix.h
> index b4d1302387a3..70a3cdcdbe47 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -144,16 +144,10 @@ static inline unsigned long radix__pte_update(struc=
t
> mm_struct *mm,
> =C2=A0 =C2=A0* new value of pte
> =C2=A0 =C2=A0*/
> =C2=A0 new_pte =3D (old_pte | set) & ~clr;
> - /*
> - =C2=A0* If we are trying to clear the pte, we can skip
> - =C2=A0* the below sequence and batch the tlb flush. The
> - =C2=A0* tlb flush batching is done by mmu gather code
> - =C2=A0*/
> - if (new_pte) {
> - asm volatile("ptesync" : : : "memory");
> - radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> + asm volatile("ptesync" : : : "memory");
> + radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> + if (new_pte)
> =C2=A0 __radix_pte_update(ptep, 0, new_pte);
> - }
> =C2=A0 } else
> =C2=A0 old_pte =3D __radix_pte_update(ptep, clr, set);
> =C2=A0 asm volatile("ptesync" : : : "memory");
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 2/3] powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm
2017-02-09 2:58 ` [PATCH 2/3] powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm Aneesh Kumar K.V
@ 2017-02-14 4:17 ` Michael Neuling
0 siblings, 0 replies; 10+ messages in thread
From: Michael Neuling @ 2017-02-14 4:17 UTC (permalink / raw)
To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev
On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
> This helps us to do some optimization for application exit case, where we=
can
> skip the DD1 style pte update sequence.
>=20
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Michael Neuling <mikey@neuling.org>
> ---
> =C2=A0arch/powerpc/include/asm/book3s/64/pgtable.h | 17 +++++++++++++++++
> =C2=A0arch/powerpc/include/asm/book3s/64/radix.h=C2=A0=C2=A0=C2=A0| 23 ++=
++++++++++++++++++++-
> =C2=A02 files changed, 39 insertions(+), 1 deletion(-)
>=20
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h
> b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 6f15bde94da2..e91ada786d48 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -373,6 +373,23 @@ static inline pte_t ptep_get_and_clear(struct mm_str=
uct
> *mm,
> =C2=A0 return __pte(old);
> =C2=A0}
> =C2=A0
> +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_FULL
> +static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
> + =C2=A0=C2=A0=C2=A0=C2=A0unsigned long addr,
> + =C2=A0=C2=A0=C2=A0=C2=A0pte_t *ptep, int full)
> +{
> + if (full && radix_enabled()) {
> + /*
> + =C2=A0* Let's skip the DD1 style pte update here. We know that
> + =C2=A0* this is a full mm pte clear and hence can be sure there is
> + =C2=A0* no parallel set_pte.
> + =C2=A0*/
> + return radix__ptep_get_and_clear_full(mm, addr, ptep, full);
> + }
> + return ptep_get_and_clear(mm, addr, ptep);
> +}
> +
> +
> =C2=A0static inline void pte_clear(struct mm_struct *mm, unsigned long ad=
dr,
> =C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0pte_t * ptep)
> =C2=A0{
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h
> b/arch/powerpc/include/asm/book3s/64/radix.h
> index 70a3cdcdbe47..fcf822d6c204 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -139,7 +139,7 @@ static inline unsigned long radix__pte_update(struct
> mm_struct *mm,
> =C2=A0
> =C2=A0 unsigned long new_pte;
> =C2=A0
> - old_pte =3D __radix_pte_update(ptep, ~0, 0);
> + old_pte =3D __radix_pte_update(ptep, ~0ul, 0);
> =C2=A0 /*
> =C2=A0 =C2=A0* new value of pte
> =C2=A0 =C2=A0*/
> @@ -157,6 +157,27 @@ static inline unsigned long radix__pte_update(struct
> mm_struct *mm,
> =C2=A0 return old_pte;
> =C2=A0}
> =C2=A0
> +static inline pte_t radix__ptep_get_and_clear_full(struct mm_struct *mm,
> + =C2=A0=C2=A0=C2=A0unsigned long addr,
> + =C2=A0=C2=A0=C2=A0pte_t *ptep, int full)
> +{
> + unsigned long old_pte;
> +
> + if (full) {
> + /*
> + =C2=A0* If we are trying to clear the pte, we can skip
> + =C2=A0* the DD1 pte update sequence and batch the tlb flush. The
> + =C2=A0* tlb flush batching is done by mmu gather code. We
> + =C2=A0* still keep the cmp_xchg update to make sure we get
> + =C2=A0* correct R/C bit which might be updated via Nest MMU.
> + =C2=A0*/
> + old_pte =3D __radix_pte_update(ptep, ~0ul, 0);
> + } else
> + old_pte =3D radix__pte_update(mm, addr, ptep, ~0ul, 0, 0);
> +
> + return __pte(old_pte);
> +}
> +
> =C2=A0/*
> =C2=A0 * Set the dirty and/or accessed bits atomically in a linux PTE, th=
is
> =C2=A0 * function doesn't need to invalidate tlb.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] powerpc/mm/radix: Skip ptesync in pte update helpers
2017-02-09 2:58 ` [PATCH 3/3] powerpc/mm/radix: Skip ptesync in pte update helpers Aneesh Kumar K.V
@ 2017-02-14 4:17 ` Michael Neuling
0 siblings, 0 replies; 10+ messages in thread
From: Michael Neuling @ 2017-02-14 4:17 UTC (permalink / raw)
To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev
On Thu, 2017-02-09 at 08:28 +0530, Aneesh Kumar K.V wrote:
> We do them at the start of tlb flush, and we are sure a pte update will b=
e
> followed by a tlbflush. Hence we can skip the ptesync in pte update helpe=
rs.
>=20
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Tested-by: Michael Neuling <mikey@neuling.org>
> ---
> =C2=A0arch/powerpc/include/asm/book3s/64/radix.h | 3 ---
> =C2=A01 file changed, 3 deletions(-)
>=20
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h
> b/arch/powerpc/include/asm/book3s/64/radix.h
> index fcf822d6c204..77e590c77299 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -144,13 +144,11 @@ static inline unsigned long radix__pte_update(struc=
t
> mm_struct *mm,
> =C2=A0 =C2=A0* new value of pte
> =C2=A0 =C2=A0*/
> =C2=A0 new_pte =3D (old_pte | set) & ~clr;
> - asm volatile("ptesync" : : : "memory");
> =C2=A0 radix__flush_tlb_pte_p9_dd1(old_pte, mm, addr);
> =C2=A0 if (new_pte)
> =C2=A0 __radix_pte_update(ptep, 0, new_pte);
> =C2=A0 } else
> =C2=A0 old_pte =3D __radix_pte_update(ptep, clr, set);
> - asm volatile("ptesync" : : : "memory");
> =C2=A0 if (!huge)
> =C2=A0 assert_pte_locked(mm, addr);
> =C2=A0
> @@ -195,7 +193,6 @@ static inline void radix__ptep_set_access_flags(struc=
t
> mm_struct *mm,
> =C2=A0 unsigned long old_pte, new_pte;
> =C2=A0
> =C2=A0 old_pte =3D __radix_pte_update(ptep, ~0, 0);
> - asm volatile("ptesync" : : : "memory");
> =C2=A0 /*
> =C2=A0 =C2=A0* new value of pte
> =C2=A0 =C2=A0*/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [1/3] powerpc/mm/radix: Update pte update sequence for pte clear case
2017-02-09 2:58 [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Aneesh Kumar K.V
` (3 preceding siblings ...)
2017-02-14 4:17 ` Michael Neuling
@ 2017-02-16 5:59 ` Michael Ellerman
4 siblings, 0 replies; 10+ messages in thread
From: Michael Ellerman @ 2017-02-16 5:59 UTC (permalink / raw)
To: Aneesh Kumar K.V, benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V
On Thu, 2017-02-09 at 02:58:19 UTC, "Aneesh Kumar K.V" wrote:
> In the kernel we do follow the below sequence in different code paths.
> pte = ptep_get_clear(ptep)
> ....
> set_pte_at(ptep, pte)
>
> We do that for mremap, autonuma protection update and softdirty clearing. This
> implies our optimization to skip a tlb flush when clearing a pte update is
> not valid, because for DD1 system that followup set_pte_at will be done witout
> doing the required tlbflush. Fix that by always doing the dd1 style pte update
> irrespective of new_pte value. In a later patch we will optimize the application
> exit case.
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Tested-by: Michael Neuling <mikey@neuling.org>
Series applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/ca94573b9c69d224e50e1084a27767
cheers
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-02-16 5:59 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09 2:58 [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Aneesh Kumar K.V
2017-02-09 2:58 ` [PATCH 2/3] powerpc/mm/radix: Use ptep_get_and_clear_full when clearing pte for full mm Aneesh Kumar K.V
2017-02-14 4:17 ` Michael Neuling
2017-02-09 2:58 ` [PATCH 3/3] powerpc/mm/radix: Skip ptesync in pte update helpers Aneesh Kumar K.V
2017-02-14 4:17 ` Michael Neuling
2017-02-09 3:49 ` [PATCH 1/3] powerpc/mm/radix: Update pte update sequence for pte clear case Benjamin Herrenschmidt
2017-02-09 4:15 ` Benjamin Herrenschmidt
2017-02-09 4:34 ` Aneesh Kumar K.V
2017-02-14 4:17 ` Michael Neuling
2017-02-16 5:59 ` [1/3] " Michael Ellerman
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.