All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses
@ 2021-08-17 13:25 Michael Ellerman
  2021-08-17 14:20 ` Christophe Leroy
  2021-08-17 14:21 ` Fabiano Rosas
  0 siblings, 2 replies; 5+ messages in thread
From: Michael Ellerman @ 2021-08-17 13:25 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: lvivier, farosas, jniethe5, npiggin, aneesh.kumar

Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
on one of his systems:

  kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0)
  BUG: Unable to handle kernel instruction fetch
  Faulting instruction address: 0xc008000004073278
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
  CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
  Workqueue: events control_work_handler [virtio_console]
  NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
  REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
  MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822 XER: 200400cf
  ...
  NIP fill_queue+0xf0/0x210 [virtio_console]
  LR  fill_queue+0xf0/0x210 [virtio_console]
  Call Trace:
    fill_queue+0xb4/0x210 [virtio_console] (unreliable)
    add_port+0x1a8/0x470 [virtio_console]
    control_work_handler+0xbc/0x1e8 [virtio_console]
    process_one_work+0x290/0x590
    worker_thread+0x88/0x620
    kthread+0x194/0x1a0
    ret_from_kernel_thread+0x5c/0x64

Jordan, Fabiano & Murilo were able to reproduce and identify that the
problem is caused by the call to module_enable_ro() in do_init_module(),
which happens after the module's init function has already been called.

Our current implementation of change_page_attr() is not safe against
concurrent accesses, because it invalidates the PTE before flushing the
TLB and then installing the new PTE. That leaves a window in time where
there is no valid PTE for the page, if another CPU tries to access the
page at that time we see something like the fault above.

We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
code doesn't handle a set_pte_at() of a valid PTE. See [1].

But we do have pte_update(), which replaces the old PTE with the new,
meaning there's no window where the PTE is invalid. And the hash MMU
version hash__pte_update() deals with synchronising the hash page table
correctly.

Because pte_update() takes the set of PTE bits to set and clear we can't
use our existing helpers, eg. pte_wrprotect() etc. and instead have to
open code the set of flags. We will clean that up somehow in a future
commit.

[1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/

Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/mm/pageattr.c | 45 +++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
index 0876216ceee6..72425b61eb7e 100644
--- a/arch/powerpc/mm/pageattr.c
+++ b/arch/powerpc/mm/pageattr.c
@@ -18,52 +18,61 @@
 /*
  * Updates the attributes of a page in three steps:
  *
- * 1. invalidate the page table entry
- * 2. flush the TLB
- * 3. install the new entry with the updated attributes
- *
- * Invalidating the pte means there are situations where this will not work
- * when in theory it should.
- * For example:
- * - removing write from page whilst it is being executed
- * - setting a page read-only whilst it is being read by another CPU
+ * 1. take the page_table_lock
+ * 2. install the new entry with the updated attributes
+ * 3. flush the TLB
  *
+ * This sequence is safe against concurrent updates, and also allows updating the
+ * attributes of a page currently being executed or accessed.
  */
 static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
 {
 	long action = (long)data;
-	pte_t pte;
+	unsigned long set, clear;
 
 	spin_lock(&init_mm.page_table_lock);
 
-	/* invalidate the PTE so it's safe to modify */
-	pte = ptep_get_and_clear(&init_mm, addr, ptep);
-	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	set = clear = 0;
 
 	/* modify the PTE bits as desired, then apply */
 	switch (action) {
 	case SET_MEMORY_RO:
-		pte = pte_wrprotect(pte);
+#ifdef CONFIG_PPC_BOOK3S_64
+		clear = _PAGE_WRITE;
+#elif defined(CONFIG_PPC_8xx)
+		set = _PAGE_RO;
+#else
+		clear = _PAGE_RW;
+#endif
 		break;
 	case SET_MEMORY_RW:
-		pte = pte_mkwrite(pte_mkdirty(pte));
+#ifdef CONFIG_PPC_8xx
+		clear = _PAGE_RO;
+#elif defined(CONFIG_PPC_BOOK3S_64)
+		set = _PAGE_RW | _PAGE_DIRTY | _PAGE_SOFT_DIRTY;
+#else
+		set = _PAGE_RW | _PAGE_DIRTY;
+#endif
 		break;
 	case SET_MEMORY_NX:
-		pte = pte_exprotect(pte);
+		clear = _PAGE_EXEC;
 		break;
 	case SET_MEMORY_X:
-		pte = pte_mkexec(pte);
+		set = _PAGE_EXEC;
 		break;
 	default:
 		WARN_ON_ONCE(1);
 		break;
 	}
 
-	set_pte_at(&init_mm, addr, ptep, pte);
+	pte_update(&init_mm, addr, ptep, clear, set, 0);
 
 	/* See ptesync comment in radix__set_pte_at() */
 	if (radix_enabled())
 		asm volatile("ptesync": : :"memory");
+
+	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+
 	spin_unlock(&init_mm.page_table_lock);
 
 	return 0;

base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses
  2021-08-17 13:25 [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses Michael Ellerman
@ 2021-08-17 14:20 ` Christophe Leroy
  2021-08-17 14:21 ` Fabiano Rosas
  1 sibling, 0 replies; 5+ messages in thread
From: Christophe Leroy @ 2021-08-17 14:20 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev
  Cc: lvivier, jniethe5, aneesh.kumar, npiggin, farosas



Le 17/08/2021 à 15:25, Michael Ellerman a écrit :
> Laurent reported that STRICT_MODULE_RWX was causing intermittent crashes
> on one of his systems:
> 
>    kernel tried to execute exec-protected page (c008000004073278) - exploit attempt? (uid: 0)
>    BUG: Unable to handle kernel instruction fetch
>    Faulting instruction address: 0xc008000004073278
>    Oops: Kernel access of bad area, sig: 11 [#1]
>    LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>    Modules linked in: drm virtio_console fuse drm_panel_orientation_quirks ...
>    CPU: 3 PID: 44 Comm: kworker/3:1 Not tainted 5.14.0-rc4+ #12
>    Workqueue: events control_work_handler [virtio_console]
>    NIP:  c008000004073278 LR: c008000004073278 CTR: c0000000001e9de0
>    REGS: c00000002e4ef7e0 TRAP: 0400   Not tainted  (5.14.0-rc4+)
>    MSR:  800000004280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002822 XER: 200400cf
>    ...
>    NIP fill_queue+0xf0/0x210 [virtio_console]
>    LR  fill_queue+0xf0/0x210 [virtio_console]
>    Call Trace:
>      fill_queue+0xb4/0x210 [virtio_console] (unreliable)
>      add_port+0x1a8/0x470 [virtio_console]
>      control_work_handler+0xbc/0x1e8 [virtio_console]
>      process_one_work+0x290/0x590
>      worker_thread+0x88/0x620
>      kthread+0x194/0x1a0
>      ret_from_kernel_thread+0x5c/0x64
> 
> Jordan, Fabiano & Murilo were able to reproduce and identify that the
> problem is caused by the call to module_enable_ro() in do_init_module(),
> which happens after the module's init function has already been called.
> 
> Our current implementation of change_page_attr() is not safe against
> concurrent accesses, because it invalidates the PTE before flushing the
> TLB and then installing the new PTE. That leaves a window in time where
> there is no valid PTE for the page, if another CPU tries to access the
> page at that time we see something like the fault above.
> 
> We can't simply switch to set_pte_at()/flush TLB, because our hash MMU
> code doesn't handle a set_pte_at() of a valid PTE. See [1].
> 
> But we do have pte_update(), which replaces the old PTE with the new,
> meaning there's no window where the PTE is invalid. And the hash MMU
> version hash__pte_update() deals with synchronising the hash page table
> correctly.
> 
> Because pte_update() takes the set of PTE bits to set and clear we can't
> use our existing helpers, eg. pte_wrprotect() etc. and instead have to
> open code the set of flags. We will clean that up somehow in a future
> commit.
> 
> [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/
> 
> Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
> Reported-by: Laurent Vivier <lvivier@redhat.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>   arch/powerpc/mm/pageattr.c | 45 +++++++++++++++++++++++---------------
>   1 file changed, 27 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/powerpc/mm/pageattr.c b/arch/powerpc/mm/pageattr.c
> index 0876216ceee6..72425b61eb7e 100644
> --- a/arch/powerpc/mm/pageattr.c
> +++ b/arch/powerpc/mm/pageattr.c
> @@ -18,52 +18,61 @@
>   /*
>    * Updates the attributes of a page in three steps:
>    *
> - * 1. invalidate the page table entry
> - * 2. flush the TLB
> - * 3. install the new entry with the updated attributes
> - *
> - * Invalidating the pte means there are situations where this will not work
> - * when in theory it should.
> - * For example:
> - * - removing write from page whilst it is being executed
> - * - setting a page read-only whilst it is being read by another CPU
> + * 1. take the page_table_lock
> + * 2. install the new entry with the updated attributes
> + * 3. flush the TLB
>    *
> + * This sequence is safe against concurrent updates, and also allows updating the
> + * attributes of a page currently being executed or accessed.
>    */
>   static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>   {
>   	long action = (long)data;
> -	pte_t pte;
> +	unsigned long set, clear;
>   
>   	spin_lock(&init_mm.page_table_lock);
>   
> -	/* invalidate the PTE so it's safe to modify */
> -	pte = ptep_get_and_clear(&init_mm, addr, ptep);
> -	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	set = clear = 0;
>   
>   	/* modify the PTE bits as desired, then apply */
>   	switch (action) {
>   	case SET_MEMORY_RO:
> -		pte = pte_wrprotect(pte);
> +#ifdef CONFIG_PPC_BOOK3S_64
> +		clear = _PAGE_WRITE;
> +#elif defined(CONFIG_PPC_8xx)
> +		set = _PAGE_RO;
> +#else
> +		clear = _PAGE_RW;
> +#endif

I think it can be handle as follows (untested):

new = pte_wrprotect(pte);

set = pte_val(new) & ~pte_val(pte);
clear = ~pte_val(new) & pte_val(pte);

So just put those two lines before the pte_update() and only change the switch cases to create a 
'new' pte instead of changing it.


Or you can do the way we do in ptep_set_wrprotect() in <asm/nohash/32/pgtable.h>

Or can __ptep_set_access_flags() be used ?

>   		break;
>   	case SET_MEMORY_RW:
> -		pte = pte_mkwrite(pte_mkdirty(pte));
> +#ifdef CONFIG_PPC_8xx
> +		clear = _PAGE_RO;
> +#elif defined(CONFIG_PPC_BOOK3S_64)
> +		set = _PAGE_RW | _PAGE_DIRTY | _PAGE_SOFT_DIRTY;
> +#else
> +		set = _PAGE_RW | _PAGE_DIRTY;
> +#endif
>   		break;
>   	case SET_MEMORY_NX:
> -		pte = pte_exprotect(pte);
> +		clear = _PAGE_EXEC;
>   		break;
>   	case SET_MEMORY_X:
> -		pte = pte_mkexec(pte);
> +		set = _PAGE_EXEC;
>   		break;
>   	default:
>   		WARN_ON_ONCE(1);
>   		break;
>   	}
>   
> -	set_pte_at(&init_mm, addr, ptep, pte);
> +	pte_update(&init_mm, addr, ptep, clear, set, 0);
>   
>   	/* See ptesync comment in radix__set_pte_at() */
>   	if (radix_enabled())
>   		asm volatile("ptesync": : :"memory");
> +
> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);

Can we use a page version like flush_tlb_page() in order to avoid a 'tlbia' ? (maybe another page as 
it was already there).

> +
>   	spin_unlock(&init_mm.page_table_lock);
>   
>   	return 0;
> 
> base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses
  2021-08-17 13:25 [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses Michael Ellerman
  2021-08-17 14:20 ` Christophe Leroy
@ 2021-08-17 14:21 ` Fabiano Rosas
  2021-08-17 14:28   ` Christophe Leroy
  2021-08-18  7:46   ` Michael Ellerman
  1 sibling, 2 replies; 5+ messages in thread
From: Fabiano Rosas @ 2021-08-17 14:21 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev; +Cc: lvivier, jniethe5, npiggin, aneesh.kumar

Michael Ellerman <mpe@ellerman.id.au> writes:

Hi, I already mentioned these things in private, but I'll post here so
everyone can see:

> Because pte_update() takes the set of PTE bits to set and clear we can't
> use our existing helpers, eg. pte_wrprotect() etc. and instead have to
> open code the set of flags. We will clean that up somehow in a future
> commit.

I tested the following on P9 and it seems to work fine. Not sure if it
works for CONFIG_PPC_8xx, though.


 static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
 {
 	long action = (long)data;
 	pte_t pte;
 
 	spin_lock(&init_mm.page_table_lock);
-
-	/* invalidate the PTE so it's safe to modify */
-	pte = ptep_get_and_clear(&init_mm, addr, ptep);
-	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
+	pte = *ptep;
 
 	/* modify the PTE bits as desired, then apply */
 	switch (action) {
@@ -59,11 +42,9 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
 		break;
 	}
 
-	set_pte_at(&init_mm, addr, ptep, pte);
+	pte_update(&init_mm, addr, ptep, ~0UL, pte_val(pte), 0);
+	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
 
-	/* See ptesync comment in radix__set_pte_at() */
-	if (radix_enabled())
-		asm volatile("ptesync": : :"memory");
 	spin_unlock(&init_mm.page_table_lock);
 
 	return 0;
---

For reference, the full patch is here:
https://github.com/farosas/linux/commit/923c95c84d7081d7be9503bf5b276dd93bd17036.patch

>
> [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/
>
> Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
> Reported-by: Laurent Vivier <lvivier@redhat.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---

...

> -	set_pte_at(&init_mm, addr, ptep, pte);
> +	pte_update(&init_mm, addr, ptep, clear, set, 0);
>  
>  	/* See ptesync comment in radix__set_pte_at() */
>  	if (radix_enabled())
>  		asm volatile("ptesync": : :"memory");
> +
> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);

I think there's an optimization possible here, when relaxing access, to
skip the TLB flush. Would still need the ptesync though. Similar to what
Nick did in e5f7cb58c2b7 ("powerpc/64s/radix: do not flush TLB when
relaxing access"). It is out of scope for this patch but maybe worth
thinking about.

> +
>  	spin_unlock(&init_mm.page_table_lock);
>  
>  	return 0;
>
> base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses
  2021-08-17 14:21 ` Fabiano Rosas
@ 2021-08-17 14:28   ` Christophe Leroy
  2021-08-18  7:46   ` Michael Ellerman
  1 sibling, 0 replies; 5+ messages in thread
From: Christophe Leroy @ 2021-08-17 14:28 UTC (permalink / raw)
  To: Fabiano Rosas, Michael Ellerman, linuxppc-dev
  Cc: lvivier, jniethe5, npiggin, aneesh.kumar



Le 17/08/2021 à 16:21, Fabiano Rosas a écrit :
> Michael Ellerman <mpe@ellerman.id.au> writes:
> 
> Hi, I already mentioned these things in private, but I'll post here so
> everyone can see:
> 
>> Because pte_update() takes the set of PTE bits to set and clear we can't
>> use our existing helpers, eg. pte_wrprotect() etc. and instead have to
>> open code the set of flags. We will clean that up somehow in a future
>> commit.
> 
> I tested the following on P9 and it seems to work fine. Not sure if it
> works for CONFIG_PPC_8xx, though.
> 
> 
>   static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>   {
>   	long action = (long)data;
>   	pte_t pte;
>   
>   	spin_lock(&init_mm.page_table_lock);
> -
> -	/* invalidate the PTE so it's safe to modify */
> -	pte = ptep_get_and_clear(&init_mm, addr, ptep);
> -	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	pte = *ptep;

Maybe using ptep_get() is better.

>   
>   	/* modify the PTE bits as desired, then apply */
>   	switch (action) {
> @@ -59,11 +42,9 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>   		break;
>   	}
>   
> -	set_pte_at(&init_mm, addr, ptep, pte);
> +	pte_update(&init_mm, addr, ptep, ~0UL, pte_val(pte), 0);

Good simple idea, indeed yes it should work with much more effort.


> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>   
> -	/* See ptesync comment in radix__set_pte_at() */
> -	if (radix_enabled())
> -		asm volatile("ptesync": : :"memory");
>   	spin_unlock(&init_mm.page_table_lock);
>   
>   	return 0;
> ---
> 
> For reference, the full patch is here:
> https://github.com/farosas/linux/commit/923c95c84d7081d7be9503bf5b276dd93bd17036.patch
> 
>>
>> [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/
>>
>> Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
>> Reported-by: Laurent Vivier <lvivier@redhat.com>
>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>> ---
> 
> ...
> 
>> -	set_pte_at(&init_mm, addr, ptep, pte);
>> +	pte_update(&init_mm, addr, ptep, clear, set, 0);
>>   
>>   	/* See ptesync comment in radix__set_pte_at() */
>>   	if (radix_enabled())
>>   		asm volatile("ptesync": : :"memory");
>> +
>> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> 
> I think there's an optimization possible here, when relaxing access, to
> skip the TLB flush. Would still need the ptesync though. Similar to what
> Nick did in e5f7cb58c2b7 ("powerpc/64s/radix: do not flush TLB when
> relaxing access"). It is out of scope for this patch but maybe worth
> thinking about.
> 
>> +
>>   	spin_unlock(&init_mm.page_table_lock);
>>   
>>   	return 0;
>>
>> base-commit: cbc06f051c524dcfe52ef0d1f30647828e226d30

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses
  2021-08-17 14:21 ` Fabiano Rosas
  2021-08-17 14:28   ` Christophe Leroy
@ 2021-08-18  7:46   ` Michael Ellerman
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2021-08-18  7:46 UTC (permalink / raw)
  To: Fabiano Rosas, linuxppc-dev; +Cc: lvivier, jniethe5, npiggin, aneesh.kumar

Fabiano Rosas <farosas@linux.ibm.com> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>
> Hi, I already mentioned these things in private, but I'll post here so
> everyone can see:
>
>> Because pte_update() takes the set of PTE bits to set and clear we can't
>> use our existing helpers, eg. pte_wrprotect() etc. and instead have to
>> open code the set of flags. We will clean that up somehow in a future
>> commit.
>
> I tested the following on P9 and it seems to work fine. Not sure if it
> works for CONFIG_PPC_8xx, though.
>
>
>  static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>  {
>  	long action = (long)data;
>  	pte_t pte;
>  
>  	spin_lock(&init_mm.page_table_lock);
> -
> -	/* invalidate the PTE so it's safe to modify */
> -	pte = ptep_get_and_clear(&init_mm, addr, ptep);
> -	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
> +	pte = *ptep;
>  
>  	/* modify the PTE bits as desired, then apply */
>  	switch (action) {
> @@ -59,11 +42,9 @@ static int change_page_attr(pte_t *ptep, unsigned long addr, void *data)
>  		break;
>  	}
>  
> -	set_pte_at(&init_mm, addr, ptep, pte);
> +	pte_update(&init_mm, addr, ptep, ~0UL, pte_val(pte), 0);

I avoided that because it's not atomic, but pte_update() is not atomic
on some platforms anyway. And for now at least we still have the
page_table_lock as well.

So you're right that's a nicer way to do it.

And I'll use ptep_get() as Christophe suggested.

> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>  
> -	/* See ptesync comment in radix__set_pte_at() */
> -	if (radix_enabled())
> -		asm volatile("ptesync": : :"memory");

I didn't do that because I wanted to keep the patch minimal. We can do
that as a separate patch.

>  	spin_unlock(&init_mm.page_table_lock);
>  
>  	return 0;
> ---
>
> For reference, the full patch is here:
> https://github.com/farosas/linux/commit/923c95c84d7081d7be9503bf5b276dd93bd17036.patch
>
>>
>> [1]: https://lore.kernel.org/linuxppc-dev/87y318wp9r.fsf@linux.ibm.com/
>>
>> Fixes: 1f9ad21c3b38 ("powerpc/mm: Implement set_memory() routines")
>> Reported-by: Laurent Vivier <lvivier@redhat.com>
>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>> ---
>
> ...
>
>> -	set_pte_at(&init_mm, addr, ptep, pte);
>> +	pte_update(&init_mm, addr, ptep, clear, set, 0);
>>  
>>  	/* See ptesync comment in radix__set_pte_at() */
>>  	if (radix_enabled())
>>  		asm volatile("ptesync": : :"memory");
>> +
>> +	flush_tlb_kernel_range(addr, addr + PAGE_SIZE);
>
> I think there's an optimization possible here, when relaxing access, to
> skip the TLB flush. Would still need the ptesync though. Similar to what
> Nick did in e5f7cb58c2b7 ("powerpc/64s/radix: do not flush TLB when
> relaxing access").

That commit is specific to Radix, whereas this code needs to work on all
platforms.

We'd need to verify that it's safe to skip the flush on all platforms
and CPU versions.

What I think we can do, and would possibly be a more meaningful
optimisation, is to move the TLB flush out of the loop and up into
change_memory_attr(). So we just do it once for the range, rather than
per page. But that too would be a separate patch.

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-08-18  7:47 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-17 13:25 [PATCH] powerpc/mm: Fix set_memory_*() against concurrent accesses Michael Ellerman
2021-08-17 14:20 ` Christophe Leroy
2021-08-17 14:21 ` Fabiano Rosas
2021-08-17 14:28   ` Christophe Leroy
2021-08-18  7:46   ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.