All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-16  5:05 ` Samuel Holland
  0 siblings, 0 replies; 22+ messages in thread
From: Samuel Holland @ 2022-09-16  5:05 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Nicholas Piggin
  Cc: Samuel Holland, Eric W. Biederman, Kees Cook, Russell Currey,
	linux-kernel, linuxppc-dev

With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
to switch away from a task inside copy_{from,to}_user. This left the CPU
with userspace access enabled until after the next IRQ or privilege
level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
switching back to the original task, the userspace access would fault:

  Kernel attempted to write user page (3fff7ab68190) - exploit attempt? (uid: 65536)
  ------------[ cut here ]------------
  Bug: Write fault blocked by KUAP!
  WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 ___do_page_fault+0x7b4/0xaa0
  CPU: 56 PID: 4939 Comm: git Tainted: G        W         5.19.8-00005-gba424747260d #1
  NIP:  c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0
  REGS: c00000008f507370 TRAP: 0700   Tainted: G        W          (5.19.8-00005-gba424747260d)
  MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28042222  XER: 20040000
  CFAR: c000000000123780 IRQMASK: 3
  NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0
  LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0
  Call Trace:
  [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 (unreliable)
  [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130
  [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0
  --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4
  NIP:  c00000000007b1a8 LR: c00000000073f4d4 CTR: 0000000000000080
  REGS: c00000008f507760 TRAP: 0300   Tainted: G        W          (5.19.8-00005-gba424747260d)
  MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002220  XER: 20040000
  CFAR: c00000000007b174 DAR: 00003fff7ab68190 DSISR: 0a000000 IRQMASK: 0
  NIP [c00000000007b1a8] __copy_tofrom_user_base+0x9c/0x5a4
  LR [c00000000073f4d4] copyout+0x74/0x150
  --- interrupt: 300
  [c00000008f507a30] [c0000000007430cc] copy_page_to_iter+0x12c/0x4b0
  [c00000008f507ab0] [c0000000002c7c20] filemap_read+0x200/0x460
  [c00000008f507bf0] [c0000000005f96f4] xfs_file_buffered_read+0x104/0x170
  [c00000008f507c30] [c0000000005f9800] xfs_file_read_iter+0xa0/0x150
  [c00000008f507c70] [c0000000003bddc8] new_sync_read+0x108/0x180
  [c00000008f507d10] [c0000000003c06b0] vfs_read+0x1d0/0x240
  [c00000008f507d60] [c0000000003c0ba4] ksys_read+0x84/0x140
  [c00000008f507db0] [c00000000002a3fc] system_call_exception+0x15c/0x300
  [c00000008f507e10] [c00000000000c63c] system_call_common+0xec/0x250
  --- interrupt: c00 at 0x3fff83aa7238
  NIP:  00003fff83aa7238 LR: 00003fff83a923b8 CTR: 0000000000000000
  REGS: c00000008f507e80 TRAP: 0c00   Tainted: G        W          (5.19.8-00005-gba424747260d)
  MSR:  900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 80002482  XER: 00000000
  IRQMASK: 0
  NIP [00003fff83aa7238] 0x3fff83aa7238
  LR [00003fff83a923b8] 0x3fff83a923b8
  --- interrupt: c00
  Instruction dump:
  e87f0100 48101021 60000000 2c230000 4182fee8 408e0128 3c82ff80 3884e978
  3c62ff80 3863ea78 480ce13d 60000000 <0fe00000> fb010070 fb810090 e80100c0
  ---[ end trace 0000000000000000 ]---

Fix this by saving and restoring the kernel-side AMR/IAMR values when
switching tasks.

Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
Signed-off-by: Samuel Holland <samuel@sholland.org>
---
I have no idea if this is the right change to make, and it could be
optimized, but my system has been stable with this patch for 5 days now.

Without the patch, I hit the bug every few minutes when my load average
is <1, and I hit it immediately if I try to do a parallel kernel build.

Because of the instability (file I/O randomly raises SIGBUS), I don't
think anyone would run a system in this configuration, so I don't think
this bug is exploitable.

 arch/powerpc/kernel/process.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 0fbda89cd1bb..69b189d63124 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1150,6 +1150,12 @@ static inline void save_sprs(struct thread_struct *t)
 		 */
 		t->tar = mfspr(SPRN_TAR);
 	}
+	if (t->regs) {
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
+			t->regs->amr = mfspr(SPRN_AMR);
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP))
+			t->regs->iamr = mfspr(SPRN_IAMR);
+	}
 #endif
 }
 
@@ -1228,6 +1234,13 @@ static inline void restore_sprs(struct thread_struct *old_thread,
 	if (cpu_has_feature(CPU_FTR_P9_TIDR) &&
 	    old_thread->tidr != new_thread->tidr)
 		mtspr(SPRN_TIDR, new_thread->tidr);
+	if (new_thread->regs) {
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
+			mtspr(SPRN_AMR, new_thread->regs->amr);
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP))
+			mtspr(SPRN_IAMR, new_thread->regs->iamr);
+		isync();
+	}
 #endif
 
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-16  5:05 ` Samuel Holland
  0 siblings, 0 replies; 22+ messages in thread
From: Samuel Holland @ 2022-09-16  5:05 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Nicholas Piggin
  Cc: Kees Cook, Samuel Holland, linux-kernel, Eric W. Biederman, linuxppc-dev

With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
to switch away from a task inside copy_{from,to}_user. This left the CPU
with userspace access enabled until after the next IRQ or privilege
level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
switching back to the original task, the userspace access would fault:

  Kernel attempted to write user page (3fff7ab68190) - exploit attempt? (uid: 65536)
  ------------[ cut here ]------------
  Bug: Write fault blocked by KUAP!
  WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 ___do_page_fault+0x7b4/0xaa0
  CPU: 56 PID: 4939 Comm: git Tainted: G        W         5.19.8-00005-gba424747260d #1
  NIP:  c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0
  REGS: c00000008f507370 TRAP: 0700   Tainted: G        W          (5.19.8-00005-gba424747260d)
  MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28042222  XER: 20040000
  CFAR: c000000000123780 IRQMASK: 3
  NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0
  LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0
  Call Trace:
  [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 (unreliable)
  [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130
  [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0
  --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4
  NIP:  c00000000007b1a8 LR: c00000000073f4d4 CTR: 0000000000000080
  REGS: c00000008f507760 TRAP: 0300   Tainted: G        W          (5.19.8-00005-gba424747260d)
  MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 24002220  XER: 20040000
  CFAR: c00000000007b174 DAR: 00003fff7ab68190 DSISR: 0a000000 IRQMASK: 0
  NIP [c00000000007b1a8] __copy_tofrom_user_base+0x9c/0x5a4
  LR [c00000000073f4d4] copyout+0x74/0x150
  --- interrupt: 300
  [c00000008f507a30] [c0000000007430cc] copy_page_to_iter+0x12c/0x4b0
  [c00000008f507ab0] [c0000000002c7c20] filemap_read+0x200/0x460
  [c00000008f507bf0] [c0000000005f96f4] xfs_file_buffered_read+0x104/0x170
  [c00000008f507c30] [c0000000005f9800] xfs_file_read_iter+0xa0/0x150
  [c00000008f507c70] [c0000000003bddc8] new_sync_read+0x108/0x180
  [c00000008f507d10] [c0000000003c06b0] vfs_read+0x1d0/0x240
  [c00000008f507d60] [c0000000003c0ba4] ksys_read+0x84/0x140
  [c00000008f507db0] [c00000000002a3fc] system_call_exception+0x15c/0x300
  [c00000008f507e10] [c00000000000c63c] system_call_common+0xec/0x250
  --- interrupt: c00 at 0x3fff83aa7238
  NIP:  00003fff83aa7238 LR: 00003fff83a923b8 CTR: 0000000000000000
  REGS: c00000008f507e80 TRAP: 0c00   Tainted: G        W          (5.19.8-00005-gba424747260d)
  MSR:  900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 80002482  XER: 00000000
  IRQMASK: 0
  NIP [00003fff83aa7238] 0x3fff83aa7238
  LR [00003fff83a923b8] 0x3fff83a923b8
  --- interrupt: c00
  Instruction dump:
  e87f0100 48101021 60000000 2c230000 4182fee8 408e0128 3c82ff80 3884e978
  3c62ff80 3863ea78 480ce13d 60000000 <0fe00000> fb010070 fb810090 e80100c0
  ---[ end trace 0000000000000000 ]---

Fix this by saving and restoring the kernel-side AMR/IAMR values when
switching tasks.

Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
Signed-off-by: Samuel Holland <samuel@sholland.org>
---
I have no idea if this is the right change to make, and it could be
optimized, but my system has been stable with this patch for 5 days now.

Without the patch, I hit the bug every few minutes when my load average
is <1, and I hit it immediately if I try to do a parallel kernel build.

Because of the instability (file I/O randomly raises SIGBUS), I don't
think anyone would run a system in this configuration, so I don't think
this bug is exploitable.

 arch/powerpc/kernel/process.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 0fbda89cd1bb..69b189d63124 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1150,6 +1150,12 @@ static inline void save_sprs(struct thread_struct *t)
 		 */
 		t->tar = mfspr(SPRN_TAR);
 	}
+	if (t->regs) {
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
+			t->regs->amr = mfspr(SPRN_AMR);
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP))
+			t->regs->iamr = mfspr(SPRN_IAMR);
+	}
 #endif
 }
 
@@ -1228,6 +1234,13 @@ static inline void restore_sprs(struct thread_struct *old_thread,
 	if (cpu_has_feature(CPU_FTR_P9_TIDR) &&
 	    old_thread->tidr != new_thread->tidr)
 		mtspr(SPRN_TIDR, new_thread->tidr);
+	if (new_thread->regs) {
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
+			mtspr(SPRN_AMR, new_thread->regs->amr);
+		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP))
+			mtspr(SPRN_IAMR, new_thread->regs->iamr);
+		isync();
+	}
 #endif
 
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-16  5:05 ` Samuel Holland
@ 2022-09-17  8:16   ` Christophe Leroy
  -1 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-09-17  8:16 UTC (permalink / raw)
  To: Samuel Holland, Michael Ellerman, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev



Le 16/09/2022 à 07:05, Samuel Holland a écrit :
> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
> to switch away from a task inside copy_{from,to}_user. This left the CPU
> with userspace access enabled until after the next IRQ or privilege
> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
> switching back to the original task, the userspace access would fault:

This is not supposed to happen. You never switch away from a task 
magically. Task switch will always happen in an interrupt, that means 
copy_{from,to}_user() get interrupted.

Whenever an interrupt is taken, kuap_save_amr_and_lock() macro is used 
to save KUAP status into the stack then lock KUAP access. At interrupt 
exit, kuap_kernel_restore() macro or function is used to restore KUAP 
access from the stack. At the time the task switch happens, KUAP access 
is expected to be locked. During task switch, the stack is switched so 
the KUAP status is taken back from the new task's stack.

Your fix suggests that there is some path where the KUAP status is not 
properly saved and/or restored. Did you try running with 
CONFIG_PPC_KUAP_DEBUG ? It should warn whenever a KUAP access is left 
unlocked.

> 
>    Kernel attempted to write user page (3fff7ab68190) - exploit attempt? (uid: 65536)
>    ------------[ cut here ]------------
>    Bug: Write fault blocked by KUAP!
>    WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 ___do_page_fault+0x7b4/0xaa0
>    CPU: 56 PID: 4939 Comm: git Tainted: G        W         5.19.8-00005-gba424747260d #1
>    NIP:  c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0
>    REGS: c00000008f507370 TRAP: 0700   Tainted: G        W          (5.19.8-00005-gba424747260d)
>    MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28042222  XER: 20040000
>    CFAR: c000000000123780 IRQMASK: 3
>    NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0
>    LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0
>    Call Trace:
>    [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 (unreliable)
>    [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130
>    [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0
>    --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4

...

> 
> Fix this by saving and restoring the kernel-side AMR/IAMR values when
> switching tasks.

As explained above, KUAP access should be locked at that time, so saving 
and restoring it should not have any effect. If it does, it means 
something goes wrong somewhere else.

> 
> Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
> Signed-off-by: Samuel Holland <samuel@sholland.org>
> ---
> I have no idea if this is the right change to make, and it could be
> optimized, but my system has been stable with this patch for 5 days now.
> 
> Without the patch, I hit the bug every few minutes when my load average
> is <1, and I hit it immediately if I try to do a parallel kernel build.

Great, then can you make a try with CONFIG_PPC_KUAP_DEBUG ?

> 
> Because of the instability (file I/O randomly raises SIGBUS), I don't
> think anyone would run a system in this configuration, so I don't think
> this bug is exploitable.
> 
>   arch/powerpc/kernel/process.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 0fbda89cd1bb..69b189d63124 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1150,6 +1150,12 @@ static inline void save_sprs(struct thread_struct *t)
>   		 */
>   		t->tar = mfspr(SPRN_TAR);
>   	}
> +	if (t->regs) {
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
> +			t->regs->amr = mfspr(SPRN_AMR);
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP))
> +			t->regs->iamr = mfspr(SPRN_IAMR);
> +	}
>   #endif
>   }
>   
> @@ -1228,6 +1234,13 @@ static inline void restore_sprs(struct thread_struct *old_thread,
>   	if (cpu_has_feature(CPU_FTR_P9_TIDR) &&
>   	    old_thread->tidr != new_thread->tidr)
>   		mtspr(SPRN_TIDR, new_thread->tidr);
> +	if (new_thread->regs) {
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
> +			mtspr(SPRN_AMR, new_thread->regs->amr);
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP))
> +			mtspr(SPRN_IAMR, new_thread->regs->iamr);
> +		isync();
> +	}
>   #endif
>   
>   }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-17  8:16   ` Christophe Leroy
  0 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-09-17  8:16 UTC (permalink / raw)
  To: Samuel Holland, Michael Ellerman, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook



Le 16/09/2022 à 07:05, Samuel Holland a écrit :
> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
> to switch away from a task inside copy_{from,to}_user. This left the CPU
> with userspace access enabled until after the next IRQ or privilege
> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
> switching back to the original task, the userspace access would fault:

This is not supposed to happen. You never switch away from a task 
magically. Task switch will always happen in an interrupt, that means 
copy_{from,to}_user() get interrupted.

Whenever an interrupt is taken, kuap_save_amr_and_lock() macro is used 
to save KUAP status into the stack then lock KUAP access. At interrupt 
exit, kuap_kernel_restore() macro or function is used to restore KUAP 
access from the stack. At the time the task switch happens, KUAP access 
is expected to be locked. During task switch, the stack is switched so 
the KUAP status is taken back from the new task's stack.

Your fix suggests that there is some path where the KUAP status is not 
properly saved and/or restored. Did you try running with 
CONFIG_PPC_KUAP_DEBUG ? It should warn whenever a KUAP access is left 
unlocked.

> 
>    Kernel attempted to write user page (3fff7ab68190) - exploit attempt? (uid: 65536)
>    ------------[ cut here ]------------
>    Bug: Write fault blocked by KUAP!
>    WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 ___do_page_fault+0x7b4/0xaa0
>    CPU: 56 PID: 4939 Comm: git Tainted: G        W         5.19.8-00005-gba424747260d #1
>    NIP:  c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0
>    REGS: c00000008f507370 TRAP: 0700   Tainted: G        W          (5.19.8-00005-gba424747260d)
>    MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28042222  XER: 20040000
>    CFAR: c000000000123780 IRQMASK: 3
>    NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0
>    LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0
>    Call Trace:
>    [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 (unreliable)
>    [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130
>    [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0
>    --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4

...

> 
> Fix this by saving and restoring the kernel-side AMR/IAMR values when
> switching tasks.

As explained above, KUAP access should be locked at that time, so saving 
and restoring it should not have any effect. If it does, it means 
something goes wrong somewhere else.

> 
> Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
> Signed-off-by: Samuel Holland <samuel@sholland.org>
> ---
> I have no idea if this is the right change to make, and it could be
> optimized, but my system has been stable with this patch for 5 days now.
> 
> Without the patch, I hit the bug every few minutes when my load average
> is <1, and I hit it immediately if I try to do a parallel kernel build.

Great, then can you make a try with CONFIG_PPC_KUAP_DEBUG ?

> 
> Because of the instability (file I/O randomly raises SIGBUS), I don't
> think anyone would run a system in this configuration, so I don't think
> this bug is exploitable.
> 
>   arch/powerpc/kernel/process.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 0fbda89cd1bb..69b189d63124 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1150,6 +1150,12 @@ static inline void save_sprs(struct thread_struct *t)
>   		 */
>   		t->tar = mfspr(SPRN_TAR);
>   	}
> +	if (t->regs) {
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
> +			t->regs->amr = mfspr(SPRN_AMR);
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP))
> +			t->regs->iamr = mfspr(SPRN_IAMR);
> +	}
>   #endif
>   }
>   
> @@ -1228,6 +1234,13 @@ static inline void restore_sprs(struct thread_struct *old_thread,
>   	if (cpu_has_feature(CPU_FTR_P9_TIDR) &&
>   	    old_thread->tidr != new_thread->tidr)
>   		mtspr(SPRN_TIDR, new_thread->tidr);
> +	if (new_thread->regs) {
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
> +			mtspr(SPRN_AMR, new_thread->regs->amr);
> +		if (mmu_has_feature(MMU_FTR_BOOK3S_KUEP))
> +			mtspr(SPRN_IAMR, new_thread->regs->iamr);
> +		isync();
> +	}
>   #endif
>   
>   }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-17  8:16   ` Christophe Leroy
@ 2022-09-17 18:38     ` Samuel Holland
  -1 siblings, 0 replies; 22+ messages in thread
From: Samuel Holland @ 2022-09-17 18:38 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev

On 9/17/22 03:16, Christophe Leroy wrote:
> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>> with userspace access enabled until after the next IRQ or privilege
>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>> switching back to the original task, the userspace access would fault:
> 
> This is not supposed to happen. You never switch away from a task 
> magically. Task switch will always happen in an interrupt, that means 
> copy_{from,to}_user() get interrupted.

That makes sense, the interrupt handler is responsible for saving the
KUAP status. It looks like neither DEFINE_INTERRUPT_HANDLER_RAW nor any
of its users (performance_monitor_exception(), do_slb_fault()) do that.
Yet they still call one of the interrupt_return variants, which restores
the status from the stack.

> Whenever an interrupt is taken, kuap_save_amr_and_lock() macro is used 
> to save KUAP status into the stack then lock KUAP access. At interrupt 
> exit, kuap_kernel_restore() macro or function is used to restore KUAP 
> access from the stack. At the time the task switch happens, KUAP access 
> is expected to be locked. During task switch, the stack is switched so 
> the KUAP status is taken back from the new task's stack.

What if another task calls schedule() from kernel process context, and
the scheduler switches to a task that had been preempted inside
copy_{from,to}_user()? Then there is no interrupt involved, and I don't
see where kuap_kernel_restore() would get called.

> Your fix suggests that there is some path where the KUAP status is not 
> properly saved and/or restored. Did you try running with 
> CONFIG_PPC_KUAP_DEBUG ? It should warn whenever a KUAP access is left 
> unlocked.
> 
>>
>>    Kernel attempted to write user page (3fff7ab68190) - exploit attempt? (uid: 65536)
>>    ------------[ cut here ]------------
>>    Bug: Write fault blocked by KUAP!
>>    WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 ___do_page_fault+0x7b4/0xaa0
>>    CPU: 56 PID: 4939 Comm: git Tainted: G        W         5.19.8-00005-gba424747260d #1
>>    NIP:  c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0
>>    REGS: c00000008f507370 TRAP: 0700   Tainted: G        W          (5.19.8-00005-gba424747260d)
>>    MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28042222  XER: 20040000
>>    CFAR: c000000000123780 IRQMASK: 3
>>    NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0
>>    LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0
>>    Call Trace:
>>    [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 (unreliable)
>>    [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130
>>    [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0
>>    --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4
> 
> ...
> 
>>
>> Fix this by saving and restoring the kernel-side AMR/IAMR values when
>> switching tasks.
> 
> As explained above, KUAP access should be locked at that time, so saving 
> and restoring it should not have any effect. If it does, it means 
> something goes wrong somewhere else.
> 
>>
>> Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
>> Signed-off-by: Samuel Holland <samuel@sholland.org>
>> ---
>> I have no idea if this is the right change to make, and it could be
>> optimized, but my system has been stable with this patch for 5 days now.
>>
>> Without the patch, I hit the bug every few minutes when my load average
>> is <1, and I hit it immediately if I try to do a parallel kernel build.
> 
> Great, then can you make a try with CONFIG_PPC_KUAP_DEBUG ?

Yes, I will try this out in the next few days.

Regards,
Samuel


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-17 18:38     ` Samuel Holland
  0 siblings, 0 replies; 22+ messages in thread
From: Samuel Holland @ 2022-09-17 18:38 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook

On 9/17/22 03:16, Christophe Leroy wrote:
> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>> with userspace access enabled until after the next IRQ or privilege
>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>> switching back to the original task, the userspace access would fault:
> 
> This is not supposed to happen. You never switch away from a task 
> magically. Task switch will always happen in an interrupt, that means 
> copy_{from,to}_user() get interrupted.

That makes sense, the interrupt handler is responsible for saving the
KUAP status. It looks like neither DEFINE_INTERRUPT_HANDLER_RAW nor any
of its users (performance_monitor_exception(), do_slb_fault()) do that.
Yet they still call one of the interrupt_return variants, which restores
the status from the stack.

> Whenever an interrupt is taken, kuap_save_amr_and_lock() macro is used 
> to save KUAP status into the stack then lock KUAP access. At interrupt 
> exit, kuap_kernel_restore() macro or function is used to restore KUAP 
> access from the stack. At the time the task switch happens, KUAP access 
> is expected to be locked. During task switch, the stack is switched so 
> the KUAP status is taken back from the new task's stack.

What if another task calls schedule() from kernel process context, and
the scheduler switches to a task that had been preempted inside
copy_{from,to}_user()? Then there is no interrupt involved, and I don't
see where kuap_kernel_restore() would get called.

> Your fix suggests that there is some path where the KUAP status is not 
> properly saved and/or restored. Did you try running with 
> CONFIG_PPC_KUAP_DEBUG ? It should warn whenever a KUAP access is left 
> unlocked.
> 
>>
>>    Kernel attempted to write user page (3fff7ab68190) - exploit attempt? (uid: 65536)
>>    ------------[ cut here ]------------
>>    Bug: Write fault blocked by KUAP!
>>    WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 ___do_page_fault+0x7b4/0xaa0
>>    CPU: 56 PID: 4939 Comm: git Tainted: G        W         5.19.8-00005-gba424747260d #1
>>    NIP:  c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0
>>    REGS: c00000008f507370 TRAP: 0700   Tainted: G        W          (5.19.8-00005-gba424747260d)
>>    MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28042222  XER: 20040000
>>    CFAR: c000000000123780 IRQMASK: 3
>>    NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0
>>    LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0
>>    Call Trace:
>>    [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 (unreliable)
>>    [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130
>>    [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0
>>    --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4
> 
> ...
> 
>>
>> Fix this by saving and restoring the kernel-side AMR/IAMR values when
>> switching tasks.
> 
> As explained above, KUAP access should be locked at that time, so saving 
> and restoring it should not have any effect. If it does, it means 
> something goes wrong somewhere else.
> 
>>
>> Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
>> Signed-off-by: Samuel Holland <samuel@sholland.org>
>> ---
>> I have no idea if this is the right change to make, and it could be
>> optimized, but my system has been stable with this patch for 5 days now.
>>
>> Without the patch, I hit the bug every few minutes when my load average
>> is <1, and I hit it immediately if I try to do a parallel kernel build.
> 
> Great, then can you make a try with CONFIG_PPC_KUAP_DEBUG ?

Yes, I will try this out in the next few days.

Regards,
Samuel


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-17 18:38     ` Samuel Holland
@ 2022-09-18  8:21       ` Christophe Leroy
  -1 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-09-18  8:21 UTC (permalink / raw)
  To: Samuel Holland, Michael Ellerman, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev



Le 17/09/2022 à 20:38, Samuel Holland a écrit :
> On 9/17/22 03:16, Christophe Leroy wrote:
>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>> with userspace access enabled until after the next IRQ or privilege
>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>> switching back to the original task, the userspace access would fault:
>>
>> This is not supposed to happen. You never switch away from a task
>> magically. Task switch will always happen in an interrupt, that means
>> copy_{from,to}_user() get interrupted.
> 
> That makes sense, the interrupt handler is responsible for saving the
> KUAP status. It looks like neither DEFINE_INTERRUPT_HANDLER_RAW nor any
> of its users (performance_monitor_exception(), do_slb_fault()) do that.

As far as I can see, that's done earlier, in exceptions-64s.S.
Look for kuap_save_amr_and_lock.

Now, it may be possible that one of the exceptions pathes misses it.

> Yet they still call one of the interrupt_return variants, which restores
> the status from the stack.
> 
>> Whenever an interrupt is taken, kuap_save_amr_and_lock() macro is used
>> to save KUAP status into the stack then lock KUAP access. At interrupt
>> exit, kuap_kernel_restore() macro or function is used to restore KUAP
>> access from the stack. At the time the task switch happens, KUAP access
>> is expected to be locked. During task switch, the stack is switched so
>> the KUAP status is taken back from the new task's stack.
> 
> What if another task calls schedule() from kernel process context, and
> the scheduler switches to a task that had been preempted inside
> copy_{from,to}_user()? Then there is no interrupt involved, and I don't
> see where kuap_kernel_restore() would get called.

Yes there is interrupt involved. That task, if it has been preempted 
inside copy_from_user(), it must have been through an interrupt, likely 
a timer interrupt. So switching back to that task means doing an 
interrupt return in the context of that task. That's when KUAP status 
should be restored.

> 
>> Your fix suggests that there is some path where the KUAP status is not
>> properly saved and/or restored. Did you try running with
>> CONFIG_PPC_KUAP_DEBUG ? It should warn whenever a KUAP access is left
>> unlocked.
>>
>>>
>>>     Kernel attempted to write user page (3fff7ab68190) - exploit attempt? (uid: 65536)
>>>     ------------[ cut here ]------------
>>>     Bug: Write fault blocked by KUAP!
>>>     WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 ___do_page_fault+0x7b4/0xaa0
>>>     CPU: 56 PID: 4939 Comm: git Tainted: G        W         5.19.8-00005-gba424747260d #1
>>>     NIP:  c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0
>>>     REGS: c00000008f507370 TRAP: 0700   Tainted: G        W          (5.19.8-00005-gba424747260d)
>>>     MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28042222  XER: 20040000
>>>     CFAR: c000000000123780 IRQMASK: 3
>>>     NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0
>>>     LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0
>>>     Call Trace:
>>>     [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 (unreliable)
>>>     [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130
>>>     [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0
>>>     --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4
>>
>> ...
>>
>>>
>>> Fix this by saving and restoring the kernel-side AMR/IAMR values when
>>> switching tasks.
>>
>> As explained above, KUAP access should be locked at that time, so saving
>> and restoring it should not have any effect. If it does, it means
>> something goes wrong somewhere else.
>>
>>>
>>> Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
>>> Signed-off-by: Samuel Holland <samuel@sholland.org>
>>> ---
>>> I have no idea if this is the right change to make, and it could be
>>> optimized, but my system has been stable with this patch for 5 days now.
>>>
>>> Without the patch, I hit the bug every few minutes when my load average
>>> is <1, and I hit it immediately if I try to do a parallel kernel build.
>>
>> Great, then can you make a try with CONFIG_PPC_KUAP_DEBUG ?
> 
> Yes, I will try this out in the next few days.
> 

Thanks
Christophe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-18  8:21       ` Christophe Leroy
  0 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-09-18  8:21 UTC (permalink / raw)
  To: Samuel Holland, Michael Ellerman, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook



Le 17/09/2022 à 20:38, Samuel Holland a écrit :
> On 9/17/22 03:16, Christophe Leroy wrote:
>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>> with userspace access enabled until after the next IRQ or privilege
>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>> switching back to the original task, the userspace access would fault:
>>
>> This is not supposed to happen. You never switch away from a task
>> magically. Task switch will always happen in an interrupt, that means
>> copy_{from,to}_user() get interrupted.
> 
> That makes sense, the interrupt handler is responsible for saving the
> KUAP status. It looks like neither DEFINE_INTERRUPT_HANDLER_RAW nor any
> of its users (performance_monitor_exception(), do_slb_fault()) do that.

As far as I can see, that's done earlier, in exceptions-64s.S.
Look for kuap_save_amr_and_lock.

Now, it may be possible that one of the exceptions pathes misses it.

> Yet they still call one of the interrupt_return variants, which restores
> the status from the stack.
> 
>> Whenever an interrupt is taken, kuap_save_amr_and_lock() macro is used
>> to save KUAP status into the stack then lock KUAP access. At interrupt
>> exit, kuap_kernel_restore() macro or function is used to restore KUAP
>> access from the stack. At the time the task switch happens, KUAP access
>> is expected to be locked. During task switch, the stack is switched so
>> the KUAP status is taken back from the new task's stack.
> 
> What if another task calls schedule() from kernel process context, and
> the scheduler switches to a task that had been preempted inside
> copy_{from,to}_user()? Then there is no interrupt involved, and I don't
> see where kuap_kernel_restore() would get called.

Yes there is interrupt involved. That task, if it has been preempted 
inside copy_from_user(), it must have been through an interrupt, likely 
a timer interrupt. So switching back to that task means doing an 
interrupt return in the context of that task. That's when KUAP status 
should be restored.

> 
>> Your fix suggests that there is some path where the KUAP status is not
>> properly saved and/or restored. Did you try running with
>> CONFIG_PPC_KUAP_DEBUG ? It should warn whenever a KUAP access is left
>> unlocked.
>>
>>>
>>>     Kernel attempted to write user page (3fff7ab68190) - exploit attempt? (uid: 65536)
>>>     ------------[ cut here ]------------
>>>     Bug: Write fault blocked by KUAP!
>>>     WARNING: CPU: 56 PID: 4939 at arch/powerpc/mm/fault.c:228 ___do_page_fault+0x7b4/0xaa0
>>>     CPU: 56 PID: 4939 Comm: git Tainted: G        W         5.19.8-00005-gba424747260d #1
>>>     NIP:  c0000000000555e4 LR: c0000000000555e0 CTR: c00000000079d9d0
>>>     REGS: c00000008f507370 TRAP: 0700   Tainted: G        W          (5.19.8-00005-gba424747260d)
>>>     MSR:  9000000000021033 <SF,HV,ME,IR,DR,RI,LE>  CR: 28042222  XER: 20040000
>>>     CFAR: c000000000123780 IRQMASK: 3
>>>     NIP [c0000000000555e4] ___do_page_fault+0x7b4/0xaa0
>>>     LR [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0
>>>     Call Trace:
>>>     [c00000008f507610] [c0000000000555e0] ___do_page_fault+0x7b0/0xaa0 (unreliable)
>>>     [c00000008f5076c0] [c000000000055938] do_page_fault+0x68/0x130
>>>     [c00000008f5076f0] [c000000000008914] data_access_common_virt+0x194/0x1f0
>>>     --- interrupt: 300 at __copy_tofrom_user_base+0x9c/0x5a4
>>
>> ...
>>
>>>
>>> Fix this by saving and restoring the kernel-side AMR/IAMR values when
>>> switching tasks.
>>
>> As explained above, KUAP access should be locked at that time, so saving
>> and restoring it should not have any effect. If it does, it means
>> something goes wrong somewhere else.
>>
>>>
>>> Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
>>> Signed-off-by: Samuel Holland <samuel@sholland.org>
>>> ---
>>> I have no idea if this is the right change to make, and it could be
>>> optimized, but my system has been stable with this patch for 5 days now.
>>>
>>> Without the patch, I hit the bug every few minutes when my load average
>>> is <1, and I hit it immediately if I try to do a parallel kernel build.
>>
>> Great, then can you make a try with CONFIG_PPC_KUAP_DEBUG ?
> 
> Yes, I will try this out in the next few days.
> 

Thanks
Christophe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-17  8:16   ` Christophe Leroy
@ 2022-09-19 12:37     ` Michael Ellerman
  -1 siblings, 0 replies; 22+ messages in thread
From: Michael Ellerman @ 2022-09-19 12:37 UTC (permalink / raw)
  To: Christophe Leroy, Samuel Holland, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>> with userspace access enabled until after the next IRQ or privilege
>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>> switching back to the original task, the userspace access would fault:
>
> This is not supposed to happen. You never switch away from a task 
> magically. Task switch will always happen in an interrupt, that means 
> copy_{from,to}_user() get interrupted.

Unfortunately this isn't true when CONFIG_PREEMPT=y.

We can switch away without an interrupt via:
  __copy_tofrom_user()
    -> __copy_tofrom_user_power7()
       -> exit_vmx_usercopy()
          -> preempt_enable()
             -> __preempt_schedule()
                -> preempt_schedule()
                   -> preempt_schedule_common()
                      -> __schedule()

I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
all on Power8, which is a bit of an oversight on my part.

And clearly no one else tests it, until now :)

I think the root of our problem is that our KUAP lock/unlock is at too
high a level, ie. we do it in C around the low-level copy to/from.

eg:

static inline unsigned long
raw_copy_to_user(void __user *to, const void *from, unsigned long n)
{
	unsigned long ret;

	allow_write_to_user(to, n);
	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
	prevent_write_to_user(to, n);
	return ret;
}

There's a reason we did that, which is that we have various different
KUAP methods on different platforms, not a simple instruction like other
arches.

But that means we have that exit_vmx_usercopy() being called deep in the
guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
the preempt machinery and eventually schedule.

I don't see an easy way to fix that "properly", it would be a big change
to all platforms to push the KUAP save/restore down into the low level
asm code.

But I think the patch below does fix it, although it abuses things a
little. Namely it only works because the 64s KUAP code can handle a
double call to prevent, and doesn't need the addresses or size for the
allow.

Still I think it might be our best option for an easy fix.

Samuel, can you try this on your system and check it works for you?

cheers


diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 97a77b37daa3..c50080c6a136 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -432,6 +432,7 @@ int speround_handler(struct pt_regs *regs);
 /* VMX copying */
 int enter_vmx_usercopy(void);
 int exit_vmx_usercopy(void);
+void exit_vmx_usercopy_continue(void);
 int enter_vmx_ops(void);
 void *exit_vmx_ops(void *dest);
 
diff --git a/arch/powerpc/lib/copyuser_power7.S b/arch/powerpc/lib/copyuser_power7.S
index 28f0be523c06..77804860383c 100644
--- a/arch/powerpc/lib/copyuser_power7.S
+++ b/arch/powerpc/lib/copyuser_power7.S
@@ -47,7 +47,7 @@
 	ld	r15,STK_REG(R15)(r1)
 	ld	r14,STK_REG(R14)(r1)
 .Ldo_err3:
-	bl	exit_vmx_usercopy
+	bl	exit_vmx_usercopy_continue
 	ld	r0,STACKFRAMESIZE+16(r1)
 	mtlr	r0
 	b	.Lexit
diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
index f76a50291fd7..78a18b8384ff 100644
--- a/arch/powerpc/lib/vmx-helper.c
+++ b/arch/powerpc/lib/vmx-helper.c
@@ -8,6 +8,7 @@
  */
 #include <linux/uaccess.h>
 #include <linux/hardirq.h>
+#include <asm/kup.h>
 #include <asm/switch_to.h>
 
 int enter_vmx_usercopy(void)
@@ -34,12 +35,19 @@ int enter_vmx_usercopy(void)
  */
 int exit_vmx_usercopy(void)
 {
+	prevent_user_access(KUAP_READ_WRITE);
 	disable_kernel_altivec();
 	pagefault_enable();
 	preempt_enable();
 	return 0;
 }
 
+void exit_vmx_usercopy_continue(void)
+{
+	exit_vmx_usercopy();
+	allow_read_write_user(NULL, NULL, 0);
+}
+
 int enter_vmx_ops(void)
 {
 	if (in_interrupt())


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-19 12:37     ` Michael Ellerman
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Ellerman @ 2022-09-19 12:37 UTC (permalink / raw)
  To: Christophe Leroy, Samuel Holland, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>> with userspace access enabled until after the next IRQ or privilege
>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>> switching back to the original task, the userspace access would fault:
>
> This is not supposed to happen. You never switch away from a task 
> magically. Task switch will always happen in an interrupt, that means 
> copy_{from,to}_user() get interrupted.

Unfortunately this isn't true when CONFIG_PREEMPT=y.

We can switch away without an interrupt via:
  __copy_tofrom_user()
    -> __copy_tofrom_user_power7()
       -> exit_vmx_usercopy()
          -> preempt_enable()
             -> __preempt_schedule()
                -> preempt_schedule()
                   -> preempt_schedule_common()
                      -> __schedule()

I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
all on Power8, which is a bit of an oversight on my part.

And clearly no one else tests it, until now :)

I think the root of our problem is that our KUAP lock/unlock is at too
high a level, ie. we do it in C around the low-level copy to/from.

eg:

static inline unsigned long
raw_copy_to_user(void __user *to, const void *from, unsigned long n)
{
	unsigned long ret;

	allow_write_to_user(to, n);
	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
	prevent_write_to_user(to, n);
	return ret;
}

There's a reason we did that, which is that we have various different
KUAP methods on different platforms, not a simple instruction like other
arches.

But that means we have that exit_vmx_usercopy() being called deep in the
guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
the preempt machinery and eventually schedule.

I don't see an easy way to fix that "properly", it would be a big change
to all platforms to push the KUAP save/restore down into the low level
asm code.

But I think the patch below does fix it, although it abuses things a
little. Namely it only works because the 64s KUAP code can handle a
double call to prevent, and doesn't need the addresses or size for the
allow.

Still I think it might be our best option for an easy fix.

Samuel, can you try this on your system and check it works for you?

cheers


diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 97a77b37daa3..c50080c6a136 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -432,6 +432,7 @@ int speround_handler(struct pt_regs *regs);
 /* VMX copying */
 int enter_vmx_usercopy(void);
 int exit_vmx_usercopy(void);
+void exit_vmx_usercopy_continue(void);
 int enter_vmx_ops(void);
 void *exit_vmx_ops(void *dest);
 
diff --git a/arch/powerpc/lib/copyuser_power7.S b/arch/powerpc/lib/copyuser_power7.S
index 28f0be523c06..77804860383c 100644
--- a/arch/powerpc/lib/copyuser_power7.S
+++ b/arch/powerpc/lib/copyuser_power7.S
@@ -47,7 +47,7 @@
 	ld	r15,STK_REG(R15)(r1)
 	ld	r14,STK_REG(R14)(r1)
 .Ldo_err3:
-	bl	exit_vmx_usercopy
+	bl	exit_vmx_usercopy_continue
 	ld	r0,STACKFRAMESIZE+16(r1)
 	mtlr	r0
 	b	.Lexit
diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
index f76a50291fd7..78a18b8384ff 100644
--- a/arch/powerpc/lib/vmx-helper.c
+++ b/arch/powerpc/lib/vmx-helper.c
@@ -8,6 +8,7 @@
  */
 #include <linux/uaccess.h>
 #include <linux/hardirq.h>
+#include <asm/kup.h>
 #include <asm/switch_to.h>
 
 int enter_vmx_usercopy(void)
@@ -34,12 +35,19 @@ int enter_vmx_usercopy(void)
  */
 int exit_vmx_usercopy(void)
 {
+	prevent_user_access(KUAP_READ_WRITE);
 	disable_kernel_altivec();
 	pagefault_enable();
 	preempt_enable();
 	return 0;
 }
 
+void exit_vmx_usercopy_continue(void)
+{
+	exit_vmx_usercopy();
+	allow_read_write_user(NULL, NULL, 0);
+}
+
 int enter_vmx_ops(void)
 {
 	if (in_interrupt())


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-19 12:37     ` Michael Ellerman
@ 2022-09-19 13:39       ` Christophe Leroy
  -1 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-09-19 13:39 UTC (permalink / raw)
  To: Michael Ellerman, Samuel Holland, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev



Le 19/09/2022 à 14:37, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>> with userspace access enabled until after the next IRQ or privilege
>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>> switching back to the original task, the userspace access would fault:
>>
>> This is not supposed to happen. You never switch away from a task
>> magically. Task switch will always happen in an interrupt, that means
>> copy_{from,to}_user() get interrupted.
> 
> Unfortunately this isn't true when CONFIG_PREEMPT=y.

Argh, yes, I wrote the above with the assumption that we properly follow 
the main principles that no complex fonction is to be used while KUAP is 
open ... Which is apparently not true here. x86 would have detected it 
with objtool, but we don't have it yet in powerpc.

> 
> We can switch away without an interrupt via:
>    __copy_tofrom_user()
>      -> __copy_tofrom_user_power7()
>         -> exit_vmx_usercopy()
>            -> preempt_enable()
>               -> __preempt_schedule()
>                  -> preempt_schedule()
>                     -> preempt_schedule_common()
>                        -> __schedule()


Should we use preempt_enable_no_resched() to avoid that ?


> 
> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
> all on Power8, which is a bit of an oversight on my part.
> 
> And clearly no one else tests it, until now :)
> 
> I think the root of our problem is that our KUAP lock/unlock is at too
> high a level, ie. we do it in C around the low-level copy to/from.
> 
> eg:
> 
> static inline unsigned long
> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
> {
> 	unsigned long ret;
> 
> 	allow_write_to_user(to, n);
> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
> 	prevent_write_to_user(to, n);
> 	return ret;
> }
> 
> There's a reason we did that, which is that we have various different
> KUAP methods on different platforms, not a simple instruction like other
> arches.
> 
> But that means we have that exit_vmx_usercopy() being called deep in the
> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
> the preempt machinery and eventually schedule.
> 
> I don't see an easy way to fix that "properly", it would be a big change
> to all platforms to push the KUAP save/restore down into the low level
> asm code.
> 
> But I think the patch below does fix it, although it abuses things a
> little. Namely it only works because the 64s KUAP code can handle a
> double call to prevent, and doesn't need the addresses or size for the
> allow.
> 
> Still I think it might be our best option for an easy fix.

Wouldn't it be even easier and less abusive to use 
preemt_enable_no_resched() ? Or is there definitively a good reason to 
resched after a VMX copy while we don't with regular copies ?

Christophe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-19 13:39       ` Christophe Leroy
  0 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-09-19 13:39 UTC (permalink / raw)
  To: Michael Ellerman, Samuel Holland, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook



Le 19/09/2022 à 14:37, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>> with userspace access enabled until after the next IRQ or privilege
>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>> switching back to the original task, the userspace access would fault:
>>
>> This is not supposed to happen. You never switch away from a task
>> magically. Task switch will always happen in an interrupt, that means
>> copy_{from,to}_user() get interrupted.
> 
> Unfortunately this isn't true when CONFIG_PREEMPT=y.

Argh, yes, I wrote the above with the assumption that we properly follow 
the main principles that no complex fonction is to be used while KUAP is 
open ... Which is apparently not true here. x86 would have detected it 
with objtool, but we don't have it yet in powerpc.

> 
> We can switch away without an interrupt via:
>    __copy_tofrom_user()
>      -> __copy_tofrom_user_power7()
>         -> exit_vmx_usercopy()
>            -> preempt_enable()
>               -> __preempt_schedule()
>                  -> preempt_schedule()
>                     -> preempt_schedule_common()
>                        -> __schedule()


Should we use preempt_enable_no_resched() to avoid that ?


> 
> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
> all on Power8, which is a bit of an oversight on my part.
> 
> And clearly no one else tests it, until now :)
> 
> I think the root of our problem is that our KUAP lock/unlock is at too
> high a level, ie. we do it in C around the low-level copy to/from.
> 
> eg:
> 
> static inline unsigned long
> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
> {
> 	unsigned long ret;
> 
> 	allow_write_to_user(to, n);
> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
> 	prevent_write_to_user(to, n);
> 	return ret;
> }
> 
> There's a reason we did that, which is that we have various different
> KUAP methods on different platforms, not a simple instruction like other
> arches.
> 
> But that means we have that exit_vmx_usercopy() being called deep in the
> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
> the preempt machinery and eventually schedule.
> 
> I don't see an easy way to fix that "properly", it would be a big change
> to all platforms to push the KUAP save/restore down into the low level
> asm code.
> 
> But I think the patch below does fix it, although it abuses things a
> little. Namely it only works because the 64s KUAP code can handle a
> double call to prevent, and doesn't need the addresses or size for the
> allow.
> 
> Still I think it might be our best option for an easy fix.

Wouldn't it be even easier and less abusive to use 
preemt_enable_no_resched() ? Or is there definitively a good reason to 
resched after a VMX copy while we don't with regular copies ?

Christophe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-19 12:37     ` Michael Ellerman
@ 2022-09-21  3:33       ` Samuel Holland
  -1 siblings, 0 replies; 22+ messages in thread
From: Samuel Holland @ 2022-09-21  3:33 UTC (permalink / raw)
  To: Michael Ellerman, Christophe Leroy, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook

On 9/19/22 07:37, Michael Ellerman wrote:
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>> with userspace access enabled until after the next IRQ or privilege
>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>> switching back to the original task, the userspace access would fault:
>>
>> This is not supposed to happen. You never switch away from a task 
>> magically. Task switch will always happen in an interrupt, that means 
>> copy_{from,to}_user() get interrupted.
> 
> Unfortunately this isn't true when CONFIG_PREEMPT=y.
> 
> We can switch away without an interrupt via:
>   __copy_tofrom_user()
>     -> __copy_tofrom_user_power7()
>        -> exit_vmx_usercopy()
>           -> preempt_enable()
>              -> __preempt_schedule()
>                 -> preempt_schedule()
>                    -> preempt_schedule_common()
>                       -> __schedule()
> 
> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
> all on Power8, which is a bit of an oversight on my part.
> 
> And clearly no one else tests it, until now :)
> 
> I think the root of our problem is that our KUAP lock/unlock is at too
> high a level, ie. we do it in C around the low-level copy to/from.
> 
> eg:
> 
> static inline unsigned long
> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
> {
> 	unsigned long ret;
> 
> 	allow_write_to_user(to, n);
> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
> 	prevent_write_to_user(to, n);
> 	return ret;
> }
> 
> There's a reason we did that, which is that we have various different
> KUAP methods on different platforms, not a simple instruction like other
> arches.
> 
> But that means we have that exit_vmx_usercopy() being called deep in the
> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
> the preempt machinery and eventually schedule.
> 
> I don't see an easy way to fix that "properly", it would be a big change
> to all platforms to push the KUAP save/restore down into the low level
> asm code.
> 
> But I think the patch below does fix it, although it abuses things a
> little. Namely it only works because the 64s KUAP code can handle a
> double call to prevent, and doesn't need the addresses or size for the
> allow.
> 
> Still I think it might be our best option for an easy fix.
> 
> Samuel, can you try this on your system and check it works for you?

It looks like your patch works. Thanks for the correct fix!

I replaced my patch with the one below, and enabled
CONFIG_PPC_KUAP_DEBUG=y, and I was able to do several kernel builds
without any crashes or splats in dmesg.

I suppose the other calls to exit_vmx_usercopy() in copyuser_power7.S
would not cause a crash, because there is no userspace memory access
afterward, but couldn't they still leave KUAP erroneously unlocked?

Regards,
Samuel

> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 97a77b37daa3..c50080c6a136 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -432,6 +432,7 @@ int speround_handler(struct pt_regs *regs);
>  /* VMX copying */
>  int enter_vmx_usercopy(void);
>  int exit_vmx_usercopy(void);
> +void exit_vmx_usercopy_continue(void);
>  int enter_vmx_ops(void);
>  void *exit_vmx_ops(void *dest);
>  
> diff --git a/arch/powerpc/lib/copyuser_power7.S b/arch/powerpc/lib/copyuser_power7.S
> index 28f0be523c06..77804860383c 100644
> --- a/arch/powerpc/lib/copyuser_power7.S
> +++ b/arch/powerpc/lib/copyuser_power7.S
> @@ -47,7 +47,7 @@
>  	ld	r15,STK_REG(R15)(r1)
>  	ld	r14,STK_REG(R14)(r1)
>  .Ldo_err3:
> -	bl	exit_vmx_usercopy
> +	bl	exit_vmx_usercopy_continue
>  	ld	r0,STACKFRAMESIZE+16(r1)
>  	mtlr	r0
>  	b	.Lexit
> diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
> index f76a50291fd7..78a18b8384ff 100644
> --- a/arch/powerpc/lib/vmx-helper.c
> +++ b/arch/powerpc/lib/vmx-helper.c
> @@ -8,6 +8,7 @@
>   */
>  #include <linux/uaccess.h>
>  #include <linux/hardirq.h>
> +#include <asm/kup.h>
>  #include <asm/switch_to.h>
>  
>  int enter_vmx_usercopy(void)
> @@ -34,12 +35,19 @@ int enter_vmx_usercopy(void)
>   */
>  int exit_vmx_usercopy(void)
>  {
> +	prevent_user_access(KUAP_READ_WRITE);
>  	disable_kernel_altivec();
>  	pagefault_enable();
>  	preempt_enable();
>  	return 0;
>  }
>  
> +void exit_vmx_usercopy_continue(void)
> +{
> +	exit_vmx_usercopy();
> +	allow_read_write_user(NULL, NULL, 0);
> +}
> +
>  int enter_vmx_ops(void)
>  {
>  	if (in_interrupt())
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-21  3:33       ` Samuel Holland
  0 siblings, 0 replies; 22+ messages in thread
From: Samuel Holland @ 2022-09-21  3:33 UTC (permalink / raw)
  To: Michael Ellerman, Christophe Leroy, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev

On 9/19/22 07:37, Michael Ellerman wrote:
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>> with userspace access enabled until after the next IRQ or privilege
>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>> switching back to the original task, the userspace access would fault:
>>
>> This is not supposed to happen. You never switch away from a task 
>> magically. Task switch will always happen in an interrupt, that means 
>> copy_{from,to}_user() get interrupted.
> 
> Unfortunately this isn't true when CONFIG_PREEMPT=y.
> 
> We can switch away without an interrupt via:
>   __copy_tofrom_user()
>     -> __copy_tofrom_user_power7()
>        -> exit_vmx_usercopy()
>           -> preempt_enable()
>              -> __preempt_schedule()
>                 -> preempt_schedule()
>                    -> preempt_schedule_common()
>                       -> __schedule()
> 
> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
> all on Power8, which is a bit of an oversight on my part.
> 
> And clearly no one else tests it, until now :)
> 
> I think the root of our problem is that our KUAP lock/unlock is at too
> high a level, ie. we do it in C around the low-level copy to/from.
> 
> eg:
> 
> static inline unsigned long
> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
> {
> 	unsigned long ret;
> 
> 	allow_write_to_user(to, n);
> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
> 	prevent_write_to_user(to, n);
> 	return ret;
> }
> 
> There's a reason we did that, which is that we have various different
> KUAP methods on different platforms, not a simple instruction like other
> arches.
> 
> But that means we have that exit_vmx_usercopy() being called deep in the
> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
> the preempt machinery and eventually schedule.
> 
> I don't see an easy way to fix that "properly", it would be a big change
> to all platforms to push the KUAP save/restore down into the low level
> asm code.
> 
> But I think the patch below does fix it, although it abuses things a
> little. Namely it only works because the 64s KUAP code can handle a
> double call to prevent, and doesn't need the addresses or size for the
> allow.
> 
> Still I think it might be our best option for an easy fix.
> 
> Samuel, can you try this on your system and check it works for you?

It looks like your patch works. Thanks for the correct fix!

I replaced my patch with the one below, and enabled
CONFIG_PPC_KUAP_DEBUG=y, and I was able to do several kernel builds
without any crashes or splats in dmesg.

I suppose the other calls to exit_vmx_usercopy() in copyuser_power7.S
would not cause a crash, because there is no userspace memory access
afterward, but couldn't they still leave KUAP erroneously unlocked?

Regards,
Samuel

> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 97a77b37daa3..c50080c6a136 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -432,6 +432,7 @@ int speround_handler(struct pt_regs *regs);
>  /* VMX copying */
>  int enter_vmx_usercopy(void);
>  int exit_vmx_usercopy(void);
> +void exit_vmx_usercopy_continue(void);
>  int enter_vmx_ops(void);
>  void *exit_vmx_ops(void *dest);
>  
> diff --git a/arch/powerpc/lib/copyuser_power7.S b/arch/powerpc/lib/copyuser_power7.S
> index 28f0be523c06..77804860383c 100644
> --- a/arch/powerpc/lib/copyuser_power7.S
> +++ b/arch/powerpc/lib/copyuser_power7.S
> @@ -47,7 +47,7 @@
>  	ld	r15,STK_REG(R15)(r1)
>  	ld	r14,STK_REG(R14)(r1)
>  .Ldo_err3:
> -	bl	exit_vmx_usercopy
> +	bl	exit_vmx_usercopy_continue
>  	ld	r0,STACKFRAMESIZE+16(r1)
>  	mtlr	r0
>  	b	.Lexit
> diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
> index f76a50291fd7..78a18b8384ff 100644
> --- a/arch/powerpc/lib/vmx-helper.c
> +++ b/arch/powerpc/lib/vmx-helper.c
> @@ -8,6 +8,7 @@
>   */
>  #include <linux/uaccess.h>
>  #include <linux/hardirq.h>
> +#include <asm/kup.h>
>  #include <asm/switch_to.h>
>  
>  int enter_vmx_usercopy(void)
> @@ -34,12 +35,19 @@ int enter_vmx_usercopy(void)
>   */
>  int exit_vmx_usercopy(void)
>  {
> +	prevent_user_access(KUAP_READ_WRITE);
>  	disable_kernel_altivec();
>  	pagefault_enable();
>  	preempt_enable();
>  	return 0;
>  }
>  
> +void exit_vmx_usercopy_continue(void)
> +{
> +	exit_vmx_usercopy();
> +	allow_read_write_user(NULL, NULL, 0);
> +}
> +
>  int enter_vmx_ops(void)
>  {
>  	if (in_interrupt())
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-21  3:33       ` Samuel Holland
@ 2022-09-21  5:17         ` Christophe Leroy
  -1 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-09-21  5:17 UTC (permalink / raw)
  To: Samuel Holland, Michael Ellerman, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev



Le 21/09/2022 à 05:33, Samuel Holland a écrit :
> On 9/19/22 07:37, Michael Ellerman wrote:
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>>> with userspace access enabled until after the next IRQ or privilege
>>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>>> switching back to the original task, the userspace access would fault:
>>>
>>> This is not supposed to happen. You never switch away from a task
>>> magically. Task switch will always happen in an interrupt, that means
>>> copy_{from,to}_user() get interrupted.
>>
>> Unfortunately this isn't true when CONFIG_PREEMPT=y.
>>
>> We can switch away without an interrupt via:
>>    __copy_tofrom_user()
>>      -> __copy_tofrom_user_power7()
>>         -> exit_vmx_usercopy()
>>            -> preempt_enable()
>>               -> __preempt_schedule()
>>                  -> preempt_schedule()
>>                     -> preempt_schedule_common()
>>                        -> __schedule()
>>
>> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
>> all on Power8, which is a bit of an oversight on my part.
>>
>> And clearly no one else tests it, until now :)
>>
>> I think the root of our problem is that our KUAP lock/unlock is at too
>> high a level, ie. we do it in C around the low-level copy to/from.
>>
>> eg:
>>
>> static inline unsigned long
>> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
>> {
>> 	unsigned long ret;
>>
>> 	allow_write_to_user(to, n);
>> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
>> 	prevent_write_to_user(to, n);
>> 	return ret;
>> }
>>
>> There's a reason we did that, which is that we have various different
>> KUAP methods on different platforms, not a simple instruction like other
>> arches.
>>
>> But that means we have that exit_vmx_usercopy() being called deep in the
>> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
>> the preempt machinery and eventually schedule.
>>
>> I don't see an easy way to fix that "properly", it would be a big change
>> to all platforms to push the KUAP save/restore down into the low level
>> asm code.
>>
>> But I think the patch below does fix it, although it abuses things a
>> little. Namely it only works because the 64s KUAP code can handle a
>> double call to prevent, and doesn't need the addresses or size for the
>> allow.
>>
>> Still I think it might be our best option for an easy fix.
>>
>> Samuel, can you try this on your system and check it works for you?
> 
> It looks like your patch works. Thanks for the correct fix!

Instead of the patch from Michael, could you try by replacing 
preempt_enable() by preempt_enable_no_resched() in exit_vmx_usercopy() ?

> 
> I replaced my patch with the one below, and enabled
> CONFIG_PPC_KUAP_DEBUG=y, and I was able to do several kernel builds
> without any crashes or splats in dmesg.

Did you try CONFIG_PPC_KUAP_DEBUG without the patch ? Did it detect any 
problem ?

Thanks
Christophe

> 
> I suppose the other calls to exit_vmx_usercopy() in copyuser_power7.S
> would not cause a crash, because there is no userspace memory access
> afterward, but couldn't they still leave KUAP erroneously unlocked?
> 
> Regards,
> Samuel
> 
>> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
>> index 97a77b37daa3..c50080c6a136 100644
>> --- a/arch/powerpc/include/asm/processor.h
>> +++ b/arch/powerpc/include/asm/processor.h
>> @@ -432,6 +432,7 @@ int speround_handler(struct pt_regs *regs);
>>   /* VMX copying */
>>   int enter_vmx_usercopy(void);
>>   int exit_vmx_usercopy(void);
>> +void exit_vmx_usercopy_continue(void);
>>   int enter_vmx_ops(void);
>>   void *exit_vmx_ops(void *dest);
>>   
>> diff --git a/arch/powerpc/lib/copyuser_power7.S b/arch/powerpc/lib/copyuser_power7.S
>> index 28f0be523c06..77804860383c 100644
>> --- a/arch/powerpc/lib/copyuser_power7.S
>> +++ b/arch/powerpc/lib/copyuser_power7.S
>> @@ -47,7 +47,7 @@
>>   	ld	r15,STK_REG(R15)(r1)
>>   	ld	r14,STK_REG(R14)(r1)
>>   .Ldo_err3:
>> -	bl	exit_vmx_usercopy
>> +	bl	exit_vmx_usercopy_continue
>>   	ld	r0,STACKFRAMESIZE+16(r1)
>>   	mtlr	r0
>>   	b	.Lexit
>> diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
>> index f76a50291fd7..78a18b8384ff 100644
>> --- a/arch/powerpc/lib/vmx-helper.c
>> +++ b/arch/powerpc/lib/vmx-helper.c
>> @@ -8,6 +8,7 @@
>>    */
>>   #include <linux/uaccess.h>
>>   #include <linux/hardirq.h>
>> +#include <asm/kup.h>
>>   #include <asm/switch_to.h>
>>   
>>   int enter_vmx_usercopy(void)
>> @@ -34,12 +35,19 @@ int enter_vmx_usercopy(void)
>>    */
>>   int exit_vmx_usercopy(void)
>>   {
>> +	prevent_user_access(KUAP_READ_WRITE);
>>   	disable_kernel_altivec();
>>   	pagefault_enable();
>>   	preempt_enable();
>>   	return 0;
>>   }
>>   
>> +void exit_vmx_usercopy_continue(void)
>> +{
>> +	exit_vmx_usercopy();
>> +	allow_read_write_user(NULL, NULL, 0);
>> +}
>> +
>>   int enter_vmx_ops(void)
>>   {
>>   	if (in_interrupt())
>>
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-21  5:17         ` Christophe Leroy
  0 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-09-21  5:17 UTC (permalink / raw)
  To: Samuel Holland, Michael Ellerman, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook



Le 21/09/2022 à 05:33, Samuel Holland a écrit :
> On 9/19/22 07:37, Michael Ellerman wrote:
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>>> with userspace access enabled until after the next IRQ or privilege
>>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>>> switching back to the original task, the userspace access would fault:
>>>
>>> This is not supposed to happen. You never switch away from a task
>>> magically. Task switch will always happen in an interrupt, that means
>>> copy_{from,to}_user() get interrupted.
>>
>> Unfortunately this isn't true when CONFIG_PREEMPT=y.
>>
>> We can switch away without an interrupt via:
>>    __copy_tofrom_user()
>>      -> __copy_tofrom_user_power7()
>>         -> exit_vmx_usercopy()
>>            -> preempt_enable()
>>               -> __preempt_schedule()
>>                  -> preempt_schedule()
>>                     -> preempt_schedule_common()
>>                        -> __schedule()
>>
>> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
>> all on Power8, which is a bit of an oversight on my part.
>>
>> And clearly no one else tests it, until now :)
>>
>> I think the root of our problem is that our KUAP lock/unlock is at too
>> high a level, ie. we do it in C around the low-level copy to/from.
>>
>> eg:
>>
>> static inline unsigned long
>> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
>> {
>> 	unsigned long ret;
>>
>> 	allow_write_to_user(to, n);
>> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
>> 	prevent_write_to_user(to, n);
>> 	return ret;
>> }
>>
>> There's a reason we did that, which is that we have various different
>> KUAP methods on different platforms, not a simple instruction like other
>> arches.
>>
>> But that means we have that exit_vmx_usercopy() being called deep in the
>> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
>> the preempt machinery and eventually schedule.
>>
>> I don't see an easy way to fix that "properly", it would be a big change
>> to all platforms to push the KUAP save/restore down into the low level
>> asm code.
>>
>> But I think the patch below does fix it, although it abuses things a
>> little. Namely it only works because the 64s KUAP code can handle a
>> double call to prevent, and doesn't need the addresses or size for the
>> allow.
>>
>> Still I think it might be our best option for an easy fix.
>>
>> Samuel, can you try this on your system and check it works for you?
> 
> It looks like your patch works. Thanks for the correct fix!

Instead of the patch from Michael, could you try by replacing 
preempt_enable() by preempt_enable_no_resched() in exit_vmx_usercopy() ?

> 
> I replaced my patch with the one below, and enabled
> CONFIG_PPC_KUAP_DEBUG=y, and I was able to do several kernel builds
> without any crashes or splats in dmesg.

Did you try CONFIG_PPC_KUAP_DEBUG without the patch ? Did it detect any 
problem ?

Thanks
Christophe

> 
> I suppose the other calls to exit_vmx_usercopy() in copyuser_power7.S
> would not cause a crash, because there is no userspace memory access
> afterward, but couldn't they still leave KUAP erroneously unlocked?
> 
> Regards,
> Samuel
> 
>> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
>> index 97a77b37daa3..c50080c6a136 100644
>> --- a/arch/powerpc/include/asm/processor.h
>> +++ b/arch/powerpc/include/asm/processor.h
>> @@ -432,6 +432,7 @@ int speround_handler(struct pt_regs *regs);
>>   /* VMX copying */
>>   int enter_vmx_usercopy(void);
>>   int exit_vmx_usercopy(void);
>> +void exit_vmx_usercopy_continue(void);
>>   int enter_vmx_ops(void);
>>   void *exit_vmx_ops(void *dest);
>>   
>> diff --git a/arch/powerpc/lib/copyuser_power7.S b/arch/powerpc/lib/copyuser_power7.S
>> index 28f0be523c06..77804860383c 100644
>> --- a/arch/powerpc/lib/copyuser_power7.S
>> +++ b/arch/powerpc/lib/copyuser_power7.S
>> @@ -47,7 +47,7 @@
>>   	ld	r15,STK_REG(R15)(r1)
>>   	ld	r14,STK_REG(R14)(r1)
>>   .Ldo_err3:
>> -	bl	exit_vmx_usercopy
>> +	bl	exit_vmx_usercopy_continue
>>   	ld	r0,STACKFRAMESIZE+16(r1)
>>   	mtlr	r0
>>   	b	.Lexit
>> diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c
>> index f76a50291fd7..78a18b8384ff 100644
>> --- a/arch/powerpc/lib/vmx-helper.c
>> +++ b/arch/powerpc/lib/vmx-helper.c
>> @@ -8,6 +8,7 @@
>>    */
>>   #include <linux/uaccess.h>
>>   #include <linux/hardirq.h>
>> +#include <asm/kup.h>
>>   #include <asm/switch_to.h>
>>   
>>   int enter_vmx_usercopy(void)
>> @@ -34,12 +35,19 @@ int enter_vmx_usercopy(void)
>>    */
>>   int exit_vmx_usercopy(void)
>>   {
>> +	prevent_user_access(KUAP_READ_WRITE);
>>   	disable_kernel_altivec();
>>   	pagefault_enable();
>>   	preempt_enable();
>>   	return 0;
>>   }
>>   
>> +void exit_vmx_usercopy_continue(void)
>> +{
>> +	exit_vmx_usercopy();
>> +	allow_read_write_user(NULL, NULL, 0);
>> +}
>> +
>>   int enter_vmx_ops(void)
>>   {
>>   	if (in_interrupt())
>>
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-19 13:39       ` Christophe Leroy
@ 2022-09-21 13:00         ` Michael Ellerman
  -1 siblings, 0 replies; 22+ messages in thread
From: Michael Ellerman @ 2022-09-21 13:00 UTC (permalink / raw)
  To: Christophe Leroy, Samuel Holland, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 19/09/2022 à 14:37, Michael Ellerman a écrit :
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>>> with userspace access enabled until after the next IRQ or privilege
>>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>>> switching back to the original task, the userspace access would fault:
>>>
>>> This is not supposed to happen. You never switch away from a task
>>> magically. Task switch will always happen in an interrupt, that means
>>> copy_{from,to}_user() get interrupted.
>> 
>> Unfortunately this isn't true when CONFIG_PREEMPT=y.
>
> Argh, yes, I wrote the above with the assumption that we properly follow 
> the main principles that no complex fonction is to be used while KUAP is 
> open ... Which is apparently not true here. x86 would have detected it 
> with objtool, but we don't have it yet in powerpc.

Yes and yes :/

>> We can switch away without an interrupt via:
>>    __copy_tofrom_user()
>>      -> __copy_tofrom_user_power7()
>>         -> exit_vmx_usercopy()
>>            -> preempt_enable()
>>               -> __preempt_schedule()
>>                  -> preempt_schedule()
>>                     -> preempt_schedule_common()
>>                        -> __schedule()
>
>
> Should we use preempt_enable_no_resched() to avoid that ?

Good point :)

...
>> 
>> Still I think it might be our best option for an easy fix.
>
> Wouldn't it be even easier and less abusive to use 
> preemt_enable_no_resched() ? Or is there definitively a good reason to 
> resched after a VMX copy while we don't with regular copies ?

I don't think it's important to reschedule there. I guess it means
another task that wants to preempt will have to wait a little longer,
but I doubt it's measurable.

One reason to do the KUAP lock is it also protects us from running
disable_kernel_altivec() with KUAP unlocked.

That in turn calls into msr_check_and_clear() and
__msr_check_and_clear(), none of which is a problem per-se, but we'd
need to make that all notrace to be 100% safe.

At the moment we're running that all with KUAP unlocked anyway, so using
preempt_enable_no_resched() would fix the actual bug and is much nicer
than my patch, so we should probably do that.

We can look at making disable_kernel_altivec() etc. notrace as a
follow-up.

cheers

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-09-21 13:00         ` Michael Ellerman
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Ellerman @ 2022-09-21 13:00 UTC (permalink / raw)
  To: Christophe Leroy, Samuel Holland, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 19/09/2022 à 14:37, Michael Ellerman a écrit :
>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>>> with userspace access enabled until after the next IRQ or privilege
>>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>>> switching back to the original task, the userspace access would fault:
>>>
>>> This is not supposed to happen. You never switch away from a task
>>> magically. Task switch will always happen in an interrupt, that means
>>> copy_{from,to}_user() get interrupted.
>> 
>> Unfortunately this isn't true when CONFIG_PREEMPT=y.
>
> Argh, yes, I wrote the above with the assumption that we properly follow 
> the main principles that no complex fonction is to be used while KUAP is 
> open ... Which is apparently not true here. x86 would have detected it 
> with objtool, but we don't have it yet in powerpc.

Yes and yes :/

>> We can switch away without an interrupt via:
>>    __copy_tofrom_user()
>>      -> __copy_tofrom_user_power7()
>>         -> exit_vmx_usercopy()
>>            -> preempt_enable()
>>               -> __preempt_schedule()
>>                  -> preempt_schedule()
>>                     -> preempt_schedule_common()
>>                        -> __schedule()
>
>
> Should we use preempt_enable_no_resched() to avoid that ?

Good point :)

...
>> 
>> Still I think it might be our best option for an easy fix.
>
> Wouldn't it be even easier and less abusive to use 
> preemt_enable_no_resched() ? Or is there definitively a good reason to 
> resched after a VMX copy while we don't with regular copies ?

I don't think it's important to reschedule there. I guess it means
another task that wants to preempt will have to wait a little longer,
but I doubt it's measurable.

One reason to do the KUAP lock is it also protects us from running
disable_kernel_altivec() with KUAP unlocked.

That in turn calls into msr_check_and_clear() and
__msr_check_and_clear(), none of which is a problem per-se, but we'd
need to make that all notrace to be 100% safe.

At the moment we're running that all with KUAP unlocked anyway, so using
preempt_enable_no_resched() would fix the actual bug and is much nicer
than my patch, so we should probably do that.

We can look at making disable_kernel_altivec() etc. notrace as a
follow-up.

cheers

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-09-21  5:17         ` Christophe Leroy
@ 2022-10-26  4:27           ` Samuel Holland
  -1 siblings, 0 replies; 22+ messages in thread
From: Samuel Holland @ 2022-10-26  4:27 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev

On 9/21/22 00:17, Christophe Leroy wrote:
> Le 21/09/2022 à 05:33, Samuel Holland a écrit :
>> On 9/19/22 07:37, Michael Ellerman wrote:
>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>>>> with userspace access enabled until after the next IRQ or privilege
>>>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>>>> switching back to the original task, the userspace access would fault:
>>>>
>>>> This is not supposed to happen. You never switch away from a task
>>>> magically. Task switch will always happen in an interrupt, that means
>>>> copy_{from,to}_user() get interrupted.
>>>
>>> Unfortunately this isn't true when CONFIG_PREEMPT=y.
>>>
>>> We can switch away without an interrupt via:
>>>    __copy_tofrom_user()
>>>      -> __copy_tofrom_user_power7()
>>>         -> exit_vmx_usercopy()
>>>            -> preempt_enable()
>>>               -> __preempt_schedule()
>>>                  -> preempt_schedule()
>>>                     -> preempt_schedule_common()
>>>                        -> __schedule()
>>>
>>> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
>>> all on Power8, which is a bit of an oversight on my part.
>>>
>>> And clearly no one else tests it, until now :)
>>>
>>> I think the root of our problem is that our KUAP lock/unlock is at too
>>> high a level, ie. we do it in C around the low-level copy to/from.
>>>
>>> eg:
>>>
>>> static inline unsigned long
>>> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
>>> {
>>> 	unsigned long ret;
>>>
>>> 	allow_write_to_user(to, n);
>>> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
>>> 	prevent_write_to_user(to, n);
>>> 	return ret;
>>> }
>>>
>>> There's a reason we did that, which is that we have various different
>>> KUAP methods on different platforms, not a simple instruction like other
>>> arches.
>>>
>>> But that means we have that exit_vmx_usercopy() being called deep in the
>>> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
>>> the preempt machinery and eventually schedule.
>>>
>>> I don't see an easy way to fix that "properly", it would be a big change
>>> to all platforms to push the KUAP save/restore down into the low level
>>> asm code.
>>>
>>> But I think the patch below does fix it, although it abuses things a
>>> little. Namely it only works because the 64s KUAP code can handle a
>>> double call to prevent, and doesn't need the addresses or size for the
>>> allow.
>>>
>>> Still I think it might be our best option for an easy fix.
>>>
>>> Samuel, can you try this on your system and check it works for you?
>>
>> It looks like your patch works. Thanks for the correct fix!
> 
> Instead of the patch from Michael, could you try by replacing 
> preempt_enable() by preempt_enable_no_resched() in exit_vmx_usercopy() ?

I finally got a chance to test this, and the simpler fix of using
preempt_enable_no_resched() works as well.

>> I replaced my patch with the one below, and enabled
>> CONFIG_PPC_KUAP_DEBUG=y, and I was able to do several kernel builds
>> without any crashes or splats in dmesg.
> 
> Did you try CONFIG_PPC_KUAP_DEBUG without the patch ? Did it detect any 
> problem ?

I believe I did at one point, and it did not detect anything.

Regards,
Samuel


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-10-26  4:27           ` Samuel Holland
  0 siblings, 0 replies; 22+ messages in thread
From: Samuel Holland @ 2022-10-26  4:27 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook

On 9/21/22 00:17, Christophe Leroy wrote:
> Le 21/09/2022 à 05:33, Samuel Holland a écrit :
>> On 9/19/22 07:37, Michael Ellerman wrote:
>>> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>>>> Le 16/09/2022 à 07:05, Samuel Holland a écrit :
>>>>> With CONFIG_PREEMPT=y (involuntary preemption enabled), it is possible
>>>>> to switch away from a task inside copy_{from,to}_user. This left the CPU
>>>>> with userspace access enabled until after the next IRQ or privilege
>>>>> level switch, when AMR/IAMR got reset to AMR_KU[AE]P_BLOCKED. Then, when
>>>>> switching back to the original task, the userspace access would fault:
>>>>
>>>> This is not supposed to happen. You never switch away from a task
>>>> magically. Task switch will always happen in an interrupt, that means
>>>> copy_{from,to}_user() get interrupted.
>>>
>>> Unfortunately this isn't true when CONFIG_PREEMPT=y.
>>>
>>> We can switch away without an interrupt via:
>>>    __copy_tofrom_user()
>>>      -> __copy_tofrom_user_power7()
>>>         -> exit_vmx_usercopy()
>>>            -> preempt_enable()
>>>               -> __preempt_schedule()
>>>                  -> preempt_schedule()
>>>                     -> preempt_schedule_common()
>>>                        -> __schedule()
>>>
>>> I do some boot tests with CONFIG_PREEMPT=y, but I realise now those are
>>> all on Power8, which is a bit of an oversight on my part.
>>>
>>> And clearly no one else tests it, until now :)
>>>
>>> I think the root of our problem is that our KUAP lock/unlock is at too
>>> high a level, ie. we do it in C around the low-level copy to/from.
>>>
>>> eg:
>>>
>>> static inline unsigned long
>>> raw_copy_to_user(void __user *to, const void *from, unsigned long n)
>>> {
>>> 	unsigned long ret;
>>>
>>> 	allow_write_to_user(to, n);
>>> 	ret = __copy_tofrom_user(to, (__force const void __user *)from, n);
>>> 	prevent_write_to_user(to, n);
>>> 	return ret;
>>> }
>>>
>>> There's a reason we did that, which is that we have various different
>>> KUAP methods on different platforms, not a simple instruction like other
>>> arches.
>>>
>>> But that means we have that exit_vmx_usercopy() being called deep in the
>>> guts of __copy_tofrom_user(), with KUAP disabled, and then we call into
>>> the preempt machinery and eventually schedule.
>>>
>>> I don't see an easy way to fix that "properly", it would be a big change
>>> to all platforms to push the KUAP save/restore down into the low level
>>> asm code.
>>>
>>> But I think the patch below does fix it, although it abuses things a
>>> little. Namely it only works because the 64s KUAP code can handle a
>>> double call to prevent, and doesn't need the addresses or size for the
>>> allow.
>>>
>>> Still I think it might be our best option for an easy fix.
>>>
>>> Samuel, can you try this on your system and check it works for you?
>>
>> It looks like your patch works. Thanks for the correct fix!
> 
> Instead of the patch from Michael, could you try by replacing 
> preempt_enable() by preempt_enable_no_resched() in exit_vmx_usercopy() ?

I finally got a chance to test this, and the simpler fix of using
preempt_enable_no_resched() works as well.

>> I replaced my patch with the one below, and enabled
>> CONFIG_PPC_KUAP_DEBUG=y, and I was able to do several kernel builds
>> without any crashes or splats in dmesg.
> 
> Did you try CONFIG_PPC_KUAP_DEBUG without the patch ? Did it detect any 
> problem ?

I believe I did at one point, and it did not detect anything.

Regards,
Samuel


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
  2022-10-26  4:27           ` Samuel Holland
@ 2022-11-14 12:14             ` Christophe Leroy
  -1 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-11-14 12:14 UTC (permalink / raw)
  To: Samuel Holland, Michael Ellerman, Nicholas Piggin
  Cc: Eric W. Biederman, Kees Cook, Russell Currey, linux-kernel, linuxppc-dev



Le 26/10/2022 à 06:27, Samuel Holland a écrit :
> On 9/21/22 00:17, Christophe Leroy wrote:
>>
>> Instead of the patch from Michael, could you try by replacing
>> preempt_enable() by preempt_enable_no_resched() in exit_vmx_usercopy() ?
> 
> I finally got a chance to test this, and the simpler fix of using
> preempt_enable_no_resched() works as well.
> 

Thanks.

That should be fixed upstream now, by commit 
https://github.com/torvalds/linux/commit/00ff1eaac129a24516a3f6d75adfb9df1efb55dd

Christophe

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] powerpc: Save AMR/IAMR when switching tasks
@ 2022-11-14 12:14             ` Christophe Leroy
  0 siblings, 0 replies; 22+ messages in thread
From: Christophe Leroy @ 2022-11-14 12:14 UTC (permalink / raw)
  To: Samuel Holland, Michael Ellerman, Nicholas Piggin
  Cc: linux-kernel, linuxppc-dev, Eric W. Biederman, Kees Cook



Le 26/10/2022 à 06:27, Samuel Holland a écrit :
> On 9/21/22 00:17, Christophe Leroy wrote:
>>
>> Instead of the patch from Michael, could you try by replacing
>> preempt_enable() by preempt_enable_no_resched() in exit_vmx_usercopy() ?
> 
> I finally got a chance to test this, and the simpler fix of using
> preempt_enable_no_resched() works as well.
> 

Thanks.

That should be fixed upstream now, by commit 
https://github.com/torvalds/linux/commit/00ff1eaac129a24516a3f6d75adfb9df1efb55dd

Christophe

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-11-14 12:15 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-16  5:05 [PATCH] powerpc: Save AMR/IAMR when switching tasks Samuel Holland
2022-09-16  5:05 ` Samuel Holland
2022-09-17  8:16 ` Christophe Leroy
2022-09-17  8:16   ` Christophe Leroy
2022-09-17 18:38   ` Samuel Holland
2022-09-17 18:38     ` Samuel Holland
2022-09-18  8:21     ` Christophe Leroy
2022-09-18  8:21       ` Christophe Leroy
2022-09-19 12:37   ` Michael Ellerman
2022-09-19 12:37     ` Michael Ellerman
2022-09-19 13:39     ` Christophe Leroy
2022-09-19 13:39       ` Christophe Leroy
2022-09-21 13:00       ` Michael Ellerman
2022-09-21 13:00         ` Michael Ellerman
2022-09-21  3:33     ` Samuel Holland
2022-09-21  3:33       ` Samuel Holland
2022-09-21  5:17       ` Christophe Leroy
2022-09-21  5:17         ` Christophe Leroy
2022-10-26  4:27         ` Samuel Holland
2022-10-26  4:27           ` Samuel Holland
2022-11-14 12:14           ` Christophe Leroy
2022-11-14 12:14             ` Christophe Leroy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.