All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP
@ 2021-01-09  3:25 Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user Christopher M. Riedl
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev

As reported by Anton, there is a large penalty to signal handling
performance on radix systems using KUAP. The signal handling code
performs many user access operations, each of which needs to switch the
KUAP permissions bit to open and then close user access. This involves a
costly 'mtspr' operation [0].

There is existing work done on x86 and by Christopher Leroy for PPC32 to
instead open up user access in "blocks" using user_*_access_{begin,end}.
We can do the same in PPC64 to bring performance back up on KUAP-enabled
radix and now also hash MMU systems [1].

Hash MMU KUAP support along with uaccess flush has landed in linuxppc/next
since the last revision. This series also provides a large benefit on hash
with KUAP. However, in the hash implementation of KUAP the user AMR is
always restored during system_call_exception() which cannot be avoided.
Fewer user access switches naturally also result in less uaccess flushing.

The first two patches add some needed 'unsafe' versions of copy-from
functions. While these do not make use of asm-goto they still allow for
avoiding the repeated uaccess switches.

The third patch moves functions called by setup_sigcontext() into a new
prepare_setup_sigcontext() to simplify converting setup_sigcontext()
into an 'unsafe' version which assumes an open uaccess window later.

The fourth patch cleans-up some of the Transactional Memory ifdef stuff
to simplify using uaccess blocks later.

The next two patches rewrite some of the signal64 helper functions to
be 'unsafe'. Finally, the last two patches update the main signal
handling functions to make use of the new 'unsafe' helpers and eliminate
some additional uaccess switching.

I used the will-it-scale signal1 benchmark to measure and compare
performance [2]. The below results are from running a minimal
kernel+initramfs QEMU/KVM guest on a POWER9 Blackbird:

	signal1_threads -t1 -s10

	|                             | hash   | radix  |
	| --------------------------- | ------ | ------ |
	| linuxppc/next               | 118693 | 133296 |
	| linuxppc/next w/o KUAP+KUEP | 228911 | 228654 |
	| unsafe-signal64             | 200480 | 234067 |

There were questions previously about the unsafe_copy_from_user() and
unsafe_copy_{vsx,fpr}_from_user() implementations in the first two
patches of this series [3][4]. The two results below show the performance
degradations when using the proposed alternate implementations:

	| unsafe-signal64-regs        | 178688 | 201128 |
	| unsafe-signal64-copy        | 147443 | 165759 |

Full trees with the two alternate implementations are available [5][6].

[0]: https://github.com/linuxppc/issues/issues/277
[1]: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=196278
[2]: https://github.com/antonblanchard/will-it-scale/blob/master/tests/signal1.c
[3]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-October/219355.html
[4]: https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-October/219351.html
[5]: https://git.sr.ht/~cmr/linux/tree/unsafe-signal64-v3-regs
[6]: https://git.sr.ht/~cmr/linux/tree/unsafe-signal64-v3-copy

v3:	* Rebase on latest linuxppc/next
	* Reword confusing commit messages
	* Add missing comma in macro in signal.h which broke compiles without
	  CONFIG_ALTIVEC
	* Validate hash KUAP signal performance improvements

v2:	* Rebase on latest linuxppc/next + Christophe Leroy's PPC32
	  signal series
	* Simplify/remove TM ifdefery similar to PPC32 series and clean
	  up the uaccess begin/end calls
	* Isolate non-inline functions so they are not called when
	  uaccess window is open


Christopher M. Riedl (6):
  powerpc/uaccess: Add unsafe_copy_from_user
  powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user()
  powerpc/signal64: Move non-inline functions out of setup_sigcontext()
  powerpc/signal64: Remove TM ifdefery in middle of if/else block
  powerpc/signal64: Replace setup_sigcontext() w/
    unsafe_setup_sigcontext()
  powerpc/signal64: Replace restore_sigcontext() w/
    unsafe_restore_sigcontext()

Daniel Axtens (2):
  powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess
    switches
  powerpc/signal64: Rewrite rt_sigreturn() to minimise uaccess switches

 arch/powerpc/include/asm/uaccess.h |  28 ++--
 arch/powerpc/kernel/signal.h       |  33 ++++
 arch/powerpc/kernel/signal_64.c    | 237 ++++++++++++++++++-----------
 3 files changed, 198 insertions(+), 100 deletions(-)

-- 
2.26.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
@ 2021-01-09  3:25 ` Christopher M. Riedl
  2021-01-11 13:22   ` Christophe Leroy
  2021-01-09  3:25 ` [PATCH v3 2/8] powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user() Christopher M. Riedl
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev

Implement raw_copy_from_user_allowed() which assumes that userspace read
access is open. Use this new function to implement raw_copy_from_user().
Finally, wrap the new function to follow the usual "unsafe_" convention
of taking a label argument.

The new raw_copy_from_user_allowed() calls non-inline __copy_tofrom_user()
internally. This is still safe to call inside user access blocks formed
with user_*_access_begin()/user_*_access_end() since asm functions are not
instrumented for tracing.

Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
---
 arch/powerpc/include/asm/uaccess.h | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
index 501c9a79038c..698f3a6d6ae5 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -403,38 +403,45 @@ raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)
 }
 #endif /* __powerpc64__ */
 
-static inline unsigned long raw_copy_from_user(void *to,
-		const void __user *from, unsigned long n)
+static inline unsigned long
+raw_copy_from_user_allowed(void *to, const void __user *from, unsigned long n)
 {
-	unsigned long ret;
 	if (__builtin_constant_p(n) && (n <= 8)) {
-		ret = 1;
+		unsigned long ret = 1;
 
 		switch (n) {
 		case 1:
 			barrier_nospec();
-			__get_user_size(*(u8 *)to, from, 1, ret);
+			__get_user_size_allowed(*(u8 *)to, from, 1, ret);
 			break;
 		case 2:
 			barrier_nospec();
-			__get_user_size(*(u16 *)to, from, 2, ret);
+			__get_user_size_allowed(*(u16 *)to, from, 2, ret);
 			break;
 		case 4:
 			barrier_nospec();
-			__get_user_size(*(u32 *)to, from, 4, ret);
+			__get_user_size_allowed(*(u32 *)to, from, 4, ret);
 			break;
 		case 8:
 			barrier_nospec();
-			__get_user_size(*(u64 *)to, from, 8, ret);
+			__get_user_size_allowed(*(u64 *)to, from, 8, ret);
 			break;
 		}
 		if (ret == 0)
 			return 0;
 	}
 
+	return __copy_tofrom_user((__force void __user *)to, from, n);
+}
+
+static inline unsigned long
+raw_copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	unsigned long ret;
+
 	barrier_nospec();
 	allow_read_from_user(from, n);
-	ret = __copy_tofrom_user((__force void __user *)to, from, n);
+	ret = raw_copy_from_user_allowed(to, from, n);
 	prevent_read_from_user(from, n);
 	return ret;
 }
@@ -542,6 +549,9 @@ user_write_access_begin(const void __user *ptr, size_t len)
 #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
 #define unsafe_put_user(x, p, e) __put_user_goto(x, p, e)
 
+#define unsafe_copy_from_user(d, s, l, e) \
+	unsafe_op_wrap(raw_copy_from_user_allowed(d, s, l), e)
+
 #define unsafe_copy_to_user(d, s, l, e) \
 do {									\
 	u8 __user *_dst = (u8 __user *)(d);				\
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 2/8] powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user()
  2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user Christopher M. Riedl
@ 2021-01-09  3:25 ` Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 3/8] powerpc/signal64: Move non-inline functions out of setup_sigcontext() Christopher M. Riedl
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev

Reuse the "safe" implementation from signal.c except for calling
unsafe_copy_from_user() to copy into a local buffer.

Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
---
 arch/powerpc/kernel/signal.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/arch/powerpc/kernel/signal.h b/arch/powerpc/kernel/signal.h
index 2559a681536e..c18402d625f1 100644
--- a/arch/powerpc/kernel/signal.h
+++ b/arch/powerpc/kernel/signal.h
@@ -53,6 +53,33 @@ unsigned long copy_ckfpr_from_user(struct task_struct *task, void __user *from);
 				&buf[i], label);\
 } while (0)
 
+#define unsafe_copy_fpr_from_user(task, from, label)	do {		\
+	struct task_struct *__t = task;					\
+	u64 __user *__f = (u64 __user *)from;				\
+	u64 buf[ELF_NFPREG];						\
+	int i;								\
+									\
+	unsafe_copy_from_user(buf, __f, ELF_NFPREG * sizeof(double),	\
+				label);					\
+	for (i = 0; i < ELF_NFPREG - 1; i++)				\
+		__t->thread.TS_FPR(i) = buf[i];				\
+	__t->thread.fp_state.fpscr = buf[i];				\
+} while (0)
+
+#define unsafe_copy_vsx_from_user(task, from, label)	do {		\
+	struct task_struct *__t = task;					\
+	u64 __user *__f = (u64 __user *)from;				\
+	u64 buf[ELF_NVSRHALFREG];					\
+	int i;								\
+									\
+	unsafe_copy_from_user(buf, __f,					\
+				ELF_NVSRHALFREG * sizeof(double),	\
+				label);					\
+	for (i = 0; i < ELF_NVSRHALFREG ; i++)				\
+		__t->thread.fp_state.fpr[i][TS_VSRLOWOFFSET] = buf[i];	\
+} while (0)
+
+
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 #define unsafe_copy_ckfpr_to_user(to, task, label)	do {		\
 	struct task_struct *__t = task;					\
@@ -80,6 +107,10 @@ unsigned long copy_ckfpr_from_user(struct task_struct *task, void __user *from);
 	unsafe_copy_to_user(to, (task)->thread.fp_state.fpr,	\
 			    ELF_NFPREG * sizeof(double), label)
 
+#define unsafe_copy_fpr_from_user(task, from, label)			\
+	unsafe_copy_from_user((task)->thread.fp_state.fpr, from,	\
+			    ELF_NFPREG * sizeof(double), label)
+
 static inline unsigned long
 copy_fpr_to_user(void __user *to, struct task_struct *task)
 {
@@ -115,6 +146,8 @@ copy_ckfpr_from_user(struct task_struct *task, void __user *from)
 #else
 #define unsafe_copy_fpr_to_user(to, task, label) do { } while (0)
 
+#define unsafe_copy_fpr_from_user(task, from, label) do { } while (0)
+
 static inline unsigned long
 copy_fpr_to_user(void __user *to, struct task_struct *task)
 {
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 3/8] powerpc/signal64: Move non-inline functions out of setup_sigcontext()
  2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 2/8] powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user() Christopher M. Riedl
@ 2021-01-09  3:25 ` Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 4/8] powerpc/signal64: Remove TM ifdefery in middle of if/else block Christopher M. Riedl
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev

There are non-inline functions which get called in setup_sigcontext() to
save register state to the thread struct. Move these functions into a
separate prepare_setup_sigcontext() function so that
setup_sigcontext() can be refactored later into an "unsafe" version
which assumes an open uaccess window. Non-inline functions should be
avoided when uaccess is open.

The majority of setup_sigcontext() can be refactored to execute in an
"unsafe" context (uaccess window is opened) except for some non-inline
functions. Move these out into a separate prepare_setup_sigcontext()
function which must be called first and before opening up a uaccess
window. A follow-up commit converts setup_sigcontext() to be "unsafe".

Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
---
 arch/powerpc/kernel/signal_64.c | 32 +++++++++++++++++++++-----------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index f9e4a1ac440f..b211a8ea4f6e 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -79,6 +79,24 @@ static elf_vrreg_t __user *sigcontext_vmx_regs(struct sigcontext __user *sc)
 }
 #endif
 
+static void prepare_setup_sigcontext(struct task_struct *tsk, int ctx_has_vsx_region)
+{
+#ifdef CONFIG_ALTIVEC
+	/* save altivec registers */
+	if (tsk->thread.used_vr)
+		flush_altivec_to_thread(tsk);
+	if (cpu_has_feature(CPU_FTR_ALTIVEC))
+		tsk->thread.vrsave = mfspr(SPRN_VRSAVE);
+#endif /* CONFIG_ALTIVEC */
+
+	flush_fp_to_thread(tsk);
+
+#ifdef CONFIG_VSX
+	if (tsk->thread.used_vsr && ctx_has_vsx_region)
+		flush_vsx_to_thread(tsk);
+#endif /* CONFIG_VSX */
+}
+
 /*
  * Set up the sigcontext for the signal frame.
  */
@@ -97,7 +115,6 @@ static long setup_sigcontext(struct sigcontext __user *sc,
 	 */
 #ifdef CONFIG_ALTIVEC
 	elf_vrreg_t __user *v_regs = sigcontext_vmx_regs(sc);
-	unsigned long vrsave;
 #endif
 	struct pt_regs *regs = tsk->thread.regs;
 	unsigned long msr = regs->msr;
@@ -112,7 +129,6 @@ static long setup_sigcontext(struct sigcontext __user *sc,
 
 	/* save altivec registers */
 	if (tsk->thread.used_vr) {
-		flush_altivec_to_thread(tsk);
 		/* Copy 33 vec registers (vr0..31 and vscr) to the stack */
 		err |= __copy_to_user(v_regs, &tsk->thread.vr_state,
 				      33 * sizeof(vector128));
@@ -124,17 +140,10 @@ static long setup_sigcontext(struct sigcontext __user *sc,
 	/* We always copy to/from vrsave, it's 0 if we don't have or don't
 	 * use altivec.
 	 */
-	vrsave = 0;
-	if (cpu_has_feature(CPU_FTR_ALTIVEC)) {
-		vrsave = mfspr(SPRN_VRSAVE);
-		tsk->thread.vrsave = vrsave;
-	}
-
-	err |= __put_user(vrsave, (u32 __user *)&v_regs[33]);
+	err |= __put_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33]);
 #else /* CONFIG_ALTIVEC */
 	err |= __put_user(0, &sc->v_regs);
 #endif /* CONFIG_ALTIVEC */
-	flush_fp_to_thread(tsk);
 	/* copy fpr regs and fpscr */
 	err |= copy_fpr_to_user(&sc->fp_regs, tsk);
 
@@ -150,7 +159,6 @@ static long setup_sigcontext(struct sigcontext __user *sc,
 	 * VMX data.
 	 */
 	if (tsk->thread.used_vsr && ctx_has_vsx_region) {
-		flush_vsx_to_thread(tsk);
 		v_regs += ELF_NVRREG;
 		err |= copy_vsx_to_user(v_regs, tsk);
 		/* set MSR_VSX in the MSR value in the frame to
@@ -655,6 +663,7 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
 		ctx_has_vsx_region = 1;
 
 	if (old_ctx != NULL) {
+		prepare_setup_sigcontext(current, ctx_has_vsx_region);
 		if (!access_ok(old_ctx, ctx_size)
 		    || setup_sigcontext(&old_ctx->uc_mcontext, current, 0, NULL, 0,
 					ctx_has_vsx_region)
@@ -842,6 +851,7 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 #endif
 	{
 		err |= __put_user(0, &frame->uc.uc_link);
+		prepare_setup_sigcontext(tsk, 1);
 		err |= setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,
 					NULL, (unsigned long)ksig->ka.sa.sa_handler,
 					1);
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 4/8] powerpc/signal64: Remove TM ifdefery in middle of if/else block
  2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
                   ` (2 preceding siblings ...)
  2021-01-09  3:25 ` [PATCH v3 3/8] powerpc/signal64: Move non-inline functions out of setup_sigcontext() Christopher M. Riedl
@ 2021-01-09  3:25 ` Christopher M. Riedl
  2021-01-11 13:29   ` Christophe Leroy
  2021-01-09  3:25 ` [PATCH v3 5/8] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext() Christopher M. Riedl
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev

Rework the messy ifdef breaking up the if-else for TM similar to
commit f1cf4f93de2f ("powerpc/signal32: Remove ifdefery in middle of if/else").

Unlike that commit for ppc32, the ifdef can't be removed entirely since
uc_transact in sigframe depends on CONFIG_PPC_TRANSACTIONAL_MEM.

Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
---
 arch/powerpc/kernel/signal_64.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index b211a8ea4f6e..dd3787f67a78 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -710,9 +710,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	struct pt_regs *regs = current_pt_regs();
 	struct ucontext __user *uc = (struct ucontext __user *)regs->gpr[1];
 	sigset_t set;
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	unsigned long msr;
-#endif
 
 	/* Always make any pending restarted system calls return -EINTR */
 	current->restart_block.fn = do_no_restart_syscall;
@@ -762,10 +760,12 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	 * restore_tm_sigcontexts.
 	 */
 	regs->msr &= ~MSR_TS_MASK;
+#endif
 
 	if (__get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR]))
 		goto badframe;
 	if (MSR_TM_ACTIVE(msr)) {
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 		/* We recheckpoint on return. */
 		struct ucontext __user *uc_transact;
 
@@ -778,9 +778,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
 		if (restore_tm_sigcontexts(current, &uc->uc_mcontext,
 					   &uc_transact->uc_mcontext))
 			goto badframe;
-	} else
 #endif
-	{
+	} else {
 		/*
 		 * Fall through, for non-TM restore
 		 *
@@ -818,10 +817,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 	unsigned long newsp = 0;
 	long err = 0;
 	struct pt_regs *regs = tsk->thread.regs;
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	/* Save the thread's msr before get_tm_stackpointer() changes it */
-	unsigned long msr = regs->msr;
-#endif
+	unsigned long msr __maybe_unused = regs->msr;
 
 	frame = get_sigframe(ksig, tsk, sizeof(*frame), 0);
 	if (!access_ok(frame, sizeof(*frame)))
@@ -836,8 +833,9 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 	/* Create the ucontext.  */
 	err |= __put_user(0, &frame->uc.uc_flags);
 	err |= __save_altstack(&frame->uc.uc_stack, regs->gpr[1]);
-#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+
 	if (MSR_TM_ACTIVE(msr)) {
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 		/* The ucontext_t passed to userland points to the second
 		 * ucontext_t (for transactional state) with its uc_link ptr.
 		 */
@@ -847,9 +845,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 					    tsk, ksig->sig, NULL,
 					    (unsigned long)ksig->ka.sa.sa_handler,
 					    msr);
-	} else
 #endif
-	{
+	} else {
 		err |= __put_user(0, &frame->uc.uc_link);
 		prepare_setup_sigcontext(tsk, 1);
 		err |= setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 5/8] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext()
  2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
                   ` (3 preceding siblings ...)
  2021-01-09  3:25 ` [PATCH v3 4/8] powerpc/signal64: Remove TM ifdefery in middle of if/else block Christopher M. Riedl
@ 2021-01-09  3:25 ` Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 6/8] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext() Christopher M. Riedl
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev

Previously setup_sigcontext() performed a costly KUAP switch on every
uaccess operation. These repeated uaccess switches cause a significant
drop in signal handling performance.

Rewrite setup_sigcontext() to assume that a userspace write access window
is open. Replace all uaccess functions with their 'unsafe' versions
which avoid the repeated uaccess switches.

Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
---
 arch/powerpc/kernel/signal_64.c | 70 ++++++++++++++++++++-------------
 1 file changed, 43 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index dd3787f67a78..fd3c1ab9a1ea 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -101,9 +101,13 @@ static void prepare_setup_sigcontext(struct task_struct *tsk, int ctx_has_vsx_re
  * Set up the sigcontext for the signal frame.
  */
 
-static long setup_sigcontext(struct sigcontext __user *sc,
-		struct task_struct *tsk, int signr, sigset_t *set,
-		unsigned long handler, int ctx_has_vsx_region)
+#define unsafe_setup_sigcontext(sc, tsk, signr, set, handler,		\
+				ctx_has_vsx_region, e)			\
+	unsafe_op_wrap(__unsafe_setup_sigcontext(sc, tsk, signr, set,	\
+			handler, ctx_has_vsx_region), e)
+static long notrace __unsafe_setup_sigcontext(struct sigcontext __user *sc,
+					struct task_struct *tsk, int signr, sigset_t *set,
+					unsigned long handler, int ctx_has_vsx_region)
 {
 	/* When CONFIG_ALTIVEC is set, we _always_ setup v_regs even if the
 	 * process never used altivec yet (MSR_VEC is zero in pt_regs of
@@ -118,20 +122,19 @@ static long setup_sigcontext(struct sigcontext __user *sc,
 #endif
 	struct pt_regs *regs = tsk->thread.regs;
 	unsigned long msr = regs->msr;
-	long err = 0;
 	/* Force usr to alway see softe as 1 (interrupts enabled) */
 	unsigned long softe = 0x1;
 
 	BUG_ON(tsk != current);
 
 #ifdef CONFIG_ALTIVEC
-	err |= __put_user(v_regs, &sc->v_regs);
+	unsafe_put_user(v_regs, &sc->v_regs, efault_out);
 
 	/* save altivec registers */
 	if (tsk->thread.used_vr) {
 		/* Copy 33 vec registers (vr0..31 and vscr) to the stack */
-		err |= __copy_to_user(v_regs, &tsk->thread.vr_state,
-				      33 * sizeof(vector128));
+		unsafe_copy_to_user(v_regs, &tsk->thread.vr_state,
+				    33 * sizeof(vector128), efault_out);
 		/* set MSR_VEC in the MSR value in the frame to indicate that sc->v_reg)
 		 * contains valid data.
 		 */
@@ -140,12 +143,12 @@ static long setup_sigcontext(struct sigcontext __user *sc,
 	/* We always copy to/from vrsave, it's 0 if we don't have or don't
 	 * use altivec.
 	 */
-	err |= __put_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33]);
+	unsafe_put_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33], efault_out);
 #else /* CONFIG_ALTIVEC */
-	err |= __put_user(0, &sc->v_regs);
+	unsafe_put_user(0, &sc->v_regs, efault_out);
 #endif /* CONFIG_ALTIVEC */
 	/* copy fpr regs and fpscr */
-	err |= copy_fpr_to_user(&sc->fp_regs, tsk);
+	unsafe_copy_fpr_to_user(&sc->fp_regs, tsk, efault_out);
 
 	/*
 	 * Clear the MSR VSX bit to indicate there is no valid state attached
@@ -160,24 +163,27 @@ static long setup_sigcontext(struct sigcontext __user *sc,
 	 */
 	if (tsk->thread.used_vsr && ctx_has_vsx_region) {
 		v_regs += ELF_NVRREG;
-		err |= copy_vsx_to_user(v_regs, tsk);
+		unsafe_copy_vsx_to_user(v_regs, tsk, efault_out);
 		/* set MSR_VSX in the MSR value in the frame to
 		 * indicate that sc->vs_reg) contains valid data.
 		 */
 		msr |= MSR_VSX;
 	}
 #endif /* CONFIG_VSX */
-	err |= __put_user(&sc->gp_regs, &sc->regs);
+	unsafe_put_user(&sc->gp_regs, &sc->regs, efault_out);
 	WARN_ON(!FULL_REGS(regs));
-	err |= __copy_to_user(&sc->gp_regs, regs, GP_REGS_SIZE);
-	err |= __put_user(msr, &sc->gp_regs[PT_MSR]);
-	err |= __put_user(softe, &sc->gp_regs[PT_SOFTE]);
-	err |= __put_user(signr, &sc->signal);
-	err |= __put_user(handler, &sc->handler);
+	unsafe_copy_to_user(&sc->gp_regs, regs, GP_REGS_SIZE, efault_out);
+	unsafe_put_user(msr, &sc->gp_regs[PT_MSR], efault_out);
+	unsafe_put_user(softe, &sc->gp_regs[PT_SOFTE], efault_out);
+	unsafe_put_user(signr, &sc->signal, efault_out);
+	unsafe_put_user(handler, &sc->handler, efault_out);
 	if (set != NULL)
-		err |=  __put_user(set->sig[0], &sc->oldmask);
+		unsafe_put_user(set->sig[0], &sc->oldmask, efault_out);
 
-	return err;
+	return 0;
+
+efault_out:
+	return -EFAULT;
 }
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -664,12 +670,15 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
 
 	if (old_ctx != NULL) {
 		prepare_setup_sigcontext(current, ctx_has_vsx_region);
-		if (!access_ok(old_ctx, ctx_size)
-		    || setup_sigcontext(&old_ctx->uc_mcontext, current, 0, NULL, 0,
-					ctx_has_vsx_region)
-		    || __copy_to_user(&old_ctx->uc_sigmask,
-				      &current->blocked, sizeof(sigset_t)))
+		if (!user_write_access_begin(old_ctx, ctx_size))
 			return -EFAULT;
+
+		unsafe_setup_sigcontext(&old_ctx->uc_mcontext, current, 0, NULL,
+					0, ctx_has_vsx_region, efault_out);
+		unsafe_copy_to_user(&old_ctx->uc_sigmask, &current->blocked,
+				    sizeof(sigset_t), efault_out);
+
+		user_write_access_end();
 	}
 	if (new_ctx == NULL)
 		return 0;
@@ -698,6 +707,10 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
 	/* This returns like rt_sigreturn */
 	set_thread_flag(TIF_RESTOREALL);
 	return 0;
+
+efault_out:
+	user_write_access_end();
+	return -EFAULT;
 }
 
 
@@ -849,9 +862,12 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 	} else {
 		err |= __put_user(0, &frame->uc.uc_link);
 		prepare_setup_sigcontext(tsk, 1);
-		err |= setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,
-					NULL, (unsigned long)ksig->ka.sa.sa_handler,
-					1);
+		if (!user_write_access_begin(frame, sizeof(struct rt_sigframe)))
+			return -EFAULT;
+		err |= __unsafe_setup_sigcontext(&frame->uc.uc_mcontext, tsk,
+						ksig->sig, NULL,
+						(unsigned long)ksig->ka.sa.sa_handler, 1);
+		user_write_access_end();
 	}
 	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
 	if (err)
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 6/8] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext()
  2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
                   ` (4 preceding siblings ...)
  2021-01-09  3:25 ` [PATCH v3 5/8] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext() Christopher M. Riedl
@ 2021-01-09  3:25 ` Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 7/8] powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess switches Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 8/8] powerpc/signal64: Rewrite rt_sigreturn() " Christopher M. Riedl
  7 siblings, 0 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev

Previously restore_sigcontext() performed a costly KUAP switch on every
uaccess operation. These repeated uaccess switches cause a significant
drop in signal handling performance.

Rewrite restore_sigcontext() to assume that a userspace read access
window is open. Replace all uaccess functions with their 'unsafe'
versions which avoid the repeated uaccess switches.

Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
---
 arch/powerpc/kernel/signal_64.c | 68 ++++++++++++++++++++-------------
 1 file changed, 41 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index fd3c1ab9a1ea..0c7f633b832b 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -326,14 +326,14 @@ static long setup_tm_sigcontexts(struct sigcontext __user *sc,
 /*
  * Restore the sigcontext from the signal frame.
  */
-
-static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
-			      struct sigcontext __user *sc)
+#define unsafe_restore_sigcontext(tsk, set, sig, sc, e) \
+	unsafe_op_wrap(__unsafe_restore_sigcontext(tsk, set, sig, sc), e)
+static long notrace __unsafe_restore_sigcontext(struct task_struct *tsk, sigset_t *set,
+						int sig, struct sigcontext __user *sc)
 {
 #ifdef CONFIG_ALTIVEC
 	elf_vrreg_t __user *v_regs;
 #endif
-	unsigned long err = 0;
 	unsigned long save_r13 = 0;
 	unsigned long msr;
 	struct pt_regs *regs = tsk->thread.regs;
@@ -348,27 +348,28 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
 		save_r13 = regs->gpr[13];
 
 	/* copy the GPRs */
-	err |= __copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr));
-	err |= __get_user(regs->nip, &sc->gp_regs[PT_NIP]);
+	unsafe_copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr),
+			      efault_out);
+	unsafe_get_user(regs->nip, &sc->gp_regs[PT_NIP], efault_out);
 	/* get MSR separately, transfer the LE bit if doing signal return */
-	err |= __get_user(msr, &sc->gp_regs[PT_MSR]);
+	unsafe_get_user(msr, &sc->gp_regs[PT_MSR], efault_out);
 	if (sig)
 		regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
-	err |= __get_user(regs->orig_gpr3, &sc->gp_regs[PT_ORIG_R3]);
-	err |= __get_user(regs->ctr, &sc->gp_regs[PT_CTR]);
-	err |= __get_user(regs->link, &sc->gp_regs[PT_LNK]);
-	err |= __get_user(regs->xer, &sc->gp_regs[PT_XER]);
-	err |= __get_user(regs->ccr, &sc->gp_regs[PT_CCR]);
+	unsafe_get_user(regs->orig_gpr3, &sc->gp_regs[PT_ORIG_R3], efault_out);
+	unsafe_get_user(regs->ctr, &sc->gp_regs[PT_CTR], efault_out);
+	unsafe_get_user(regs->link, &sc->gp_regs[PT_LNK], efault_out);
+	unsafe_get_user(regs->xer, &sc->gp_regs[PT_XER], efault_out);
+	unsafe_get_user(regs->ccr, &sc->gp_regs[PT_CCR], efault_out);
 	/* Don't allow userspace to set SOFTE */
 	set_trap_norestart(regs);
-	err |= __get_user(regs->dar, &sc->gp_regs[PT_DAR]);
-	err |= __get_user(regs->dsisr, &sc->gp_regs[PT_DSISR]);
-	err |= __get_user(regs->result, &sc->gp_regs[PT_RESULT]);
+	unsafe_get_user(regs->dar, &sc->gp_regs[PT_DAR], efault_out);
+	unsafe_get_user(regs->dsisr, &sc->gp_regs[PT_DSISR], efault_out);
+	unsafe_get_user(regs->result, &sc->gp_regs[PT_RESULT], efault_out);
 
 	if (!sig)
 		regs->gpr[13] = save_r13;
 	if (set != NULL)
-		err |=  __get_user(set->sig[0], &sc->oldmask);
+		unsafe_get_user(set->sig[0], &sc->oldmask, efault_out);
 
 	/*
 	 * Force reload of FP/VEC.
@@ -378,29 +379,28 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
 	regs->msr &= ~(MSR_FP | MSR_FE0 | MSR_FE1 | MSR_VEC | MSR_VSX);
 
 #ifdef CONFIG_ALTIVEC
-	err |= __get_user(v_regs, &sc->v_regs);
-	if (err)
-		return err;
+	unsafe_get_user(v_regs, &sc->v_regs, efault_out);
 	if (v_regs && !access_ok(v_regs, 34 * sizeof(vector128)))
 		return -EFAULT;
 	/* Copy 33 vec registers (vr0..31 and vscr) from the stack */
 	if (v_regs != NULL && (msr & MSR_VEC) != 0) {
-		err |= __copy_from_user(&tsk->thread.vr_state, v_regs,
-					33 * sizeof(vector128));
+		unsafe_copy_from_user(&tsk->thread.vr_state, v_regs,
+				      33 * sizeof(vector128), efault_out);
 		tsk->thread.used_vr = true;
 	} else if (tsk->thread.used_vr) {
 		memset(&tsk->thread.vr_state, 0, 33 * sizeof(vector128));
 	}
 	/* Always get VRSAVE back */
 	if (v_regs != NULL)
-		err |= __get_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33]);
+		unsafe_get_user(tsk->thread.vrsave, (u32 __user *)&v_regs[33],
+				efault_out);
 	else
 		tsk->thread.vrsave = 0;
 	if (cpu_has_feature(CPU_FTR_ALTIVEC))
 		mtspr(SPRN_VRSAVE, tsk->thread.vrsave);
 #endif /* CONFIG_ALTIVEC */
 	/* restore floating point */
-	err |= copy_fpr_from_user(tsk, &sc->fp_regs);
+	unsafe_copy_fpr_from_user(tsk, &sc->fp_regs, efault_out);
 #ifdef CONFIG_VSX
 	/*
 	 * Get additional VSX data. Update v_regs to point after the
@@ -409,14 +409,17 @@ static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int sig,
 	 */
 	v_regs += ELF_NVRREG;
 	if ((msr & MSR_VSX) != 0) {
-		err |= copy_vsx_from_user(tsk, v_regs);
+		unsafe_copy_vsx_from_user(tsk, v_regs, efault_out);
 		tsk->thread.used_vsr = true;
 	} else {
 		for (i = 0; i < 32 ; i++)
 			tsk->thread.fp_state.fpr[i][TS_VSRLOWOFFSET] = 0;
 	}
 #endif
-	return err;
+	return 0;
+
+efault_out:
+	return -EFAULT;
 }
 
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -701,8 +704,14 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, old_ctx,
 	if (__copy_from_user(&set, &new_ctx->uc_sigmask, sizeof(set)))
 		do_exit(SIGSEGV);
 	set_current_blocked(&set);
-	if (restore_sigcontext(current, NULL, 0, &new_ctx->uc_mcontext))
+
+	if (!user_read_access_begin(new_ctx, ctx_size))
+		return -EFAULT;
+	if (__unsafe_restore_sigcontext(current, NULL, 0, &new_ctx->uc_mcontext)) {
+		user_read_access_end();
 		do_exit(SIGSEGV);
+	}
+	user_read_access_end();
 
 	/* This returns like rt_sigreturn */
 	set_thread_flag(TIF_RESTOREALL);
@@ -806,8 +815,13 @@ SYSCALL_DEFINE0(rt_sigreturn)
 		 * causing a TM bad thing.
 		 */
 		current->thread.regs->msr &= ~MSR_TS_MASK;
-		if (restore_sigcontext(current, NULL, 1, &uc->uc_mcontext))
+		if (!user_read_access_begin(uc, sizeof(*uc)))
+			return -EFAULT;
+		if (__unsafe_restore_sigcontext(current, NULL, 1, &uc->uc_mcontext)) {
+			user_read_access_end();
 			goto badframe;
+		}
+		user_read_access_end();
 	}
 
 	if (restore_altstack(&uc->uc_stack))
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 7/8] powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess switches
  2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
                   ` (5 preceding siblings ...)
  2021-01-09  3:25 ` [PATCH v3 6/8] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext() Christopher M. Riedl
@ 2021-01-09  3:25 ` Christopher M. Riedl
  2021-01-09  3:25 ` [PATCH v3 8/8] powerpc/signal64: Rewrite rt_sigreturn() " Christopher M. Riedl
  7 siblings, 0 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Daniel Axtens

From: Daniel Axtens <dja@axtens.net>

Add uaccess blocks and use the 'unsafe' versions of functions doing user
access where possible to reduce the number of times uaccess has to be
opened/closed.

There is no 'unsafe' version of copy_siginfo_to_user, so move it
slightly to allow for a "longer" uaccess block.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Co-developed-by: Christopher M. Riedl <cmr@codefail.de>
Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
---
 arch/powerpc/kernel/signal_64.c | 54 +++++++++++++++++++++------------
 1 file changed, 34 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 0c7f633b832b..7e62d62ef482 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -848,44 +848,51 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 	unsigned long msr __maybe_unused = regs->msr;
 
 	frame = get_sigframe(ksig, tsk, sizeof(*frame), 0);
-	if (!access_ok(frame, sizeof(*frame)))
-		goto badframe;
 
-	err |= __put_user(&frame->info, &frame->pinfo);
-	err |= __put_user(&frame->uc, &frame->puc);
-	err |= copy_siginfo_to_user(&frame->info, &ksig->info);
-	if (err)
+	/* This only applies when calling unsafe_setup_sigcontext() and must be
+	 * called before opening the uaccess window.
+	 */
+	if (!MSR_TM_ACTIVE(msr))
+		prepare_setup_sigcontext(tsk, 1);
+
+	if (!user_write_access_begin(frame, sizeof(*frame)))
 		goto badframe;
 
+	unsafe_put_user(&frame->info, &frame->pinfo, badframe_block);
+	unsafe_put_user(&frame->uc, &frame->puc, badframe_block);
+
 	/* Create the ucontext.  */
-	err |= __put_user(0, &frame->uc.uc_flags);
-	err |= __save_altstack(&frame->uc.uc_stack, regs->gpr[1]);
+	unsafe_put_user(0, &frame->uc.uc_flags, badframe_block);
+	unsafe_save_altstack(&frame->uc.uc_stack, regs->gpr[1], badframe_block);
 
 	if (MSR_TM_ACTIVE(msr)) {
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 		/* The ucontext_t passed to userland points to the second
 		 * ucontext_t (for transactional state) with its uc_link ptr.
 		 */
-		err |= __put_user(&frame->uc_transact, &frame->uc.uc_link);
+		unsafe_put_user(&frame->uc_transact, &frame->uc.uc_link, badframe_block);
+
+		user_write_access_end();
+
 		err |= setup_tm_sigcontexts(&frame->uc.uc_mcontext,
 					    &frame->uc_transact.uc_mcontext,
 					    tsk, ksig->sig, NULL,
 					    (unsigned long)ksig->ka.sa.sa_handler,
 					    msr);
+
+		if (!user_write_access_begin(frame, sizeof(struct rt_sigframe)))
+			goto badframe;
+
 #endif
 	} else {
-		err |= __put_user(0, &frame->uc.uc_link);
-		prepare_setup_sigcontext(tsk, 1);
-		if (!user_write_access_begin(frame, sizeof(struct rt_sigframe)))
-			return -EFAULT;
-		err |= __unsafe_setup_sigcontext(&frame->uc.uc_mcontext, tsk,
-						ksig->sig, NULL,
-						(unsigned long)ksig->ka.sa.sa_handler, 1);
-		user_write_access_end();
+		unsafe_put_user(0, &frame->uc.uc_link, badframe_block);
+		unsafe_setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,
+					NULL, (unsigned long)ksig->ka.sa.sa_handler,
+					1, badframe_block);
 	}
-	err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
-	if (err)
-		goto badframe;
+
+	unsafe_copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set), badframe_block);
+	user_write_access_end();
 
 	/* Make sure signal handler doesn't get spurious FP exceptions */
 	tsk->thread.fp_state.fpscr = 0;
@@ -900,6 +907,11 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 		regs->nip = (unsigned long) &frame->tramp[0];
 	}
 
+
+	/* Save the siginfo outside of the unsafe block. */
+	if (copy_siginfo_to_user(&frame->info, &ksig->info))
+		goto badframe;
+
 	/* Allocate a dummy caller frame for the signal handler. */
 	newsp = ((unsigned long)frame) - __SIGNAL_FRAMESIZE;
 	err |= put_user(regs->gpr[1], (unsigned long __user *)newsp);
@@ -939,6 +951,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
 
 	return 0;
 
+badframe_block:
+	user_write_access_end();
 badframe:
 	signal_fault(current, regs, "handle_rt_signal64", frame);
 
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v3 8/8] powerpc/signal64: Rewrite rt_sigreturn() to minimise uaccess switches
  2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
                   ` (6 preceding siblings ...)
  2021-01-09  3:25 ` [PATCH v3 7/8] powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess switches Christopher M. Riedl
@ 2021-01-09  3:25 ` Christopher M. Riedl
  7 siblings, 0 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-09  3:25 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Daniel Axtens

From: Daniel Axtens <dja@axtens.net>

Add uaccess blocks and use the 'unsafe' versions of functions doing user
access where possible to reduce the number of times uaccess has to be
opened/closed.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Co-developed-by: Christopher M. Riedl <cmr@codefail.de>
Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
---
 arch/powerpc/kernel/signal_64.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
index 7e62d62ef482..1b1e39663999 100644
--- a/arch/powerpc/kernel/signal_64.c
+++ b/arch/powerpc/kernel/signal_64.c
@@ -784,8 +784,11 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	regs->msr &= ~MSR_TS_MASK;
 #endif
 
-	if (__get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR]))
+	if (!user_read_access_begin(uc, sizeof(*uc)))
 		goto badframe;
+
+	unsafe_get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR], badframe_block);
+
 	if (MSR_TM_ACTIVE(msr)) {
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 		/* We recheckpoint on return. */
@@ -793,10 +796,12 @@ SYSCALL_DEFINE0(rt_sigreturn)
 
 		/* Trying to start TM on non TM system */
 		if (!cpu_has_feature(CPU_FTR_TM))
-			goto badframe;
+			goto badframe_block;
+
+		unsafe_get_user(uc_transact, &uc->uc_link, badframe_block);
+
+		user_read_access_end();
 
-		if (__get_user(uc_transact, &uc->uc_link))
-			goto badframe;
 		if (restore_tm_sigcontexts(current, &uc->uc_mcontext,
 					   &uc_transact->uc_mcontext))
 			goto badframe;
@@ -815,12 +820,9 @@ SYSCALL_DEFINE0(rt_sigreturn)
 		 * causing a TM bad thing.
 		 */
 		current->thread.regs->msr &= ~MSR_TS_MASK;
-		if (!user_read_access_begin(uc, sizeof(*uc)))
-			return -EFAULT;
-		if (__unsafe_restore_sigcontext(current, NULL, 1, &uc->uc_mcontext)) {
-			user_read_access_end();
-			goto badframe;
-		}
+		unsafe_restore_sigcontext(current, NULL, 1, &uc->uc_mcontext,
+					  badframe_block);
+
 		user_read_access_end();
 	}
 
@@ -830,6 +832,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	set_thread_flag(TIF_RESTOREALL);
 	return 0;
 
+badframe_block:
+	user_read_access_end();
 badframe:
 	signal_fault(current, regs, "rt_sigreturn", uc);
 
-- 
2.26.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-09  3:25 ` [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user Christopher M. Riedl
@ 2021-01-11 13:22   ` Christophe Leroy
  2021-01-17 17:19     ` Christopher M. Riedl
  0 siblings, 1 reply; 20+ messages in thread
From: Christophe Leroy @ 2021-01-11 13:22 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev



Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
> Implement raw_copy_from_user_allowed() which assumes that userspace read
> access is open. Use this new function to implement raw_copy_from_user().
> Finally, wrap the new function to follow the usual "unsafe_" convention
> of taking a label argument.

I think there is no point implementing raw_copy_from_user_allowed(), see 
https://github.com/linuxppc/linux/commit/4b842e4e25b1 and 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/

You should simply do:

	#define unsafe_copy_from_user(d, s, l, e) \
		unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)


Christophe

> 
> The new raw_copy_from_user_allowed() calls non-inline __copy_tofrom_user()
> internally. This is still safe to call inside user access blocks formed
> with user_*_access_begin()/user_*_access_end() since asm functions are not
> instrumented for tracing.
> 
> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>   arch/powerpc/include/asm/uaccess.h | 28 +++++++++++++++++++---------
>   1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> index 501c9a79038c..698f3a6d6ae5 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -403,38 +403,45 @@ raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)
>   }
>   #endif /* __powerpc64__ */
>   
> -static inline unsigned long raw_copy_from_user(void *to,
> -		const void __user *from, unsigned long n)
> +static inline unsigned long
> +raw_copy_from_user_allowed(void *to, const void __user *from, unsigned long n)
>   {
> -	unsigned long ret;
>   	if (__builtin_constant_p(n) && (n <= 8)) {
> -		ret = 1;
> +		unsigned long ret = 1;
>   
>   		switch (n) {
>   		case 1:
>   			barrier_nospec();
> -			__get_user_size(*(u8 *)to, from, 1, ret);
> +			__get_user_size_allowed(*(u8 *)to, from, 1, ret);
>   			break;
>   		case 2:
>   			barrier_nospec();
> -			__get_user_size(*(u16 *)to, from, 2, ret);
> +			__get_user_size_allowed(*(u16 *)to, from, 2, ret);
>   			break;
>   		case 4:
>   			barrier_nospec();
> -			__get_user_size(*(u32 *)to, from, 4, ret);
> +			__get_user_size_allowed(*(u32 *)to, from, 4, ret);
>   			break;
>   		case 8:
>   			barrier_nospec();
> -			__get_user_size(*(u64 *)to, from, 8, ret);
> +			__get_user_size_allowed(*(u64 *)to, from, 8, ret);
>   			break;
>   		}
>   		if (ret == 0)
>   			return 0;
>   	}
>   
> +	return __copy_tofrom_user((__force void __user *)to, from, n);
> +}
> +
> +static inline unsigned long
> +raw_copy_from_user(void *to, const void __user *from, unsigned long n)
> +{
> +	unsigned long ret;
> +
>   	barrier_nospec();
>   	allow_read_from_user(from, n);
> -	ret = __copy_tofrom_user((__force void __user *)to, from, n);
> +	ret = raw_copy_from_user_allowed(to, from, n);
>   	prevent_read_from_user(from, n);
>   	return ret;
>   }
> @@ -542,6 +549,9 @@ user_write_access_begin(const void __user *ptr, size_t len)
>   #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
>   #define unsafe_put_user(x, p, e) __put_user_goto(x, p, e)
>   
> +#define unsafe_copy_from_user(d, s, l, e) \
> +	unsafe_op_wrap(raw_copy_from_user_allowed(d, s, l), e)
> +
>   #define unsafe_copy_to_user(d, s, l, e) \
>   do {									\
>   	u8 __user *_dst = (u8 __user *)(d);				\
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 4/8] powerpc/signal64: Remove TM ifdefery in middle of if/else block
  2021-01-09  3:25 ` [PATCH v3 4/8] powerpc/signal64: Remove TM ifdefery in middle of if/else block Christopher M. Riedl
@ 2021-01-11 13:29   ` Christophe Leroy
  2021-01-17 17:16     ` Christopher M. Riedl
  0 siblings, 1 reply; 20+ messages in thread
From: Christophe Leroy @ 2021-01-11 13:29 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev



Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
> Rework the messy ifdef breaking up the if-else for TM similar to
> commit f1cf4f93de2f ("powerpc/signal32: Remove ifdefery in middle of if/else").
> 
> Unlike that commit for ppc32, the ifdef can't be removed entirely since
> uc_transact in sigframe depends on CONFIG_PPC_TRANSACTIONAL_MEM.
> 
> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> ---
>   arch/powerpc/kernel/signal_64.c | 17 +++++++----------
>   1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> index b211a8ea4f6e..dd3787f67a78 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -710,9 +710,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
>   	struct pt_regs *regs = current_pt_regs();
>   	struct ucontext __user *uc = (struct ucontext __user *)regs->gpr[1];
>   	sigset_t set;
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>   	unsigned long msr;
> -#endif
>   
>   	/* Always make any pending restarted system calls return -EINTR */
>   	current->restart_block.fn = do_no_restart_syscall;
> @@ -762,10 +760,12 @@ SYSCALL_DEFINE0(rt_sigreturn)
>   	 * restore_tm_sigcontexts.
>   	 */
>   	regs->msr &= ~MSR_TS_MASK;
> +#endif
>   
>   	if (__get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR]))
>   		goto badframe;

This means you are doing that __get_user() even when msr is not used. That should be avoided.

>   	if (MSR_TM_ACTIVE(msr)) {
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>   		/* We recheckpoint on return. */
>   		struct ucontext __user *uc_transact;
>   
> @@ -778,9 +778,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
>   		if (restore_tm_sigcontexts(current, &uc->uc_mcontext,
>   					   &uc_transact->uc_mcontext))
>   			goto badframe;
> -	} else
>   #endif
> -	{
> +	} else {
>   		/*
>   		 * Fall through, for non-TM restore
>   		 *
> @@ -818,10 +817,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
>   	unsigned long newsp = 0;
>   	long err = 0;
>   	struct pt_regs *regs = tsk->thread.regs;
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>   	/* Save the thread's msr before get_tm_stackpointer() changes it */
> -	unsigned long msr = regs->msr;
> -#endif
> +	unsigned long msr __maybe_unused = regs->msr;

I don't thing __maybe_unused() is the right solution.

I think MSR_TM_ACTIVE() should be fixed instead, either by changing it into a static inline 
function, or doing something similar to 
https://github.com/linuxppc/linux/commit/05a4ab823983d9136a460b7b5e0d49ee709a6f86

>   
>   	frame = get_sigframe(ksig, tsk, sizeof(*frame), 0);
>   	if (!access_ok(frame, sizeof(*frame)))
> @@ -836,8 +833,9 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
>   	/* Create the ucontext.  */
>   	err |= __put_user(0, &frame->uc.uc_flags);
>   	err |= __save_altstack(&frame->uc.uc_stack, regs->gpr[1]);
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> +
>   	if (MSR_TM_ACTIVE(msr)) {
> +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>   		/* The ucontext_t passed to userland points to the second
>   		 * ucontext_t (for transactional state) with its uc_link ptr.
>   		 */
> @@ -847,9 +845,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
>   					    tsk, ksig->sig, NULL,
>   					    (unsigned long)ksig->ka.sa.sa_handler,
>   					    msr);
> -	} else
>   #endif
> -	{
> +	} else {
>   		err |= __put_user(0, &frame->uc.uc_link);
>   		prepare_setup_sigcontext(tsk, 1);
>   		err |= setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,
> 

Christophe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 4/8] powerpc/signal64: Remove TM ifdefery in middle of if/else block
  2021-01-11 13:29   ` Christophe Leroy
@ 2021-01-17 17:16     ` Christopher M. Riedl
  0 siblings, 0 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-17 17:16 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

On Mon Jan 11, 2021 at 7:29 AM CST, Christophe Leroy wrote:
>
>
> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
> > Rework the messy ifdef breaking up the if-else for TM similar to
> > commit f1cf4f93de2f ("powerpc/signal32: Remove ifdefery in middle of if/else").
> > 
> > Unlike that commit for ppc32, the ifdef can't be removed entirely since
> > uc_transact in sigframe depends on CONFIG_PPC_TRANSACTIONAL_MEM.
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> > ---
> >   arch/powerpc/kernel/signal_64.c | 17 +++++++----------
> >   1 file changed, 7 insertions(+), 10 deletions(-)
> > 
> > diff --git a/arch/powerpc/kernel/signal_64.c b/arch/powerpc/kernel/signal_64.c
> > index b211a8ea4f6e..dd3787f67a78 100644
> > --- a/arch/powerpc/kernel/signal_64.c
> > +++ b/arch/powerpc/kernel/signal_64.c
> > @@ -710,9 +710,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >   	struct pt_regs *regs = current_pt_regs();
> >   	struct ucontext __user *uc = (struct ucontext __user *)regs->gpr[1];
> >   	sigset_t set;
> > -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> >   	unsigned long msr;
> > -#endif
> >   
> >   	/* Always make any pending restarted system calls return -EINTR */
> >   	current->restart_block.fn = do_no_restart_syscall;
> > @@ -762,10 +760,12 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >   	 * restore_tm_sigcontexts.
> >   	 */
> >   	regs->msr &= ~MSR_TS_MASK;
> > +#endif
> >   
> >   	if (__get_user(msr, &uc->uc_mcontext.gp_regs[PT_MSR]))
> >   		goto badframe;
>
> This means you are doing that __get_user() even when msr is not used.
> That should be avoided.
>

Thanks, I moved it into the #ifdef block right above it instead for the
next spin.

> >   	if (MSR_TM_ACTIVE(msr)) {
> > +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> >   		/* We recheckpoint on return. */
> >   		struct ucontext __user *uc_transact;
> >   
> > @@ -778,9 +778,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >   		if (restore_tm_sigcontexts(current, &uc->uc_mcontext,
> >   					   &uc_transact->uc_mcontext))
> >   			goto badframe;
> > -	} else
> >   #endif
> > -	{
> > +	} else {
> >   		/*
> >   		 * Fall through, for non-TM restore
> >   		 *
> > @@ -818,10 +817,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
> >   	unsigned long newsp = 0;
> >   	long err = 0;
> >   	struct pt_regs *regs = tsk->thread.regs;
> > -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> >   	/* Save the thread's msr before get_tm_stackpointer() changes it */
> > -	unsigned long msr = regs->msr;
> > -#endif
> > +	unsigned long msr __maybe_unused = regs->msr;
>
> I don't thing __maybe_unused() is the right solution.
>
> I think MSR_TM_ACTIVE() should be fixed instead, either by changing it
> into a static inline
> function, or doing something similar to
> https://github.com/linuxppc/linux/commit/05a4ab823983d9136a460b7b5e0d49ee709a6f86
>

Agreed, I'll change MSR_TM_ACTIVE() to reference its argument in the
macro. This keeps it consistent with all the other MSR_TM_* macros in
reg.h. Probably better than changing it to static inline since that
would mean changing all the macros too which seems unecessary.

> >   
> >   	frame = get_sigframe(ksig, tsk, sizeof(*frame), 0);
> >   	if (!access_ok(frame, sizeof(*frame)))
> > @@ -836,8 +833,9 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
> >   	/* Create the ucontext.  */
> >   	err |= __put_user(0, &frame->uc.uc_flags);
> >   	err |= __save_altstack(&frame->uc.uc_stack, regs->gpr[1]);
> > -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> > +
> >   	if (MSR_TM_ACTIVE(msr)) {
> > +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> >   		/* The ucontext_t passed to userland points to the second
> >   		 * ucontext_t (for transactional state) with its uc_link ptr.
> >   		 */
> > @@ -847,9 +845,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t *set,
> >   					    tsk, ksig->sig, NULL,
> >   					    (unsigned long)ksig->ka.sa.sa_handler,
> >   					    msr);
> > -	} else
> >   #endif
> > -	{
> > +	} else {
> >   		err |= __put_user(0, &frame->uc.uc_link);
> >   		prepare_setup_sigcontext(tsk, 1);
> >   		err |= setup_sigcontext(&frame->uc.uc_mcontext, tsk, ksig->sig,
> > 
>
> Christophe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-11 13:22   ` Christophe Leroy
@ 2021-01-17 17:19     ` Christopher M. Riedl
  2021-01-19  2:11       ` Michael Ellerman
  2021-01-19  7:29       ` Christophe Leroy
  0 siblings, 2 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-17 17:19 UTC (permalink / raw)
  To: Christophe Leroy, linuxppc-dev

On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
>
>
> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
> > Implement raw_copy_from_user_allowed() which assumes that userspace read
> > access is open. Use this new function to implement raw_copy_from_user().
> > Finally, wrap the new function to follow the usual "unsafe_" convention
> > of taking a label argument.
>
> I think there is no point implementing raw_copy_from_user_allowed(), see
> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
>
> You should simply do:
>
> #define unsafe_copy_from_user(d, s, l, e) \
> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
>

I gave this a try and the signal ops decreased by ~8K. Now, to be
honest, I am not sure what an "acceptable" benchmark number here
actually is - so maybe this is ok? Same loss with both radix and hash:

	|                                      | hash   | radix  |
	| ------------------------------------ | ------ | ------ |
	| linuxppc/next                        | 118693 | 133296 |
	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
	| unsafe-signal64                      | 200480 | 234067 |
	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |

To put this into perspective, prior to KUAP and uaccess flush, signal
performance in this benchmark was ~290K on hash.

>
> Christophe
>
> > 
> > The new raw_copy_from_user_allowed() calls non-inline __copy_tofrom_user()
> > internally. This is still safe to call inside user access blocks formed
> > with user_*_access_begin()/user_*_access_end() since asm functions are not
> > instrumented for tracing.
> > 
> > Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
> > ---
> >   arch/powerpc/include/asm/uaccess.h | 28 +++++++++++++++++++---------
> >   1 file changed, 19 insertions(+), 9 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
> > index 501c9a79038c..698f3a6d6ae5 100644
> > --- a/arch/powerpc/include/asm/uaccess.h
> > +++ b/arch/powerpc/include/asm/uaccess.h
> > @@ -403,38 +403,45 @@ raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)
> >   }
> >   #endif /* __powerpc64__ */
> >   
> > -static inline unsigned long raw_copy_from_user(void *to,
> > -		const void __user *from, unsigned long n)
> > +static inline unsigned long
> > +raw_copy_from_user_allowed(void *to, const void __user *from, unsigned long n)
> >   {
> > -	unsigned long ret;
> >   	if (__builtin_constant_p(n) && (n <= 8)) {
> > -		ret = 1;
> > +		unsigned long ret = 1;
> >   
> >   		switch (n) {
> >   		case 1:
> >   			barrier_nospec();
> > -			__get_user_size(*(u8 *)to, from, 1, ret);
> > +			__get_user_size_allowed(*(u8 *)to, from, 1, ret);
> >   			break;
> >   		case 2:
> >   			barrier_nospec();
> > -			__get_user_size(*(u16 *)to, from, 2, ret);
> > +			__get_user_size_allowed(*(u16 *)to, from, 2, ret);
> >   			break;
> >   		case 4:
> >   			barrier_nospec();
> > -			__get_user_size(*(u32 *)to, from, 4, ret);
> > +			__get_user_size_allowed(*(u32 *)to, from, 4, ret);
> >   			break;
> >   		case 8:
> >   			barrier_nospec();
> > -			__get_user_size(*(u64 *)to, from, 8, ret);
> > +			__get_user_size_allowed(*(u64 *)to, from, 8, ret);
> >   			break;
> >   		}
> >   		if (ret == 0)
> >   			return 0;
> >   	}
> >   
> > +	return __copy_tofrom_user((__force void __user *)to, from, n);
> > +}
> > +
> > +static inline unsigned long
> > +raw_copy_from_user(void *to, const void __user *from, unsigned long n)
> > +{
> > +	unsigned long ret;
> > +
> >   	barrier_nospec();
> >   	allow_read_from_user(from, n);
> > -	ret = __copy_tofrom_user((__force void __user *)to, from, n);
> > +	ret = raw_copy_from_user_allowed(to, from, n);
> >   	prevent_read_from_user(from, n);
> >   	return ret;
> >   }
> > @@ -542,6 +549,9 @@ user_write_access_begin(const void __user *ptr, size_t len)
> >   #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
> >   #define unsafe_put_user(x, p, e) __put_user_goto(x, p, e)
> >   
> > +#define unsafe_copy_from_user(d, s, l, e) \
> > +	unsafe_op_wrap(raw_copy_from_user_allowed(d, s, l), e)
> > +
> >   #define unsafe_copy_to_user(d, s, l, e) \
> >   do {									\
> >   	u8 __user *_dst = (u8 __user *)(d);				\
> > 


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-17 17:19     ` Christopher M. Riedl
@ 2021-01-19  2:11       ` Michael Ellerman
  2021-01-19 12:33         ` Christophe Leroy
  2021-02-09 14:09         ` Christophe Leroy
  2021-01-19  7:29       ` Christophe Leroy
  1 sibling, 2 replies; 20+ messages in thread
From: Michael Ellerman @ 2021-01-19  2:11 UTC (permalink / raw)
  To: Christopher M. Riedl, Christophe Leroy, linuxppc-dev

"Christopher M. Riedl" <cmr@codefail.de> writes:
> On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
>> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
>> > Implement raw_copy_from_user_allowed() which assumes that userspace read
>> > access is open. Use this new function to implement raw_copy_from_user().
>> > Finally, wrap the new function to follow the usual "unsafe_" convention
>> > of taking a label argument.
>>
>> I think there is no point implementing raw_copy_from_user_allowed(), see
>> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
>>
>> You should simply do:
>>
>> #define unsafe_copy_from_user(d, s, l, e) \
>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
>>
>
> I gave this a try and the signal ops decreased by ~8K. Now, to be
> honest, I am not sure what an "acceptable" benchmark number here
> actually is - so maybe this is ok? Same loss with both radix and hash:
>
> 	|                                      | hash   | radix  |
> 	| ------------------------------------ | ------ | ------ |
> 	| linuxppc/next                        | 118693 | 133296 |
> 	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
> 	| unsafe-signal64                      | 200480 | 234067 |
> 	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |
>
> To put this into perspective, prior to KUAP and uaccess flush, signal
> performance in this benchmark was ~290K on hash.

If I'm doing the math right 8K is ~4% of the best number.

It seems like 4% is worth a few lines of code to handle these constant
sizes. It's not like we have performance to throw away.

Or, we should chase down where the call sites are that are doing small
constant copies with copy_to/from_user() and change them to use
get/put_user().

cheers

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-17 17:19     ` Christopher M. Riedl
  2021-01-19  2:11       ` Michael Ellerman
@ 2021-01-19  7:29       ` Christophe Leroy
  1 sibling, 0 replies; 20+ messages in thread
From: Christophe Leroy @ 2021-01-19  7:29 UTC (permalink / raw)
  To: Christopher M. Riedl, linuxppc-dev



Le 17/01/2021 à 18:19, Christopher M. Riedl a écrit :
> On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
>>
>>
>> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
>>> Implement raw_copy_from_user_allowed() which assumes that userspace read
>>> access is open. Use this new function to implement raw_copy_from_user().
>>> Finally, wrap the new function to follow the usual "unsafe_" convention
>>> of taking a label argument.
>>
>> I think there is no point implementing raw_copy_from_user_allowed(), see
>> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
>>
>> You should simply do:
>>
>> #define unsafe_copy_from_user(d, s, l, e) \
>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
>>
> 
> I gave this a try and the signal ops decreased by ~8K. Now, to be
> honest, I am not sure what an "acceptable" benchmark number here
> actually is - so maybe this is ok? Same loss with both radix and hash:

I don't think this is ok, but it probably means that you are using unsafe_copy_from_user() to copy 
small constant size data that should be copied with unsafe_get_user() instead.

> 
> 	|                                      | hash   | radix  |
> 	| ------------------------------------ | ------ | ------ |
> 	| linuxppc/next                        | 118693 | 133296 |
> 	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
> 	| unsafe-signal64                      | 200480 | 234067 |
> 	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |
> 
> To put this into perspective, prior to KUAP and uaccess flush, signal
> performance in this benchmark was ~290K on hash.
> 
>>
>> Christophe
>>
>>>
>>> The new raw_copy_from_user_allowed() calls non-inline __copy_tofrom_user()
>>> internally. This is still safe to call inside user access blocks formed
>>> with user_*_access_begin()/user_*_access_end() since asm functions are not
>>> instrumented for tracing.
>>>
>>> Signed-off-by: Christopher M. Riedl <cmr@codefail.de>
>>> ---
>>>    arch/powerpc/include/asm/uaccess.h | 28 +++++++++++++++++++---------
>>>    1 file changed, 19 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h
>>> index 501c9a79038c..698f3a6d6ae5 100644
>>> --- a/arch/powerpc/include/asm/uaccess.h
>>> +++ b/arch/powerpc/include/asm/uaccess.h
>>> @@ -403,38 +403,45 @@ raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)
>>>    }
>>>    #endif /* __powerpc64__ */
>>>    
>>> -static inline unsigned long raw_copy_from_user(void *to,
>>> -		const void __user *from, unsigned long n)
>>> +static inline unsigned long
>>> +raw_copy_from_user_allowed(void *to, const void __user *from, unsigned long n)
>>>    {
>>> -	unsigned long ret;
>>>    	if (__builtin_constant_p(n) && (n <= 8)) {
>>> -		ret = 1;
>>> +		unsigned long ret = 1;
>>>    
>>>    		switch (n) {
>>>    		case 1:
>>>    			barrier_nospec();
>>> -			__get_user_size(*(u8 *)to, from, 1, ret);
>>> +			__get_user_size_allowed(*(u8 *)to, from, 1, ret);
>>>    			break;
>>>    		case 2:
>>>    			barrier_nospec();
>>> -			__get_user_size(*(u16 *)to, from, 2, ret);
>>> +			__get_user_size_allowed(*(u16 *)to, from, 2, ret);
>>>    			break;
>>>    		case 4:
>>>    			barrier_nospec();
>>> -			__get_user_size(*(u32 *)to, from, 4, ret);
>>> +			__get_user_size_allowed(*(u32 *)to, from, 4, ret);
>>>    			break;
>>>    		case 8:
>>>    			barrier_nospec();
>>> -			__get_user_size(*(u64 *)to, from, 8, ret);
>>> +			__get_user_size_allowed(*(u64 *)to, from, 8, ret);
>>>    			break;
>>>    		}
>>>    		if (ret == 0)
>>>    			return 0;
>>>    	}
>>>    
>>> +	return __copy_tofrom_user((__force void __user *)to, from, n);
>>> +}
>>> +
>>> +static inline unsigned long
>>> +raw_copy_from_user(void *to, const void __user *from, unsigned long n)
>>> +{
>>> +	unsigned long ret;
>>> +
>>>    	barrier_nospec();
>>>    	allow_read_from_user(from, n);
>>> -	ret = __copy_tofrom_user((__force void __user *)to, from, n);
>>> +	ret = raw_copy_from_user_allowed(to, from, n);
>>>    	prevent_read_from_user(from, n);
>>>    	return ret;
>>>    }
>>> @@ -542,6 +549,9 @@ user_write_access_begin(const void __user *ptr, size_t len)
>>>    #define unsafe_get_user(x, p, e) unsafe_op_wrap(__get_user_allowed(x, p), e)
>>>    #define unsafe_put_user(x, p, e) __put_user_goto(x, p, e)
>>>    
>>> +#define unsafe_copy_from_user(d, s, l, e) \
>>> +	unsafe_op_wrap(raw_copy_from_user_allowed(d, s, l), e)
>>> +
>>>    #define unsafe_copy_to_user(d, s, l, e) \
>>>    do {									\
>>>    	u8 __user *_dst = (u8 __user *)(d);				\
>>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-19  2:11       ` Michael Ellerman
@ 2021-01-19 12:33         ` Christophe Leroy
  2021-01-19 17:02           ` Christopher M. Riedl
  2021-02-09 14:09         ` Christophe Leroy
  1 sibling, 1 reply; 20+ messages in thread
From: Christophe Leroy @ 2021-01-19 12:33 UTC (permalink / raw)
  To: Michael Ellerman, Christopher M. Riedl, linuxppc-dev



Le 19/01/2021 à 03:11, Michael Ellerman a écrit :
> "Christopher M. Riedl" <cmr@codefail.de> writes:
>> On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
>>> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
>>>> Implement raw_copy_from_user_allowed() which assumes that userspace read
>>>> access is open. Use this new function to implement raw_copy_from_user().
>>>> Finally, wrap the new function to follow the usual "unsafe_" convention
>>>> of taking a label argument.
>>>
>>> I think there is no point implementing raw_copy_from_user_allowed(), see
>>> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
>>>
>>> You should simply do:
>>>
>>> #define unsafe_copy_from_user(d, s, l, e) \
>>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
>>>
>>
>> I gave this a try and the signal ops decreased by ~8K. Now, to be
>> honest, I am not sure what an "acceptable" benchmark number here
>> actually is - so maybe this is ok? Same loss with both radix and hash:
>>
>> 	|                                      | hash   | radix  |
>> 	| ------------------------------------ | ------ | ------ |
>> 	| linuxppc/next                        | 118693 | 133296 |
>> 	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
>> 	| unsafe-signal64                      | 200480 | 234067 |
>> 	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |
>>
>> To put this into perspective, prior to KUAP and uaccess flush, signal
>> performance in this benchmark was ~290K on hash.
> 
> If I'm doing the math right 8K is ~4% of the best number.
> 
> It seems like 4% is worth a few lines of code to handle these constant
> sizes. It's not like we have performance to throw away.
> 
> Or, we should chase down where the call sites are that are doing small
> constant copies with copy_to/from_user() and change them to use
> get/put_user().
> 

Christopher, when you say you gave it a try, is I my series or only the following ?

#define unsafe_copy_from_user(d, s, l, e) \
unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)

Because I see no use of unsafe_copy_from_user() that would explain that.

Christophe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-19 12:33         ` Christophe Leroy
@ 2021-01-19 17:02           ` Christopher M. Riedl
  2021-01-19 17:27             ` Christophe Leroy
  0 siblings, 1 reply; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-19 17:02 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, linuxppc-dev

On Tue Jan 19, 2021 at 6:33 AM CST, Christophe Leroy wrote:
>
>
> Le 19/01/2021 à 03:11, Michael Ellerman a écrit :
> > "Christopher M. Riedl" <cmr@codefail.de> writes:
> >> On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
> >>> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
> >>>> Implement raw_copy_from_user_allowed() which assumes that userspace read
> >>>> access is open. Use this new function to implement raw_copy_from_user().
> >>>> Finally, wrap the new function to follow the usual "unsafe_" convention
> >>>> of taking a label argument.
> >>>
> >>> I think there is no point implementing raw_copy_from_user_allowed(), see
> >>> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
> >>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
> >>>
> >>> You should simply do:
> >>>
> >>> #define unsafe_copy_from_user(d, s, l, e) \
> >>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
> >>>
> >>
> >> I gave this a try and the signal ops decreased by ~8K. Now, to be
> >> honest, I am not sure what an "acceptable" benchmark number here
> >> actually is - so maybe this is ok? Same loss with both radix and hash:
> >>
> >> 	|                                      | hash   | radix  |
> >> 	| ------------------------------------ | ------ | ------ |
> >> 	| linuxppc/next                        | 118693 | 133296 |
> >> 	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
> >> 	| unsafe-signal64                      | 200480 | 234067 |
> >> 	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |
> >>
> >> To put this into perspective, prior to KUAP and uaccess flush, signal
> >> performance in this benchmark was ~290K on hash.
> > 
> > If I'm doing the math right 8K is ~4% of the best number.
> > 
> > It seems like 4% is worth a few lines of code to handle these constant
> > sizes. It's not like we have performance to throw away.
> > 
> > Or, we should chase down where the call sites are that are doing small
> > constant copies with copy_to/from_user() and change them to use
> > get/put_user().
> > 
>
> Christopher, when you say you gave it a try, is I my series or only the
> following ?
>
> #define unsafe_copy_from_user(d, s, l, e) \
> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
>

I only used the above to replace this patch in my series (so none of my
changes implementing raw_copy_from_user_allowed() are included).

>
> Because I see no use of unsafe_copy_from_user() that would explain that.
>
> Christophe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-19 17:02           ` Christopher M. Riedl
@ 2021-01-19 17:27             ` Christophe Leroy
  2021-01-20  5:08               ` Christopher M. Riedl
  0 siblings, 1 reply; 20+ messages in thread
From: Christophe Leroy @ 2021-01-19 17:27 UTC (permalink / raw)
  To: Christopher M. Riedl, Michael Ellerman, linuxppc-dev



Le 19/01/2021 à 18:02, Christopher M. Riedl a écrit :
> On Tue Jan 19, 2021 at 6:33 AM CST, Christophe Leroy wrote:
>>
>>
>> Le 19/01/2021 à 03:11, Michael Ellerman a écrit :
>>> "Christopher M. Riedl" <cmr@codefail.de> writes:
>>>> On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
>>>>> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
>>>>>> Implement raw_copy_from_user_allowed() which assumes that userspace read
>>>>>> access is open. Use this new function to implement raw_copy_from_user().
>>>>>> Finally, wrap the new function to follow the usual "unsafe_" convention
>>>>>> of taking a label argument.
>>>>>
>>>>> I think there is no point implementing raw_copy_from_user_allowed(), see
>>>>> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
>>>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
>>>>>
>>>>> You should simply do:
>>>>>
>>>>> #define unsafe_copy_from_user(d, s, l, e) \
>>>>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
>>>>>
>>>>
>>>> I gave this a try and the signal ops decreased by ~8K. Now, to be
>>>> honest, I am not sure what an "acceptable" benchmark number here
>>>> actually is - so maybe this is ok? Same loss with both radix and hash:
>>>>
>>>> 	|                                      | hash   | radix  |
>>>> 	| ------------------------------------ | ------ | ------ |
>>>> 	| linuxppc/next                        | 118693 | 133296 |
>>>> 	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
>>>> 	| unsafe-signal64                      | 200480 | 234067 |
>>>> 	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |
>>>>
>>>> To put this into perspective, prior to KUAP and uaccess flush, signal
>>>> performance in this benchmark was ~290K on hash.
>>>
>>> If I'm doing the math right 8K is ~4% of the best number.
>>>
>>> It seems like 4% is worth a few lines of code to handle these constant
>>> sizes. It's not like we have performance to throw away.
>>>
>>> Or, we should chase down where the call sites are that are doing small
>>> constant copies with copy_to/from_user() and change them to use
>>> get/put_user().
>>>
>>
>> Christopher, when you say you gave it a try, is I my series or only the
>> following ?
>>
>> #define unsafe_copy_from_user(d, s, l, e) \
>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
>>
> 
> I only used the above to replace this patch in my series (so none of my
> changes implementing raw_copy_from_user_allowed() are included).

Then I see no reason why the performance would be different, because you only call 
unsafe_copy_from_user() with non trivial lengthes.

> 
>>
>> Because I see no use of unsafe_copy_from_user() that would explain that.
>>
>> Christophe

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-19 17:27             ` Christophe Leroy
@ 2021-01-20  5:08               ` Christopher M. Riedl
  0 siblings, 0 replies; 20+ messages in thread
From: Christopher M. Riedl @ 2021-01-20  5:08 UTC (permalink / raw)
  To: Christophe Leroy, Michael Ellerman, linuxppc-dev

On Tue Jan 19, 2021 at 11:27 AM CST, Christophe Leroy wrote:
>
>
> Le 19/01/2021 à 18:02, Christopher M. Riedl a écrit :
> > On Tue Jan 19, 2021 at 6:33 AM CST, Christophe Leroy wrote:
> >>
> >>
> >> Le 19/01/2021 à 03:11, Michael Ellerman a écrit :
> >>> "Christopher M. Riedl" <cmr@codefail.de> writes:
> >>>> On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
> >>>>> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
> >>>>>> Implement raw_copy_from_user_allowed() which assumes that userspace read
> >>>>>> access is open. Use this new function to implement raw_copy_from_user().
> >>>>>> Finally, wrap the new function to follow the usual "unsafe_" convention
> >>>>>> of taking a label argument.
> >>>>>
> >>>>> I think there is no point implementing raw_copy_from_user_allowed(), see
> >>>>> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
> >>>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
> >>>>>
> >>>>> You should simply do:
> >>>>>
> >>>>> #define unsafe_copy_from_user(d, s, l, e) \
> >>>>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
> >>>>>
> >>>>
> >>>> I gave this a try and the signal ops decreased by ~8K. Now, to be
> >>>> honest, I am not sure what an "acceptable" benchmark number here
> >>>> actually is - so maybe this is ok? Same loss with both radix and hash:
> >>>>
> >>>> 	|                                      | hash   | radix  |
> >>>> 	| ------------------------------------ | ------ | ------ |
> >>>> 	| linuxppc/next                        | 118693 | 133296 |
> >>>> 	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
> >>>> 	| unsafe-signal64                      | 200480 | 234067 |
> >>>> 	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |
> >>>>
> >>>> To put this into perspective, prior to KUAP and uaccess flush, signal
> >>>> performance in this benchmark was ~290K on hash.
> >>>
> >>> If I'm doing the math right 8K is ~4% of the best number.
> >>>
> >>> It seems like 4% is worth a few lines of code to handle these constant
> >>> sizes. It's not like we have performance to throw away.
> >>>
> >>> Or, we should chase down where the call sites are that are doing small
> >>> constant copies with copy_to/from_user() and change them to use
> >>> get/put_user().
> >>>
> >>
> >> Christopher, when you say you gave it a try, is I my series or only the
> >> following ?
> >>
> >> #define unsafe_copy_from_user(d, s, l, e) \
> >> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
> >>
> > 
> > I only used the above to replace this patch in my series (so none of my
> > changes implementing raw_copy_from_user_allowed() are included).
>
> Then I see no reason why the performance would be different, because you
> only call
> unsafe_copy_from_user() with non trivial lengthes.
>

Ok I made a mistake - this actually included the other improvements from
your feedback on this series; specifically moving the unsafe_get_user()
call to read the msr regs value into the #ifdef. My first pass resulted
in another __get_user() call which explains some of the perf loss.

With that mistake fixed, the benchmark performance is still only ~197k
on hash (consistent). I agree that there are no places where
unsafe_copy_from_user() is called with trivial lengths so the only
conclusion I can draw is that the changes in this patch marginally
speed-up the other __copy_from_user() calls. I started comparing the
disassembly but nothing immediately obvious stands out to me. In fact, I
observe the speed-up even if I keep this patch _and_ apply this change:

#define unsafe_copy_from_user(d, s, l, e) \
unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)

Related to that, I think sigset_t is always 8B on ppc64 so these will
probably need a wrapper if we pick-up your series to get rid of trivial
size optimizations:

__copy_from_user(&set, &uc->uc_sigmask, sizeof(set))

I can bring performance up to ~200K on hash again by replacing the two
__copy_from_user(&set, ...) with direct calls to __get_user_size() (it's
kind of hacky I think). See the last commit in this tree:

https://git.sr.ht/~cmr/linux/log/unsafe-signal64-v4

All my comments apply to performance on radix as well.

> > 
> >>
> >> Because I see no use of unsafe_copy_from_user() that would explain that.
> >>
> >> Christophe


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user
  2021-01-19  2:11       ` Michael Ellerman
  2021-01-19 12:33         ` Christophe Leroy
@ 2021-02-09 14:09         ` Christophe Leroy
  1 sibling, 0 replies; 20+ messages in thread
From: Christophe Leroy @ 2021-02-09 14:09 UTC (permalink / raw)
  To: Michael Ellerman, Christopher M. Riedl, linuxppc-dev



Le 19/01/2021 à 03:11, Michael Ellerman a écrit :
> "Christopher M. Riedl" <cmr@codefail.de> writes:
>> On Mon Jan 11, 2021 at 7:22 AM CST, Christophe Leroy wrote:
>>> Le 09/01/2021 à 04:25, Christopher M. Riedl a écrit :
>>>> Implement raw_copy_from_user_allowed() which assumes that userspace read
>>>> access is open. Use this new function to implement raw_copy_from_user().
>>>> Finally, wrap the new function to follow the usual "unsafe_" convention
>>>> of taking a label argument.
>>>
>>> I think there is no point implementing raw_copy_from_user_allowed(), see
>>> https://github.com/linuxppc/linux/commit/4b842e4e25b1 and
>>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/8c74fc9ce8131cabb10b3e95dc0e430f396ee83e.1610369143.git.christophe.leroy@csgroup.eu/
>>>
>>> You should simply do:
>>>
>>> #define unsafe_copy_from_user(d, s, l, e) \
>>> unsafe_op_wrap(__copy_tofrom_user((__force void __user *)d, s, l), e)
>>>
>>
>> I gave this a try and the signal ops decreased by ~8K. Now, to be
>> honest, I am not sure what an "acceptable" benchmark number here
>> actually is - so maybe this is ok? Same loss with both radix and hash:
>>
>> 	|                                      | hash   | radix  |
>> 	| ------------------------------------ | ------ | ------ |
>> 	| linuxppc/next                        | 118693 | 133296 |
>> 	| linuxppc/next w/o KUAP+KUEP          | 228911 | 228654 |
>> 	| unsafe-signal64                      | 200480 | 234067 |
>> 	| unsafe-signal64 (__copy_tofrom_user) | 192467 | 225119 |
>>
>> To put this into perspective, prior to KUAP and uaccess flush, signal
>> performance in this benchmark was ~290K on hash.
> 
> If I'm doing the math right 8K is ~4% of the best number.
> 
> It seems like 4% is worth a few lines of code to handle these constant
> sizes. It's not like we have performance to throw away.
> 
> Or, we should chase down where the call sites are that are doing small
> constant copies with copy_to/from_user() and change them to use
> get/put_user().
> 

I have built pmac32_defconfig and ppc64_defconfig with a BUILD_BUG_ON(__builtin_constant_p(n) && (n 
== 1 || n == 2 || n == 4 || n == 8) in raw_copy_from_user() and raw_copy_to_user():

On pmac32_defconfig, no hit.

On ppc64_defconfig, two hits:
- copies of sigset_t in signal64. This problem is only on linux/next. On next-test we don't have 
this problem anymore thanks to the series from Christopher.
- in pkey_set() in arch/powerpc/kernel/ptrace/ptrace-view.c, in the copy of new_amr. This is not a 
hot path I think so we can live with it.

Christophe

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-02-09 14:17 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-09  3:25 [PATCH v3 0/8] Improve signal performance on PPC64 with KUAP Christopher M. Riedl
2021-01-09  3:25 ` [PATCH v3 1/8] powerpc/uaccess: Add unsafe_copy_from_user Christopher M. Riedl
2021-01-11 13:22   ` Christophe Leroy
2021-01-17 17:19     ` Christopher M. Riedl
2021-01-19  2:11       ` Michael Ellerman
2021-01-19 12:33         ` Christophe Leroy
2021-01-19 17:02           ` Christopher M. Riedl
2021-01-19 17:27             ` Christophe Leroy
2021-01-20  5:08               ` Christopher M. Riedl
2021-02-09 14:09         ` Christophe Leroy
2021-01-19  7:29       ` Christophe Leroy
2021-01-09  3:25 ` [PATCH v3 2/8] powerpc/signal: Add unsafe_copy_{vsx,fpr}_from_user() Christopher M. Riedl
2021-01-09  3:25 ` [PATCH v3 3/8] powerpc/signal64: Move non-inline functions out of setup_sigcontext() Christopher M. Riedl
2021-01-09  3:25 ` [PATCH v3 4/8] powerpc/signal64: Remove TM ifdefery in middle of if/else block Christopher M. Riedl
2021-01-11 13:29   ` Christophe Leroy
2021-01-17 17:16     ` Christopher M. Riedl
2021-01-09  3:25 ` [PATCH v3 5/8] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext() Christopher M. Riedl
2021-01-09  3:25 ` [PATCH v3 6/8] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext() Christopher M. Riedl
2021-01-09  3:25 ` [PATCH v3 7/8] powerpc/signal64: Rewrite handle_rt_signal64() to minimise uaccess switches Christopher M. Riedl
2021-01-09  3:25 ` [PATCH v3 8/8] powerpc/signal64: Rewrite rt_sigreturn() " Christopher M. Riedl

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.