From: Nicholas Piggin <npiggin@gmail.com> To: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Christophe Leroy <christophe.leroy@csgroup.eu>, Michael Ellerman <mpe@ellerman.id.au>, msuchanek@suse.de, Paul Mackerras <paulus@samba.org> Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH v5 16/22] powerpc/syscall: Avoid stack frame in likely part of system_call_exception() Date: Tue, 09 Feb 2021 11:55:17 +1000 [thread overview] Message-ID: <1612834634.qle1lc7n6y.astroid@bobo.none> (raw) In-Reply-To: <981edfd50d4c980634b74c4bb76b765c499a87ec.1612796617.git.christophe.leroy@csgroup.eu> Excerpts from Christophe Leroy's message of February 9, 2021 1:10 am: > When r3 is not modified, reload it from regs->orig_r3 to free > volatile registers. This avoids a stack frame for the likely part > of system_call_exception() This doesn't on my 64s build, but it does reduce one non volatile register save/restore. With quite a bit more register pressure reduction 64s can avoid the stack frame as well. It's a cool trick but quite code and compiler specific so I don't know how worthwhile it is to keep considering we're calling out into random kernel C code after this. Maybe just keep it PPC32 specific for the moment, will have to do more tuning for 64 and we have other stuff to do there first. If you are happy to make it 32-bit only then Reviewed-by: Nicholas Piggin <npiggin@gmail.com> > > Before the patch: > > c000b4d4 <system_call_exception>: > c000b4d4: 7c 08 02 a6 mflr r0 > c000b4d8: 94 21 ff e0 stwu r1,-32(r1) > c000b4dc: 93 e1 00 1c stw r31,28(r1) > c000b4e0: 90 01 00 24 stw r0,36(r1) > c000b4e4: 90 6a 00 88 stw r3,136(r10) > c000b4e8: 81 6a 00 84 lwz r11,132(r10) > c000b4ec: 69 6b 00 02 xori r11,r11,2 > c000b4f0: 55 6b ff fe rlwinm r11,r11,31,31,31 > c000b4f4: 0f 0b 00 00 twnei r11,0 > c000b4f8: 81 6a 00 a0 lwz r11,160(r10) > c000b4fc: 55 6b 07 fe clrlwi r11,r11,31 > c000b500: 0f 0b 00 00 twnei r11,0 > c000b504: 7c 0c 42 e6 mftb r0 > c000b508: 83 e2 00 08 lwz r31,8(r2) > c000b50c: 81 82 00 28 lwz r12,40(r2) > c000b510: 90 02 00 24 stw r0,36(r2) > c000b514: 7d 8c f8 50 subf r12,r12,r31 > c000b518: 7c 0c 02 14 add r0,r12,r0 > c000b51c: 90 02 00 08 stw r0,8(r2) > c000b520: 7c 10 13 a6 mtspr 80,r0 > c000b524: 81 62 00 70 lwz r11,112(r2) > c000b528: 71 60 86 91 andi. r0,r11,34449 > c000b52c: 40 82 00 34 bne c000b560 <system_call_exception+0x8c> > c000b530: 2b 89 01 b6 cmplwi cr7,r9,438 > c000b534: 41 9d 00 64 bgt cr7,c000b598 <system_call_exception+0xc4> > c000b538: 3d 40 c0 5c lis r10,-16292 > c000b53c: 55 29 10 3a rlwinm r9,r9,2,0,29 > c000b540: 39 4a 41 e8 addi r10,r10,16872 > c000b544: 80 01 00 24 lwz r0,36(r1) > c000b548: 7d 2a 48 2e lwzx r9,r10,r9 > c000b54c: 7c 08 03 a6 mtlr r0 > c000b550: 7d 29 03 a6 mtctr r9 > c000b554: 83 e1 00 1c lwz r31,28(r1) > c000b558: 38 21 00 20 addi r1,r1,32 > c000b55c: 4e 80 04 20 bctr > > After the patch: > > c000b4d4 <system_call_exception>: > c000b4d4: 81 6a 00 84 lwz r11,132(r10) > c000b4d8: 90 6a 00 88 stw r3,136(r10) > c000b4dc: 69 6b 00 02 xori r11,r11,2 > c000b4e0: 55 6b ff fe rlwinm r11,r11,31,31,31 > c000b4e4: 0f 0b 00 00 twnei r11,0 > c000b4e8: 80 6a 00 a0 lwz r3,160(r10) > c000b4ec: 54 63 07 fe clrlwi r3,r3,31 > c000b4f0: 0f 03 00 00 twnei r3,0 > c000b4f4: 7d 6c 42 e6 mftb r11 > c000b4f8: 81 82 00 08 lwz r12,8(r2) > c000b4fc: 80 02 00 28 lwz r0,40(r2) > c000b500: 91 62 00 24 stw r11,36(r2) > c000b504: 7c 00 60 50 subf r0,r0,r12 > c000b508: 7d 60 5a 14 add r11,r0,r11 > c000b50c: 91 62 00 08 stw r11,8(r2) > c000b510: 7c 10 13 a6 mtspr 80,r0 > c000b514: 80 62 00 70 lwz r3,112(r2) > c000b518: 70 6b 86 91 andi. r11,r3,34449 > c000b51c: 40 82 00 28 bne c000b544 <system_call_exception+0x70> > c000b520: 2b 89 01 b6 cmplwi cr7,r9,438 > c000b524: 41 9d 00 84 bgt cr7,c000b5a8 <system_call_exception+0xd4> > c000b528: 80 6a 00 88 lwz r3,136(r10) > c000b52c: 3d 40 c0 5c lis r10,-16292 > c000b530: 55 29 10 3a rlwinm r9,r9,2,0,29 > c000b534: 39 4a 41 e4 addi r10,r10,16868 > c000b538: 7d 2a 48 2e lwzx r9,r10,r9 > c000b53c: 7d 29 03 a6 mtctr r9 > c000b540: 4e 80 04 20 bctr > > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> > --- > arch/powerpc/kernel/interrupt.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c > index 107ec39f05cb..205902052112 100644 > --- a/arch/powerpc/kernel/interrupt.c > +++ b/arch/powerpc/kernel/interrupt.c > @@ -117,6 +117,9 @@ notrace long system_call_exception(long r3, long r4, long r5, > return regs->gpr[3]; > } > return -ENOSYS; > + } else { > + /* Restore r3 from orig_gpr3 to free up a volatile reg */ > + r3 = regs->orig_gpr3; > } > > /* May be faster to do array_index_nospec? */ > -- > 2.25.0 > >
WARNING: multiple messages have this Message-ID (diff)
From: Nicholas Piggin <npiggin@gmail.com> To: Benjamin Herrenschmidt <benh@kernel.crashing.org>, Christophe Leroy <christophe.leroy@csgroup.eu>, Michael Ellerman <mpe@ellerman.id.au>, msuchanek@suse.de, Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v5 16/22] powerpc/syscall: Avoid stack frame in likely part of system_call_exception() Date: Tue, 09 Feb 2021 11:55:17 +1000 [thread overview] Message-ID: <1612834634.qle1lc7n6y.astroid@bobo.none> (raw) In-Reply-To: <981edfd50d4c980634b74c4bb76b765c499a87ec.1612796617.git.christophe.leroy@csgroup.eu> Excerpts from Christophe Leroy's message of February 9, 2021 1:10 am: > When r3 is not modified, reload it from regs->orig_r3 to free > volatile registers. This avoids a stack frame for the likely part > of system_call_exception() This doesn't on my 64s build, but it does reduce one non volatile register save/restore. With quite a bit more register pressure reduction 64s can avoid the stack frame as well. It's a cool trick but quite code and compiler specific so I don't know how worthwhile it is to keep considering we're calling out into random kernel C code after this. Maybe just keep it PPC32 specific for the moment, will have to do more tuning for 64 and we have other stuff to do there first. If you are happy to make it 32-bit only then Reviewed-by: Nicholas Piggin <npiggin@gmail.com> > > Before the patch: > > c000b4d4 <system_call_exception>: > c000b4d4: 7c 08 02 a6 mflr r0 > c000b4d8: 94 21 ff e0 stwu r1,-32(r1) > c000b4dc: 93 e1 00 1c stw r31,28(r1) > c000b4e0: 90 01 00 24 stw r0,36(r1) > c000b4e4: 90 6a 00 88 stw r3,136(r10) > c000b4e8: 81 6a 00 84 lwz r11,132(r10) > c000b4ec: 69 6b 00 02 xori r11,r11,2 > c000b4f0: 55 6b ff fe rlwinm r11,r11,31,31,31 > c000b4f4: 0f 0b 00 00 twnei r11,0 > c000b4f8: 81 6a 00 a0 lwz r11,160(r10) > c000b4fc: 55 6b 07 fe clrlwi r11,r11,31 > c000b500: 0f 0b 00 00 twnei r11,0 > c000b504: 7c 0c 42 e6 mftb r0 > c000b508: 83 e2 00 08 lwz r31,8(r2) > c000b50c: 81 82 00 28 lwz r12,40(r2) > c000b510: 90 02 00 24 stw r0,36(r2) > c000b514: 7d 8c f8 50 subf r12,r12,r31 > c000b518: 7c 0c 02 14 add r0,r12,r0 > c000b51c: 90 02 00 08 stw r0,8(r2) > c000b520: 7c 10 13 a6 mtspr 80,r0 > c000b524: 81 62 00 70 lwz r11,112(r2) > c000b528: 71 60 86 91 andi. r0,r11,34449 > c000b52c: 40 82 00 34 bne c000b560 <system_call_exception+0x8c> > c000b530: 2b 89 01 b6 cmplwi cr7,r9,438 > c000b534: 41 9d 00 64 bgt cr7,c000b598 <system_call_exception+0xc4> > c000b538: 3d 40 c0 5c lis r10,-16292 > c000b53c: 55 29 10 3a rlwinm r9,r9,2,0,29 > c000b540: 39 4a 41 e8 addi r10,r10,16872 > c000b544: 80 01 00 24 lwz r0,36(r1) > c000b548: 7d 2a 48 2e lwzx r9,r10,r9 > c000b54c: 7c 08 03 a6 mtlr r0 > c000b550: 7d 29 03 a6 mtctr r9 > c000b554: 83 e1 00 1c lwz r31,28(r1) > c000b558: 38 21 00 20 addi r1,r1,32 > c000b55c: 4e 80 04 20 bctr > > After the patch: > > c000b4d4 <system_call_exception>: > c000b4d4: 81 6a 00 84 lwz r11,132(r10) > c000b4d8: 90 6a 00 88 stw r3,136(r10) > c000b4dc: 69 6b 00 02 xori r11,r11,2 > c000b4e0: 55 6b ff fe rlwinm r11,r11,31,31,31 > c000b4e4: 0f 0b 00 00 twnei r11,0 > c000b4e8: 80 6a 00 a0 lwz r3,160(r10) > c000b4ec: 54 63 07 fe clrlwi r3,r3,31 > c000b4f0: 0f 03 00 00 twnei r3,0 > c000b4f4: 7d 6c 42 e6 mftb r11 > c000b4f8: 81 82 00 08 lwz r12,8(r2) > c000b4fc: 80 02 00 28 lwz r0,40(r2) > c000b500: 91 62 00 24 stw r11,36(r2) > c000b504: 7c 00 60 50 subf r0,r0,r12 > c000b508: 7d 60 5a 14 add r11,r0,r11 > c000b50c: 91 62 00 08 stw r11,8(r2) > c000b510: 7c 10 13 a6 mtspr 80,r0 > c000b514: 80 62 00 70 lwz r3,112(r2) > c000b518: 70 6b 86 91 andi. r11,r3,34449 > c000b51c: 40 82 00 28 bne c000b544 <system_call_exception+0x70> > c000b520: 2b 89 01 b6 cmplwi cr7,r9,438 > c000b524: 41 9d 00 84 bgt cr7,c000b5a8 <system_call_exception+0xd4> > c000b528: 80 6a 00 88 lwz r3,136(r10) > c000b52c: 3d 40 c0 5c lis r10,-16292 > c000b530: 55 29 10 3a rlwinm r9,r9,2,0,29 > c000b534: 39 4a 41 e4 addi r10,r10,16868 > c000b538: 7d 2a 48 2e lwzx r9,r10,r9 > c000b53c: 7d 29 03 a6 mtctr r9 > c000b540: 4e 80 04 20 bctr > > Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> > --- > arch/powerpc/kernel/interrupt.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c > index 107ec39f05cb..205902052112 100644 > --- a/arch/powerpc/kernel/interrupt.c > +++ b/arch/powerpc/kernel/interrupt.c > @@ -117,6 +117,9 @@ notrace long system_call_exception(long r3, long r4, long r5, > return regs->gpr[3]; > } > return -ENOSYS; > + } else { > + /* Restore r3 from orig_gpr3 to free up a volatile reg */ > + r3 = regs->orig_gpr3; > } > > /* May be faster to do array_index_nospec? */ > -- > 2.25.0 > >
next prev parent reply other threads:[~2021-02-09 1:56 UTC|newest] Thread overview: 118+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-02-08 15:10 [PATCH v5 00/22] powerpc/32: Implement C syscall entry/exit Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 01/22] powerpc/32s: Add missing call to kuep_lock on syscall entry Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 02/22] powerpc/32: Always enable data translation " Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 03/22] powerpc/32: On syscall entry, enable instruction translation at the same time as data Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 04/22] powerpc/32: Reorder instructions to avoid using CTR in syscall entry Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 05/22] powerpc/irq: Add helper to set regs->softe Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:11 ` Nicholas Piggin 2021-02-09 1:11 ` Nicholas Piggin 2021-02-09 5:57 ` Christophe Leroy 2021-02-09 5:57 ` Christophe Leroy 2021-02-09 7:47 ` Nicholas Piggin 2021-02-09 7:47 ` Nicholas Piggin 2021-02-09 6:18 ` Christophe Leroy 2021-02-09 6:18 ` Christophe Leroy 2021-02-09 7:49 ` Nicholas Piggin 2021-02-09 7:49 ` Nicholas Piggin 2021-03-05 8:54 ` Christophe Leroy 2021-03-05 8:54 ` Christophe Leroy 2021-03-08 8:47 ` Nicholas Piggin 2021-03-08 8:47 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 06/22] powerpc/irq: Rework helpers that manipulate MSR[EE/RI] Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:14 ` Nicholas Piggin 2021-02-09 1:14 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 07/22] powerpc/irq: Add stub irq_soft_mask_return() for PPC32 Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:19 ` Nicholas Piggin 2021-02-09 1:19 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 08/22] powerpc/syscall: Rename syscall_64.c into interrupt.c Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:19 ` Nicholas Piggin 2021-02-09 1:19 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 09/22] powerpc/syscall: Make interrupt.c buildable on PPC32 Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:27 ` Nicholas Piggin 2021-02-09 1:27 ` Nicholas Piggin 2021-02-09 6:02 ` Christophe Leroy 2021-02-09 6:02 ` Christophe Leroy 2021-02-09 7:50 ` Nicholas Piggin 2021-02-09 7:50 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 10/22] powerpc/syscall: Use is_compat_task() Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:29 ` Nicholas Piggin 2021-02-09 1:29 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 11/22] powerpc/syscall: Save r3 in regs->orig_r3 Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:29 ` Nicholas Piggin 2021-02-09 1:29 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 12/22] powerpc/syscall: Change condition to check MSR_RI Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:36 ` Nicholas Piggin 2021-02-09 1:36 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 13/22] powerpc/32: Always save non volatile GPRs at syscall entry Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 14/22] powerpc/syscall: implement system call entry/exit logic in C for PPC32 Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 15/22] powerpc/32: Remove verification of MSR_PR on syscall in the ASM entry Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 16/22] powerpc/syscall: Avoid stack frame in likely part of system_call_exception() Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:55 ` Nicholas Piggin [this message] 2021-02-09 1:55 ` Nicholas Piggin 2021-02-09 16:13 ` Christophe Leroy 2021-02-09 16:13 ` Christophe Leroy 2021-02-10 1:56 ` Nicholas Piggin 2021-02-10 1:56 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 17/22] powerpc/syscall: Do not check unsupported scv vector on PPC32 Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 2:00 ` Nicholas Piggin 2021-02-09 2:00 ` Nicholas Piggin 2021-02-09 6:13 ` Christophe Leroy 2021-02-09 6:13 ` Christophe Leroy 2021-02-09 7:56 ` Nicholas Piggin 2021-02-09 7:56 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 18/22] powerpc/syscall: Remove FULL_REGS verification in system_call_exception Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 2:02 ` Nicholas Piggin 2021-02-09 2:02 ` Nicholas Piggin 2021-02-09 14:31 ` Christophe Leroy 2021-02-09 14:31 ` Christophe Leroy 2021-02-10 1:57 ` Nicholas Piggin 2021-02-10 1:57 ` Nicholas Piggin 2021-02-08 15:10 ` [PATCH v5 19/22] powerpc/syscall: Optimise checks in beginning of system_call_exception() Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 2:06 ` Nicholas Piggin 2021-02-09 2:06 ` Nicholas Piggin 2021-02-09 14:32 ` Christophe Leroy 2021-02-09 14:32 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 20/22] powerpc/syscall: Avoid storing 'current' in another pointer Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 2:36 ` Nicholas Piggin 2021-02-09 2:36 ` Nicholas Piggin 2021-02-09 13:50 ` Segher Boessenkool 2021-02-09 13:50 ` Segher Boessenkool 2021-02-09 14:31 ` David Laight 2021-02-09 14:31 ` David Laight 2021-02-09 17:03 ` Christophe Leroy 2021-02-09 17:03 ` Christophe Leroy 2021-02-09 17:16 ` David Laight 2021-02-09 17:16 ` David Laight 2021-02-10 2:00 ` Nicholas Piggin 2021-02-10 2:00 ` Nicholas Piggin 2021-02-10 8:45 ` Christophe Leroy 2021-02-10 8:45 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 21/22] powerpc/32: Remove the counter in global_dbcr0 Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-08 15:10 ` [PATCH v5 22/22] powerpc/32: Handle bookE debugging in C in syscall entry/exit Christophe Leroy 2021-02-08 15:10 ` Christophe Leroy 2021-02-09 1:03 ` [PATCH v5 00/22] powerpc/32: Implement C " Nicholas Piggin 2021-02-09 1:03 ` Nicholas Piggin 2021-02-12 0:19 ` Michael Ellerman 2021-02-12 0:19 ` Michael Ellerman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1612834634.qle1lc7n6y.astroid@bobo.none \ --to=npiggin@gmail.com \ --cc=benh@kernel.crashing.org \ --cc=christophe.leroy@csgroup.eu \ --cc=linux-kernel@vger.kernel.org \ --cc=linuxppc-dev@lists.ozlabs.org \ --cc=mpe@ellerman.id.au \ --cc=msuchanek@suse.de \ --cc=paulus@samba.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.