linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
       [not found] <202305142217.46875.pisa@cmp.felk.cvut.cz>
@ 2023-05-16 16:33 ` Sebastian Andrzej Siewior
  2023-05-16 17:13   ` Ard Biesheuvel
  2023-05-19 18:52 ` Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Sebastian Andrzej Siewior
  1 sibling, 1 reply; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-16 16:33 UTC (permalink / raw)
  To: Pavel Pisa
  Cc: linux-rt-users, Pavel Hronek, Thomas Gleixner, Ard Biesheuvel,
	Peter Zijlstra

+ ard, peterz

I've been looking at the PREEMPT_RT report from Pavel:

On 2023-05-14 22:17:46 [+0200], Pavel Pisa wrote:
> [  199.657099] ------------[ cut here ]------------
> [  199.657116] WARNING: CPU: 1 PID: 1702 at kernel/softirq.c:172 __local_bh_disable_ip+0xd8/0x16c
> [  199.657150] DEBUG_LOCKS_WARN_ON(this_cpu_read(softirq_ctrl.cnt))
> [  199.657155] Modules linked in: ctucanfd_platform ctucanfd dtbocfg(O) uio_pdrv_genirq ip_tables x_tables
> [  199.657192] CPU: 1 PID: 1702 Comm: find Tainted: G           O       6.3.0-rc7-rt9-dut #1
> [  199.657207] Hardware name: Xilinx Zynq Platform
> [  199.657224]  unwind_backtrace from show_stack+0x10/0x14
> [  199.657259]  show_stack from dump_stack_lvl+0x40/0x4c
> [  199.657294]  dump_stack_lvl from __warn+0x84/0x168
> [  199.657330]  __warn from warn_slowpath_fmt+0x134/0x1b8
> [  199.657357]  warn_slowpath_fmt from __local_bh_disable_ip+0xd8/0x16c
> [  199.657379]  __local_bh_disable_ip from vfp_sync_hwstate+0x28/0xa0
> [  199.657403]  vfp_sync_hwstate from vfp_notifier+0x30/0x174
> [  199.657424]  vfp_notifier from atomic_notifier_call_chain+0x64/0x88
> [  199.657446]  atomic_notifier_call_chain from copy_thread+0xa4/0xe0
> [  199.657474]  copy_thread from copy_process+0x1258/0x1ba8
> [  199.657503]  copy_process from kernel_clone+0x94/0x3b8
> [  199.657525]  kernel_clone from sys_clone+0x70/0x98
> [  199.657547]  sys_clone from ret_fast_syscall+0x0/0x54
> [  199.657565] Exception stack(0xf1231fa8 to 0xf1231ff0)
> [  199.657578] 1fa0:                   b6f5f088 b6f5f520 01200011 00000000 00000000 00000000
> [  199.657592] 1fc0: b6f5f088 b6f5f520 00000001 00000078 004efba8 00000000 004cc2fc 00000000
> [  199.657601] 1fe0: 00000078 bef2b740 b6df8fe3 b6d8e616
> [  199.657608] ---[ end trace 0000000000000000 ]---

The problem is that vfp_sync_hwstate() does:
	preempt_disable();
	local_bh_disable();

this *would* be okay *if* it is guaranteed that BH is not already
disabled on this CPU. This isn't the case because something disabled BH,
got preempted and then this occurred.
This could be (probably) fixed by dropping that get_cpu() from
vfp_sync_hwstate() (unless preemption should really be disabled as
claimed by the comment). But then I looked further and stumbled over
commit
   62b95a7b44d1a ("ARM: 9282/1: vfp: Manipulate task VFP state with softirqs disabled")

and got a little sad. On PREEMPT_RT you can't simply add
SOFTIRQ_DISABLE_OFFSET to the preemption counter because it works
differently. You would have to invoke local_bh_disable() which *may*
involve a context switch if BH was disabled by another task which got
preempted. I guess this only happens from userland at which point it is
guaranteed that BH is disabled since it is not preemptible on !RT.

This local_bh_disable() is invoked from do_vfp which has interrupts
enabled. The counter-part enables BH by simply removing that constant.
Don't we miss a scheduling opportunity if an interrupt occurred and the
NEED_RESCHED flag was set? Also if an interrupt raised softirqs while
they were disabled, the local_bh_enable() should invoke "do_softirq()"
or they remain pending until the next opportunity. Unless your way back
there has a check somewhere.

The snippet below is probably disgusting but I managed to boot in qemu.

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 06b48ce23e1ca..47c9f14f8c5e9 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -244,6 +244,7 @@ THUMB(	fpreg	.req	r7	)
 	.endm
 #endif
 
+#if 0
 	.macro	local_bh_disable, ti, tmp
 	ldr	\tmp, [\ti, #TI_PREEMPT]
 	add	\tmp, \tmp, #SOFTIRQ_DISABLE_OFFSET
@@ -256,6 +257,7 @@ THUMB(	fpreg	.req	r7	)
 	sub	\tmp, \tmp, #SOFTIRQ_DISABLE_OFFSET
 	str	\tmp, [\ti, #TI_PREEMPT]
 	.endm
+#endif
 
 #define USERL(l, x...)				\
 9999:	x;					\
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 9a89264cdcc0b..80e45999c2961 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -22,7 +22,14 @@
 @  IRQs enabled.
 @
 ENTRY(do_vfp)
-	local_bh_disable r10, r4
+	push {r0, r2, r9, r10, lr}
+	ldr     r0, 0f
+	mov	r1, #SOFTIRQ_DISABLE_OFFSET
+0:
+	bl	__local_bh_disable_ip
+	pop {r0, r2, r9, r10, lr}
+
+/*	local_bh_disable r10, r4 */
  	ldr	r4, .LCvfp
 	ldr	r11, [r10, #TI_CPU]	@ CPU number
 	add	r10, r10, #TI_VFPSTATE	@ r10 = workspace
@@ -30,7 +37,11 @@ ENTRY(do_vfp)
 ENDPROC(do_vfp)
 
 ENTRY(vfp_null_entry)
-	local_bh_enable_ti r10, r4
+/*	local_bh_enable_ti r10, r4 */
+	ldr     r0, 0f
+	mov	r1, #SOFTIRQ_DISABLE_OFFSET
+0:
+	bl	__local_bh_enable_ip
 	ret	lr
 ENDPROC(vfp_null_entry)
 
diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index 26c4f61ecfa39..e9f183aad39ed 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -175,7 +175,13 @@ ENTRY(vfp_support_entry)
 					@ else it's one 32-bit instruction, so
 					@ always subtract 4 from the following
 					@ instruction address.
-	local_bh_enable_ti r10, r4
+	push { r0, r2, r9, r10, r11, lr}
+	ldr     r0, 0f
+	mov	r1, #SOFTIRQ_DISABLE_OFFSET
+0:
+	bl	__local_bh_enable_ip
+	pop { r0, r2, r9, r10, r11, lr}
+/*	local_bh_enable_ti r10, r4 */
 	ret	r9			@ we think we have handled things
 
 
@@ -200,7 +206,13 @@ ENTRY(vfp_support_entry)
 	@ not recognised by VFP
 
 	DBGSTR	"not VFP"
-	local_bh_enable_ti r10, r4
+/*	local_bh_enable_ti r10, r4 */
+	push { r0, r2, r9, r10, r11, lr}
+	ldr     r0, 0f
+	mov	r1, #SOFTIRQ_DISABLE_OFFSET
+0:
+	bl	__local_bh_enable_ip
+	pop { r0, r2, r9, r10, r11, lr}
 	ret	lr
 
 process_exception:
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 01bc48d738478..d4f68806e66b9 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -515,11 +515,9 @@ static inline void vfp_pm_init(void) { }
  */
 void vfp_sync_hwstate(struct thread_info *thread)
 {
-	unsigned int cpu = get_cpu();
-
 	local_bh_disable();
-
-	if (vfp_state_in_hw(cpu, thread)) {
+	preempt_disable();
+	if (vfp_state_in_hw(smp_processor_id(), thread)) {
 		u32 fpexc = fmrx(FPEXC);
 
 		/*
@@ -530,8 +528,8 @@ void vfp_sync_hwstate(struct thread_info *thread)
 		fmxr(FPEXC, fpexc);
 	}
 
+	preempt_enable();
 	local_bh_enable();
-	put_cpu();
 }
 
 /* Ensure that the thread reloads the hardware VFP state on the next use. */

Sebastian

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
  2023-05-16 16:33 ` Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Sebastian Andrzej Siewior
@ 2023-05-16 17:13   ` Ard Biesheuvel
  2023-05-17  8:54     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 16+ messages in thread
From: Ard Biesheuvel @ 2023-05-16 17:13 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On Tue, 16 May 2023 at 18:33, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> + ard, peterz

Hello Sebastian,

>
> I've been looking at the PREEMPT_RT report from Pavel:
>
> On 2023-05-14 22:17:46 [+0200], Pavel Pisa wrote:
> > [  199.657099] ------------[ cut here ]------------
> > [  199.657116] WARNING: CPU: 1 PID: 1702 at kernel/softirq.c:172 __local_bh_disable_ip+0xd8/0x16c
> > [  199.657150] DEBUG_LOCKS_WARN_ON(this_cpu_read(softirq_ctrl.cnt))
> > [  199.657155] Modules linked in: ctucanfd_platform ctucanfd dtbocfg(O) uio_pdrv_genirq ip_tables x_tables
> > [  199.657192] CPU: 1 PID: 1702 Comm: find Tainted: G           O       6.3.0-rc7-rt9-dut #1
> > [  199.657207] Hardware name: Xilinx Zynq Platform
> > [  199.657224]  unwind_backtrace from show_stack+0x10/0x14
> > [  199.657259]  show_stack from dump_stack_lvl+0x40/0x4c
> > [  199.657294]  dump_stack_lvl from __warn+0x84/0x168
> > [  199.657330]  __warn from warn_slowpath_fmt+0x134/0x1b8
> > [  199.657357]  warn_slowpath_fmt from __local_bh_disable_ip+0xd8/0x16c
> > [  199.657379]  __local_bh_disable_ip from vfp_sync_hwstate+0x28/0xa0
> > [  199.657403]  vfp_sync_hwstate from vfp_notifier+0x30/0x174
> > [  199.657424]  vfp_notifier from atomic_notifier_call_chain+0x64/0x88
> > [  199.657446]  atomic_notifier_call_chain from copy_thread+0xa4/0xe0
> > [  199.657474]  copy_thread from copy_process+0x1258/0x1ba8
> > [  199.657503]  copy_process from kernel_clone+0x94/0x3b8
> > [  199.657525]  kernel_clone from sys_clone+0x70/0x98
> > [  199.657547]  sys_clone from ret_fast_syscall+0x0/0x54
> > [  199.657565] Exception stack(0xf1231fa8 to 0xf1231ff0)
> > [  199.657578] 1fa0:                   b6f5f088 b6f5f520 01200011 00000000 00000000 00000000
> > [  199.657592] 1fc0: b6f5f088 b6f5f520 00000001 00000078 004efba8 00000000 004cc2fc 00000000
> > [  199.657601] 1fe0: 00000078 bef2b740 b6df8fe3 b6d8e616
> > [  199.657608] ---[ end trace 0000000000000000 ]---
>
> The problem is that vfp_sync_hwstate() does:
>         preempt_disable();
>         local_bh_disable();
>
> this *would* be okay *if* it is guaranteed that BH is not already
> disabled on this CPU. This isn't the case because something disabled BH,
> got preempted and then this occurred.
> This could be (probably) fixed by dropping that get_cpu() from
> vfp_sync_hwstate() (unless preemption should really be disabled as
> claimed by the comment). But then I looked further and stumbled over
> commit
>    62b95a7b44d1a ("ARM: 9282/1: vfp: Manipulate task VFP state with softirqs disabled")
>
> and got a little sad. On PREEMPT_RT you can't simply add
> SOFTIRQ_DISABLE_OFFSET to the preemption counter because it works
> differently. You would have to invoke local_bh_disable() which *may*
> involve a context switch if BH was disabled by another task which got
> preempted. I guess this only happens from userland at which point it is
> guaranteed that BH is disabled since it is not preemptible on !RT.
>
> This local_bh_disable() is invoked from do_vfp which has interrupts
> enabled. The counter-part enables BH by simply removing that constant.
> Don't we miss a scheduling opportunity if an interrupt occurred and the
> NEED_RESCHED flag was set? Also if an interrupt raised softirqs while
> they were disabled, the local_bh_enable() should invoke "do_softirq()"
> or they remain pending until the next opportunity. Unless your way back
> there has a check somewhere.
>

Please check whether 6.4-rc2 is equally affected - we fixed some
issues related to BH en/disabling from asm code.

In particular,

2b951b0efbaa6c80 ARM: 9297/1: vfp: avoid unbalanced stack on 'success'
return path
c76c6c4ecbec0deb ARM: 9294/2: vfp: Fix broken softirq handling with
instrumentation enabled
3a2bdad0b46649cc ARM: 9293/1: vfp: Pass successful return address via
register R3
dae904d96ad6a5fa ARM: 9292/1: vfp: Pass thread_info pointer to vfp_support_entry

Please take a look, and confirm whether or not we still need to drop
the get_cpu() call from vfp_sync_hwstate()

Thanks,
Ard.



> The snippet below is probably disgusting but I managed to boot in qemu.
>
> diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> index 06b48ce23e1ca..47c9f14f8c5e9 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -244,6 +244,7 @@ THUMB(      fpreg   .req    r7      )
>         .endm
>  #endif
>
> +#if 0
>         .macro  local_bh_disable, ti, tmp
>         ldr     \tmp, [\ti, #TI_PREEMPT]
>         add     \tmp, \tmp, #SOFTIRQ_DISABLE_OFFSET
> @@ -256,6 +257,7 @@ THUMB(      fpreg   .req    r7      )
>         sub     \tmp, \tmp, #SOFTIRQ_DISABLE_OFFSET
>         str     \tmp, [\ti, #TI_PREEMPT]
>         .endm
> +#endif
>
>  #define USERL(l, x...)                         \
>  9999:  x;                                      \
> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
> index 9a89264cdcc0b..80e45999c2961 100644
> --- a/arch/arm/vfp/entry.S
> +++ b/arch/arm/vfp/entry.S
> @@ -22,7 +22,14 @@
>  @  IRQs enabled.
>  @
>  ENTRY(do_vfp)
> -       local_bh_disable r10, r4
> +       push {r0, r2, r9, r10, lr}
> +       ldr     r0, 0f
> +       mov     r1, #SOFTIRQ_DISABLE_OFFSET
> +0:
> +       bl      __local_bh_disable_ip
> +       pop {r0, r2, r9, r10, lr}
> +
> +/*     local_bh_disable r10, r4 */
>         ldr     r4, .LCvfp
>         ldr     r11, [r10, #TI_CPU]     @ CPU number
>         add     r10, r10, #TI_VFPSTATE  @ r10 = workspace
> @@ -30,7 +37,11 @@ ENTRY(do_vfp)
>  ENDPROC(do_vfp)
>
>  ENTRY(vfp_null_entry)
> -       local_bh_enable_ti r10, r4
> +/*     local_bh_enable_ti r10, r4 */
> +       ldr     r0, 0f
> +       mov     r1, #SOFTIRQ_DISABLE_OFFSET
> +0:
> +       bl      __local_bh_enable_ip
>         ret     lr
>  ENDPROC(vfp_null_entry)
>
> diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
> index 26c4f61ecfa39..e9f183aad39ed 100644
> --- a/arch/arm/vfp/vfphw.S
> +++ b/arch/arm/vfp/vfphw.S
> @@ -175,7 +175,13 @@ ENTRY(vfp_support_entry)
>                                         @ else it's one 32-bit instruction, so
>                                         @ always subtract 4 from the following
>                                         @ instruction address.
> -       local_bh_enable_ti r10, r4
> +       push { r0, r2, r9, r10, r11, lr}
> +       ldr     r0, 0f
> +       mov     r1, #SOFTIRQ_DISABLE_OFFSET
> +0:
> +       bl      __local_bh_enable_ip
> +       pop { r0, r2, r9, r10, r11, lr}
> +/*     local_bh_enable_ti r10, r4 */
>         ret     r9                      @ we think we have handled things
>
>
> @@ -200,7 +206,13 @@ ENTRY(vfp_support_entry)
>         @ not recognised by VFP
>
>         DBGSTR  "not VFP"
> -       local_bh_enable_ti r10, r4
> +/*     local_bh_enable_ti r10, r4 */
> +       push { r0, r2, r9, r10, r11, lr}
> +       ldr     r0, 0f
> +       mov     r1, #SOFTIRQ_DISABLE_OFFSET
> +0:
> +       bl      __local_bh_enable_ip
> +       pop { r0, r2, r9, r10, r11, lr}
>         ret     lr
>
>  process_exception:
> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> index 01bc48d738478..d4f68806e66b9 100644
> --- a/arch/arm/vfp/vfpmodule.c
> +++ b/arch/arm/vfp/vfpmodule.c
> @@ -515,11 +515,9 @@ static inline void vfp_pm_init(void) { }
>   */
>  void vfp_sync_hwstate(struct thread_info *thread)
>  {
> -       unsigned int cpu = get_cpu();
> -
>         local_bh_disable();
> -
> -       if (vfp_state_in_hw(cpu, thread)) {
> +       preempt_disable();
> +       if (vfp_state_in_hw(smp_processor_id(), thread)) {
>                 u32 fpexc = fmrx(FPEXC);
>
>                 /*
> @@ -530,8 +528,8 @@ void vfp_sync_hwstate(struct thread_info *thread)
>                 fmxr(FPEXC, fpexc);
>         }
>
> +       preempt_enable();
>         local_bh_enable();
> -       put_cpu();
>  }
>
>  /* Ensure that the thread reloads the hardware VFP state on the next use. */
>
> Sebastian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
  2023-05-16 17:13   ` Ard Biesheuvel
@ 2023-05-17  8:54     ` Sebastian Andrzej Siewior
  2023-05-17 11:05       ` Ard Biesheuvel
  0 siblings, 1 reply; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-17  8:54 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On 2023-05-16 19:13:26 [+0200], Ard Biesheuvel wrote:
> On Tue, 16 May 2023 at 18:33, Sebastian Andrzej Siewior
> <bigeasy@linutronix.de> wrote:
> >
> > + ard, peterz
> 
> Hello Sebastian,
Hi Ard,

> Please check whether 6.4-rc2 is equally affected - we fixed some
> issues related to BH en/disabling from asm code.

oh oh Ard, you are the best. Thank you.

…
> Please take a look, and confirm whether or not we still need to drop
> the get_cpu() call from vfp_sync_hwstate()

Yes, it needs to be dealt with. At the time vfp_sync_hwstate() the
context is fully preemptible. So that get_cpu() needs to be removed
before local_bh_disable(). But…

If I understand the logic right, then the VFP state is saved on context
switch and the FPU access is disabled but the content remains and is in
sync with vfp_current_hw_state. Upon first usage (after the context
switch) in userland there is an exception at which point the access to
the FPU is enabled (FPEXC_EX) and the FPU content is loaded unless it is
in sync as per vfp_current_hw_state.

That means it relies on disabled preemption while operating on
vfp_current_hw_state. On PREEMPT_RT softirqs are only handled in process
context (for instance at the end of the threaded interrupt). Therefore
it is sufficient to disable preemption instead of BH. It would need
something like x86's fpregs_lock().
Otherwise vfp_entry() can get preempted after decisions based on
vfp_current_hw_state have been made. Also kernel_neon_begin() could get
preempted after enabling FPU. I think based on current logic, after
kernel_neon_begin() a task can get preempted on PREEMPT_RT and since
vfp_current_hw_state is NULL the registers won't be saved and a zero set
of registers will be loaded once the neon task gets back on the CPU (it
seems that VFP exceptions in kernel mode are treated the same way as
those in user mode).

What do you think?

> Thanks,
> Ard.

Sebastian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
  2023-05-17  8:54     ` Sebastian Andrzej Siewior
@ 2023-05-17 11:05       ` Ard Biesheuvel
  2023-05-17 14:37         ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 16+ messages in thread
From: Ard Biesheuvel @ 2023-05-17 11:05 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On Wed, 17 May 2023 at 10:54, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2023-05-16 19:13:26 [+0200], Ard Biesheuvel wrote:
> > On Tue, 16 May 2023 at 18:33, Sebastian Andrzej Siewior
> > <bigeasy@linutronix.de> wrote:
> > >
> > > + ard, peterz
> >
> > Hello Sebastian,
> Hi Ard,
>
> > Please check whether 6.4-rc2 is equally affected - we fixed some
> > issues related to BH en/disabling from asm code.
>
> oh oh Ard, you are the best. Thank you.
>

Well, I was the one who broke things in the first place, so it's only
reasonable that I was the one fixing them again.

For context, these changes give a substantial performance boost on the
RX path to IPsec or WireGuard using SIMD based software crypto.

> …
> > Please take a look, and confirm whether or not we still need to drop
> > the get_cpu() call from vfp_sync_hwstate()
>
> Yes, it needs to be dealt with. At the time vfp_sync_hwstate() the
> context is fully preemptible. So that get_cpu() needs to be removed
> before local_bh_disable(). But…
>
> If I understand the logic right, then the VFP state is saved on context
> switch and the FPU access is disabled but the content remains and is in
> sync with vfp_current_hw_state. Upon first usage (after the context
> switch) in userland there is an exception at which point the access to
> the FPU is enabled (FPEXC_EX) and the FPU content is loaded unless it is
> in sync as per vfp_current_hw_state.
>

Exactly. On UP we lazily preserve the user space FP state, and on SMP
we always preserve it on a context switch. Then, only when user space
actually tries to use the SIMD context, the correct data is copied in.
That way, we don't take the performance hit for tasks that are blocked
in the kernel.

> That means it relies on disabled preemption while operating on
> vfp_current_hw_state.

In general, this code relies on preserving/restoring the VFP context
to/from memory to be a critical section, as it needs to run to
completion once it is started.

> On PREEMPT_RT softirqs are only handled in process
> context (for instance at the end of the threaded interrupt). Therefore
> it is sufficient to disable preemption instead of BH. It would need
> something like x86's fpregs_lock().

I think this is not the only issue, to be honest. We cannot preempt
tasks while they are using the SIMD unit in kernel mode, as the kernel
mode context is never preserved/restored. So we at least need to
disable preemption again in kernel_neon_begin() [which now relies on
BH disable to disable preemption but as you point out, that is only
the case on !RT]

Given this, I wonder whether supporting kernel mode NEON on PREEMPT_RT
is worth it to begin with. AIUI, the alternative is to disable both
preemption and BH when touching the VFP state.

> Otherwise vfp_entry() can get preempted after decisions based on
> vfp_current_hw_state have been made. Also kernel_neon_begin() could get
> preempted after enabling FPU. I think based on current logic, after
> kernel_neon_begin() a task can get preempted on PREEMPT_RT and since
> vfp_current_hw_state is NULL the registers won't be saved and a zero set
> of registers will be loaded once the neon task gets back on the CPU (it
> seems that VFP exceptions in kernel mode are treated the same way as
> those in user mode).
>
> What do you think?
>

Indeed. kernel_neon_begin() no longer disabling preemption is
definitely wrong on PREEMPT_RT. The question is whether we want to
untangle this or just make PREEMPT_RT and KERNEL_MODE_NEON mutually
exclusive. This, by itself, makes quite a lot of sense, actually,
given that on actual 32-bit hardware, the SIMD speedup is not that
spectacular (the AES and other crypto instructions, which do give an
order of magnitude speedup are only implemented on 64-bit cores, which
usually run a 64-bit kernel). So if real-time behavior is a
requirement, using ordinary crypto implemented in C (which does not
require preemption to be disabled) is likely preferred anyway.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
  2023-05-17 11:05       ` Ard Biesheuvel
@ 2023-05-17 14:37         ` Sebastian Andrzej Siewior
  2023-05-17 22:05           ` Ard Biesheuvel
  0 siblings, 1 reply; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-17 14:37 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On 2023-05-17 13:05:32 [+0200], Ard Biesheuvel wrote:
> For context, these changes give a substantial performance boost on the
> RX path to IPsec or WireGuard using SIMD based software crypto.
> 
> > …
> > > Please take a look, and confirm whether or not we still need to drop
> > > the get_cpu() call from vfp_sync_hwstate()
> >
> > Yes, it needs to be dealt with. At the time vfp_sync_hwstate() the
> > context is fully preemptible. So that get_cpu() needs to be removed
> > before local_bh_disable(). But…
> >
> > If I understand the logic right, then the VFP state is saved on context
> > switch and the FPU access is disabled but the content remains and is in
> > sync with vfp_current_hw_state. Upon first usage (after the context
> > switch) in userland there is an exception at which point the access to
> > the FPU is enabled (FPEXC_EX) and the FPU content is loaded unless it is
> > in sync as per vfp_current_hw_state.
> >
> 
> Exactly. On UP we lazily preserve the user space FP state, and on SMP
> we always preserve it on a context switch. Then, only when user space
> actually tries to use the SIMD context, the correct data is copied in.
> That way, we don't take the performance hit for tasks that are blocked
> in the kernel.
> 
> > That means it relies on disabled preemption while operating on
> > vfp_current_hw_state.
> 
> In general, this code relies on preserving/restoring the VFP context
> to/from memory to be a critical section, as it needs to run to
> completion once it is started.
> 
> > On PREEMPT_RT softirqs are only handled in process
> > context (for instance at the end of the threaded interrupt). Therefore
> > it is sufficient to disable preemption instead of BH. It would need
> > something like x86's fpregs_lock().
> 
> I think this is not the only issue, to be honest. We cannot preempt
> tasks while they are using the SIMD unit in kernel mode, as the kernel
> mode context is never preserved/restored. So we at least need to
> disable preemption again in kernel_neon_begin() [which now relies on
> BH disable to disable preemption but as you point out, that is only
> the case on !RT]
> 
> Given this, I wonder whether supporting kernel mode NEON on PREEMPT_RT
> is worth it to begin with. AIUI, the alternative is to disable both
> preemption and BH when touching the VFP state.

Only preemption on RT. So it is debatable if it makes sense to use NEON
on RT. The preempt-off section is limited to PAGE_SIZE due the scatter/
gather walk if I remember correctly.

> > Otherwise vfp_entry() can get preempted after decisions based on
> > vfp_current_hw_state have been made. Also kernel_neon_begin() could get
> > preempted after enabling FPU. I think based on current logic, after
> > kernel_neon_begin() a task can get preempted on PREEMPT_RT and since
> > vfp_current_hw_state is NULL the registers won't be saved and a zero set
> > of registers will be loaded once the neon task gets back on the CPU (it
> > seems that VFP exceptions in kernel mode are treated the same way as
> > those in user mode).
> >
> > What do you think?
> >
> 
> Indeed. kernel_neon_begin() no longer disabling preemption is
> definitely wrong on PREEMPT_RT. The question is whether we want to
> untangle this or just make PREEMPT_RT and KERNEL_MODE_NEON mutually
> exclusive. This, by itself, makes quite a lot of sense, actually,
> given that on actual 32-bit hardware, the SIMD speedup is not that
> spectacular (the AES and other crypto instructions, which do give an
> order of magnitude speedup are only implemented on 64-bit cores, which
> usually run a 64-bit kernel). So if real-time behavior is a
> requirement, using ordinary crypto implemented in C (which does not
> require preemption to be disabled) is likely preferred anyway.

I have no opinion which can be backed up by numbers. So I have no
problem to go along with it and disable KERNEL_MODE_NEON since it is
simpler and the benefit is questionable.

So patch
- #1 KERNEL_MODE_NEON depends on !RT
- #2 disable BH followed by preemption in vfp_sync_hwstate() and
  vfp_entry().

if so, let me prepare something unless you want to :)

Sebastian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
  2023-05-17 14:37         ` Sebastian Andrzej Siewior
@ 2023-05-17 22:05           ` Ard Biesheuvel
  2023-05-19 14:57             ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
  0 siblings, 1 reply; 16+ messages in thread
From: Ard Biesheuvel @ 2023-05-17 22:05 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On Wed, 17 May 2023 at 16:37, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2023-05-17 13:05:32 [+0200], Ard Biesheuvel wrote:
> > For context, these changes give a substantial performance boost on the
> > RX path to IPsec or WireGuard using SIMD based software crypto.
> >
> > > …
> > > > Please take a look, and confirm whether or not we still need to drop
> > > > the get_cpu() call from vfp_sync_hwstate()
> > >
> > > Yes, it needs to be dealt with. At the time vfp_sync_hwstate() the
> > > context is fully preemptible. So that get_cpu() needs to be removed
> > > before local_bh_disable(). But…
> > >
> > > If I understand the logic right, then the VFP state is saved on context
> > > switch and the FPU access is disabled but the content remains and is in
> > > sync with vfp_current_hw_state. Upon first usage (after the context
> > > switch) in userland there is an exception at which point the access to
> > > the FPU is enabled (FPEXC_EX) and the FPU content is loaded unless it is
> > > in sync as per vfp_current_hw_state.
> > >
> >
> > Exactly. On UP we lazily preserve the user space FP state, and on SMP
> > we always preserve it on a context switch. Then, only when user space
> > actually tries to use the SIMD context, the correct data is copied in.
> > That way, we don't take the performance hit for tasks that are blocked
> > in the kernel.
> >
> > > That means it relies on disabled preemption while operating on
> > > vfp_current_hw_state.
> >
> > In general, this code relies on preserving/restoring the VFP context
> > to/from memory to be a critical section, as it needs to run to
> > completion once it is started.
> >
> > > On PREEMPT_RT softirqs are only handled in process
> > > context (for instance at the end of the threaded interrupt). Therefore
> > > it is sufficient to disable preemption instead of BH. It would need
> > > something like x86's fpregs_lock().
> >
> > I think this is not the only issue, to be honest. We cannot preempt
> > tasks while they are using the SIMD unit in kernel mode, as the kernel
> > mode context is never preserved/restored. So we at least need to
> > disable preemption again in kernel_neon_begin() [which now relies on
> > BH disable to disable preemption but as you point out, that is only
> > the case on !RT]
> >
> > Given this, I wonder whether supporting kernel mode NEON on PREEMPT_RT
> > is worth it to begin with. AIUI, the alternative is to disable both
> > preemption and BH when touching the VFP state.
>
> Only preemption on RT.

Right.

> So it is debatable if it makes sense to use NEON
> on RT. The preempt-off section is limited to PAGE_SIZE due the scatter/
> gather walk if I remember correctly.
>

This depends on the type of algorithm (skciphers/aeads vs
shashes/ahashes) but they generally all have a bounded quantum of work
before yielding the NEON (and therefore the CPU).

> > > Otherwise vfp_entry() can get preempted after decisions based on
> > > vfp_current_hw_state have been made. Also kernel_neon_begin() could get
> > > preempted after enabling FPU. I think based on current logic, after
> > > kernel_neon_begin() a task can get preempted on PREEMPT_RT and since
> > > vfp_current_hw_state is NULL the registers won't be saved and a zero set
> > > of registers will be loaded once the neon task gets back on the CPU (it
> > > seems that VFP exceptions in kernel mode are treated the same way as
> > > those in user mode).
> > >
> > > What do you think?
> > >
> >
> > Indeed. kernel_neon_begin() no longer disabling preemption is
> > definitely wrong on PREEMPT_RT. The question is whether we want to
> > untangle this or just make PREEMPT_RT and KERNEL_MODE_NEON mutually
> > exclusive. This, by itself, makes quite a lot of sense, actually,
> > given that on actual 32-bit hardware, the SIMD speedup is not that
> > spectacular (the AES and other crypto instructions, which do give an
> > order of magnitude speedup are only implemented on 64-bit cores, which
> > usually run a 64-bit kernel). So if real-time behavior is a
> > requirement, using ordinary crypto implemented in C (which does not
> > require preemption to be disabled) is likely preferred anyway.
>
> I have no opinion which can be backed up by numbers. So I have no
> problem to go along with it and disable KERNEL_MODE_NEON since it is
> simpler and the benefit is questionable.
>
> So patch
> - #1 KERNEL_MODE_NEON depends on !RT

Actually, given your explanation above, I think this may not be
necessary. Given that on !RT, disabling BH implies disabling
preemption and on RT disabling preemption implies disabling BH, it
should just be a matter of doing one or the other consistently,
depending on IS_ENABLED(CONFIG_PREEMPT_RT)

> - #2 disable BH followed by preemption in vfp_sync_hwstate() and
>   vfp_entry().
>

Not sure what to do here, given my answer to #1

In most cases where the VFP code disables BH, it is either because an
interrupting softirq may enable the SIMD unit and disable it again,
which would result in a FP exception when the interrupted context
tries to access the registers, or because such an interruption would
corrupt the preserved/restored FP state if it occurs while such a
preserve/restore is in progress.

Maybe the commit log of patch 62b95a7b44d1a30b3a9 sheds some more light here.

> if so, let me prepare something unless you want to :)
>

Don't let me stop you :-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT
  2023-05-17 22:05           ` Ard Biesheuvel
@ 2023-05-19 14:57             ` Sebastian Andrzej Siewior
  2023-05-19 14:57               ` [PATCH 1/3] ARM: vfp: Provide vfp_lock() for VFP locking Sebastian Andrzej Siewior
                                 ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-19 14:57 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

Hi Ard,

On 2023-05-18 00:05:33 [+0200], Ard Biesheuvel wrote:
> Don't let me stop you :-)

made a small series. It seems to address everything and with your
assembly rework it is not a big mess.

What do you think?

Sebastian



^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] ARM: vfp: Provide vfp_lock() for VFP locking.
  2023-05-19 14:57             ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
@ 2023-05-19 14:57               ` Sebastian Andrzej Siewior
  2023-05-19 14:57               ` [PATCH 2/3] ARM: vfp: Use vfp_lock() in vfp_sync_hwstate() Sebastian Andrzej Siewior
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-19 14:57 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra, Sebastian Andrzej Siewior

kernel_neon_begin() uses local_bh_disable() to ensure exclusive access
to the VFP unit. This is broken on PREEMPT_RT because a BH disabled
section remains preemptible on PREEMPT_RT.

Introduce vfp_lock() which uses local_bh_disable() and preempt_disable()
on PREEMPT_RT. Since softirqs are processed always in thread context,
disabling preemption is enough to ensure that the current context won't
get interrupted by something that is using the VFP. Use it in
kernel_neon_begin().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/arm/vfp/vfpmodule.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 349dcb944a937..57f9527d1e50e 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -54,6 +54,34 @@ static unsigned int __initdata VFP_arch;
  */
 union vfp_state *vfp_current_hw_state[NR_CPUS];
 
+/*
+ * Claim ownership of the VFP unit.
+ *
+ * The caller may change VFP registers until vfp_unlock() is called.
+ *
+ * local_bh_disable() is used to disable preemption and to disable VFP
+ * processing in softirq context. On PREEMPT_RT kernels local_bh_disable() is
+ * not sufficient because it only serializes soft interrupt related sections
+ * via a local lock, but stays preemptible. Disabling preemption is the right
+ * choice here as bottom half processing is always in thread context on RT
+ * kernels so it implicitly prevents bottom half processing as well.
+ */
+static void vfp_lock(void)
+{
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_bh_disable();
+	else
+		preempt_disable();
+}
+
+static void vfp_unlock(void)
+{
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		local_bh_enable();
+	else
+		preempt_enable();
+}
+
 /*
  * Is 'thread's most up to date state stored in this CPUs hardware?
  * Must be called from non-preemptible context.
@@ -738,7 +766,7 @@ void kernel_neon_begin(void)
 	unsigned int cpu;
 	u32 fpexc;
 
-	local_bh_disable();
+	vfp_lock();
 
 	/*
 	 * Kernel mode NEON is only allowed outside of hardirq context with
@@ -769,7 +797,7 @@ void kernel_neon_end(void)
 {
 	/* Disable the NEON/VFP unit. */
 	fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
-	local_bh_enable();
+	vfp_unlock();
 }
 EXPORT_SYMBOL(kernel_neon_end);
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] ARM: vfp: Use vfp_lock() in vfp_sync_hwstate().
  2023-05-19 14:57             ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
  2023-05-19 14:57               ` [PATCH 1/3] ARM: vfp: Provide vfp_lock() for VFP locking Sebastian Andrzej Siewior
@ 2023-05-19 14:57               ` Sebastian Andrzej Siewior
  2023-05-19 14:57               ` [PATCH 3/3] ARM: vfp: Use vfp_lock() in vfp_entry() Sebastian Andrzej Siewior
  2023-05-19 16:14               ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Ard Biesheuvel
  3 siblings, 0 replies; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-19 14:57 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra, Sebastian Andrzej Siewior

vfp_sync_hwstate() uses preempt_disable() followed by local_bh_disable()
to ensure that it won't get interrupted while checking the VFP state.
This harms PREEMPT_RT because softirq handling can get preempted and
local_bh_disable() synchronizes the related section with a sleeping lock
which does not work with disabled preemption.

Use the vfp_lock() to synchronize the access.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/arm/vfp/vfpmodule.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 57f9527d1e50e..543dc7f5a27e3 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -542,11 +542,9 @@ static inline void vfp_pm_init(void) { }
  */
 void vfp_sync_hwstate(struct thread_info *thread)
 {
-	unsigned int cpu = get_cpu();
+	vfp_lock();
 
-	local_bh_disable();
-
-	if (vfp_state_in_hw(cpu, thread)) {
+	if (vfp_state_in_hw(raw_smp_processor_id(), thread)) {
 		u32 fpexc = fmrx(FPEXC);
 
 		/*
@@ -557,8 +555,7 @@ void vfp_sync_hwstate(struct thread_info *thread)
 		fmxr(FPEXC, fpexc);
 	}
 
-	local_bh_enable();
-	put_cpu();
+	vfp_unlock();
 }
 
 /* Ensure that the thread reloads the hardware VFP state on the next use. */
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] ARM: vfp: Use vfp_lock() in vfp_entry().
  2023-05-19 14:57             ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
  2023-05-19 14:57               ` [PATCH 1/3] ARM: vfp: Provide vfp_lock() for VFP locking Sebastian Andrzej Siewior
  2023-05-19 14:57               ` [PATCH 2/3] ARM: vfp: Use vfp_lock() in vfp_sync_hwstate() Sebastian Andrzej Siewior
@ 2023-05-19 14:57               ` Sebastian Andrzej Siewior
  2023-05-19 16:14               ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Ard Biesheuvel
  3 siblings, 0 replies; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-19 14:57 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra, Sebastian Andrzej Siewior

vfp_entry() is invoked from exception handler and is fully preemptible.
It uses local_bh_disable() to remain uninterrupted while checking the
VFP state.
This is not working on PREEMPT_RT because local_bh_disable()
synchronizes the relevant section but the context remains fully
preemptible.

Use vfp_lock() for uninterrupted access.

VFP_bounce() is invoked from within vfp_entry() and may send a signal.
Sending a signal uses spinlock_t which becomes a sleeping lock
on PREEMPT_RT and must not be acquired within a preempt-disabled
section. Move the vfp_raise_sigfpe() block outside of the
preempt-disabled section.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/arm/vfp/vfphw.S     |  7 ++-----
 arch/arm/vfp/vfpmodule.c | 30 ++++++++++++++++++++----------
 2 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/arch/arm/vfp/vfphw.S b/arch/arm/vfp/vfphw.S
index a4610d0f32152..860512042bc21 100644
--- a/arch/arm/vfp/vfphw.S
+++ b/arch/arm/vfp/vfphw.S
@@ -180,10 +180,7 @@ ENTRY(vfp_support_entry)
 					@ always subtract 4 from the following
 					@ instruction address.
 
-local_bh_enable_and_ret:
-	adr	r0, .
-	mov	r1, #SOFTIRQ_DISABLE_OFFSET
-	b	__local_bh_enable_ip	@ tail call
+	b	vfp_exit	@ tail call
 
 look_for_VFP_exceptions:
 	@ Check for synchronous or asynchronous exception
@@ -206,7 +203,7 @@ ENTRY(vfp_support_entry)
 	@ not recognised by VFP
 
 	DBGSTR	"not VFP"
-	b	local_bh_enable_and_ret
+	b	vfp_exit	@ tail call
 
 process_exception:
 	DBGSTR	"bounce"
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 543dc7f5a27e3..a2745d17e9c71 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -267,7 +267,7 @@ static void vfp_panic(char *reason, u32 inst)
 /*
  * Process bitmask of exception conditions.
  */
-static void vfp_raise_exceptions(u32 exceptions, u32 inst, u32 fpscr, struct pt_regs *regs)
+static int vfp_raise_exceptions(u32 exceptions, u32 inst, u32 fpscr)
 {
 	int si_code = 0;
 
@@ -275,8 +275,7 @@ static void vfp_raise_exceptions(u32 exceptions, u32 inst, u32 fpscr, struct pt_
 
 	if (exceptions == VFP_EXCEPTION_ERROR) {
 		vfp_panic("unhandled bounce", inst);
-		vfp_raise_sigfpe(FPE_FLTINV, regs);
-		return;
+		return FPE_FLTINV;
 	}
 
 	/*
@@ -304,8 +303,7 @@ static void vfp_raise_exceptions(u32 exceptions, u32 inst, u32 fpscr, struct pt_
 	RAISE(FPSCR_OFC, FPSCR_OFE, FPE_FLTOVF);
 	RAISE(FPSCR_IOC, FPSCR_IOE, FPE_FLTINV);
 
-	if (si_code)
-		vfp_raise_sigfpe(si_code, regs);
+	return si_code;
 }
 
 /*
@@ -350,6 +348,8 @@ static u32 vfp_emulate_instruction(u32 inst, u32 fpscr, struct pt_regs *regs)
 void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs)
 {
 	u32 fpscr, orig_fpscr, fpsid, exceptions;
+	int si_code2 = 0;
+	int si_code = 0;
 
 	pr_debug("VFP: bounce: trigger %08x fpexc %08x\n", trigger, fpexc);
 
@@ -397,7 +397,7 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs)
 		 * unallocated VFP instruction but with FPSCR.IXE set and not
 		 * on VFP subarch 1.
 		 */
-		 vfp_raise_exceptions(VFP_EXCEPTION_ERROR, trigger, fpscr, regs);
+		si_code = vfp_raise_exceptions(VFP_EXCEPTION_ERROR, trigger, fpscr);
 		goto exit;
 	}
 
@@ -422,7 +422,7 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs)
 	 */
 	exceptions = vfp_emulate_instruction(trigger, fpscr, regs);
 	if (exceptions)
-		vfp_raise_exceptions(exceptions, trigger, orig_fpscr, regs);
+		si_code2 = vfp_raise_exceptions(exceptions, trigger, orig_fpscr);
 
 	/*
 	 * If there isn't a second FP instruction, exit now. Note that
@@ -441,9 +441,14 @@ void VFP_bounce(u32 trigger, u32 fpexc, struct pt_regs *regs)
  emulate:
 	exceptions = vfp_emulate_instruction(trigger, orig_fpscr, regs);
 	if (exceptions)
-		vfp_raise_exceptions(exceptions, trigger, orig_fpscr, regs);
+		si_code = vfp_raise_exceptions(exceptions, trigger, orig_fpscr);
+
  exit:
-	local_bh_enable();
+	vfp_unlock();
+	if (si_code2)
+		vfp_raise_sigfpe(si_code2, regs);
+	if (si_code)
+		vfp_raise_sigfpe(si_code, regs);
 }
 
 static void vfp_enable(void *unused)
@@ -684,10 +689,15 @@ asmlinkage void vfp_entry(u32 trigger, struct thread_info *ti, u32 resume_pc,
 	if (unlikely(!have_vfp))
 		return;
 
-	local_bh_disable();
+	vfp_lock();
 	vfp_support_entry(trigger, ti, resume_pc, resume_return_address);
 }
 
+asmlinkage void vfp_exit(void)
+{
+	vfp_unlock();
+}
+
 #ifdef CONFIG_KERNEL_MODE_NEON
 
 static int vfp_kmode_exception(struct pt_regs *regs, unsigned int instr)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT
  2023-05-19 14:57             ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
                                 ` (2 preceding siblings ...)
  2023-05-19 14:57               ` [PATCH 3/3] ARM: vfp: Use vfp_lock() in vfp_entry() Sebastian Andrzej Siewior
@ 2023-05-19 16:14               ` Ard Biesheuvel
  2023-05-19 18:06                 ` Sebastian Andrzej Siewior
  3 siblings, 1 reply; 16+ messages in thread
From: Ard Biesheuvel @ 2023-05-19 16:14 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On Fri, 19 May 2023 at 16:57, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> Hi Ard,
>
> On 2023-05-18 00:05:33 [+0200], Ard Biesheuvel wrote:
> > Don't let me stop you :-)
>
> made a small series. It seems to address everything and with your
> assembly rework it is not a big mess.
>
> What do you think?
>

This looks reasonable to me - thanks for taking the time.

However, I was about to hit send on my PR to Russell for the following series:

https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-vfp-softirq-fixes

which moves all of the VFP processing to C, and so this is going to
conflict at the very least, but also provide a cleaner base for patch
#2 in this series.

Also, aren't there a few other places in the VFP code where BH are
disabled and enabled again? Do those need the same treatment?

Thanks,
Ard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT
  2023-05-19 16:14               ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Ard Biesheuvel
@ 2023-05-19 18:06                 ` Sebastian Andrzej Siewior
  2023-05-19 19:42                   ` Ard Biesheuvel
  0 siblings, 1 reply; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-19 18:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On 2023-05-19 18:14:38 [+0200], Ard Biesheuvel wrote:
> This looks reasonable to me - thanks for taking the time.
> 
> However, I was about to hit send on my PR to Russell for the following series:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-vfp-softirq-fixes
> 
> which moves all of the VFP processing to C, and so this is going to
> conflict at the very least, but also provide a cleaner base for patch
> #2 in this series.

Okay. PR? Is this still the rmk's patch tracking system or did something
change? Either way, please let me know once this get through so I can
rebase accordingly. In the mean I throw this into -RT.

> Also, aren't there a few other places in the VFP code where BH are
> disabled and enabled again? Do those need the same treatment?

If so I haven't found them. A grep in arch/arm/ for bh_disable() has now
hit and this is vfp_lock().

> Thanks,
> Ard.

Sebastian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
       [not found] <202305142217.46875.pisa@cmp.felk.cvut.cz>
  2023-05-16 16:33 ` Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Sebastian Andrzej Siewior
@ 2023-05-19 18:52 ` Sebastian Andrzej Siewior
  2023-05-21 14:57   ` Pavel Pisa
  1 sibling, 1 reply; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-19 18:52 UTC (permalink / raw)
  To: Pavel Pisa; +Cc: linux-rt-users, Pavel Hronek, Thomas Gleixner

On 2023-05-14 22:17:46 [+0200], Pavel Pisa wrote:
> Dear PREEMPT RT Linux kernel maintainers,
> 
> we have started to experience failures of PREEMPT RT kernels builds from
> 6.3 RC versions. We use kernel on MZ_APO educational kits
> based on MicroZed, Xilinx Zynq 7Z010 SoC. More about actual HW

I just released v6.3.3-rt15. Could you please give it a try and report
back if all problems are gone or not? Based on the backtrace I can't
tell if the other problems are follow-up or other things. 

Sebastian

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT
  2023-05-19 18:06                 ` Sebastian Andrzej Siewior
@ 2023-05-19 19:42                   ` Ard Biesheuvel
  0 siblings, 0 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2023-05-19 19:42 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Pavel Pisa, linux-rt-users, Pavel Hronek, Thomas Gleixner,
	Peter Zijlstra

On Fri, 19 May 2023 at 20:06, Sebastian Andrzej Siewior
<bigeasy@linutronix.de> wrote:
>
> On 2023-05-19 18:14:38 [+0200], Ard Biesheuvel wrote:
> > This looks reasonable to me - thanks for taking the time.
> >
> > However, I was about to hit send on my PR to Russell for the following series:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/log/?h=arm-vfp-softirq-fixes
> >
> > which moves all of the VFP processing to C, and so this is going to
> > conflict at the very least, but also provide a cleaner base for patch
> > #2 in this series.
>
> Okay. PR? Is this still the rmk's patch tracking system or did something
> change? Either way, please let me know once this get through so I can
> rebase accordingly. In the mean I throw this into -RT.
>

As I said, I am about to send it but I haven't yet, and it takes
Russell usually some time to catch up. So this won't be in -next for
another week or two.

> > Also, aren't there a few other places in the VFP code where BH are
> > disabled and enabled again? Do those need the same treatment?
>
> If so I haven't found them. A grep in arch/arm/ for bh_disable() has now
> hit and this is vfp_lock().
>

Apologies, I got myself confused.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
  2023-05-19 18:52 ` Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Sebastian Andrzej Siewior
@ 2023-05-21 14:57   ` Pavel Pisa
  2023-05-22  7:44     ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 16+ messages in thread
From: Pavel Pisa @ 2023-05-21 14:57 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, Pavel Hronek, Thomas Gleixner

Dear Sebastian,

tanks much for the time investigated into the problem analysis
and designing solution. Pavel Hronek has checked actual daily runs
of for-kbuild-bot/current-stable of

https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel

and manually tested your 6.3 RT version and reports that is is stable
now.

But he has problem that his e-mail didnot pass mailing list filters
rules. I am trying to resnet from my other e-mail. I cannot send from
my university account as well so I ma trying my company one if
it passes

-------- Forwarded Message --------
Subject: Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
Date: Sun, 21 May 2023 16:42:13 +0200
From: Pavel Hronek <hronepa1@fel.cvut.cz>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
CC: Pavel Pisa <pisa@cmp.felk.cvut.cz>, linux-rt-users@vger.kernel.org, 
Thomas Gleixner <tglx@linutronix.de>

Dear maintainers,

Thank you for looking into the issue.

On 19. 05. 23 20:52, Sebastian Andrzej Siewior wrote:
> I just released v6.3.3-rt15. Could you please give it a try and report
> back if all problems are gone or not? Based on the backtrace I can't
> tell if the other problems are follow-up or other things.

The last two rounds of automated tests on 6.4.0-rc2-rt1
(https://canbus.pages.fel.cvut.cz/can-latester/inspect.html?kernel=rt&flood=1&prio=1&kern=1&load=1)
have completed without the lockup and the messages no longer appear in 
the console.

I've also run the tests manually on 6.3.3-rt15 and the same workload 
that would freeze an earlier version of 6.3 within a minute has now 
lasted for over 30 minutes before I stopped it. No warnings here either. 
I think I can confirm the problems are gone.

Best regards,

Pavel Hronek


On Friday 19 of May 2023 20:52:41 Sebastian Andrzej Siewior wrote:
> On 2023-05-14 22:17:46 [+0200], Pavel Pisa wrote:
> > Dear PREEMPT RT Linux kernel maintainers,
> >
> > we have started to experience failures of PREEMPT RT kernels builds from
> > 6.3 RC versions. We use kernel on MZ_APO educational kits
> > based on MicroZed, Xilinx Zynq 7Z010 SoC. More about actual HW
>
> I just released v6.3.3-rt15. Could you please give it a try and report
> back if all problems are gone or not? Based on the backtrace I can't
> tell if the other problems are follow-up or other things.
>
> Sebastian


-- 
                Pavel Pisa
    phone:      +420 603531357
    e-mail:     pisa@cmp.felk.cvut.cz
    Department of Control Engineering FEE CVUT
    Karlovo namesti 13, 121 35, Prague 2
    university: http://control.fel.cvut.cz/
    personal:   http://cmp.felk.cvut.cz/~pisa
    projects:   https://www.openhub.net/accounts/ppisa
    CAN related:http://canbus.pages.fel.cvut.cz/
    RISC-V education: https://comparch.edu.cvut.cz/
    Open Technologies Research Education and Exchange Services
    https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
  2023-05-21 14:57   ` Pavel Pisa
@ 2023-05-22  7:44     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 16+ messages in thread
From: Sebastian Andrzej Siewior @ 2023-05-22  7:44 UTC (permalink / raw)
  To: Pavel Pisa; +Cc: linux-rt-users, Pavel Hronek, Thomas Gleixner

On 2023-05-21 16:57:21 [+0200], Pavel Pisa wrote:
> Dear Sebastian,
Hi Pavel,

> and manually tested your 6.3 RT version and reports that is is stable
> now.

thank you.

> But he has problem that his e-mail didnot pass mailing list filters
> rules. I am trying to resnet from my other e-mail. I cannot send from
> my university account as well so I ma trying my company one if
> it passes

If it is too large it will be dropped. Was there a bounce message?

> -------- Forwarded Message --------
> Subject: Re: Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010
> Date: Sun, 21 May 2023 16:42:13 +0200
> From: Pavel Hronek <hronepa1@fel.cvut.cz>
> To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> CC: Pavel Pisa <pisa@cmp.felk.cvut.cz>, linux-rt-users@vger.kernel.org, 
> Thomas Gleixner <tglx@linutronix.de>
> 
> The last two rounds of automated tests on 6.4.0-rc2-rt1
> (https://canbus.pages.fel.cvut.cz/can-latester/inspect.html?kernel=rt&flood=1&prio=1&kern=1&load=1)
> have completed without the lockup and the messages no longer appear in 
> the console.

6.4 got the same fixes as I added to the last 6.3 release. Good to here
that there isn't anything new.

> Pavel Hronek

Sebastian

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-05-22  7:44 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <202305142217.46875.pisa@cmp.felk.cvut.cz>
2023-05-16 16:33 ` Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Sebastian Andrzej Siewior
2023-05-16 17:13   ` Ard Biesheuvel
2023-05-17  8:54     ` Sebastian Andrzej Siewior
2023-05-17 11:05       ` Ard Biesheuvel
2023-05-17 14:37         ` Sebastian Andrzej Siewior
2023-05-17 22:05           ` Ard Biesheuvel
2023-05-19 14:57             ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Sebastian Andrzej Siewior
2023-05-19 14:57               ` [PATCH 1/3] ARM: vfp: Provide vfp_lock() for VFP locking Sebastian Andrzej Siewior
2023-05-19 14:57               ` [PATCH 2/3] ARM: vfp: Use vfp_lock() in vfp_sync_hwstate() Sebastian Andrzej Siewior
2023-05-19 14:57               ` [PATCH 3/3] ARM: vfp: Use vfp_lock() in vfp_entry() Sebastian Andrzej Siewior
2023-05-19 16:14               ` [PATCH 0/3] ARM: vfp: Fixes for PREEMP_RT Ard Biesheuvel
2023-05-19 18:06                 ` Sebastian Andrzej Siewior
2023-05-19 19:42                   ` Ard Biesheuvel
2023-05-19 18:52 ` Preemp-RT kernel builds from 6.3 on fails on Xilinx Zynq 7Z010 Sebastian Andrzej Siewior
2023-05-21 14:57   ` Pavel Pisa
2023-05-22  7:44     ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).