linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
@ 2023-03-10 12:13 Álvaro Fernández Rojas
  2023-03-11 17:32 ` Florian Fainelli
  2023-03-16 18:05 ` [PATCH v2] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() for TP1 Álvaro Fernández Rojas
  0 siblings, 2 replies; 17+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-10 12:13 UTC (permalink / raw)
  To: f.fainelli, jonas.gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel
  Cc: Álvaro Fernández Rojas

arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with EHCI/OHCI:
[    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
[    3.895011] Reserved instruction in kernel code[#1]:
[    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
[    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
[    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
[    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
[    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
[    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
[    3.932848] $20   : 00000000 00000000 55590000 77d70000
[    3.938251] $24   : 00000018 00000010
[    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
[    3.949058] Hi    : 00000000
[    3.952013] Lo    : 00000000
[    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
[    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
[    3.965913] Status: 10008703	KERNEL EXL IE
[    3.970216] Cause : 00800028 (ExcCode 0a)
[    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
[    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
[    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
[    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
[    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
[    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
[    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
[    4.044196]         ...
[    4.046706] Call Trace:
[    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
[    4.054356] [<80015c70>] setup_frame+0xdc/0x124
[    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
[    4.064207] [<80011b50>] work_notifysig+0x10/0x18
[    4.069036]
[    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
[    4.080686]
[    4.082517] ---[ end trace 22a8edb41f5f983b ]---
[    4.087374] Kernel panic - not syncing: Fatal exception
[    4.092753] Rebooting in 1 seconds..

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
---
 arch/mips/bmips/dma.c   | 5 +++++
 arch/mips/bmips/setup.c | 5 +++++
 2 files changed, 10 insertions(+)

diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
index 33788668cbdb..3769d5efdede 100644
--- a/arch/mips/bmips/dma.c
+++ b/arch/mips/bmips/dma.c
@@ -5,11 +5,16 @@
 #include <asm/bmips.h>
 #include <asm/io.h>
 
+int bmips_dma_sync_enabled = 1;
+
 void arch_sync_dma_for_cpu_all(void)
 {
 	void __iomem *cbr = BMIPS_GET_CBR();
 	u32 cfg;
 
+	if (!bmips_dma_sync_enabled)
+		return;
+
 	if (boot_cpu_type() != CPU_BMIPS3300 &&
 	    boot_cpu_type() != CPU_BMIPS4350 &&
 	    boot_cpu_type() != CPU_BMIPS4380)
diff --git a/arch/mips/bmips/setup.c b/arch/mips/bmips/setup.c
index e95b3f78e7cd..825e85e14010 100644
--- a/arch/mips/bmips/setup.c
+++ b/arch/mips/bmips/setup.c
@@ -35,6 +35,8 @@
 #define REG_BCM6328_OTP		((void __iomem *)CKSEG1ADDR(0x1000062c))
 #define BCM6328_TP1_DISABLED	BIT(9)
 
+extern int bmips_dma_sync_enabled;
+
 static const unsigned long kbase = VMLINUX_LOAD_ADDRESS & 0xfff00000;
 
 struct bmips_quirk {
@@ -104,6 +106,9 @@ static void bcm6358_quirks(void)
 	 * disable SMP for now
 	 */
 	bmips_smp_enabled = 0;
+
+	/* ARCH_HAS_SYNC_DMA_FOR_CPU_ALL causes kernel panics on BCM6358 */
+	bmips_dma_sync_enabled = 0;
 }
 
 static void bcm6368_quirks(void)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-10 12:13 [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() Álvaro Fernández Rojas
@ 2023-03-11 17:32 ` Florian Fainelli
  2023-03-11 19:44   ` Jonas Gorski
  2023-03-16 18:05 ` [PATCH v2] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() for TP1 Álvaro Fernández Rojas
  1 sibling, 1 reply; 17+ messages in thread
From: Florian Fainelli @ 2023-03-11 17:32 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, jonas.gorski,
	bcm-kernel-feedback-list, tsbogend, linux-mips, linux-kernel



On 3/10/2023 4:13 AM, Álvaro Fernández Rojas wrote:
> arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with EHCI/OHCI:
> [    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
> [    3.895011] Reserved instruction in kernel code[#1]:
> [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
> [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
> [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
> [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
> [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
> [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
> [    3.932848] $20   : 00000000 00000000 55590000 77d70000
> [    3.938251] $24   : 00000018 00000010
> [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
> [    3.949058] Hi    : 00000000
> [    3.952013] Lo    : 00000000
> [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
> [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
> [    3.965913] Status: 10008703	KERNEL EXL IE
> [    3.970216] Cause : 00800028 (ExcCode 0a)
> [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
> [    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
> [    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
> [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
> [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
> [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
> [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
> [    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
> [    4.044196]         ...
> [    4.046706] Call Trace:
> [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
> [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
> [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
> [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
> [    4.069036]
> [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
> [    4.080686]
> [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
> [    4.087374] Kernel panic - not syncing: Fatal exception
> [    4.092753] Rebooting in 1 seconds..

Did you pinpoint which specific instruction within 
arch_sync_dma_for_cpu_all() is causing the reserved instruction exception?

> 
> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> ---
>   arch/mips/bmips/dma.c   | 5 +++++
>   arch/mips/bmips/setup.c | 5 +++++
>   2 files changed, 10 insertions(+)
> 
> diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
> index 33788668cbdb..3769d5efdede 100644
> --- a/arch/mips/bmips/dma.c
> +++ b/arch/mips/bmips/dma.c
> @@ -5,11 +5,16 @@
>   #include <asm/bmips.h>
>   #include <asm/io.h>
>   
> +int bmips_dma_sync_enabled = 1;

bool bmips_dma_sync_disabled;

> +
>   void arch_sync_dma_for_cpu_all(void)
>   {
>   	void __iomem *cbr = BMIPS_GET_CBR();
>   	u32 cfg;
>   
> +	if (!bmips_dma_sync_enabled)
> +		return;

if (bmips_dma_sync_disabled)

> +
>   	if (boot_cpu_type() != CPU_BMIPS3300 &&
>   	    boot_cpu_type() != CPU_BMIPS4350 &&
>   	    boot_cpu_type() != CPU_BMIPS4380)
> diff --git a/arch/mips/bmips/setup.c b/arch/mips/bmips/setup.c
> index e95b3f78e7cd..825e85e14010 100644
> --- a/arch/mips/bmips/setup.c
> +++ b/arch/mips/bmips/setup.c
> @@ -35,6 +35,8 @@
>   #define REG_BCM6328_OTP		((void __iomem *)CKSEG1ADDR(0x1000062c))
>   #define BCM6328_TP1_DISABLED	BIT(9)
>   
> +extern int bmips_dma_sync_enabled;
> +
>   static const unsigned long kbase = VMLINUX_LOAD_ADDRESS & 0xfff00000;
>   
>   struct bmips_quirk {
> @@ -104,6 +106,9 @@ static void bcm6358_quirks(void)
>   	 * disable SMP for now
>   	 */
>   	bmips_smp_enabled = 0;
> +
> +	/* ARCH_HAS_SYNC_DMA_FOR_CPU_ALL causes kernel panics on BCM6358 */
> +	bmips_dma_sync_enabled = 0;

bmips_dma_sync_disabled = true;

>   }
>   
>   static void bcm6368_quirks(void)

-- 
Florian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-11 17:32 ` Florian Fainelli
@ 2023-03-11 19:44   ` Jonas Gorski
  2023-03-12  7:31     ` William Zhang
  0 siblings, 1 reply; 17+ messages in thread
From: Jonas Gorski @ 2023-03-11 19:44 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Álvaro Fernández Rojas, bcm-kernel-feedback-list,
	tsbogend, linux-mips, linux-kernel

On Sat, 11 Mar 2023 at 18:32, Florian Fainelli <f.fainelli@gmail.com> wrote:
>
>
>
> On 3/10/2023 4:13 AM, Álvaro Fernández Rojas wrote:
> > arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with EHCI/OHCI:
> > [    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
> > [    3.895011] Reserved instruction in kernel code[#1]:
> > [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
> > [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
> > [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
> > [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
> > [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
> > [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
> > [    3.932848] $20   : 00000000 00000000 55590000 77d70000
> > [    3.938251] $24   : 00000018 00000010
> > [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
> > [    3.949058] Hi    : 00000000
> > [    3.952013] Lo    : 00000000
> > [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
> > [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
> > [    3.965913] Status: 10008703       KERNEL EXL IE
> > [    3.970216] Cause : 00800028 (ExcCode 0a)
> > [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
> > [    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
> > [    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
> > [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
> > [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
> > [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
> > [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
> > [    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
> > [    4.044196]         ...
> > [    4.046706] Call Trace:
> > [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
> > [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
> > [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
> > [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
> > [    4.069036]
> > [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
> > [    4.080686]
> > [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
> > [    4.087374] Kernel panic - not syncing: Fatal exception
> > [    4.092753] Rebooting in 1 seconds..
>
> Did you pinpoint which specific instruction within
> arch_sync_dma_for_cpu_all() is causing the reserved instruction exception?

It's setup_sigcontext(), not arch_sync_dma_for_cpu_all() that's
causing the exception ;-)

Hand decoding the Code gives me

lw $1, 0xb4($fp)
or $v0, 0, 0
addiu $a0, $s1, 8
sw $v0, 0($a0) <- the code in brackets, so I guess EPC?
sw $v1, 4($a0)

which I assume is this part:

err |= __put_user(regs->cp0_epc, &sc->sc_pc);

(0xb4 is the offset of cp0_epc, 0x8 the offset of sc_pc)

One thing I see is that we do the RAC flush for BMIPS3300, 4350 and
4380, but only initialize it for 3300 [1], but leave it at whatever
state the bootloader did for the other ones. Maybe it has some invalid
config in (that particuar?) 6358 that triggers issues later on after a
flush? E.g. the flush puts it in an error state, and the next time
something triggers a prefetch(write?) (by trying to access userspace)
it generates an error exception.

Just spit balling though.

[1] https://elixir.bootlin.com/linux/latest/source/arch/mips/kernel/smp-bmips.c#L587

Jonas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-11 19:44   ` Jonas Gorski
@ 2023-03-12  7:31     ` William Zhang
  2023-03-12 16:16       ` Florian Fainelli
  0 siblings, 1 reply; 17+ messages in thread
From: William Zhang @ 2023-03-12  7:31 UTC (permalink / raw)
  To: Jonas Gorski, Florian Fainelli
  Cc: Álvaro Fernández Rojas, bcm-kernel-feedback-list,
	tsbogend, linux-mips, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4076 bytes --]



On 03/11/2023 11:44 AM, Jonas Gorski wrote:
> On Sat, 11 Mar 2023 at 18:32, Florian Fainelli <f.fainelli@gmail.com> wrote:
>>
>>
>>
>> On 3/10/2023 4:13 AM, Álvaro Fernández Rojas wrote:
>>> arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with EHCI/OHCI:
>>> [    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
>>> [    3.895011] Reserved instruction in kernel code[#1]:
>>> [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
>>> [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
>>> [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
>>> [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
>>> [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
>>> [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
>>> [    3.932848] $20   : 00000000 00000000 55590000 77d70000
>>> [    3.938251] $24   : 00000018 00000010
>>> [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
>>> [    3.949058] Hi    : 00000000
>>> [    3.952013] Lo    : 00000000
>>> [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
>>> [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
>>> [    3.965913] Status: 10008703       KERNEL EXL IE
>>> [    3.970216] Cause : 00800028 (ExcCode 0a)
>>> [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
>>> [    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
>>> [    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
>>> [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
>>> [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
>>> [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
>>> [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
>>> [    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
>>> [    4.044196]         ...
>>> [    4.046706] Call Trace:
>>> [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
>>> [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
>>> [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
>>> [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
>>> [    4.069036]
>>> [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
>>> [    4.080686]
>>> [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
>>> [    4.087374] Kernel panic - not syncing: Fatal exception
>>> [    4.092753] Rebooting in 1 seconds..
>>
>> Did you pinpoint which specific instruction within
>> arch_sync_dma_for_cpu_all() is causing the reserved instruction exception?
> 
> It's setup_sigcontext(), not arch_sync_dma_for_cpu_all() that's
> causing the exception ;-)
> 
> Hand decoding the Code gives me
> 
> lw $1, 0xb4($fp)
> or $v0, 0, 0
> addiu $a0, $s1, 8
> sw $v0, 0($a0) <- the code in brackets, so I guess EPC?
> sw $v1, 4($a0)
> 
> which I assume is this part:
> 
> err |= __put_user(regs->cp0_epc, &sc->sc_pc);
> 
> (0xb4 is the offset of cp0_epc, 0x8 the offset of sc_pc)
> 
> One thing I see is that we do the RAC flush for BMIPS3300, 4350 and
> 4380, but only initialize it for 3300 [1], but leave it at whatever
> state the bootloader did for the other ones. Maybe it has some invalid
> config in (that particuar?) 6358 that triggers issues later on after a
> flush? E.g. the flush puts it in an error state, and the next time
> something triggers a prefetch(write?) (by trying to access userspace)
> it generates an error exception.
> 
Depending on the bootloader but likely bootloader does not use RAC at 
all.  So agree that RAC may not be properly initialized when the flush 
function is called and push the stale data to corrupt memory and cause 
problem later on the userspace.

> Just spit balling though.
> 
> [1] https://elixir.bootlin.com/linux/latest/source/arch/mips/kernel/smp-bmips.c#L587
> 
> Jonas
> 

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-12  7:31     ` William Zhang
@ 2023-03-12 16:16       ` Florian Fainelli
  2023-03-12 18:50         ` Álvaro Fernández Rojas
  0 siblings, 1 reply; 17+ messages in thread
From: Florian Fainelli @ 2023-03-12 16:16 UTC (permalink / raw)
  To: William Zhang, Jonas Gorski
  Cc: Álvaro Fernández Rojas, bcm-kernel-feedback-list,
	tsbogend, linux-mips, linux-kernel



On 3/11/2023 11:31 PM, William Zhang wrote:
> 
> 
> On 03/11/2023 11:44 AM, Jonas Gorski wrote:
>> On Sat, 11 Mar 2023 at 18:32, Florian Fainelli <f.fainelli@gmail.com> 
>> wrote:
>>>
>>>
>>>
>>> On 3/10/2023 4:13 AM, Álvaro Fernández Rojas wrote:
>>>> arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with 
>>>> EHCI/OHCI:
>>>> [    3.881739] usb 1-1: new high-speed USB device number 2 using 
>>>> ehci-platform
>>>> [    3.895011] Reserved instruction in kernel code[#1]:
>>>> [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
>>>> [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
>>>> [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
>>>> [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
>>>> [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
>>>> [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
>>>> [    3.932848] $20   : 00000000 00000000 55590000 77d70000
>>>> [    3.938251] $24   : 00000018 00000010
>>>> [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
>>>> [    3.949058] Hi    : 00000000
>>>> [    3.952013] Lo    : 00000000
>>>> [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
>>>> [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
>>>> [    3.965913] Status: 10008703       KERNEL EXL IE
>>>> [    3.970216] Cause : 00800028 (ExcCode 0a)
>>>> [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
>>>> [    3.979170] Modules linked in: ohci_platform ohci_hcd 
>>>> fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug 
>>>> usbcore nls_base usb_common
>>>> [    3.992907] Process init (pid: 1, threadinfo=(ptrval), 
>>>> task=(ptrval), tls=77e22ec8)
>>>> [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 
>>>> 81431edc 7ff559b8 81428470
>>>> [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 
>>>> 80015c70 806f0000 8063ae74
>>>> [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 
>>>> 0000000a 77d6b418 00000003
>>>> [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 
>>>> 00000001 00000000 04000000
>>>> [    4.035512]         77d54874 00000000 00000000 00000000 00000000 
>>>> 00000012 00000002 00000000
>>>> [    4.044196]         ...
>>>> [    4.046706] Call Trace:
>>>> [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
>>>> [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
>>>> [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
>>>> [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
>>>> [    4.069036]
>>>> [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> 
>>>> ac830004  3c048063  0c0228aa  24846a00  26240010
>>>> [    4.080686]
>>>> [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
>>>> [    4.087374] Kernel panic - not syncing: Fatal exception
>>>> [    4.092753] Rebooting in 1 seconds..
>>>
>>> Did you pinpoint which specific instruction within
>>> arch_sync_dma_for_cpu_all() is causing the reserved instruction 
>>> exception?
>>
>> It's setup_sigcontext(), not arch_sync_dma_for_cpu_all() that's
>> causing the exception ;-)
>>
>> Hand decoding the Code gives me
>>
>> lw $1, 0xb4($fp)
>> or $v0, 0, 0
>> addiu $a0, $s1, 8
>> sw $v0, 0($a0) <- the code in brackets, so I guess EPC?
>> sw $v1, 4($a0)
>>
>> which I assume is this part:
>>
>> err |= __put_user(regs->cp0_epc, &sc->sc_pc);
>>
>> (0xb4 is the offset of cp0_epc, 0x8 the offset of sc_pc)
>>
>> One thing I see is that we do the RAC flush for BMIPS3300, 4350 and
>> 4380, but only initialize it for 3300 [1], but leave it at whatever
>> state the bootloader did for the other ones. Maybe it has some invalid
>> config in (that particuar?) 6358 that triggers issues later on after a
>> flush? E.g. the flush puts it in an error state, and the next time
>> something triggers a prefetch(write?) (by trying to access userspace)
>> it generates an error exception.
>>
> Depending on the bootloader but likely bootloader does not use RAC at 
> all.  So agree that RAC may not be properly initialized when the flush 
> function is called and push the stale data to corrupt memory and cause 
> problem later on the userspace.

Alvaro, could you do the following and let us know the results, at boot 
time in bcm6358_quirks():

- issue a read_c0_brcm_config_0() and look whether bit 29 (RAC present) 
is set, if it is not set, then we should forcibly disable the use of the 
RAC using a flag

- issue a __raw_readl(cbr + BMIPS_RAC_CONFIG) and check whether bits 0 
(RAC_I) or 1 (RAC_D) are set, if not, try to set them and see whether it 
works

Thanks!
-- 
Florian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-12 16:16       ` Florian Fainelli
@ 2023-03-12 18:50         ` Álvaro Fernández Rojas
  2023-03-13 17:37           ` Florian Fainelli
  0 siblings, 1 reply; 17+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-12 18:50 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: William Zhang, Jonas Gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel

Hi Florian,

I tried what you suggested but it stil panics on EHCI:

[    0.000000] Linux version 5.15.98 (noltari@atlantis)
(mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
[    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
[    0.000000] bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x3c1b8041
[    0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)

It looks like bit 29 is set so RAC should be present.
And RAC_I seems to be set, but not RAC_D...

BTW, this is what I added to bmips_cpu_setup:

case CPU_BMIPS4350:
cfg = read_c0_brcm_config_0();
pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);

cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
pr_info("bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x%x\n", cfg);
__raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
__raw_readl(cbr + BMIPS_RAC_CONFIG);
break;

Best regards,
Álvaro.


El dom, 12 mar 2023 a las 17:16, Florian Fainelli
(<f.fainelli@gmail.com>) escribió:
>
>
>
> On 3/11/2023 11:31 PM, William Zhang wrote:
> >
> >
> > On 03/11/2023 11:44 AM, Jonas Gorski wrote:
> >> On Sat, 11 Mar 2023 at 18:32, Florian Fainelli <f.fainelli@gmail.com>
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 3/10/2023 4:13 AM, Álvaro Fernández Rojas wrote:
> >>>> arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with
> >>>> EHCI/OHCI:
> >>>> [    3.881739] usb 1-1: new high-speed USB device number 2 using
> >>>> ehci-platform
> >>>> [    3.895011] Reserved instruction in kernel code[#1]:
> >>>> [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
> >>>> [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
> >>>> [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
> >>>> [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
> >>>> [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
> >>>> [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
> >>>> [    3.932848] $20   : 00000000 00000000 55590000 77d70000
> >>>> [    3.938251] $24   : 00000018 00000010
> >>>> [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
> >>>> [    3.949058] Hi    : 00000000
> >>>> [    3.952013] Lo    : 00000000
> >>>> [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
> >>>> [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
> >>>> [    3.965913] Status: 10008703       KERNEL EXL IE
> >>>> [    3.970216] Cause : 00800028 (ExcCode 0a)
> >>>> [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
> >>>> [    3.979170] Modules linked in: ohci_platform ohci_hcd
> >>>> fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug
> >>>> usbcore nls_base usb_common
> >>>> [    3.992907] Process init (pid: 1, threadinfo=(ptrval),
> >>>> task=(ptrval), tls=77e22ec8)
> >>>> [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068
> >>>> 81431edc 7ff559b8 81428470
> >>>> [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c
> >>>> 80015c70 806f0000 8063ae74
> >>>> [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28
> >>>> 0000000a 77d6b418 00000003
> >>>> [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc
> >>>> 00000001 00000000 04000000
> >>>> [    4.035512]         77d54874 00000000 00000000 00000000 00000000
> >>>> 00000012 00000002 00000000
> >>>> [    4.044196]         ...
> >>>> [    4.046706] Call Trace:
> >>>> [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
> >>>> [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
> >>>> [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
> >>>> [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
> >>>> [    4.069036]
> >>>> [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000>
> >>>> ac830004  3c048063  0c0228aa  24846a00  26240010
> >>>> [    4.080686]
> >>>> [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
> >>>> [    4.087374] Kernel panic - not syncing: Fatal exception
> >>>> [    4.092753] Rebooting in 1 seconds..
> >>>
> >>> Did you pinpoint which specific instruction within
> >>> arch_sync_dma_for_cpu_all() is causing the reserved instruction
> >>> exception?
> >>
> >> It's setup_sigcontext(), not arch_sync_dma_for_cpu_all() that's
> >> causing the exception ;-)
> >>
> >> Hand decoding the Code gives me
> >>
> >> lw $1, 0xb4($fp)
> >> or $v0, 0, 0
> >> addiu $a0, $s1, 8
> >> sw $v0, 0($a0) <- the code in brackets, so I guess EPC?
> >> sw $v1, 4($a0)
> >>
> >> which I assume is this part:
> >>
> >> err |= __put_user(regs->cp0_epc, &sc->sc_pc);
> >>
> >> (0xb4 is the offset of cp0_epc, 0x8 the offset of sc_pc)
> >>
> >> One thing I see is that we do the RAC flush for BMIPS3300, 4350 and
> >> 4380, but only initialize it for 3300 [1], but leave it at whatever
> >> state the bootloader did for the other ones. Maybe it has some invalid
> >> config in (that particuar?) 6358 that triggers issues later on after a
> >> flush? E.g. the flush puts it in an error state, and the next time
> >> something triggers a prefetch(write?) (by trying to access userspace)
> >> it generates an error exception.
> >>
> > Depending on the bootloader but likely bootloader does not use RAC at
> > all.  So agree that RAC may not be properly initialized when the flush
> > function is called and push the stale data to corrupt memory and cause
> > problem later on the userspace.
>
> Alvaro, could you do the following and let us know the results, at boot
> time in bcm6358_quirks():
>
> - issue a read_c0_brcm_config_0() and look whether bit 29 (RAC present)
> is set, if it is not set, then we should forcibly disable the use of the
> RAC using a flag
>
> - issue a __raw_readl(cbr + BMIPS_RAC_CONFIG) and check whether bits 0
> (RAC_I) or 1 (RAC_D) are set, if not, try to set them and see whether it
> works
>
> Thanks!
> --
> Florian

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-12 18:50         ` Álvaro Fernández Rojas
@ 2023-03-13 17:37           ` Florian Fainelli
  2023-03-13 21:39             ` Álvaro Fernández Rojas
  0 siblings, 1 reply; 17+ messages in thread
From: Florian Fainelli @ 2023-03-13 17:37 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, Florian Fainelli
  Cc: William Zhang, Jonas Gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel

On 3/12/23 11:50, Álvaro Fernández Rojas wrote:
> Hi Florian,
> 
> I tried what you suggested but it stil panics on EHCI:
> 
> [    0.000000] Linux version 5.15.98 (noltari@atlantis)
> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
> 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
> [    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
> [    0.000000] bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x3c1b8041
> [    0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
> 
> It looks like bit 29 is set so RAC should be present.
> And RAC_I seems to be set, but not RAC_D...
> 
> BTW, this is what I added to bmips_cpu_setup:
> 
> case CPU_BMIPS4350:
> cfg = read_c0_brcm_config_0();
> pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);
> 
> cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
> pr_info("bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x%x\n", cfg);
> __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
> __raw_readl(cbr + BMIPS_RAC_CONFIG);
> break;

Thanks for running those experiments, I cannot explain what you are 
seeing, so there must be some sort of erratum applicable to the 
BMIPS4380 revision used on the 6358 somehow...

If you can make the suggested change to use negative logic in order to 
disable the RAC flushing, that would work for me, also maybe add a 
Fixes: tag so it gets backported to stable trees?

Thanks!
-- 
Florian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-13 17:37           ` Florian Fainelli
@ 2023-03-13 21:39             ` Álvaro Fernández Rojas
  2023-03-13 21:46               ` Florian Fainelli
  0 siblings, 1 reply; 17+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-13 21:39 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: William Zhang, Jonas Gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel

Hi Florian,

I did another test changing from TP1 to TP0 and this is the result:
[ 0.000000] Linux version 5.15.98 (noltari@atlantis)
(mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
[ 0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
[ 0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x2a00015
[ 0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)

And there were no exceptions with EHCI/OHCI as opposed to TP1.
So the issue is only happening when booting from TP1.
Maybe it's due to the fact that BCM6358 has a shared TLB?

Maybe the correct way of solving the issue would be by adding the
following code at bcm6358_quirks():
if (read_c0_brcm_cmt_local() & (1 << 31))
    bmips_dma_sync_disabled = 1;

BTW, if I understood it correctly, you want me to reverse the logic,
so bmips_dma_sync_disabled instead of bmips_dma_sync_enabled.
Is this correct?

Best regards,
Álvaro.


El lun, 13 mar 2023 a las 18:37, Florian Fainelli
(<f.fainelli@gmail.com>) escribió:
>
> On 3/12/23 11:50, Álvaro Fernández Rojas wrote:
> > Hi Florian,
> >
> > I tried what you suggested but it stil panics on EHCI:
> >
> > [    0.000000] Linux version 5.15.98 (noltari@atlantis)
> > (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
> > 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
> > [    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
> > [    0.000000] bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x3c1b8041
> > [    0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
> >
> > It looks like bit 29 is set so RAC should be present.
> > And RAC_I seems to be set, but not RAC_D...
> >
> > BTW, this is what I added to bmips_cpu_setup:
> >
> > case CPU_BMIPS4350:
> > cfg = read_c0_brcm_config_0();
> > pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);
> >
> > cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
> > pr_info("bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x%x\n", cfg);
> > __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
> > __raw_readl(cbr + BMIPS_RAC_CONFIG);
> > break;
>
> Thanks for running those experiments, I cannot explain what you are
> seeing, so there must be some sort of erratum applicable to the
> BMIPS4380 revision used on the 6358 somehow...
>
> If you can make the suggested change to use negative logic in order to
> disable the RAC flushing, that would work for me, also maybe add a
> Fixes: tag so it gets backported to stable trees?
>
> Thanks!
> --
> Florian
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-13 21:39             ` Álvaro Fernández Rojas
@ 2023-03-13 21:46               ` Florian Fainelli
  2023-03-14 18:14                 ` Álvaro Fernández Rojas
  0 siblings, 1 reply; 17+ messages in thread
From: Florian Fainelli @ 2023-03-13 21:46 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, Florian Fainelli
  Cc: William Zhang, Jonas Gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel

(please don't top post)

On 3/13/23 14:39, Álvaro Fernández Rojas wrote:
> Hi Florian,
> 
> I did another test changing from TP1 to TP0 and this is the result:
> [ 0.000000] Linux version 5.15.98 (noltari@atlantis)
> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
> 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
> [ 0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
> [ 0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x2a00015
> [ 0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
> 
> And there were no exceptions with EHCI/OHCI as opposed to TP1.
> So the issue is only happening when booting from TP1.

Ah, that explains it then, I was just about to ask you which TP was the 
kernel booted on.

> Maybe it's due to the fact that BCM6358 has a shared TLB?

I think it has to do with the fact that the BMIPS_RAC_CONFIG_1 is likely 
not enabling the RAC since that register pertains to TP1, could you dump 
its contents, and if they do not set bit 0 and/or 1, please set them and 
try again and see whether it works any better? The RAC provides 
substantial performance improvements, it would be a change to keep it 
disabled.

> 
> Maybe the correct way of solving the issue would be by adding the
> following code at bcm6358_quirks():
> if (read_c0_brcm_cmt_local() & (1 << 31))
>      bmips_dma_sync_disabled = 1;
> 
> BTW, if I understood it correctly, you want me to reverse the logic,
> so bmips_dma_sync_disabled instead of bmips_dma_sync_enabled.
> Is this correct?

Yes, I want the logic such that we need to set a variable to 1/true 
rather setting one to 0, less change to get it wrong IMHO.

> 
> Best regards,
> Álvaro.
> 
> 
> El lun, 13 mar 2023 a las 18:37, Florian Fainelli
> (<f.fainelli@gmail.com>) escribió:
>>
>> On 3/12/23 11:50, Álvaro Fernández Rojas wrote:
>>> Hi Florian,
>>>
>>> I tried what you suggested but it stil panics on EHCI:
>>>
>>> [    0.000000] Linux version 5.15.98 (noltari@atlantis)
>>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
>>> 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
>>> [    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
>>> [    0.000000] bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x3c1b8041
>>> [    0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
>>>
>>> It looks like bit 29 is set so RAC should be present.
>>> And RAC_I seems to be set, but not RAC_D...
>>>
>>> BTW, this is what I added to bmips_cpu_setup:
>>>
>>> case CPU_BMIPS4350:
>>> cfg = read_c0_brcm_config_0();
>>> pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);
>>>
>>> cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
>>> pr_info("bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x%x\n", cfg);
>>> __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
>>> __raw_readl(cbr + BMIPS_RAC_CONFIG);
>>> break;
>>
>> Thanks for running those experiments, I cannot explain what you are
>> seeing, so there must be some sort of erratum applicable to the
>> BMIPS4380 revision used on the 6358 somehow...
>>
>> If you can make the suggested change to use negative logic in order to
>> disable the RAC flushing, that would work for me, also maybe add a
>> Fixes: tag so it gets backported to stable trees?
>>
>> Thanks!
>> --
>> Florian
>>

-- 
Florian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-13 21:46               ` Florian Fainelli
@ 2023-03-14 18:14                 ` Álvaro Fernández Rojas
  2023-03-16 17:40                   ` Florian Fainelli
  0 siblings, 1 reply; 17+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-14 18:14 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: William Zhang, Jonas Gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel

Hi Florian,

El lun, 13 mar 2023 a las 22:46, Florian Fainelli
(<f.fainelli@gmail.com>) escribió:
>
> (please don't top post)
>
> On 3/13/23 14:39, Álvaro Fernández Rojas wrote:
> > Hi Florian,
> >
> > I did another test changing from TP1 to TP0 and this is the result:
> > [ 0.000000] Linux version 5.15.98 (noltari@atlantis)
> > (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
> > 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
> > [ 0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
> > [ 0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x2a00015
> > [ 0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
> >
> > And there were no exceptions with EHCI/OHCI as opposed to TP1.
> > So the issue is only happening when booting from TP1.
>
> Ah, that explains it then, I was just about to ask you which TP was the
> kernel booted on.
>
> > Maybe it's due to the fact that BCM6358 has a shared TLB?
>
> I think it has to do with the fact that the BMIPS_RAC_CONFIG_1 is likely
> not enabling the RAC since that register pertains to TP1, could you dump
> its contents, and if they do not set bit 0 and/or 1, please set them and
> try again and see whether it works any better? The RAC provides
> substantial performance improvements, it would be a change to keep it
> disabled.

This is the code that I added to bmips_cpu_setup():
    case CPU_BMIPS4350:
        cfg = read_c0_brcm_cmt_local();
        pr_info("bmips_cpu_setup: read_c0_brcm_cmt_local() = 0x%x\n", cfg);

        cfg = read_c0_brcm_config_0();
        pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);

        cfg = __raw_readl(cbr + BMIPS_RAC_ADDRESS_RANGE);
        pr_info("bmips_cpu_setup: BMIPS_RAC_ADDRESS_RANGE = 0x%x\n", cfg);

        cfg = __raw_readl(cbr + BMIPS_L2_CONFIG);
        pr_info("bmips_cpu_setup: BMIPS_L2_CONFIG = 0x%x\n", cfg);

        cfg = __raw_readl(cbr + BMIPS_LMB_CONTROL);
        pr_info("bmips_cpu_setup: BMIPS_LMB_CONTROL = 0x%x\n", cfg);

        cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
        pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x%x\n", cfg);
        __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
        pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x%x\n", cfg);
        __raw_readl(cbr + BMIPS_RAC_CONFIG);

        cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG_1);
        pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x%x\n", cfg);
        __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG_1);
        pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x%x\n", cfg);
        __raw_readl(cbr + BMIPS_RAC_CONFIG_1);
        break;

And this is the result:
[    0.000000] bmips_cpu_setup: read_c0_brcm_cmt_local() = 0x80000000
[    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
[    0.000000] bmips_cpu_setup: BMIPS_RAC_ADDRESS_RANGE = 0x277bdab0
[    0.000000] bmips_cpu_setup: BMIPS_L2_CONFIG = 0x241a0008
[    0.000000] bmips_cpu_setup: BMIPS_LMB_CONTROL = 0x0
[    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x3c1b8041
[    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x3c1b8041
[    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x3600008
[    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x3600008

As you can see the bit's aren't set and all the registers appear to
have strange values and not the usual ones when initialized by the
bootloader...

>
> >
> > Maybe the correct way of solving the issue would be by adding the
> > following code at bcm6358_quirks():
> > if (read_c0_brcm_cmt_local() & (1 << 31))
> >      bmips_dma_sync_disabled = 1;
> >
> > BTW, if I understood it correctly, you want me to reverse the logic,
> > so bmips_dma_sync_disabled instead of bmips_dma_sync_enabled.
> > Is this correct?
>
> Yes, I want the logic such that we need to set a variable to 1/true
> rather setting one to 0, less change to get it wrong IMHO.
>
> >
> > Best regards,
> > Álvaro.
> >
> >
> > El lun, 13 mar 2023 a las 18:37, Florian Fainelli
> > (<f.fainelli@gmail.com>) escribió:
> >>
> >> On 3/12/23 11:50, Álvaro Fernández Rojas wrote:
> >>> Hi Florian,
> >>>
> >>> I tried what you suggested but it stil panics on EHCI:
> >>>
> >>> [    0.000000] Linux version 5.15.98 (noltari@atlantis)
> >>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
> >>> 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
> >>> [    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
> >>> [    0.000000] bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x3c1b8041
> >>> [    0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
> >>>
> >>> It looks like bit 29 is set so RAC should be present.
> >>> And RAC_I seems to be set, but not RAC_D...
> >>>
> >>> BTW, this is what I added to bmips_cpu_setup:
> >>>
> >>> case CPU_BMIPS4350:
> >>> cfg = read_c0_brcm_config_0();
> >>> pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);
> >>>
> >>> cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
> >>> pr_info("bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x%x\n", cfg);
> >>> __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
> >>> __raw_readl(cbr + BMIPS_RAC_CONFIG);
> >>> break;
> >>
> >> Thanks for running those experiments, I cannot explain what you are
> >> seeing, so there must be some sort of erratum applicable to the
> >> BMIPS4380 revision used on the 6358 somehow...
> >>
> >> If you can make the suggested change to use negative logic in order to
> >> disable the RAC flushing, that would work for me, also maybe add a
> >> Fixes: tag so it gets backported to stable trees?
> >>
> >> Thanks!
> >> --
> >> Florian
> >>
>
> --
> Florian
>

Álvaro

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-14 18:14                 ` Álvaro Fernández Rojas
@ 2023-03-16 17:40                   ` Florian Fainelli
  2023-03-16 18:03                     ` Álvaro Fernández Rojas
  0 siblings, 1 reply; 17+ messages in thread
From: Florian Fainelli @ 2023-03-16 17:40 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: William Zhang, Jonas Gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel

On 3/14/23 11:14, Álvaro Fernández Rojas wrote:
> Hi Florian,
> 
> El lun, 13 mar 2023 a las 22:46, Florian Fainelli
> (<f.fainelli@gmail.com>) escribió:
>>
>> (please don't top post)
>>
>> On 3/13/23 14:39, Álvaro Fernández Rojas wrote:
>>> Hi Florian,
>>>
>>> I did another test changing from TP1 to TP0 and this is the result:
>>> [ 0.000000] Linux version 5.15.98 (noltari@atlantis)
>>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
>>> 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
>>> [ 0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
>>> [ 0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x2a00015
>>> [ 0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
>>>
>>> And there were no exceptions with EHCI/OHCI as opposed to TP1.
>>> So the issue is only happening when booting from TP1.
>>
>> Ah, that explains it then, I was just about to ask you which TP was the
>> kernel booted on.
>>
>>> Maybe it's due to the fact that BCM6358 has a shared TLB?
>>
>> I think it has to do with the fact that the BMIPS_RAC_CONFIG_1 is likely
>> not enabling the RAC since that register pertains to TP1, could you dump
>> its contents, and if they do not set bit 0 and/or 1, please set them and
>> try again and see whether it works any better? The RAC provides
>> substantial performance improvements, it would be a change to keep it
>> disabled.
> 
> This is the code that I added to bmips_cpu_setup():
>      case CPU_BMIPS4350:
>          cfg = read_c0_brcm_cmt_local();
>          pr_info("bmips_cpu_setup: read_c0_brcm_cmt_local() = 0x%x\n", cfg);
> 
>          cfg = read_c0_brcm_config_0();
>          pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);
> 
>          cfg = __raw_readl(cbr + BMIPS_RAC_ADDRESS_RANGE);
>          pr_info("bmips_cpu_setup: BMIPS_RAC_ADDRESS_RANGE = 0x%x\n", cfg);
> 
>          cfg = __raw_readl(cbr + BMIPS_L2_CONFIG);
>          pr_info("bmips_cpu_setup: BMIPS_L2_CONFIG = 0x%x\n", cfg);
> 
>          cfg = __raw_readl(cbr + BMIPS_LMB_CONTROL);
>          pr_info("bmips_cpu_setup: BMIPS_LMB_CONTROL = 0x%x\n", cfg);
> 
>          cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
>          pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x%x\n", cfg);
>          __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
>          pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x%x\n", cfg);
>          __raw_readl(cbr + BMIPS_RAC_CONFIG);
> 
>          cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG_1);
>          pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x%x\n", cfg);
>          __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG_1);
>          pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x%x\n", cfg);
>          __raw_readl(cbr + BMIPS_RAC_CONFIG_1);
>          break;
> 
> And this is the result:
> [    0.000000] bmips_cpu_setup: read_c0_brcm_cmt_local() = 0x80000000

OK, so we are executing from TP1, but we knew that already.

> [    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006

bit 31: instruction cache enabled
bit 30: data cache enabled
bit 29: RAC present
bit 25: DSU power enabled
bit 24: data cache power enabled
bit 19: low-latency memory bus (LMB) present
bit 18: concurrent multi threading (CMT) present
bit 17: reserved
bit 12: split instruction cache
bit 2: number of Hi/Lo special registers - 1
bit 1: eDSP present

This seems to match the recommended and default values

> [    0.000000] bmips_cpu_setup: BMIPS_RAC_ADDRESS_RANGE = 0x277bdab0

That does not look intended, the reset value is supposed to be 0. Can 
you apply the same values as the ones programmed in the CPU_BMIPS_3300 case?

> [    0.000000] bmips_cpu_setup: BMIPS_L2_CONFIG = 0x241a0008


> [    0.000000] bmips_cpu_setup: BMIPS_LMB_CONTROL = 0x0

LMB not enabled, that's OK.

> [    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x3c1b8041
> [    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x3c1b8041
> [    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x3600008
> [    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x3600008
> 
> As you can see the bit's aren't set and all the registers appear to
> have strange values and not the usual ones when initialized by the
> bootloader...

Yes, so maybe the best way to go about is indeed to go with your change 
such that if we are running on TP1, it is safe to assume that CFE has 
not done any sensible initialization, and we have a non functional RAC.

> 
>>
>>>
>>> Maybe the correct way of solving the issue would be by adding the
>>> following code at bcm6358_quirks():
>>> if (read_c0_brcm_cmt_local() & (1 << 31))
>>>       bmips_dma_sync_disabled = 1;
>>>
>>> BTW, if I understood it correctly, you want me to reverse the logic,
>>> so bmips_dma_sync_disabled instead of bmips_dma_sync_enabled.
>>> Is this correct?
>>
>> Yes, I want the logic such that we need to set a variable to 1/true
>> rather setting one to 0, less change to get it wrong IMHO.
>>
>>>
>>> Best regards,
>>> Álvaro.
>>>
>>>
>>> El lun, 13 mar 2023 a las 18:37, Florian Fainelli
>>> (<f.fainelli@gmail.com>) escribió:
>>>>
>>>> On 3/12/23 11:50, Álvaro Fernández Rojas wrote:
>>>>> Hi Florian,
>>>>>
>>>>> I tried what you suggested but it stil panics on EHCI:
>>>>>
>>>>> [    0.000000] Linux version 5.15.98 (noltari@atlantis)
>>>>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
>>>>> 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
>>>>> [    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
>>>>> [    0.000000] bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x3c1b8041
>>>>> [    0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
>>>>>
>>>>> It looks like bit 29 is set so RAC should be present.
>>>>> And RAC_I seems to be set, but not RAC_D...
>>>>>
>>>>> BTW, this is what I added to bmips_cpu_setup:
>>>>>
>>>>> case CPU_BMIPS4350:
>>>>> cfg = read_c0_brcm_config_0();
>>>>> pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);
>>>>>
>>>>> cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
>>>>> pr_info("bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x%x\n", cfg);
>>>>> __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
>>>>> __raw_readl(cbr + BMIPS_RAC_CONFIG);
>>>>> break;
>>>>
>>>> Thanks for running those experiments, I cannot explain what you are
>>>> seeing, so there must be some sort of erratum applicable to the
>>>> BMIPS4380 revision used on the 6358 somehow...
>>>>
>>>> If you can make the suggested change to use negative logic in order to
>>>> disable the RAC flushing, that would work for me, also maybe add a
>>>> Fixes: tag so it gets backported to stable trees?
>>>>
>>>> Thanks!
>>>> --
>>>> Florian
>>>>
>>
>> --
>> Florian
>>
> 
> Álvaro

-- 
Florian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all()
  2023-03-16 17:40                   ` Florian Fainelli
@ 2023-03-16 18:03                     ` Álvaro Fernández Rojas
  0 siblings, 0 replies; 17+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-16 18:03 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: William Zhang, Jonas Gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel

El jue, 16 mar 2023 a las 18:40, Florian Fainelli
(<f.fainelli@gmail.com>) escribió:
>
> On 3/14/23 11:14, Álvaro Fernández Rojas wrote:
> > Hi Florian,
> >
> > El lun, 13 mar 2023 a las 22:46, Florian Fainelli
> > (<f.fainelli@gmail.com>) escribió:
> >>
> >> (please don't top post)
> >>
> >> On 3/13/23 14:39, Álvaro Fernández Rojas wrote:
> >>> Hi Florian,
> >>>
> >>> I did another test changing from TP1 to TP0 and this is the result:
> >>> [ 0.000000] Linux version 5.15.98 (noltari@atlantis)
> >>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
> >>> 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
> >>> [ 0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
> >>> [ 0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x2a00015
> >>> [ 0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
> >>>
> >>> And there were no exceptions with EHCI/OHCI as opposed to TP1.
> >>> So the issue is only happening when booting from TP1.
> >>
> >> Ah, that explains it then, I was just about to ask you which TP was the
> >> kernel booted on.
> >>
> >>> Maybe it's due to the fact that BCM6358 has a shared TLB?
> >>
> >> I think it has to do with the fact that the BMIPS_RAC_CONFIG_1 is likely
> >> not enabling the RAC since that register pertains to TP1, could you dump
> >> its contents, and if they do not set bit 0 and/or 1, please set them and
> >> try again and see whether it works any better? The RAC provides
> >> substantial performance improvements, it would be a change to keep it
> >> disabled.
> >
> > This is the code that I added to bmips_cpu_setup():
> >      case CPU_BMIPS4350:
> >          cfg = read_c0_brcm_cmt_local();
> >          pr_info("bmips_cpu_setup: read_c0_brcm_cmt_local() = 0x%x\n", cfg);
> >
> >          cfg = read_c0_brcm_config_0();
> >          pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);
> >
> >          cfg = __raw_readl(cbr + BMIPS_RAC_ADDRESS_RANGE);
> >          pr_info("bmips_cpu_setup: BMIPS_RAC_ADDRESS_RANGE = 0x%x\n", cfg);
> >
> >          cfg = __raw_readl(cbr + BMIPS_L2_CONFIG);
> >          pr_info("bmips_cpu_setup: BMIPS_L2_CONFIG = 0x%x\n", cfg);
> >
> >          cfg = __raw_readl(cbr + BMIPS_LMB_CONTROL);
> >          pr_info("bmips_cpu_setup: BMIPS_LMB_CONTROL = 0x%x\n", cfg);
> >
> >          cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
> >          pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x%x\n", cfg);
> >          __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
> >          pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x%x\n", cfg);
> >          __raw_readl(cbr + BMIPS_RAC_CONFIG);
> >
> >          cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG_1);
> >          pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x%x\n", cfg);
> >          __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG_1);
> >          pr_info("bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x%x\n", cfg);
> >          __raw_readl(cbr + BMIPS_RAC_CONFIG_1);
> >          break;
> >
> > And this is the result:
> > [    0.000000] bmips_cpu_setup: read_c0_brcm_cmt_local() = 0x80000000
>
> OK, so we are executing from TP1, but we knew that already.
>
> > [    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
>
> bit 31: instruction cache enabled
> bit 30: data cache enabled
> bit 29: RAC present
> bit 25: DSU power enabled
> bit 24: data cache power enabled
> bit 19: low-latency memory bus (LMB) present
> bit 18: concurrent multi threading (CMT) present
> bit 17: reserved
> bit 12: split instruction cache
> bit 2: number of Hi/Lo special registers - 1
> bit 1: eDSP present

I'm curious, is there any bit that indicates a split data cache?

>
> This seems to match the recommended and default values
>
> > [    0.000000] bmips_cpu_setup: BMIPS_RAC_ADDRESS_RANGE = 0x277bdab0
>
> That does not look intended, the reset value is supposed to be 0. Can
> you apply the same values as the ones programmed in the CPU_BMIPS_3300 case?

I've just tried this but it still panics...

>
> > [    0.000000] bmips_cpu_setup: BMIPS_L2_CONFIG = 0x241a0008
>
>
> > [    0.000000] bmips_cpu_setup: BMIPS_LMB_CONTROL = 0x0
>
> LMB not enabled, that's OK.
>
> > [    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x3c1b8041
> > [    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG = 0x3c1b8041
> > [    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x3600008
> > [    0.000000] bmips_cpu_setup: BMIPS_RAC_CONFIG_1 = 0x3600008
> >
> > As you can see the bit's aren't set and all the registers appear to
> > have strange values and not the usual ones when initialized by the
> > bootloader...
>
> Yes, so maybe the best way to go about is indeed to go with your change
> such that if we are running on TP1, it is safe to assume that CFE has
> not done any sensible initialization, and we have a non functional RAC.

Yes, I will send v2 checking if we're running on TP1.

>
> >
> >>
> >>>
> >>> Maybe the correct way of solving the issue would be by adding the
> >>> following code at bcm6358_quirks():
> >>> if (read_c0_brcm_cmt_local() & (1 << 31))
> >>>       bmips_dma_sync_disabled = 1;
> >>>
> >>> BTW, if I understood it correctly, you want me to reverse the logic,
> >>> so bmips_dma_sync_disabled instead of bmips_dma_sync_enabled.
> >>> Is this correct?
> >>
> >> Yes, I want the logic such that we need to set a variable to 1/true
> >> rather setting one to 0, less change to get it wrong IMHO.
> >>
> >>>
> >>> Best regards,
> >>> Álvaro.
> >>>
> >>>
> >>> El lun, 13 mar 2023 a las 18:37, Florian Fainelli
> >>> (<f.fainelli@gmail.com>) escribió:
> >>>>
> >>>> On 3/12/23 11:50, Álvaro Fernández Rojas wrote:
> >>>>> Hi Florian,
> >>>>>
> >>>>> I tried what you suggested but it stil panics on EHCI:
> >>>>>
> >>>>> [    0.000000] Linux version 5.15.98 (noltari@atlantis)
> >>>>> (mips-openwrt-linux-musl-gcc (OpenWrt GCC 12.2.0 r22187+1-19817fa3f5)
> >>>>> 12.2.0, GNU ld (GNU Binutils) 2.40.0) #0 SMP Sun Mar 12 18:23:28 2023
> >>>>> [    0.000000] bmips_cpu_setup: read_c0_brcm_config_0() = 0xe30e1006
> >>>>> [    0.000000] bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x3c1b8041
> >>>>> [    0.000000] CPU0 revision is: 0002a010 (Broadcom BMIPS4350)
> >>>>>
> >>>>> It looks like bit 29 is set so RAC should be present.
> >>>>> And RAC_I seems to be set, but not RAC_D...
> >>>>>
> >>>>> BTW, this is what I added to bmips_cpu_setup:
> >>>>>
> >>>>> case CPU_BMIPS4350:
> >>>>> cfg = read_c0_brcm_config_0();
> >>>>> pr_info("bmips_cpu_setup: read_c0_brcm_config_0() = 0x%x\n", cfg);
> >>>>>
> >>>>> cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
> >>>>> pr_info("bmips_cpu_setup: cbr + BMIPS_RAC_CONFIG = 0x%x\n", cfg);
> >>>>> __raw_writel(cfg | BIT(0) | BIT(1), cbr + BMIPS_RAC_CONFIG);
> >>>>> __raw_readl(cbr + BMIPS_RAC_CONFIG);
> >>>>> break;
> >>>>
> >>>> Thanks for running those experiments, I cannot explain what you are
> >>>> seeing, so there must be some sort of erratum applicable to the
> >>>> BMIPS4380 revision used on the 6358 somehow...
> >>>>
> >>>> If you can make the suggested change to use negative logic in order to
> >>>> disable the RAC flushing, that would work for me, also maybe add a
> >>>> Fixes: tag so it gets backported to stable trees?
> >>>>
> >>>> Thanks!
> >>>> --
> >>>> Florian
> >>>>
> >>
> >> --
> >> Florian
> >>
> >
> > Álvaro
>
> --
> Florian
>

Álvaro

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() for TP1
  2023-03-10 12:13 [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() Álvaro Fernández Rojas
  2023-03-11 17:32 ` Florian Fainelli
@ 2023-03-16 18:05 ` Álvaro Fernández Rojas
  2023-03-17  0:00   ` Florian Fainelli
  2023-03-17 10:20   ` [PATCH v3] mips: bmips: BCM6358: disable RAC flush " Álvaro Fernández Rojas
  1 sibling, 2 replies; 17+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-16 18:05 UTC (permalink / raw)
  To: f.fainelli, jonas.gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel
  Cc: Álvaro Fernández Rojas

arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with EHCI/OHCI:
[    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
[    3.895011] Reserved instruction in kernel code[#1]:
[    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
[    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
[    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
[    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
[    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
[    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
[    3.932848] $20   : 00000000 00000000 55590000 77d70000
[    3.938251] $24   : 00000018 00000010
[    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
[    3.949058] Hi    : 00000000
[    3.952013] Lo    : 00000000
[    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
[    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
[    3.965913] Status: 10008703	KERNEL EXL IE
[    3.970216] Cause : 00800028 (ExcCode 0a)
[    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
[    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
[    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
[    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
[    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
[    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
[    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
[    4.044196]         ...
[    4.046706] Call Trace:
[    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
[    4.054356] [<80015c70>] setup_frame+0xdc/0x124
[    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
[    4.064207] [<80011b50>] work_notifysig+0x10/0x18
[    4.069036]
[    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
[    4.080686]
[    4.082517] ---[ end trace 22a8edb41f5f983b ]---
[    4.087374] Kernel panic - not syncing: Fatal exception
[    4.092753] Rebooting in 1 seconds..

This only happens when booting from TP1 instead of TP0.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
---
 v2: check if we're running on TP1 and invert logic.

 arch/mips/bmips/dma.c   | 5 +++++
 arch/mips/bmips/setup.c | 9 +++++++++
 2 files changed, 14 insertions(+)

diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
index 33788668cbdb..2e470ef8d30f 100644
--- a/arch/mips/bmips/dma.c
+++ b/arch/mips/bmips/dma.c
@@ -5,11 +5,16 @@
 #include <asm/bmips.h>
 #include <asm/io.h>
 
+int bmips_dma_sync_disable = 0;
+
 void arch_sync_dma_for_cpu_all(void)
 {
 	void __iomem *cbr = BMIPS_GET_CBR();
 	u32 cfg;
 
+	if (bmips_dma_sync_disable)
+		return;
+
 	if (boot_cpu_type() != CPU_BMIPS3300 &&
 	    boot_cpu_type() != CPU_BMIPS4350 &&
 	    boot_cpu_type() != CPU_BMIPS4380)
diff --git a/arch/mips/bmips/setup.c b/arch/mips/bmips/setup.c
index e95b3f78e7cd..15543e8d5b0c 100644
--- a/arch/mips/bmips/setup.c
+++ b/arch/mips/bmips/setup.c
@@ -35,6 +35,8 @@
 #define REG_BCM6328_OTP		((void __iomem *)CKSEG1ADDR(0x1000062c))
 #define BCM6328_TP1_DISABLED	BIT(9)
 
+extern int bmips_dma_sync_disable;
+
 static const unsigned long kbase = VMLINUX_LOAD_ADDRESS & 0xfff00000;
 
 struct bmips_quirk {
@@ -104,6 +106,13 @@ static void bcm6358_quirks(void)
 	 * disable SMP for now
 	 */
 	bmips_smp_enabled = 0;
+
+	/*
+	 * ARCH_HAS_SYNC_DMA_FOR_CPU_ALL causes kernel panics on BCM6358 when
+	 * booting from TP1
+	 */
+	if (read_c0_brcm_cmt_local() & (1 << 31))
+		bmips_dma_sync_disable = 1;
 }
 
 static void bcm6368_quirks(void)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() for TP1
  2023-03-16 18:05 ` [PATCH v2] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() for TP1 Álvaro Fernández Rojas
@ 2023-03-17  0:00   ` Florian Fainelli
  2023-03-17 10:20   ` [PATCH v3] mips: bmips: BCM6358: disable RAC flush " Álvaro Fernández Rojas
  1 sibling, 0 replies; 17+ messages in thread
From: Florian Fainelli @ 2023-03-17  0:00 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, jonas.gorski,
	bcm-kernel-feedback-list, tsbogend, linux-mips, linux-kernel

On 3/16/23 11:05, Álvaro Fernández Rojas wrote:
> arch_sync_dma_for_cpu_all() causes kernel panics on BCM6358 with EHCI/OHCI:
> [    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
> [    3.895011] Reserved instruction in kernel code[#1]:
> [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
> [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
> [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
> [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
> [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
> [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
> [    3.932848] $20   : 00000000 00000000 55590000 77d70000
> [    3.938251] $24   : 00000018 00000010
> [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
> [    3.949058] Hi    : 00000000
> [    3.952013] Lo    : 00000000
> [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
> [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
> [    3.965913] Status: 10008703	KERNEL EXL IE
> [    3.970216] Cause : 00800028 (ExcCode 0a)
> [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
> [    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
> [    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
> [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
> [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
> [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
> [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
> [    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
> [    4.044196]         ...
> [    4.046706] Call Trace:
> [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
> [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
> [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
> [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
> [    4.069036]
> [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
> [    4.080686]
> [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
> [    4.087374] Kernel panic - not syncing: Fatal exception
> [    4.092753] Rebooting in 1 seconds..
> 
> This only happens when booting from TP1 instead of TP0.

Because the bootloader (CFE) is not initializing the Read-ahead cache 
properly on the second thread (TP1). Since the RAC was not initialized 
properly, we should avoid flushing it at the risk of corrupting the 
instruction stream as seen in the trace above.

> 
> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> ---
>   v2: check if we're running on TP1 and invert logic.
> 
>   arch/mips/bmips/dma.c   | 5 +++++
>   arch/mips/bmips/setup.c | 9 +++++++++
>   2 files changed, 14 insertions(+)
> 
> diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
> index 33788668cbdb..2e470ef8d30f 100644
> --- a/arch/mips/bmips/dma.c
> +++ b/arch/mips/bmips/dma.c
> @@ -5,11 +5,16 @@
>   #include <asm/bmips.h>
>   #include <asm/io.h>
>   
> +int bmips_dma_sync_disable = 0;

bool bmips_rac_flush_disable;

No need for the false, it is initialized to 0 by default.

> +
>   void arch_sync_dma_for_cpu_all(void)
>   {
>   	void __iomem *cbr = BMIPS_GET_CBR();
>   	u32 cfg;
>   
> +	if (bmips_dma_sync_disable)
> +		return;

if (unlikely(bmips_rac_flush_disable))

and move it after the check on the boot_cpu_type() so we do not impact 
other BMIPS CPUs.

> +
>   	if (boot_cpu_type() != CPU_BMIPS3300 &&
>   	    boot_cpu_type() != CPU_BMIPS4350 &&
>   	    boot_cpu_type() != CPU_BMIPS4380)
> diff --git a/arch/mips/bmips/setup.c b/arch/mips/bmips/setup.c
> index e95b3f78e7cd..15543e8d5b0c 100644
> --- a/arch/mips/bmips/setup.c
> +++ b/arch/mips/bmips/setup.c
> @@ -35,6 +35,8 @@
>   #define REG_BCM6328_OTP		((void __iomem *)CKSEG1ADDR(0x1000062c))
>   #define BCM6328_TP1_DISABLED	BIT(9)
>   
> +extern int bmips_dma_sync_disable;

extern bool bmips_rac_flush_disable;

> +
>   static const unsigned long kbase = VMLINUX_LOAD_ADDRESS & 0xfff00000;
>   
>   struct bmips_quirk {
> @@ -104,6 +106,13 @@ static void bcm6358_quirks(void)
>   	 * disable SMP for now
>   	 */
>   	bmips_smp_enabled = 0;
> +
> +	/*
> +	 * ARCH_HAS_SYNC_DMA_FOR_CPU_ALL causes kernel panics on BCM6358 when
> +	 * booting from TP1
> +	 */
> +	if (read_c0_brcm_cmt_local() & (1 << 31))
> +		bmips_dma_sync_disable = 1;

	bmips_rac_flush_disable = !!(read_c0_brcm_cmt_local() & (1 << 31));

	is more compact, but what you have works too.

Sorry for being so insisting about the name, but signaling what this 
does is fairly important.
-- 
Florian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3] mips: bmips: BCM6358: disable RAC flush for TP1
  2023-03-16 18:05 ` [PATCH v2] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() for TP1 Álvaro Fernández Rojas
  2023-03-17  0:00   ` Florian Fainelli
@ 2023-03-17 10:20   ` Álvaro Fernández Rojas
  2023-03-17 16:44     ` Florian Fainelli
  2023-03-18 13:38     ` Thomas Bogendoerfer
  1 sibling, 2 replies; 17+ messages in thread
From: Álvaro Fernández Rojas @ 2023-03-17 10:20 UTC (permalink / raw)
  To: f.fainelli, jonas.gorski, bcm-kernel-feedback-list, tsbogend,
	linux-mips, linux-kernel
  Cc: Álvaro Fernández Rojas

RAC flush causes kernel panics on BCM6358 with EHCI/OHCI when booting from TP1:
[    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
[    3.895011] Reserved instruction in kernel code[#1]:
[    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
[    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
[    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
[    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
[    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
[    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
[    3.932848] $20   : 00000000 00000000 55590000 77d70000
[    3.938251] $24   : 00000018 00000010
[    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
[    3.949058] Hi    : 00000000
[    3.952013] Lo    : 00000000
[    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
[    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
[    3.965913] Status: 10008703	KERNEL EXL IE
[    3.970216] Cause : 00800028 (ExcCode 0a)
[    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
[    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
[    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
[    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
[    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
[    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
[    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
[    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
[    4.044196]         ...
[    4.046706] Call Trace:
[    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
[    4.054356] [<80015c70>] setup_frame+0xdc/0x124
[    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
[    4.064207] [<80011b50>] work_notifysig+0x10/0x18
[    4.069036]
[    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
[    4.080686]
[    4.082517] ---[ end trace 22a8edb41f5f983b ]---
[    4.087374] Kernel panic - not syncing: Fatal exception
[    4.092753] Rebooting in 1 seconds..

Because the bootloader (CFE) is not initializing the Read-ahead cache properly
on the second thread (TP1). Since the RAC was not initialized properly, we
should avoid flushing it at the risk of corrupting the instruction stream as
seen in the trace above.

Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
---
 v3: add changes suggested by Florian:
   - Switch to a bool and remove unneeded initialization.
   - Remove if from bcm6358_quirks().
   - Improve commit description and bcm6358_quirks() comment.
 v2: check if we're running on TP1 and invert logic.

 arch/mips/bmips/dma.c   | 5 +++++
 arch/mips/bmips/setup.c | 8 ++++++++
 2 files changed, 13 insertions(+)

diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
index 33788668cbdb..3779e7855bd7 100644
--- a/arch/mips/bmips/dma.c
+++ b/arch/mips/bmips/dma.c
@@ -5,6 +5,8 @@
 #include <asm/bmips.h>
 #include <asm/io.h>
 
+bool bmips_rac_flush_disable;
+
 void arch_sync_dma_for_cpu_all(void)
 {
 	void __iomem *cbr = BMIPS_GET_CBR();
@@ -15,6 +17,9 @@ void arch_sync_dma_for_cpu_all(void)
 	    boot_cpu_type() != CPU_BMIPS4380)
 		return;
 
+	if (unlikely(bmips_rac_flush_disable))
+		return;
+
 	/* Flush stale data out of the readahead cache */
 	cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
 	__raw_writel(cfg | 0x100, cbr + BMIPS_RAC_CONFIG);
diff --git a/arch/mips/bmips/setup.c b/arch/mips/bmips/setup.c
index e95b3f78e7cd..549a6392a3d2 100644
--- a/arch/mips/bmips/setup.c
+++ b/arch/mips/bmips/setup.c
@@ -35,6 +35,8 @@
 #define REG_BCM6328_OTP		((void __iomem *)CKSEG1ADDR(0x1000062c))
 #define BCM6328_TP1_DISABLED	BIT(9)
 
+extern bool bmips_rac_flush_disable;
+
 static const unsigned long kbase = VMLINUX_LOAD_ADDRESS & 0xfff00000;
 
 struct bmips_quirk {
@@ -104,6 +106,12 @@ static void bcm6358_quirks(void)
 	 * disable SMP for now
 	 */
 	bmips_smp_enabled = 0;
+
+	/*
+	 * RAC flush causes kernel panics on BCM6358 when booting from TP1
+	 * because the bootloader is not initializing it properly.
+	 */
+	bmips_rac_flush_disable = !!(read_c0_brcm_cmt_local() & (1 << 31));
 }
 
 static void bcm6368_quirks(void)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3] mips: bmips: BCM6358: disable RAC flush for TP1
  2023-03-17 10:20   ` [PATCH v3] mips: bmips: BCM6358: disable RAC flush " Álvaro Fernández Rojas
@ 2023-03-17 16:44     ` Florian Fainelli
  2023-03-18 13:38     ` Thomas Bogendoerfer
  1 sibling, 0 replies; 17+ messages in thread
From: Florian Fainelli @ 2023-03-17 16:44 UTC (permalink / raw)
  To: Álvaro Fernández Rojas, f.fainelli, jonas.gorski,
	bcm-kernel-feedback-list, tsbogend, linux-mips, linux-kernel

On 3/17/23 03:20, Álvaro Fernández Rojas wrote:
> RAC flush causes kernel panics on BCM6358 with EHCI/OHCI when booting from TP1:
> [    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
> [    3.895011] Reserved instruction in kernel code[#1]:
> [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
> [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
> [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
> [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
> [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
> [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
> [    3.932848] $20   : 00000000 00000000 55590000 77d70000
> [    3.938251] $24   : 00000018 00000010
> [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
> [    3.949058] Hi    : 00000000
> [    3.952013] Lo    : 00000000
> [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
> [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
> [    3.965913] Status: 10008703	KERNEL EXL IE
> [    3.970216] Cause : 00800028 (ExcCode 0a)
> [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
> [    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
> [    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
> [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
> [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
> [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
> [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
> [    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
> [    4.044196]         ...
> [    4.046706] Call Trace:
> [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
> [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
> [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
> [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
> [    4.069036]
> [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
> [    4.080686]
> [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
> [    4.087374] Kernel panic - not syncing: Fatal exception
> [    4.092753] Rebooting in 1 seconds..
> 
> Because the bootloader (CFE) is not initializing the Read-ahead cache properly
> on the second thread (TP1). Since the RAC was not initialized properly, we
> should avoid flushing it at the risk of corrupting the instruction stream as
> seen in the trace above.
> 
> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>

Third time is a charm, Thomas, I believe this deserves adding a:

Fixes: d59098a0e9cb ("MIPS: bmips: use generic dma noncoherent ops")

such that this backports cleanly. Thanks!
-- 
Florian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3] mips: bmips: BCM6358: disable RAC flush for TP1
  2023-03-17 10:20   ` [PATCH v3] mips: bmips: BCM6358: disable RAC flush " Álvaro Fernández Rojas
  2023-03-17 16:44     ` Florian Fainelli
@ 2023-03-18 13:38     ` Thomas Bogendoerfer
  1 sibling, 0 replies; 17+ messages in thread
From: Thomas Bogendoerfer @ 2023-03-18 13:38 UTC (permalink / raw)
  To: Álvaro Fernández Rojas
  Cc: f.fainelli, jonas.gorski, bcm-kernel-feedback-list, linux-mips,
	linux-kernel

On Fri, Mar 17, 2023 at 11:20:04AM +0100, Álvaro Fernández Rojas wrote:
> RAC flush causes kernel panics on BCM6358 with EHCI/OHCI when booting from TP1:
> [    3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform
> [    3.895011] Reserved instruction in kernel code[#1]:
> [    3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0
> [    3.905829] $ 0   : 00000000 10008700 00000000 77d94060
> [    3.911238] $ 4   : 7fd1f088 00000000 81431cac 81431ca0
> [    3.916641] $ 8   : 00000000 ffffefff 8075cd34 00000000
> [    3.922043] $12   : 806f8d40 f3e812b7 00000000 000d9aaa
> [    3.927446] $16   : 7fd1f068 7fd1f080 7ff559b8 81428470
> [    3.932848] $20   : 00000000 00000000 55590000 77d70000
> [    3.938251] $24   : 00000018 00000010
> [    3.943655] $28   : 81430000 81431e60 81431f28 800157fc
> [    3.949058] Hi    : 00000000
> [    3.952013] Lo    : 00000000
> [    3.955019] epc   : 80015808 setup_sigcontext+0x54/0x24c
> [    3.960464] ra    : 800157fc setup_sigcontext+0x48/0x24c
> [    3.965913] Status: 10008703	KERNEL EXL IE
> [    3.970216] Cause : 00800028 (ExcCode 0a)
> [    3.974340] PrId  : 0002a010 (Broadcom BMIPS4350)
> [    3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common
> [    3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8)
> [    4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470
> [    4.009467]         81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74
> [    4.018149]         08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003
> [    4.026831]         ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000
> [    4.035512]         77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000
> [    4.044196]         ...
> [    4.046706] Call Trace:
> [    4.049238] [<80015808>] setup_sigcontext+0x54/0x24c
> [    4.054356] [<80015c70>] setup_frame+0xdc/0x124
> [    4.059015] [<80016414>] do_notify_resume+0x1dc/0x288
> [    4.064207] [<80011b50>] work_notifysig+0x10/0x18
> [    4.069036]
> [    4.070538] Code: 8fc300b4  00001025  26240008 <ac820000> ac830004  3c048063  0c0228aa  24846a00  26240010
> [    4.080686]
> [    4.082517] ---[ end trace 22a8edb41f5f983b ]---
> [    4.087374] Kernel panic - not syncing: Fatal exception
> [    4.092753] Rebooting in 1 seconds..
> 
> Because the bootloader (CFE) is not initializing the Read-ahead cache properly
> on the second thread (TP1). Since the RAC was not initialized properly, we
> should avoid flushing it at the risk of corrupting the instruction stream as
> seen in the trace above.
> 
> Signed-off-by: Álvaro Fernández Rojas <noltari@gmail.com>
> ---
>  v3: add changes suggested by Florian:
>    - Switch to a bool and remove unneeded initialization.
>    - Remove if from bcm6358_quirks().
>    - Improve commit description and bcm6358_quirks() comment.
>  v2: check if we're running on TP1 and invert logic.
> 
>  arch/mips/bmips/dma.c   | 5 +++++
>  arch/mips/bmips/setup.c | 8 ++++++++
>  2 files changed, 13 insertions(+)

applied to mips-fixes with the Fixes tag added.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-03-18 13:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-10 12:13 [PATCH] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() Álvaro Fernández Rojas
2023-03-11 17:32 ` Florian Fainelli
2023-03-11 19:44   ` Jonas Gorski
2023-03-12  7:31     ` William Zhang
2023-03-12 16:16       ` Florian Fainelli
2023-03-12 18:50         ` Álvaro Fernández Rojas
2023-03-13 17:37           ` Florian Fainelli
2023-03-13 21:39             ` Álvaro Fernández Rojas
2023-03-13 21:46               ` Florian Fainelli
2023-03-14 18:14                 ` Álvaro Fernández Rojas
2023-03-16 17:40                   ` Florian Fainelli
2023-03-16 18:03                     ` Álvaro Fernández Rojas
2023-03-16 18:05 ` [PATCH v2] mips: bmips: BCM6358: disable arch_sync_dma_for_cpu_all() for TP1 Álvaro Fernández Rojas
2023-03-17  0:00   ` Florian Fainelli
2023-03-17 10:20   ` [PATCH v3] mips: bmips: BCM6358: disable RAC flush " Álvaro Fernández Rojas
2023-03-17 16:44     ` Florian Fainelli
2023-03-18 13:38     ` Thomas Bogendoerfer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).